CN117853371B - Multi-branch frequency domain enhanced real image defogging method, system and terminal - Google Patents
Multi-branch frequency domain enhanced real image defogging method, system and terminal Download PDFInfo
- Publication number
- CN117853371B CN117853371B CN202410252937.1A CN202410252937A CN117853371B CN 117853371 B CN117853371 B CN 117853371B CN 202410252937 A CN202410252937 A CN 202410252937A CN 117853371 B CN117853371 B CN 117853371B
- Authority
- CN
- China
- Prior art keywords
- image
- frequency domain
- enhancement
- defogging
- residual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000004927 fusion Effects 0.000 claims abstract description 65
- 238000011084 recovery Methods 0.000 claims abstract description 46
- 230000011218 segmentation Effects 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 17
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 238000001228 spectrum Methods 0.000 claims description 86
- 230000008859 change Effects 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 17
- 238000013507 mapping Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 8
- 230000002708 enhancing effect Effects 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 7
- 230000003190 augmentative effect Effects 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000008439 repair process Effects 0.000 claims description 3
- 238000002834 transmittance Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 5
- 230000000007 visual effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Image Processing (AREA)
Abstract
The invention provides a real image defogging method, a system and a terminal for multi-branch frequency domain enhancement, wherein the method comprises the following steps: inputting the sample enhancement set into an image defogging model to perform multi-branch network feature extraction to obtain sample image features, and performing semantic segmentation, residual attention processing and feature fusion on the sample image features to obtain fusion features; performing mixed skip connection on the residual output image to obtain a residual image, performing image recovery and frequency domain enhancement on the residual image and the fusion characteristic to obtain a residual enhanced image and a fusion enhanced image, and fusing the residual enhanced image and the fusion enhanced image to obtain a defogging image; determining model loss according to the defogging image, and updating parameters of the defogging image model according to the model loss; and inputting the image to be defogged into an image defogging model to perform defogging treatment, so as to obtain a defogged output image. The embodiment of the invention can effectively refine the defogging effect in the defogging image by utilizing the frequency domain information, and improves the defogging accuracy of the image.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a real image defogging method, a real image defogging system and a real image defogging terminal with multi-branch frequency domain enhancement.
Background
Image defogging is a low-level visual task, typically a preprocessing step that improves the performance of high-level visual tasks, such as crowd counting, object detection, or image segmentation. In general, in advanced visual tasks, clear haze-free images are required. Therefore, the problem of image defogging is becoming more and more interesting to academia and industry.
In the existing image defogging process, the mapping relation between the defogging image and the foggy image is generally learned only based on the image characteristic information of the sample data set, so that the defogging image is generated, but the quality of the generated defogging image is low only based on the image characteristic information, and the defogging accuracy of the image is reduced.
Disclosure of Invention
The embodiment of the invention aims to provide a real image defogging method, a real image defogging system and a real image defogging terminal with multi-branch frequency domain enhancement, and aims to solve the problem that the conventional image defogging accuracy is low.
The embodiment of the invention is realized in such a way that the true image defogging method with multi-branch frequency domain enhancement comprises the following steps:
Acquiring a sample data set, and carrying out data enhancement on the sample data set to obtain a sample enhancement set;
inputting the sample enhancement set into an image defogging model to perform multi-branch network feature extraction to obtain sample image features, and performing semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
residual attention processing is carried out on semantic segmentation features of each branch network respectively to obtain mask feature mapping and residual output images, and the mask feature mapping of each branch network is fused to obtain fusion features;
Performing mixed skip connection on residual output images of each branch network to obtain residual images, and performing image restoration on the residual images and the fusion features to obtain residual restoration images and fusion restoration images;
carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fusing the residual error enhancement image and the fusion enhancement image to obtain a defogging image;
determining model loss according to the defogging images and the real images of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges;
And inputting the image to be defogged into the converged image defogging model to perform defogging treatment, so as to obtain a defogged output image.
It is another object of an embodiment of the present invention to provide a multi-branch frequency domain enhanced true image defogging system, the system comprising:
the data enhancement module is used for acquiring a sample data set, and carrying out data enhancement on the sample data set to obtain a sample enhancement set;
The feature extraction module is used for carrying out feature extraction of a multi-branch network on the image defogging model input by the sample enhancement set to obtain sample image features, and carrying out semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
residual attention processing is carried out on semantic segmentation features of each branch network respectively to obtain mask feature mapping and residual output images, and the mask feature mapping of each branch network is fused to obtain fusion features;
Performing mixed skip connection on residual output images of each branch network to obtain residual images, and performing image restoration on the residual images and the fusion features to obtain residual restoration images and fusion restoration images;
The frequency domain enhancement module is used for carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fusing the residual error enhancement image and the fusion enhancement image to obtain a defogging image;
The parameter updating module is used for determining model loss according to the defogging image and the real image of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges;
And the defogging processing module is used for inputting the image to be defogged into the converged image defogging model to perform defogging processing so as to obtain a defogged output image.
It is a further object of an embodiment of the present invention to provide a terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, which processor implements the steps of the method as described above when executing the computer program.
It is a further object of embodiments of the present invention to provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above method.
According to the embodiment of the invention, the diversity of the sample data set data is effectively improved by carrying out data enhancement on the sample data set, the characteristic extraction of a multi-branch network is carried out by inputting the sample enhancement set into the image defogging model, the sample image characteristics in the sample data set are extracted in a multi-channel mode, and the frequency domain enhancement is carried out on the residual error recovery image and the fusion recovery image, so that the defogging effect in the residual error recovery image and the fusion recovery image can be effectively refined by utilizing the frequency domain information, and the quality of the defogging image and the defogging accuracy of the image are improved.
Drawings
FIG. 1 is a flow chart of a true image defogging method for multi-branch frequency domain enhancement provided by a first embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a multi-branch frequency domain enhanced real image defogging system according to a second embodiment of the present invention;
Fig. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In order to illustrate the technical scheme of the invention, the following description is made by specific examples.
Example 1
Referring to fig. 1, a flowchart of a multi-branch frequency domain enhanced real image defogging method according to a first embodiment of the present invention is provided, and the multi-branch frequency domain enhanced real image defogging method can be applied to any system, and includes the steps of:
step S10, a sample data set is obtained, and data enhancement is carried out on the sample data set to obtain a sample enhancement set;
the data of the sample data sets are enhanced, so that the number of the data sets is effectively increased, the trained model can flexibly adapt to the real-world image, and the model has good model performance.
In this step, the formula adopted for data enhancement on the sample data set includes:
wherein, For sample enhancement set,/>For the sample dataset/>A true haze-free image, a is global atmospheric light,For the medium transmittance map,/>For scene depth,/>For haze density, in this step, the global atmospheric light and haze density may be perturbed numerically based on the depth estimator, preferably the global atmospheric light can be adjusted from (0.7,1.0) to (0.5,1.8), the haze density can be adjusted from (0.6,1.8) to (0.8,2.8), and by amplifying the global atmospheric light and haze density, more challenging reality can be effectively covered. If the haze is severe, the information is seriously lost due to the dense fog, and the supervision signal in the input of the model is obviously reduced, so that the generalization capability of the model is enhanced by amplifying the global atmosphere light and the haze density.
The haze effect in the frequency domain is generally considered as a spatially static signal consisting of low frequency components. In the embodiment, the low-frequency spectrums of the two images are exchanged, so that the haze mode is changed under the condition of not influencing the high-level semantic perception, the image style is migrated, the haze mode is diversified, and the understanding of a scene is not changed. In addition, the unique position dependency relationship of the haze image on the scene depth is also decomposed, so that more non-uniform haze modes are generated, and the diversity of the data set is further improved.
Wherein,For/>, from hazy imageAnd hazy image/>Generated augmented image,/>And/>For true haze-free image/>And true haze free image/>Synthesized hazy image,/>And/>For the a and b th true haze-free images in the sample dataset,/>Amplitude of Fourier transform,/>Is the phase component of the Fourier transform,/>For image mask,/>Is a complex operation of functions.
Wherein H and W are the height and width of the image,And/>For the height and width of the image,/>Is a ratio/>For controlling the migration range of low frequency components,/>Indicating the function.
The range of low frequency components (i.e. haze mode) is/>Substituted by the range of (2) at the same time/>Is disturbed to maintain/>In order to alleviate/>The negative impact of hard thresholding may apply a gaussian filter to smooth the image.
According to the embodiment, the data enhancement is performed on the sample data set, so that the distribution difference between the real data and the synthesized data can be effectively reduced, and the defogging performance of the model in a real scene can be improved.
Step S20, inputting the sample enhancement set into an image defogging model to perform multi-branch network feature extraction to obtain sample image features, and performing semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
The image defogging model is provided with a plurality of branch networks, each branch network extracts sample image characteristics (shallow characteristics) through two 3x3 convolutions and one ReLU function layer, all branch networks have 3-layer U-Net network architecture to carry out semantic segmentation on the sample image characteristics, each scale is 3, and different attention blocks can be used for replacing naive convolutions in each branch.
Step S30, carrying out residual attention processing on semantic segmentation features of each branch network to obtain mask feature mapping and residual output images, and fusing the mask feature mapping of each branch network to obtain fusion features;
Each branch network is provided with a Residual Attention Module (RAM), the RAM generates two outputs, namely a mask feature map and a residual output image, the mask feature map is generated by a mask obtained by transmitting the residual output image through a sigmoid function, and the residual output image is an output image which is degraded through 3X 3 convolution transmission in the branch network.
Step S40, carrying out mixed skip connection on residual output images of all branch networks to obtain residual images, and respectively carrying out image restoration on the residual images and the fusion features to obtain residual restored images and fusion restored images;
And performing mixed skip connection (Mixed Skip Connection, MSC) on residual output images of each branch network to obtain residual images, and performing 3x3 convolution on the residual images and the fusion features to perform image recovery to obtain residual recovery images and fusion recovery images.
Step S50, carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fusing the residual error enhancement image and the fusion enhancement image to obtain a defogging image;
optionally, frequency domain enhancing is performed on the residual error recovery image and the fusion recovery image, including:
Acquiring spatial domain features of the residual error recovery image and the fusion recovery image, and determining frequency domain features according to the spatial domain features;
determining an amplitude spectrum and a phase spectrum according to the frequency domain characteristics, and repairing the amplitude spectrum to obtain an amplitude repairing spectrum;
Determining a residual amplitude from the amplitude restoration spectrum and the amplitude spectrum, and determining an attention map from the residual amplitude;
Performing phase change on the phase spectrum according to the attention diagram to obtain a phase change spectrum, and determining a frequency domain enhanced real part and a frequency domain enhanced imaginary part according to the phase change spectrum and the amplitude restoration spectrum;
Determining frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part, and determining spatial domain enhancement features according to the frequency domain enhancement features;
Performing feature enhancement on the residual error restored image and the fusion restored image according to the spatial domain enhanced features to obtain the residual error enhanced image and the fusion enhanced image;
Wherein the formula adopted for determining the frequency domain features according to the spatial domain features comprises:
wherein x is the spatial domain feature of the residual restored image and the fused restored image, Coordinates in the frequency domain for the residual recovery image and the fused recovery image,/>Is a frequency component in the horizontal direction of the image,/>Is a frequency component in the vertical direction of the image,/>And/>For the height and width of the image,/>And/>Is the coordinates of the vertical direction and the horizontal direction of the image,/>Restoring frequency domain features of the image for the residual error and the fused restored image;
The formula adopted for determining the amplitude spectrum and the phase spectrum according to the frequency domain features comprises the following steps:
wherein, Is the real part of the frequency domain feature,/>Is the imaginary part of the frequency domain feature;
wherein, Is amplitude spectrum,/>Is a phase spectrum;
Repairing the amplitude spectrum to obtain an amplitude repairing spectrum, wherein a formula adopted for determining residual amplitude according to the amplitude repairing spectrum and the amplitude spectrum comprises the following steps:
in which the amplitude may be severely distorted due to various conditions such as sampling and signal corruption, the amplitude may be repaired using a 1x1 convolution, For amplitude restoration spectrum,/>For residual amplitude between amplitude repair spectrum and amplitude spectrum,/>Is a convolution operator,/>Is a filter with convolution kernel size of 1x1 pixels.
Determining an attention profile from the residual amplitude, the formula employed to phase change the phase spectrum from the attention profile comprising:
wherein, To take care of the force, GAP is a global average pooling,/>Is a phase change spectrum,/>Performing product operation for element;
The formula adopted for determining the frequency domain enhancement real part and the frequency domain enhancement imaginary part according to the phase change spectrum and the amplitude restoration spectrum comprises the following steps:
wherein, Enhancing the real part for the frequency domain,/>Enhancing the imaginary part for the frequency domain;
the formula adopted for determining the frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part comprises the following steps:
wherein, In the step, the frequency domain enhancement features are transformed into the spatial domain enhancement features by adopting Fourier inverse transformation.
Step S60, determining model loss according to the defogging images and the real images of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges;
optionally, determining the model loss from the defogging image and the real image of the sample data set comprises:
wherein X is the defogging image, Y is the real image, For the model loss,/>Loss of peak signal-to-noise ratio and structural similarity index composition,/>For edge loss,/>、/>And/>Is a preset constant,/>For Laplacian, PSNR is peak signal-to-noise ratio, SSIM is structural similarity, MAX 2 Y is the maximum pixel value that an image can take, MSE is the mean square error of the defogging image and the corresponding real image,/>、/>、/>Comparison of brightness, contrast and saturation representing real image and defogging image respectively,/>、/>And/>Is a super parameter. Preferably,/>Can be set to 0.005,/>Can be set to 0.05,/>Can be set as 10 -3,/>Typically set to 1.SSIM measures the similarity of images in terms of brightness, contrast, and structure.
And step S70, inputting the image to be defogged into the converged image defogging model to perform defogging treatment, so as to obtain a defogged output image.
In the embodiment, the diversity of the data of the sample data set is effectively improved by carrying out data enhancement on the sample data set, the characteristic extraction of a multi-branch network is carried out by inputting the sample enhancement set into an image defogging model, the sample image characteristics in the sample data set are extracted in a multi-channel mode, and the defogging effect in the residual error recovery image and the fusion recovery image can be effectively refined by carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image by utilizing frequency domain information, so that the quality of the defogging image and the defogging accuracy of the image are improved.
Example two
Referring to fig. 2, a schematic structural diagram of a multi-branch frequency domain enhanced real image defogging system 100 according to a second embodiment of the present invention is shown, which includes:
the data enhancement module 10 is configured to obtain a sample data set, and perform data enhancement on the sample data set to obtain a sample enhancement set.
Optionally, the formula for performing data enhancement on the sample data set includes:
wherein, For sample enhancement set,/>For the sample dataset/>A true haze-free image, a is global atmospheric light,For the medium transmittance map,/>For scene depth,/>Is haze density;
wherein, For/>And/>Generated augmented image,/>And/>For the a and b th true haze-free images in the sample dataset,/>And/>For the purposes of image/>And/>Synthesized hazy image,/>Amplitude of Fourier transform,/>Is the phase component of the Fourier transform,/>For image mask,/>Is a complex operation of functions.
The feature extraction module 11 is configured to perform feature extraction on the multi-branch network by using the image defogging model input by the sample enhancement set to obtain sample image features, and perform semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
residual attention processing is carried out on semantic segmentation features of each branch network respectively to obtain mask feature mapping and residual output images, and the mask feature mapping of each branch network is fused to obtain fusion features;
And performing mixed skip connection on residual output images of each branch network to obtain residual images, and performing image restoration on the residual images and the fusion features respectively to obtain residual restoration images and fusion restoration images.
The frequency domain enhancement module 12 is configured to perform frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fuse the residual error enhancement image and the fusion enhancement image to obtain a defogging image.
Optionally, the frequency domain enhancement module 12 is further configured to: acquiring spatial domain features of the residual error recovery image and the fusion recovery image, and determining frequency domain features according to the spatial domain features;
determining an amplitude spectrum and a phase spectrum according to the frequency domain characteristics, and repairing the amplitude spectrum to obtain an amplitude repairing spectrum;
Determining a residual amplitude from the amplitude restoration spectrum and the amplitude spectrum, and determining an attention map from the residual amplitude;
Performing phase change on the phase spectrum according to the attention diagram to obtain a phase change spectrum, and determining a frequency domain enhanced real part and a frequency domain enhanced imaginary part according to the phase change spectrum and the amplitude restoration spectrum;
Determining frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part, and determining spatial domain enhancement features according to the frequency domain enhancement features;
and carrying out feature enhancement on the residual error restored image and the fusion restored image according to the spatial domain enhanced features to obtain the residual error enhanced image and the fusion enhanced image.
Further, the formula adopted for determining the frequency domain features according to the spatial domain features comprises:
wherein x is the spatial domain feature of the residual restored image and the fused restored image, Coordinates in the frequency domain for the residual recovery image and the fused recovery image,/>Is a frequency component in the horizontal direction of the image,/>Is a frequency component in the vertical direction of the image,/>And/>For the height and width of the image,/>And/>Is the coordinates of the vertical direction and the horizontal direction of the image,/>Restoring frequency domain features of the image for the residual error and the fused restored image;
The formula adopted for determining the amplitude spectrum and the phase spectrum according to the frequency domain features comprises the following steps:
wherein, Is the real part of the frequency domain feature,/>Is the imaginary part of the frequency domain feature;
wherein, Is amplitude spectrum,/>Is a phase spectrum;
Repairing the amplitude spectrum to obtain an amplitude repairing spectrum, wherein a formula adopted for determining residual amplitude according to the amplitude repairing spectrum and the amplitude spectrum comprises the following steps:
wherein, For amplitude restoration spectrum,/>For residual amplitude between amplitude repair spectrum and amplitude spectrum,/>Is a convolution operator,/>Is a filter;
Determining an attention profile from the residual amplitude, the formula employed to phase change the phase spectrum from the attention profile comprising:
wherein, To take care of the force, GAP is a global average pooling,/>Is a phase change spectrum,/>Performing product operation for element;
The formula adopted for determining the frequency domain enhancement real part and the frequency domain enhancement imaginary part according to the phase change spectrum and the amplitude restoration spectrum comprises the following steps:
wherein, Enhancing the real part for the frequency domain,/>Enhancing the imaginary part for the frequency domain;
the formula adopted for determining the frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part comprises the following steps:
wherein, Frequency domain enhancement features for residual enhancement images and fusion enhancement images.
And the parameter updating module 13 is used for determining model loss according to the defogging image and the real image of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges.
Optionally, determining the model loss from the defogging image and the real image of the sample data set comprises:
wherein X is the defogging image, Y is the real image, For the model loss,/>Loss of peak signal-to-noise ratio and structural similarity index composition,/>For edge loss,/>、/>And/>Is a preset constant,/>For Laplacian, PSNR is peak signal-to-noise ratio, SSIM is structural similarity, MAX 2 Y is the maximum pixel value that an image can take, MSE is the mean square error of the defogging image and the corresponding real image,/>、/>、/>Comparison of brightness, contrast and saturation representing real image and defogging image respectively,/>、/>And/>Is a super parameter.
And the defogging processing module 14 is used for inputting the image to be defogged into the converged image defogging model to perform defogging processing so as to obtain a defogged output image.
According to the embodiment, the diversity of the sample data set data is effectively improved by carrying out data enhancement on the sample data set, the characteristic extraction of a multi-branch network is carried out by inputting the sample enhancement set into an image defogging model, the sample image characteristics in the sample data set are extracted in a multi-channel mode, and the frequency domain enhancement is carried out on the residual error recovery image and the fusion recovery image, so that the defogging effect in the residual error recovery image and the fusion recovery image can be effectively refined by utilizing frequency domain information, and the quality of the defogging image and the defogging accuracy of the image are improved.
Example III
Fig. 3 is a block diagram of a terminal device 2 according to a third embodiment of the present application. As shown in fig. 3, the terminal device 2 of this embodiment includes: a processor 20, a memory 21 and a computer program 22 stored in said memory 21 and executable on said processor 20, for example a program of a multi-branched frequency domain enhanced real image defogging method. The processor 20, when executing the computer program 22, implements the steps of the various embodiments of the real image defogging method for multi-branch frequency domain enhancement described above.
Illustratively, the computer program 22 may be partitioned into one or more modules that are stored in the memory 21 and executed by the processor 20 to complete the present application. The one or more modules may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 22 in the terminal device 2. The terminal device may include, but is not limited to, a processor 20, a memory 21.
The Processor 20 may be a central processing unit (Central Processing Unit, CPU), other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 21 may be an internal storage unit of the terminal device 2, such as a hard disk or a memory of the terminal device 2. The memory 21 may also be an external storage device of the terminal device 2, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the terminal device 2. Further, the memory 21 may also include both an internal storage unit and an external storage device of the terminal device 2. The memory 21 is used for storing the computer program as well as other programs and data required by the terminal device. The memory 21 may also be used for temporarily storing data that has been output or is to be output.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Wherein the computer readable storage medium may be nonvolatile or volatile. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of each method embodiment described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable storage medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable storage medium may be appropriately scaled according to the requirements of jurisdictions in which such computer readable storage medium does not include electrical carrier signals and telecommunication signals, for example, according to jurisdictions and patent practices.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.
Claims (6)
1. A multi-branch frequency domain enhanced true image defogging method, the method comprising:
Acquiring a sample data set, and carrying out data enhancement on the sample data set to obtain a sample enhancement set;
inputting the sample enhancement set into an image defogging model to perform multi-branch network feature extraction to obtain sample image features, and performing semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
residual attention processing is carried out on semantic segmentation features of each branch network respectively to obtain mask feature mapping and residual output images, and the mask feature mapping of each branch network is fused to obtain fusion features;
Performing mixed skip connection on residual output images of each branch network to obtain residual images, and performing image restoration on the residual images and the fusion features to obtain residual restoration images and fusion restoration images;
carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fusing the residual error enhancement image and the fusion enhancement image to obtain a defogging image;
determining model loss according to the defogging images and the real images of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges;
inputting the image to be defogged into the converged image defogging model for defogging treatment to obtain a defogged output image;
Performing frequency domain enhancement on the residual error recovery image and the fusion recovery image, including:
Acquiring spatial domain features of the residual error recovery image and the fusion recovery image, and determining frequency domain features according to the spatial domain features;
determining an amplitude spectrum and a phase spectrum according to the frequency domain characteristics, and repairing the amplitude spectrum to obtain an amplitude repairing spectrum;
Determining a residual amplitude from the amplitude restoration spectrum and the amplitude spectrum, and determining an attention map from the residual amplitude;
Performing phase change on the phase spectrum according to the attention diagram to obtain a phase change spectrum, and determining a frequency domain enhanced real part and a frequency domain enhanced imaginary part according to the phase change spectrum and the amplitude restoration spectrum;
Determining frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part, and determining spatial domain enhancement features according to the frequency domain enhancement features;
Performing feature enhancement on the residual error restored image and the fusion restored image according to the spatial domain enhanced features to obtain the residual error enhanced image and the fusion enhanced image;
The formula adopted for determining the frequency domain features according to the spatial domain features comprises the following steps:
;
wherein x is the spatial domain feature of the residual restored image and the fused restored image, Coordinates in the frequency domain for the residual recovery image and the fused recovery image,/>Is a frequency component in the horizontal direction of the image,/>Is a frequency component in the vertical direction of the image,/>And/>For the height and width of the image,/>And/>Is the coordinates of the vertical direction and the horizontal direction of the image,/>Restoring frequency domain features of the image for the residual error and the fused restored image;
The formula adopted for determining the amplitude spectrum and the phase spectrum according to the frequency domain features comprises the following steps:
;
wherein, Is the real part of the frequency domain feature,/>Is the imaginary part of the frequency domain feature;
;
wherein, Is amplitude spectrum,/>Is a phase spectrum;
Repairing the amplitude spectrum to obtain an amplitude repairing spectrum, wherein a formula adopted for determining residual amplitude according to the amplitude repairing spectrum and the amplitude spectrum comprises the following steps:
;
wherein, For amplitude restoration spectrum,/>For residual amplitude between amplitude repair spectrum and amplitude spectrum,/>Is a convolution operator,/>Is a filter;
Determining an attention profile from the residual amplitude, the formula employed to phase change the phase spectrum from the attention profile comprising:
;
wherein, To take care of the force, GAP is a global average pooling,/>Is a phase change spectrum,/>Performing product operation for element;
The formula adopted for determining the frequency domain enhancement real part and the frequency domain enhancement imaginary part according to the phase change spectrum and the amplitude restoration spectrum comprises the following steps:
;
wherein, Enhancing the real part for the frequency domain,/>Enhancing the imaginary part for the frequency domain;
the formula adopted for determining the frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part comprises the following steps:
;
wherein, Frequency domain enhancement features for residual enhancement images and fusion enhancement images.
2. The multi-branch frequency domain enhanced true image defogging method of claim 1, wherein the formula for data enhancement of the sample data set comprises:
;
wherein, For sample enhancement set,/>For the sample dataset/>A is global atmospheric light,/>, and a true fog-free imageFor the medium transmittance map,/>For scene depth,/>Is haze density;
;
wherein, For/>, from hazy imageAnd hazy image/>Generated augmented image,/>And/>For true haze-free image/>And true haze free image/>Synthesized hazy image,/>And/>For the a-th and b-th real haze-free images in the sample dataset,Amplitude of Fourier transform,/>Is the phase component of the Fourier transform,/>For image mask,/>Is a complex operation of functions.
3. The multi-branch frequency-domain enhanced true image defogging method of claim 1, wherein determining a model loss from the defogging image and a true image of the sample data set comprises:
;
wherein X is the defogging image, Y is the real image, For the model loss,/>Loss of peak signal-to-noise ratio and structural similarity index composition,/>For edge loss,/>、/>And/>Is a preset constant,/>For Laplacian, PSNR is peak signal-to-noise ratio, SSIM is structural similarity, MAX 2 Y is the maximum pixel value that an image can take, MSE is the mean square error of the defogging image and the corresponding real image,/>、/>、/>Comparison of brightness, contrast and saturation representing real image and defogging image respectively,/>、/>And/>Is a super parameter.
4. A multi-branched frequency domain enhanced real image defogging system, characterized by applying the multi-branched frequency domain enhanced real image defogging method of any of claims 1 to 3, the system comprising:
the data enhancement module is used for acquiring a sample data set, and carrying out data enhancement on the sample data set to obtain a sample enhancement set;
The feature extraction module is used for carrying out feature extraction of a multi-branch network on the image defogging model input by the sample enhancement set to obtain sample image features, and carrying out semantic segmentation on the sample image features of each branch network to obtain semantic segmentation features;
residual attention processing is carried out on semantic segmentation features of each branch network respectively to obtain mask feature mapping and residual output images, and the mask feature mapping of each branch network is fused to obtain fusion features;
Performing mixed skip connection on residual output images of each branch network to obtain residual images, and performing image restoration on the residual images and the fusion features to obtain residual restoration images and fusion restoration images;
The frequency domain enhancement module is used for carrying out frequency domain enhancement on the residual error recovery image and the fusion recovery image to obtain a residual error enhancement image and a fusion enhancement image, and fusing the residual error enhancement image and the fusion enhancement image to obtain a defogging image;
The parameter updating module is used for determining model loss according to the defogging image and the real image of the sample data set, and updating parameters of the image defogging model according to the model loss until the image defogging model converges;
And the defogging processing module is used for inputting the image to be defogged into the converged image defogging model to perform defogging processing so as to obtain a defogged output image.
5. The multi-branch frequency domain enhanced real image defogging system of claim 4, wherein the frequency domain enhancement module is further configured to:
Acquiring spatial domain features of the residual error recovery image and the fusion recovery image, and determining frequency domain features according to the spatial domain features;
determining an amplitude spectrum and a phase spectrum according to the frequency domain characteristics, and repairing the amplitude spectrum to obtain an amplitude repairing spectrum;
Determining a residual amplitude from the amplitude restoration spectrum and the amplitude spectrum, and determining an attention map from the residual amplitude;
Performing phase change on the phase spectrum according to the attention diagram to obtain a phase change spectrum, and determining a frequency domain enhanced real part and a frequency domain enhanced imaginary part according to the phase change spectrum and the amplitude restoration spectrum;
Determining frequency domain enhancement features according to the frequency domain enhancement real part and the frequency domain enhancement imaginary part, and determining spatial domain enhancement features according to the frequency domain enhancement features;
and carrying out feature enhancement on the residual error restored image and the fusion restored image according to the spatial domain enhanced features to obtain the residual error enhanced image and the fusion enhanced image.
6. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 3 when the computer program is executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410252937.1A CN117853371B (en) | 2024-03-06 | 2024-03-06 | Multi-branch frequency domain enhanced real image defogging method, system and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410252937.1A CN117853371B (en) | 2024-03-06 | 2024-03-06 | Multi-branch frequency domain enhanced real image defogging method, system and terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117853371A CN117853371A (en) | 2024-04-09 |
CN117853371B true CN117853371B (en) | 2024-05-31 |
Family
ID=90532825
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410252937.1A Active CN117853371B (en) | 2024-03-06 | 2024-03-06 | Multi-branch frequency domain enhanced real image defogging method, system and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117853371B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111383192A (en) * | 2020-02-18 | 2020-07-07 | 清华大学 | SAR-fused visible light remote sensing image defogging method |
CN111915531A (en) * | 2020-08-06 | 2020-11-10 | 温州大学 | Multi-level feature fusion and attention-guided neural network image defogging method |
CN114283078A (en) * | 2021-12-09 | 2022-04-05 | 北京理工大学 | Self-adaptive fusion image defogging method based on double-path convolution neural network |
CN114764752A (en) * | 2021-01-15 | 2022-07-19 | 西北大学 | Night image defogging algorithm based on deep learning |
CN117111000A (en) * | 2023-03-24 | 2023-11-24 | 西安电子科技大学 | SAR comb spectrum interference suppression method based on dual-channel attention residual network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11663705B2 (en) * | 2021-09-17 | 2023-05-30 | Nanjing University Of Posts And Telecommunications | Image haze removal method and apparatus, and device |
-
2024
- 2024-03-06 CN CN202410252937.1A patent/CN117853371B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111383192A (en) * | 2020-02-18 | 2020-07-07 | 清华大学 | SAR-fused visible light remote sensing image defogging method |
CN111915531A (en) * | 2020-08-06 | 2020-11-10 | 温州大学 | Multi-level feature fusion and attention-guided neural network image defogging method |
CN114764752A (en) * | 2021-01-15 | 2022-07-19 | 西北大学 | Night image defogging algorithm based on deep learning |
CN114283078A (en) * | 2021-12-09 | 2022-04-05 | 北京理工大学 | Self-adaptive fusion image defogging method based on double-path convolution neural network |
CN117111000A (en) * | 2023-03-24 | 2023-11-24 | 西安电子科技大学 | SAR comb spectrum interference suppression method based on dual-channel attention residual network |
Non-Patent Citations (3)
Title |
---|
Hu Yu 等.Frequency and Spatial Dual Guidance for Image Dehazing .《Computer Vsion-ECCV 2022》.2022,全文. * |
孙航 等.层级特征交互与增强感受野双分支遥感图像去雾网络.《遥感学报》.2023,全文. * |
徐岩 等.基于多特征融合的卷积神经网络图像去雾算法.《激光与光电子学进展》.2018,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN117853371A (en) | 2024-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Single image defogging based on multi-channel convolutional MSRCR | |
Vijayalakshmi et al. | A comprehensive survey on image contrast enhancement techniques in spatial domain | |
Liu et al. | Single image dehazing with depth-aware non-local total variation regularization | |
CN107403421B (en) | Image defogging method, storage medium and terminal equipment | |
CN109726195A (en) | A kind of data enhancement methods and device | |
Lam | Blind bi-level image restoration with iterated quadratic programming | |
CN110188815B (en) | Feature point sampling method, device, equipment and storage medium | |
Zhu et al. | Low-light image enhancement network with decomposition and adaptive information fusion | |
Galetto et al. | Edge-aware filter based on adaptive patch variance weighted average | |
CN113674187A (en) | Image reconstruction method, system, terminal device and storage medium | |
CN112233037B (en) | Image enhancement system and method based on image segmentation | |
Tsubokawa et al. | Local look-up table upsampling for accelerating image processing | |
CN117853371B (en) | Multi-branch frequency domain enhanced real image defogging method, system and terminal | |
CN112991359A (en) | Pavement area extraction method, pavement area extraction system, electronic equipment and storage medium | |
Jin et al. | Color correction and local contrast enhancement for underwater image enhancement | |
CN111311610A (en) | Image segmentation method and terminal equipment | |
CN116993609A (en) | Image noise reduction method, device, equipment and medium | |
Goyal et al. | An enhancement of underwater images based on contrast restricted adaptive histogram equalization for image enhancement | |
CN111383187A (en) | Image processing method and device and intelligent terminal | |
Wang et al. | An airlight estimation method for image dehazing based on gray projection | |
Orhei et al. | Image sharpening using dilated filters | |
Ju et al. | Vrohi: Visibility recovery for outdoor hazy image in scattering media | |
CN115965776A (en) | SFR (Small form-factor rating) graph card identification method and device, electronic equipment and storage medium | |
CN114596210A (en) | Noise estimation method, device, terminal equipment and computer readable storage medium | |
CN109712094B (en) | Image processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |