CN113658057A - Swin transform low-light-level image enhancement method - Google Patents
Swin transform low-light-level image enhancement method Download PDFInfo
- Publication number
- CN113658057A CN113658057A CN202110805770.3A CN202110805770A CN113658057A CN 113658057 A CN113658057 A CN 113658057A CN 202110805770 A CN202110805770 A CN 202110805770A CN 113658057 A CN113658057 A CN 113658057A
- Authority
- CN
- China
- Prior art keywords
- image
- module
- input
- swin
- msa
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000007781 pre-processing Methods 0.000 claims abstract description 20
- 238000011084 recovery Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 23
- 238000013507 mapping Methods 0.000 claims description 22
- 239000011159 matrix material Substances 0.000 claims description 14
- 230000008447 perception Effects 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 6
- 238000005192 partition Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000015556 catabolic process Effects 0.000 claims description 3
- 238000006731 degradation reaction Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 230000008034 disappearance Effects 0.000 claims description 2
- 238000006073 displacement reaction Methods 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims 1
- 238000005286 illumination Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002207 retinal effect Effects 0.000 description 2
- 230000016776 visual perception Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G06T5/70—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G06T5/90—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The invention discloses a SwinTransformer low-light-level image enhancement method, which comprises the following steps of: step 1, constructing a preprocessing module, wherein the input of the preprocessing module is an original low-light-level image; the output of the preprocessing module is a feature map; step 2, constructing a SwinTransformer module, wherein input data of the SwinTransformer module is output characteristics of the step 1; the output of the SwinTransformer module is the extracted image characteristics; step 3, constructing a recovery module, wherein input data of the recovery module is the output characteristic of the step 2; the output of the restoration module is an enhanced high quality noise-free color image. The method solves the problems of low visibility, low contrast, noise pollution and color distortion of low-light-level images in the prior art.
Description
Technical Field
The invention belongs to the technical field of image processing, particularly belongs to an RGB true color image recovery technology, and relates to a Swin transform low-light-level image enhancement method.
Background
The number of photons entering the lens is small in imaging under a low-light condition, the brightness of a formed image is low, and information such as color and a target structure is difficult to distinguish. Although the image brightness can be improved to some extent by extending the exposure time, the optical sensor is liable to generate a large amount of noise. Therefore, directly acquired low-light-level images mostly have the degradation problems of low contrast, high noise, color distortion, detail blurring and the like. The low-light-level image has poor visual perception quality, and can influence subsequent image processing tasks such as image segmentation, target recognition, video monitoring and the like. The low-light-level image enhancement technology can restore the directly acquired low-light-level images to a normal illumination level based on a series of mathematical methods, improves the visual perception quality and greatly improves the precision of subsequent image processing tasks, thereby becoming one of the research hotspots in the field of image processing.
Early low-light image enhancement methods were mainly based on Histogram Equalization (HE) and Retinex theory. HE image enhancement is a histogram modification method based on cumulative distribution function, which adjusts the image histogram to an equilibrium distribution to stretch the image dynamic range, thereby improving image contrast. The method is simple to operate and high in efficiency, but the generated image is easily influenced by the artifact and is not strong in reality sense. Whereas the retinal theory-based method attempts to illuminate an image by decomposing an input image into a reflection component, which is an inherent property of a scene, and an illumination component, which is affected by ambient illuminance; retinal theory-based methods typically enhance the illumination component of low-light images to approximate corresponding normal-light images. Parameters in the model need to be set manually, the diversity of the image cannot be processed in a self-adaptive mode, the image processing effect aiming at high noise is poor, and the situations of local detail underexposure or overexposure exist.
With the rapid development of artificial intelligence theory, in recent years, low-light-level image enhancement algorithms based on deep learning are proposed in succession. Although the method based on deep learning makes up the defects of the traditional method to a certain extent and achieves a good enhancement effect on a certain image set, most deep learning dim light enhancement methods depend on the quality of the data set seriously, and it is assumed that dark light areas have no noise or the distribution of the noise in different illumination areas is not considered. In fact, the prior knowledge is deviated from a real image, and a complete real image data set is difficult to acquire, which results in that the existing deep learning model cannot effectively suppress real image noise and is difficult to generate satisfactory visual quality.
Through the research on the traditional model and the deep learning model for enhancing the low-light-level images, two challenging problems are found in the process of enhancing the real low-light-level images, namely the problem of low illumination of the images independent of the space and the problem of non-uniform noise. Statistics shows that the spatial characteristic distribution of a real low-light image is complex, the number of photons entering a lens at different spatial positions is greatly different, and the illumination variability in the image space is strong. Most of existing partial deep learning methods can effectively improve illumination characteristics in a data set generated manually, but aiming at the problem of low illumination independent of spatial distribution, the enhancement effect in the whole visibility and underexposed areas of an image is not ideal. In addition, aiming at the non-uniform noise characteristics introduced in image acquisition, the traditional model cannot be well solved, and the deep learning model cannot achieve an ideal effect through simple cascade noise reduction. For example, partial image details are lost by denoising before image enhancement, high-noise pixel information reconstruction is difficult, and image blurring is easily caused by denoising after enhancement. Therefore, how to effectively suppress noise and recover information hidden in the dark is another challenge in the current dim light enhancement model.
Disclosure of Invention
The invention aims to provide a Swin transform low-light-level image enhancement method, which solves the problems of low visibility, low contrast, noise pollution and color distortion of a low-light-level image in the prior art.
The invention adopts the technical scheme that a Swin transform low-light-level image enhancement method is specifically implemented according to the following steps:
step 1, constructing a preprocessing module, wherein the input of the preprocessing module is an original low-light-level image with the size of H, W and 3; the output of the preprocessing module is a feature map with a size of H/4W/4 96;
step 2, constructing a Swin Transformer module, wherein input data of the Swin Transformer module is output characteristics of the step 1, and the size of the input data is H/4W/96; the output of the Swin transform module is the extracted image characteristics, and the size is H/4W/4 96;
The method has the advantages that the low-light-level image can be effectively restored to the image acquired under the normal illumination condition, and the texture details, the color information and the like of the image are kept.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention;
FIG. 2 is a flow chart of the structure of a pre-processing module constructed in the method of the present invention;
FIG. 3 is a flow chart of the structure of a Swin transducer module constructed in the method of the present invention;
FIG. 4 is a flow chart of the structure of the recovery module constructed in the method of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention provides a low-light-level image enhancement method mainly based on a Swin transform model, and the overall thought is as follows: firstly, preprocessing an input image, converting image data into processed image characteristics, and facilitating extraction of image information characteristics by a Swin transform model; then, inputting the obtained image features into a Swin Transformer model for feature extraction; and finally, restoring the extracted image features to the original low-light-level image size through a restoring module, and outputting an enhanced result.
Referring to fig. 1, the method of the present invention is based on a Swin Transformer low-light-level image enhancement network (hereinafter referred to as network), and comprises a preprocessing module, a Swin Transformer module, and a recovery module. The preprocessing module comprises two steps of batch Partition and Linear Embedding, directly takes the original low-light image as input, and mainly converts the original low-light image into preprocessed image features. The Patch Partition divides the original low-light image into non-overlapping Patch sizes of 4 × 4, the feature dimension of each Patch becomes 4 × 3 ═ 48 after the division, and then the projected feature dimension is applied to the Linear Embedding layer through Linear Embedding, which is set to 96 in the implementation of the invention. And the Swin transform module performs window division on the input image characteristic graph by using a moving window strategy. And modeling the global dependency relationship of input and output by independently using a multi-head self-attention mechanism in the divided window, thereby extracting the global image characteristics. By translating the position of the window, the problem of information interaction between different windows is solved. The recovery module also comprises two steps of Patch expansion and Linear, wherein the Patch expansion is used for recovering the characteristic size of the image and ensuring that the characteristic size is the same as that of the original input low-light-level image, and the Linear is used for image dimension mapping and ensuring that the dimension of an output result is 3.
The method of the invention is implemented by utilizing the network framework according to the following steps:
step 1, constructing a preprocessing module, wherein the input of the preprocessing module is an original low-light-level image with the size of H, W and 3; the output of the preprocessing module is a signature of size H/4W/4 96.
Referring to fig. 2, the preprocessing module is mainly used for preprocessing data of the original image, and the structure of the preprocessing module sequentially comprises: the original low-light image (Input _ image) serves as an Input image → batch Partition layer (Conv4 × 4 × 48) → Linear Embedding layer (Linear h/4 × W/4 × 96) → Output feature (Output _ feature).
The Patch Partition layer is convolution operation, the size of a convolution kernel is 4 x 4, the convolution step length is 4, and the total number of feature mappings is 48; the Linear Embedding layer performs feature mapping for Linear operation, the size of a convolution kernel is H/4W/4, and the total number of feature mappings is 96.
Step 2, constructing a Swin Transformer module, wherein input data of the Swin Transformer module is output characteristics of the step 1, and the size of the input data is H/4W/96; the output of the Swin Transformer module is the extracted image features, with a size of H/4W/4 96.
A low-light-level image enhancement network is designed based on Swin transform, the dependency relationship among different spatial position features is modeled by using a self-attention mechanism, global context information is effectively captured, and the method has better feature extraction capability. Reduce the stacking of the convolution layer, ensure the precision and greatly improve the processing speed. The Swin Transformer module carries out attention weighting through the relevance between the two spatial position characteristics, local characteristics and global information are blended in a network, the structure of the Swin Transformer module avoids a mode that CNN stacks convolution layers to obtain the global information, and the model can have good performance.
The specific internal structure of a single Swin Transformer module is consistent with the paper (Swin Transformer: Hierarchical Vision Transformer using Shifted Windows).
Referring to fig. 3, the structure of the Swin Transformer module is as follows: the output characteristic of the step 1 is taken as an input characteristic → LN regularization layer → W-MSA submodule (i.e. window multi-head self-attention layer) or SW-MSA submodule (i.e. shift window multi-head self-attention layer) → residual connecting layer → LN regularization layer → feedforward network → residual connecting layer → output characteristic; the Swin Transformer model is circulated for 6 times totally, odd layers and even layers are sequentially connected at intervals, wherein W-MSA sub-modules are adopted in the three odd layers, and SW-MSA sub-modules are adopted in the three even layers.
Referring to fig. 3, the LN regularization layer mainly functions to perform LN regularization processing, normalize the input data to between 0 and 1, and thus ensure that the data distribution of the input layer is the same; the residual connecting layer is mainly used for performing residual connection, so that the problems of gradient disappearance and weight matrix degradation are solved; the feedforward network is composed of two layers of feedforward neural networks, wherein the first layer of feedforward network firstly inputs a vector from dmodelDimension mapping to 4 x dmodelDimension, the activation function is a ReLU function, and the second layer feedforward network is from 4 x dmodelDimension mapping back to dmodelDimension, without using an activation function, the feedforward network is expressed as (1):
FFN(x)=max(0,xW1+b1)W2+b2 (1)
the W-MSA submodule (window multi-head self-attention layer) firstly performs window division on input features, the size of a division window set in the embodiment of the invention is 7 x 7, and multi-head self-attention calculation is performed on each divided small window. The W-MSA submodule maps input features into different subspaces, then point multiplication operation is carried out on all the subspaces to calculate attention vectors, finally the attention vectors calculated by all the subspaces are spliced and mapped into an original input space to obtain a final attention vector as output, and an expression of the W-MSA submodule is as follows (2):
MultiHead(Q,K,V)=Concat(head1,...,headh)W0
headi=Attention(QWi Q,KWi K,VWi V) (2)
wherein Q, K, V are respectively the input of the W-MSA submodule, i.e. query vector, key vector, value vector, Wi QMapping matrices for Q in different subspaces, Wi KMapping matrix for K in different subspaces, Wi VFor the mapping matrix of V in different subspaces, the number h of the subspaces set in this step is set to 8, and the calculation manner of the attention vector on a single subspace is sequentially as follows: the query vector Q and the key vector K are point-multiplied and then divided by the square root of the dimension of the key vector KObtaining a fraction matrix of the query vector Q, finally normalizing the fraction matrix through a softmax function to obtain a weight matrix, and then multiplying the weight matrix by a value vector V to obtain an attention vector of a subspace, wherein the expression is as the following formula (3):
the W-MSA submodule (window multi-head self-attention layer) captures the dependency relationship of the features on different subspaces by mapping the input features to different subspaces and then calculating the attention vector, and the finally obtained attention vector can capture the dependency relationship between the features more three-dimensionally and comprehensively.
The ability to model networks using the W-MSA submodule alone is very poor, since it treats each window as an independent area calculation and ignores the necessity of interaction between windows, and based on this motivation the Swin Transformer of the inventive procedure in turn proposes the SW-MSA submodule (shifted window multi-headed self-attention layer). The SW-MSA submodule is used for carrying out pixel displacement operation on the image characteristic with the size of half window before the image characteristic is input into the SW-MSA submodule, and then carrying out W-MSA submodule operation. The image characteristic information contained in the window at the same position divided by the W-MSA sub-module is different, and the problem of information interaction between different windows is solved. The specific operation flow is as follows:
circularly shifting the output characteristics of the step 1 upwards and leftwards by half of the size of the window, and segmenting the window according to the W-MSA sub-module on the basis of shifting to obtain window contents different from W-MSA; and then, operating the W-MSA submodule, and circularly shifting down and circularly shifting right the obtained image characteristics by half window size after the operation is finished, and restoring the original positions.
Referring to fig. 4, the recovery module mainly functions to recover the image features extracted by the Swin Transformer module to the original input low-light-level image size, and outputs an enhanced high-quality noise-free color image, and the recovery module sequentially has the following structures: and (3) taking the Output characteristic of the step 2 as Input _ feature → batch Expanding layer (performing read operation) → Linear layer (Linear, H × W × 3) → Output image (Output _ image), so as to obtain the Output image.
The Patch expansion layer is used for performing read operation, the resolution of the input features is expanded to 4 times of the input resolution, and the feature dimension is reduced to 1/16 of the input dimension; the Linear layer performs feature mapping for Linear operation, the size of a convolution kernel is H x W, and the total number of feature mappings is 3.
When training a Swin transform-based low-light-level image enhancement network, L is considered1The loss function is better in the aspects of contrast of a target contour and smooth effect of a uniform region, and meanwhile, the graph can be well restored by introducing structural constraint into the SSIM loss functionThe structure and local details of the image and the perception loss function can restrain the difference between a real image and a predicted image and keep the fidelity of image perception and details. In this step, L is1The + SSIM loss function + perception loss function are combined to be used as a total loss function of the Swin Transformer-based low-light-level image enhancement network, and are expressed as follows:
Ltotal=(1-λs-λp)L1+λsLssim+λpLperc
in the formula, L1L representing pixel level1Loss of norm, LssimDenotes structural similarity loss, LpercIndicating a loss of perception, λs、λpIs the corresponding coefficient, and the value range is [0,1 ]]Preferably λs=0.2、λp=0.1。
Wherein L is1The norm loss formula isIgtRepresenting a real image, IhRepresenting a predicted image, l represents a non-zero constant, taken as 10-6;
The structural similarity loss formula of SSIM isμx、μyThe pixel average values of the images x and y are represented respectively; sigmaxyRepresenting the standard deviation of the product of x and y of the image;respectively representing the variances of the images x and y; n represents the total number of image samples, C1、C2Is a constant;
the perceptual loss function is formulated asIgtRepresenting a real image, IhRepresenting a predicted image, CjRepresents a channel, HjAnd WjRespectively represent the height of the jth feature mapAnd a width of the sheet material,representing the feature map obtained for the jth convolutional layer in the pre-trained VGG16 model.
Claims (5)
1. A Swin transform low-light-level image enhancement method is characterized by being specifically implemented according to the following steps:
step 1, constructing a preprocessing module, wherein the input of the preprocessing module is an original low-light-level image with the size of H, W and 3; the output of the preprocessing module is a feature map with a size of H/4W/4 96;
step 2, constructing a Swin Transformer module, wherein input data of the Swin Transformer module is output characteristics of the step 1, and the size of the input data is H/4W/96; the output of the Swin transform module is the extracted image characteristics, and the size is H/4W/4 96;
step 3, constructing a recovery module, wherein input data of the recovery module is output characteristics of the step 2, and the size of the input data is H/4W/4 96; the output of the restoration module is an enhanced high-quality noise-free color image with a size H x W x 3.
2. The Swin transform low-light-level image enhancement method of claim 1, wherein: in step 1, the structure of the preprocessing module is as follows in sequence: the original low-light image serves as an input image → batch Partition layer → Linear Embedding layer → Output feature (Output _ feature),
the Patch Partition layer is convolution operation, the size of a convolution kernel is 4 x 4, the convolution step length is 4, and the total number of feature mappings is 48; the Linear Embedding layer performs feature mapping for Linear operation, the size of a convolution kernel is H/4W/4, and the total number of feature mappings is 96.
3. The Swin transform low-light-level image enhancement method of claim 1, wherein: in the step 2, the structures of the Swin Transformer module are as follows in sequence: the output characteristic of the step 1 is used as an input characteristic → an LN regularization layer → a W-MSA submodule or an SW-MSA submodule → a residual connecting layer → an LN regularization layer → a feedforward network → a residual connecting layer → an output characteristic; the Swin Transformer model is circulated for 6 times totally, odd layers and even layers are sequentially connected at intervals, wherein W-MSA sub-modules are adopted in the three odd layers, and SW-MSA sub-modules are adopted in the three even layers.
4. The Swin transform low-light-level image enhancement method of claim 3, wherein: the LN regularization layer is used for carrying out LN regularization treatment, carrying out normalization treatment on input data and enabling the input data to be between 0 and 1;
the residual connecting layer is used for performing residual connection, so that the problems of gradient disappearance and weight matrix degradation are solved;
the feedforward network is composed of two layers of feedforward neural networks, wherein the first layer of feedforward network firstly inputs a vector from dmodelDimension mapping to 4 x dmodelDimension, the activation function is a ReLU function, and the second layer feedforward network is from 4 x dmodelDimension mapping back to dmodelDimension, without using an activation function, the feedforward network is expressed as (1):
FFN(x)=max(0,xW1+b1)W2+b2 (1)
firstly, carrying out window division on input characteristics by the W-MSA submodule, and carrying out multi-head self-attention calculation on each divided small window; the W-MSA submodule maps input features into different subspaces, then point multiplication operation is carried out on all the subspaces to calculate attention vectors, finally the attention vectors calculated by all the subspaces are spliced and mapped into an original input space to obtain a final attention vector as output, and an expression of the W-MSA submodule is as follows (2):
MultiHead(Q,K,V)=Concat(head1,...,headh)W0
headi=Attention(QWi Q,KWi K,VWi V) (2)
wherein Q, K, V are respectively the input of the W-MSA submodule, i.e. query vector, key vector, value vector, Wi QIs notMapping matrix of Q in the same subspace, Wi KMapping matrix for K in different subspaces, Wi VFor mapping matrices of V in different subspaces, the attention vector on a single subspace is calculated in the following manner: the query vector Q and the key vector K are point-multiplied and then divided by the square root of the dimension of the key vector KObtaining a fraction matrix of the query vector Q, finally normalizing the fraction matrix through a softmax function to obtain a weight matrix, and then multiplying the weight matrix by a value vector V to obtain an attention vector of a subspace, wherein the expression is as the following formula (3):
the W-MSA submodule captures the dependency relationship of the features on different subspaces by mapping the input features to different subspaces and then calculating the attention vector, and the finally obtained attention vector can capture the dependency relationship between the features more three-dimensionally and more comprehensively;
the SW-MSA sub-module is used for performing pixel displacement operation on the image characteristic with the size of half a window before the image characteristic is input into the SW-MSA sub-module, and then performing W-MSA sub-module operation, wherein the specific operation flow is as follows: circularly shifting the output characteristics of the step 1 upwards and leftwards by half of the size of the window, and segmenting the window according to the W-MSA sub-module on the basis of shifting to obtain window contents different from W-MSA; and then, operating the W-MSA submodule, and circularly shifting down and circularly shifting right the obtained image characteristics by half window size after the operation is finished, and restoring the original positions.
5. The method of claim 1, wherein: in the step 3, the recovery module is used for recovering the image features extracted by the Swin transform module to the original input low-light-level image size, outputting the enhanced high-quality noise-free color image, and the recovery module sequentially has the following structures: the output characteristic of the step 2 is taken as input → batch Expanding layer → Linear layer → output image;
the Patch expansion layer is used for performing read operation, the resolution of the input features is expanded to 4 times of the input resolution, and the feature dimension is reduced to 1/16 of the input dimension; the Linear layer performs characteristic mapping for Linear operation, the size of a convolution kernel is H x W, the total number of the characteristic mapping is 3,
when training a Swin transform-based low-light-level image enhancement network, L is considered1The loss function is better in the aspects of contrast of a target contour and smooth effect of a uniform region, meanwhile, the SSIM loss function introduces structural constraint to well restore the structure and local details of an image, the perception loss function can constrain the difference between a real image and a predicted image and keep the fidelity of image perception and details, and in the step, L is added1The + SSIM loss function + perception loss function are combined to be used as a total loss function of the Swin Transformer-based low-light-level image enhancement network, and are expressed as follows:
Ltotal=(1-λs-λp)L1+λsLssim+λpLperc
in the formula, L1L representing pixel level1Loss of norm, LssimDenotes structural similarity loss, LpercIndicating a loss of perception, λs、λpIs the corresponding coefficient, and the value range is [0,1 ]]Preferably λs=0.2、λp=0.1。
Wherein L is1The norm loss formula isIgtRepresenting a real image, IhRepresenting a predicted image, l represents a non-zero constant, taken as 10-6;
The structural similarity loss formula of SSIM isμx、μyAre respectively provided withThe pixel average of the representative image x, y; sigmaxyRepresenting the standard deviation of the product of x and y of the image;respectively representing the variances of the images x and y; n represents the total number of image samples, C1、C2Is a constant;
the perceptual loss function is formulated asIgtRepresenting a real image, IhRepresenting a predicted image, CjRepresents a channel, HjAnd WjRespectively representing the height and width of the jth feature map,representing the feature map obtained for the jth convolutional layer in the pre-trained VGG16 model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110805770.3A CN113658057A (en) | 2021-07-16 | 2021-07-16 | Swin transform low-light-level image enhancement method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110805770.3A CN113658057A (en) | 2021-07-16 | 2021-07-16 | Swin transform low-light-level image enhancement method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113658057A true CN113658057A (en) | 2021-11-16 |
Family
ID=78477997
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110805770.3A Pending CN113658057A (en) | 2021-07-16 | 2021-07-16 | Swin transform low-light-level image enhancement method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113658057A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114140353A (en) * | 2021-11-25 | 2022-03-04 | 苏州大学 | Swin-Transformer image denoising method and system based on channel attention |
CN114283288A (en) * | 2021-12-24 | 2022-04-05 | 合肥工业大学智能制造技术研究院 | Method, system, equipment and storage medium for enhancing night vehicle image |
CN115330898A (en) * | 2022-08-24 | 2022-11-11 | 晋城市大锐金马工程设计咨询有限公司 | Improved Swin transform-based magazine, book and periodical advertisement embedding method |
CN116128768A (en) * | 2023-04-17 | 2023-05-16 | 中国石油大学(华东) | Unsupervised image low-illumination enhancement method with denoising module |
CN116385317A (en) * | 2023-06-02 | 2023-07-04 | 河北工业大学 | Low-dose CT image recovery method based on self-adaptive convolution and transducer mixed structure |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2933547A1 (en) * | 2005-10-07 | 2007-04-07 | Rearden Mova, Llc | Apparatus and method for performing motion capture using a random pattern on capture surfaces |
US20200234414A1 (en) * | 2019-01-23 | 2020-07-23 | Inception Institute of Artificial Intelligence, Ltd. | Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures |
CN111968044A (en) * | 2020-07-16 | 2020-11-20 | 中国科学院沈阳自动化研究所 | Low-illumination image enhancement method based on Retinex and deep learning |
CN112381897A (en) * | 2020-11-16 | 2021-02-19 | 西安电子科技大学 | Low-illumination image enhancement method based on self-coding network structure |
-
2021
- 2021-07-16 CN CN202110805770.3A patent/CN113658057A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2933547A1 (en) * | 2005-10-07 | 2007-04-07 | Rearden Mova, Llc | Apparatus and method for performing motion capture using a random pattern on capture surfaces |
US20200234414A1 (en) * | 2019-01-23 | 2020-07-23 | Inception Institute of Artificial Intelligence, Ltd. | Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures |
CN111968044A (en) * | 2020-07-16 | 2020-11-20 | 中国科学院沈阳自动化研究所 | Low-illumination image enhancement method based on Retinex and deep learning |
CN112381897A (en) * | 2020-11-16 | 2021-02-19 | 西安电子科技大学 | Low-illumination image enhancement method based on self-coding network structure |
Non-Patent Citations (2)
Title |
---|
刘超;张晓晖;: "超低照度下微光图像的深度卷积自编码网络复原", 光学精密工程, no. 04 * |
张艳;张明路;蒋志宏;吕晓玲;: "基于改进的LIP算法低照度图像增强算法", 电子测量与仪器学报, no. 11 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114140353A (en) * | 2021-11-25 | 2022-03-04 | 苏州大学 | Swin-Transformer image denoising method and system based on channel attention |
CN114140353B (en) * | 2021-11-25 | 2023-04-07 | 苏州大学 | Swin-Transformer image denoising method and system based on channel attention |
WO2023092813A1 (en) * | 2021-11-25 | 2023-06-01 | 苏州大学 | Swin-transformer image denoising method and system based on channel attention |
CN114283288A (en) * | 2021-12-24 | 2022-04-05 | 合肥工业大学智能制造技术研究院 | Method, system, equipment and storage medium for enhancing night vehicle image |
CN114283288B (en) * | 2021-12-24 | 2022-07-12 | 合肥工业大学智能制造技术研究院 | Method, system, equipment and storage medium for enhancing night vehicle image |
CN115330898A (en) * | 2022-08-24 | 2022-11-11 | 晋城市大锐金马工程设计咨询有限公司 | Improved Swin transform-based magazine, book and periodical advertisement embedding method |
CN116128768A (en) * | 2023-04-17 | 2023-05-16 | 中国石油大学(华东) | Unsupervised image low-illumination enhancement method with denoising module |
CN116385317A (en) * | 2023-06-02 | 2023-07-04 | 河北工业大学 | Low-dose CT image recovery method based on self-adaptive convolution and transducer mixed structure |
CN116385317B (en) * | 2023-06-02 | 2023-08-01 | 河北工业大学 | Low-dose CT image recovery method based on self-adaptive convolution and transducer mixed structure |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111968044B (en) | Low-illumination image enhancement method based on Retinex and deep learning | |
CN110232661B (en) | Low-illumination color image enhancement method based on Retinex and convolutional neural network | |
CN113658057A (en) | Swin transform low-light-level image enhancement method | |
CN113793275A (en) | Swin Unet low-illumination image enhancement method | |
CN111028163A (en) | Convolution neural network-based combined image denoising and weak light enhancement method | |
CN107798661B (en) | Self-adaptive image enhancement method | |
Liu et al. | Survey of natural image enhancement techniques: Classification, evaluation, challenges, and perspectives | |
Yan et al. | Enhanced network optimized generative adversarial network for image enhancement | |
CN113284061B (en) | Underwater image enhancement method based on gradient network | |
CN113284064A (en) | Cross-scale context low-illumination image enhancement method based on attention mechanism | |
CN113450290A (en) | Low-illumination image enhancement method and system based on image inpainting technology | |
Lepcha et al. | A deep journey into image enhancement: A survey of current and emerging trends | |
CN114723630A (en) | Image deblurring method and system based on cavity double-residual multi-scale depth network | |
CN113052814A (en) | Dark light image enhancement method based on Retinex and attention mechanism | |
Lv et al. | Low-light image enhancement via deep Retinex decomposition and bilateral learning | |
CN116797488A (en) | Low-illumination image enhancement method based on feature fusion and attention embedding | |
Punnappurath et al. | A little bit more: Bitplane-wise bit-depth recovery | |
Chen et al. | End-to-end single image enhancement based on a dual network cascade model | |
CN115035011A (en) | Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy | |
CN112927160B (en) | Single low-light image enhancement method based on depth Retinex | |
CN112614063B (en) | Image enhancement and noise self-adaptive removal method for low-illumination environment in building | |
CN113962905A (en) | Single image rain removing method based on multi-stage feature complementary network | |
CN113643202A (en) | Low-light-level image enhancement method based on noise attention map guidance | |
CN115147311A (en) | Image enhancement method based on HSV and AM-RetinexNet | |
Han et al. | Low-light images enhancement and denoising network based on unsupervised learning multi-stream feature modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |