CN115035011A - Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy - Google Patents
Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy Download PDFInfo
- Publication number
- CN115035011A CN115035011A CN202210644966.3A CN202210644966A CN115035011A CN 115035011 A CN115035011 A CN 115035011A CN 202210644966 A CN202210644966 A CN 202210644966A CN 115035011 A CN115035011 A CN 115035011A
- Authority
- CN
- China
- Prior art keywords
- image
- illumination
- low
- reflectivity
- enhanced
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005286 illumination Methods 0.000 title claims abstract description 160
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000004927 fusion Effects 0.000 title claims abstract description 21
- 238000002310 reflectometry Methods 0.000 claims abstract description 50
- 230000002708 enhancing effect Effects 0.000 claims abstract description 13
- 230000009467 reduction Effects 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 62
- 238000005070 sampling Methods 0.000 claims description 28
- 238000010586 diagram Methods 0.000 claims description 18
- 238000012549 training Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 10
- 230000003044 adaptive effect Effects 0.000 claims description 7
- 238000013507 mapping Methods 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 238000009499 grossing Methods 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 238000003384 imaging method Methods 0.000 claims description 5
- 239000000203 mixture Substances 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000005316 response function Methods 0.000 claims description 3
- 238000011524 similarity measure Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims 2
- 238000012545 processing Methods 0.000 abstract description 10
- 230000002401 inhibitory effect Effects 0.000 abstract 1
- 238000011176 pooling Methods 0.000 description 9
- 230000004913 activation Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G06T5/70—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Abstract
The invention relates to the technical field of image processing, in particular to a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which comprises the steps of inputting a V-channel image and a normal light image of a low-illumination image into a Decommet to obtain the illumination and reflectivity of the image; inputting the reflectivity of the low-illumination image and illumination into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction; inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination; reconstructing an image to obtain a coarse enhanced image; acquiring a virtual overexposure image of the low-illumination image, and fusing the virtual overexposure image with the low-illumination image and the coarse enhancement image; the invention improves the color distortion phenomenon after image enhancement, and meets the requirement of effectively inhibiting noise while retaining the edge structure and detail information.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy.
Background
The popularization of the internet enables people to rapidly enter an information era, people have greater and greater requirements on various information, the information acquired through a human visual system approximately reaches 75% of the total amount of the information acquired by the people, and the image is one of important carriers for human eyes to acquire the information. As an important information carrier, image processing technology has become a popular field of research in order to enable images to meet the needs of various application fields. In real life, due to factors such as the equipment and environment for acquiring images, some low-quality pictures are often obtained. Some of the reasons are caused by abnormal weather factors, and some of the reasons are caused by equipment (such as underexposure). Such images generally suffer from overall darkness, poor contrast, and lack of detail, which can affect the viewing of the image content and the subsequent use of the image.
In order to extract important information in a low-quality image as much as possible, it is necessary to perform image processing on such an image, and image processing techniques such as image enhancement have been developed. The content of the image is reproduced by utilizing the image enhancement technology, so that on one hand, the visual experience can be improved, and the visual appreciation requirements of people are met; on the other hand, the image enhancement is one of the preprocessing means of computer vision, the detail information in the image is reproduced, and the accuracy of production application such as detection and identification in the computer vision field, pathological characteristic information extraction in the biomedical field and the like can be greatly improved.
Image enhancement in a low-illumination environment currently comprises histogram equalization enhancement, wavelet transform image enhancement, Retinex theory enhancement and the like. In comparison, the image enhancement based on the Retinex theory has a good enhancement effect on most images, and the application range is wide. The Retinex image enhancement method has good effects on night images, fog images, low-illumination images and the like. But also has a more obvious color distortion phenomenon and loses certain edge detail information.
Disclosure of Invention
In order to improve the details and contrast of an image and enable the image to contain rich texture details and good visual effect, the invention provides a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which specifically comprises the following steps:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and reflectivity of the normal light image and the illumination and reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity of the low-illumination image and illumination into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
Further, before the image is input into the DecomNet, color channel conversion is carried out on the training set or the real-time low-illumination image to be enhanced, and the image is converted into an HSV image from an RGB image.
Further, extracting features of images input into the DecomNet by using convolution kernels with convolution kernels of 3 × 3, namely sequentially extracting features by using 5 convolution layers with ReLU and convolution kernels of 3 × 3, wherein each convolution layer is used for extracting features, and the ReLU is used for mapping the obtained features to reflectivity and illumination; and after mapping, sequentially passing a layer of convolution layer with convolution kernel of 3 × 3 and Sigmoid function to obtain an image with a channel of 4, taking the first 3 channels of the image as the reflectivity R of the image, taking the last channel as the illumination I of the image, namely projecting the reflectivity R and the illumination I from the characteristic space of the image through the layer of convolution layer with convolution kernel of 3 × 3, and using the Sigmoid function to constrain the projected value in the range of [0, 1 ].
Further, enhancing the illumination of the low-illumination image, and obtaining the enhanced illumination includes:
splicing the illumination of the low-illumination image and the reflectivity after noise reduction to be used as the input of an Enhancenet network;
acquiring context information in a large area of an input image through an encoder-decoder framework of an EnhanceNet network;
in an EnhanceNet network, an input image is downsampled to different sizes by adopting three downsampling modules; for example, if the size of the original image is 600x400, the image needs to be reduced to 75x50 after 3 times of downsampling;
respectively splicing the down-sampled image and the context information, reconstructing the spliced image through up-sampling to obtain enhanced illumination, summing elements for each splicing, introducing jump connection from the up-sampling block to the corresponding mirror image up-sampling, then introducing multi-scale splicing, then adjusting the final scale through nearest neighbor interpolation, connecting the final scale to a channel characteristic diagram, and finally performing 3 x 3 convolution to obtain a final illumination diagram.
Further, Bacpropagation training Decombet by the loss function of Decombet, the loss function L of Decombet 1 By a reconstruction loss function L recon Reflection component uniformity loss function L ir And the structural smoothing loss function L is The composition, expressed as:
L 1 =L recon +λ ir L ir +λ is L is ;
L ir =||R low -R normal || 1 ;
wherein λ is ir Denotes the coefficient of reflectivity uniformity, λ is Representing an illumination smoothness coefficient; low represents the low-light image dataset and normal represents the normal-light image dataset; lambda [ alpha ] ij Equilibrium coefficients to reconstruct losses; r i Is the reflectance when i equals low or normal; when j is low, I j For low-light image illumination, when j is normal, I j Illumination that is a normal light image; when j is low, S j Representing low-light images, when j is normal, S j Representing a normal light image; r low A reflectance representing a low illuminance image; r is normal Representing the reflectivity of a normal light image; ^ represents obtaining a gradient; lambda [ alpha ] g The perceived intensity coefficient of the balanced structure; | | non-woven hair 2 Represents a 2 norm; | | non-woven hair 1 Represents a 1-normal form, wherein L recon The 1 norm in (1) is used to find the reconstruction loss of the whole training image, L ir The 1-norm in (a) is used to find the difference between the reflectogram of all the training images; ^ represents obtaining a gradient; | | non-woven hair 2 2-norm is expressed for solving the modular length;
backpropagating RestorationNet through its loss function, expressed as:
wherein the content of the first and second substances,loss function for restationnet;is a reflection diagram after noise reduction; r h The reflectance of a normal light image;is composed ofAnd R h A structural similarity measure between;
carrying out back propagation training EnhanceNet through a loss function of EnhanceNet, wherein the loss function of EnhanceNet is a reconstruction loss function L recon And the structural smoothing loss function L is The composition, expressed as:
L 2 =L recon +λ is L is ;
wherein L is 2 As a loss function for EnhanceNet.
Further, after obtaining the enhanced V-channel image, performing adaptive adjustment on the S-channel image, where the adjustment process is expressed as:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein s' (x, y) is the saturation of the pixel points at the x-th row and y-th column of the rough enhanced image; s (p, q) is the saturation of pixel points at the x-th row and y-th column of the low-illumination image; v' (x, y) is the brightness of pixel points at the x-th row and y-th column of the rough enhanced image; v (p, q) is the brightness of pixel points at the x-th row and y-th column of the low-illumination image; t is a proportionality constant; λ (x, y) is the correlation coefficient of v (p, q) and s (p, q).
Further, the correlation coefficient λ (x, y) of v (p, q) and s (p, q) is expressed as:
wherein v (p, q) is at position (p, q) in the neighborhood window of pixel point (x, y)Corresponding to the brightness of the pixel point, and s (p, q) is the saturation of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y);is the mean value of the luminance of the pixel point (x, y) in the neighborhood window w,is the saturation mean of the pixel point (x, y) in the neighborhood window w; delta v (x, y) is the variance of the luminance of the pixel point (x, y) in the neighborhood window w, δ s (x, y) is the saturation variance of the pixel point (x, y) in the neighborhood window w; w is a window of n × n with the pixel point (x, y) as the center.
Further, a virtual overexposure image is obtained using the camera response model, as represented by:
P=f(E);
wherein, P is an image obtained by camera imaging, namely a virtual overexposure image; e is the irradiance of the low-illumination image; f is the nonlinear response function of the camera
Further, the process of fusing the original low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image comprises the following steps:
performing row vectorization on an original low-illumination image, a rough enhancement image and a virtual overexposure image by using an image block decomposition method;
the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as
Obtaining the expectation of the Block Structure of the column-vectorized image, denotedExpressed as:
wherein the content of the first and second substances,is a weighting function expressed as To remove the image blocks of the mean, expressed asx k Which represents an image block or a sub-image block,for image block x k Average value of (d); p is a weight parameter; s. the k Is a unit length vector; k is the exposure rate; s is k A block structure for an image with an exposure k;
the average intensity of the block is obtained by using a weighted linear fusion mechanism and is recorded asExpressed as:
wherein, L (. mu.) is k ,l k ) For imaging an image X k Global mean value mu of k And a current image block x k As a weighted function of the input; l k Representing the average intensity of pixel blocks of different exposure rates;
the method can well depict the edge and the texture area of the image in the process of processing the texture image noise, can reserve the low-frequency information of the image as much as possible, can distinguish the high-frequency information of the image, and is suitable for the occasions of image noise reduction with complicated texture detail characteristics. Compared with the traditional RetinexNet method, the scheme of the invention also has the following advantages:
1. and enhancing the brightness component by utilizing the mutually independent characteristic of each channel in the HSV color space model.
2. And the saturation component is adaptively adjusted along with the change of the brightness component by utilizing the correlation coefficient, so that the change of the color sensation of the image is avoided.
3. On the basis of UNet, different areas of the illumination enhancement image are combined to bear different levels of noise, and a reflectivity noise reduction model is constructed.
4. A camera response model is introduced to generate a virtual overexposed image that is complementary to the original image. On the basis of the original image, the image enhancement effect is improved, the brightness is more uniform, and the image detail information is better reserved.
Drawings
Fig. 1 is a schematic flow diagram of a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion policy in the prior art;
FIG. 2 is a schematic diagram of an Enhancenet network structure adopted by the present invention;
FIG. 3 is a schematic diagram of a RestorationNet network architecture according to the present invention;
FIG. 4 is a schematic diagram of a low-illumination image enhancement method of an adaptive RetinexNet under a fusion strategy according to the present invention;
fig. 5 is a schematic diagram of the contrast of the enhanced image obtained by the method of the present invention, wherein (a) is low luminance data, (b) is (a) a coarse enhanced image obtained by the present invention, and (c) is (a) a final optimized enhanced image obtained by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The invention provides a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which specifically comprises the following steps as shown in figure 1:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and reflectivity of the normal light image and the illumination and reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image; the reconstruction performed in this embodiment mainly includes three steps: obtaining an enhanced V-channel image by the enhanced illumination and the reflectivity after noise reduction according to a retinex theory; according to the enhanced V-channel image, the S-channel image is subjected to self-adaptive adjustment so as to keep the contrast of the image; the H channel image is not transformed, and an RGB image is synthesized through images of a V channel, an S channel and an H channel;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
In the embodiment, the RetinexNet illumination component enhancement method for the HSV color space is provided, and the method separates a V-channel image from an RGB image, so that the distortion problem of a color image is better solved; on the basis, the design of a reflectivity recovery model based on the UNet network is provided, and the reflectivity is recovered in an auxiliary mode by utilizing illumination; and finally, introducing a camera response model and designing a fusion strategy by combining image block decomposition.
In the embodiment, by analyzing the HSV model and utilizing the mutually independent relationship among the channels, the color information of the low-illumination image is completely reserved, and the color distortion problem of the enhanced image is improved; meanwhile, the saturation is adaptively adjusted, so that color deviation is avoided; and reconstructing the image and converting the image into an RGB space to obtain a final enhancement effect. The change of the image brightness can cause the image contrast to change, so that the color deviation of the enhanced image occurs, the saturation of the image is adaptively adjusted by using a relative coefficient, and the contrast of the image is maintained:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein the content of the first and second substances,v (x, y) is the brightness of the pixel corresponding to the original image, v '(x, y) is the brightness of the pixel after enhancement, s (x, y) is the saturation of the pixel corresponding to the original image, s' (x, y) is the saturation of the pixel after correction, t is a proportionality constant, t is taken as 0.4 in the experiment, lambda (x, y) is the correlation coefficient of v (x, y) and s (x, y), and n x n is the size of the neighborhood window w,andrespectively, the mean value of the brightness and saturation of the pixel point (x, y) in the neighborhood window w, delta v (x, y) and δ s (x, y) are respectively the variance of the brightness and saturation of the pixel point (x, y) in the neighborhood window w, v (p, q) is the brightness of the pixel point corresponding to the neighborhood window, s (p, q) is the saturation of the pixel point corresponding to the neighborhood window, and (p, q) belongs to w and represents the fieldA point pixel point in the window.
Taking the obtained V-channel image and an original normal light image as input of a DecomNet network, firstly extracting features from the input image by adopting a 3 x 3 convolution kernel, then mapping RGB images into R and I by adopting 5 3 x 3 convolution layers with ReLU, and finally performing 3 x 3 convolution and obtaining an image with a channel of 4 through a sigmoid function, wherein the images of the first 3 channels are taken as reflection components, the image of the last channel is taken as illumination components, and the reflectivity and illumination of normal light and the reflectivity and illumination of a low-illumination image are obtained; the loss function of the decomposition network model mainly comprises a reconstruction loss function, a reflection component consistency loss function and a structure smooth loss function, and specifically comprises the following steps:
L=L recon +λ ir L ir +λ is L is ;
wherein λ is ir And λ is The reflectance uniformity coefficient and the illumination smoothness coefficient are respectively expressed.
Decomposing the model into a reflection component and an illumination component, recombining to construct an original image, wherein the reconstruction loss function can be expressed as the following formula:
introducing a shared reflectivity loss function for the purpose of maintaining reflectivity consistency, which is expressed by the following formula:
L ir =||R low -R normal || 1 ;
weighting the original TV function by the gradient of the reflectivity diagram, and smoothing the loss function L is The expression formula is:
wherein ∑ comprises a horizontal direction gradient (° v) h ) And a vertical direction gradient (#) v ),λ g Means for indicating flatnessA constant structure perception intensity coefficient; at a weight of exp (- λ) g ▽R i ) In the case of (1), L is The constraint on the smoothness of the area with larger reflectivity gradient is reduced, namely the constraint on the smoothness of the discontinuous position corresponding to the illumination is reduced at the position of the image structure, and the smoothness of the image structure is maintained, so that a clearer illumination image is obtained.
The network adjusting stage mainly comprises two parts, namely a noise reduction operation (RestorationNet) and an enhancement network (EnhanceNet), and the noise reduction function of the low-illumination reflection image and the enhancement function of the illumination image are respectively realized.
For RestorationNet, a typical 5-layer UNet structure is followed by a convolutional layer and a Sigmoid layer with a loss function ofWhereinRepresenting the recovered reflection diagram, SSIM () is the structural similarity measure, and the recovery resultWith the target result R h L of 2 Distance, the last entry, keeps texture detail information etc. consistent.
Restationnet architecture as shown in fig. 3, the input data is subjected to four cascaded convolution + pooling architectures, the convolution + pooling architecture includes a convolution module and a pooling module, the convolution module includes a convolution operation (Cov) of 3 × 3 and a Rule activation function (Rule), and the pooling module includes a convolution operation (MaxPooling) of 3 × 3, a Rule activation function and a max pooling operation (MaxPooling) of 2 × 2; the four cascaded convolution + pooling structures are followed by four convolution + UpSampling structures in a cascaded mode, each convolution + UpSampling structure comprises a convolution module and an UpSampling module, each convolution module comprises a 3 x 3 convolution operation (Cov) and a Rule activation function (Rule), each pooling module comprises a 3 x 3 convolution operation (Cov), a Rule activation function (Rule) and an UpSampling operation (UpSampling), each UpSampling module is in jump connection with each pooling module according to the diagram shown in the drawing 3, and the last stage output of each convolution + UpSampling structure is output after passing through a Sigmoid after passing through the 3 x 3 convolution operation.
And adjusting the illumination intensity by adopting an illumination enhancement network (Enhancenet), inputting a reflection component diagram and an illumination component diagram in the enhancement network, connecting the reflection component diagram and the illumination component diagram, and inputting the connected reflection component diagram and the illumination component diagram into a network layer, wherein the convolution kernel size of the convolutional layer is 3 multiplied by 3, and the convolution kernel size of the pooling layer is 2 multiplied by 2. At the moment, the image is enlarged by utilizing a nearest neighbor interpolation method by combining the U-Net thought, and upsampling is carried out. The size of the feature graph combined with the feature graph is ensured to be consistent, the feature graphs are correspondingly added, then feature fusion is carried out, the feature graph with more complete detail storage is obtained, finally, gradient descent is used for carrying out end-to-end fine adjustment on the network, the whole encoding and decoding structure is adopted to obtain image information, the input image can be continuously sampled downwards, and a large number of illumination reflection images are obtained.
As shown in fig. 2, an EnhanceNet network adopted in this embodiment is an encoder-decoder architecture as a whole, input data is subjected to feature extraction by a convolution layer with a convolution kernel of 3 and a step length of 1, the extracted data is sequentially input into a first downsampling, a second downsampling and a third downsampling, each downsampling is composed of a convolution layer with a step length of 2 and an activation function, that is, a downsampling structure is Conv + ReLU; after the third down-sampling, processing the third down-sampling by a convolution layer with convolution kernel of 3, up-sampling, after each up-sampling, processing the third up-sampling by a convolution layer with convolution kernel of 3, after the first up-sampling, processing the first up-sampling by a convolution layer with convolution kernel of 3, jump-connecting the first up-sampling output with the second up-sampling output to be used as the input of the second up-sampling, using the spliced data as the output result of the first down-sampling, and so on, after the output of the convolution layer after the second up-sampling is jump-connected with the output of the first down-sampling to be used as the output of the second up-sampling, after the output of the convolution layer after the third up-sampling is jump-connected with the input of the first down-sampling to be used as the output of the third up-sampling, wherein each up-sampling layer uses resize-conjugation, that is, the up-sampling layer is composed of a structure of nearest interpolation operation and a convolution layer with step length of 1 and an activation function, i.e., the Conv + ReLU structure; splicing the outputs of the first up-sampling, the second up-sampling and the third up-sampling, and reducing the cascade characteristic into C channels by passing the spliced data through a 1 x 1 convolution layer; finally, the illumination map is reconstructed by using the convolution layer of 3 x 3.
Therefore, a large amount of illumination information is utilized, a local illumination distribution image is reconstructed through up-sampling, an improved illumination image is obtained, and meanwhile, multi-scale connection is introduced to improve the adaptivity of a network model.
Similar to the decomposition model loss function, the loss function of the enhanced network model is also mainly composed of a reconstruction loss function and a structure smoothing loss function, and the expression is shown in the following formula:
L=L recon +λ is L is ;
and mapping the finally obtained enhanced brightness component image, the H-channel image and the S-channel image into an RGB space to obtain a coarse enhanced image.
Under the same conditions, a better exposed image may provide more detailed information. Thus, by constructing a camera response model as a complement to the original image, a virtual overexposed image can be obtained. The model of the camera response model is:
P=f(E);
wherein E is the irradiance of the image, P is the image obtained by imaging of the camera, and f is the nonlinear response function of the camera.
For the low-illumination image enhancement problem, the functional form of f can be obtained indirectly by modeling the luminance transfer function (BTF). BTF is two images P of different exposures in the same scene 0 And P 1 A mapping function between.
Where g is a luminance conversion function, k is an exposure rate, and β and γ are parameters determined by the camera parameters and the exposure rate k. By solving the above equation, a camera response model can be obtained, namely:
most cameras can be accommodated when a-0.3293 and b-1.1258. In order to use the input image and the generated image to express as much information as possible, it is necessary to find the optimal exposure k so that the composite image is well exposed where the original image is underexposed.
Determining the optimal exposure according to the principle of' image entropy maximizationThe image entropy is expressed as:
the image entropy maximization process is represented as:
wherein the content of the first and second substances,representing the entropy of the image, N being the maximum value of the grey value of the image, p i Representing the probability of occurrence of a gray value i;the image entropy of the luminance transfer function g (B, k) is represented.
At the time of obtaining the optimum exposureThereafter, a virtual overexposure Δ k can be obtained, and then a virtual overexposure image is generated using a luminance transfer function (BTF). The parameter ak is set to 0.5.
There are three images that need to be fused. For each image p, a row vectorization is performed using the method of image block decomposition, expressed as:
P=c·s+l;
where c is the block signal strength, s is the block structure, and l is the average strength of the block.
Considering that all input pixel blocks are true images of the scene, the visibility is best for the pixel block with the highest contrast. Therefore, the required signal intensity of the fusion image block is determined by the maximum value of the signal intensity in all the source image blocks, and the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as the maximum value(generally, the higher the contrast, the better the visibility. considering that all input source image tiles are true captures of the scene, the tile with the highest contrast of them will correspond to the best visibility.
Wherein the content of the first and second substances,denotes x k Of (2), where { x k }={x k K is equal to or less than 1.k.sub.k.sub.is a set of image blocks extracted at the same spatial position of a source sequence containing K multi-exposure images. Here, x of all K k Is CN 2 A column vector of dimensions, where C is the number of color channels of the input image and N is the spatial size of the square patch.
Structural vector s of unit length, unlike signal strength k Pointing to CN 2 A particular direction in dimensional space. The expected structure of the fused image block is expected to represent the structures of all original image blocks best, and the block structure value of the column-vectorized image is calculated through the expectation of the block structure of the column-vectorized image. A simple implementation of this relationship is as follows:
wherein, the first and the second end of the pipe are connected with each other,is a weighting function that determines the contribution of each original image block in the fused image block structure; k is the exposure rate; s k Is a unit length vector. The contribution increases with the intensity of the image patch, using a power weighting function given by:
wherein, P is a weighting parameter, P is more than or equal to 0 and is an index parameter, and weighting functions with different physical meanings are generated along with different choices of the value of P. The larger p, the more image blocks with relatively greater intensity.
Regarding the average intensity of the local image, the average intensity of the blocks obtained by the weighted linear fusion mechanism is recorded asExpressed as:
wherein, L (. mu.) is k ,l k ) Is to image X k Global mean value μ of k And current patch x k As a weighted function of the input; l k Representing the average intensity of a block of pixels with an exposure k. L (. mu.) k ,l k ) Quantize X k In x k So as to be at X k Or x k As a preferred embodiment, the present embodiment uses a two-dimensional gaussian distribution to specify this measure, which is expressed as:
wherein σ g And σ l Control the curve edge mu separately k And l k Distribution of dimensions, μ c And l c Is a constant of medium intensity value, where medium intensity refers to a median value between the maximum and minimum values of a parameter, for example, if the parameter has a value in the range of [0, 1%]Then the median intensity value in this parameter is 0.5, mu c And l c Is respectively based on mu k And l k Is determined.
the embodiment also provides a specific implementation example, a deep learning framework used in the example is Tensorflow 1.13GPU, a NumPy computing library and a PIL image processing library are installed, software development environments of experiments are Pycherm 2019 and python3.7, and implementation results are shown in FIG. 5, wherein (a) is a low-illumination image, (b) is a coarse enhanced image, and (c) is a final enhanced image, so that the image processed by the method has higher detail information and smaller image distortion phenomenon, and the image quality is effectively improved.
The implementation process of the embodiment proposed by the present invention is shown in FIG. 4, and is based on S in the training connection process normal Generating a V-channel image S of a corresponding low-illumination image low Generating reflectivity and illumination according to Decompsition in training process, and generating reflectivity and illumination according to normal image S normal Illumination of (I) normal V-channel image S corresponding to low-illumination image low Illumination of (I) ow And a normal image S normal Reflectivity of (2) normal V-channel image S corresponding to low-illumination image low Reflectivity of (2) ow The difference between the two parameters is used for updating network parameters, a restationNet and an Enhancenet are updated in the same way, an S channel image of a low-illumination image is adaptively adjusted according to an obtained rough enhanced image of a V channel, the rough enhanced image of the V channel, the adaptively adjusted S channel image and an H channel image are synthesized into an RGB image, namely a rough enhanced image, a virtual overexposure image is generated according to the low-illumination image, the virtual overexposure image, the low-illumination image and the rough enhanced image are synthesized into a final enhanced image, and the loss of the image and a normal image is used for updating the network parameters; in a real-time data stage, a low-illumination image is used as input, the reflectivity and illumination corresponding to the image of a V channel of the low-illumination image are obtained through Decomposion which training is completed, the reflectivity is denoised through RestorationNet, the illumination is enhanced through EnhanceNet, the denoised reflectivity and the enhanced illumination are synthesized to obtain an enhanced image of an S channel, the image of the S channel of the low-illumination image is subjected to self-adaptive enhancement according to the coarsely enhanced image, and the obtained processed image of an S, V channel is synthesized with an H channel to obtain an RGB image, namely, a coarsely enhanced image; and generating a virtual overexposure image of the low-illumination image, and synthesizing the virtual overexposure image, the low-illumination image and the coarse enhancement image to obtain a final virtual image.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (9)
1. A low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy is characterized by comprising the following steps:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and the reflectivity of the normal light image and the illumination and the reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
2. The method as claimed in claim 1, wherein the training set or the real-time low-illumination image to be enhanced is subjected to color channel conversion before the image is input into the Decombet, and the image is converted from an RGB image to an HSV image.
3. The method as claimed in claim 1, wherein the method for enhancing low-illumination images with adaptive RetinexNet under fusion strategy comprises extracting features from the convolution layer with convolution kernel of 3 × 3 for the image inputted into Decommet, sequentially mapping the extracted features with 5 convolution layers with ReLU and convolution kernel of 3 × 3, sequentially mapping one convolution layer with convolution kernel of 3 × 3 and sigmoid function to obtain an image with channel of 4, using the first 3 channels of the image as the reflectivity R of the image, and using the last channel as the illumination I of the image.
4. The method for enhancing a low-illumination image by self-adaptive RetinexNet under a fusion strategy according to claim 1, wherein the process of enhancing the illumination of the low-illumination image to obtain the enhanced illumination comprises:
splicing the illumination of the low-illumination image and the reflectivity after noise reduction to be used as the input of an Enhancenet network;
acquiring context information in a large area of an input image through an encoder-decoder framework of an EnhanceNet network;
in an EnhanceNet network, an input image is downsampled to different sizes by adopting three downsampling modules;
and respectively splicing the image subjected to down-sampling and the context information, and reconstructing the spliced image through up-sampling to obtain enhanced illumination.
5. The method as claimed in claim 1, wherein the DCE performs back propagation training Decombet through the loss function of Decombet, and the loss function L of Decombet 1 By a reconstruction loss function L recon Reflection component uniformity loss function L ir And the structural smoothing loss function L is The composition, expressed as:
L 1 =L recon +λ ir L ir +λ is L is ;
L ir =||R low -R normal || 1 ;
wherein λ is ir Denotes the coefficient of reflectivity uniformity, λ is Representing an illumination smoothness coefficient; low represents the low-light image dataset and normal represents the normal-light image dataset; lambda ij Equilibrium coefficients to reconstruct losses; r i Is the reflectance when i equals low or normal; when j is low, I j For low-light image illumination, when j is normal, I j Illumination that is a normal light image; when j is low, S j Representing low-light images, when j is normal, S j Representing a normal light image; r is low A reflectance representing a low illuminance image; r is normal Representing the reflectivity of a normal light image; | | non-woven hair 1 Represents a 1-paradigm;representing the gradient calculation; lambda g The perceived intensity coefficient of the balanced structure; | | non-woven hair 1 Represents a 1-norm;representing the gradient calculation; | | non-woven hair 2 Represents a 2-norm;
backpropagating RestorationNet through its loss function, expressed as:
wherein the content of the first and second substances,loss function for restationnet;is a reflection diagram after noise reduction; r h The reflectance of a normal light image;is composed ofAnd R h A structural similarity measure between;
reverse propagation training of EnhanceNet through a loss function of EnhanceNet, the loss function of EnhanceNetBy a reconstruction loss function L recon And the structural smoothing loss function L is The composition, expressed as:
L 2 =L recon +λ is L is ;
wherein L is 2 As a loss function for EnhanceNet.
6. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein after the enhanced V-channel image is obtained, the S-channel image is adaptively adjusted, and the adjustment process is represented as follows:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein s' (x, y) is the saturation of the pixel points at the x-th row and y-th column of the rough enhanced image; s (p, q) is the saturation of pixel points at the x-th row and y-th column of the low-illumination image; v' (x, y) is the brightness of pixel points at the x-th row and y-th column of the rough enhanced image; v (p, q) is the brightness of pixel points at the x-th row and y-th column of the low-illumination image; t is a proportionality constant; λ (x, y) is the correlation coefficient of v (p, q) and s (p, q).
7. The method for enhancing a low-illumination image of an adaptive RetinexNet under a fusion strategy according to claim 1, wherein the correlation coefficient λ (x, y) of v (p, q) and s (p, q) is expressed as:
v (p, q) is the brightness of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y), and s (p, q) is the saturation of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y);is the mean value of the luminance of the pixel point (x, y) in the neighborhood window w,is the mean of the saturation of the pixel point (x, y) in the neighborhood window w; delta v (x, y) is the variance of the luminance of the pixel point (x, y) in the neighborhood window w, δ s (x, y) is the variance of the saturation of the pixel point (x, y) in the neighborhood window w; w is a window of n × n with the pixel point (x, y) as the center.
8. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein the virtual overexposure image is obtained by using a camera response model, and the method is represented as follows:
P=f(E);
wherein, P is an image obtained by camera imaging, namely a virtual overexposure image; e is the irradiance of the low-illumination image; f is the camera nonlinear response function.
9. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein the process of fusing the original low-illumination image, the rough enhanced image and the virtual overexposed image to obtain the final optimized enhanced image comprises:
carrying out row vectorization on the original low-illumination image, the rough enhanced image and the virtual overexposure image by using an image block decomposition method;
the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as
Expectation of obtaining the Block Structure of the column-vectorized image, denotedCalculating the block structure value of the column-vectorized image by the expectation of the block structure of the column-vectorized image, expressed as:
wherein, the first and the second end of the pipe are connected with each other,is a weighting function expressed as To remove the image blocks of the mean, expressed asx k Which represents an image block or a sub-image block,for image block x k Average value of (d); p is a weight parameter; s k Is a unit length vector; k is the exposure rate; s k A block structure for an image with an exposure k;
the average intensity of the block is obtained by using a weighted linear fusion mechanism and is recorded asExpressed as:
wherein, L (. mu.) is k ,l k ) For imaging an image X k Global mean value mu of k And a current image block x k As a weighted function of the input; l k Representing the average intensity of pixel blocks of different exposure;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210644966.3A CN115035011A (en) | 2022-06-09 | 2022-06-09 | Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210644966.3A CN115035011A (en) | 2022-06-09 | 2022-06-09 | Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115035011A true CN115035011A (en) | 2022-09-09 |
Family
ID=83123144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210644966.3A Pending CN115035011A (en) | 2022-06-09 | 2022-06-09 | Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115035011A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115294126A (en) * | 2022-10-08 | 2022-11-04 | 南京诺源医疗器械有限公司 | Intelligent cancer cell identification method for pathological image |
CN116363009A (en) * | 2023-03-31 | 2023-06-30 | 哈尔滨工业大学 | Method and system for enhancing rapid light-weight low-illumination image based on supervised learning |
-
2022
- 2022-06-09 CN CN202210644966.3A patent/CN115035011A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115294126A (en) * | 2022-10-08 | 2022-11-04 | 南京诺源医疗器械有限公司 | Intelligent cancer cell identification method for pathological image |
CN115294126B (en) * | 2022-10-08 | 2022-12-16 | 南京诺源医疗器械有限公司 | Cancer cell intelligent identification method for pathological image |
CN116363009A (en) * | 2023-03-31 | 2023-06-30 | 哈尔滨工业大学 | Method and system for enhancing rapid light-weight low-illumination image based on supervised learning |
CN116363009B (en) * | 2023-03-31 | 2024-03-12 | 哈尔滨工业大学 | Method and system for enhancing rapid light-weight low-illumination image based on supervised learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111968044B (en) | Low-illumination image enhancement method based on Retinex and deep learning | |
Rao et al. | A Survey of Video Enhancement Techniques. | |
WO2011008239A1 (en) | Contrast enhancement | |
CN115035011A (en) | Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy | |
CN112541877B (en) | Defuzzification method, system, equipment and medium for generating countermeasure network based on condition | |
CN111242883A (en) | Dynamic scene HDR reconstruction method based on deep learning | |
Liu et al. | Survey of natural image enhancement techniques: Classification, evaluation, challenges, and perspectives | |
CN112348747A (en) | Image enhancement method, device and storage medium | |
Li et al. | Underwater image high definition display using the multilayer perceptron and color feature-based SRCNN | |
CN111986084A (en) | Multi-camera low-illumination image quality enhancement method based on multi-task fusion | |
CN113095470A (en) | Neural network training method, image processing method and device, and storage medium | |
Lepcha et al. | A deep journey into image enhancement: A survey of current and emerging trends | |
CN116797488A (en) | Low-illumination image enhancement method based on feature fusion and attention embedding | |
CN116152120A (en) | Low-light image enhancement method and device integrating high-low frequency characteristic information | |
Lv et al. | Low-light image enhancement via deep Retinex decomposition and bilateral learning | |
Zheng et al. | Neural augmented exposure interpolation for two large-exposure-ratio images | |
Wang et al. | Single Underwater Image Enhancement Based on $ L_ {P} $-Norm Decomposition | |
Guo et al. | A survey on image enhancement for Low-light images | |
Chen et al. | End-to-end single image enhancement based on a dual network cascade model | |
Chen et al. | Retinex low-light image enhancement network based on attention mechanism | |
Singh et al. | Low-light image enhancement for UAVs with multi-feature fusion deep neural networks | |
Liu et al. | Attention mechanism enhancement algorithm based on cycle consistent generative adversarial networks for single image dehazing | |
Du et al. | Low-light image enhancement and denoising via dual-constrained Retinex model | |
Wang et al. | Single low-light image brightening using learning-based intensity mapping | |
CN116452431A (en) | Weak light image enhancement method based on multi-branch progressive depth network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |