CN115035011A - Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy - Google Patents

Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy Download PDF

Info

Publication number
CN115035011A
CN115035011A CN202210644966.3A CN202210644966A CN115035011A CN 115035011 A CN115035011 A CN 115035011A CN 202210644966 A CN202210644966 A CN 202210644966A CN 115035011 A CN115035011 A CN 115035011A
Authority
CN
China
Prior art keywords
image
illumination
low
reflectivity
enhanced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210644966.3A
Other languages
Chinese (zh)
Inventor
尹学辉
陈巧玉
涂戈
赵锡琰
田璐
唐逸航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202210644966.3A priority Critical patent/CN115035011A/en
Publication of CN115035011A publication Critical patent/CN115035011A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The invention relates to the technical field of image processing, in particular to a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which comprises the steps of inputting a V-channel image and a normal light image of a low-illumination image into a Decommet to obtain the illumination and reflectivity of the image; inputting the reflectivity of the low-illumination image and illumination into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction; inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination; reconstructing an image to obtain a coarse enhanced image; acquiring a virtual overexposure image of the low-illumination image, and fusing the virtual overexposure image with the low-illumination image and the coarse enhancement image; the invention improves the color distortion phenomenon after image enhancement, and meets the requirement of effectively inhibiting noise while retaining the edge structure and detail information.

Description

Low-illumination image enhancement method of self-adaptive RetinexNet under fusion strategy
Technical Field
The invention relates to the technical field of image processing, in particular to a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy.
Background
The popularization of the internet enables people to rapidly enter an information era, people have greater and greater requirements on various information, the information acquired through a human visual system approximately reaches 75% of the total amount of the information acquired by the people, and the image is one of important carriers for human eyes to acquire the information. As an important information carrier, image processing technology has become a popular field of research in order to enable images to meet the needs of various application fields. In real life, due to factors such as the equipment and environment for acquiring images, some low-quality pictures are often obtained. Some of the reasons are caused by abnormal weather factors, and some of the reasons are caused by equipment (such as underexposure). Such images generally suffer from overall darkness, poor contrast, and lack of detail, which can affect the viewing of the image content and the subsequent use of the image.
In order to extract important information in a low-quality image as much as possible, it is necessary to perform image processing on such an image, and image processing techniques such as image enhancement have been developed. The content of the image is reproduced by utilizing the image enhancement technology, so that on one hand, the visual experience can be improved, and the visual appreciation requirements of people are met; on the other hand, the image enhancement is one of the preprocessing means of computer vision, the detail information in the image is reproduced, and the accuracy of production application such as detection and identification in the computer vision field, pathological characteristic information extraction in the biomedical field and the like can be greatly improved.
Image enhancement in a low-illumination environment currently comprises histogram equalization enhancement, wavelet transform image enhancement, Retinex theory enhancement and the like. In comparison, the image enhancement based on the Retinex theory has a good enhancement effect on most images, and the application range is wide. The Retinex image enhancement method has good effects on night images, fog images, low-illumination images and the like. But also has a more obvious color distortion phenomenon and loses certain edge detail information.
Disclosure of Invention
In order to improve the details and contrast of an image and enable the image to contain rich texture details and good visual effect, the invention provides a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which specifically comprises the following steps:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and reflectivity of the normal light image and the illumination and reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity of the low-illumination image and illumination into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
Further, before the image is input into the DecomNet, color channel conversion is carried out on the training set or the real-time low-illumination image to be enhanced, and the image is converted into an HSV image from an RGB image.
Further, extracting features of images input into the DecomNet by using convolution kernels with convolution kernels of 3 × 3, namely sequentially extracting features by using 5 convolution layers with ReLU and convolution kernels of 3 × 3, wherein each convolution layer is used for extracting features, and the ReLU is used for mapping the obtained features to reflectivity and illumination; and after mapping, sequentially passing a layer of convolution layer with convolution kernel of 3 × 3 and Sigmoid function to obtain an image with a channel of 4, taking the first 3 channels of the image as the reflectivity R of the image, taking the last channel as the illumination I of the image, namely projecting the reflectivity R and the illumination I from the characteristic space of the image through the layer of convolution layer with convolution kernel of 3 × 3, and using the Sigmoid function to constrain the projected value in the range of [0, 1 ].
Further, enhancing the illumination of the low-illumination image, and obtaining the enhanced illumination includes:
splicing the illumination of the low-illumination image and the reflectivity after noise reduction to be used as the input of an Enhancenet network;
acquiring context information in a large area of an input image through an encoder-decoder framework of an EnhanceNet network;
in an EnhanceNet network, an input image is downsampled to different sizes by adopting three downsampling modules; for example, if the size of the original image is 600x400, the image needs to be reduced to 75x50 after 3 times of downsampling;
respectively splicing the down-sampled image and the context information, reconstructing the spliced image through up-sampling to obtain enhanced illumination, summing elements for each splicing, introducing jump connection from the up-sampling block to the corresponding mirror image up-sampling, then introducing multi-scale splicing, then adjusting the final scale through nearest neighbor interpolation, connecting the final scale to a channel characteristic diagram, and finally performing 3 x 3 convolution to obtain a final illumination diagram.
Further, Bacpropagation training Decombet by the loss function of Decombet, the loss function L of Decombet 1 By a reconstruction loss function L recon Reflection component uniformity loss function L ir And the structural smoothing loss function L is The composition, expressed as:
L 1 =L reconir L iris L is
Figure BDA0003685547520000031
L ir =||R low -R normal || 1
Figure BDA0003685547520000032
wherein λ is ir Denotes the coefficient of reflectivity uniformity, λ is Representing an illumination smoothness coefficient; low represents the low-light image dataset and normal represents the normal-light image dataset; lambda [ alpha ] ij Equilibrium coefficients to reconstruct losses; r i Is the reflectance when i equals low or normal; when j is low, I j For low-light image illumination, when j is normal, I j Illumination that is a normal light image; when j is low, S j Representing low-light images, when j is normal, S j Representing a normal light image; r low A reflectance representing a low illuminance image; r is normal Representing the reflectivity of a normal light image; ^ represents obtaining a gradient; lambda [ alpha ] g The perceived intensity coefficient of the balanced structure; | | non-woven hair 2 Represents a 2 norm; | | non-woven hair 1 Represents a 1-normal form, wherein L recon The 1 norm in (1) is used to find the reconstruction loss of the whole training image, L ir The 1-norm in (a) is used to find the difference between the reflectogram of all the training images; ^ represents obtaining a gradient; | | non-woven hair 2 2-norm is expressed for solving the modular length;
backpropagating RestorationNet through its loss function, expressed as:
Figure BDA0003685547520000041
wherein the content of the first and second substances,
Figure BDA0003685547520000042
loss function for restationnet;
Figure BDA0003685547520000043
is a reflection diagram after noise reduction; r h The reflectance of a normal light image;
Figure BDA0003685547520000044
is composed of
Figure BDA0003685547520000045
And R h A structural similarity measure between;
carrying out back propagation training EnhanceNet through a loss function of EnhanceNet, wherein the loss function of EnhanceNet is a reconstruction loss function L recon And the structural smoothing loss function L is The composition, expressed as:
L 2 =L reconis L is
wherein L is 2 As a loss function for EnhanceNet.
Further, after obtaining the enhanced V-channel image, performing adaptive adjustment on the S-channel image, where the adjustment process is expressed as:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein s' (x, y) is the saturation of the pixel points at the x-th row and y-th column of the rough enhanced image; s (p, q) is the saturation of pixel points at the x-th row and y-th column of the low-illumination image; v' (x, y) is the brightness of pixel points at the x-th row and y-th column of the rough enhanced image; v (p, q) is the brightness of pixel points at the x-th row and y-th column of the low-illumination image; t is a proportionality constant; λ (x, y) is the correlation coefficient of v (p, q) and s (p, q).
Further, the correlation coefficient λ (x, y) of v (p, q) and s (p, q) is expressed as:
Figure BDA0003685547520000046
wherein v (p, q) is at position (p, q) in the neighborhood window of pixel point (x, y)Corresponding to the brightness of the pixel point, and s (p, q) is the saturation of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y);
Figure BDA0003685547520000051
is the mean value of the luminance of the pixel point (x, y) in the neighborhood window w,
Figure BDA0003685547520000052
is the saturation mean of the pixel point (x, y) in the neighborhood window w; delta v (x, y) is the variance of the luminance of the pixel point (x, y) in the neighborhood window w, δ s (x, y) is the saturation variance of the pixel point (x, y) in the neighborhood window w; w is a window of n × n with the pixel point (x, y) as the center.
Further, a virtual overexposure image is obtained using the camera response model, as represented by:
P=f(E);
wherein, P is an image obtained by camera imaging, namely a virtual overexposure image; e is the irradiance of the low-illumination image; f is the nonlinear response function of the camera
Further, the process of fusing the original low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image comprises the following steps:
performing row vectorization on an original low-illumination image, a rough enhancement image and a virtual overexposure image by using an image block decomposition method;
the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as
Figure BDA0003685547520000053
Obtaining the expectation of the Block Structure of the column-vectorized image, denoted
Figure BDA0003685547520000054
Expressed as:
Figure BDA0003685547520000055
wherein the content of the first and second substances,
Figure BDA0003685547520000056
is a weighting function expressed as
Figure BDA0003685547520000057
Figure BDA0003685547520000058
To remove the image blocks of the mean, expressed as
Figure BDA0003685547520000059
x k Which represents an image block or a sub-image block,
Figure BDA00036855475200000510
for image block x k Average value of (d); p is a weight parameter; s. the k Is a unit length vector; k is the exposure rate; s is k A block structure for an image with an exposure k;
the average intensity of the block is obtained by using a weighted linear fusion mechanism and is recorded as
Figure BDA00036855475200000511
Expressed as:
Figure BDA00036855475200000512
wherein, L (. mu.) is k ,l k ) For imaging an image X k Global mean value mu of k And a current image block x k As a weighted function of the input; l k Representing the average intensity of pixel blocks of different exposure rates;
to be stacked
Figure BDA0003685547520000061
Reverting to the RGB channel, i.e. the optimized enhanced image is represented as:
Figure BDA0003685547520000062
the method can well depict the edge and the texture area of the image in the process of processing the texture image noise, can reserve the low-frequency information of the image as much as possible, can distinguish the high-frequency information of the image, and is suitable for the occasions of image noise reduction with complicated texture detail characteristics. Compared with the traditional RetinexNet method, the scheme of the invention also has the following advantages:
1. and enhancing the brightness component by utilizing the mutually independent characteristic of each channel in the HSV color space model.
2. And the saturation component is adaptively adjusted along with the change of the brightness component by utilizing the correlation coefficient, so that the change of the color sensation of the image is avoided.
3. On the basis of UNet, different areas of the illumination enhancement image are combined to bear different levels of noise, and a reflectivity noise reduction model is constructed.
4. A camera response model is introduced to generate a virtual overexposed image that is complementary to the original image. On the basis of the original image, the image enhancement effect is improved, the brightness is more uniform, and the image detail information is better reserved.
Drawings
Fig. 1 is a schematic flow diagram of a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion policy in the prior art;
FIG. 2 is a schematic diagram of an Enhancenet network structure adopted by the present invention;
FIG. 3 is a schematic diagram of a RestorationNet network architecture according to the present invention;
FIG. 4 is a schematic diagram of a low-illumination image enhancement method of an adaptive RetinexNet under a fusion strategy according to the present invention;
fig. 5 is a schematic diagram of the contrast of the enhanced image obtained by the method of the present invention, wherein (a) is low luminance data, (b) is (a) a coarse enhanced image obtained by the present invention, and (c) is (a) a final optimized enhanced image obtained by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The invention provides a low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy, which specifically comprises the following steps as shown in figure 1:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and reflectivity of the normal light image and the illumination and reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image; the reconstruction performed in this embodiment mainly includes three steps: obtaining an enhanced V-channel image by the enhanced illumination and the reflectivity after noise reduction according to a retinex theory; according to the enhanced V-channel image, the S-channel image is subjected to self-adaptive adjustment so as to keep the contrast of the image; the H channel image is not transformed, and an RGB image is synthesized through images of a V channel, an S channel and an H channel;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
In the embodiment, the RetinexNet illumination component enhancement method for the HSV color space is provided, and the method separates a V-channel image from an RGB image, so that the distortion problem of a color image is better solved; on the basis, the design of a reflectivity recovery model based on the UNet network is provided, and the reflectivity is recovered in an auxiliary mode by utilizing illumination; and finally, introducing a camera response model and designing a fusion strategy by combining image block decomposition.
In the embodiment, by analyzing the HSV model and utilizing the mutually independent relationship among the channels, the color information of the low-illumination image is completely reserved, and the color distortion problem of the enhanced image is improved; meanwhile, the saturation is adaptively adjusted, so that color deviation is avoided; and reconstructing the image and converting the image into an RGB space to obtain a final enhancement effect. The change of the image brightness can cause the image contrast to change, so that the color deviation of the enhanced image occurs, the saturation of the image is adaptively adjusted by using a relative coefficient, and the contrast of the image is maintained:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein the content of the first and second substances,
Figure BDA0003685547520000081
v (x, y) is the brightness of the pixel corresponding to the original image, v '(x, y) is the brightness of the pixel after enhancement, s (x, y) is the saturation of the pixel corresponding to the original image, s' (x, y) is the saturation of the pixel after correction, t is a proportionality constant, t is taken as 0.4 in the experiment, lambda (x, y) is the correlation coefficient of v (x, y) and s (x, y), and n x n is the size of the neighborhood window w,
Figure BDA0003685547520000082
and
Figure BDA0003685547520000083
respectively, the mean value of the brightness and saturation of the pixel point (x, y) in the neighborhood window w, delta v (x, y) and δ s (x, y) are respectively the variance of the brightness and saturation of the pixel point (x, y) in the neighborhood window w, v (p, q) is the brightness of the pixel point corresponding to the neighborhood window, s (p, q) is the saturation of the pixel point corresponding to the neighborhood window, and (p, q) belongs to w and represents the fieldA point pixel point in the window.
Taking the obtained V-channel image and an original normal light image as input of a DecomNet network, firstly extracting features from the input image by adopting a 3 x 3 convolution kernel, then mapping RGB images into R and I by adopting 5 3 x 3 convolution layers with ReLU, and finally performing 3 x 3 convolution and obtaining an image with a channel of 4 through a sigmoid function, wherein the images of the first 3 channels are taken as reflection components, the image of the last channel is taken as illumination components, and the reflectivity and illumination of normal light and the reflectivity and illumination of a low-illumination image are obtained; the loss function of the decomposition network model mainly comprises a reconstruction loss function, a reflection component consistency loss function and a structure smooth loss function, and specifically comprises the following steps:
L=L reconir L iris L is
wherein λ is ir And λ is The reflectance uniformity coefficient and the illumination smoothness coefficient are respectively expressed.
Decomposing the model into a reflection component and an illumination component, recombining to construct an original image, wherein the reconstruction loss function can be expressed as the following formula:
Figure BDA0003685547520000091
introducing a shared reflectivity loss function for the purpose of maintaining reflectivity consistency, which is expressed by the following formula:
L ir =||R low -R normal || 1
weighting the original TV function by the gradient of the reflectivity diagram, and smoothing the loss function L is The expression formula is:
Figure BDA0003685547520000092
wherein ∑ comprises a horizontal direction gradient (° v) h ) And a vertical direction gradient (#) v ),λ g Means for indicating flatnessA constant structure perception intensity coefficient; at a weight of exp (- λ) g ▽R i ) In the case of (1), L is The constraint on the smoothness of the area with larger reflectivity gradient is reduced, namely the constraint on the smoothness of the discontinuous position corresponding to the illumination is reduced at the position of the image structure, and the smoothness of the image structure is maintained, so that a clearer illumination image is obtained.
The network adjusting stage mainly comprises two parts, namely a noise reduction operation (RestorationNet) and an enhancement network (EnhanceNet), and the noise reduction function of the low-illumination reflection image and the enhancement function of the illumination image are respectively realized.
For RestorationNet, a typical 5-layer UNet structure is followed by a convolutional layer and a Sigmoid layer with a loss function of
Figure BDA0003685547520000093
Wherein
Figure BDA0003685547520000094
Representing the recovered reflection diagram, SSIM () is the structural similarity measure, and the recovery result
Figure BDA0003685547520000095
With the target result R h L of 2 Distance, the last entry, keeps texture detail information etc. consistent.
Restationnet architecture as shown in fig. 3, the input data is subjected to four cascaded convolution + pooling architectures, the convolution + pooling architecture includes a convolution module and a pooling module, the convolution module includes a convolution operation (Cov) of 3 × 3 and a Rule activation function (Rule), and the pooling module includes a convolution operation (MaxPooling) of 3 × 3, a Rule activation function and a max pooling operation (MaxPooling) of 2 × 2; the four cascaded convolution + pooling structures are followed by four convolution + UpSampling structures in a cascaded mode, each convolution + UpSampling structure comprises a convolution module and an UpSampling module, each convolution module comprises a 3 x 3 convolution operation (Cov) and a Rule activation function (Rule), each pooling module comprises a 3 x 3 convolution operation (Cov), a Rule activation function (Rule) and an UpSampling operation (UpSampling), each UpSampling module is in jump connection with each pooling module according to the diagram shown in the drawing 3, and the last stage output of each convolution + UpSampling structure is output after passing through a Sigmoid after passing through the 3 x 3 convolution operation.
And adjusting the illumination intensity by adopting an illumination enhancement network (Enhancenet), inputting a reflection component diagram and an illumination component diagram in the enhancement network, connecting the reflection component diagram and the illumination component diagram, and inputting the connected reflection component diagram and the illumination component diagram into a network layer, wherein the convolution kernel size of the convolutional layer is 3 multiplied by 3, and the convolution kernel size of the pooling layer is 2 multiplied by 2. At the moment, the image is enlarged by utilizing a nearest neighbor interpolation method by combining the U-Net thought, and upsampling is carried out. The size of the feature graph combined with the feature graph is ensured to be consistent, the feature graphs are correspondingly added, then feature fusion is carried out, the feature graph with more complete detail storage is obtained, finally, gradient descent is used for carrying out end-to-end fine adjustment on the network, the whole encoding and decoding structure is adopted to obtain image information, the input image can be continuously sampled downwards, and a large number of illumination reflection images are obtained.
As shown in fig. 2, an EnhanceNet network adopted in this embodiment is an encoder-decoder architecture as a whole, input data is subjected to feature extraction by a convolution layer with a convolution kernel of 3 and a step length of 1, the extracted data is sequentially input into a first downsampling, a second downsampling and a third downsampling, each downsampling is composed of a convolution layer with a step length of 2 and an activation function, that is, a downsampling structure is Conv + ReLU; after the third down-sampling, processing the third down-sampling by a convolution layer with convolution kernel of 3, up-sampling, after each up-sampling, processing the third up-sampling by a convolution layer with convolution kernel of 3, after the first up-sampling, processing the first up-sampling by a convolution layer with convolution kernel of 3, jump-connecting the first up-sampling output with the second up-sampling output to be used as the input of the second up-sampling, using the spliced data as the output result of the first down-sampling, and so on, after the output of the convolution layer after the second up-sampling is jump-connected with the output of the first down-sampling to be used as the output of the second up-sampling, after the output of the convolution layer after the third up-sampling is jump-connected with the input of the first down-sampling to be used as the output of the third up-sampling, wherein each up-sampling layer uses resize-conjugation, that is, the up-sampling layer is composed of a structure of nearest interpolation operation and a convolution layer with step length of 1 and an activation function, i.e., the Conv + ReLU structure; splicing the outputs of the first up-sampling, the second up-sampling and the third up-sampling, and reducing the cascade characteristic into C channels by passing the spliced data through a 1 x 1 convolution layer; finally, the illumination map is reconstructed by using the convolution layer of 3 x 3.
Therefore, a large amount of illumination information is utilized, a local illumination distribution image is reconstructed through up-sampling, an improved illumination image is obtained, and meanwhile, multi-scale connection is introduced to improve the adaptivity of a network model.
Similar to the decomposition model loss function, the loss function of the enhanced network model is also mainly composed of a reconstruction loss function and a structure smoothing loss function, and the expression is shown in the following formula:
L=L reconis L is
and mapping the finally obtained enhanced brightness component image, the H-channel image and the S-channel image into an RGB space to obtain a coarse enhanced image.
Under the same conditions, a better exposed image may provide more detailed information. Thus, by constructing a camera response model as a complement to the original image, a virtual overexposed image can be obtained. The model of the camera response model is:
P=f(E);
wherein E is the irradiance of the image, P is the image obtained by imaging of the camera, and f is the nonlinear response function of the camera.
For the low-illumination image enhancement problem, the functional form of f can be obtained indirectly by modeling the luminance transfer function (BTF). BTF is two images P of different exposures in the same scene 0 And P 1 A mapping function between.
Figure BDA0003685547520000111
Where g is a luminance conversion function, k is an exposure rate, and β and γ are parameters determined by the camera parameters and the exposure rate k. By solving the above equation, a camera response model can be obtained, namely:
Figure BDA0003685547520000112
most cameras can be accommodated when a-0.3293 and b-1.1258. In order to use the input image and the generated image to express as much information as possible, it is necessary to find the optimal exposure k so that the composite image is well exposed where the original image is underexposed.
Determining the optimal exposure according to the principle of' image entropy maximization
Figure BDA0003685547520000114
The image entropy is expressed as:
Figure BDA0003685547520000113
the image entropy maximization process is represented as:
Figure BDA0003685547520000121
wherein the content of the first and second substances,
Figure BDA0003685547520000122
representing the entropy of the image, N being the maximum value of the grey value of the image, p i Representing the probability of occurrence of a gray value i;
Figure BDA0003685547520000123
the image entropy of the luminance transfer function g (B, k) is represented.
At the time of obtaining the optimum exposure
Figure BDA0003685547520000124
Thereafter, a virtual overexposure Δ k can be obtained, and then a virtual overexposure image is generated using a luminance transfer function (BTF). The parameter ak is set to 0.5.
There are three images that need to be fused. For each image p, a row vectorization is performed using the method of image block decomposition, expressed as:
P=c·s+l;
where c is the block signal strength, s is the block structure, and l is the average strength of the block.
Considering that all input pixel blocks are true images of the scene, the visibility is best for the pixel block with the highest contrast. Therefore, the required signal intensity of the fusion image block is determined by the maximum value of the signal intensity in all the source image blocks, and the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as the maximum value
Figure BDA0003685547520000127
(generally, the higher the contrast, the better the visibility. considering that all input source image tiles are true captures of the scene, the tile with the highest contrast of them will correspond to the best visibility.
Figure BDA0003685547520000125
Wherein the content of the first and second substances,
Figure BDA0003685547520000126
denotes x k Of (2), where { x k }={x k K is equal to or less than 1.k.sub.k.sub.is a set of image blocks extracted at the same spatial position of a source sequence containing K multi-exposure images. Here, x of all K k Is CN 2 A column vector of dimensions, where C is the number of color channels of the input image and N is the spatial size of the square patch.
Structural vector s of unit length, unlike signal strength k Pointing to CN 2 A particular direction in dimensional space. The expected structure of the fused image block is expected to represent the structures of all original image blocks best, and the block structure value of the column-vectorized image is calculated through the expectation of the block structure of the column-vectorized image. A simple implementation of this relationship is as follows:
Figure BDA0003685547520000131
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003685547520000132
is a weighting function that determines the contribution of each original image block in the fused image block structure; k is the exposure rate; s k Is a unit length vector. The contribution increases with the intensity of the image patch, using a power weighting function given by:
Figure BDA0003685547520000133
wherein, P is a weighting parameter, P is more than or equal to 0 and is an index parameter, and weighting functions with different physical meanings are generated along with different choices of the value of P. The larger p, the more image blocks with relatively greater intensity.
Regarding the average intensity of the local image, the average intensity of the blocks obtained by the weighted linear fusion mechanism is recorded as
Figure BDA0003685547520000134
Expressed as:
Figure BDA0003685547520000135
wherein, L (. mu.) is k ,l k ) Is to image X k Global mean value μ of k And current patch x k As a weighted function of the input; l k Representing the average intensity of a block of pixels with an exposure k. L (. mu.) k ,l k ) Quantize X k In x k So as to be at X k Or x k As a preferred embodiment, the present embodiment uses a two-dimensional gaussian distribution to specify this measure, which is expressed as:
Figure BDA0003685547520000136
wherein σ g And σ l Control the curve edge mu separately k And l k Distribution of dimensions, μ c And l c Is a constant of medium intensity value, where medium intensity refers to a median value between the maximum and minimum values of a parameter, for example, if the parameter has a value in the range of [0, 1%]Then the median intensity value in this parameter is 0.5, mu c And l c Is respectively based on mu k And l k Is determined.
To be stacked
Figure BDA0003685547520000137
Reverting to the RGB channel, i.e. the optimized enhanced image is represented as:
Figure BDA0003685547520000138
the embodiment also provides a specific implementation example, a deep learning framework used in the example is Tensorflow 1.13GPU, a NumPy computing library and a PIL image processing library are installed, software development environments of experiments are Pycherm 2019 and python3.7, and implementation results are shown in FIG. 5, wherein (a) is a low-illumination image, (b) is a coarse enhanced image, and (c) is a final enhanced image, so that the image processed by the method has higher detail information and smaller image distortion phenomenon, and the image quality is effectively improved.
The implementation process of the embodiment proposed by the present invention is shown in FIG. 4, and is based on S in the training connection process normal Generating a V-channel image S of a corresponding low-illumination image low Generating reflectivity and illumination according to Decompsition in training process, and generating reflectivity and illumination according to normal image S normal Illumination of (I) normal V-channel image S corresponding to low-illumination image low Illumination of (I) ow And a normal image S normal Reflectivity of (2) normal V-channel image S corresponding to low-illumination image low Reflectivity of (2) ow The difference between the two parameters is used for updating network parameters, a restationNet and an Enhancenet are updated in the same way, an S channel image of a low-illumination image is adaptively adjusted according to an obtained rough enhanced image of a V channel, the rough enhanced image of the V channel, the adaptively adjusted S channel image and an H channel image are synthesized into an RGB image, namely a rough enhanced image, a virtual overexposure image is generated according to the low-illumination image, the virtual overexposure image, the low-illumination image and the rough enhanced image are synthesized into a final enhanced image, and the loss of the image and a normal image is used for updating the network parameters; in a real-time data stage, a low-illumination image is used as input, the reflectivity and illumination corresponding to the image of a V channel of the low-illumination image are obtained through Decomposion which training is completed, the reflectivity is denoised through RestorationNet, the illumination is enhanced through EnhanceNet, the denoised reflectivity and the enhanced illumination are synthesized to obtain an enhanced image of an S channel, the image of the S channel of the low-illumination image is subjected to self-adaptive enhancement according to the coarsely enhanced image, and the obtained processed image of an S, V channel is synthesized with an H channel to obtain an RGB image, namely, a coarsely enhanced image; and generating a virtual overexposure image of the low-illumination image, and synthesizing the virtual overexposure image, the low-illumination image and the coarse enhancement image to obtain a final virtual image.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A low-illumination image enhancement method of a self-adaptive RetinexNet under a fusion strategy is characterized by comprising the following steps:
acquiring an original image and a synthesized low-illumination image corresponding to the original image from historical data, taking the original image as a normal light image, and taking the synthesized low-illumination image as a low-illumination image;
inputting the V-channel image of the low-illumination image and the normal light image into a DecomNet to obtain the illumination and the reflectivity of the normal light image and the illumination and the reflectivity of the low-illumination image;
inputting the reflectivity and illumination of the obtained low-illumination image into a RestorationNet, and using the illumination to guide the reflectivity to reduce noise to obtain the reflectivity after noise reduction;
inputting the reflectivity and illumination of the low-illumination image into an EnhanceNet, and enhancing the illumination of the low-illumination image to obtain enhanced illumination;
reconstructing an image, namely synthesizing an RGB image by using color channels for an H channel, a V channel and an S channel of the optimized image, namely a coarse enhancement image;
and acquiring a virtual overexposure image of the low-illumination image, and fusing the low-illumination image, the rough enhanced image and the virtual overexposure image to obtain a final optimized enhanced image.
2. The method as claimed in claim 1, wherein the training set or the real-time low-illumination image to be enhanced is subjected to color channel conversion before the image is input into the Decombet, and the image is converted from an RGB image to an HSV image.
3. The method as claimed in claim 1, wherein the method for enhancing low-illumination images with adaptive RetinexNet under fusion strategy comprises extracting features from the convolution layer with convolution kernel of 3 × 3 for the image inputted into Decommet, sequentially mapping the extracted features with 5 convolution layers with ReLU and convolution kernel of 3 × 3, sequentially mapping one convolution layer with convolution kernel of 3 × 3 and sigmoid function to obtain an image with channel of 4, using the first 3 channels of the image as the reflectivity R of the image, and using the last channel as the illumination I of the image.
4. The method for enhancing a low-illumination image by self-adaptive RetinexNet under a fusion strategy according to claim 1, wherein the process of enhancing the illumination of the low-illumination image to obtain the enhanced illumination comprises:
splicing the illumination of the low-illumination image and the reflectivity after noise reduction to be used as the input of an Enhancenet network;
acquiring context information in a large area of an input image through an encoder-decoder framework of an EnhanceNet network;
in an EnhanceNet network, an input image is downsampled to different sizes by adopting three downsampling modules;
and respectively splicing the image subjected to down-sampling and the context information, and reconstructing the spliced image through up-sampling to obtain enhanced illumination.
5. The method as claimed in claim 1, wherein the DCE performs back propagation training Decombet through the loss function of Decombet, and the loss function L of Decombet 1 By a reconstruction loss function L recon Reflection component uniformity loss function L ir And the structural smoothing loss function L is The composition, expressed as:
L 1 =L reconir L iris L is
Figure FDA0003685547510000021
L ir =||R low -R normal || 1
Figure FDA0003685547510000022
wherein λ is ir Denotes the coefficient of reflectivity uniformity, λ is Representing an illumination smoothness coefficient; low represents the low-light image dataset and normal represents the normal-light image dataset; lambda ij Equilibrium coefficients to reconstruct losses; r i Is the reflectance when i equals low or normal; when j is low, I j For low-light image illumination, when j is normal, I j Illumination that is a normal light image; when j is low, S j Representing low-light images, when j is normal, S j Representing a normal light image; r is low A reflectance representing a low illuminance image; r is normal Representing the reflectivity of a normal light image; | | non-woven hair 1 Represents a 1-paradigm;
Figure FDA0003685547510000023
representing the gradient calculation; lambda g The perceived intensity coefficient of the balanced structure; | | non-woven hair 1 Represents a 1-norm;
Figure FDA0003685547510000024
representing the gradient calculation; | | non-woven hair 2 Represents a 2-norm;
backpropagating RestorationNet through its loss function, expressed as:
Figure FDA0003685547510000031
wherein the content of the first and second substances,
Figure FDA0003685547510000032
loss function for restationnet;
Figure FDA0003685547510000033
is a reflection diagram after noise reduction; r h The reflectance of a normal light image;
Figure FDA0003685547510000034
is composed of
Figure FDA0003685547510000035
And R h A structural similarity measure between;
reverse propagation training of EnhanceNet through a loss function of EnhanceNet, the loss function of EnhanceNetBy a reconstruction loss function L recon And the structural smoothing loss function L is The composition, expressed as:
L 2 =L reconis L is
wherein L is 2 As a loss function for EnhanceNet.
6. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein after the enhanced V-channel image is obtained, the S-channel image is adaptively adjusted, and the adjustment process is represented as follows:
s′(x,y)=s(x,y)+t[v′(x,y)-v(x,y)]×λ(x,y);
wherein s' (x, y) is the saturation of the pixel points at the x-th row and y-th column of the rough enhanced image; s (p, q) is the saturation of pixel points at the x-th row and y-th column of the low-illumination image; v' (x, y) is the brightness of pixel points at the x-th row and y-th column of the rough enhanced image; v (p, q) is the brightness of pixel points at the x-th row and y-th column of the low-illumination image; t is a proportionality constant; λ (x, y) is the correlation coefficient of v (p, q) and s (p, q).
7. The method for enhancing a low-illumination image of an adaptive RetinexNet under a fusion strategy according to claim 1, wherein the correlation coefficient λ (x, y) of v (p, q) and s (p, q) is expressed as:
Figure FDA0003685547510000036
v (p, q) is the brightness of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y), and s (p, q) is the saturation of the corresponding pixel point at the position (p, q) in the neighborhood window of the pixel point (x, y);
Figure FDA0003685547510000037
is the mean value of the luminance of the pixel point (x, y) in the neighborhood window w,
Figure FDA0003685547510000038
is the mean of the saturation of the pixel point (x, y) in the neighborhood window w; delta v (x, y) is the variance of the luminance of the pixel point (x, y) in the neighborhood window w, δ s (x, y) is the variance of the saturation of the pixel point (x, y) in the neighborhood window w; w is a window of n × n with the pixel point (x, y) as the center.
8. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein the virtual overexposure image is obtained by using a camera response model, and the method is represented as follows:
P=f(E);
wherein, P is an image obtained by camera imaging, namely a virtual overexposure image; e is the irradiance of the low-illumination image; f is the camera nonlinear response function.
9. The method for enhancing the low-illumination image of the adaptive RetinexNet under the fusion strategy according to claim 1, wherein the process of fusing the original low-illumination image, the rough enhanced image and the virtual overexposed image to obtain the final optimized enhanced image comprises:
carrying out row vectorization on the original low-illumination image, the rough enhanced image and the virtual overexposure image by using an image block decomposition method;
the value with the maximum block signal intensity of the image after column vectorization is obtained and recorded as
Figure FDA0003685547510000041
Expectation of obtaining the Block Structure of the column-vectorized image, denoted
Figure FDA0003685547510000042
Calculating the block structure value of the column-vectorized image by the expectation of the block structure of the column-vectorized image, expressed as:
Figure FDA0003685547510000043
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003685547510000044
is a weighting function expressed as
Figure FDA0003685547510000045
Figure FDA0003685547510000046
To remove the image blocks of the mean, expressed as
Figure FDA0003685547510000047
x k Which represents an image block or a sub-image block,
Figure FDA0003685547510000048
for image block x k Average value of (d); p is a weight parameter; s k Is a unit length vector; k is the exposure rate; s k A block structure for an image with an exposure k;
the average intensity of the block is obtained by using a weighted linear fusion mechanism and is recorded as
Figure FDA0003685547510000049
Expressed as:
Figure FDA00036855475100000410
wherein, L (. mu.) is k ,l k ) For imaging an image X k Global mean value mu of k And a current image block x k As a weighted function of the input; l k Representing the average intensity of pixel blocks of different exposure;
to be stacked
Figure FDA0003685547510000051
Reverting to the RGB channel, i.e. the optimized enhanced image is represented as:
Figure FDA0003685547510000052
CN202210644966.3A 2022-06-09 2022-06-09 Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy Pending CN115035011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210644966.3A CN115035011A (en) 2022-06-09 2022-06-09 Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210644966.3A CN115035011A (en) 2022-06-09 2022-06-09 Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy

Publications (1)

Publication Number Publication Date
CN115035011A true CN115035011A (en) 2022-09-09

Family

ID=83123144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210644966.3A Pending CN115035011A (en) 2022-06-09 2022-06-09 Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy

Country Status (1)

Country Link
CN (1) CN115035011A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294126A (en) * 2022-10-08 2022-11-04 南京诺源医疗器械有限公司 Intelligent cancer cell identification method for pathological image
CN116363009A (en) * 2023-03-31 2023-06-30 哈尔滨工业大学 Method and system for enhancing rapid light-weight low-illumination image based on supervised learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294126A (en) * 2022-10-08 2022-11-04 南京诺源医疗器械有限公司 Intelligent cancer cell identification method for pathological image
CN115294126B (en) * 2022-10-08 2022-12-16 南京诺源医疗器械有限公司 Cancer cell intelligent identification method for pathological image
CN116363009A (en) * 2023-03-31 2023-06-30 哈尔滨工业大学 Method and system for enhancing rapid light-weight low-illumination image based on supervised learning
CN116363009B (en) * 2023-03-31 2024-03-12 哈尔滨工业大学 Method and system for enhancing rapid light-weight low-illumination image based on supervised learning

Similar Documents

Publication Publication Date Title
CN111968044B (en) Low-illumination image enhancement method based on Retinex and deep learning
Rao et al. A Survey of Video Enhancement Techniques.
WO2011008239A1 (en) Contrast enhancement
CN115035011A (en) Low-illumination image enhancement method for self-adaptive RetinexNet under fusion strategy
CN112541877B (en) Defuzzification method, system, equipment and medium for generating countermeasure network based on condition
CN111242883A (en) Dynamic scene HDR reconstruction method based on deep learning
Liu et al. Survey of natural image enhancement techniques: Classification, evaluation, challenges, and perspectives
CN112348747A (en) Image enhancement method, device and storage medium
Li et al. Underwater image high definition display using the multilayer perceptron and color feature-based SRCNN
CN111986084A (en) Multi-camera low-illumination image quality enhancement method based on multi-task fusion
CN113095470A (en) Neural network training method, image processing method and device, and storage medium
Lepcha et al. A deep journey into image enhancement: A survey of current and emerging trends
CN116797488A (en) Low-illumination image enhancement method based on feature fusion and attention embedding
CN116152120A (en) Low-light image enhancement method and device integrating high-low frequency characteristic information
Lv et al. Low-light image enhancement via deep Retinex decomposition and bilateral learning
Zheng et al. Neural augmented exposure interpolation for two large-exposure-ratio images
Wang et al. Single Underwater Image Enhancement Based on $ L_ {P} $-Norm Decomposition
Guo et al. A survey on image enhancement for Low-light images
Chen et al. End-to-end single image enhancement based on a dual network cascade model
Chen et al. Retinex low-light image enhancement network based on attention mechanism
Singh et al. Low-light image enhancement for UAVs with multi-feature fusion deep neural networks
Liu et al. Attention mechanism enhancement algorithm based on cycle consistent generative adversarial networks for single image dehazing
Du et al. Low-light image enhancement and denoising via dual-constrained Retinex model
Wang et al. Single low-light image brightening using learning-based intensity mapping
CN116452431A (en) Weak light image enhancement method based on multi-branch progressive depth network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination