CN111612722B - Low-illumination image processing method based on simplified Unet full-convolution neural network - Google Patents
Low-illumination image processing method based on simplified Unet full-convolution neural network Download PDFInfo
- Publication number
- CN111612722B CN111612722B CN202010455150.7A CN202010455150A CN111612722B CN 111612722 B CN111612722 B CN 111612722B CN 202010455150 A CN202010455150 A CN 202010455150A CN 111612722 B CN111612722 B CN 111612722B
- Authority
- CN
- China
- Prior art keywords
- convolution
- layer
- image
- size
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 38
- 238000005286 illumination Methods 0.000 title claims abstract description 17
- 238000003672 processing method Methods 0.000 title claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000011176 pooling Methods 0.000 claims description 42
- 238000010586 diagram Methods 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 21
- 238000000034 method Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 230000003321 amplification Effects 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000002787 reinforcement Effects 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 abstract description 4
- 238000012360 testing method Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a low-illumination image processing method based on a simplified Unet full convolution neural network, which comprises the following steps: collecting a dark image and a corresponding clear image to generate an image data set; constructing a full convolution neural network (FCN) model for end-to-end image enhancement; training the FCN model of the convolutional network by using the generated image data set to obtain a trained FCN model; and inputting the low-level image in the original RAW format into the trained FCN model to obtain an enhanced clear image. The invention rebuilds the network by simplifying the network layer number and reducing the convolution kernel number, trains the simplified Unet network by a supervising learning mode through a large batch of data sets based on real scenes, and reduces the running time by more than half and greatly reduces the training cost on the premise of keeping the visual effect difference unobvious by comparing the network performance before simplification based on the simplified Unet full convolution neural network.
Description
Technical Field
The invention relates to the technical field of neural networks, in particular to a video image processing method in a low-illumination environment.
Background
With the rapid development of information technology, digital images play a great role in many fields such as public safety, medical treatment, entertainment and the like, and meanwhile, the quality requirements of people on the images are continuously improved. However, due to various factors such as the image capturing device and the environment of the image capturing, the original low-illumination image often cannot completely satisfy the visual viewing requirement of people and the requirement of image technology application in engineering to a certain extent. However, many upper-layer image processing algorithms have certain requirements on picture quality, so that enhancement of low-illumination images is fundamental work of image application.
Disclosure of Invention
In view of the above, the present invention is directed to a low-illumination image processing method based on a simplified Unet full convolution neural network, so as to solve the technical problems of how to process a low-illumination low-quality image to obtain an image recognizable to human eyes and increase the processing speed.
The invention discloses a low-illumination image processing method based on a simplified Unet full convolution neural network, which comprises the following steps of:
the method comprises the following steps: collecting a short-exposure dark image and a corresponding long-exposure bright clear image in a low-illumination environment, and generating an image data set by the collected dark image and the corresponding clear image;
step two: constructing a full convolution neural network (FCN) model for end-to-end image enhancement, wherein the full convolution neural network (FCN) model comprises an input layer, a hidden layer and an output layer, the input layer is used for inputting a graph, the convolution layer of each computing node in the hidden layer is used for performing convolution calculation and deconvolution calculation on input data, all layers of the FCN model are connected together through an activation function, and network parameters are continuously improved through a training algorithm;
step three: training the FCN model of the convolutional network by using the image data set generated in the first step to obtain a trained FCN model;
step four: and inputting the low-image in the original RAW format into the trained FCN model to obtain an enhanced clear image.
Further, the step one of acquiring a dark image with short exposure and a corresponding clear image with long exposure and brightness comprises:
step 1: selecting a shooting scene, and fixing a camera to keep the shooting posture of the camera unchanged;
step 2: setting the exposure time parameters of the camera to be 0.1s,0.04s and 0.033s respectively, and carrying out short-exposure shooting respectively; setting the exposure time parameter of the camera to be 10s, and carrying out long exposure shooting;
and step 3: and (3) repeatedly selecting different shooting scenes according to the methods of the step (1) and the step (2) to acquire images, so as to obtain a dark image and a bright clear image which are matched with each other.
Further, the input layer of the full convolution neural network FCN model in the step two is: receiving 4-channel image data;
the hidden layer of the full convolution neural network FCN model in the step two comprises:
the convolutional layer 1: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 32, the convolution step length s =1, valid is selected by padding;
a pooling layer 1: selecting Max pooling, selecting same with the maximum pooling size of 2 × 2 and the step length s =2, padding;
and (3) convolutional layer 2: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 64, the convolution step length s =1, and valid is selected by padding;
and (3) a pooling layer 2: max pooling was chosen using a maximum pooling size of 2 × 2, step length s =2; padding selects same;
and (3) convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 128, the convolution step length s =1, valid is selected by padding;
a pooling layer 3: max pooling was chosen, same was chosen with maximum pooling size of 2 × 2, step s =2, padding;
and (4) convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 256, the convolution step length s =1, valid is selected by padding;
the pooling layer 4: max pooling was chosen, same was chosen with maximum pooling size of 2 × 2, step s =2, padding;
5-1 of convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 512, the convolution step length s =1, valid is selected by padding;
5-2 of the convolution layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 512, the convolution step length s =1, valid is selected by padding;
deconvolution layer 6: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolution layer 4, cascading with the convolution layer 5-2, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the high-resolution characteristic diagram and the low-resolution characteristic diagram to be used as the input of the next convolution layer;
and (6) a convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 256, the convolution step length s =1, valid is selected by padding;
deconvolution layer 7: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolution layer 3, cascading with the convolution layer 6, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the two to be used as the input of the next convolution layer;
and (3) a convolutional layer 7: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 128, the convolution step length s =1, valid is selected by padding;
deconvolution layer 8: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolution layer 2, cascading with the convolution layer 7, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the two to be used as the input of the next convolution layer;
and (3) convolutional layer 8: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 64, the convolution step length s =1, and valid is selected by padding;
deconvolution layer 9: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolutional layer 1, cascading with the convolutional layer 8, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the high-resolution characteristic diagram and the low-resolution characteristic diagram to be used as the input of the next convolutional layer;
a convolutional layer 9: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 32, the convolution step length s =1, valid is selected by padding;
the output layer of the full convolution neural network FCN model in the second step is a convolution layer 10: the convolution kernel size is 1 × 1, the number of convolution kernels is 12, 12 channels are output, each feature map size is 1/12 of the RGB component, and the number of feature maps output is changed to 1 using a convolution kernel size of 1 × 1.
Further, the training the FCN model of the convolutional network using the image dataset generated in the first step in the third step includes:
1) Preprocessing an original image: firstly, RGB pixel separation is carried out on an Raw format image, each Raw image block is expanded into a four-channel feature map of RGBG components, the spatial resolution on each channel is reduced by half, and finally feature maps of RGBG four channels with the size of 1/2 of an original image are obtained; subtracting the black level value of the characteristic graph after executing different gamma amplification rates to obtain a brightness image with the same brightness as the clear image corresponding to the long exposure;
2) Inputting the preprocessed image into a full convolution neural network FCN model, wherein a feature map output by the full convolution neural network FCN model is 12 channels, and the size of each feature map is half of RGB components;
3) And performing sub-pixel convolution operation on the image output by the full convolution neural network FCN model to restore the data to a normal sRGB image format.
Further, the training algorithm comprises:
1) The activation functions of the FCN model are all selected as Leaky Relu functions, and the expression of the Leaky Relu functions is as follows:
f (x) = max (0.2x, x), wherein x is a natural number;
2) Pooling the images after the convolution operation, and selecting a maximum pooling method;
3) The loss function is constructed such that,wherein I out Representing predicted pictures, I gt Representing a desired image;
4) The loss function is processed using an ADAM optimizer.
5) Setting a reinforcement learning effect of a segmented learning rate: the initial learning rate is 1e-4, and the number of times of training is 4000, where the learning rate is divided by 10 to decrease the learning rate when the number of times of training is 2000.
The invention has the beneficial effects that:
aiming at the problems that in the image enhancement by utilizing a Unet convolutional neural network in a low-illumination environment, the operation speed is low due to the complex network structure, the noise-modulated multi-enhancement effect is not obvious by using a traditional method and the like, the low-illumination image processing method based on the simplified Unet fully convolutional neural network provided by the invention reserves the network construction style of computer coding and decoding, rebuilds the network by simplifying the network layer number and reducing the convolutional kernel number, and trains the simplified Unet network in a supervised learning mode through a large batch of data sets based on real scenes. The final experiment result shows that compared with the network performance before simplification, the simplified Unet full convolution neural network reduces the running time by more than half and greatly reduces the training cost under the condition of keeping the visual effect difference unobvious.
Drawings
FIG. 1 is a schematic structural diagram of a deep learning image enhancement model;
FIG. 2 is a long exposure sharp image acquired;
figure 3 is a low illumination image for a short exposure,
FIG. 4 is an image after processing a short-exposed low-illumination image using the method of the embodiment;
FIG. 5 illustrates an image after white balance adjustment using conventional methods;
FIG. 6 is a graph of a LeakRelu function;
fig. 7 is a diagram of an image enhancement process.
Detailed Description
The invention is further described below with reference to the figures and examples.
In this embodiment, the method for processing a low-illuminance image based on a simplified Unet full convolution neural network includes the steps of:
the method comprises the following steps: a short-exposure dark image and a corresponding long-exposure bright sharp image are acquired in a low-light environment, and an image dataset is generated from the acquired dark image and the corresponding sharp image.
Acquiring a short-exposure dark image and a corresponding long-exposure bright sharp image includes:
step 1: selecting a shooting scene, and fixing a camera to keep the shooting posture of the camera unchanged;
step 2: setting the exposure time parameters of the camera to be 0.1s,0.04s and 0.033s respectively, and carrying out short-exposure shooting respectively; setting the exposure time parameter of the camera to be 10s, and carrying out long exposure shooting;
and step 3: and (3) repeatedly selecting different shooting scenes according to the methods of the step (1) and the step (2) to acquire images, so as to obtain a dark image and a bright clear image which are matched with each other.
Step two: the method comprises the steps of constructing a full convolution neural network FCN model for end-to-end image enhancement, wherein the full convolution neural network FCN model comprises an input layer, a hidden layer and an output layer, the input layer is used for inputting a graph, the convolution layer of each computing node in the hidden layer is used for performing convolution calculation and deconvolution calculation on input data, all layers of the FCN model are connected together through an activation function, and network parameters are continuously improved through a training algorithm.
The input layer of the full convolution neural network FCN model is as follows: the size of the image it receives is not limited and the full resolution input image is accepted. The input layer of the present embodiment receives 4-channel (R, G, B, G) image data.
The hidden layer of the full convolution neural network FCN model comprises:
convolutional layer 1: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 32, the convolution step length s =1, valid is selected by padding; thus, the convolutional layer can ensure that the segmentation result is complete, is obtained based on the context features without missing, and can cause the sizes of the input and the output to be inconsistent. So the feature map size will be reduced by 2 after this operation.
A pooling layer 1: max pooling was chosen, same was chosen using a maximum pooling size of 2x 2, step size s =2, padding. At this point, padding selects same two different from Unet selects vaild. The same policy will fill in 0 at the edge, ensuring that every value of featuremap will be fetched, and instead of filling, the value will ignore the pooling operation that cannot be followed, which will result in some information being lost if the size of the feature map before pooling is odd.
Convolution layer 2: the convolution kernel size is 3 × 3, the number of convolution kernels is 64, the convolution step s =1, padding selects valid.
And (3) a pooling layer 2: max pooling was chosen using a maximum pooling size of 2 × 2, step length s =2; padding selects same.
And (3) convolutional layer: the convolution kernel size is 3 × 3, the number of convolution kernels is 128, the convolution step s =1, padding selects valid.
A pooling layer 3: max pooling was chosen, and same was chosen using a maximum pooling size of 2x 2, step size s =2,padding.
Convolution layer 4: the convolution kernel size is 3 × 3, the number of convolution kernels is 256, the convolution step s =1, padding selects valid.
And (4) a pooling layer: max pooling was chosen, same was chosen using a maximum pooling size of 2x 2, step size s =2,padding.
5-1 of convolutional layer: the convolution kernel size is 3 × 3, the number of convolution kernels is 512, the convolution step s =1, padding selects valid.
5-2 of the convolution layer: the convolution kernel size is 3 × 3, the number of convolution kernels is 512, the convolution step s =1, padding selects valid.
Deconvolution layer 6: the convolution kernel size was 2 × 2, doubling the rows and columns.
The stage structure is as follows: and the convolution layer 4 is cascaded with the convolution layer 5-2 after the Crop operation is carried out, and the high-resolution characteristic diagram and the low-resolution characteristic diagram are fused and then spliced to be used as the input of the next convolution layer.
And (6) a convolutional layer: the convolution kernel size is 3 × 3, the number of convolution kernels is 256, the convolution step s =1, padding selects valid.
Deconvolution layer 7: the convolution kernel size is 2x 2, doubling the rows and columns.
A hierarchical structure: and the convolution layer 3 is cascaded with the convolution layer 6 after the Crop operation is carried out, and the high-resolution characteristic diagram and the low-resolution characteristic diagram are fused and further spliced to be used as the input of the next convolution layer.
And (3) a convolutional layer 7: the convolution kernel size is 3 × 3, the number of convolution kernels is 128, the convolution step s =1, padding selects valid.
Deconvolution layer 8: the convolution kernel size is 2x 2, doubling the rows and columns.
The stage structure is as follows: after the Crop operation is carried out on the convolution layer 2, the convolution layer 7 is cascaded, the high-resolution characteristic diagram and the low-resolution characteristic diagram are fused, and the fused high-resolution characteristic diagram and the low-resolution characteristic diagram are used as the input of the next convolution layer after splicing.
And (3) convolutional layer 8: the convolution kernel size is 3 × 3, the number of convolution kernels is 64, the convolution step s =1, padding selects valid.
Deconvolution layer 9: the convolution kernel size was 2 × 2, doubling the rows and columns.
The stage structure is as follows: after the Crop operation is carried out on the convolutional layer 1, the convolutional layer 8 is cascaded, the high-resolution characteristic diagram and the low-resolution characteristic diagram are fused, and the fused data are used as the input of the next convolutional layer.
A convolutional layer 9: the convolution kernel size is 3 × 3, the number of convolution kernels is 32, the convolution step s =1, padding selects valid.
The output layer of the full convolution neural network FCN model in the second step is a convolution layer 10: the convolution kernel size is 1 × 1, the number of convolution kernels is 12, 12 channels are output, each feature map size is 1/12 of the RGB component, and the number of output feature maps is changed to 1 using convolution kernels of size 1 × 1.
The training algorithm comprises:
1) The activation functions of the FCN model are all selected as Leaky Relu functions, and the expression of the Leaky Relu functions is as follows:
f(x)=max(0.2x,x)
in the formula, x is a natural number, and a function graph of the formula is shown in FIG. 6. The Leaky Relu activation function can avoid gradient vanishing.
2) Pooling the images after the convolution operation, and selecting a maximum pooling method. The main characteristics of the local area are extracted, so that the dimensionality of data can be reduced to a great extent, the total number of weight parameters is correspondingly reduced, the calculation cost is reduced, and the calculation efficiency is improved.
3) The loss function is constructed such that,wherein I out Representing predicted pictures, I gt Representing the desired image.
4) Using an ADAM optimizer to process the loss function, the ADAM optimizer can dynamically adjust the learning rate for each parameter using the first moment estimate and the second moment estimate of the gradient, calculating different adaptive learning rates for different parameters. Adam is selected as the optimizer because after bias correction, the learning rate of each iteration has a certain range, so that the parameters are relatively stable, and the requirement on the memory is relatively small.
5) Setting a reinforcement learning effect of a segmented learning rate: the initial learning rate is 1e-4, and the number of times of training is 4000, where the learning rate is divided by 10 to decrease the learning rate when the number of times of training is 2000. Finally, a fixed parameter training completed deep learning image enhancement model is obtained.
Step three: training the FCN model of the convolutional network by using the image data set generated in the first step to obtain a trained FCN model, wherein the training comprises the following steps of:
the training of the convolution network FCN model by using the image data set generated in the first step in the third step comprises the following steps:
1) Preprocessing an original image: firstly, RGB pixel separation is carried out on an Raw format image, each Raw image block is expanded into a four-channel feature map of RGBG components, the spatial resolution of each channel is reduced by half, and finally feature maps of RGBG four channels with the size of 1/2 of an original image are obtained; subtracting the black level value of the characteristic graph after executing different gamma amplification rates to obtain a brightness image with the same brightness as the clear image corresponding to the long exposure;
2) Inputting the preprocessed image into a full convolution neural network FCN model, wherein a feature map output by the full convolution neural network FCN model is 12 channels, and the size of each feature map is half of RGB components;
3) The sub-pixel convolution operation is performed on the image output by the full convolution neural network FCN model to restore the data to the normal sRGB image format, and the processing procedure is as shown in fig. 7.
Step four: and inputting the low-level image in the original RAW format into the trained FCN model to obtain an enhanced clear image.
The following table is a graph of test results obtained by processing low-illumination images, and the image quality evaluation Index data used in the test are Peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM). From the test results in the table, the invention can reduce the running time by more than half and greatly reduce the training cost based on the simplified Unet full convolution neural network compared with the network performance before simplification under the condition of keeping the visual effect difference unobvious.
Comparison of experimental results of different algorithms
White balance | Algorithm of the invention | Unet | |
PSNR/SSIM | 17.668/0.207 | 28.956/0.695 | 29.012/0.703 |
Training time | -- | 16h | 24h |
Run time | -- | 0.04s | 0.1s |
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.
Claims (4)
1. The low-illumination image processing method based on the simplified Unet full convolution neural network is characterized by comprising the following steps of:
the method comprises the following steps: collecting a short-exposure dark image and a corresponding long-exposure bright clear image in a low-illumination environment, and generating an image data set by the collected dark image and the corresponding clear image;
step two: constructing a full convolution neural network (FCN) model for end-to-end image enhancement, wherein the full convolution neural network (FCN) model comprises an input layer, a hidden layer and an output layer, the input layer is used for inputting a graph, the convolution layer of each computing node in the hidden layer is used for performing convolution calculation and deconvolution calculation on input data, all layers of the FCN model are connected together through an activation function, and network parameters are continuously improved through a training algorithm;
the input layer of the full convolution neural network FCN model in the step two is as follows: receiving 4-channel image data;
the hidden layer of the full convolution neural network FCN model in the step two comprises:
convolutional layer 1: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 32, the convolution step length s =1, valid is selected by padding;
a pooling layer 1: max pooling is selected, same is selected using a maximum pooling size of 2 × 2, step length s =2, padding;
and (3) convolutional layer 2: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 64, the convolution step length s =1, valid is selected by padding;
and (3) a pooling layer 2: max pool ing was chosen using a maximum pooling size of 2 × 2, step length s =2; padding selects same;
convolution layer 3: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 128, the convolution step length s =1, valid is selected by padding;
a pooling layer 3: max pool is selected, maximum pooling size is used of 2 × 2, step length s =2, padding selects same;
and (4) convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 256, the convolution step length s =1, and valid is selected by padding;
and (4) a pooling layer: max pool is selected, maximum pooling size is used of 2 × 2, step length s =2, padding selects same;
5-1 of convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 512, the convolution step length s =1, valid is selected by padding;
5-2 of the convolution layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 512, the convolution step length s =1, valid is selected by padding;
deconvolution layer 6: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolution layer 4, cascading with the convolution layer 5-2, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the high-resolution characteristic diagram and the low-resolution characteristic diagram to be used as the input of the next convolution layer;
and (6) a convolutional layer: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 256, the convolution step length s =1, valid is selected by padding;
deconvolution layer 7: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
a hierarchical structure: after the Crop operation is carried out on the convolution layer 3, cascading with the convolution layer 6, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the two to be used as the input of the next convolution layer;
convolution layer 7: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 128, the convolution step length s =1, valid is selected by padding;
deconvolution layer 8: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolution layer 2, cascading with the convolution layer 7, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the two to be used as the input of the next convolution layer;
and (3) convolutional layer 8: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 64, the convolution step length s =1, and valid is selected by padding;
deconvolution layer 9: the size of the convolution kernel is 2 multiplied by 2, and the row and column are doubled;
the stage structure is as follows: after the Crop operation is carried out on the convolutional layer 1, cascading with the convolutional layer 8, fusing the high-resolution characteristic diagram and the low-resolution characteristic diagram, and further splicing the two to be used as the input of the next convolutional layer;
convolutional layer 9: the size of the convolution kernel is 3 multiplied by 3, the number of the convolution kernels is 32, the convolution step length s =1, valid is selected by padding;
the output layer of the full convolution neural network FCN model in the second step is a convolution layer 10: the size of a convolution kernel is 1 multiplied by 1, the number of the convolution kernels is 12, 12 channels are output, each feature map is RGB components with the size of 1/12, and the number of the output feature maps is changed into 1 by using the convolution kernel with the size of 1 multiplied by 1;
step three: training the FCN model of the convolutional network by using the image data set generated in the first step to obtain a trained FCN model;
step four: and inputting the low-image in the original RAW format into the trained FCN model to obtain an enhanced clear image.
2. The reduced Unet fully convolutional neural network-based low-light image processing method of claim 1, wherein: the step one of acquiring the dark image of the short exposure and the corresponding clear image of the long exposure and brightness comprises the following steps:
step 1: selecting a shooting scene, and fixing a camera to keep the shooting posture of the camera unchanged;
and 2, step: setting the exposure time parameters of the camera to be 0.1s,0.04s and 0.033s respectively, and carrying out short-exposure shooting respectively; setting the exposure time parameter of the camera to be 10s, and carrying out long exposure shooting;
and step 3: and (3) repeatedly selecting different shooting scenes according to the methods of the step (1) and the step (2) to acquire images, so as to obtain a dark image and a bright clear image which are matched with each other.
3. The reduced Unet full convolution neural network-based low-light image processing method of claim 1, wherein: the training of the convolution network FCN model by using the image data set generated in the first step in the third step comprises the following steps:
1) Preprocessing an original image: firstly, RGB pixel separation is carried out on an Raw format image, each Raw image block is expanded into a four-channel feature map of RGBG components, the spatial resolution of each channel is reduced by half, and finally feature maps of RGBG four channels with the size of 1/2 of an original image are obtained; subtracting the black level value of the characteristic graph after executing different gamma amplification rates to obtain a brightness image with the same brightness as the clear image corresponding to the long exposure;
2) Inputting the preprocessed image into a full convolution neural network FCN model, wherein a feature map output by the full convolution neural network FCN model is 12 channels, and the size of each feature map is half of RGB components;
3) And performing sub-pixel convolution operation on the image output by the full convolution neural network FCN model to restore the data to a normal sRGB image format.
4. The reduced Unet full convolution neural network-based low-light image processing method of claim 1, wherein: the training algorithm comprises:
1) The activation functions of the FCN model are all selected as Leaky Relu functions, and the expression of the Leaky Relu functions is as follows:
f (x) = max (0.2x, x); wherein x is a natural number;
2) Pooling the images after the convolution operation, and selecting a maximum pooling method;
3) The loss function is constructed such that,wherein I out Representing predicted pictures, I gt Representing a desired image;
4) Processing the loss function using an ADAM optimizer;
5) Setting a reinforcement learning effect of a segmented learning rate: the initial learning rate is 1e-4, and the number of times of training is 4000, where the learning rate is divided by 10 to decrease the learning rate when the number of times of training is 2000.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010455150.7A CN111612722B (en) | 2020-05-26 | 2020-05-26 | Low-illumination image processing method based on simplified Unet full-convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010455150.7A CN111612722B (en) | 2020-05-26 | 2020-05-26 | Low-illumination image processing method based on simplified Unet full-convolution neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111612722A CN111612722A (en) | 2020-09-01 |
CN111612722B true CN111612722B (en) | 2023-04-18 |
Family
ID=72196328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010455150.7A Active CN111612722B (en) | 2020-05-26 | 2020-05-26 | Low-illumination image processing method based on simplified Unet full-convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111612722B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200226B (en) * | 2020-09-27 | 2021-11-05 | 北京达佳互联信息技术有限公司 | Image processing method based on reinforcement learning, image processing method and related device |
CN112581401B (en) * | 2020-12-25 | 2023-04-28 | 英特灵达信息技术(深圳)有限公司 | RAW picture acquisition method and device and electronic equipment |
CN113379861B (en) * | 2021-05-24 | 2023-05-09 | 南京理工大学 | Color low-light-level image reconstruction method based on color recovery block |
US11468543B1 (en) | 2021-08-27 | 2022-10-11 | Hong Kong Applied Science and Technology Research Institute Company Limited | Neural-network for raw low-light image enhancement |
CN113744167B (en) * | 2021-09-02 | 2024-04-30 | 厦门美图之家科技有限公司 | Image data conversion method and device |
CN114724022B (en) * | 2022-03-04 | 2024-05-10 | 大连海洋大学 | Method, system and medium for detecting farmed fish shoal by fusing SKNet and YOLOv5 |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0028844A2 (en) * | 1979-11-13 | 1981-05-20 | Phillips Petroleum Company | Polypropylene filament yarn and process for making same |
CN106600571A (en) * | 2016-11-07 | 2017-04-26 | 中国科学院自动化研究所 | Brain tumor automatic segmentation method through fusion of full convolutional neural network and conditional random field |
CN106981067A (en) * | 2017-04-05 | 2017-07-25 | 深圳市唯特视科技有限公司 | A kind of Texture Segmentation Methods based on full convolutional network |
CN107169974A (en) * | 2017-05-26 | 2017-09-15 | 中国科学技术大学 | It is a kind of based on the image partition method for supervising full convolutional neural networks more |
CN107273864A (en) * | 2017-06-22 | 2017-10-20 | 星际(重庆)智能装备技术研究院有限公司 | A kind of method for detecting human face based on deep learning |
CA2948499A1 (en) * | 2016-11-16 | 2018-05-16 | The Governing Council Of The University Of Toronto | System and method for classifying and segmenting microscopy images with deep multiple instance learning |
CN108492297A (en) * | 2017-12-25 | 2018-09-04 | 重庆理工大学 | The MRI brain tumors positioning for cascading convolutional network based on depth and dividing method in tumor |
CN109345476A (en) * | 2018-09-19 | 2019-02-15 | 南昌工程学院 | High spectrum image super resolution ratio reconstruction method and device based on depth residual error network |
CN109598727A (en) * | 2018-11-28 | 2019-04-09 | 北京工业大学 | A kind of CT image pulmonary parenchyma three-dimensional semantic segmentation method based on deep neural network |
CN109871798A (en) * | 2019-02-01 | 2019-06-11 | 浙江大学 | A kind of remote sensing image building extracting method based on convolutional neural networks |
CN110062173A (en) * | 2019-03-15 | 2019-07-26 | 北京旷视科技有限公司 | Image processor and image processing method, equipment, storage medium and intelligent terminal |
AU2019101133A4 (en) * | 2019-09-30 | 2019-10-31 | Bo, Yaxin MISS | Fast vehicle detection using augmented dataset based on RetinaNet |
-
2020
- 2020-05-26 CN CN202010455150.7A patent/CN111612722B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0028844A2 (en) * | 1979-11-13 | 1981-05-20 | Phillips Petroleum Company | Polypropylene filament yarn and process for making same |
CN106600571A (en) * | 2016-11-07 | 2017-04-26 | 中国科学院自动化研究所 | Brain tumor automatic segmentation method through fusion of full convolutional neural network and conditional random field |
CA2948499A1 (en) * | 2016-11-16 | 2018-05-16 | The Governing Council Of The University Of Toronto | System and method for classifying and segmenting microscopy images with deep multiple instance learning |
CN106981067A (en) * | 2017-04-05 | 2017-07-25 | 深圳市唯特视科技有限公司 | A kind of Texture Segmentation Methods based on full convolutional network |
CN107169974A (en) * | 2017-05-26 | 2017-09-15 | 中国科学技术大学 | It is a kind of based on the image partition method for supervising full convolutional neural networks more |
CN107273864A (en) * | 2017-06-22 | 2017-10-20 | 星际(重庆)智能装备技术研究院有限公司 | A kind of method for detecting human face based on deep learning |
CN108492297A (en) * | 2017-12-25 | 2018-09-04 | 重庆理工大学 | The MRI brain tumors positioning for cascading convolutional network based on depth and dividing method in tumor |
CN109345476A (en) * | 2018-09-19 | 2019-02-15 | 南昌工程学院 | High spectrum image super resolution ratio reconstruction method and device based on depth residual error network |
CN109598727A (en) * | 2018-11-28 | 2019-04-09 | 北京工业大学 | A kind of CT image pulmonary parenchyma three-dimensional semantic segmentation method based on deep neural network |
CN109871798A (en) * | 2019-02-01 | 2019-06-11 | 浙江大学 | A kind of remote sensing image building extracting method based on convolutional neural networks |
CN110062173A (en) * | 2019-03-15 | 2019-07-26 | 北京旷视科技有限公司 | Image processor and image processing method, equipment, storage medium and intelligent terminal |
AU2019101133A4 (en) * | 2019-09-30 | 2019-10-31 | Bo, Yaxin MISS | Fast vehicle detection using augmented dataset based on RetinaNet |
Non-Patent Citations (2)
Title |
---|
李超波 等.深度学习在图像识别中的应用.《南通大学学报(自然科学版)》.2018,第17卷(第01期),第1-9页. * |
熊炜 等.融合背景估计与U-Net的文档图像二值化算法.《计算机应用研究》.2019,第37卷(第03期),第896-900页. * |
Also Published As
Publication number | Publication date |
---|---|
CN111612722A (en) | 2020-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111612722B (en) | Low-illumination image processing method based on simplified Unet full-convolution neural network | |
WO2021179820A1 (en) | Image processing method and apparatus, storage medium and electronic device | |
WO2021164234A1 (en) | Image processing method and image processing device | |
CN111292264A (en) | Image high dynamic range reconstruction method based on deep learning | |
CN113450290B (en) | Low-illumination image enhancement method and system based on image inpainting technology | |
CN111047543A (en) | Image enhancement method, device and storage medium | |
CN112465727A (en) | Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory | |
CN111696033B (en) | Real image super-resolution model and method based on angular point guided cascade hourglass network structure learning | |
CN111105376B (en) | Single-exposure high-dynamic-range image generation method based on double-branch neural network | |
CN112348747A (en) | Image enhancement method, device and storage medium | |
CN111915513A (en) | Image denoising method based on improved adaptive neural network | |
CN109785252A (en) | Based on multiple dimensioned residual error dense network nighttime image enhancing method | |
WO2023151511A1 (en) | Model training method and apparatus, image moire removal method and apparatus, and electronic device | |
CN111696034B (en) | Image processing method and device and electronic equipment | |
CN115115516A (en) | Real-world video super-resolution algorithm based on Raw domain | |
Xu et al. | Deep video inverse tone mapping | |
CN113379861B (en) | Color low-light-level image reconstruction method based on color recovery block | |
CN114299180A (en) | Image reconstruction method, device, equipment and storage medium | |
CN117974459A (en) | Low-illumination image enhancement method integrating physical model and priori | |
CN117611467A (en) | Low-light image enhancement method capable of balancing details and brightness of different areas simultaneously | |
CN117422653A (en) | Low-light image enhancement method based on weight sharing and iterative data optimization | |
CN114897718B (en) | Low-light image enhancement method capable of balancing context information and space detail simultaneously | |
Omrani et al. | High dynamic range image reconstruction using multi-exposure Wavelet HDRCNN | |
CN111861877A (en) | Method and apparatus for video hyper-resolution | |
CN114596219B (en) | Image motion blur removing method based on condition generation countermeasure network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |