CN115082315B - Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment - Google Patents
Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment Download PDFInfo
- Publication number
- CN115082315B CN115082315B CN202210763316.0A CN202210763316A CN115082315B CN 115082315 B CN115082315 B CN 115082315B CN 202210763316 A CN202210763316 A CN 202210763316A CN 115082315 B CN115082315 B CN 115082315B
- Authority
- CN
- China
- Prior art keywords
- image
- training
- neural network
- network model
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005070 sampling Methods 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000005286 illumination Methods 0.000 title claims abstract description 10
- 230000006870 function Effects 0.000 claims abstract description 33
- 238000003062 neural network model Methods 0.000 claims abstract description 21
- 238000013528 artificial neural network Methods 0.000 claims abstract description 16
- 238000009499 grossing Methods 0.000 claims description 44
- 238000000605 extraction Methods 0.000 claims description 41
- 238000007781 pre-processing Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims 2
- 230000000694 effects Effects 0.000 abstract description 12
- 238000001914 filtration Methods 0.000 abstract 1
- 238000011084 recovery Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4015—Image demosaicing, e.g. colour filter arrays [CFA] or Bayer patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
Abstract
The invention provides a demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment, which comprises the following steps of: after Gaussian noise is added to the original data set and the brightness is reduced, the RGB image is processed into a mosaic image through the CFA with 75% transparent elements, and parameters of a training target neural network are set. Step 2: and building a neural network model taking the UNet++ network as a main framework. Step 3: according to the neural network model, training a corresponding network model in two stages with the aim of minimizing respective loss functions. Step 4: and processing the image to be processed by using the trained neural network model to obtain a demosaiced image. The invention optimizes the network topology structure on the basis of ensuring the image restoration effect of the small-pixel color filtering whole-row mosaic, reduces the parameter as much as possible, accelerates the training time, and can be applied to network deployment on edge computing equipment.
Description
Technical Field
The invention belongs to the field of digital image processing, and in particular relates to a software and hardware collaborative design of a visual processing neural network.
Background
Computer vision has more methods for finishing demosaicing by deep learning, but the methods focus on the optimal effect of the algorithm layer unilaterally, so that the parameter amount is huge, the training time is long under the condition of energy and cost budget shortage depending on a high-cost and low-energy-efficiency graphic processing unit or a remote computing center, and meanwhile, the algorithms are difficult to be deployed on a portable or mobile real-time system. On the other hand, most of these algorithms are based on mosaic image restoration after the color filter array of the pixels with the conventional size is sampled by the classical Bayer pattern, and are not suitable for demosaicing of the mosaic image obtained by the color filter array of the low-illumination small pixels.
Therefore, the existing solutions mainly focus on the demosaicing effect of the algorithm level, and do not consider the calculation duration of the algorithm and whether the algorithm is suitable for deployment on an edge computing device, and in addition, the solutions are applicable to color filter arrays of pixels with a conventional size in a conventional Bayer pattern, and are not applicable to the case of low-light small pixels.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a demosaicing method applicable to small-pixel CFA (color filter array) sampling and edge computing equipment based on low illumination, and the method optimizes a network topological structure, reduces the number of parameters as much as possible, accelerates the training time length and is applicable to network deployment on the edge computing equipment on the basis of ensuring the mosaic image restoration effect applicable to small-pixel color filter whole row.
The technical scheme of the invention is as follows:
A low-light small-pixel CFA sampling and edge computing device applicable demosaicing method, comprising the steps of:
step 1: firstly, after Gaussian noise is added to an original data set and brightness is reduced, an RGB image is processed into a mosaic image through a CFA with 75% transparent elements, then, a training set is formed after data preprocessing (such as cutting each picture into 256 x 256 pictures and cutting the pictures into 4 x 128 pictures) steps, and parameters of a training target neural network are set.
Step 2: and building a neural network model taking the UNet++ network as a main framework.
Step 3: according to the neural network model, training a corresponding network model in two stages with the aim of minimizing respective loss functions.
Step 4: and demosaicing the image to be processed, which is already added with Gaussian noise, is subjected to brightness reduction and is sampled according to a color filter array mode with 75% transparent elements, by using the trained neural network model, so as to obtain a demosaiced image.
Further, in the step 1, the method for adding gaussian noise to the image and reducing brightness is as follows:
By calculating the variance v of the image of the single Zhang Hei photo and taking the value as a reference, gaussian noise subject to a mean value of 0 and a variance v distribution is multiplexed onto three channels of a data set to be processed according to the following formula (1)
Y=C(A·X+N(0,B·v)) (1)
Wherein Y is a processed low-light noise image, C is an image pixel cut-off function, A is an illumination reduction multiple, X is an original image to be processed, N (a, B) is a Gaussian normal distribution noise generation function with a mean value of a and a variance of B, and B is a Gaussian noise variance fine adjustment multiple.
Above, using gaussian smoothing instead of convolution structure can ensure that different levels of semantics can be connected across layers.
Further, in the step 1, the step of performing data preprocessing to form a training set means that the data set for training is subjected to sampling preprocessing according to a mode of a color filter array to obtain a data set used for training a neural network, and the method includes:
The image, to which gaussian noise has been added and which has reduced brightness, is sampled in a data sampling fashion with a color filter array of 75% transparent elements. The image only containing B channel sampling pixels, the image only containing R channel sampling pixels, the image G1 only containing G channel sampling pixels, the image G2 only containing G channel sampling pixels and the image only containing transparent channel sampling pixels are respectively divided into a plurality of mosaic image blocks, wherein the size of the image only containing transparent channel sampling is 3 times that of other images, and after downsampling is carried out twice, the transparent channel image blocks with the same size as the other image blocks are obtained.
Further, the neural network model in the step 2 includes a feature extraction module and an image reconstruction module.
The feature extraction module comprises three Gaussian smoothing and feature extraction and six cross-layer connection and a one-time up-sampling structure. In the feature extraction module, three Gaussian smoothing is performed, wherein the three Gaussian smoothing relations are as follows, an input image is firstly subjected to Gaussian smoothing once to obtain a picture subjected to Gaussian smoothing once, and then the picture subjected to Gaussian smoothing once is horizontally sent to the feature extraction module; simultaneously, the picture subjected to the one-time Gaussian smoothing is downwards subjected to the one-time Gaussian smoothing again to extract higher-level semantics, the picture subjected to the two-time Gaussian smoothing is obtained, and then the picture subjected to the two-time Gaussian smoothing is sent to a feature extraction module in the same way; and simultaneously, the picture subjected to the twice Gaussian smoothing is fed into the palace once again downwards to extract higher-level semantics by Gaussian smoothing, the picture subjected to the three times Gaussian smoothing is obtained, and then the picture subjected to the three times Gaussian smoothing is fed into a feature extraction module in the same way.
In the feature extraction module, firstly, pictures subjected to 0,1,2 and 3 Gaussian smoothing are sent to the feature extraction module of each layer, and the convolution kernel size in different feature extraction modules is different according to the difference of the smoothing times. The feature extraction modules of each layer have different convolution kernel sizes, wherein the convolution kernel size after 0 times of Gaussian smoothing is 3*3, the convolution kernel size after 1 time of Gaussian smoothing is 3*3, the convolution kernel size after 2 times of Gaussian smoothing is 5*5, and the convolution kernel size after 3 times of Gaussian smoothing is 7*7. The effect of these convolution structures is to further expand the receptive field so that the network can obtain a larger range of image information at the upper layer of the triangle structure concerned. Following the convolved structure of each layer are three solid connection units, each consisting of three depth separable convolutions and one PReLU. A dense connection is employed between the respective depth-separable convolutions in each dense connection element, i.e., the input and output of the first layer depth-separable convolutions are fed together into the input of the second layer depth-separable convolutions, and the same input and output of the second layer depth-separable convolutions are fed together into the input layer of the third layer depth-separable convolutions. Meanwhile, in the characteristic reconstruction module, the number of input and output channels of each compact connection unit.
The image reconstruction module consists of a 3×3 convolution product, a1×1 convolution structure and a PReLU, and is used for reconstructing the feature map after feature extraction into a mosaic-free and noise-free image.
Further, the feature extraction module embeds a dense connection unit having a residual connection structure in the feature extraction module connected by the residual connection structure, the residual structure is composed of a convolution structure and three dense connection units, wherein the convolution structure is used for preliminarily extracting corresponding horizontal features of different layers while further expanding a receptive field, and each dense connection unit comprises three depth-separable convolutions and six PReLU activation layers.
Further, the step 3 includes:
3.1 data selection, namely selecting an ImageNet as a data set of a training network, firstly cutting a picture into 256-by-256 resolution pictures by taking a center as a reference before the training, and then cutting the network into 128-by-128 resolution pictures for network training.
3.2, Selecting by adopting Adam optimization, setting the initial learning rate to be 0.001, reducing the learning rate to be half of the original learning rate every 10 periods, and setting the small batch size to be 16, wherein other super parameters adopt default settings.
3.3 Design of loss function, in 10 cycles (e.ltoreq.10) of training, SSIM index for measuring local similarity is adopted as the basis of the design of loss function, after 10 cycles (e > 10), MSE for measuring global characteristic is adopted as the basis of the design of loss function, and the total number of cycles is set to 150 cycles.
As can be seen from the above technical solutions, the present invention proposes a neural network demosaicing method suitable for an edge computing device, which is matched with a color filter array dedicated to a low-light small-pixel camera, and has the advantages that:
1. The color filter array with 75% transparent elements samples the resulting image for low light conditions to denoise and demosaict.
2. On the basis of the original UNet++ network structure, the original convolution structure is replaced by a Gaussian smooth structure in the down-sampling stage of the UNet++ network, so that different levels of semantics can be connected in a cross-layer manner, and feature maps of different semantics can be simultaneously used for image recovery.
3. Different loss functions are adopted in different training periods and are used for training the neural network, priori knowledge is fully utilized, the local effect of image recovery is guaranteed, and the training convergence speed of the network is guaranteed.
4. On the basis of ensuring the image recovery effect, the traditional convolution of the up-sampling stage is replaced by modifying the network topology structure and using the depth separable convolution to reduce the network parameter quantity, and after the network training is finished, the appropriate network scale can be selected by using pruning according to the actual situation, so that the network parameter quantity is convenient to transplant to edge computing equipment.
5. The method is high in efficiency and short in training time.
The neural network algorithm designed by the invention is the de-noising de-mosaic of the RAW image obtained by sampling the color filter array special for the low-illumination small-pixel condition, and the parameter quantity of the neural network is reduced by modifying the topological structure and other methods on the basis of ensuring the better image recovery quality so that the neural network is suitable for edge computing equipment.
Drawings
Fig. 1 is a view showing the effect of processing for adding gaussian noise and reducing luminance;
FIG. 2 is a color filter array;
FIG. 3 data set preprocessing;
FIG. 4 is a network topology;
fig. 5 is a feature extraction module.
Detailed Description
The technical implementation of the invention is further described in detail below with reference to the accompanying drawings:
the scheme provided by the invention is a demosaicing method matched with a color filter array with 75% transparent elements, the method mainly solves the problem of different sizes of characteristic diagrams, and the demosaicing effect is realized on the basis of utilizing the higher photosensitive characteristic of the color filter array and the characteristic of being suitable for a small camera. The method mainly comprises the following steps:
Step 1: firstly, after Gaussian noise is added to an original data set and brightness is reduced, an RGB image is processed into a mosaic image through a CFA with 75% transparent elements, data preprocessing is carried out to form a training set, and parameters of a training target neural network are set.
Step 2: and building a neural network model taking the UNet++ network as a main framework.
Step 3: according to the neural network model, training a corresponding network model in two stages with the aim of minimizing respective loss functions.
Step 4: and processing the image to be processed which is sampled in a color filter array mode with 75% transparent elements by utilizing the trained neural network model, wherein Gaussian noise is added, the brightness is reduced, and a demosaiced image is obtained.
The following examples detail the specific implementation of each step:
Step 1, firstly adding Gaussian noise to an original image, reducing brightness, processing an RGB image into a mosaic image through a CFA with 75% transparent elements, preprocessing data to form a training set, and setting parameters of a training target neural network. The method specifically comprises the following steps:
Step 1.1, adding Gaussian noise with proper intensity to an original image, and reducing brightness:
by calculating the variance v of the image of the single Zhang Hei photo and taking the value as a reference, gaussian noise obeying the mean value of 0 and distributed with the variance v is multiplexed onto three channels of the data set to be processed according to the following formula (1).
Y=C(A·X+N(0,B·v)) (1)
Wherein,
Y, a processed low-light noise image;
c, an image pixel truncation function;
a, reducing the illumination multiple;
X, an original picture to be processed;
N (a, b) obeys a Gaussian normal distribution noise generating function with a mean value of a and a variance of b;
and B, fine adjustment multiple of Gaussian noise variance.
The process aims at simulating the noise intensity of photon readout noise and the like generated by the electronic equipment in the transmission process of the image in the sensor, and further obtains a low-light picture of Gaussian noise with the average value 0 and the variance of B.V by comparing the noise adding and brightness reducing processing effect after the fine adjustment of the intensity, as shown in figure 1.
According to the above results, parameters of a=0.2, b=2 and 3 are selected to process the image.
Step 1.2, processing the RGB image into a mosaic image through a CFA with 75% transparent elements, and forming a training set after data preprocessing:
Taking the minimum repeat 4*4 unit as an example, there are 16 elements in the 4*4 unit, the upper left corner is sampled according to the bayer pattern, that is, the main diagonal samples the green value, the upper right corner samples the red value, and the lower left corner samples the blue value, and the rest is transparent samples, that is, these parts only obtain the luminance value of the original picture (here, it is calculated in a manner of gray=r 0.299+g 0.587+b 0.114). Specific sampling modes can be referred to in the following papers :GANG LUO,"ANOVEL COLOR FILTER ARRAY WITH 75%TRANSPARENT ELEMENTS,"PROC.SPIE 6502,DIGITAL PHOTOGRAPHY III,65020T(20FEBRUARY 2007);DOI:10.1117/12.702950.
Analyzing the data sampling pattern of the color filter array according to the present invention (as shown in fig. 2), the original image is sampled in the manner shown in fig. 2.
Firstly, setting R and G channels of pixels with the pixel abscissa pair 4 of 0 and the pixel ordinate pair 4 of 1 in the original image to 0, and obtaining an image only containing sampling pixels of B channels; setting the B and G channels of pixels with the pixel abscissa pair 4 of 1 and the pixel ordinate pair 4 of 0 in the original image to 0, and obtaining an image only containing R channel sampling pixels; setting R and G channels of pixels with the pixel abscissa pair 4 of 0 and the pixel ordinate pair 4 of 0 in the original image to 0 to obtain an image G1 only containing G channel sampling pixels, setting R and G channels of pixels with the pixel abscissa pair 4 of 1 and the pixel ordinate pair 4 of 1 in the original image to 0 to obtain an image G2 only containing G channel sampling pixels; and calculating the brightness of each pixel according to the formula gray=r 0.299+g 0.587+b 0.114, and then sampling to obtain an image only containing transparent channel sampling pixels.
Then, the image containing only B-channel sampling pixels, the image containing only R-channel sampling pixels, the image G1 containing only G-channel sampling pixels, the image G2 containing only G-channel sampling pixels, and the image containing only transparent channels sampling pixels are divided into a plurality of mosaic image blocks, respectively. And after the sampling image only containing the transparent channel is 3 times as large as the other images and downsampled twice, obtaining the transparent channel image block with the same size as the other image blocks. This process is shown in fig. 3.
In this way, the data set for training is sampled and preprocessed in a color filter array pattern with 75% transparent elements to obtain the data set for neural network training in the present design.
And 1.3, setting parameters of a training target neural network.
Specifically, the optimizer during training is set to Adam, the initial learning rate is set to 0.001, and the learning rate will drop by half every 10 epochs during training. The small lot size during training is set to 16. The remaining super parameters will be set by default.
Step 2, building a neural network model taking a UNet++ network as a main framework
The main body frame of the model is of a UNet++ structure, and the network characteristics are as follows:
the overall model of the method is shown in fig. 4, and the neural network model comprises a feature extraction part and an image reconstruction module.
2.1 Feature extraction part:
The feature extraction part comprises three Gaussian smoothing, feature extraction, six cross-layer connection and a primary up-sampling structure, and the processing procedure is as follows:
In the feature extraction module, three Gaussian smoothing is performed, wherein the three Gaussian smoothing relations are as follows, an input image is firstly subjected to Gaussian smoothing once to obtain a picture subjected to Gaussian smoothing once, and then the picture subjected to Gaussian smoothing once is horizontally sent to the feature extraction module; simultaneously, the picture subjected to the one-time Gaussian smoothing is downwards subjected to one-time Gaussian smoothing again to extract higher-level semantics, the picture subjected to the two-time Gaussian smoothing is obtained, and then the picture subjected to the two-time Gaussian smoothing is sent to a feature extraction module in the same way; and simultaneously, the picture subjected to the twice Gaussian smoothing is fed into the palace once again downwards to extract higher-level semantics by Gaussian smoothing, the picture subjected to the three times Gaussian smoothing is obtained, and then the picture subjected to the three times Gaussian smoothing is fed into a feature extraction module in the same way.
In the feature extraction module, firstly, pictures subjected to 0,1,2 and 3 Gaussian smoothing are sent to the feature extraction module of each layer, and the convolution kernel size in different feature extraction modules is different according to the difference of the smoothing times. The feature extraction modules of each layer have different convolution kernel sizes, wherein the convolution kernel size after 0 times of Gaussian smoothing is 3*3, the convolution kernel size after 1 time of Gaussian smoothing is 3*3, the convolution kernel size after 2 times of Gaussian smoothing is 5*5, and the convolution kernel size after 3 times of Gaussian smoothing is 7*7. The left and right of these convolution structures is to further expand the receptive field so that the network can obtain a larger range of image information at the upper layer of the triangle structure concerned. Following the convolved structure of each layer are three solid connection units, each consisting of three depth separable convolutions and one PReLU. A dense connection is employed between the respective depth-separable convolutions in each dense connection element, i.e., the input and output of the first layer depth-separable convolutions are fed together into the input of the second layer depth-separable convolutions, and the same input and output of the second layer depth-separable convolutions are fed together into the input layer of the third layer depth-separable convolutions. Meanwhile, in the feature reconstruction module, the number of input and output channels of each compact connection unit is the same.
In the feature extraction module, after three Gaussian smoothing of an input image, the network obtains three different levels of image blur levels, and the feature extraction module has the advantages that the size of an intermediate feature image is not changed while feature images with different semantic levels are obtained, so that subsequent cross-layer connection is facilitated; after gaussian smoothing, the image is sent to the feature extraction module of each layer as shown in fig. 5.
As shown in fig. 5, in the feature extraction module of the neural network model in this method, a dense connection unit with a residual connection structure is embedded in the feature extraction module connected by the residual connection structure, where the residual structure is composed of a convolution structure and three dense connection units, and the convolution structure is used to initially extract corresponding horizontal features of different layers and further expand the receptive field, and each dense connection unit includes three depth-separable convolutions and six PReLU activation layers. The residual structure consists of a convolution structure and three dense connection units, wherein the convolution structure is used for preliminarily extracting corresponding horizontal characteristics of different layers and further expanding receptive fields, and in the neural network model, the sizes of convolution kernels of 0, 1,2 and 3 layers are 3 multiplied by 3, 5 multiplied by 5 and 7 multiplied by 7 respectively. The feature extraction module also comprises three dense connection units, and each dense connection unit comprises three depth separable convolutions and six PReLU activation layers.
The method comprises the step that the number of input channels and the number of output channels of a feature extraction module of a network model are consistent. The parameters of the number of channels in the intermediate layer are shown in table 1.
TABLE 1
2.2 Image reconstruction portion:
After the input image passes through the feature extraction section described above, the feature map will be reconstructed as a mosaic-free and noise-free image with the purple section facing right upward as in fig. 4. This upward right process is the image reconstruction portion. In this section, each layer of features will be concatenated with each layer of features at a lower level further forward with a1 x 1 convolution (as shown by the white small circles in the figure); each image reconstruction operation of the portion consists of a 3 x 3 convolution and a1 x 1 convolution structure and a PReLU; the dashed parts in the figure represent connections between layers across different semantic levels, implemented with 1 x 1 convolutions; the orange arrow at the top of the structure represents the upsampling achieved with the 3 x 3 transposed convolution structure. The final output image between layers is layered on the top layer of the structure, i.e. l=1, l=2, l=3, etc. in the figure. In the image reconstruction section, the image size is unchanged in the process before the transpose convolution operation, the image length and width after the transpose convolution each become 2 times the original, and the input image size, the convolution kernel size, and the output image size in the intermediate process are shown in table 2.
TABLE 2
And step 3, training a corresponding network model by taking the minimized respective loss function as a target in two stages according to the neural network model.
3.1 Data selection, the method selects the ImageNet as a data set of a training network, firstly cuts pictures into 256 x 256 resolution pictures based on the center before the training, and then cuts the network into 128 x 128 resolution pictures before the training.
3.2 Optimizing the selection of the optimizer, wherein Adam optimization is adopted in the method, the initial learning rate is set to 0.001, the learning rate is reduced to half of the original learning rate every 10 periods, the small batch size is 16, and other super parameters are set by default.
3.3 Design of loss function, in 10 cycles (e.ltoreq.10) of training, SSIM index for measuring local similarity is adopted as the basis of the design of loss function, after 10 cycles (e > 10), MSE for measuring global characteristic is adopted as the basis of the design of loss function, and the total number of cycles is set to 150 cycles.
Wherein SSIM loss function L SSIM is as follows:
The MSE loss function L MSE is as follows:
the overall loss function is as follows:
step 4: and processing the image to be processed which is sampled in a color filter array mode with 75% transparent elements by utilizing the trained neural network model, wherein Gaussian noise is added, the brightness is reduced, and a demosaiced image can be obtained.
The above embodiment shows that the method is used as a demosaicing algorithm matched with a color filter array with 75% transparent elements, is suitable for a small-pixel color filter array under a low illumination condition, and mainly solves the problem of different sizes of characteristic diagrams, and realizes the demosaicing effect on the basis of utilizing the higher photosensitive characteristic of the color filter array and the characteristics suitable for a small camera.
The method realizes the thought of software and hardware collaborative design, reduces the parameter quantity from the perspective of the topological structure on the basis of ensuring the acceptable demosaicing effect by modifying the topological structure and the like on the basis of ensuring the better image recovery quality, reduces the parameter quantity of the neural network, and ensures that the neural network is suitable for the edge computing equipment.
Claims (5)
1. A demosaicing method applicable to low-light small-pixel CFA sampling and edge computing equipment, comprising the steps of:
Step 1: firstly, after Gaussian noise is added to an original data set and the brightness is reduced, an RGB image is processed into a mosaic image through a color filter array mode (CFA) with 75% transparent elements, data preprocessing is carried out to form a training set, and parameters of a training target neural network are set;
Step 2: building a neural network model taking a UNet++ network as a main framework, wherein the neural network model comprises a feature extraction part and an image reconstruction module:
The characteristic extraction part comprises a three-time Gaussian smoothing and characteristic extraction module, a cross-layer connection structure and a one-time up-sampling structure; the feature extraction module is embedded with a dense connection unit with a residual connection structure in a feature extraction module connected by the residual connection structure, wherein the residual connection structure consists of a convolution structure and three dense connection units, the convolution structure is used for preliminarily extracting corresponding horizontal features of different layers and further expanding a receptive field, and each dense connection unit comprises three depth separable convolutions and six PReLU activation layers;
The image reconstruction module consists of a 3×3 convolution product, a1×1 convolution structure and a PReLU, and is used for reconstructing the feature map after feature extraction into a mosaic-free and noise-free image;
step 3: according to the neural network model, training a corresponding network model in two stages with the aim of minimizing respective loss functions, including:
3.1 data selection, namely selecting an ImageNet as a data set of a training network, firstly cutting a picture into 256-by-256 resolution pictures by taking a center as a reference before the training, and then cutting the network into 128-by-128 resolution pictures and then using the network for network training;
3.2, selecting by an optimizer, adopting Adam optimization, setting an initial learning rate to be 0.001, reducing the learning rate to be half of the original learning rate every 10 periods, and setting the small batch size to be 16, wherein other super parameters adopt default settings;
3.3 designing a loss function, wherein e is less than or equal to 10 in 10 training periods, SSIM indexes for measuring local similarity are adopted as the basis of the loss function design, and after 10 training periods, e is more than 10, MSE for measuring global characteristics is adopted as the basis of the loss function design, and the total experimental period number is set to 150;
Step 4: and processing the image to be processed which is sampled in a color filter array mode with 75% transparent elements by utilizing the trained neural network model, wherein Gaussian noise is added, the brightness is reduced, and a demosaiced image is obtained.
2. The method for demosaicing applicable to a low-light small-pixel CFA sampling and edge computing device according to claim 1, wherein in the step 1, the method for adding gaussian noise to an original data set and reducing brightness is as follows:
by calculating the variance v of the image of the single Zhang Hei photo and taking the value as a reference, gaussian noise subject to a mean of 0 and distributed with the variance v is multiplexed onto three channels of the original dataset according to the following formula (1)
Y=C((A·X+N(0,B·v)) (1)
Wherein Y is a processed low-light noise image, C is an image pixel cut-off function, A is an illumination reduction multiple, X is an original image to be processed, N (a, B) is a Gaussian normal distribution noise generation function with a mean value of a and a variance of B, and B is a Gaussian noise variance fine adjustment multiple.
3. The method for demosaicing as recited in claim 1, wherein in step 1, the processing of the RGB image into the mosaic image by the CFA having 75% transparent elements means:
The original image is sampled in a mode of a color filter array with 75% transparent elements, and an image only containing B channel sampling pixels, an image only containing R channel sampling pixels, an image only containing G channel sampling pixels G2 and an image only containing transparent channel sampling pixels are respectively divided into a plurality of mosaic image blocks, wherein the size of the image only containing transparent channel sampling is 3 times that of other images, and after downsampling is carried out twice, the transparent channel image blocks with the same size as the other image blocks are obtained.
4. The low-light small-pixel CFA sampling and edge computing device-adapted demosaicing method of claim 1, wherein the sizes of the convolution kernels of the 0, 1,2, 3 layers in the neural network model are 3 x 3, 5 x 5, 7 x 7, respectively.
5. The low-light small-pixel CFA sampling and edge computing device-adapted demosaicing method of claim 1, wherein in step 3.3:
The SSIM loss function L SSIM is as follows:
Wherein, Representing the structural similarity of two images, where x and y represent the two images used to calculate the structural similarity, where mu x is the average of x, mu y is the average of y,Is the variance of the x-value,Is the variance of y, σ xy is the covariance of x and y, and C 1=(k1L)2,C2=(k2L)2 is a constant used to maintain stability; l is the dynamic range of pixel values, k 1=0.01,k2 =0.03; the size of the sliding window in the SSIM calculation process in this design is set to 11;
M in the loss function L SSIM represents that the matrix abstracted from the two images has M rows in total, and N represents that the matrix abstracted from the two images has N columns in total;
After the number of epochs trained exceeds 10, the loss function takes the form of MSE loss function L MSE, which is specified as follows:
The overall loss function is as follows:
the superscript 1 for the loss function in the above equation indicates that the layer loss functions L 1、L2、L3 of the network that introduce deep supervision need to be considered together in the training process, and therefore the above SSIM loss functions representing layers 1, 2, and 3, the sameRepresenting layer 1, 2, 3 MSE loss functions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210763316.0A CN115082315B (en) | 2022-06-30 | 2022-06-30 | Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210763316.0A CN115082315B (en) | 2022-06-30 | 2022-06-30 | Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115082315A CN115082315A (en) | 2022-09-20 |
CN115082315B true CN115082315B (en) | 2024-09-13 |
Family
ID=83257630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210763316.0A Active CN115082315B (en) | 2022-06-30 | 2022-06-30 | Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115082315B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111696036A (en) * | 2020-05-25 | 2020-09-22 | 电子科技大学 | Residual error neural network based on cavity convolution and two-stage image demosaicing method |
CN112614072A (en) * | 2020-12-29 | 2021-04-06 | 北京航空航天大学合肥创新研究院 | Image restoration method and device, image restoration equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10366471B2 (en) * | 2015-12-02 | 2019-07-30 | Texas Instruments Incorporated | Universal and adaptive de-mosaicing (CFA) system |
RU2764395C1 (en) * | 2020-11-23 | 2022-01-17 | Самсунг Электроникс Ко., Лтд. | Method and apparatus for joint debayering and image noise elimination using a neural network |
CN113888405B (en) * | 2021-08-23 | 2023-08-15 | 西安电子科技大学 | Denoising and demosaicing method based on clustering self-adaptive expansion convolutional neural network |
-
2022
- 2022-06-30 CN CN202210763316.0A patent/CN115082315B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111696036A (en) * | 2020-05-25 | 2020-09-22 | 电子科技大学 | Residual error neural network based on cavity convolution and two-stage image demosaicing method |
CN112614072A (en) * | 2020-12-29 | 2021-04-06 | 北京航空航天大学合肥创新研究院 | Image restoration method and device, image restoration equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115082315A (en) | 2022-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111127336B (en) | Image signal processing method based on self-adaptive selection module | |
EP4109392A1 (en) | Image processing method and image processing device | |
CN113658057B (en) | Swin converter low-light-level image enhancement method | |
CN113052814B (en) | Dim light image enhancement method based on Retinex and attention mechanism | |
CN107507138A (en) | A kind of underwater picture Enhancement Method based on Retinex model | |
Ratnasingam | Deep camera: A fully convolutional neural network for image signal processing | |
CN112308803B (en) | Self-supervision low-illumination image enhancement and denoising method based on deep learning | |
CN112465727A (en) | Low-illumination image enhancement method without normal illumination reference based on HSV color space and Retinex theory | |
CN111598789B (en) | Sparse color sensor image reconstruction method based on deep learning | |
CN111986084A (en) | Multi-camera low-illumination image quality enhancement method based on multi-task fusion | |
CN111383200B (en) | CFA image demosaicing method based on generated antagonistic neural network | |
CN113822830B (en) | Multi-exposure image fusion method based on depth perception enhancement | |
CN114862698B (en) | Channel-guided real overexposure image correction method and device | |
CN114066747A (en) | Low-illumination image enhancement method based on illumination and reflection complementarity | |
CN111612722A (en) | Low-illumination image processing method based on simplified Unet full-convolution neural network | |
CN114219722A (en) | Low-illumination image enhancement method by utilizing time-frequency domain hierarchical processing | |
Li et al. | A Large-Scale Film Style Dataset for Learning Multi-frequency Driven Film Enhancement. | |
JP7504629B2 (en) | IMAGE PROCESSING METHOD, IMAGE PROCESSING APPARATUS, IMAGE PROCESSING PROGRAM, AND STORAGE MEDIUM | |
CN116385298A (en) | No-reference enhancement method for night image acquisition of unmanned aerial vehicle | |
CN116309116A (en) | Low-dim-light image enhancement method and device based on RAW image | |
CN117274060B (en) | Unsupervised end-to-end demosaicing method and system | |
CN113643202A (en) | Low-light-level image enhancement method based on noise attention map guidance | |
CN115082315B (en) | Demosaicing method applicable to low-illumination small-pixel CFA sampling and edge computing equipment | |
CN116579940A (en) | Real-time low-illumination image enhancement method based on convolutional neural network | |
CN116912114A (en) | Non-reference low-illumination image enhancement method based on high-order curve iteration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |