CN116029930A - Multispectral image demosaicing method based on convolutional neural network - Google Patents

Multispectral image demosaicing method based on convolutional neural network Download PDF

Info

Publication number
CN116029930A
CN116029930A CN202310027162.3A CN202310027162A CN116029930A CN 116029930 A CN116029930 A CN 116029930A CN 202310027162 A CN202310027162 A CN 202310027162A CN 116029930 A CN116029930 A CN 116029930A
Authority
CN
China
Prior art keywords
mosaic
convolution
layer
attention
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310027162.3A
Other languages
Chinese (zh)
Inventor
高大化
石劢
刘丹华
牛毅
石光明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202310027162.3A priority Critical patent/CN116029930A/en
Publication of CN116029930A publication Critical patent/CN116029930A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a multi-spectrum image demosaicing method based on a convolution neural network, which utilizes a built mosaic self-adaptive attention convolution neural network to realize demosaicing reconstruction of an input multi-spectrum Raw image, and provides a mosaic convolution based on convolution kernel weight sharing of an optical filter position in an MSFA (multi-spectral Fabry-Perot) aiming at the characteristics of high spatial resolution of the multi-spectrum Raw image and different corresponding spectrum sections at different pixel positions, so that feature extraction can be performed on the whole multi-spectrum Raw image, and different convolution kernel weights are adopted for pixel points sampled by different optical filters, so that the spectrum features of corresponding pixels can be fully extracted. The mosaic channel attention and mosaic space attention mechanism provided by the invention can further reduce mosaic distortion phenomena such as checkerboard effect and the like in the feature map, highlight the main part of the image and further improve the reconstruction effect.

Description

Multispectral image demosaicing method based on convolutional neural network
Technical Field
The invention belongs to the technical field of multispectral images, and particularly relates to a multispectral image demosaicing method based on a convolutional neural network.
Background
Compared with the traditional RGB color image, the multispectral image has more spectral bands and more spectral information, and is widely used for remote sensing image processing, medical image analysis, food quality detection, true and false target detection and other aspects. The existing multispectral imaging technology mainly comprises spatial scanning, spectral scanning, snapshot-type spectral imaging and the like, and spatial resolution or time resolution is sacrificed to be replaced by spectral resolution. Existing snapshot spectral imaging (shown in fig. 3) based on a multi-spectral filter array (MSFA) (shown in fig. 1) improves spectral resolution at the expense of spatial resolution, which decreases with increasing spectral resolution. To obtain a complete multispectral image, a demosaicing process is performed. Demosaicing is typically performed using interpolation, which results in significant artifacts and checkerboard effects in the demosaiced image due to the spatially sparse sampling of the snapshot multispectral image.
At present, snapshot type multispectral image demosaicing methods based on a depth convolution neural network have certain defects, all the methods separate each spectrum segment of the snapshot type multispectral image, and reconstruct through the multispectral image with low spatial resolution, which can lead to loss of spatial information and limit the reconstruction effect. Because the sampling pixel positions of the optical filters of the snapshot spectrum image are different, the corresponding spatial information of each spectrum section is offset to a certain extent, and the multispectral image reconstructed by the methods can have artifacts. Tewodros Amberbir Habtegebrial et al in its paper Deep Convolutional Networks For Snapshot Hypercpectral Demosaicking propose a multi-spectral Raw image demosaicing method based on a depth residual network, which generates a low-spatial-resolution multi-spectral image by separating and recombining pixels corresponding to each spectrum segment in the multi-spectral Raw image, and generates a high-spatial-resolution hyperspectral image by reconstructing the low-spatial-resolution multi-spectral image through the depth residual network.
However, in the prior art, each spectrum segment of the multispectral Raw image is separated, so that the spatial resolution of the image is reduced, part of spatial information in the image is lost, in addition, each spectrum segment of the multispectral Raw image is not completely isolated, the relation between each spectrum segment cannot be extracted after separation, and the spectrum precision of the reconstructed image is affected. In the prior art, when a convolutional neural network is used for extracting characteristics of a multispectral Raw image, convolution kernels share the same weight, so that spectral information is mixed after characteristics are extracted.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a multispectral image demosaicing method based on a convolutional neural network. The technical problems to be solved by the invention are realized by the following technical scheme:
the invention provides a multispectral image demosaicing method based on a convolutional neural network, which comprises the following steps:
acquiring a multispectral Raw image training set;
building a mosaic self-adaptive attention convolution neural network;
training the mosaic self-adaptive attention convolutional neural network by using the multispectral Raw image training set to obtain a trained mosaic self-adaptive attention convolutional neural network;
inputting the multispectral Raw image to be detected into a trained mosaic self-adaptive attention convolutional neural network for demosaicing reconstruction, and obtaining a corresponding demosaicing reconstruction multispectral image;
wherein the mosaic self-adaptive attention convolutional neural network comprises a mosaic convolutional module, a mosaic feature coding module, a mosaic feature decoding module and a plurality of spatial attention modules, wherein,
the mosaic convolution module, the mosaic feature encoding module and the mosaic feature decoding module are cascaded;
the mosaic feature encoding module and the mosaic feature decoding module both comprise a plurality of cascaded intensive residual attention modules;
the plurality of spatial attention modules are correspondingly connected between the dense residual attention module of the mosaic feature encoding module and the dense residual attention module of the mosaic feature decoding module.
In one embodiment of the present invention, obtaining a training set of multispectral Raw images includes:
performing simulation sampling on a plurality of original multispectral images by utilizing a multispectral filter array to obtain a plurality of multispectral Raw images serving as a multispectral Raw image training set;
the multi-spectral filter array comprises c spectral filters with m x n spatial arrangement modes, c is the number of spectral bands of the multi-spectral filter array, and m and n are integers larger than zero.
In one embodiment of the present invention, in the mosaic feature encoding module, the number of channels of the input/output feature map of each dense residual attention module is different;
in the mosaic feature decoding module, the number of input and output feature map channels of each dense residual attention module is different.
In one embodiment of the present invention, the dense residual attention module includes a first convolution unit, a first localization splice layer, a second convolution unit, a second localization splice layer, a third convolution unit, a mosaic channel attention layer, and a fusion layer that are sequentially cascaded, wherein,
the first convolution unit, the second convolution unit and the third convolution unit comprise cascaded convolution layers and a first activation function layer, wherein the convolution kernel size of the convolution layers is 3 multiplied by 3, the number of the convolution kernels of the first convolution unit and the second convolution unit is one fourth of the number of input channels of the intensive residual attention module, and the number of the convolution kernels of the convolution layers of the third convolution unit is consistent with the number of output channels of the intensive residual attention module; the activation function of the first activation function layer is a PReLU activation function;
the first jointing layer is used for jointing the input characteristics of the intensive residual error attention module with the output characteristics of the first convolution unit;
the second jointing layer is used for jointing the input characteristics of the intensive residual error attention module, the output characteristics of the first convolution unit and the output characteristics of the second convolution unit;
the mosaic channel attention layer is used for aggregating spectrums at positions corresponding to the spectrum filters in the output characteristics of the third convolution unit, calculating weights for the pixels of the aggregated characteristics, and weighting the output characteristics of the third convolution unit by using the calculated weights;
and the fusion layer is used for fusing the input characteristics of the intensive residual attention module with the output characteristics of the mosaic channel attention layer.
In one embodiment of the invention, the spatial attention module comprises a spatial feature aggregation layer, a spatial feature screening layer, a third spliced layer, a mosaic convolution module, a second activation function layer, and a feature attention layer, wherein,
the spatial feature aggregation layer is used for carrying out average weighting on pixels on all channels at each spatial position of the input feature of the spatial attention module to realize feature aggregation;
the spatial feature screening layer is used for screening the maximum pixel values on all channels at each spatial position of the input features of the spatial attention module;
the third jointing layer is used for jointing the output characteristics of the spatial characteristic aggregation layer and the output characteristics of the spatial characteristic screening layer, and the spliced characteristic diagram is input into the mosaic convolution module;
the mosaic convolution module, the second activation function layer and the characteristic attention layer are sequentially cascaded;
the activation function of the second activation function layer is a Sigmoid activation function.
In one embodiment of the present invention, the mosaic convolution module comprises: the multi-core convolution layer, the characteristic channel screening layer and the characteristic channel fusion layer; wherein, the liquid crystal display device comprises a liquid crystal display device,
the number of convolution kernels of the multi-core convolution layer is consistent with the number of spectrum wavebands of the multi-spectrum filter array, and each convolution kernel of the multi-core convolution layer carries out convolution operation on the input feature map to obtain a corresponding mosaic feature map;
the characteristic channel screening layer is used for carrying out filtering operation on a plurality of mosaic characteristic graphs generated by the multi-core convolution layer and filters based on space positions, wherein the number of the filters is consistent with the number of convolution kernels of the multi-core convolution layer, and each filter responds to the space position of a filter of a corresponding spectral band in the multi-spectral filter array;
and the characteristic channel fusion layer is used for carrying out addition fusion on all the characteristic graphs generated by the characteristic channel screening layer.
In one embodiment of the present invention, training the mosaic adaptive attention convolutional neural network by using the multispectral Raw image training set to obtain a trained mosaic adaptive attention convolutional neural network, including:
inputting the multispectral Raw image training set and the original multispectral image into a mosaic self-adaptive attention convolution neural network to obtain a reconstructed multispectral image corresponding to the multispectral Raw image;
calculating a regression loss between the reconstructed multispectral image and the original multispectral image using a Charbonnier regression loss function;
and carrying out multi-round training on the mosaic self-adaptive attention convolutional neural network by utilizing the self-adaptive moment estimation gradient descent algorithm according to the regression loss until the regression loss converges, so as to obtain the trained mosaic self-adaptive attention convolutional neural network.
In one embodiment of the invention, the Charbonnier regression loss function is:
Figure BDA0004045594800000051
wherein L represents the regression loss between the reconstructed multispectral image and the original multispectral image,
Figure BDA0004045594800000052
representing the i-th original multispectral image, y i Representing the ith reconstructed multispectral image, M representing the total number of images of the multispectral Raw image training set, ε representing the stable bias parameter.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the multi-spectrum image demosaicing method based on the convolution neural network, the built mosaic self-adaptive attention convolution neural network is utilized to realize demosaicing reconstruction of an input multi-spectrum Raw image, a mosaic convolution module based on a convolution kernel weight sharing strategy aiming at the multi-spectrum Raw image is arranged in the mosaic self-adaptive attention convolution neural network, the mosaic convolution module only adopts a specific convolution kernel weight for a pixel position corresponding to the same optical filter in the MSFA, convolution kernel weights adopted for pixels sampled by different optical filters are different, and feature extraction can be carried out on the whole multi-spectrum Raw image without spectrum segment separation. Spectral features at each pixel can be better extracted while spatial information is not lost;
2. according to the multi-spectral image demosaicing method based on the convolutional neural network, a dense residual error attention module which is used for carrying out feature aggregation based on the spatial position adaptation of the optical filters is arranged in the proposed mosaic self-adaptive attention convolutional neural network, after feature extraction is carried out, feature fusion can be carried out on pixel points in a feature map according to the corresponding positions of different optical filters in an MSFA (multi-spectral optical filter array), and mosaic channel attention force map is generated according to the feature fusion, so that the network has stronger learning capability and checkerboard effect of the feature map can be reduced;
3. according to the multispectral image demosaicing method based on the convolutional neural network, a spatial attention module aiming at mosaic characteristics is arranged in the proposed mosaic self-adaptive attention convolutional neural network, and is mainly used for highlighting the main body content of a multispectral Raw image, enhancing the characterization capability of the neural network on the main body content in a mosaic characteristic diagram and reducing the loss of main body information of images before and after demosaicing;
4. according to the multispectral image demosaicing method based on the convolutional neural network, the network parameters of the mosaic self-adaptive attention convolutional neural network can be adjusted according to the number and arrangement mode of optical filters in the MSFA used for generating multispectral Raw images.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings.
Drawings
FIG. 1 is a schematic diagram of a multi-spectral filter array (MSFA) according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a multispectral Raw image according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a snapshot spectral imaging according to an embodiment of the present invention;
FIG. 4 is a flowchart of a multi-spectral image demosaicing method based on a convolutional neural network according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a mosaic adaptive attention convolutional neural network according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a mosaic convolution module according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a dense residual attention module provided by an embodiment of the present invention;
fig. 8 is a schematic diagram of a spatial attention module according to an embodiment of the present invention.
Detailed Description
In order to further explain the technical means and effects adopted by the invention to achieve the preset aim, the invention provides a multispectral image demosaicing method based on a convolutional neural network, which is described in detail below with reference to the accompanying drawings and the specific embodiments.
The foregoing and other features, aspects, and advantages of the present invention will become more apparent from the following detailed description of the preferred embodiments when taken in conjunction with the accompanying drawings. The technical means and effects adopted by the present invention to achieve the intended purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only, and are not intended to limit the technical scheme of the present invention.
Example 1
Referring to fig. 4 and fig. 5 in combination, fig. 4 is a flowchart of a multi-spectral image demosaicing method based on a convolutional neural network according to an embodiment of the present invention; fig. 5 is a schematic structural diagram of a mosaic adaptive attention convolution neural network according to an embodiment of the present invention. As shown in the figure, the multispectral image demosaicing method based on the convolutional neural network of the embodiment comprises the following steps:
step 1: acquiring a multispectral Raw image training set;
in an alternative embodiment, the multiple original multispectral images are subjected to simulation sampling by using the multispectral filter array, so that multiple multispectral Raw images are obtained, and the multispectral Raw images are used as a multispectral Raw image training set, and a multispectral Raw image schematic diagram is shown in fig. 2.
Wherein the original multispectral image is three-dimensional data S epsilon R a×b×c Each wave band in the multispectral image corresponds to a two-dimensional matrix S in the three-dimensional data i ∈R a×b Where e denotes a symbol belonging to the real number field, R denotes a real number field, a denotes a width of the multispectral image, b denotes a height of the multispectral image, c denotes a number of spectral bands of the multispectral image, i denotes a number of spectral bands in the multispectral image, i=1, 2, …, c.
In the embodiment of the present invention, the multispectral filter array (MSFA) includes c spectral filters with spatial arrangement of m×n, c is the number of spectral bands of the multispectral filter array, and m and n are integers greater than zero.
In the multispectral Raw image, the spatial size of the multispectral Raw image is a multiplied by b, the size of the multispectral Raw image is consistent with that of the original multispectral image, the channel number is 1, k pixels are spatially used as a group, and the information on each pixel is obtained after the information is filtered and sampled by the corresponding spectral filter. The pixels obtained by sampling with each spectral filter are periodically arranged in the multispectral Raw image. Alternatively, in the present example, c= 9,m =3, n=3.
In this embodiment, a plurality of sampled multispectral Raw images are selected, 80% of the multispectral Raw images are selected as a training sample set, the original multispectral images of the Raw images are used as true values for evaluation and comparison, and the remaining 20% of the multispectral Raw images and the corresponding original multispectral images thereof are used as a test sample set for subsequent testing.
Step 2: building a mosaic self-adaptive attention convolution neural network;
as shown in fig. 5, the mosaic adaptive attention convolutional neural network of the present embodiment includes a mosaic convolution module, a mosaic feature encoding module, a mosaic feature decoding module, and a plurality of spatial attention modules, where the mosaic convolution module, the mosaic feature encoding module, and the mosaic feature decoding module are cascaded. The mosaic feature coding module and the mosaic feature decoding module both comprise a plurality of cascaded intensive residual attention modules, wherein in the mosaic feature coding module, the number of input and output feature map channels of each intensive residual attention module is different; in the mosaic feature decoding module, the number of input and output feature map channels of each dense residual attention module is different. The plurality of spatial attention modules are correspondingly connected between the dense residual attention module of the mosaic feature encoding module and the dense residual attention module of the mosaic feature decoding module.
Referring to fig. 6, a schematic diagram of a mosaic convolution module according to an embodiment of the present invention is provided, and in an alternative implementation, the mosaic convolution module includes: the multi-core convolution layer, the characteristic channel screening layer and the characteristic channel fusion layer.
The number of convolution kernels of the multi-core convolution layer is consistent with the number of spectrum bands of the multi-spectrum filter array, and each convolution kernel of the multi-core convolution layer carries out convolution operation on the input feature map to obtain a corresponding mosaic feature map; the characteristic channel screening layer is used for carrying out filtering operation on a plurality of mosaic characteristic graphs generated by the multi-core convolution layer and filters based on space positions, wherein the number of the filters is consistent with the number of convolution kernels of the multi-core convolution layer, and each filter responds to the space position of a corresponding optical filter of one spectrum band in the multi-spectrum optical filter array; the feature channel fusion layer is used for carrying out addition fusion on all the feature graphs generated by the feature channel screening layer.
In this embodiment, the multi-core convolution layer has 9 convolution kernels, where the sizes of the 9 convolution kernels are all set to 3×3, the numbers of the 9 convolution kernels are all set to 1, and each convolution kernel is used for performing convolution operation on the input feature map, so as to generate 9 mosaic feature maps, and the number of filters of the feature channel screening layer is 9.
In this embodiment, the mosaic convolution module is designed according to a convolution kernel weight sharing strategy based on the optical filters of the multispectral Raw image, and the mosaic convolution module only adopts a specific convolution kernel weight for the pixel position corresponding to the same optical filter in the MSFA, and the convolution kernel weights adopted for the pixels sampled by different optical filters are different, so that the feature extraction can be performed on the whole multispectral Raw image without spectral section separation. Spectral features at each pixel can be better extracted without loss of spatial information.
In an alternative embodiment, the mosaic feature encoding module includes four cascaded dense residual attention modules, a first dense residual attention module, a second dense residual attention module, a third dense residual attention module, and a fourth dense residual attention module, respectively.
Optionally, the first dense residual attention module input channel number is 32, the output channel number is 32, the second dense residual attention module input channel number is 32, the output channel number is 64, the third dense residual attention module input channel number is 64, the output channel number is 128, the fourth dense residual attention module input channel number is 128, and the output channel number is 128.
In an alternative embodiment, the mosaic feature decoding module includes four cascaded dense residual attention modules, a fifth dense residual attention module, a sixth dense residual attention module, a seventh dense residual attention module, and an eighth dense residual attention module, respectively.
Optionally, the fifth dense residual attention module has an input channel number of 128, an output channel number of 128, the sixth dense residual attention module has an input channel number of 128, an output channel number of 64, the seventh dense residual attention module has an input channel number of 64, an output channel number of 64, the eighth dense residual attention module has an input channel number of 64, and an output channel number of 32.
In an alternative embodiment, the mosaic adaptive attention convolutional neural network comprises four spatial attention modules, wherein a spatial attention module is connected between the first dense residual attention module and the eighth dense residual attention module, a spatial attention module is connected between the second dense residual attention module and the seventh dense residual attention module, a spatial attention module is connected between the third dense residual attention module and the sixth dense residual attention module, and a spatial attention module is connected between the fourth dense residual attention module and the fifth dense residual attention module.
Referring to fig. 7 in combination with a schematic diagram of a dense residual attention module according to an embodiment of the present invention, in an alternative implementation manner, the dense residual attention module includes a first convolution unit, a first conjoint layer, a second convolution unit, a second conjoint layer, a third convolution unit, a mosaic channel attention layer, and a fusion layer that are cascaded in sequence.
The first convolution unit, the second convolution unit and the third convolution unit comprise cascaded convolution layers and a first activation function layer, wherein the convolution kernel size of the convolution layers is 3 multiplied by 3, the number of the convolution kernels of the convolution layers of the first convolution unit and the second convolution unit is one fourth of the number of input channels of the intensive residual attention module, and the number of the convolution kernels of the convolution layers of the third convolution unit is consistent with the number of output channels of the intensive residual attention module; the activation function of the first activation function layer is a PReLU activation function.
The first jointing layer is used for jointing the input characteristics of the intensive residual error attention module with the output characteristics of the first convolution unit; the second jointing layer is used for jointing the input characteristics of the intensive residual error attention module, the output characteristics of the first convolution unit and the output characteristics of the second convolution unit.
The mosaic channel attention layer is used for aggregating the spectrums at the positions corresponding to the spectrum filters in the output characteristics of the third convolution unit, calculating the weight of each pixel of the aggregated characteristics, and weighting the output characteristics of the third convolution unit by using the calculated weight. The fusion layer is used for fusing the input features of the intensive residual attention module with the output features of the mosaic channel attention layer.
Optionally, the mosaic channel attention layer uses adaptive averaging pooling to aggregate features in the feature map space dimension and uses Sigmoid activation functions.
In this embodiment, by setting the dense residual attention module for performing feature aggregation based on spatial position adaptation of the filters, feature fusion can be performed on pixel points in the feature map according to corresponding positions of different filters in the MSFA (multi-spectral filter array) after feature extraction is performed, and thus mosaic channel attention is generated, so that the network has stronger learning capability and checkerboard effect of the feature map can be reduced.
It should be noted that, the dense residual attention module is used to build the mosaic feature encoding module and the mosaic feature decoding module, where the dense residual attention module is used for extracting the neural network features, and in other alternative embodiments, the dense residual attention module may use only a convolution layer and an activation function layer without a concatate splice layer and a fusion layer, or use only a convolution layer, an activation function layer and a concatate splice layer without a fusion layer, or use only a convolution layer, an activation function layer and a fusion layer without a concatate splice layer, so that the function of extracting the neural network features can be completed.
Referring to fig. 8, in an alternative implementation, the spatial attention module includes a spatial feature aggregation layer, a spatial feature screening layer, a third joint layer, a mosaic convolution module, a second activation function layer, and a feature attention layer.
The spatial feature aggregation layer is used for carrying out average weighting on pixels on all channels at each spatial position of the input feature of the spatial attention module to realize feature aggregation; the spatial feature screening layer is used for screening the maximum pixel values on all channels at each spatial position of the input features of the spatial attention module; the third jointing layer is used for jointing the output characteristics of the spatial characteristic aggregation layer and the output characteristics of the spatial characteristic screening layer, and the spliced characteristic diagram is input into the mosaic convolution module. The mosaic convolution module, the second activation function layer and the feature attention layer are sequentially cascaded.
Optionally, the activation function of the second activation function layer is a Sigmoid activation function.
In this embodiment, the spatial attention module is set for the mosaic feature, and is mainly used for highlighting the main content of the multispectral Raw image, enhancing the capability of the neural network for characterizing the main content in the mosaic feature map, and reducing the loss of main information of the images before and after demosaicing.
It should be noted that, in this embodiment, the mosaic convolution module is used in both the backbone structure and the spatial attention module of the mosaic adaptive attention convolution neural network, and the mosaic convolution module may act on not only the multispectral Raw image but also the mosaic feature map, and because the mosaic feature map still has a mosaic arrangement, the mosaic convolution may still be used.
The network parameters of the mosaic adaptive attention convolution neural network in this embodiment may be adjusted according to the number and arrangement of optical filters in the MSFA used to generate the multispectral Raw image.
Step 3: training the mosaic self-adaptive attention convolutional neural network by utilizing a multispectral Raw image training set to obtain a trained mosaic self-adaptive attention convolutional neural network;
the specific training process of the mosaic self-adaptive attention convolutional neural network is as follows:
inputting the multispectral Raw image training set and the original multispectral image into a mosaic self-adaptive attention convolution neural network to obtain a reconstructed multispectral image corresponding to the multispectral Raw image;
calculating regression loss between the reconstructed multispectral image and the original multispectral image by using a Charbonnier regression loss function;
in the image reconstruction task, a regression loss function is used as an optimization target, when the traditional L1 and L2 loss functions are relatively similar between a reconstructed image and a real image matrix, the optimization target is close to 0, the optimization is difficult to continue, the reconstruction performance of a convolutional neural network is limited, and in order to further improve the reconstruction performance of the neural network, the Charbonnier regression loss function is used in the embodiment, and the formula is as follows:
Figure BDA0004045594800000131
wherein L represents the regression loss between the reconstructed multispectral image and the original multispectral image,
Figure BDA0004045594800000132
representing the i-th original multispectral image, y i Representing the ith reconstructed multispectral image, M represents the total number of images of the multispectral Raw image training set, epsilon represents the stable bias parameter, and epsilon=0.001 in this embodiment.
And carrying out multiple rounds of training on the mosaic self-adaptive attention convolutional neural network by utilizing the self-adaptive moment estimation gradient descent algorithm according to the regression loss until the regression loss converges, and obtaining the trained mosaic self-adaptive attention convolutional neural network.
Step 4: inputting the multispectral Raw image to be detected into a trained mosaic self-adaptive attention convolutional neural network for demosaicing reconstruction, and obtaining a corresponding demosaicing reconstruction multispectral image.
According to the multi-spectrum image demosaicing method based on the convolutional neural network, demosaicing reconstruction of the input multi-spectrum Raw image is achieved by using the built mosaic self-adaptive attention convolutional neural network, separation of all spectrum sections in the multi-spectrum Raw image is not needed, loss of spatial information is reduced during feature extraction, and inter-spectrum connection among different spectrums can be extracted by feature extraction of the whole multi-spectrum Raw image.
The mosaic channel attention is added into the mosaic self-adaptive attention convolution neural network of the embodiment, so that the neural network has stronger learning capability and can reduce the checkerboard effect of the feature map, the mosaic space attention enhances the characterization capability of the neural network on the main body content in the mosaic feature map, and the loss of main body information of images before and after demosaicing is reduced.
The peak signal-to-noise ratio (PSNR), the structural consistency (SSIM), the Spectral Angle Matching (SAM) and other index effects of the multispectral image generated after demosaicing by the method are superior to those of the existing method.
It should be noted that in this document relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in an article or apparatus that comprises the element. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The orientation or positional relationship indicated by "upper", "lower", "left", "right", etc. is based on the orientation or positional relationship shown in the drawings, and is merely for convenience of description and to simplify the description, and is not indicative or implying that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the invention.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (8)

1. A multi-spectral image demosaicing method based on a convolutional neural network, comprising:
acquiring a multispectral Raw image training set;
building a mosaic self-adaptive attention convolution neural network;
training the mosaic self-adaptive attention convolutional neural network by using the multispectral Raw image training set to obtain a trained mosaic self-adaptive attention convolutional neural network;
inputting the multispectral Raw image to be detected into a trained mosaic self-adaptive attention convolutional neural network for demosaicing reconstruction, and obtaining a corresponding demosaicing reconstruction multispectral image;
wherein the mosaic self-adaptive attention convolutional neural network comprises a mosaic convolutional module, a mosaic feature coding module, a mosaic feature decoding module and a plurality of spatial attention modules, wherein,
the mosaic convolution module, the mosaic feature encoding module and the mosaic feature decoding module are cascaded;
the mosaic feature encoding module and the mosaic feature decoding module both comprise a plurality of cascaded intensive residual attention modules;
the plurality of spatial attention modules are correspondingly connected between the dense residual attention module of the mosaic feature encoding module and the dense residual attention module of the mosaic feature decoding module.
2. The method for demosaicing a multispectral image based on a convolutional neural network according to claim 1, wherein obtaining a multispectral Raw image training set comprises:
performing simulation sampling on a plurality of original multispectral images by utilizing a multispectral filter array to obtain a plurality of multispectral Raw images serving as a multispectral Raw image training set;
the multi-spectral filter array comprises c spectral filters with m x n spatial arrangement modes, c is the number of spectral bands of the multi-spectral filter array, and m and n are integers larger than zero.
3. The multi-spectral image demosaicing method based on the convolutional neural network according to claim 1, wherein in the mosaic feature coding module, the number of input/output feature map channels of each dense residual attention module is different;
in the mosaic feature decoding module, the number of input and output feature map channels of each dense residual attention module is different.
4. The method for demosaicing a multispectral image based on a convolutional neural network according to claim 3, wherein the dense residual attention module comprises a first convolution unit, a first convolution splice layer, a second convolution unit, a second convolution splice layer, a third convolution unit, a mosaic channel attention layer and a fusion layer which are sequentially cascaded,
the first convolution unit, the second convolution unit and the third convolution unit comprise cascaded convolution layers and a first activation function layer, wherein the convolution kernel size of the convolution layers is 3 multiplied by 3, the number of the convolution kernels of the first convolution unit and the second convolution unit is one fourth of the number of input channels of the intensive residual attention module, and the number of the convolution kernels of the convolution layers of the third convolution unit is consistent with the number of output channels of the intensive residual attention module; the activation function of the first activation function layer is a PReLU activation function;
the first jointing layer is used for jointing the input characteristics of the intensive residual error attention module with the output characteristics of the first convolution unit;
the second jointing layer is used for jointing the input characteristics of the intensive residual error attention module, the output characteristics of the first convolution unit and the output characteristics of the second convolution unit;
the mosaic channel attention layer is used for aggregating spectrums at positions corresponding to the spectrum filters in the output characteristics of the third convolution unit, calculating weights for the pixels of the aggregated characteristics, and weighting the output characteristics of the third convolution unit by using the calculated weights;
and the fusion layer is used for fusing the input characteristics of the intensive residual attention module with the output characteristics of the mosaic channel attention layer.
5. The method of claim 1, wherein the spatial attention module comprises a spatial feature aggregation layer, a spatial feature screening layer, a third con-cate stitching layer, a mosaic convolution module, a second activation function layer, and a feature attention layer,
the spatial feature aggregation layer is used for carrying out average weighting on pixels on all channels at each spatial position of the input feature of the spatial attention module to realize feature aggregation;
the spatial feature screening layer is used for screening the maximum pixel values on all channels at each spatial position of the input features of the spatial attention module;
the third jointing layer is used for jointing the output characteristics of the spatial characteristic aggregation layer and the output characteristics of the spatial characteristic screening layer, and the spliced characteristic diagram is input into the mosaic convolution module;
the mosaic convolution module, the second activation function layer and the characteristic attention layer are sequentially cascaded;
the activation function of the second activation function layer is a Sigmoid activation function.
6. The method for demosaicing a multispectral image based on a convolutional neural network of claim 5, wherein the mosaic convolution module comprises: the multi-core convolution layer, the characteristic channel screening layer and the characteristic channel fusion layer; wherein, the liquid crystal display device comprises a liquid crystal display device,
the number of convolution kernels of the multi-core convolution layer is consistent with the number of spectrum wavebands of the multi-spectrum filter array, and each convolution kernel of the multi-core convolution layer carries out convolution operation on the input feature map to obtain a corresponding mosaic feature map;
the characteristic channel screening layer is used for carrying out filtering operation on a plurality of mosaic characteristic graphs generated by the multi-core convolution layer and filters based on space positions, wherein the number of the filters is consistent with the number of convolution kernels of the multi-core convolution layer, and each filter responds to the space position of a filter of a corresponding spectral band in the multi-spectral filter array;
and the characteristic channel fusion layer is used for carrying out addition fusion on all the characteristic graphs generated by the characteristic channel screening layer.
7. The multi-spectral image demosaicing method based on the convolutional neural network according to claim 2, wherein training the mosaic adaptive attention convolutional neural network by using the multi-spectral Raw image training set to obtain a trained mosaic adaptive attention convolutional neural network comprises:
inputting the multispectral Raw image training set and the original multispectral image into a mosaic self-adaptive attention convolution neural network to obtain a reconstructed multispectral image corresponding to the multispectral Raw image;
calculating a regression loss between the reconstructed multispectral image and the original multispectral image using a Charbonnier regression loss function;
and carrying out multi-round training on the mosaic self-adaptive attention convolutional neural network by utilizing the self-adaptive moment estimation gradient descent algorithm according to the regression loss until the regression loss converges, so as to obtain the trained mosaic self-adaptive attention convolutional neural network.
8. The method for demosaicing a multispectral image based on a convolutional neural network of claim 7, wherein the Charbonnier regression loss function is:
Figure FDA0004045594790000041
wherein L represents the regression loss between the reconstructed multispectral image and the original multispectral image,
Figure FDA0004045594790000042
representing the i-th original multispectral image, y i Representing the ith reconstructed multispectral image, M representing the total number of images of the multispectral Raw image training set, ε representing the stable bias parameter. />
CN202310027162.3A 2023-01-09 2023-01-09 Multispectral image demosaicing method based on convolutional neural network Pending CN116029930A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310027162.3A CN116029930A (en) 2023-01-09 2023-01-09 Multispectral image demosaicing method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310027162.3A CN116029930A (en) 2023-01-09 2023-01-09 Multispectral image demosaicing method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN116029930A true CN116029930A (en) 2023-04-28

Family

ID=86077505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310027162.3A Pending CN116029930A (en) 2023-01-09 2023-01-09 Multispectral image demosaicing method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN116029930A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237474A (en) * 2023-11-16 2023-12-15 长春理工大学 Depth guidance-based on-chip integrated multispectral imaging reconstruction method
CN117372564A (en) * 2023-12-04 2024-01-09 长春理工大学 Method, system and storage medium for reconstructing multispectral image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237474A (en) * 2023-11-16 2023-12-15 长春理工大学 Depth guidance-based on-chip integrated multispectral imaging reconstruction method
CN117237474B (en) * 2023-11-16 2024-02-09 长春理工大学 Depth guidance-based on-chip integrated multispectral imaging reconstruction method
CN117372564A (en) * 2023-12-04 2024-01-09 长春理工大学 Method, system and storage medium for reconstructing multispectral image
CN117372564B (en) * 2023-12-04 2024-03-08 长春理工大学 Method, system and storage medium for reconstructing multispectral image

Similar Documents

Publication Publication Date Title
CN116029930A (en) Multispectral image demosaicing method based on convolutional neural network
CN111080567B (en) Remote sensing image fusion method and system based on multi-scale dynamic convolutional neural network
CN109727207B (en) Hyperspectral image sharpening method based on spectrum prediction residual convolution neural network
Peng et al. Residual pixel attention network for spectral reconstruction from RGB images
Tao et al. Hyperspectral image recovery based on fusion of coded aperture snapshot spectral imaging and RGB images by guided filtering
CN106507065B (en) Imaging device, imaging system, image generation device, and color filter
CN114266957B (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
Zhao et al. Hyperspectral imaging with random printed mask
CN113676629B (en) Image sensor, image acquisition device, image processing method and image processor
CN116188305B (en) Multispectral image reconstruction method based on weighted guided filtering
CN116128735B (en) Multispectral image demosaicing structure and method based on densely connected residual error network
CN110443865B (en) Multispectral imaging method and device based on RGB camera and depth neural network
Hang et al. Prinet: A prior driven spectral super-resolution network
CN113902646A (en) Remote sensing image pan-sharpening method based on depth layer feature weighted fusion network
Kapah et al. Demosaicking using artificial neural networks
CN108696728B (en) Image generation device, imaging device, and imaging system
Fuchs et al. Hyspecnet-11k: A large-scale hyperspectral dataset for benchmarking learning-based hyperspectral image compression methods
CN108051087B (en) Eight-channel multispectral camera design method for rapid imaging
CN112989593B (en) High-spectrum low-rank tensor fusion calculation imaging method based on double cameras
CN114511470B (en) Attention mechanism-based double-branch panchromatic sharpening method
CN111667434B (en) Near infrared enhancement-based weak light color imaging method
CN115082344A (en) Dual-branch network panchromatic sharpening method based on detail injection
Cheng et al. A lightweight convolutional neural network for camera isp
CN110992266B (en) Demosaicing method and demosaicing system based on multi-dimensional non-local statistical eigen
CN110332991B (en) Spectrum reconstruction method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination