CN115689962A - Multi-exposure image fusion method based on multi-scale self-encoder - Google Patents
Multi-exposure image fusion method based on multi-scale self-encoder Download PDFInfo
- Publication number
- CN115689962A CN115689962A CN202211424921.1A CN202211424921A CN115689962A CN 115689962 A CN115689962 A CN 115689962A CN 202211424921 A CN202211424921 A CN 202211424921A CN 115689962 A CN115689962 A CN 115689962A
- Authority
- CN
- China
- Prior art keywords
- image
- convolution
- scale
- output
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Processing (AREA)
Abstract
The invention discloses a multi-exposure image fusion method based on a multi-scale self-encoder, which comprises the following steps: 1, preparing data, preprocessing the data, and constructing a multi-scale self-encoder network, which mainly comprises a multi-scale encoder and a decoder, wherein the encoder comprises a convolution and activation function and a visual converter and is mainly used for multi-scale feature extraction; the encoder is a convolution and activation function of multi-scale fusion dense cross connection and mainly used for image reconstruction; and 2, fusing the input multi-exposure image pair, including network training and multi-exposure image fusion. The invention can fully utilize complementary and redundant information in the overexposed image and the underexposed image to fuse high-quality images with better quality, provides images with better quality for human eye observation, and simultaneously provides support for computer vision tasks such as segmentation, classification and the like of the images, thereby assisting human eye identification, computer analysis processing and other researches.
Description
Technical Field
The invention relates to the technical field of multi-exposure image fusion, in particular to a multi-exposure image fusion method based on a multi-scale self-encoder.
Background
The brightness in natural scenes can often be quite different, with the dynamic range of a single image being much lower than that of natural scenes due to the limitations of the imaging device. The scene being photographed may be affected by light, weather, sun altitude and other factors, and overexposure and underexposure often occur. A single image may not completely reflect the light and dark levels of the scene and some information may be lost, resulting in unsatisfactory imaging. It remains challenging to solve the problem of incomplete dynamic range matching in the dynamic response of existing imaging devices, display monitors, and human eyes to real natural scenes. Multi-exposure image fusion (MEF) technology provides a simple, economical and efficient way to overcome the contradiction between HDR imaging and Low Dynamic Range (LDR) display. It avoids the complexity of imaging hardware circuit design and reduces the weight and power consumption of the whole device, it improves image quality and has important applicability in the field of digital photography. MEF is a process of fusing multiple images differently exposed to produce a single visually pleasing and high quality fused image, and in particular MEF is a process of fusing multiple more differently exposed images to produce a single visually pleasing and high quality fused image. MEFs are similar to other image fusion tasks, such as medical image fusion and remote sensing image fusion, in that they both combine important information from multiple source images together to produce a high quality fused image. The main difference between these image fusion tasks is the source image, which contains the different information to be fused. In contrast, the source image of the MEF is an image with a different exposure. MEFs have attracted considerable attention due to their effectiveness in facilitating high quality images.
The multi-exposure image fusion is an important branch in image fusion, and the main task of the multi-exposure image fusion is to perform fusion processing on a plurality of images with different exposure degrees in the same scene so as to obtain an image with a high dynamic range and high quality. The existing method has the problems that firstly, the method mainly based on the traditional method has poor robustness due to the manually designed feature extractor and rule, has poor effect in different scenes, and can generate uneven brightness and artifact; the other method is a deep learning-based method, which relies on training of a multi-exposure data set, and the multi-exposure data set is smaller than a natural data set, so that a network with more layers and deeper parameters cannot be trained fully; most of the previous methods are based on convolution, and lack of neglected global information; but also lacks multi-scale feature fusion and interaction.
Through multi-exposure image fusion, people can obtain all important characteristic information from one image, so that the human eye perception and the subsequent image processing, such as target detection, target segmentation, edge extraction and the like, are facilitated. Therefore, the realization of the multi-exposure image fusion technology has important significance.
Disclosure of Invention
The invention provides a multi-exposure image fusion method based on a multi-scale self-encoder, aiming at overcoming the problems of the existing image fusion in the multi-exposure image fusion, so as to provide better image characteristic expression by fully utilizing complementary and redundant information of images with different exposure degrees and reconstruct an image with higher quality, thereby providing an image with better quality for human eye observation and simultaneously providing support for computer vision tasks such as segmentation, classification and the like of the image.
The invention adopts the following technical scheme for solving the problems:
the invention discloses a multi-exposure image fusion method based on a multi-scale self-encoder, which is characterized by comprising the following steps:
step 1: obtaining P RGB natural images and converting the images into gray level images, and marking as { I 1 ,I 2 ,…,I p ,…,I P And as a training set, wherein I p Representing the p-th gray scaleAn image;
step 2.1: the multi-scale encoder comprises W convolution blocks A 1 ,A 2 ,…,A w ,…,A W X convolution blocks N 1 ,N 2 ,…,N x ,…,N X And Y vision converter Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y Wherein A is w Represents the w-th convolution block, and the w-th convolution block A w The method comprises the following steps of (1) including a convolution layer with convolution kernel of A multiplied by A and a ReLU activation function; n is a radical of hydrogen x Represents the xth volume block; and the xth convolution block N x The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; trans y Represents the y-th vision converter; y = W-2;
step 2.1.1: the p-th gray image I p Input into the multi-scale encoder and sequentially pass through the 1 st convolution block A 1 And the 1 st convolution block N 1 After the treatment, obtaining a primary shallow feature mapThen sequentially passes through Y vision converters Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y After the treatment, Y primary deep characteristic maps are correspondingly obtainedWherein, the first and the second end of the pipe are connected with each other,representing the y primary deep feature map;
step 2.1.2: will be provided withInput the 2 nd convolution block A 2 Processing to obtain shallow layer characteristic diagram with channel number of C
Will be provided withSequentially corresponding to input W-2 convolution blocks A 3 ,…,A w ,…,A W Processing to obtain deep characteristic diagram with channel number CWherein, the first and the second end of the pipe are connected with each other,represents the w-2 deep characteristic diagram; from shallow feature mapsAnd deep level feature mapForm W-1 comprehensive characteristic graphs
Step 2.1.3: w-1 th comprehensive characteristic diagramThrough the Xth convolution block N X After the treatment, obtaining a plurality of X-1 multi-scale characteristic graphs
Step 2.1.4: for the W-1 comprehensive characteristic diagramObtaining the W-2 th up-sampling characteristic after up-samplingThe up-sampling featureAnd then with the W-2 comprehensive characteristic diagramPerforming element-by-element addition to obtain W-2 intermediate characteristic diagramThe W-2 intermediate feature mapThrough the X-1 th convolution block N X-1 Then obtaining the X-2 characteristic
Step 2.1.5: to pairObtaining W-3 th up-sampling characteristic after up-sampling And then combined with the W-3 th featurePerforming element-by-element addition to obtain W-3 intermediate characteristic diagramThe W-3 intermediate feature mapThrough the X-2 th convolution block N X-2 Then obtaining the X-3 characteristic
Step 2.1.6: the processes according to step 2.1.5 are sequentially performedAfter processing, obtaining X-1 multi-scale characteristic graphs Representing an x-1 th multi-scale feature map;
step 2.2: the decoder comprises P convolution blocks and an output block Conv, wherein the P convolution blocks are connected densely in an upper triangle mode and are recorded as the upper triangle mode in sequenceWherein, decoder (i,j) A convolutional block Decoder representing the ith column and the jth row (i,j) The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; i = J = X-2,
the output block Conv includes a convolution layer with convolution kernel a × a and a ReLU activation function;
step 2.2.1: multiple X-1 multi-scale features to be output by an encoderThe row subscripts of the corresponding decoders are respectively marked as 1,2, \8230;, J \8230;, and J; then will beIn turn mark asWherein the content of the first and second substances,a multi-scale feature representing a jth output;
step (ii) of2.2.2: decoder for 1 st column j row (1,j) Is inputted asAnd upsampling the feature mapDecoder for 1 st column j row (1,j) Is characteristic I (1,j) ;
Step 2.2.3: convolution block Decoder for jth row of other columns except for first column (i,j) Is input as a jth multi-scale feature mapJ +1 th row, I-1 th column decoder output characteristic I (i-1,j+1) Up-sampled feature map ofAnd a characteristic diagram I output by a decoder of the ith-1 column of the jth row and the ith-2 column of the jth row to the 1 st column of the jth row (i-1,j) …I (1,j) Splicing the characteristic graphs; volume block Decoder (i,j) Is a characteristic diagram I (i,j) Thereby is composed ofObtaining characteristic I after P convolution blocks of decoder (I,1) (ii) a Said characteristic I (I,1) Processing the output result by an output block Conv to obtain an output result O p ;
And step 3: the overall loss function L of the multi-scale self-encoder network is constructed using equation (1):
L=L ssim +λL pixel (1)
in the formula (1), λ represents a weight coefficient of pixel loss, L ssim Representing the structural similarity loss function and obtained from equation (2), L pixel A pixel loss function is expressed and obtained by formula (3);
L ssim =1-SSIM(I p ,O p ) (2)
in formula (2), SSIM represents structural similarity;
and 4, step 4: based on the training set, training the multi-scale self-encoder network by adopting a back propagation algorithm, and calculating the total loss function L to adjust network parameters until the maximum iteration number is reached, thereby obtaining the trained multi-scale self-encoder network;
and 5: obtaining B pairs of multi-exposure images and converting the images into Ycbcr color gamut, and then only keeping the image pair of Y channel, thereby obtaining preprocessed B pairs of multi-exposure images { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) Wherein (I) ob ,I ub ) Denotes the b-th pair of multiple exposure images, I ob Representing the overexposed image of the b-th Y channel, I ub An underexposed image representing the b-th Y-channel;
step 6: multiple exposure image pair { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) Inputting the data into a trained multi-scale encoder for processing to obtain overexposure image characteristics (Io) of S scales f1 ,Io f2 ,…,Io fs ,…,Io fS } and underexposed image features { Iu f1 ,Iu f2 ,…,Iu fs ,…,Iu fS In which, io fs Representing the characteristic of the s-th overexposed image, iu fs Representing the feature of the s < th > underexposed image;
the s-th overexposure image characteristic Io fs And the s-th underexposed image feature Iu fs Adding the obtained sums and averaging to obtain the s-th fusion feature f s To obtain a fused feature set { f } 1 ,f 2 ,…,f s ,…,f S Inputting the result into the trained decoder to obtain the fused result { Output } 1 ,Output 2 ,…,Output b ,…,Output B Wherein, output b Overexposed image I representing the b-th Y channel ob Underexposed image I of the b-th Y channel ub The fusion result of (2);
will { Output } 1 ,Output 2 ,…,Output b ,…,Output B The color image is converted to an RGB domain through an Ycbcr domain, and finally, a color image with uniform exposure { Result }is obtained 1 ,Result 2 ,…,Result b ,…,Result B Where Result b Representing the b-th color image result.
The electronic equipment comprises a memory and a processor, and is characterized in that the memory is used for storing a program for supporting the processor to execute the multi-exposure image fusion method, and the processor is configured to execute the program stored in the memory.
The invention relates to a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, performs the steps of the multi-exposure image fusion method.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a unified network framework, simultaneously realizes the fusion tasks of the overexposed image and the underexposed image, fully utilizes the redundant and complementary information among the images in different modes, and fuses the images with high quality. Compared with the existing method which needs to train the learning network for the multi-exposure image, the method can realize the high-quality fusion of the multi-exposure image only by training on a normal natural image data set (such as a 2014MS-COCO data set), thereby avoiding depending on the multi-exposure data set and training parameters, and avoiding the network with more layers and deeper layers.
2. The invention designs a top-down and bottom-up encoder which combines the multi-scale characteristics of CNN and transformer to effectively extract local and global characteristics; the pyramid multi-scale extraction is used for well processing the characteristics of the multi-scale changed picture; the method can better ensure that the characteristics of different scales have stronger semantic information; and the details of the bottom layer and the high-level semantic information are integrated, so that better detail expression is brought to the fusion result.
3. The invention designs the decoder consisting of upper triangular dense connection and upper sampling, can effectively fuse multi-scale features, fully utilizes depth features, keeps more information of different scales extracted by a coder network, prevents the network from losing shallow features while extracting deeper features, enables the feature information extracted by the network to be more comprehensive, and further can fully utilize the multi-scale features obtained by the decoder, and enhances the quality of fused images.
Drawings
FIG. 1 is a flow chart of a multi-exposure image fusion method based on a multi-scale self-encoder according to the present invention;
FIG. 2 is a schematic diagram of a network architecture according to the present invention;
FIG. 3 is a schematic view of the fusion construct of the present invention;
FIG. 4 is a schematic diagram of an encoder of the present invention;
FIG. 5 is a block diagram of a decoder according to the present invention.
Detailed Description
In this embodiment, a general flow of a multi-exposure image fusion method based on a multi-scale self-encoder is shown in fig. 1, and includes the following steps:
step 1: obtaining P RGB natural images, converting the P RGB natural images into gray level images, and recording the gray level images as { I } 1 ,I 2 ,…,I p ,…,I P In which I p Representing the p-th grayscale image.
Step 2: constructing a multi-scale self-encoder network which adopts the multi-scale self-encoder network shown in FIG. 2 and comprises a multi-scale encoder and a decoder;
step 2.1: the multiscale encoder comprises W convolutional blocks A 1 ,A 2 ,…,A w ,…,A W X convolution blocks N 1 ,N 2 ,…,N x ,…,N X And Y vision converter Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y Wherein A is w Represents the w-th convolution block, and the w-th convolution block A w The method comprises the following steps of (1) including a convolution layer with convolution kernel of A multiplied by A and a ReLU activation function; n is a radical of x Represents the x-th convolution block(ii) a And the x-th convolution block N x The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; trans y Represents the y-th vision converter; y = W-2; in the present embodiment, as shown in fig. 4, W =5, x =5, y =3.
Step 2.1.1: p-th gray image I p Input into a multi-scale encoder and sequentially pass through a 1 st convolution block A 1 And the 1 st convolution block N 1 After the treatment, obtaining a primary shallow feature mapThen sequentially passes through Y visual converters Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y After the treatment, Y primary deep characteristic maps are correspondingly obtainedWherein the content of the first and second substances,representing the y-th primary deep profile.
Step 2.1.2: will be provided withInput the 2 nd convolution block A 2 Processing to obtain shallow layer characteristic diagram with channel number C
Will be provided withCorresponding to the input W-2 convolution blocks A in sequence 3 ,…,A w ,…,A W Processing to obtain deep characteristic diagram with channel number CWherein, the first and the second end of the pipe are connected with each other,represents the w-2 deep characteristic diagram; from shallow feature mapsAnd deep level feature mapForm W-1 comprehensive characteristic graphs
Step 2.1.3: w-1 comprehensive characteristic diagramThrough the Xth convolution block N X After the treatment, obtaining an X-1 th multi-scale characteristic diagram
Step 2.1.4: for the W-1 comprehensive characteristic diagramObtaining the W-2 th up-sampling characteristic after up-samplingUpsampling featureAnd then with the W-2 comprehensive characteristic diagramPerforming element-by-element addition to obtain W-2 intermediate characteristic diagramW-2 th intermediate feature mapThrough the X-1 th convolution block N X-1 Then obtaining the X-2 characteristics
Step 2.1.5: to pairObtaining W-3 th up-sampling characteristic after up-sampling Combined with the W-3 th featurePerforming element-by-element addition to obtain W-3 intermediate characteristic diagramW-3 intermediate feature mapPasses through the X-2 th convolution block N X-2 Then obtaining the X-3 characteristics
Step 2.1.6: according to the process of step 2.1.5After processing, X-1 multi-scale characteristic graphs are obtained Showing an x-1 th multi-scale feature map.
Encoder as shown in FIG. 4, input I p 256 × 256 × 1 images, passing through A 1 The output is a feature map 236 × 256 × 16, passing through N 1 Is a primary shallow feature map 256 × 256 × 32, and is recorded as Sequentially pass through Trans 1 ,Trans 2 ,Trans 3 The obtained primary deep characteristic maps are respectively recorded as 128 × 128 × 64, 64 × 64 × 128 and 32 × 32 × 256Wherein the vision converter is a standard vision converter. Followed byBy passing through A respectively 2 Obtaining a shallow layer characteristic diagram 256 multiplied by 128 and noted asPrimary deep profileBy rolling up block A 3 、A 4 、A 5 Obtaining deep characteristic maps of 128 × 128 × 128, 64 × 64 × 128 and 32 × 32 × 128, which are respectively marked asShallow feature mapAnd deep level feature mapForm 4 comprehensive characteristic maps
4 th comprehensive characteristic diagramThrough the 5 th convolution block N 5 After processing, a 4 th multi-scale feature map 32 × 32 × 256 is obtained and recorded as
For characteristic diagramUpsampling to obtain an upsampled profileUp-sampled feature map ofAnd then with the 3 rd comprehensive characteristic diagramPerforming element-by-element addition to obtain an intermediate feature map of 64 × 64 × 128, and recording asMiddle characteristic diagram ofBy a fourth convolution N 4 Then, a 3 rd multi-scale feature 64 multiplied by 128 is obtained and is marked as
To the middle feature mapUpsampling to obtain an upsampled profileUp-sampled feature map ofAnd then with the 2 nd comprehensive characteristic diagramPerforming element-by-element addition to obtain an intermediate feature map of 128 × 128 × 128, and recording asMiddle characteristic diagram ofBy a third convolution N 3 Then obtaining a 2 nd multi-scale characteristic 128 multiplied by 64 which is marked as
To the middle feature mapUpsampling to obtain an upsampled profileUp-sampled feature map ofThen, the 1 st integrated feature mapCarrying out element-by-element addition operation to obtain an intermediate characteristic diagram of 256 multiplied by 128, which is recorded asMiddle characteristic diagram ofAfter the third convolution N 2 Then, the 1 st multi-scale feature 256 multiplied by 32 is obtained and is marked asThereby obtaining four multi-scale characteristics
Step 2.2: as shown in fig. 5, the decoder is composed of P convolutional blocks and an output block Conv, where the P convolutional blocks are connected densely at an upper triangle and are sequentially marked asWherein, decoder (i,j) Represents the convolution block of the ith column and jth row, and the convolution block Decoder of the ith column and jth row (i,j) The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; i = J = X-2,
the output block Conv comprises a convolution layer with convolution kernel a × a and a ReLU activation function.
Step 2.2.1: multiple X-1 multi-scale features to be output by an encoderThe row subscripts of the corresponding decoders are respectively marked as 1,2, \8230;, J; then will beIn turn mark asWherein the content of the first and second substances,representing the multi-scale features of the jth output.
In the specific design, the material is selected,in the description of the decoder it is notedSo as to correspond to the rows.
Step 2.2.2: decoder for 1 st column j row (1,j) Is inputted asAnd up-sampling feature maps thereofDecoder for 1 st column j row (1,j) Is characterized by (1,j) ;
Two times of upsampling andmerging and inputting Decoder (1,3) Decoder 13 Obtain a characteristic diagram I (1,3) At the same time as this, the first and second,two times up sampling andmerging and inputting Decoder (1,2) Obtain a characteristic diagram I (1,2) ;Two times up sampling andmerging input Decoder (1,1) Get the characteristic diagram I (1,1) 。
Step 2.2.3: convolution block Decoder for jth row of other columns except for first column (i,j) Is input as a jth multi-scale feature mapJ +1 th row, I-1 th column decoder output characteristic I (i-1,j+1) Up-sampled feature map ofAnd from the ith row, the ith-1 column, the jth row, the ith-2 column to the jth rowSignature I of decoder output of 1 column (i-1,j) …I (1,j) Splicing the feature maps; decoder (i,j) Is a characteristic diagram I (i,j) . Thereby composed ofObtaining characteristic I after P convolution blocks of decoder (I,1) (ii) a Characteristic I (I,1) Processing the output result by an output block Conv to obtain an output result O p 。
In this embodiment, the characteristic diagram I (1,3) Two times up sampling andand I (1,2) Input Decoder after vector splicing (2,2) Obtain a characteristic diagram I (2,2) ;
Characteristic diagram I (1,2) Two times of upsampling andand I (1,1) Input Decoder after vector splicing (2,1) Get the characteristic diagram I (2,1) ;
Characteristic diagram I (2,2) Two times up sampling andI (1,1) and I (2,1) Input Decoder after vector splicing (3,1) Get the characteristic diagram I (3,1) ;
Wherein the Decoder (1,1) ,Decoder (1,2) ,Decoder (1,3) ,Decoder (2,1) ,Decoder (2,2) ,Decoder (3,1) The input and output channels of (96, 32), (192, 64), (384, 128), (128, 32), (256, 64), (160, 32).
Characteristic diagram I (3,1) The input/output block Conv obtains an output result O p 。
And 3, step 3: the total loss function L of the multi-scale self-encoder network is constructed using equation (1):
L=L ssim +λL pixel (1)
in the formula (1), λ represents a weight coefficient of pixel loss, L ssim Representing the structural similarity loss function and obtained from equation (2), L pixel A pixel loss function is expressed and obtained by formula (3);
L ssim =1-SSIM(I p ,O p ) (2)
in formula (2), SSIM represents structural similarity.
And 4, step 4: training the multi-scale self-encoder network by adopting a back propagation algorithm based on a training set, and calculating a total loss function L to adjust network parameters until the maximum iteration number is reached, thereby obtaining the trained multi-scale self-encoder network;
and 5: obtaining B pairs of multi-exposure images and converting the images into Ycbcr color gamut, and then only keeping the image pair of Y channel, thereby obtaining preprocessed B pairs of multi-exposure images { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) Wherein (I) ob ,I ub ) Denotes the b-th pair of multiple exposure images, I ob Representing the overexposed image of the b-th Y channel, I ub An underexposed image representing the b-th Y-channel;
step 6: multiple exposure image pair { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) Inputting the data into a trained multi-scale encoder for processing to obtain overexposure image characteristics (Io) of S scales f1 ,Io f2 ,…,Io fs ,…,Io fS And underexposed image features { Iu } f1 ,Iu f2 ,…,Iu fs ,…,Iu fS In which, io fs Representing the s-th overexposed image feature, iu fs Representing the s-th underexposed image feature.
The s-th overexposed image characteristic Io fs And the s < th > chipExposed image feature Iu fs Adding the obtained sums and averaging to obtain the s-th fusion feature f s To obtain a fused feature set { f } 1 ,f 2 ,…,f s ,…,f S Is input into the trained decoder, so as to obtain a fused result { Output } 1 ,Output 2 ,…,Output b ,…,Output B H, output therein b Overexposed image I representing the b-th Y channel ob Underexposed image I with the b-th Y channel ub The fusion process of (3) is shown in FIG. 3.
Will { Output } 1 ,Output 2 ,…,Output b ,…,Output B The color image is converted to an RGB domain through an Ycbcr domain, and finally, a color image with uniform exposure { Result }is obtained 1 ,Result 2 ,…,Result b ,…,Result B Where Result b Representing the b-th color image result.
In this embodiment, an electronic device includes a memory for storing a program that supports a processor to execute the above-described multi-exposure image fusion method, and a processor configured to execute the program stored in the memory.
In this embodiment, a computer-readable storage medium stores a computer program, and the computer program is executed by a processor to execute the steps of the multi-exposure image fusion method.
Claims (3)
1. A multi-exposure image fusion method based on a multi-scale self-encoder is characterized by comprising the following steps:
step 1: obtaining P RGB natural images and converting the images into gray level images, and marking as { I 1 ,I 2 ,…,I p ,…,I P And as a training set, wherein I p Representing a pth grayscale image;
step 2, constructing a multi-scale self-encoder network, comprising: a multi-scale encoder and decoder;
step 2.1: the multi-scale encoder comprises W convolution blocks A 1 ,A 2 ,…,A w ,…,A W X convolution blocks N 1 ,N 2 ,…,N x ,…,N X And Y vision converter Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y Wherein A is w Represents the w-th convolution block, and the w-th convolution block A w The method comprises the following steps of (1) including a convolution layer with convolution kernel of A multiplied by A and a ReLU activation function; n is a radical of x Represents the xth volume block; and the xth convolution block N x The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; trans y Represents the y-th vision converter; y = W-2;
step 2.1.1: the p-th gray image I p Input into the multi-scale encoder and sequentially pass through the 1 st convolution block A 1 And the 1 st convolution block N 1 After the treatment, obtaining a primary shallow layer characteristic diagram I N1 Then sequentially passes through Y vision converters Trans 1 ,Trans 2 ,…,Trans y ,…,Trans Y After the treatment, Y primary deep characteristic maps are correspondingly obtainedWherein, the first and the second end of the pipe are connected with each other,representing the y primary deep profile;
step 2.1.2: will be provided withInput the 2 nd convolution block A 2 Processing to obtain shallow layer characteristic diagram with channel number of CWill be provided withSequentially corresponding to input W-2 convolution blocks A 3 ,…,A w ,…,A W Processing to obtain deep layer characteristics with channel number CDrawing (A)Wherein the content of the first and second substances,represents the w-2 deep characteristic diagram; from shallow feature mapsAnd deep level feature mapForm W-1 comprehensive characteristic graphs
Step 2.1.3: w-1 th comprehensive characteristic diagramThrough the Xth convolution block N X After the treatment, obtaining an X-1 th multi-scale characteristic diagram
Step 2.1.4: for the W-1 comprehensive characteristic diagramObtaining the W-2 th up-sampling characteristic after up-samplingThe up-sampling featureAnd then with the W-2 comprehensive characteristic diagramCarry out element-by-element additionObtaining a W-2 intermediate characteristic diagram after the method operationThe W-2 intermediate feature mapThrough the X-1 th convolution block N X-1 Then obtaining the X-2 characteristics
Step 2.1.5: to pairObtaining W-3 th up-sampling characteristic after up-sampling Combined with the W-3 th featurePerforming element-by-element addition to obtain W-3 intermediate characteristic diagramThe W-3 intermediate feature mapThrough the X-2 th convolution block N X-2 Then obtaining the X-3 characteristic
Step 2.1.6: according to the process of step 2.1.5After processing, X-1 multi-scale characteristic graphs are obtained Representing an x-1 th multi-scale feature map;
step 2.2: the decoder consists of P convolution blocks and an output block Conv, wherein the P convolution blocks are densely connected in an upper triangle mode and are sequentially marked asWherein, decoder (i,j) A convolution block which represents the ith column and the jth row and is a Decoder (i,j) The method comprises the following steps of (1) including a convolution layer with convolution kernel of NxN and a ReLU activation function; i = J = X-2,
the output block Conv comprises a convolution layer with convolution kernel AxA and a ReLU activation function;
step 2.2.1: multiple scale features of X-1 output from an encoderThe row subscripts of the corresponding decoders are respectively marked as 1,2, \8230;, J; then will beIn turn mark asWherein the content of the first and second substances,a multi-scale feature representing a jth output;
step 2.2.2: decoder for 1 st column j row (1,j) Is inputted asAnd up-sampling feature maps thereofDecoder for 1 st column j row (1,j) Is characteristic I (1,j) ;
Step 2.2.3: convolution block Decoder for jth row of other columns except for first column (i,j) Is input as a jth multi-scale feature mapJ +1 th row, I-1 th column decoder output characteristic I (i-1,j+1) Up-sampled feature map ofAnd a characteristic diagram I output by a decoder of the ith-1 column of the jth row and the ith-2 column of the jth row to the 1 st column of the jth row (i-1,j) …I (1,j) Splicing the characteristic graphs; volume block Decoder (i,j) Is a characteristic diagram I (i,j) Thereby is composed ofObtaining characteristic I after P convolution blocks of decoder (I,1) (ii) a Said feature I (I,1) Processing the output result by an output block Conv to obtain an output result O p ;
And step 3: the overall loss function L of the multi-scale self-encoder network is constructed using equation (1):
L=L ssim +λL pixel (1)
in the formula (1), λ represents a weight coefficient of pixel loss, L ssim Representing the structural similarity loss function and obtained from equation (2), L pixel A pixel loss function is expressed and obtained by formula (3);
L ssim =1-SSIM(I p ,O p ) (2)
in formula (2), SSIM represents structural similarity;
and 4, step 4: based on the training set, training the multi-scale self-encoder network by adopting a back propagation algorithm, and calculating the total loss function L to adjust network parameters until the maximum iteration number is reached, thereby obtaining the trained multi-scale self-encoder network;
and 5: obtaining B pairs of multi-exposure images and converting the images into Ycbcr color gamut, and then only keeping the image pair of Y channel, thereby obtaining preprocessed B pairs of multi-exposure images { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) In which (I) ob ,I ub ) Denotes the b-th pair of multiple exposure images, I ob Representing the overexposed image of the b-th Y channel, I ub An underexposed image representing the b-th Y-channel;
and 6: multiple exposure image pair { (I) o1 ,I u1 ),(I o2 ,I u2 ),…,(I ob ,I ub ),…,(I oB ,I uB ) Inputting the data into a trained multi-scale encoder for processing to obtain overexposure image characteristics (Io) of S scales f1 ,Io f2 ,…,Io fs ,…,Io fS And underexposed image features { Iu } f1 ,Iu f2 ,…,Iu fs ,…,Iu fS In which Io fs Representing the characteristic of the s-th overexposed image, iu fs Representing the feature of the s < th > underexposed image;
the s-th overexposed image characteristic Io fs And the s-th underexposed image feature Iu fs Adding the obtained sums and averaging to obtain the s-th fusion feature f s To obtain a fused feature set { f } 1 ,f 2 ,…,f s ,…,f S And input into the trained decoder, thereby obtaining the convergenceResultant result { Output } 1 ,Output 2 ,…,Output b ,…,Output B Wherein, output b Overexposed image I representing the b-th Y channel ob Underexposed image I of the b-th Y channel ub The fusion result of (1);
will { Output } 1 ,Output 2 ,…,Output b ,…,Output B Converting the image into RGB domain through Ycbcr domain, finally obtaining color image { Result } with uniform exposure 1 ,Result 2 ,…,Result b ,…,Result B In which Result b Representing the b-th color image result.
2. An electronic device comprising a memory for storing a program that enables the processor to perform the multi-exposure image fusion method of claim 1 and a processor configured to execute the program stored in the memory.
3. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the multi-exposure image fusion method of claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211424921.1A CN115689962A (en) | 2022-11-14 | 2022-11-14 | Multi-exposure image fusion method based on multi-scale self-encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211424921.1A CN115689962A (en) | 2022-11-14 | 2022-11-14 | Multi-exposure image fusion method based on multi-scale self-encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115689962A true CN115689962A (en) | 2023-02-03 |
Family
ID=85051690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211424921.1A Pending CN115689962A (en) | 2022-11-14 | 2022-11-14 | Multi-exposure image fusion method based on multi-scale self-encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115689962A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117115926A (en) * | 2023-10-25 | 2023-11-24 | 天津大树智能科技有限公司 | Human body action standard judging method and device based on real-time image processing |
CN117173525A (en) * | 2023-09-05 | 2023-12-05 | 北京交通大学 | Universal multi-mode image fusion method and device |
-
2022
- 2022-11-14 CN CN202211424921.1A patent/CN115689962A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117173525A (en) * | 2023-09-05 | 2023-12-05 | 北京交通大学 | Universal multi-mode image fusion method and device |
CN117115926A (en) * | 2023-10-25 | 2023-11-24 | 天津大树智能科技有限公司 | Human body action standard judging method and device based on real-time image processing |
CN117115926B (en) * | 2023-10-25 | 2024-02-06 | 天津大树智能科技有限公司 | Human body action standard judging method and device based on real-time image processing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Anwar et al. | Diving deeper into underwater image enhancement: A survey | |
Yang et al. | Underwater image enhancement based on conditional generative adversarial network | |
Wang et al. | An experiment-based review of low-light image enhancement methods | |
Rana et al. | Deep tone mapping operator for high dynamic range images | |
Liu et al. | HoLoCo: Holistic and local contrastive learning network for multi-exposure image fusion | |
CN111242883B (en) | Dynamic scene HDR reconstruction method based on deep learning | |
Zamir et al. | Learning digital camera pipeline for extreme low-light imaging | |
Zhu et al. | Stacked U-shape networks with channel-wise attention for image super-resolution | |
Yan et al. | High dynamic range imaging via gradient-aware context aggregation network | |
US20220076459A1 (en) | Image optimization method, apparatus, device and storage medium | |
Li et al. | Hdrnet: Single-image-based hdr reconstruction using channel attention cnn | |
Lv et al. | BacklitNet: A dataset and network for backlit image enhancement | |
Lv et al. | Low-light image enhancement via deep Retinex decomposition and bilateral learning | |
CN115689962A (en) | Multi-exposure image fusion method based on multi-scale self-encoder | |
Chen et al. | End-to-end single image enhancement based on a dual network cascade model | |
Yang et al. | Low‐light image enhancement based on Retinex decomposition and adaptive gamma correction | |
Zhang et al. | Multi-branch and progressive network for low-light image enhancement | |
Li et al. | Low-light hyperspectral image enhancement | |
Wang et al. | Low-light image enhancement by deep learning network for improved illumination map | |
Chen et al. | Improving dynamic hdr imaging with fusion transformer | |
Li et al. | AMBCR: Low‐light image enhancement via attention guided multi‐branch construction and Retinex theory | |
Liu et al. | Non-homogeneous haze data synthesis based real-world image dehazing with enhancement-and-restoration fused CNNs | |
CN104123707B (en) | Local rank priori based single-image super-resolution reconstruction method | |
Cao et al. | A deep thermal-guided approach for effective low-light visible image enhancement | |
Zhang et al. | Invertible network for unpaired low-light image enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |