CN112184550B

CN112184550B - Neural network training method, image fusion method, device, equipment and medium

Info

Publication number: CN112184550B
Application number: CN202010986245.1A
Authority: CN
Inventors: 邓欣; 张雨童; 徐迈; 段一平; 关振宇; 李大伟
Original assignee: Tsinghua University; Beihang University
Current assignee: Tsinghua University; Beihang University
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2022-11-01
Anticipated expiration: 2040-09-18
Also published as: CN112184550A

Abstract

The disclosure relates to the technical field of image processing, and discloses a neural network training method, an image fusion device, equipment and a medium. The method comprises the following steps: designing a first sub-network and a second sub-network with the same network structure, wherein any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the primary characteristic module is used for extracting low-level characteristics of the underexposed low-resolution image and the overexposed low-resolution image; the high-level feature extraction module is used for further extracting high-level features of the underexposed low-resolution image and the overexposed low-resolution image from the corresponding low-level features of the underexposed low-resolution image and the overexposed low-resolution image; the feedback module is coupled to alternately fuse the low-level features and the high-level features corresponding to the under-exposed low-resolution image and the over-exposed low-resolution image. By the technical scheme, multi-exposure fusion processing and super-resolution processing of the image are simultaneously performed by using one neural network, and the image processing speed and the processing accuracy are improved.

Description

Neural network training method, image fusion method, device, equipment and medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a neural network training method, an image fusion method, an apparatus, a device, and a medium.

Background

With the development of technology, people are more and more accustomed to recording the drip of their own life with photos. However, due to the hardware limitations of the camera sensor, the captured image usually has various distortions, which make the image very different from a real natural scene. Images taken with a camera tend to feature a Low Dynamic Range (LDR) and a Low Resolution (LR) compared to real scenes. In order to reduce the difference between the photographed image and the real photographed scene, the image needs to be processed.

At present, the problem of low dynamic range of images is mainly corrected by Multi-exposure Image fusion (MEF), and the problem of low Resolution of images is corrected by Image Super-Resolution (ISR). The multi-exposure image fusion technique is intended to fuse LDR images with different exposure levels, so as to generate an image with High Dynamic Range (HDR). The image super-Resolution technique aims to reconstruct an LR image into a High-Resolution (HR) image.

However, in practical situations, one shot image has two characteristics of LDR and LR, and the multi-exposure image fusion technique and the image super-resolution technique are two independent image processing techniques, which means that one shot image needs to be subjected to multi-exposure image fusion processing and image super-resolution processing successively. Moreover, the sequential execution order of the two image processing techniques may affect the final image processing result. Therefore, the existing image processing method is not only complicated in processing process, but also not ideal in image processing effect.

Disclosure of Invention

To solve the above technical problems, or to at least partially solve the above technical problems, the present disclosure provides a neural network training method, an image fusion method, an apparatus, a device, and a medium.

In a first aspect, the present disclosure provides a neural network training method for multi-exposure image fusion, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the method comprises the following steps:

acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;

inputting the underexposed low-resolution image and the overexposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;

inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate underexposed high-level features and overexposed high-level features;

inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;

inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into the coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;

and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level features and the coupling feedback results corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level features and the coupling feedback results corresponding to the second sub-network.

In some embodiments, the neural network includes a plurality of the coupled feedback modules, and each of the coupled feedback modules does not share model parameters.

In some embodiments, each of the coupled feedback modules processes serially;

the inputting the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and the generating of the coupling feedback result corresponding to the first sub-network includes:

inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a first coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;

for any subsequent coupled feedback module in the first sub-network except the first coupled feedback module, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the first sub-network;

the inputting the overexposed low-level features, the overexposed high-level features, and the underexposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:

inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into a first coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;

and for any subsequent coupled feedback module except the first coupled feedback module in the second sub-network, inputting the over-exposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the first sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the second sub-network.

In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature map sets, wherein each feature map set comprises a filter, an deconvolution layer, and a convolution layer;

a first of said concatenation submodules precedes each of said sets of feature maps;

any other coupling submodule except the first coupling submodule is positioned between any two adjacent feature map groups, and any two other coupling submodules are positioned at different positions.

In some embodiments, said adjusting parameters of said neural network based on said underexposed low resolution image, said underexposed high level features and said coupled feedback results corresponding to said first sub-network, and said overexposed low resolution image, said overexposed high level features and said coupled feedback results corresponding to said second sub-network comprises:

respectively carrying out up-sampling operation on the under-exposed low-resolution image and the over-exposed low-resolution image;

adding the image corresponding to the under-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled under-exposed low-resolution image respectively to generate an under-exposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;

adding the image corresponding to the overexposure high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network to the upsampled overexposure low-resolution image respectively to generate an overexposure high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;

and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.

In some embodiments, the parameters of the neural network are adjusted by a loss function as shown in the following equation:

wherein L is_totalRepresents the value of the total loss function, λ_o、λ_uAnd

the weights corresponding to each partial loss function value are respectively represented,

and

respectively representing loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,

and

respectively represent the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-network_MSRepresenting a loss value between two images determined based on the structural similarity index of the images,

and

respectively representing the over-exposed high resolution image and the over-exposed high resolution reference image,

and

respectively representing the under-exposed high resolution image and the under-exposed high resolution reference image,

and I_gtAnd the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images are respectively represented, and T represents the number of the coupling feedback modules.

In a second aspect, the present disclosure provides an image fusion method, including:

inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;

and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.

In some embodiments, the generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:

and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.

In a third aspect, the present disclosure provides a neural network training device for multi-exposure image fusion, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the device includes:

the image acquisition unit is used for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;

a low-level feature generation unit, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;

a high-level feature generation unit configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network, respectively, and generate the underexposed high-level features and the overexposed high-level features;

a first coupling feedback result generating unit, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;

a second coupling feedback result generating unit, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into the coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;

and the parameter adjusting unit is used for adjusting the parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.

In some embodiments, the neural network includes a plurality of coupled feedback modules, and the coupled feedback modules do not share model parameters.

In some embodiments, each coupled feedback module processes serially;

correspondingly, the first coupling feedback result generating unit is specifically configured to:

inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network;

for any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network;

correspondingly, the second coupling feedback result generating unit is specifically configured to:

inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network;

and inputting the over-exposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.

In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature mapping sets, wherein each feature mapping set comprises a filter, an deconvolution layer and a convolution layer;

a first link submodule located before each feature map group;

any other linking submodule than the first linking submodule is located between any two adjacent feature map sets, and any two other linking submodules are located at different positions.

In some embodiments, the parameter adjusting unit is specifically configured to:

respectively carrying out up-sampling operation on the underexposed low-resolution image and the overexposed low-resolution image;

adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the upsampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;

adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate an over-exposed high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;

Further, the parameter adjusting unit is specifically configured to:

parameters of the neural network are adjusted by a loss function as shown in the following equation:

and

respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,

and

respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-network_MSRepresenting a loss value between two images determined based on the structural similarity index of the images,

and

respectively representing an overexposed high resolution image and an overexposed high resolution reference image,

and

respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,

and I_gtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.

In a fourth aspect, the present disclosure provides an image fusion apparatus, comprising:

an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;

the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein the neural network is trained by any embodiment of the neural network training method for multi-exposure image fusion in the disclosure;

an image fusion result generating unit configured to generate an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.

In some embodiments, the image fusion result generating unit is specifically configured to:

and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.

In a fifth aspect, the present disclosure provides an electronic device, including:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement any of the embodiments of the neural network training method for multi-exposure image fusion or the image fusion method described above.

In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the above-described neural network training methods for multi-exposure image fusion or image fusion methods.

According to the technical scheme provided by the embodiment of the disclosure, a first sub-network and a second sub-network with the same network structures are designed, any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module, the primary feature extraction module is used for extracting low-level features of an under-exposed low-resolution image and an over-exposed low-resolution image, the high-level feature extraction module is used for further extracting high-level features of the under-exposed low-resolution image and the over-exposed low-resolution image from the corresponding low-level features, and mapping of the low-resolution image into the high-resolution features is preliminarily achieved. The low-level features and the high-level features corresponding to the underexposed low-resolution images and the overexposed low-resolution images are crossly fused through the coupling feedback module, so that the multi-exposure fusion of the overexposed images and the underexposed images is realized, the resolution of the images is further improved, an image with both high resolution and high dynamic range is obtained, the aims of simultaneously performing multi-exposure fusion processing and super-resolution processing on the images are fulfilled, the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by utilizing the complementary characteristics between the multi-exposure fusion and the super-resolution.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the embodiments or technical solutions in the prior art description will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a network architecture diagram of a neural network provided by an embodiment of the present disclosure;

FIG. 2 is a network architecture diagram of a high-level feature extraction module in a neural network provided by an embodiment of the present disclosure;

FIG. 3 is a network architecture diagram of a coupled feedback module in a neural network provided by an embodiment of the present disclosure;

FIG. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure;

FIG. 5 is a flowchart of a neural network training method for multi-exposure image fusion provided by an embodiment of the present disclosure;

FIG. 6 is a flowchart of an image fusion method provided by an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of a neural network training apparatus for multi-exposure image fusion according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be described in further detail below. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments of the present disclosure may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.

The neural network training scheme provided by the embodiment of the disclosure can be applied to an application scene for performing fusion processing on images with the characteristics of low dynamic range and low resolution, and is particularly suitable for a scene for performing image fusion processing on overexposed low-resolution images (overexposed low-resolution images for short) and underexposed low-resolution images (underexposed low-resolution images for short).

Fig. 1 is a block diagram of a network structure of a neural network for image fusion according to an embodiment of the present disclosure. As shown in fig. 1, the neural network includes a first sub-network 110 and a second sub-network 120 having the same network structure but not sharing model parameters. The first sub-network 110 includes a primary Feature Extraction Block (FEB) 111, a super-resolution Block (SRB) 112, and a Coupled Feedback Block (CFB) 113. The second network sub-system 120 comprises a primary feature extraction module 121, a high-level feature extraction module 122 and a coupling feedback module 123. The number of the coupled feedback modules 113 in the first sub-network 110 and the number of the coupled feedback modules 123 in the second sub-network 120 are the same, and equal to or greater than 1. The input data of the neural network are an overexposed low-resolution image and an underexposed low-resolution image, the two input images are only required to be respectively input into a sub-network, and a specific input corresponding relation is not limited. In the embodiment of the present disclosure, an under-exposed low-resolution image is input into the first sub-network 110, and an over-exposed low-resolution image is input into the second sub-network 120. The FEB and the SRB are used to extract high-level features from the input image, which helps to enhance the image resolution; the CFB is located behind the SRB and is used for absorbing the features learned by the SRBs of the two sub-networks, so as to fuse an image with High-resolution (HR) and High Dynamic Range (HDR).

The primary feature extraction module 111 and the primary feature extraction module 121 are respectively used for extracting the input under-exposed low-resolution image

And overexposed low resolution images

To obtain corresponding under-exposed low-level features

And over-exposed low level features

The primary feature extraction process is characterized by using a formula as follows:

and

wherein f is_FEB() Representing the operation of the primary feature extraction module. In some embodiments, f_FEB() Convolutional layers comprising a series of 3 x 3 and 1 x 1 convolutional kernels.

The high-level feature extraction module 112 and the high-level feature extraction module 122 are respectively used for underexposed low-level features of the input

And over-exposed low level features

Performing a further feature extraction operation to extract an under-exposed low resolution image

And overexposed low resolution images

To obtain an under-exposed high-level feature G^uAnd overexposure high level feature G^o. Because the high-level features comprise higher-level semantic features which can represent small and complex targets in the image better, and thus, the detail information in the image is enriched, the underexposed high-level features G^uAnd overexposed high level feature G^oCan improve the score of the corresponding imageAnd the resolution is high, and the super-resolution effect is realized. In some embodiments, referring to fig. 2, a feedback module in the SRFBN network is used as the main structure of the high-level feature extraction module 112 (122), which comprises a plurality of feature map groups 210 connected in series in a dense connection (dense connection) manner. Each set of feature maps 210 contains at least one upsampling operation (Deconv) and one downsampling operation (Conv). By means of continuous up-and-down sampling, the feature F is gradually changed from the low-level feature F under the condition of ensuring that the size of the feature is not changed_inTo extract higher level features G to improve the resolution of the image. The formula representation high-level feature extraction process comprises the following steps: the high-level features of the SRB output can be expressed as:

and

wherein, f_SRB() Representing operations in a high-level feature extraction module.

The coupling feedback module CFB is a core component of a neural network, and aims to realize super-resolution and multi-exposure image fusion simultaneously through a complex network structure. The input data of the coupled feedback module CFB contains three, namely a low-level feature and a high-level feature in the same sub-network and a high-level feature in another sub-network. The two input data in the same sub-network are used for further improving the image resolution and enhancing the super-resolution effect of the image; the input data in the other sub-network has the function of improving the image fusion effect and realizing multi-exposure image fusion.

In some embodiments, one coupled feedback module CFB is included in each sub-network of the neural network. Then, the coupling feedback module 113 is used to fuse the under-exposed low-level features of the input

Underexposed high level feature G^uAnd overexposed high level feature G^oAnd generates a coupled feedback result corresponding to the first sub-network 110. The coupling feedback module 123 is used to fuse the inputsExposing low level features

Overexposure high level feature G^oSum underexposed high level feature G^uAnd generates a coupled feedback result corresponding to the second sub-network 120. The coupled feedback results are image characteristics which realize multi-exposure fusion and super resolution simultaneously.

In some embodiments, multiple coupled feedback modules CFB are included in each sub-network in the neural network, and the multiple CFBs are in a parallel processing fashion. In this embodiment, the input data of each CFB in the same sub-network is the same, and the output coupled feedback results need to be further fused (such as weighted summation, etc.), so as to obtain a coupled feedback result.

In some embodiments, a plurality of coupled feedback modules CFB are included in each sub-network in the neural network, and the plurality of CFBs are serially connected in a loop, as shown in fig. 1. In this embodiment, assuming that there are T CFBs in each sub-network, the process of generating the coupling feedback result corresponding to the first sub-network 110 is: underexposing low level features

Under-exposed high level feature G^uAnd overexposure high level feature G^oThe coupling feedback result is inputted to the first coupling feedback module 113 in the first sub-network 110 to generate the coupling feedback result corresponding to the first sub-network

Underexposing the low-level feature for any subsequent coupled feedback module 113 (numbered t) in the first sub-network except the first coupled feedback module 113

The coupling feedback result of the previous adjacent coupling feedback module (serial number is t-1) of the subsequent coupling feedback module

And in the second sub-network 120 with the sameCoupling feedback result of coupling feedback module corresponding to previous adjacent coupling feedback module

The coupling feedback result is inputted into the subsequent coupling feedback module 113 to generate a coupling feedback result corresponding to the first sub-network 110

According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the first sub-network 110 can be obtained

Similarly, the process of generating the coupling feedback result corresponding to the second sub-network 120 is as follows: overexposure of low level features

Overexposed high level feature G^oSum-underexposed high level feature G^uThe coupling feedback result is inputted to the first coupling feedback module 123 in the second sub-network 120 to generate the coupling feedback result corresponding to the second sub-network 120

For any subsequent coupled feedback module 123 (with the sequence number t) in the second sub-network 120 except the first coupled feedback module 123, the overexposed low-level feature is obtained

Coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module

And the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network 110

Inputting the subsequent coupling feedback module 123 to generate a coupling feedback junction corresponding to the second sub-network 120Fruit

According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the second sub-network 120 can be obtained

The above process is characterized by the formula:

and

wherein, f_CFB() Indicating the operation of the coupled feedback module.

In some embodiments, the number of

coupling feedback modules

113 and 123 is three. This can better balance the computation speed and model accuracy of the neural network.

In some embodiments, the internal network structure of each coupled feedback module CFB is the same, but does not share model parameters. Referring to fig. 3, the structure of the tth coupled feedback module 123 in the second sub-network 120 and the correlation with other modules are illustrated. The coupling feedback module 123 includes at least two join sub-modules 310 and at least two feature map groups 320. As in fig. 2, the plurality of feature map groups 320 are connected in a dense manner, and each feature map group 320 includes a filter, a deconvolution layer Deconv, and a convolution layer Conv, implementing successive upsampling and downsampling. The first of which, linking submodule 310, precedes all feature map sets 320; any other coupling sub-module 310 than the first coupling sub-module is located between any two adjacent feature map sets 320, and any two other coupling sub-modules 310 are located at different positions.

The tth CFB has three input data, which are overexposed low-level features

Extracted from the t-1 th CFBCoupling the feedback results

And the coupled feedback result extracted by the t-1 th CFB in the first sub-network 110

Wherein features for feedback

Is feedback information obtained from the same sub-network, so its main function is to correct the overexposed low-level features

So as to further improve the effect of super resolution; but features for feedback

Is feedback information from another sub-network, whose main function is to bring complementary information to improve the effect of multi-exposure image fusion.

The processing procedure of the t-th coupling feedback module 123 is as follows: first, three input features are concatenated in the dimension of the number of channels using a concatenation submodule 310. The join results are then fused with a series of 1 × 1 filters:

wherein the content of the first and second substances,

representing low-resolution features after filter fusion based on three input features, M_inA series of 1 x 1 filters is shown,

representing a concatenation of internal elements. Then, based on the feature after filtering fusion

The operations of upsampling Deconv and downsampling Conv, each of which may result in a high resolution feature, are repeated using a series of feature map sets 320

Low resolution features can be obtained with each downsampling

And finally to progressively extract more efficient high-level features.

During operation of the feature map set 320, based on the above

Is to bring complementary information to enhance the interpretation of the effect of multi-exposure image fusion, while taking into account the fact that as the number of feature map sets increases, the intra-module aspect is to feature

Is gradually forgotten, so that

The generated influence is gradually reduced to cause poor effect of subsequent fusion, so the embodiment is used for enhancing

For network effects, except

As input data for each CFB, it is implanted between feature map groups 320 to reactivate the CFB module's memory, i.e., a linker sub-module 310 is added between the feature map groups of the CFB, the input data of the added linker sub-module 310 contains

In one embodiment, at least two coupling sub-modules 310 are provided in each CFB, andand the other link sub-modules, except the first link sub-module, are disposed between different feature map sets 320. If there is no requirement for the operation speed of the neural network, more than two link sub-modules 310 may be provided, and even one link sub-module 310 may be added between every two feature mapping groups 310, which can further improve the fusion effect. If there are high requirements on the operation speed and the operation accuracy of the neural network, only two coupling sub-modules 310 may be provided in order to equalize the speed and the accuracy, the second coupling sub-module 310 being provided at an intermediate position of the plurality of feature mapping groups 320. For example, assuming that the total number of feature map groups is N, the feedback features

And

are combined to form a new low resolution LR profile

Wherein the content of the first and second substances,

indicating a rounding down operation. The new low resolution feature map

Will substitute for

As input features for subsequent sets of feature maps.

Finally, after the N feature mapping groups 320 are operated, the LR feature maps of the feature mapping groups 320 are aggregated together and fused by a series of 1 × 1 filters to obtain the final output result of the CFB

Wherein, M_out() The operation of convolution with a series of 1 x 1 filters is shown.

In some embodiments, first sub-network 110 and second sub-network 120 comprise an image reconstruction module (REC) 114 and an image reconstruction module 124, respectively, for reconstructing the coupled feedback results (features) obtained by the at least one CFB into an image. Then, multiple CFBs can obtain multiple reconstructed images. On the basis, the original input images of the neural network can be further fused to obtain a first fusion exposure high-resolution image

And a second fused exposed high resolution image

Any fusion exposure high-resolution image has the characteristics of high dynamic range HDR and high resolution HR. It should be noted that, in the embodiment of multiple CFB loop connection, each CFB outputs a coupling feedback result, but considering that the serial feedback processing of each CFB will gradually improve the image fusion and super resolution effect, the coupling feedback result obtained by the last CFB is the best overall effect. Based on this, a first fusion exposure high resolution image is obtained

And a second fused exposed high resolution image

In the process of (2), a reconstructed image corresponding to a coupling feedback result obtained by the last CFB in each sub-network is taken as one of the inputs. In addition, the characteristic size of the coupling feedback result is larger than the image size of the original input image, so the image size of the original input image can be enlarged by the up-sampling operation such as bicubic interpolation, and then the image size of the original input image can be enlargedThe result of the upsampling is taken as another input to obtain a fused exposed high resolution image. The above process is characterized by the formula:

and

wherein, f_UP() And f_REC() Respectively representing an upsampling operation and an image reconstruction operation.

Based on the above description, the parameter settings of each part of the neural network provided by the embodiments of the present disclosure may be exemplified as follows:

fig. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure. Based on the architecture of the neural network in fig. 1, there are multi-level features in the neural network, such as a low-level feature, a high-level feature, and at least one coupled feedback result (feature), which are all used to implement multi-exposure image fusion and super-resolution techniques simultaneously, so in order to ensure the effectiveness of each obtained feature, a layered loss function limitation is adopted in the neural network training process. Since the hierarchical loss function requires the image of each layer to be calculated, the network architecture of the neural network for neural network training in fig. 4 increases the branches of multiple image outputs, for example, outputs an overexposed high-resolution image corresponding to a high-level feature, compared with the network architecture for image fusion prediction in fig. 1

And under-exposed high resolution images

Outputting the fusion exposure high resolution image corresponding to other coupled feedback modules in the first sub-network 110

And

and outputs fused exposure high resolution images corresponding to other coupled feedback modules in the second subnetwork 120

And

fig. 5 is a flowchart of a neural network training method for multi-exposure image fusion according to an embodiment of the present disclosure. The neural network training method for multi-exposure image fusion is implemented based on the neural network architecture in fig. 4, wherein the same or corresponding explanations as those in the above embodiments are not repeated herein. The neural network training method for multi-exposure image fusion provided by the embodiment of the disclosure may be executed by a neural network training apparatus for multi-exposure image fusion, which may be implemented by software and/or hardware, and may be integrated in an electronic device with certain computing capability, such as a notebook computer, a desktop computer, a server or a super computer. Referring to fig. 5, the neural network training method for multi-exposure image fusion specifically includes:

s110, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.

Specifically, network training needs to be performed for multiple times in the whole neural network training process, one training image group needs to be obtained for each network training, multiple training image groups need to be obtained in the whole training process, and the training process of each training image group is the same. In this embodiment, only one training process is described. One training image set comprises an under-exposed low-resolution image

And an overexposed low resolution image

The underexposed low-resolution image is an image with exposure less than a first preset exposure threshold and image resolution less than a preset resolution threshold. The overexposure low-resolution image is an image with shot exposure higher than a second preset exposure threshold and image resolution lower than the preset resolution threshold. Here, the first preset exposure threshold is smaller than the second preset exposure threshold, and the first preset exposure threshold, the second preset exposure threshold, and the preset resolution threshold are a predetermined exposure and an image resolution, respectively.

And S120, inputting the underexposed low-resolution image and the overexposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed low-level features and the overexposed low-level features.

In particular, an under-exposed low resolution image

Inputting the primary feature extraction module FEB in the first sub-network to obtain the underexposed low-level features

Overexposure of low resolution images

Inputting the feature into a primary feature extraction module FEB in a second sub-network to obtain over-exposed low-level features

And S130, inputting the underexposed low-level features and the overexposed low-level features into high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features.

In particular, underexposed low-level features

Input the firstA high-level feature extraction module SRB in the subnetwork to obtain the underexposed high-level features G^u. Overexposure of low-level features

Inputting the high-level feature extraction module SRB in the second sub-network to obtain the overexposed high-level feature G^o。

And S140, inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network.

In particular, low level features are underexposed

Under-exposed high level feature G^uAnd overexposed high level feature G^oAt least one coupling feedback result corresponding to the first sub-network is generated by processing of at least one coupling feedback module CFB in the first sub-network as a function of the basic input features.

In some embodiments, S140 may be implemented as: inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network; and aiming at any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of the previous adjacent coupled feedback module of the subsequent coupled feedback module and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes serially.

Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the first sub-network is: first, for the first CFB, underexposed low-level features are applied

Under-exposed high level feature G^uAnd overexposure high level feature G^oAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the first sub-network

Then, for some subsequent CFB in the first subnetwork (assumed to be the t-th, and t) except the first CFB<T), underexposing the low level feature

The coupled feedback result of the previous adjacent CFB (i.e. t-1 st CFB) of the t-th CFB

And the coupled feedback result of the t-1 CFB in the second sub-network

After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the first sub-network

By analogy, through iterative feedback, a coupled feedback result output by any subsequent CFB in the first subnetwork can be obtained.

And S150, inputting the overexposed low-level features, the overexposed high-level features and the underexposed high-level features into a coupling feedback module in the second sub-network, and generating a coupling feedback result corresponding to the second sub-network.

In particular, the level features are low by overexposure

Overexposure high level feature G^oSum underexposed high level feature G^uAt least one coupled feedback result corresponding to the second sub-network is generated by processing of at least one coupled feedback module CFB in the second sub-network based on the input characteristics.

In some embodiments, S150 may be implemented as: inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network; and inputting the over-exposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes serially.

Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the second sub-network is as follows: first, for the first CFB, the overexposed low-level features are applied

Overexposure high level feature G^oSum underexposed high level feature G^uAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the second sub-network

Then, for some subsequent CFB in the second sub-network other than the first CFB (assuming the tth and t-th CFB)<T), overexposed low-level features

Coupled feedback results for t-1 th CFB

And the coupled feedback result of the t-1 CFB in the first sub-network

After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the second sub-network

By analogy, through iterative feedback, a coupled feedback result of any subsequent CFB output in the second sub-network can be obtained.

And S160, adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.

Specifically, according to the above description, the embodiment of the present disclosure employs a hierarchical loss function to train the neural network, so that it is required to be based on an under-exposed low-resolution image

Under-exposed high level feature G^uRespective coupled feedback results corresponding to the first sub-network

Overexposed low resolution images

Overexposure high level feature G^oRespective coupled feedback results corresponding to the second sub-network

The images output by each layer of the neural network are determined, the loss value of the training is calculated by using the output images, and then the model parameters in the neural network are adjusted by using the loss value.

In some embodiments, S160 may be implemented as:

A. the under-exposed low resolution image and the over-exposed low resolution image are respectively subjected to an up-sampling operation.

Specifically, in order to further improve the image fusion effect, the images output by each layer of the neural network in the embodiment of the present disclosure all need to be fused with the original input image, that is, the underexposed low-resolution image

And overexposed low resolution images

However, since the high-level features and the coupled feedback result both increase more image detail information of super resolution, and the feature size is larger than the original input image, the original input image needs to be up-sampled to enlarge the image size. For example, separately for underexposed low resolution images

And overexposed low resolution images

Carrying out bicubic interpolation up-sampling operation to obtain an up-sampled under-exposed low-resolution image

And upsampled overexposed low resolution image

B. And adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network.

Specifically, first, the underexposed high-level feature G is subjected to^uRespective coupled feedback results corresponding to the first sub-network

The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the under-exposed high-level features G are combined^uCorresponding image and upsampled under-exposed low resolution image

Add up to obtain the under-exposure heightResolution image

And, the up-sampled under-exposed low resolution image is processed

Coupling the feedback result separately to each of the first sub-networks

Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the first sub-network

C. And adding the image corresponding to the over-exposed high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network with the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network.

Specifically, first, the overexposed high level feature G is subjected to^oRespective coupled feedback results corresponding to the second sub-network

The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the overexposed high-level feature G is processed^oCorresponding image and upsampled overexposed low resolution image

Adding to obtain an overexposed high-resolution image

And, the up-sampled overexposed low resolution image is processed

Coupling the feedback result to each of the second sub-networks separately

Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the second sub-network

D. And adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.

In particular, the underexposed high resolution image obtained by the above process

Respective fusion exposure high resolution images corresponding to the first sub-network

Overexposed high resolution images

Respective merged exposure high resolution images corresponding to the second sub-network

The loss value of the training is calculated, and the model parameters in the neural network are adjusted by using the back propagation of the loss value.

In some embodiments, step D may be implemented as: parameters of the neural network are adjusted by a loss function as shown in the following equation (1):

wherein L is_totalExpressing the value of the total loss function, λ_o、λ_uAnd

and

and

and

and

and I_gtAnd the fusion exposure high-resolution images respectively represent the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images, and T represents the number of the coupling feedback modules.

Each reference image is a true value corresponding to the corresponding neural network output imageThe image is a target image that is desired to be as close as possible to the image generated by the neural network. The loss value L representing the Structural Similarity Index (SSIM) at the image level between the two images (X and Y) is described above_MSCan be determined by:

the loss function in equation (1) above can be divided into two parts. First two loss functions

And

for ensuring the effectiveness of the high-level feature extraction module SRB, the last part of the loss function

To ensure the effectiveness of the coupling feedback module CFB. That is, the first two loss functions are to ensure the effect of super-resolution, while the latter part is constructed to ensure the effect of super-resolution and multi-exposure image fusion simultaneously. At the same time, the first two loss functions are also important bases for the last part of the loss function. The entire neural network is trained in an end-to-end manner by minimizing the loss function defined in equation (1).

It should be noted that the execution order of S140 and S150 is not limited, and S140 may be executed first and then S150 may be executed, S150 may be executed first and then S140 may be executed, or S140 and S150 may be executed in parallel.

According to the technical scheme of the embodiment of the disclosure, the obtained underexposed low-resolution images and the obtained overexposed low-resolution images are respectively input into the primary feature extraction modules in the first sub-network and the second sub-network to generate underexposed low-level features and overexposed low-level features; respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features; inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network; inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network; and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network. The end-to-end training of the neural network coupled with the multi-exposure fusion technology and the super-resolution technology is realized, the neural network with more accurate parameters of each module is obtained, the neural network can simplify the processing flow of the shot image and improve the image processing speed, and the image processing accuracy is further improved by utilizing the complementary characteristic between the multi-exposure fusion and the super-resolution.

Fig. 6 is a flowchart of an image fusion method provided in an embodiment of the present disclosure. The image fusion method is implemented based on the neural network architecture in fig. 1, and the explanation of the same or corresponding contents as those in the above embodiments is not repeated herein. The image fusion method provided by the embodiment of the present disclosure may be executed by an image fusion device, where the image fusion device may be implemented by software and/or hardware, and the image fusion device may be integrated in an electronic device with certain computing capability, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a server, or a super computer. Referring to fig. 6, the image fusion method includes:

s210, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.

Specifically, two extreme-exposed images, i.e., an underexposed low-resolution image and an overexposed low-resolution image, for the same photographic scene and the same photographic subject are acquired.

S220, inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image.

Specifically, in the application process of the neural network, only an under-exposed low-resolution image and an over-exposed low-resolution image are needed to be input, and two images can be output through the processing of the neural network, namely the under-exposed low-resolution image

Corresponding first fused exposure high resolution image

Overexposed low resolution images

Corresponding second fused exposure high resolution image

And S230, generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.

Specifically, a first fused exposure high resolution image

And a second fused exposure high resolution image

Although both images are super-resolution and multi-exposure fused images, the two output images also have differences due to the difference of the corresponding input images. In order to further improve the fusion precision, the embodiment of the present disclosure needs to be applied to

And

further comprehensive processing is carried out to obtain a final output image, namely an image fusion result.

In some embodiments, S230 may be implemented as: and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result. Specifically, the pair of the present embodiments

And

the weighting process is performed, and therefore, two weighting weights, i.e., a first weight and a second weight, need to be determined in advance. The values of the two weights are related to the exposure levels, shooting scenes and the like of the underexposed low-resolution images and the overexposed low-resolution images. For example, 0.5 may be used as a default value for the first weight and the second weight. Then, the image fusion result can be generated according to the following formula (2):

wherein, I_out、w_oAnd w_uRespectively representing the image fusion result, the second weight and the first weight.

According to the technical scheme of the embodiment of the disclosure, the under-exposed low-resolution image and the over-exposed low-resolution image obtained by shooting are input into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image. The method and the device have the advantages that the neural network coupled with the multi-exposure fusion technology and the super-resolution technology is utilized to process two extremely-exposed low-resolution images, an image fusion result with high resolution HR and high dynamic range HDR is generated, the processing flow of shot images is simplified, and the image processing speed and accuracy are improved.

Fig. 7 is a schematic structural diagram of a neural network training device for multi-exposure image fusion according to an embodiment of the present disclosure. The neural network comprises a first sub-network and a second sub-network which have the same network structure, and any one of the sub-networks comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module. Referring to fig. 7, the apparatus specifically includes:

an image acquisition unit 710 for acquiring an under-exposed low resolution image and an over-exposed low resolution image;

a low-level feature generation unit 720, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;

a high-level feature generating unit 730, configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first subnetwork and the second subnetwork, respectively, and generate the underexposed high-level features and the overexposed high-level features;

a first coupling feedback result generating unit 740, configured to input the under-exposure low-level features, the under-exposure high-level features, and the over-exposure high-level features into the coupling feedback modules in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;

a second coupling feedback result generating unit 750, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into a coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;

and a parameter adjusting unit 760, configured to adjust parameters of the neural network based on the under-exposed low-resolution image, the under-exposed high-level feature, and the coupling feedback result corresponding to the first sub-network, and the over-exposed low-resolution image, the over-exposed high-level feature, and the coupling feedback result corresponding to the second sub-network.

In some embodiments, the neural network includes a plurality of coupled feedback modules, and each coupled feedback module does not share model parameters.

In some embodiments, each coupled feedback module processes serially;

accordingly, the first coupling feedback result generating unit 740 is specifically configured to:

inputting the underexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the second sub-network into any subsequent coupling feedback module except the first coupling feedback module in the first sub-network, so as to generate the coupling feedback result corresponding to the first sub-network;

correspondingly, the second coupling feedback result generating unit 750 is specifically configured to:

and inputting the overexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.

In some embodiments, the coupling feedback module comprises at least two coupling submodules and at least two eigenmap groups, wherein each eigenmap group comprises a filter, an deconvolution layer and a convolution layer;

the first joint submodule is positioned in front of each feature mapping group;

In some embodiments, the parameter adjusting unit 760 is specifically configured to:

adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network;

Further, the parameter adjusting unit 760 is specifically configured to:

the weight corresponding to each partial loss function value is respectively represented,

and

respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,

and

respectively representing loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second subnetwork_MSRepresenting a loss value between two images determined based on the structural similarity index of the images,

and

and

Through the neural network training device for multi-exposure image fusion provided by the embodiment of the disclosure, multi-exposure fusion processing and super-resolution processing of images are simultaneously performed by using one neural network, so that the processing flow of image shooting is simplified, the image processing speed is increased, and the image processing accuracy is further improved by using the complementary characteristic between multi-exposure fusion and super-resolution.

The neural network training device for multi-exposure image fusion provided by the embodiment of the disclosure can execute the neural network training method for multi-exposure image fusion provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.

Fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure. Referring to fig. 8, the apparatus specifically includes:

an image obtaining unit 810 for obtaining an under-exposed low-resolution image and an over-exposed low-resolution image;

a fusion exposure high-resolution image generation unit 820, configured to input the under-exposure low-resolution image and the over-exposure low-resolution image into a neural network trained in advance, and generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;

an image fusion result generating unit 830 for generating an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.

In some embodiments, the image fusion result generating unit 830 is specifically configured to:

Through the image fusion device provided by the embodiment of the disclosure, multi-exposure fusion processing and super-resolution processing of images are simultaneously performed by using one neural network, so that the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by using the complementary characteristic between multi-exposure fusion and super-resolution.

The image fusion device provided by the embodiment of the disclosure can execute the image fusion method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, in the embodiment of the neural network training device for multi-exposure image fusion, the included units are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present disclosure.

Referring to fig. 9, the present embodiment provides an electronic device, which includes: one or more processors 920; a storage device 910, configured to store one or more programs, when the one or more programs are executed by the one or more processors 920, so that the one or more processors 920 implement the neural network training method for multi-exposure image fusion provided by the embodiment of the present invention, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any one of the sub-networks includes a primary feature extraction module, a higher-level feature extraction module, and a coupling feedback module; the method comprises the following steps:

respectively inputting the underexposed low-resolution image and the overexposed low-resolution image into primary feature extraction modules in a first sub-network and a second sub-network to generate an underexposed low-level feature and an overexposed low-level feature;

respectively inputting the underexposed low-level features and the overexposed low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposed high-level features and the overexposed high-level features;

inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;

inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;

and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.

Of course, it can be understood by those skilled in the art that the processor 920 may also implement the technical solution of the neural network training method for multi-exposure image fusion provided in any embodiment of the present invention.

The electronic device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 9, the electronic device includes a processor 920, a storage 910, an input 930, and an output 940; the number of the processors 920 in the electronic device may be one or more, and one processor 920 is taken as an example in fig. 9; the processor 920, the storage device 910, the input device 930, and the output device 940 in the electronic apparatus may be connected by a bus or other means, and are exemplified by a bus 950 in fig. 9.

The storage device 910 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the neural network training method for multi-exposure image fusion in the embodiment of the present invention.

The storage device 910 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. In addition, the storage 910 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the storage 910 may further include memory located remotely from the processor 920, which may be connected to electronic devices over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input unit 930 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. The output device 940 may include a display device such as a display screen.

An embodiment of the present invention further provides another electronic device, which includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors implement the image fusion method provided by the embodiment of the invention, the method comprises the following steps:

Of course, those skilled in the art can understand that the processor may also implement the technical solution of the image fusion method provided in any embodiment of the present invention. The hardware structure and the function of the electronic device can be explained with reference to fig. 9.

The disclosed embodiments also provide a storage medium containing computer-executable instructions for performing a neural network training method for multi-exposure image fusion when executed by a computer processor, the neural network including a first sub-network and a second sub-network having the same network structure, and any one of the sub-networks including a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the method comprises the following steps:

inputting the underexposed low-resolution image and the overexposed low-resolution image into primary feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;

respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features;

Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the above method operations, and may also perform related operations in the neural network training method for multi-exposure image fusion provided by any embodiments of the present invention.

Computer storage media for embodiments of the present invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

Embodiments of the present invention also provide another computer-readable storage medium, where computer-executable instructions, when executed by a computer processor, are configured to perform an image fusion method, including:

Of course, the storage medium provided in the embodiments of the present invention includes computer-executable instructions, and the computer-executable instructions are not limited to the above method operations, and may also perform related operations in the image fusion method provided in any embodiment of the present invention. The description of the storage medium is explained with reference to the above embodiments.

It is to be understood that the terminology used in the disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present application. As used in the specification and claims of this disclosure, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are inclusive in the plural, unless the context clearly dictates otherwise. The term "and/or" includes any and all combinations of one or more of the associated listed items. Relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, or apparatus that comprises the element.

The previous description is only for the purpose of describing particular embodiments of the present disclosure, so as to enable those skilled in the art to understand or implement the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A neural network training method for multi-exposure image fusion is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the method comprises the following steps:

inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features;

2. The method of claim 1, wherein the neural network comprises a plurality of the coupled feedback modules, and each of the coupled feedback modules does not share model parameters.

3. The method of claim 2, wherein each of the coupled feedback modules processes serially;

for any subsequent coupling feedback module in the first sub-network except the first coupling feedback module, inputting the underexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module, and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the second sub-network into the subsequent coupling feedback module, and generating a coupling feedback result corresponding to the first sub-network;

the inputting the over-exposed low-level features, the over-exposed high-level features, and the under-exposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:

4. The method of any one of claims 1 to 3, wherein the coupled feedback module comprises at least two concatenated sub-modules and at least two sets of signature maps, wherein each set of signature maps comprises a filter, an deconvolution layer and a convolution layer;

a first one of said join submodules precedes each of said sets of feature maps;

any other coupling submodule than the first coupling submodule is located between any two adjacent feature map sets, and any two other coupling submodules are located at different positions.

5. The method of claim 1, wherein the adjusting parameters of the neural network based on the under-exposed low resolution image, the under-exposed high level features and the coupled feedback results corresponding to the first sub-network, and the over-exposed low resolution image, the over-exposed high level features and the coupled feedback results corresponding to the second sub-network comprises:

6. The method of claim 5, wherein adjusting parameters of the neural network based on the under-exposed high resolution image, the fused-exposed high resolution image corresponding to the first sub-network, the over-exposed high resolution image, and the fused-exposed high resolution image corresponding to the second sub-network comprises:

adjusting parameters of the neural network by a loss function as shown in the following equation:

and

respectively representing loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork，

And

and

and

and I_gtAnd the images respectively represent a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, and T represents the number of the coupling feedback modules.

7. An image fusion method, comprising:

inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein, the neural network is obtained by training the neural network training method for multi-exposure image fusion according to any one of claims 1 to 6;

8. The method of claim 7, wherein generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:

and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate the image fusion result.

9. A neural network training device for multi-exposure image fusion is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the device comprises:

a low-level feature generation unit configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;

a high-level feature generation unit, configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first subnetwork and the second subnetwork, respectively, and generate underexposed high-level features and overexposed high-level features;

10. An image fusion apparatus, comprising:

the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein, the neural network is obtained by training the neural network training method for multi-exposure image fusion according to any one of claims 1 to 6;

11. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the neural network training method for multi-exposure image fusion of any of claims 1-6 or the image fusion method of any of claims 7-8.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the neural network training method for multi-exposure image fusion of any one of claims 1 to 6 or the image fusion method of any one of claims 7 to 8.