CN112184550B - Neural network training method, image fusion method, device, equipment and medium - Google Patents

Neural network training method, image fusion method, device, equipment and medium Download PDF

Info

Publication number
CN112184550B
CN112184550B CN202010986245.1A CN202010986245A CN112184550B CN 112184550 B CN112184550 B CN 112184550B CN 202010986245 A CN202010986245 A CN 202010986245A CN 112184550 B CN112184550 B CN 112184550B
Authority
CN
China
Prior art keywords
network
sub
low
resolution image
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010986245.1A
Other languages
Chinese (zh)
Other versions
CN112184550A (en
Inventor
邓欣
张雨童
徐迈
段一平
关振宇
李大伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beihang University
Original Assignee
Tsinghua University
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beihang University filed Critical Tsinghua University
Priority to CN202010986245.1A priority Critical patent/CN112184550B/en
Publication of CN112184550A publication Critical patent/CN112184550A/en
Application granted granted Critical
Publication of CN112184550B publication Critical patent/CN112184550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The disclosure relates to the technical field of image processing, and discloses a neural network training method, an image fusion device, equipment and a medium. The method comprises the following steps: designing a first sub-network and a second sub-network with the same network structure, wherein any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the primary characteristic module is used for extracting low-level characteristics of the underexposed low-resolution image and the overexposed low-resolution image; the high-level feature extraction module is used for further extracting high-level features of the underexposed low-resolution image and the overexposed low-resolution image from the corresponding low-level features of the underexposed low-resolution image and the overexposed low-resolution image; the feedback module is coupled to alternately fuse the low-level features and the high-level features corresponding to the under-exposed low-resolution image and the over-exposed low-resolution image. By the technical scheme, multi-exposure fusion processing and super-resolution processing of the image are simultaneously performed by using one neural network, and the image processing speed and the processing accuracy are improved.

Description

Neural network training method, image fusion method, device, equipment and medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a neural network training method, an image fusion method, an apparatus, a device, and a medium.
Background
With the development of technology, people are more and more accustomed to recording the drip of their own life with photos. However, due to the hardware limitations of the camera sensor, the captured image usually has various distortions, which make the image very different from a real natural scene. Images taken with a camera tend to feature a Low Dynamic Range (LDR) and a Low Resolution (LR) compared to real scenes. In order to reduce the difference between the photographed image and the real photographed scene, the image needs to be processed.
At present, the problem of low dynamic range of images is mainly corrected by Multi-exposure Image fusion (MEF), and the problem of low Resolution of images is corrected by Image Super-Resolution (ISR). The multi-exposure image fusion technique is intended to fuse LDR images with different exposure levels, so as to generate an image with High Dynamic Range (HDR). The image super-Resolution technique aims to reconstruct an LR image into a High-Resolution (HR) image.
However, in practical situations, one shot image has two characteristics of LDR and LR, and the multi-exposure image fusion technique and the image super-resolution technique are two independent image processing techniques, which means that one shot image needs to be subjected to multi-exposure image fusion processing and image super-resolution processing successively. Moreover, the sequential execution order of the two image processing techniques may affect the final image processing result. Therefore, the existing image processing method is not only complicated in processing process, but also not ideal in image processing effect.
Disclosure of Invention
To solve the above technical problems, or to at least partially solve the above technical problems, the present disclosure provides a neural network training method, an image fusion method, an apparatus, a device, and a medium.
In a first aspect, the present disclosure provides a neural network training method for multi-exposure image fusion, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;
inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate underexposed high-level features and overexposed high-level features;
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into the coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level features and the coupling feedback results corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level features and the coupling feedback results corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of the coupled feedback modules, and each of the coupled feedback modules does not share model parameters.
In some embodiments, each of the coupled feedback modules processes serially;
the inputting the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and the generating of the coupling feedback result corresponding to the first sub-network includes:
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a first coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module in the first sub-network except the first coupled feedback module, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the first sub-network;
the inputting the overexposed low-level features, the overexposed high-level features, and the underexposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into a first coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and for any subsequent coupled feedback module except the first coupled feedback module in the second sub-network, inputting the over-exposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the first sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the second sub-network.
In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature map sets, wherein each feature map set comprises a filter, an deconvolution layer, and a convolution layer;
a first of said concatenation submodules precedes each of said sets of feature maps;
any other coupling submodule except the first coupling submodule is positioned between any two adjacent feature map groups, and any two other coupling submodules are positioned at different positions.
In some embodiments, said adjusting parameters of said neural network based on said underexposed low resolution image, said underexposed high level features and said coupled feedback results corresponding to said first sub-network, and said overexposed low resolution image, said overexposed high level features and said coupled feedback results corresponding to said second sub-network comprises:
respectively carrying out up-sampling operation on the under-exposed low-resolution image and the over-exposed low-resolution image;
adding the image corresponding to the under-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled under-exposed low-resolution image respectively to generate an under-exposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the overexposure high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network to the upsampled overexposure low-resolution image respectively to generate an overexposure high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
In some embodiments, the parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure GDA0003764799590000041
wherein L istotalRepresents the value of the total loss function, λo、λuAnd
Figure GDA0003764799590000042
the weights corresponding to each partial loss function value are respectively represented,
Figure GDA0003764799590000043
and
Figure GDA0003764799590000044
respectively representing loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,
Figure GDA0003764799590000045
and
Figure GDA0003764799590000046
respectively represent the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure GDA0003764799590000047
and
Figure GDA0003764799590000048
respectively representing the over-exposed high resolution image and the over-exposed high resolution reference image,
Figure GDA0003764799590000051
and
Figure GDA0003764799590000052
respectively representing the under-exposed high resolution image and the under-exposed high resolution reference image,
Figure GDA0003764799590000053
and IgtAnd the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images are respectively represented, and T represents the number of the coupling feedback modules.
In a second aspect, the present disclosure provides an image fusion method, including:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
In some embodiments, the generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:
and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.
In a third aspect, the present disclosure provides a neural network training device for multi-exposure image fusion, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the device includes:
the image acquisition unit is used for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
a low-level feature generation unit, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generation unit configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network, respectively, and generate the underexposed high-level features and the overexposed high-level features;
a first coupling feedback result generating unit, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into the coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and the parameter adjusting unit is used for adjusting the parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of coupled feedback modules, and the coupled feedback modules do not share model parameters.
In some embodiments, each coupled feedback module processes serially;
correspondingly, the first coupling feedback result generating unit is specifically configured to:
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network;
correspondingly, the second coupling feedback result generating unit is specifically configured to:
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and inputting the over-exposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.
In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature mapping sets, wherein each feature mapping set comprises a filter, an deconvolution layer and a convolution layer;
a first link submodule located before each feature map group;
any other linking submodule than the first linking submodule is located between any two adjacent feature map sets, and any two other linking submodules are located at different positions.
In some embodiments, the parameter adjusting unit is specifically configured to:
respectively carrying out up-sampling operation on the underexposed low-resolution image and the overexposed low-resolution image;
adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the upsampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate an over-exposed high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
Further, the parameter adjusting unit is specifically configured to:
parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure GDA0003764799590000071
wherein L istotalRepresents the value of the total loss function, λo、λuAnd
Figure GDA0003764799590000072
the weights corresponding to each partial loss function value are respectively represented,
Figure GDA0003764799590000073
and
Figure GDA0003764799590000074
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,
Figure GDA0003764799590000081
and
Figure GDA0003764799590000082
respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure GDA0003764799590000083
and
Figure GDA0003764799590000084
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure GDA0003764799590000085
and
Figure GDA0003764799590000086
respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,
Figure GDA0003764799590000087
and IgtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.
In a fourth aspect, the present disclosure provides an image fusion apparatus, comprising:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein the neural network is trained by any embodiment of the neural network training method for multi-exposure image fusion in the disclosure;
an image fusion result generating unit configured to generate an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
In some embodiments, the image fusion result generating unit is specifically configured to:
and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.
In a fifth aspect, the present disclosure provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement any of the embodiments of the neural network training method for multi-exposure image fusion or the image fusion method described above.
In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the above-described neural network training methods for multi-exposure image fusion or image fusion methods.
According to the technical scheme provided by the embodiment of the disclosure, a first sub-network and a second sub-network with the same network structures are designed, any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module, the primary feature extraction module is used for extracting low-level features of an under-exposed low-resolution image and an over-exposed low-resolution image, the high-level feature extraction module is used for further extracting high-level features of the under-exposed low-resolution image and the over-exposed low-resolution image from the corresponding low-level features, and mapping of the low-resolution image into the high-resolution features is preliminarily achieved. The low-level features and the high-level features corresponding to the underexposed low-resolution images and the overexposed low-resolution images are crossly fused through the coupling feedback module, so that the multi-exposure fusion of the overexposed images and the underexposed images is realized, the resolution of the images is further improved, an image with both high resolution and high dynamic range is obtained, the aims of simultaneously performing multi-exposure fusion processing and super-resolution processing on the images are fulfilled, the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by utilizing the complementary characteristics between the multi-exposure fusion and the super-resolution.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the embodiments or technical solutions in the prior art description will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
Fig. 1 is a network architecture diagram of a neural network provided by an embodiment of the present disclosure;
FIG. 2 is a network architecture diagram of a high-level feature extraction module in a neural network provided by an embodiment of the present disclosure;
FIG. 3 is a network architecture diagram of a coupled feedback module in a neural network provided by an embodiment of the present disclosure;
FIG. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure;
FIG. 5 is a flowchart of a neural network training method for multi-exposure image fusion provided by an embodiment of the present disclosure;
FIG. 6 is a flowchart of an image fusion method provided by an embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of a neural network training apparatus for multi-exposure image fusion according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be described in further detail below. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments of the present disclosure may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
The neural network training scheme provided by the embodiment of the disclosure can be applied to an application scene for performing fusion processing on images with the characteristics of low dynamic range and low resolution, and is particularly suitable for a scene for performing image fusion processing on overexposed low-resolution images (overexposed low-resolution images for short) and underexposed low-resolution images (underexposed low-resolution images for short).
Fig. 1 is a block diagram of a network structure of a neural network for image fusion according to an embodiment of the present disclosure. As shown in fig. 1, the neural network includes a first sub-network 110 and a second sub-network 120 having the same network structure but not sharing model parameters. The first sub-network 110 includes a primary Feature Extraction Block (FEB) 111, a super-resolution Block (SRB) 112, and a Coupled Feedback Block (CFB) 113. The second network sub-system 120 comprises a primary feature extraction module 121, a high-level feature extraction module 122 and a coupling feedback module 123. The number of the coupled feedback modules 113 in the first sub-network 110 and the number of the coupled feedback modules 123 in the second sub-network 120 are the same, and equal to or greater than 1. The input data of the neural network are an overexposed low-resolution image and an underexposed low-resolution image, the two input images are only required to be respectively input into a sub-network, and a specific input corresponding relation is not limited. In the embodiment of the present disclosure, an under-exposed low-resolution image is input into the first sub-network 110, and an over-exposed low-resolution image is input into the second sub-network 120. The FEB and the SRB are used to extract high-level features from the input image, which helps to enhance the image resolution; the CFB is located behind the SRB and is used for absorbing the features learned by the SRBs of the two sub-networks, so as to fuse an image with High-resolution (HR) and High Dynamic Range (HDR).
The primary feature extraction module 111 and the primary feature extraction module 121 are respectively used for extracting the input under-exposed low-resolution image
Figure GDA0003764799590000111
And overexposed low resolution images
Figure GDA0003764799590000112
To obtain corresponding under-exposed low-level features
Figure GDA0003764799590000113
And over-exposed low level features
Figure GDA0003764799590000114
The primary feature extraction process is characterized by using a formula as follows:
Figure GDA0003764799590000115
and
Figure GDA0003764799590000116
wherein f isFEB() Representing the operation of the primary feature extraction module. In some embodiments, fFEB() Convolutional layers comprising a series of 3 x 3 and 1 x 1 convolutional kernels.
The high-level feature extraction module 112 and the high-level feature extraction module 122 are respectively used for underexposed low-level features of the input
Figure GDA0003764799590000117
And over-exposed low level features
Figure GDA0003764799590000118
Performing a further feature extraction operation to extract an under-exposed low resolution image
Figure GDA0003764799590000119
And overexposed low resolution images
Figure GDA00037647995900001110
To obtain an under-exposed high-level feature GuAnd overexposure high level feature Go. Because the high-level features comprise higher-level semantic features which can represent small and complex targets in the image better, and thus, the detail information in the image is enriched, the underexposed high-level features GuAnd overexposed high level feature GoCan improve the score of the corresponding imageAnd the resolution is high, and the super-resolution effect is realized. In some embodiments, referring to fig. 2, a feedback module in the SRFBN network is used as the main structure of the high-level feature extraction module 112 (122), which comprises a plurality of feature map groups 210 connected in series in a dense connection (dense connection) manner. Each set of feature maps 210 contains at least one upsampling operation (Deconv) and one downsampling operation (Conv). By means of continuous up-and-down sampling, the feature F is gradually changed from the low-level feature F under the condition of ensuring that the size of the feature is not changedinTo extract higher level features G to improve the resolution of the image. The formula representation high-level feature extraction process comprises the following steps: the high-level features of the SRB output can be expressed as:
Figure GDA0003764799590000121
and
Figure GDA0003764799590000122
wherein, fSRB() Representing operations in a high-level feature extraction module.
The coupling feedback module CFB is a core component of a neural network, and aims to realize super-resolution and multi-exposure image fusion simultaneously through a complex network structure. The input data of the coupled feedback module CFB contains three, namely a low-level feature and a high-level feature in the same sub-network and a high-level feature in another sub-network. The two input data in the same sub-network are used for further improving the image resolution and enhancing the super-resolution effect of the image; the input data in the other sub-network has the function of improving the image fusion effect and realizing multi-exposure image fusion.
In some embodiments, one coupled feedback module CFB is included in each sub-network of the neural network. Then, the coupling feedback module 113 is used to fuse the under-exposed low-level features of the input
Figure GDA0003764799590000123
Underexposed high level feature GuAnd overexposed high level feature GoAnd generates a coupled feedback result corresponding to the first sub-network 110. The coupling feedback module 123 is used to fuse the inputsExposing low level features
Figure GDA0003764799590000124
Overexposure high level feature GoSum underexposed high level feature GuAnd generates a coupled feedback result corresponding to the second sub-network 120. The coupled feedback results are image characteristics which realize multi-exposure fusion and super resolution simultaneously.
In some embodiments, multiple coupled feedback modules CFB are included in each sub-network in the neural network, and the multiple CFBs are in a parallel processing fashion. In this embodiment, the input data of each CFB in the same sub-network is the same, and the output coupled feedback results need to be further fused (such as weighted summation, etc.), so as to obtain a coupled feedback result.
In some embodiments, a plurality of coupled feedback modules CFB are included in each sub-network in the neural network, and the plurality of CFBs are serially connected in a loop, as shown in fig. 1. In this embodiment, assuming that there are T CFBs in each sub-network, the process of generating the coupling feedback result corresponding to the first sub-network 110 is: underexposing low level features
Figure GDA0003764799590000125
Under-exposed high level feature GuAnd overexposure high level feature GoThe coupling feedback result is inputted to the first coupling feedback module 113 in the first sub-network 110 to generate the coupling feedback result corresponding to the first sub-network
Figure GDA0003764799590000126
Underexposing the low-level feature for any subsequent coupled feedback module 113 (numbered t) in the first sub-network except the first coupled feedback module 113
Figure GDA0003764799590000127
The coupling feedback result of the previous adjacent coupling feedback module (serial number is t-1) of the subsequent coupling feedback module
Figure GDA0003764799590000131
And in the second sub-network 120 with the sameCoupling feedback result of coupling feedback module corresponding to previous adjacent coupling feedback module
Figure GDA0003764799590000132
The coupling feedback result is inputted into the subsequent coupling feedback module 113 to generate a coupling feedback result corresponding to the first sub-network 110
Figure GDA0003764799590000133
According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the first sub-network 110 can be obtained
Figure GDA0003764799590000134
Similarly, the process of generating the coupling feedback result corresponding to the second sub-network 120 is as follows: overexposure of low level features
Figure GDA0003764799590000135
Overexposed high level feature GoSum-underexposed high level feature GuThe coupling feedback result is inputted to the first coupling feedback module 123 in the second sub-network 120 to generate the coupling feedback result corresponding to the second sub-network 120
Figure GDA0003764799590000136
For any subsequent coupled feedback module 123 (with the sequence number t) in the second sub-network 120 except the first coupled feedback module 123, the overexposed low-level feature is obtained
Figure GDA0003764799590000137
Coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module
Figure GDA0003764799590000138
And the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network 110
Figure GDA0003764799590000139
Inputting the subsequent coupling feedback module 123 to generate a coupling feedback junction corresponding to the second sub-network 120Fruit
Figure GDA00037647995900001310
According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the second sub-network 120 can be obtained
Figure GDA00037647995900001311
The above process is characterized by the formula:
Figure GDA00037647995900001312
and
Figure GDA00037647995900001313
wherein, fCFB() Indicating the operation of the coupled feedback module.
In some embodiments, the number of coupling feedback modules 113 and 123 is three. This can better balance the computation speed and model accuracy of the neural network.
In some embodiments, the internal network structure of each coupled feedback module CFB is the same, but does not share model parameters. Referring to fig. 3, the structure of the tth coupled feedback module 123 in the second sub-network 120 and the correlation with other modules are illustrated. The coupling feedback module 123 includes at least two join sub-modules 310 and at least two feature map groups 320. As in fig. 2, the plurality of feature map groups 320 are connected in a dense manner, and each feature map group 320 includes a filter, a deconvolution layer Deconv, and a convolution layer Conv, implementing successive upsampling and downsampling. The first of which, linking submodule 310, precedes all feature map sets 320; any other coupling sub-module 310 than the first coupling sub-module is located between any two adjacent feature map sets 320, and any two other coupling sub-modules 310 are located at different positions.
The tth CFB has three input data, which are overexposed low-level features
Figure GDA0003764799590000141
Extracted from the t-1 th CFBCoupling the feedback results
Figure GDA0003764799590000142
And the coupled feedback result extracted by the t-1 th CFB in the first sub-network 110
Figure GDA0003764799590000143
Wherein features for feedback
Figure GDA0003764799590000144
Is feedback information obtained from the same sub-network, so its main function is to correct the overexposed low-level features
Figure GDA0003764799590000145
So as to further improve the effect of super resolution; but features for feedback
Figure GDA0003764799590000146
Is feedback information from another sub-network, whose main function is to bring complementary information to improve the effect of multi-exposure image fusion.
The processing procedure of the t-th coupling feedback module 123 is as follows: first, three input features are concatenated in the dimension of the number of channels using a concatenation submodule 310. The join results are then fused with a series of 1 × 1 filters:
Figure GDA0003764799590000147
wherein the content of the first and second substances,
Figure GDA0003764799590000148
representing low-resolution features after filter fusion based on three input features, MinA series of 1 x 1 filters is shown,
Figure GDA0003764799590000149
representing a concatenation of internal elements. Then, based on the feature after filtering fusion
Figure GDA00037647995900001410
The operations of upsampling Deconv and downsampling Conv, each of which may result in a high resolution feature, are repeated using a series of feature map sets 320
Figure GDA00037647995900001411
Low resolution features can be obtained with each downsampling
Figure GDA00037647995900001412
And finally to progressively extract more efficient high-level features.
During operation of the feature map set 320, based on the above
Figure GDA00037647995900001413
Is to bring complementary information to enhance the interpretation of the effect of multi-exposure image fusion, while taking into account the fact that as the number of feature map sets increases, the intra-module aspect is to feature
Figure GDA00037647995900001414
Is gradually forgotten, so that
Figure GDA00037647995900001415
The generated influence is gradually reduced to cause poor effect of subsequent fusion, so the embodiment is used for enhancing
Figure GDA00037647995900001416
For network effects, except
Figure GDA00037647995900001417
As input data for each CFB, it is implanted between feature map groups 320 to reactivate the CFB module's memory, i.e., a linker sub-module 310 is added between the feature map groups of the CFB, the input data of the added linker sub-module 310 contains
Figure GDA00037647995900001418
In one embodiment, at least two coupling sub-modules 310 are provided in each CFB, andand the other link sub-modules, except the first link sub-module, are disposed between different feature map sets 320. If there is no requirement for the operation speed of the neural network, more than two link sub-modules 310 may be provided, and even one link sub-module 310 may be added between every two feature mapping groups 310, which can further improve the fusion effect. If there are high requirements on the operation speed and the operation accuracy of the neural network, only two coupling sub-modules 310 may be provided in order to equalize the speed and the accuracy, the second coupling sub-module 310 being provided at an intermediate position of the plurality of feature mapping groups 320. For example, assuming that the total number of feature map groups is N, the feedback features
Figure GDA0003764799590000151
And
Figure GDA0003764799590000152
are combined to form a new low resolution LR profile
Figure GDA0003764799590000153
Figure GDA0003764799590000154
Wherein the content of the first and second substances,
Figure GDA0003764799590000155
indicating a rounding down operation. The new low resolution feature map
Figure GDA0003764799590000156
Will substitute for
Figure GDA0003764799590000157
As input features for subsequent sets of feature maps.
Finally, after the N feature mapping groups 320 are operated, the LR feature maps of the feature mapping groups 320 are aggregated together and fused by a series of 1 × 1 filters to obtain the final output result of the CFB
Figure GDA0003764799590000158
Figure GDA0003764799590000159
Wherein, Mout() The operation of convolution with a series of 1 x 1 filters is shown.
In some embodiments, first sub-network 110 and second sub-network 120 comprise an image reconstruction module (REC) 114 and an image reconstruction module 124, respectively, for reconstructing the coupled feedback results (features) obtained by the at least one CFB into an image. Then, multiple CFBs can obtain multiple reconstructed images. On the basis, the original input images of the neural network can be further fused to obtain a first fusion exposure high-resolution image
Figure GDA00037647995900001510
And a second fused exposed high resolution image
Figure GDA00037647995900001511
Any fusion exposure high-resolution image has the characteristics of high dynamic range HDR and high resolution HR. It should be noted that, in the embodiment of multiple CFB loop connection, each CFB outputs a coupling feedback result, but considering that the serial feedback processing of each CFB will gradually improve the image fusion and super resolution effect, the coupling feedback result obtained by the last CFB is the best overall effect. Based on this, a first fusion exposure high resolution image is obtained
Figure GDA00037647995900001512
And a second fused exposed high resolution image
Figure GDA00037647995900001513
In the process of (2), a reconstructed image corresponding to a coupling feedback result obtained by the last CFB in each sub-network is taken as one of the inputs. In addition, the characteristic size of the coupling feedback result is larger than the image size of the original input image, so the image size of the original input image can be enlarged by the up-sampling operation such as bicubic interpolation, and then the image size of the original input image can be enlargedThe result of the upsampling is taken as another input to obtain a fused exposed high resolution image. The above process is characterized by the formula:
Figure GDA00037647995900001514
and
Figure GDA0003764799590000161
wherein, fUP() And fREC() Respectively representing an upsampling operation and an image reconstruction operation.
Based on the above description, the parameter settings of each part of the neural network provided by the embodiments of the present disclosure may be exemplified as follows:
Figure GDA0003764799590000162
fig. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure. Based on the architecture of the neural network in fig. 1, there are multi-level features in the neural network, such as a low-level feature, a high-level feature, and at least one coupled feedback result (feature), which are all used to implement multi-exposure image fusion and super-resolution techniques simultaneously, so in order to ensure the effectiveness of each obtained feature, a layered loss function limitation is adopted in the neural network training process. Since the hierarchical loss function requires the image of each layer to be calculated, the network architecture of the neural network for neural network training in fig. 4 increases the branches of multiple image outputs, for example, outputs an overexposed high-resolution image corresponding to a high-level feature, compared with the network architecture for image fusion prediction in fig. 1
Figure GDA0003764799590000171
And under-exposed high resolution images
Figure GDA0003764799590000172
Outputting the fusion exposure high resolution image corresponding to other coupled feedback modules in the first sub-network 110
Figure GDA0003764799590000173
And
Figure GDA0003764799590000174
and outputs fused exposure high resolution images corresponding to other coupled feedback modules in the second subnetwork 120
Figure GDA0003764799590000175
And
Figure GDA0003764799590000176
fig. 5 is a flowchart of a neural network training method for multi-exposure image fusion according to an embodiment of the present disclosure. The neural network training method for multi-exposure image fusion is implemented based on the neural network architecture in fig. 4, wherein the same or corresponding explanations as those in the above embodiments are not repeated herein. The neural network training method for multi-exposure image fusion provided by the embodiment of the disclosure may be executed by a neural network training apparatus for multi-exposure image fusion, which may be implemented by software and/or hardware, and may be integrated in an electronic device with certain computing capability, such as a notebook computer, a desktop computer, a server or a super computer. Referring to fig. 5, the neural network training method for multi-exposure image fusion specifically includes:
s110, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.
Specifically, network training needs to be performed for multiple times in the whole neural network training process, one training image group needs to be obtained for each network training, multiple training image groups need to be obtained in the whole training process, and the training process of each training image group is the same. In this embodiment, only one training process is described. One training image set comprises an under-exposed low-resolution image
Figure GDA0003764799590000177
And an overexposed low resolution image
Figure GDA0003764799590000178
The underexposed low-resolution image is an image with exposure less than a first preset exposure threshold and image resolution less than a preset resolution threshold. The overexposure low-resolution image is an image with shot exposure higher than a second preset exposure threshold and image resolution lower than the preset resolution threshold. Here, the first preset exposure threshold is smaller than the second preset exposure threshold, and the first preset exposure threshold, the second preset exposure threshold, and the preset resolution threshold are a predetermined exposure and an image resolution, respectively.
And S120, inputting the underexposed low-resolution image and the overexposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed low-level features and the overexposed low-level features.
In particular, an under-exposed low resolution image
Figure GDA0003764799590000181
Inputting the primary feature extraction module FEB in the first sub-network to obtain the underexposed low-level features
Figure GDA0003764799590000182
Overexposure of low resolution images
Figure GDA0003764799590000183
Inputting the feature into a primary feature extraction module FEB in a second sub-network to obtain over-exposed low-level features
Figure GDA0003764799590000184
And S130, inputting the underexposed low-level features and the overexposed low-level features into high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features.
In particular, underexposed low-level features
Figure GDA0003764799590000185
Input the firstA high-level feature extraction module SRB in the subnetwork to obtain the underexposed high-level features Gu. Overexposure of low-level features
Figure GDA0003764799590000186
Inputting the high-level feature extraction module SRB in the second sub-network to obtain the overexposed high-level feature Go
And S140, inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network.
In particular, low level features are underexposed
Figure GDA0003764799590000187
Under-exposed high level feature GuAnd overexposed high level feature GoAt least one coupling feedback result corresponding to the first sub-network is generated by processing of at least one coupling feedback module CFB in the first sub-network as a function of the basic input features.
In some embodiments, S140 may be implemented as: inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network; and aiming at any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of the previous adjacent coupled feedback module of the subsequent coupled feedback module and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes serially.
Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the first sub-network is: first, for the first CFB, underexposed low-level features are applied
Figure GDA0003764799590000191
Under-exposed high level feature GuAnd overexposure high level feature GoAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the first sub-network
Figure GDA0003764799590000192
Then, for some subsequent CFB in the first subnetwork (assumed to be the t-th, and t) except the first CFB<T), underexposing the low level feature
Figure GDA0003764799590000193
The coupled feedback result of the previous adjacent CFB (i.e. t-1 st CFB) of the t-th CFB
Figure GDA0003764799590000194
And the coupled feedback result of the t-1 CFB in the second sub-network
Figure GDA0003764799590000195
After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the first sub-network
Figure GDA0003764799590000196
By analogy, through iterative feedback, a coupled feedback result output by any subsequent CFB in the first subnetwork can be obtained.
And S150, inputting the overexposed low-level features, the overexposed high-level features and the underexposed high-level features into a coupling feedback module in the second sub-network, and generating a coupling feedback result corresponding to the second sub-network.
In particular, the level features are low by overexposure
Figure GDA0003764799590000197
Overexposure high level feature GoSum underexposed high level feature GuAt least one coupled feedback result corresponding to the second sub-network is generated by processing of at least one coupled feedback module CFB in the second sub-network based on the input characteristics.
In some embodiments, S150 may be implemented as: inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network; and inputting the over-exposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes serially.
Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the second sub-network is as follows: first, for the first CFB, the overexposed low-level features are applied
Figure GDA0003764799590000198
Overexposure high level feature GoSum underexposed high level feature GuAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the second sub-network
Figure GDA0003764799590000201
Then, for some subsequent CFB in the second sub-network other than the first CFB (assuming the tth and t-th CFB)<T), overexposed low-level features
Figure GDA0003764799590000202
Coupled feedback results for t-1 th CFB
Figure GDA0003764799590000203
And the coupled feedback result of the t-1 CFB in the first sub-network
Figure GDA0003764799590000204
After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the second sub-network
Figure GDA0003764799590000205
By analogy, through iterative feedback, a coupled feedback result of any subsequent CFB output in the second sub-network can be obtained.
And S160, adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Specifically, according to the above description, the embodiment of the present disclosure employs a hierarchical loss function to train the neural network, so that it is required to be based on an under-exposed low-resolution image
Figure GDA0003764799590000206
Under-exposed high level feature GuRespective coupled feedback results corresponding to the first sub-network
Figure GDA0003764799590000207
Overexposed low resolution images
Figure GDA0003764799590000208
Overexposure high level feature GoRespective coupled feedback results corresponding to the second sub-network
Figure GDA0003764799590000209
The images output by each layer of the neural network are determined, the loss value of the training is calculated by using the output images, and then the model parameters in the neural network are adjusted by using the loss value.
In some embodiments, S160 may be implemented as:
A. the under-exposed low resolution image and the over-exposed low resolution image are respectively subjected to an up-sampling operation.
Specifically, in order to further improve the image fusion effect, the images output by each layer of the neural network in the embodiment of the present disclosure all need to be fused with the original input image, that is, the underexposed low-resolution image
Figure GDA00037647995900002010
And overexposed low resolution images
Figure GDA00037647995900002011
However, since the high-level features and the coupled feedback result both increase more image detail information of super resolution, and the feature size is larger than the original input image, the original input image needs to be up-sampled to enlarge the image size. For example, separately for underexposed low resolution images
Figure GDA00037647995900002012
And overexposed low resolution images
Figure GDA00037647995900002013
Carrying out bicubic interpolation up-sampling operation to obtain an up-sampled under-exposed low-resolution image
Figure GDA00037647995900002014
And upsampled overexposed low resolution image
Figure GDA00037647995900002015
B. And adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network.
Specifically, first, the underexposed high-level feature G is subjected touRespective coupled feedback results corresponding to the first sub-network
Figure GDA0003764799590000211
The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the under-exposed high-level features G are combineduCorresponding image and upsampled under-exposed low resolution image
Figure GDA0003764799590000212
Add up to obtain the under-exposure heightResolution image
Figure GDA0003764799590000213
And, the up-sampled under-exposed low resolution image is processed
Figure GDA0003764799590000214
Coupling the feedback result separately to each of the first sub-networks
Figure GDA0003764799590000215
Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the first sub-network
Figure GDA0003764799590000216
C. And adding the image corresponding to the over-exposed high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network with the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network.
Specifically, first, the overexposed high level feature G is subjected tooRespective coupled feedback results corresponding to the second sub-network
Figure GDA0003764799590000217
The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the overexposed high-level feature G is processedoCorresponding image and upsampled overexposed low resolution image
Figure GDA0003764799590000218
Adding to obtain an overexposed high-resolution image
Figure GDA0003764799590000219
And, the up-sampled overexposed low resolution image is processed
Figure GDA00037647995900002110
Coupling the feedback result to each of the second sub-networks separately
Figure GDA00037647995900002111
Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the second sub-network
Figure GDA00037647995900002112
D. And adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
In particular, the underexposed high resolution image obtained by the above process
Figure GDA00037647995900002113
Respective fusion exposure high resolution images corresponding to the first sub-network
Figure GDA00037647995900002114
Overexposed high resolution images
Figure GDA00037647995900002115
Respective merged exposure high resolution images corresponding to the second sub-network
Figure GDA00037647995900002116
The loss value of the training is calculated, and the model parameters in the neural network are adjusted by using the back propagation of the loss value.
In some embodiments, step D may be implemented as: parameters of the neural network are adjusted by a loss function as shown in the following equation (1):
Figure GDA00037647995900002117
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure GDA0003764799590000221
the weights corresponding to each partial loss function value are respectively represented,
Figure GDA0003764799590000222
and
Figure GDA0003764799590000223
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,
Figure GDA0003764799590000224
and
Figure GDA0003764799590000225
respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure GDA0003764799590000226
and
Figure GDA0003764799590000227
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure GDA0003764799590000228
and
Figure GDA0003764799590000229
respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,
Figure GDA00037647995900002210
and IgtAnd the fusion exposure high-resolution images respectively represent the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images, and T represents the number of the coupling feedback modules.
Each reference image is a true value corresponding to the corresponding neural network output imageThe image is a target image that is desired to be as close as possible to the image generated by the neural network. The loss value L representing the Structural Similarity Index (SSIM) at the image level between the two images (X and Y) is described aboveMSCan be determined by:
Figure GDA00037647995900002211
the loss function in equation (1) above can be divided into two parts. First two loss functions
Figure GDA00037647995900002212
And
Figure GDA00037647995900002213
for ensuring the effectiveness of the high-level feature extraction module SRB, the last part of the loss function
Figure GDA00037647995900002214
To ensure the effectiveness of the coupling feedback module CFB. That is, the first two loss functions are to ensure the effect of super-resolution, while the latter part is constructed to ensure the effect of super-resolution and multi-exposure image fusion simultaneously. At the same time, the first two loss functions are also important bases for the last part of the loss function. The entire neural network is trained in an end-to-end manner by minimizing the loss function defined in equation (1).
It should be noted that the execution order of S140 and S150 is not limited, and S140 may be executed first and then S150 may be executed, S150 may be executed first and then S140 may be executed, or S140 and S150 may be executed in parallel.
According to the technical scheme of the embodiment of the disclosure, the obtained underexposed low-resolution images and the obtained overexposed low-resolution images are respectively input into the primary feature extraction modules in the first sub-network and the second sub-network to generate underexposed low-level features and overexposed low-level features; respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features; inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network; inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network; and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network. The end-to-end training of the neural network coupled with the multi-exposure fusion technology and the super-resolution technology is realized, the neural network with more accurate parameters of each module is obtained, the neural network can simplify the processing flow of the shot image and improve the image processing speed, and the image processing accuracy is further improved by utilizing the complementary characteristic between the multi-exposure fusion and the super-resolution.
Fig. 6 is a flowchart of an image fusion method provided in an embodiment of the present disclosure. The image fusion method is implemented based on the neural network architecture in fig. 1, and the explanation of the same or corresponding contents as those in the above embodiments is not repeated herein. The image fusion method provided by the embodiment of the present disclosure may be executed by an image fusion device, where the image fusion device may be implemented by software and/or hardware, and the image fusion device may be integrated in an electronic device with certain computing capability, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a server, or a super computer. Referring to fig. 6, the image fusion method includes:
s210, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.
Specifically, two extreme-exposed images, i.e., an underexposed low-resolution image and an overexposed low-resolution image, for the same photographic scene and the same photographic subject are acquired.
S220, inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image.
Specifically, in the application process of the neural network, only an under-exposed low-resolution image and an over-exposed low-resolution image are needed to be input, and two images can be output through the processing of the neural network, namely the under-exposed low-resolution image
Figure GDA0003764799590000241
Corresponding first fused exposure high resolution image
Figure GDA0003764799590000242
Overexposed low resolution images
Figure GDA0003764799590000243
Corresponding second fused exposure high resolution image
Figure GDA0003764799590000244
And S230, generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Specifically, a first fused exposure high resolution image
Figure GDA0003764799590000245
And a second fused exposure high resolution image
Figure GDA0003764799590000246
Although both images are super-resolution and multi-exposure fused images, the two output images also have differences due to the difference of the corresponding input images. In order to further improve the fusion precision, the embodiment of the present disclosure needs to be applied to
Figure GDA0003764799590000247
And
Figure GDA0003764799590000248
further comprehensive processing is carried out to obtain a final output image, namely an image fusion result.
In some embodiments, S230 may be implemented as: and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result. Specifically, the pair of the present embodiments
Figure GDA0003764799590000249
And
Figure GDA00037647995900002410
the weighting process is performed, and therefore, two weighting weights, i.e., a first weight and a second weight, need to be determined in advance. The values of the two weights are related to the exposure levels, shooting scenes and the like of the underexposed low-resolution images and the overexposed low-resolution images. For example, 0.5 may be used as a default value for the first weight and the second weight. Then, the image fusion result can be generated according to the following formula (2):
Figure GDA00037647995900002411
wherein, Iout、woAnd wuRespectively representing the image fusion result, the second weight and the first weight.
According to the technical scheme of the embodiment of the disclosure, the under-exposed low-resolution image and the over-exposed low-resolution image obtained by shooting are input into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image. The method and the device have the advantages that the neural network coupled with the multi-exposure fusion technology and the super-resolution technology is utilized to process two extremely-exposed low-resolution images, an image fusion result with high resolution HR and high dynamic range HDR is generated, the processing flow of shot images is simplified, and the image processing speed and accuracy are improved.
Fig. 7 is a schematic structural diagram of a neural network training device for multi-exposure image fusion according to an embodiment of the present disclosure. The neural network comprises a first sub-network and a second sub-network which have the same network structure, and any one of the sub-networks comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module. Referring to fig. 7, the apparatus specifically includes:
an image acquisition unit 710 for acquiring an under-exposed low resolution image and an over-exposed low resolution image;
a low-level feature generation unit 720, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generating unit 730, configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first subnetwork and the second subnetwork, respectively, and generate the underexposed high-level features and the overexposed high-level features;
a first coupling feedback result generating unit 740, configured to input the under-exposure low-level features, the under-exposure high-level features, and the over-exposure high-level features into the coupling feedback modules in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit 750, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into a coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and a parameter adjusting unit 760, configured to adjust parameters of the neural network based on the under-exposed low-resolution image, the under-exposed high-level feature, and the coupling feedback result corresponding to the first sub-network, and the over-exposed low-resolution image, the over-exposed high-level feature, and the coupling feedback result corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of coupled feedback modules, and each coupled feedback module does not share model parameters.
In some embodiments, each coupled feedback module processes serially;
accordingly, the first coupling feedback result generating unit 740 is specifically configured to:
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the underexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the second sub-network into any subsequent coupling feedback module except the first coupling feedback module in the first sub-network, so as to generate the coupling feedback result corresponding to the first sub-network;
correspondingly, the second coupling feedback result generating unit 750 is specifically configured to:
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and inputting the overexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.
In some embodiments, the coupling feedback module comprises at least two coupling submodules and at least two eigenmap groups, wherein each eigenmap group comprises a filter, an deconvolution layer and a convolution layer;
the first joint submodule is positioned in front of each feature mapping group;
any other linking submodule than the first linking submodule is located between any two adjacent feature map sets, and any two other linking submodules are located at different positions.
In some embodiments, the parameter adjusting unit 760 is specifically configured to:
respectively carrying out up-sampling operation on the underexposed low-resolution image and the overexposed low-resolution image;
adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the upsampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
Further, the parameter adjusting unit 760 is specifically configured to:
parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure GDA0003764799590000271
wherein L istotalRepresents the value of the total loss function, λo、λuAnd
Figure GDA0003764799590000272
the weight corresponding to each partial loss function value is respectively represented,
Figure GDA0003764799590000273
and
Figure GDA0003764799590000274
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,
Figure GDA0003764799590000275
and
Figure GDA0003764799590000276
respectively representing loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second subnetworkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure GDA0003764799590000277
and
Figure GDA0003764799590000278
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure GDA0003764799590000279
and
Figure GDA00037647995900002710
respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,
Figure GDA00037647995900002711
and IgtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.
Through the neural network training device for multi-exposure image fusion provided by the embodiment of the disclosure, multi-exposure fusion processing and super-resolution processing of images are simultaneously performed by using one neural network, so that the processing flow of image shooting is simplified, the image processing speed is increased, and the image processing accuracy is further improved by using the complementary characteristic between multi-exposure fusion and super-resolution.
The neural network training device for multi-exposure image fusion provided by the embodiment of the disclosure can execute the neural network training method for multi-exposure image fusion provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure. Referring to fig. 8, the apparatus specifically includes:
an image obtaining unit 810 for obtaining an under-exposed low-resolution image and an over-exposed low-resolution image;
a fusion exposure high-resolution image generation unit 820, configured to input the under-exposure low-resolution image and the over-exposure low-resolution image into a neural network trained in advance, and generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;
an image fusion result generating unit 830 for generating an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
In some embodiments, the image fusion result generating unit 830 is specifically configured to:
and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.
Through the image fusion device provided by the embodiment of the disclosure, multi-exposure fusion processing and super-resolution processing of images are simultaneously performed by using one neural network, so that the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by using the complementary characteristic between multi-exposure fusion and super-resolution.
The image fusion device provided by the embodiment of the disclosure can execute the image fusion method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the embodiment of the neural network training device for multi-exposure image fusion, the included units are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present disclosure.
Referring to fig. 9, the present embodiment provides an electronic device, which includes: one or more processors 920; a storage device 910, configured to store one or more programs, when the one or more programs are executed by the one or more processors 920, so that the one or more processors 920 implement the neural network training method for multi-exposure image fusion provided by the embodiment of the present invention, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any one of the sub-networks includes a primary feature extraction module, a higher-level feature extraction module, and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
respectively inputting the underexposed low-resolution image and the overexposed low-resolution image into primary feature extraction modules in a first sub-network and a second sub-network to generate an underexposed low-level feature and an overexposed low-level feature;
respectively inputting the underexposed low-level features and the overexposed low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposed high-level features and the overexposed high-level features;
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Of course, it can be understood by those skilled in the art that the processor 920 may also implement the technical solution of the neural network training method for multi-exposure image fusion provided in any embodiment of the present invention.
The electronic device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, the electronic device includes a processor 920, a storage 910, an input 930, and an output 940; the number of the processors 920 in the electronic device may be one or more, and one processor 920 is taken as an example in fig. 9; the processor 920, the storage device 910, the input device 930, and the output device 940 in the electronic apparatus may be connected by a bus or other means, and are exemplified by a bus 950 in fig. 9.
The storage device 910 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the neural network training method for multi-exposure image fusion in the embodiment of the present invention.
The storage device 910 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. In addition, the storage 910 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the storage 910 may further include memory located remotely from the processor 920, which may be connected to electronic devices over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input unit 930 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. The output device 940 may include a display device such as a display screen.
An embodiment of the present invention further provides another electronic device, which includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors implement the image fusion method provided by the embodiment of the invention, the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Of course, those skilled in the art can understand that the processor may also implement the technical solution of the image fusion method provided in any embodiment of the present invention. The hardware structure and the function of the electronic device can be explained with reference to fig. 9.
The disclosed embodiments also provide a storage medium containing computer-executable instructions for performing a neural network training method for multi-exposure image fusion when executed by a computer processor, the neural network including a first sub-network and a second sub-network having the same network structure, and any one of the sub-networks including a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into primary feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;
respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features;
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the above method operations, and may also perform related operations in the neural network training method for multi-exposure image fusion provided by any embodiments of the present invention.
Computer storage media for embodiments of the present invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Embodiments of the present invention also provide another computer-readable storage medium, where computer-executable instructions, when executed by a computer processor, are configured to perform an image fusion method, including:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method for multi-exposure image fusion in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Of course, the storage medium provided in the embodiments of the present invention includes computer-executable instructions, and the computer-executable instructions are not limited to the above method operations, and may also perform related operations in the image fusion method provided in any embodiment of the present invention. The description of the storage medium is explained with reference to the above embodiments.
It is to be understood that the terminology used in the disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present application. As used in the specification and claims of this disclosure, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are inclusive in the plural, unless the context clearly dictates otherwise. The term "and/or" includes any and all combinations of one or more of the associated listed items. Relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, or apparatus that comprises the element.
The previous description is only for the purpose of describing particular embodiments of the present disclosure, so as to enable those skilled in the art to understand or implement the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (12)

1. A neural network training method for multi-exposure image fusion is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;
inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features;
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into the coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level features and the coupling feedback results corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level features and the coupling feedback results corresponding to the second sub-network.
2. The method of claim 1, wherein the neural network comprises a plurality of the coupled feedback modules, and each of the coupled feedback modules does not share model parameters.
3. The method of claim 2, wherein each of the coupled feedback modules processes serially;
the inputting the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and the generating of the coupling feedback result corresponding to the first sub-network includes:
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a first coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupling feedback module in the first sub-network except the first coupling feedback module, inputting the underexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module, and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the second sub-network into the subsequent coupling feedback module, and generating a coupling feedback result corresponding to the first sub-network;
the inputting the over-exposed low-level features, the over-exposed high-level features, and the under-exposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into a first coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and for any subsequent coupled feedback module except the first coupled feedback module in the second sub-network, inputting the over-exposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the first sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the second sub-network.
4. The method of any one of claims 1 to 3, wherein the coupled feedback module comprises at least two concatenated sub-modules and at least two sets of signature maps, wherein each set of signature maps comprises a filter, an deconvolution layer and a convolution layer;
a first one of said join submodules precedes each of said sets of feature maps;
any other coupling submodule than the first coupling submodule is located between any two adjacent feature map sets, and any two other coupling submodules are located at different positions.
5. The method of claim 1, wherein the adjusting parameters of the neural network based on the under-exposed low resolution image, the under-exposed high level features and the coupled feedback results corresponding to the first sub-network, and the over-exposed low resolution image, the over-exposed high level features and the coupled feedback results corresponding to the second sub-network comprises:
respectively carrying out up-sampling operation on the under-exposed low-resolution image and the over-exposed low-resolution image;
adding the image corresponding to the under-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled under-exposed low-resolution image respectively to generate an under-exposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the overexposure high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network to the upsampled overexposure low-resolution image respectively to generate an overexposure high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
6. The method of claim 5, wherein adjusting parameters of the neural network based on the under-exposed high resolution image, the fused-exposed high resolution image corresponding to the first sub-network, the over-exposed high resolution image, and the fused-exposed high resolution image corresponding to the second sub-network comprises:
adjusting parameters of the neural network by a loss function as shown in the following equation:
Figure FDA0003764799580000031
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure FDA0003764799580000032
the weight corresponding to each partial loss function value is respectively represented,
Figure FDA0003764799580000033
and
Figure FDA0003764799580000034
respectively representing loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first subnetwork,
Figure FDA0003764799580000035
And
Figure FDA0003764799580000036
respectively represent the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure FDA0003764799580000041
and
Figure FDA0003764799580000042
respectively representing the over-exposed high resolution image and the over-exposed high resolution reference image,
Figure FDA0003764799580000043
and
Figure FDA0003764799580000044
respectively representing the under-exposed high resolution image and the under-exposed high resolution reference image,
Figure FDA0003764799580000045
and IgtAnd the images respectively represent a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, and T represents the number of the coupling feedback modules.
7. An image fusion method, comprising:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein, the neural network is obtained by training the neural network training method for multi-exposure image fusion according to any one of claims 1 to 6;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
8. The method of claim 7, wherein generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:
and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate the image fusion result.
9. A neural network training device for multi-exposure image fusion is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the device comprises:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
a low-level feature generation unit configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the primary feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generation unit, configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first subnetwork and the second subnetwork, respectively, and generate underexposed high-level features and overexposed high-level features;
a first coupling feedback result generating unit, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into the coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and the parameter adjusting unit is used for adjusting the parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
10. An image fusion apparatus, comprising:
the image acquisition unit is used for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein, the neural network is obtained by training the neural network training method for multi-exposure image fusion according to any one of claims 1 to 6;
an image fusion result generating unit configured to generate an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
11. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the neural network training method for multi-exposure image fusion of any of claims 1-6 or the image fusion method of any of claims 7-8.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the neural network training method for multi-exposure image fusion of any one of claims 1 to 6 or the image fusion method of any one of claims 7 to 8.
CN202010986245.1A 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium Active CN112184550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010986245.1A CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010986245.1A CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN112184550A CN112184550A (en) 2021-01-05
CN112184550B true CN112184550B (en) 2022-11-01

Family

ID=73921653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010986245.1A Active CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112184550B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115103118B (en) * 2022-06-20 2023-04-07 北京航空航天大学 High dynamic range image generation method, device, equipment and readable storage medium
CN115100043B (en) * 2022-08-25 2022-11-15 天津大学 HDR image reconstruction method based on deep learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10497105B2 (en) * 2017-11-01 2019-12-03 Google Llc Digital image auto exposure adjustment
CN108805836A (en) * 2018-05-31 2018-11-13 大连理工大学 Method for correcting image based on the reciprocating HDR transformation of depth
US11232541B2 (en) * 2018-10-08 2022-01-25 Rensselaer Polytechnic Institute CT super-resolution GAN constrained by the identical, residual and cycle learning ensemble (GAN-circle)
CN110728633B (en) * 2019-09-06 2022-08-02 上海交通大学 Multi-exposure high-dynamic-range inverse tone mapping model construction method and device
CN111246091B (en) * 2020-01-16 2021-09-03 北京迈格威科技有限公司 Dynamic automatic exposure control method and device and electronic equipment

Also Published As

Publication number Publication date
CN112184550A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
Afifi et al. Learning multi-scale photo exposure correction
Xu et al. Learning to restore low-light images via decomposition-and-enhancement
CN109102483B (en) Image enhancement model training method and device, electronic equipment and readable storage medium
Huang et al. Deep fourier-based exposure correction network with spatial-frequency interaction
CN112184550B (en) Neural network training method, image fusion method, device, equipment and medium
CN111311532B (en) Image processing method and device, electronic device and storage medium
CN112602088B (en) Method, system and computer readable medium for improving quality of low light images
RU2706891C1 (en) Method of generating a common loss function for training a convolutional neural network for converting an image into an image with drawn parts and a system for converting an image into an image with drawn parts
CN109886875A (en) Image super-resolution rebuilding method and device, storage medium
Liu et al. Tape: Task-agnostic prior embedding for image restoration
CN111951165A (en) Image processing method, image processing device, computer equipment and computer readable storage medium
WO2023081399A1 (en) Integrated machine learning algorithms for image filters
Le et al. Single-image hdr reconstruction by multi-exposure generation
Yin et al. Two exposure fusion using prior-aware generative adversarial network
Maeda Image super-resolution with deep dictionary
CN114049258A (en) Method, chip and device for image processing and electronic equipment
CN112150363A (en) Convolution neural network-based image night scene processing method, and computing module and readable storage medium for operating method
CN115115518B (en) Method, device, equipment, medium and product for generating high dynamic range image
Hung et al. Image interpolation using convolutional neural networks with deep recursive residual learning
Akhil et al. Single Image HDR Synthesis Using a Densely Connected Dilated ConvNet.
CN113810597B (en) Rapid image and scene rendering method based on semi-predictive filtering
CN111861940A (en) Image toning enhancement method based on condition continuous adjustment
CN110351489B (en) Method and device for generating HDR image and mobile terminal
CN110378852A (en) Image enchancing method, device, computer equipment and storage medium
CN111953888A (en) Dim light imaging method and device, computer readable storage medium and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant