CN112184550A - Neural network training method, image fusion method, device, equipment and medium - Google Patents

Neural network training method, image fusion method, device, equipment and medium Download PDF

Info

Publication number
CN112184550A
CN112184550A CN202010986245.1A CN202010986245A CN112184550A CN 112184550 A CN112184550 A CN 112184550A CN 202010986245 A CN202010986245 A CN 202010986245A CN 112184550 A CN112184550 A CN 112184550A
Authority
CN
China
Prior art keywords
network
sub
low
resolution image
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010986245.1A
Other languages
Chinese (zh)
Other versions
CN112184550B (en
Inventor
邓欣
张雨童
徐迈
段一平
关振宇
李大伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beihang University
Original Assignee
Tsinghua University
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beihang University filed Critical Tsinghua University
Priority to CN202010986245.1A priority Critical patent/CN112184550B/en
Publication of CN112184550A publication Critical patent/CN112184550A/en
Application granted granted Critical
Publication of CN112184550B publication Critical patent/CN112184550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The disclosure relates to the technical field of image processing, and discloses a neural network training method, an image fusion device, equipment and a medium. The method comprises the following steps: designing a first sub-network and a second sub-network with the same network structure, wherein any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the primary characteristic module is used for extracting low-level characteristics of the underexposed low-resolution image and the overexposed low-resolution image; the high-level feature extraction module is used for further extracting high-level features of the underexposed low-resolution image and the overexposed low-resolution image from the corresponding low-level features of the underexposed low-resolution image and the overexposed low-resolution image; the feedback module is coupled to alternately fuse the low-level features and the high-level features corresponding to the under-exposed low-resolution image and the over-exposed low-resolution image. By the technical scheme, multi-exposure fusion processing and super-resolution processing of the image are simultaneously performed by using one neural network, and the image processing speed and the processing accuracy are improved.

Description

Neural network training method, image fusion method, device, equipment and medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a neural network training method, an image fusion method, an apparatus, a device, and a medium.
Background
With the development of technology, people are more and more accustomed to recording the drip of their own life with photos. However, due to the hardware limitations of the camera sensor, the captured image often has various distortions, which make the image very different from a real natural scene. Images captured with a camera tend to have Low Dynamic Range (LDR) and Low Resolution (LR) characteristics compared to real scenes. In order to reduce the difference between the photographed image and the real photographed scene, the image needs to be processed.
At present, the problem of low dynamic range of an Image is mainly corrected by a Multi-exposure Image fusion (MEF) technique, and the problem of low Resolution of an Image is mainly corrected by an Image Super-Resolution (ISR) technique. The multi-exposure image fusion technique is intended to fuse LDR images with different exposure levels, thereby generating an image with High Dynamic Range (HDR). The image super-Resolution technique aims to reconstruct an LR image into a High-Resolution (HR) image.
However, in practical situations, one shot image has two characteristics of LDR and LR, and the multi-exposure image fusion technique and the image super-resolution technique are two independent image processing techniques, which means that one shot image needs to be subjected to multi-exposure image fusion processing and image super-resolution processing successively. Moreover, the sequential execution order of the two image processing techniques may affect the final image processing result. Therefore, the existing image processing mode has not only a complicated processing process, but also an unsatisfactory image processing effect.
Disclosure of Invention
To solve the above technical problem or at least partially solve the above technical problem, the present disclosure provides a neural network training method, an image fusion method, an apparatus, a device, and a medium.
In a first aspect, the present disclosure provides a neural network training method, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;
inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features;
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into the coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level features and the coupling feedback results corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level features and the coupling feedback results corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of the coupled feedback modules, and each of the coupled feedback modules does not share model parameters.
In some embodiments, each of the coupled feedback modules processes serially;
the inputting the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and the generating of the coupling feedback result corresponding to the first sub-network includes:
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a first coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module in the first sub-network except the first coupled feedback module, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the first sub-network;
the inputting the overexposed low-level features, the overexposed high-level features, and the underexposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into a first coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and for any subsequent coupled feedback module except the first coupled feedback module in the second sub-network, inputting the over-exposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the first sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the second sub-network.
In some embodiments, the coupled feedback module comprises at least two concatenated sub-modules and at least two feature map sets, wherein each feature map set comprises a filter, an deconvolution layer, and a convolution layer;
a first one of said join submodules precedes each of said sets of feature maps;
any other coupling submodule than the first coupling submodule is located between any two adjacent feature map sets, and any two other coupling submodules are located at different positions.
In some embodiments, the adjusting parameters of the neural network based on the under-exposed low resolution image, the under-exposed high level features, and the coupled feedback results corresponding to the first sub-network, and the over-exposed low resolution image, the over-exposed high level features, and the coupled feedback results corresponding to the second sub-network comprises:
respectively carrying out up-sampling operation on the under-exposed low-resolution image and the over-exposed low-resolution image;
adding the image corresponding to the under-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled under-exposed low-resolution image respectively to generate an under-exposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the overexposure high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network to the upsampled overexposure low-resolution image respectively to generate an overexposure high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
In some embodiments, the parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure BDA0002689356980000041
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure BDA0002689356980000042
the weight corresponding to each partial loss function value is respectively represented,
Figure BDA0002689356980000043
and
Figure BDA0002689356980000044
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,
Figure BDA0002689356980000045
and
Figure BDA0002689356980000046
respectively represent the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure BDA0002689356980000047
and
Figure BDA0002689356980000048
respectively representing the overexposed high resolution image and the overexposed high resolution reference image,
Figure BDA0002689356980000049
and
Figure BDA00026893569800000410
respectively representing the under-exposed high resolution image and the under-exposed high resolution reference image,
Figure BDA0002689356980000051
and IgtAnd the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images are respectively represented, and T represents the number of the coupling feedback modules.
In a second aspect, the present disclosure provides an image fusion method, including:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
In some embodiments, the generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:
and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate the image fusion result.
In a third aspect, the present disclosure provides a neural network training device, where the neural network includes a first sub-network and a second sub-network with the same network structure, and any sub-network includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the device includes:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
a low-level feature generation unit, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generation unit configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network, respectively, and generate the underexposed high-level features and the overexposed high-level features;
a first coupling feedback result generating unit, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into the coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and the parameter adjusting unit is used for adjusting the parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of coupled feedback modules, and each coupled feedback module does not share model parameters.
In some embodiments, each coupled feedback module processes serially;
correspondingly, the first coupling feedback result generating unit is specifically configured to:
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network;
correspondingly, the second coupling feedback result generating unit is specifically configured to:
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and inputting the overexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.
In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature mapping sets, wherein each feature mapping set comprises a filter, an deconvolution layer and a convolution layer;
a first link submodule located before each feature map group;
any other linking submodule than the first linking submodule is located between any two adjacent feature map sets, and any two other linking submodules are located at different positions.
In some embodiments, the parameter adjusting unit is specifically configured to:
respectively carrying out up-sampling operation on the underexposed low-resolution image and the overexposed low-resolution image;
adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the upsampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
Further, the parameter adjusting unit is specifically configured to:
parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure BDA0002689356980000071
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure BDA0002689356980000072
the weight corresponding to each partial loss function value is respectively represented,
Figure BDA0002689356980000073
and
Figure BDA0002689356980000074
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,
Figure BDA0002689356980000075
and
Figure BDA0002689356980000076
respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure BDA0002689356980000081
and
Figure BDA0002689356980000082
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure BDA0002689356980000083
and
Figure BDA0002689356980000084
respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,
Figure BDA0002689356980000085
and IgtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.
In a fourth aspect, the present disclosure provides an image fusion apparatus, comprising:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein the neural network is trained by any embodiment of the neural network training method in the present disclosure;
an image fusion result generating unit configured to generate an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
In some embodiments, the image fusion result generating unit is specifically configured to:
and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.
In a fifth aspect, the present disclosure provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement any of the embodiments of the neural network training method or the image fusion method described above.
In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the embodiments of the neural network training method or the image fusion method described above.
According to the technical scheme provided by the embodiment of the disclosure, a first sub-network and a second sub-network with the same network structures are designed, any sub-network comprises a neural network of a primary feature extraction module, a high-level feature extraction module and a coupling feedback module, the primary feature extraction module is used for extracting low-level features of an under-exposed low-resolution image and an over-exposed low-resolution image, the high-level feature extraction module is used for further extracting high-level features of the under-exposed low-resolution image and the over-exposed low-resolution image from the corresponding low-level features, and mapping of the low-resolution image into the high-resolution features is preliminarily achieved. The low-level features and the high-level features corresponding to the underexposed low-resolution images and the overexposed low-resolution images are crossly fused through the coupling feedback module, so that the multi-exposure fusion of the overexposed images and the underexposed images is realized, the resolution of the images is further improved, an image with both high resolution and high dynamic range is obtained, the aims of simultaneously performing multi-exposure fusion processing and super-resolution processing on the images are fulfilled, the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by utilizing the complementary characteristics between the multi-exposure fusion and the super-resolution.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a network architecture diagram of a neural network provided by an embodiment of the present disclosure;
FIG. 2 is a network architecture diagram of a high-level feature extraction module in a neural network provided by an embodiment of the present disclosure;
FIG. 3 is a network architecture diagram of a coupled feedback module in a neural network provided by an embodiment of the present disclosure;
FIG. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure;
FIG. 5 is a flow chart of a neural network training method provided by an embodiment of the present disclosure;
FIG. 6 is a flowchart of an image fusion method provided by an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a neural network training device provided in an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be described in further detail below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
The neural network training scheme provided by the embodiment of the disclosure can be applied to an application scene for performing fusion processing on images with the characteristics of low dynamic range and low resolution, and is particularly suitable for a scene for performing image fusion processing on an overexposed low-resolution image (overexposed low-resolution image for short) and an underexposed low-resolution image (underexposed low-resolution image for short).
Fig. 1 is a block diagram of a network structure of a neural network for image fusion according to an embodiment of the present disclosure. As shown in fig. 1, the neural network includes a first sub-network 110 and a second sub-network 120 having the same network structure but not sharing model parameters. The first sub-network 110 includes a primary Feature Extraction Block (FEB) 111, a high-level Feature Extraction Block (SRB) 112, and a Coupled Feedback Block (CFB) 113. The second network sub-system 120 comprises a primary feature extraction module 121, a high-level feature extraction module 122 and a coupling feedback module 123. The number of the coupled feedback modules 113 in the first sub-network 110 and the number of the coupled feedback modules 123 in the second sub-network 120 are the same, and equal to or greater than 1. The input data of the neural network are an overexposed low-resolution image and an underexposed low-resolution image, the two input images are only required to be respectively input into a sub-network, and a specific input corresponding relation is not limited. In the embodiment of the present disclosure, an under-exposed low-resolution image is input into the first sub-network 110, and an over-exposed low-resolution image is input into the second sub-network 120. The FEB and SRB are used to extract high-level features from the input image, which helps to enhance the image resolution; the CFB is located behind the SRB and is used for absorbing the features learned by the SRBs of the two sub-networks, so as to fuse an image with High-resolution (HR) and High Dynamic Range (HDR).
The primary feature extraction module 111 and the primary feature extraction module 121 are respectively used for extracting the input under-exposed low-resolution image
Figure BDA0002689356980000111
And overexposed low resolution images
Figure BDA0002689356980000112
To obtain corresponding under-exposed low-level features
Figure BDA0002689356980000113
And overexposure low level features
Figure BDA0002689356980000114
The primary feature extraction process is characterized by the following formula:
Figure BDA0002689356980000115
and
Figure BDA0002689356980000116
wherein f isFEB() Representing the operation of the primary feature extraction module. In some embodiments, fFEB() Convolutional layers comprising a series of 3 x 3 and 1 x 1 convolutional kernels.
The high-level feature extraction module 112 and the high-level feature extraction module 122 are respectively used for underexposed low-level features of the input
Figure BDA0002689356980000117
And overexposure low level features
Figure BDA0002689356980000118
Performing a further feature extraction operation to extract an under-exposed low resolution image
Figure BDA0002689356980000119
And overexposed low resolution images
Figure BDA00026893569800001110
To obtain an under-exposed high-level feature GuAnd overexposure high level feature Go. Since the high-level features include higher-level semantic features, which can represent small and complex targets in the image better, thereby enriching detail information in the image, the under-exposed high-level features GuAnd overexposure high level feature GoThe resolution of the corresponding image can be improved, and the super-resolution effect is realized. In some embodiments, referring to fig. 2, a feedback module in the SRFBN network is used as the main structure of the high-level feature extraction module 112(122), which comprises a plurality of feature map groups 210 connected in series in a dense connection (dense connection) manner. Each set of feature maps 210 contains at least one upsampling operation (Deconv) and one downsampling operation (Conv). By means of continuous up-down sampling, the feature F is gradually changed from the low-level feature F under the condition of ensuring that the size of the feature is not changedinTo extract higher level features G to improve the resolution of the image. The process of representing the high-level feature extraction by using a formula comprises the following steps: the high-level features of the SRB output can be expressed as:
Figure BDA00026893569800001111
and
Figure BDA00026893569800001112
wherein f isSRB() Representing operations in a high-level feature extraction module.
The coupling feedback module CFB is a core component of a neural network, and aims to realize super-resolution and multi-exposure image fusion simultaneously through a complex network structure. The input data of the coupled feedback module CFB contains three, namely a low-level feature and a high-level feature in the same sub-network and a high-level feature in another sub-network. The two input data in the same sub-network are used for further improving the image resolution and enhancing the super-resolution effect of the image; the input data in the other sub-network has the function of improving the image fusion effect and realizing multi-exposure image fusion.
In some embodiments, one coupled feedback module CFB is included in each sub-network of the neural network. Then, the coupling feedback module 113 is used to fuse the under-exposed low-level features of the input
Figure BDA0002689356980000121
Under-exposed high level feature GuAnd overexposure high level feature GoAnd generates a coupled feedback result corresponding to the first sub-network 110. The coupling feedback module 123 is used for fusing the input over-exposed low-level features
Figure BDA0002689356980000122
Overexposure high level feature GoSum underexposed high level feature GuAnd generates a coupled feedback result corresponding to the second sub-network 120. The coupled feedback results are image characteristics which realize multi-exposure fusion and super resolution simultaneously.
In some embodiments, multiple coupled feedback modules CFB are included in each sub-network in the neural network, and the multiple CFBs are in a parallel processing fashion. In this embodiment, the input data of each CFB in the same sub-network is the same, and the output coupled feedback results need to be further fused (such as weighted summation, etc.), so as to obtain a coupled feedback result.
In some embodiments, a plurality of coupled feedback modules CFB are included in each sub-network in the neural network, and the plurality of CFBs are serially connected in a loop, as shown in fig. 1. In this embodiment, assuming that there are T CFBs in each sub-network, the process of generating the coupling feedback result corresponding to the first sub-network 110 is as follows: underexposing low level features
Figure BDA0002689356980000123
Under-exposed high level feature GuAnd overexposure high level feature GoInput into the first sub-network 110A first coupling feedback module 113 for generating a coupling feedback result corresponding to the first sub-network
Figure BDA0002689356980000124
Underexposing the low-level feature for any subsequent coupled feedback module 113 (numbered t) in the first sub-network except the first coupled feedback module 113
Figure BDA0002689356980000125
The coupling feedback result of the previous adjacent coupling feedback module (with the sequence number of t-1) of the subsequent coupling feedback module
Figure BDA0002689356980000126
And the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the second sub-network 120
Figure BDA0002689356980000127
The result is input into the subsequent coupling feedback module 113 to generate a coupling feedback result corresponding to the first sub-network 110
Figure BDA0002689356980000128
According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the first sub-network 110 can be obtained
Figure BDA0002689356980000129
Similarly, the process of generating the coupling feedback result corresponding to the second sub-network 120 is as follows: overexposure of low level features
Figure BDA00026893569800001210
Overexposure high level feature GoSum underexposed high level feature GuThe coupling feedback result is inputted to the first coupling feedback module 123 in the second sub-network 120 to generate the coupling feedback result corresponding to the second sub-network 120
Figure BDA0002689356980000131
For any of the second sub-network 120 except the first coupled feedback module 123The subsequent coupled feedback module 123 (with sequence number t) will overexpose the low level feature
Figure BDA0002689356980000132
Coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module
Figure BDA0002689356980000133
And the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network 110
Figure BDA0002689356980000134
The result is input into the subsequent coupling feedback module 123 to generate a coupling feedback result corresponding to the second sub-network 120
Figure BDA0002689356980000135
According to the procedure, through all CFB operations, the final coupling feedback result corresponding to the second sub-network 120 can be obtained
Figure BDA0002689356980000136
The above process is characterized by the formula:
Figure BDA0002689356980000137
and
Figure BDA0002689356980000138
wherein f isCFB() Indicating the operation of the coupled feedback module.
In some embodiments, the number of coupling feedback modules 113 and 123 is three. This can better balance the computation speed and model accuracy of the neural network.
In some embodiments, the internal network structure of each coupled feedback module CFB is the same, but does not share model parameters. Referring to fig. 3, the structure of the tth coupled feedback module 123 in the second sub-network 120 and the correlation with other modules are illustrated. The coupling feedback module 123 includes at least two join sub-modules 310 and at least two feature map groups 320. As in fig. 2, the plurality of feature map groups 320 are connected in a dense manner, and each feature map group 320 includes a filter, a deconvolution layer Deconv, and a convolution layer Conv, implementing successive upsampling and downsampling. The first of which 310 precedes all of the feature map sets 320; any other coupling sub-module 310 than the first coupling sub-module is located between any two adjacent feature map sets 320, and any two other coupling sub-modules 310 are located at different positions.
The tth CFB has three input data, which are overexposed low-level features
Figure BDA0002689356980000139
Coupled feedback results of t-1 CFB extraction
Figure BDA00026893569800001310
And the coupled feedback result extracted by the t-1 th CFB in the first sub-network 110
Figure BDA00026893569800001311
Wherein the characteristics for feedback
Figure BDA00026893569800001312
Is feedback information obtained from the same sub-network, so its main function is to correct the overexposed low-level features
Figure BDA00026893569800001313
So as to further improve the effect of super resolution; but features for feedback
Figure BDA00026893569800001314
Is feedback information from another sub-network, whose main function is to bring complementary information to improve the effect of multi-exposure image fusion.
The processing procedure of the tth coupling feedback module 123 is as follows: first, in the dimension of the number of channels, the three input features are joined using the joining submodule 310. The join results are then fused using a series of 1 x 1 filters:
Figure BDA0002689356980000141
Wherein the content of the first and second substances,
Figure BDA0002689356980000142
features representing low resolution after filter fusion based on three input features, MinA series of 1 x 1 filters is shown,
Figure BDA0002689356980000143
representing the concatenation of internal elements. Then, based on the feature after filtering fusion
Figure BDA0002689356980000144
The operations of upsampling Deconv and downsampling Conv, each of which may result in a high resolution feature, are repeated using a series of feature map sets 320
Figure BDA0002689356980000145
Low resolution features can be obtained with each downsampling
Figure BDA0002689356980000146
And finally to progressively extract more efficient high-level features.
During operation of the feature map set 320, based on the above
Figure BDA0002689356980000147
Is to bring complementary information to enhance the interpretation of the effect of multi-exposure image fusion, while taking into account the fact that as the number of feature map sets increases, the intra-module aspect is to feature
Figure BDA0002689356980000148
Is gradually forgotten, so that
Figure BDA0002689356980000149
The generated influence is gradually reduced to cause poor effect of subsequent fusion, so the embodiment aims to enhance
Figure BDA00026893569800001410
For network effects other than
Figure BDA00026893569800001411
In addition to the input data for each CFB, it is implanted between feature map sets 320 to reactivate the CFB module's memory, i.e., a link sub-module 310 is added between the feature map sets of the CFB, the input data for the added link sub-module 310 each containing
Figure BDA00026893569800001412
In particular, at least two coupling sub-modules 310 are provided in each CFB, and the other coupling sub-modules, except the first coupling sub-module, are provided between different feature map sets 320. If the operation speed of the neural network is not required, more than two linking submodules 310 can be arranged, and even one linking submodule 310 can be added between every two feature mapping groups 310, so that the fusion effect can be improved to a greater extent. If there are high demands on the operation speed and the operation accuracy of the neural network, only two linkage sub-modules 310 may be provided in order to equalize the speed and the accuracy, and the second linkage sub-module 310 may be provided at a middle position of the plurality of feature mapping sets 320. For example, assuming that the total number of feature map groups is N, the feedback features
Figure BDA00026893569800001413
And
Figure BDA00026893569800001414
are combined to form a new low resolution LR profile
Figure BDA00026893569800001415
Figure BDA00026893569800001416
Wherein the content of the first and second substances,
Figure BDA00026893569800001417
indicating a rounding down operation. The new low resolution feature map
Figure BDA0002689356980000151
Will replace
Figure BDA0002689356980000152
As input features for subsequent sets of feature maps.
Finally, after the N feature mapping groups 320 are operated, the LR feature maps of the feature mapping groups 320 are aggregated together and fused by a series of 1 × 1 filters to obtain the final output result of the CFB
Figure BDA0002689356980000153
Figure BDA0002689356980000154
Wherein M isout() The operation of convolution with a series of 1 x 1 filters is shown.
In some embodiments, the first sub-network 110 and the second sub-network 120 respectively include an image reconstruction module (REC) 114 and an image reconstruction module 124 for reconstructing the coupled feedback results (features) obtained by the at least one CFB into an image. Then, multiple CFBs can obtain multiple reconstructed images. On the basis, the original input images of the neural network can be further fused to obtain a first fusion exposure high-resolution image
Figure BDA0002689356980000155
And a second fused exposed high resolution image
Figure BDA0002689356980000156
Any fusion exposure high-resolution image has the characteristics of high dynamic range HDR and high resolution HR. It should be noted that, in the embodiment of multiple CFB loop connection, each CFB outputs a coupled feedback result, but considering that the serial feedback processing of each CFB will gradually improve the image fusion and super resolution effect, the coupled feedback result obtained by the last CFBThe result is the best overall result. Based on this, a first fusion exposure high resolution image is obtained
Figure BDA0002689356980000157
And a second fused exposed high resolution image
Figure BDA0002689356980000158
The reconstructed image corresponding to the coupling feedback result obtained by the last CFB in each sub-network is taken as one of the inputs. In addition, the characteristic size of the coupling feedback result is larger than the image size of the original input image, so the image size of the original input image can be enlarged by utilizing an up-sampling operation such as bicubic interpolation, and then the up-sampling result is used as another input for obtaining the fusion exposure high-resolution image. The above process is characterized by the formula:
Figure BDA0002689356980000159
and
Figure BDA00026893569800001510
wherein f isUP() And fREC() Respectively representing an upsampling operation and an image reconstruction operation.
Based on the above description, the parameter settings of each part of the neural network provided by the embodiments of the present disclosure can be exemplified as follows:
Figure BDA0002689356980000161
fig. 4 is a network architecture diagram of a neural network for neural network training provided by an embodiment of the present disclosure. Based on the architecture of the neural network in fig. 1, there are multi-level features in the neural network, such as a low-level feature, a high-level feature, and at least one coupled feedback result (feature), which are all used to implement multi-exposure image fusion and super-resolution techniques simultaneously, so in order to ensure the effectiveness of each obtained feature, a layered loss function limitation is adopted in the neural network training process. While a layered penalty function requires a map of layersAs calculated, the network architecture of the neural network for neural network training in FIG. 4 adds a plurality of branches of image output, such as outputting an overexposed high resolution image corresponding to a high level feature, to the network architecture for image fusion prediction in FIG. 1
Figure BDA0002689356980000171
And under-exposed high resolution images
Figure BDA0002689356980000172
Outputting the fusion exposure high resolution image corresponding to other coupled feedback modules in the first sub-network 110
Figure BDA0002689356980000173
And
Figure BDA0002689356980000174
and outputs fused exposed high resolution images corresponding to other coupled feedback modules in the second subnetwork 120
Figure BDA0002689356980000175
And
Figure BDA0002689356980000176
fig. 5 is a flowchart of a neural network training method provided in an embodiment of the present disclosure. The neural network training method is implemented based on the neural network architecture in fig. 4, wherein the same or corresponding explanations as those in the above embodiments are not repeated herein. The neural network training method provided by the embodiment of the disclosure can be executed by a neural network training device, which can be implemented by software and/or hardware, and the device can be integrated in an electronic device with certain computing capability, such as a notebook computer, a desktop computer, a server or a super computer, and the like. Referring to fig. 5, the neural network training method specifically includes:
s110, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.
In particular, the entire neural network has been trainedIn the course of training, network training needs to be performed for multiple times, each network training needs to acquire one training image group, so that the whole training process needs to acquire multiple training image groups, and the training process of each training image group is the same. In this embodiment, only one training process is described. One training image set comprises an under-exposed low-resolution image
Figure BDA0002689356980000177
And an overexposed low resolution image
Figure BDA0002689356980000178
The underexposed low-resolution image is an image with exposure less than a first preset exposure threshold and image resolution less than a preset resolution threshold. The overexposure low-resolution image is an image with shot exposure higher than a second preset exposure threshold and image resolution lower than the preset resolution threshold. Here, the first preset exposure threshold is smaller than the second preset exposure threshold, and the first preset exposure threshold, the second preset exposure threshold, and the preset resolution threshold are a predetermined exposure and an image resolution, respectively.
And S120, inputting the underexposed low-resolution image and the overexposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed low-level features and the overexposed low-level features.
In particular, an under-exposed low resolution image is formed
Figure BDA0002689356980000179
Inputting the initial feature extraction module FEB in the first sub-network to obtain the underexposed low-level features
Figure BDA00026893569800001710
Overexposure of low resolution images
Figure BDA00026893569800001711
Inputting the initial feature extraction module FEB in the second sub-network to obtain the overexposed low-level features
Figure BDA0002689356980000181
And S130, inputting the underexposed low-level features and the overexposed low-level features into high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features.
In particular, underexposed low-level features
Figure BDA0002689356980000182
Inputting the high-level feature extraction module SRB in the first sub-network to obtain the underexposed high-level feature Gu. Overexposure of low level features
Figure BDA0002689356980000183
Inputting the high-level feature extraction module SRB in the second sub-network to obtain the overexposed high-level feature Go
And S140, inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network.
In particular, low level features are underexposed
Figure BDA0002689356980000184
Under-exposed high level feature GuAnd overexposure high level feature GoAt least one coupling feedback result corresponding to the first sub-network is generated by processing of at least one coupling feedback module CFB in the first sub-network as a function of the basic input features.
In some embodiments, S140 may be implemented as: inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network; and aiming at any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of the previous adjacent coupled feedback module of the subsequent coupled feedback module and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes in series.
Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the first sub-network is as follows: first, for the first CFB, underexposed low-level features are applied
Figure BDA0002689356980000185
Under-exposed high level feature GuAnd overexposure high level feature GoAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the first sub-network
Figure BDA0002689356980000186
Then, for some subsequent CFB in the first sub-network other than the first CFB (assuming the tth and t-th CFB)<T), underexposing low-level features
Figure BDA0002689356980000187
The coupled feedback result of the previous adjacent CFB (i.e. t-1 st CFB) of the t-th CFB
Figure BDA0002689356980000188
And the coupled feedback result of the t-1 CFB in the second sub-network
Figure BDA0002689356980000191
After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the first sub-network
Figure BDA0002689356980000192
By analogy, through iterative feedback, a coupled feedback result of any subsequent CFB output in the first subnetwork can be obtained.
And S150, inputting the overexposed low-level features, the overexposed high-level features and the underexposed high-level features into a coupling feedback module in the second sub-network, and generating a coupling feedback result corresponding to the second sub-network.
In particular, the low level features are over-exposed
Figure BDA0002689356980000193
Overexposure high level feature GoSum underexposed high level feature GuAt least one coupling feedback result corresponding to the second sub-network is generated by processing of at least one coupling feedback module CFB in the second sub-network as a function of the basic input features.
In some embodiments, S150 may be implemented as: inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network; and inputting the overexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network. In this embodiment, the neural network includes a plurality of coupled feedback modules CFB (T are taken as an example), and each coupled feedback module processes in series.
Referring to fig. 4, the process of generating at least one coupling feedback result corresponding to the second sub-network is as follows: first, for the first CFB, the low level features are overexposed
Figure BDA0002689356980000194
Overexposure high level feature GoSum underexposed high level feature GuAfter inputting the first CFB, outputting a first coupled feedback result corresponding to the second sub-network
Figure BDA0002689356980000195
Then, for some subsequent CFB in the second sub-network other than the first CFB (assuming the tth and t-th CFB)<T), overexposure of the low-level features
Figure BDA0002689356980000196
Coupled feedback results for t-1 th CFB
Figure BDA0002689356980000197
And the coupled feedback result of the t-1 CFB in the first sub-network
Figure BDA0002689356980000198
After inputting the t-th CFB, outputting the t-th coupling feedback result corresponding to the second sub-network
Figure BDA0002689356980000199
By analogy, through iterative feedback, a coupled feedback result of any subsequent CFB output in the second sub-network can be obtained.
And S160, adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Specifically, according to the above description, the embodiment of the present disclosure employs a hierarchical loss function to train the neural network, so that it is required to be based on an under-exposed low-resolution image
Figure BDA0002689356980000201
Under-exposed high level feature GuRespective coupled feedback results corresponding to the first sub-network
Figure BDA0002689356980000202
Overexposed low resolution images
Figure BDA0002689356980000203
Overexposure high level feature GoRespective coupled feedback results corresponding to the second sub-network
Figure BDA0002689356980000204
Determining the images output by each layer of the neural network, calculating the loss value of the training by using the output images, and adjusting by using the loss valueModel parameters in the whole neural network.
In some embodiments, S160 may be implemented as:
A. the under-exposed low resolution image and the over-exposed low resolution image are respectively subjected to an up-sampling operation.
Specifically, in order to further improve the image fusion effect, the images output by each layer of the neural network in the embodiment of the present disclosure all need to be fused with the original input image, that is, the underexposed low-resolution image
Figure BDA0002689356980000205
And overexposed low resolution images
Figure BDA0002689356980000206
However, since the high-level features and the coupled feedback result both increase more image detail information of super resolution, and feature sizes thereof are larger than the original input image, the original input image needs to be upsampled to enlarge the image size. For example, separately for underexposed low resolution images
Figure BDA0002689356980000207
And overexposed low resolution images
Figure BDA0002689356980000208
Carrying out bicubic interpolation up-sampling operation to obtain an up-sampled under-exposed low-resolution image
Figure BDA0002689356980000209
And upsampled overexposed low resolution image
Figure BDA00026893569800002010
B. And adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network.
Specifically, first, the underexposed high-level feature G is subjected touRespective coupled feedback results corresponding to the first sub-network
Figure BDA00026893569800002011
The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the under-exposed high-level features G are combineduCorresponding image and upsampled under-exposed low resolution image
Figure BDA00026893569800002012
Adding to obtain an under-exposed high-resolution image
Figure BDA00026893569800002013
And, the up-sampled under-exposed low resolution image is processed
Figure BDA00026893569800002014
Coupling the feedback result separately to each of the first sub-networks
Figure BDA00026893569800002015
Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the first sub-network
Figure BDA00026893569800002016
C. And adding the image corresponding to the over-exposed high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network with the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network.
Specifically, first, the overexposed high level feature G is subjected tooRespective coupled feedback results corresponding to the second sub-network
Figure BDA0002689356980000211
The operation of the image reconstruction module REC is applied to obtain the corresponding image. Then, the overexposed high-level feature G is processedoCorresponding image and upsampled overexposed low resolution image
Figure BDA0002689356980000212
Adding to obtain an overexposed high-resolution image
Figure BDA0002689356980000213
And, the up-sampled overexposed low resolution image is processed
Figure BDA0002689356980000214
Coupling the feedback result separately to each of the second sub-networks
Figure BDA0002689356980000215
Adding the corresponding images to obtain each fusion exposure high-resolution image corresponding to the second sub-network
Figure BDA0002689356980000216
D. And adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
In particular, the underexposed high resolution image obtained by the above process
Figure BDA0002689356980000217
Respective fusion exposure high resolution images corresponding to the first sub-network
Figure BDA0002689356980000218
Overexposed high resolution images
Figure BDA0002689356980000219
Respective merged exposure high resolution images corresponding to the second sub-network
Figure BDA00026893569800002110
The loss value of the training is calculated, and the model parameters in the neural network are adjusted by using the back propagation of the loss value.
In some embodiments, step D may be implemented as: parameters of the neural network are adjusted by a loss function as shown in the following equation (1):
Figure BDA00026893569800002111
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure BDA00026893569800002112
the weight corresponding to each partial loss function value is respectively represented,
Figure BDA00026893569800002113
and
Figure BDA00026893569800002114
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,
Figure BDA00026893569800002115
and
Figure BDA00026893569800002116
respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure BDA00026893569800002117
and
Figure BDA00026893569800002118
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure BDA00026893569800002119
and
Figure BDA00026893569800002120
separately representing an under-exposed high resolution image and an under-exposed high resolutionThe reference image is a reference image that is,
Figure BDA00026893569800002121
and IgtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.
Each of the reference images is a true value image corresponding to the output image of the corresponding neural network, and is a target image desired to be as close as possible to the image generated by the neural network. The loss value L representing the Structural Similarity Index (SSIM) at the image level between the two images (X and Y) is described aboveMSCan be determined by:
Figure BDA0002689356980000221
the loss function in equation (1) above can be divided into two parts. First two loss functions
Figure BDA0002689356980000222
And
Figure BDA0002689356980000223
for ensuring the effectiveness of the high-level feature extraction module SRB, the last part of the loss function
Figure BDA0002689356980000224
To ensure the effectiveness of the coupled feedback module CFB. That is, the first two loss functions are to ensure the effect of super-resolution, while the latter part is constructed to ensure the effect of super-resolution and multi-exposure image fusion simultaneously. At the same time, the first two loss functions are also important bases for the last part of the loss function. The entire neural network is trained in an end-to-end manner by minimizing the loss function defined in equation (1).
It should be noted that the execution order of S140 and S150 is not limited, and S140 may be executed first and then S150 may be executed, S150 may be executed first and then S140 may be executed, or S140 and S150 may be executed in parallel.
According to the technical scheme of the embodiment of the disclosure, the obtained underexposed low-resolution images and the obtained overexposed low-resolution images are respectively input into the initial feature extraction modules in the first sub-network and the second sub-network to generate underexposed low-level features and overexposed low-level features; respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features; inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network; inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network; and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network. The end-to-end training of the neural network coupled with the multi-exposure fusion technology and the super-resolution technology is realized, the neural network with more accurate parameters of each module is obtained, the neural network can simplify the processing flow of the shot image and improve the image processing speed, and the image processing accuracy is further improved by utilizing the complementary characteristic between the multi-exposure fusion and the super-resolution.
Fig. 6 is a flowchart of an image fusion method provided in an embodiment of the present disclosure. The image fusion method is implemented based on the neural network architecture in fig. 1, and the explanation of the same or corresponding contents as those in the above embodiments is not repeated herein. The image fusion method provided by the embodiment of the present disclosure may be executed by an image fusion device, where the image fusion device may be implemented by software and/or hardware, and the image fusion device may be integrated in an electronic device with certain computing capability, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a server, or a super computer. Referring to fig. 6, the image fusion method includes:
s210, acquiring an under-exposed low-resolution image and an over-exposed low-resolution image.
Specifically, two extreme-exposed images, i.e., an underexposed low-resolution image and an overexposed low-resolution image, for the same photographic scene and the same photographic subject are acquired.
S220, inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image.
Specifically, in the application process of the neural network, only an under-exposed low-resolution image and an over-exposed low-resolution image are needed to be input, and two images can be output through the processing of the neural network, namely the under-exposed low-resolution image
Figure BDA0002689356980000231
Corresponding first fused exposed high resolution image
Figure BDA0002689356980000232
Overexposed low resolution images
Figure BDA0002689356980000233
Corresponding second fused exposed high resolution image
Figure BDA0002689356980000234
And S230, generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Specifically, a first fusion exposure high resolution image
Figure BDA0002689356980000235
And a second fused exposed high resolution image
Figure BDA0002689356980000236
Although the images are combined by super resolution and multiple exposure, the two images are different because of the difference of the corresponding input imagesThere are also differences between the images. In order to further improve the fusion precision, the embodiment of the present disclosure needs to be applied to
Figure BDA0002689356980000241
And
Figure BDA0002689356980000242
further comprehensive processing is carried out to obtain a final output image, namely an image fusion result.
In some embodiments, S230 may be implemented as: and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result. Specifically, the pair of the present embodiments
Figure BDA0002689356980000243
And
Figure BDA0002689356980000244
the weighting process is performed, and therefore, two weighting weights, i.e., a first weight and a second weight, need to be determined in advance. The values of the two weights are related to the exposure levels, shooting scenes and the like of the underexposed low-resolution images and the overexposed low-resolution images. For example, 0.5 may be used as a default value for the first weight and the second weight. Then, the image fusion result may be generated according to the following formula (2):
Figure BDA0002689356980000245
wherein, Iout、woAnd wuRespectively representing the image fusion result, the second weight and the first weight.
According to the technical scheme of the embodiment of the disclosure, the under-exposed low-resolution image and the over-exposed low-resolution image obtained by shooting are input into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image. The method and the device realize the processing of two extremely exposed low-resolution images by utilizing the neural network coupled with the multi-exposure fusion technology and the super-resolution technology to generate an image fusion result with high resolution HR and high dynamic range HDR, simplify the processing flow of the shot image, and improve the image processing speed and the processing accuracy.
Fig. 7 is a schematic structural diagram of a neural network training device according to an embodiment of the present disclosure. The neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module. Referring to fig. 7, the apparatus specifically includes:
an image acquisition unit 710 for acquiring an under-exposed low resolution image and an over-exposed low resolution image;
a low-level feature generation unit 720, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generation unit 730, configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network, respectively, and generate the underexposed high-level features and the overexposed high-level features;
a first coupling feedback result generating unit 740, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into a coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit 750, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into a coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and a parameter adjusting unit 760, configured to adjust parameters of the neural network based on the under-exposed low-resolution image, the under-exposed high-level feature, and the coupling feedback result corresponding to the first sub-network, and the over-exposed low-resolution image, the over-exposed high-level feature, and the coupling feedback result corresponding to the second sub-network.
In some embodiments, the neural network includes a plurality of coupled feedback modules, and each coupled feedback module does not share model parameters.
In some embodiments, each coupled feedback module processes serially;
accordingly, the first coupling feedback result generating unit 740 is specifically configured to:
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a first coupling feedback module in a first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module except the first coupled feedback module in the first sub-network, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module to generate the coupled feedback result corresponding to the first sub-network;
correspondingly, the second coupling feedback result generating unit 750 is specifically configured to:
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a first coupling feedback module in a second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and inputting the overexposure low-level feature, the coupling feedback result of a previous adjacent coupling feedback module of the subsequent coupling feedback module and the coupling feedback result of the coupling feedback module corresponding to the previous adjacent coupling feedback module in the first sub-network into the subsequent coupling feedback module aiming at any subsequent coupling feedback module except the first coupling feedback module in the second sub-network, and generating the coupling feedback result corresponding to the second sub-network.
In some embodiments, the coupling feedback module comprises at least two coupling sub-modules and at least two feature mapping sets, wherein each feature mapping set comprises a filter, an deconvolution layer and a convolution layer;
a first link submodule located before each feature map group;
any other linking submodule than the first linking submodule is located between any two adjacent feature map sets, and any two other linking submodules are located at different positions.
In some embodiments, the parameter adjusting unit 760 is specifically configured to:
respectively carrying out up-sampling operation on the underexposed low-resolution image and the overexposed low-resolution image;
adding the image corresponding to the underexposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the upsampled underexposed low-resolution image respectively to generate an underexposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the over-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the second sub-network to the up-sampled over-exposed low-resolution image respectively to generate a fusion exposed high-resolution image corresponding to the over-exposed high-resolution image and the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
Further, the parameter adjusting unit 760 is specifically configured to:
parameters of the neural network are adjusted by a loss function as shown in the following equation:
Figure BDA0002689356980000261
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure BDA0002689356980000262
representing the loss function of each part separatelyThe weight corresponding to the value of the weight,
Figure BDA0002689356980000271
and
Figure BDA0002689356980000272
respectively representing the loss function values corresponding to the high-level feature extraction module and the coupling feedback module in the first sub-network,
Figure BDA0002689356980000273
and
Figure BDA0002689356980000274
respectively representing the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure BDA0002689356980000275
and
Figure BDA0002689356980000276
respectively representing an overexposed high resolution image and an overexposed high resolution reference image,
Figure BDA0002689356980000277
and
Figure BDA0002689356980000278
respectively representing an under-exposed high resolution image and an under-exposed high resolution reference image,
Figure BDA0002689356980000279
and IgtAnd respectively representing a fusion exposure high-resolution image corresponding to the tth second sub-network, a fusion exposure high-resolution image corresponding to the tth first sub-network and a fusion exposure high-resolution reference image, wherein T represents the number of the coupling feedback modules.
Through the neural network training device provided by the embodiment of the disclosure, the multi-exposure fusion processing and the super-resolution processing of the image are simultaneously performed by using one neural network, the processing flow of the shot image is simplified, the image processing speed is improved, and the image processing accuracy is further improved by using the complementary characteristic between the multi-exposure fusion and the super-resolution.
The neural network training device provided by the embodiment of the disclosure can execute the neural network training method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 8 is a schematic structural diagram of an image fusion apparatus provided in an embodiment of the present disclosure. Referring to fig. 8, the apparatus specifically includes:
an image acquisition unit 810 for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
a fusion exposure high-resolution image generation unit 820, configured to input the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network, and generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method in any embodiment of the disclosure;
an image fusion result generating unit 830 for generating an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
In some embodiments, the image fusion result generating unit 830 is specifically configured to:
and respectively utilizing the first weight and the second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate an image fusion result.
Through the image fusion device provided by the embodiment of the disclosure, multi-exposure fusion processing and super-resolution processing of images are simultaneously performed by using one neural network, so that the processing flow of the shot images is simplified, the image processing speed is improved, and the image processing accuracy is further improved by using the complementary characteristic between the multi-exposure fusion and the super-resolution.
The image fusion device provided by the embodiment of the disclosure can execute the image fusion method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the embodiment of the neural network training device, the included units are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present disclosure.
Referring to fig. 9, the present embodiment provides an electronic device, which includes: one or more processors 920; the storage device 910 is configured to store one or more programs, and when the one or more programs are executed by the one or more processors 920, the one or more processors 920 implement the neural network training method provided in the embodiments of the present invention, the neural network includes a first sub-network and a second sub-network with the same network structure, and any one of the sub-networks includes a primary feature extraction module, a high-level feature extraction module, and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
respectively inputting the underexposed low-resolution image and the overexposed low-resolution image into initial feature extraction modules in a first sub-network and a second sub-network to generate an underexposed low-level feature and an overexposed low-level feature;
respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features;
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Of course, those skilled in the art will understand that the processor 920 may also implement the technical solution of the neural network training method provided in any embodiment of the present invention.
The electronic device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, the electronic device includes a processor 920, a storage 910, an input 930, and an output 940; the number of the processors 920 in the electronic device may be one or more, and one processor 920 is taken as an example in fig. 9; the processor 920, the storage device 910, the input device 930, and the output device 940 in the electronic apparatus may be connected by a bus or other means, and fig. 9 illustrates an example in which the processor, the storage device 910, the input device 930, and the output device 940 are connected by a bus 950.
The storage device 910 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the neural network training method in the embodiment of the present invention.
The storage device 910 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. In addition, the storage 910 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the storage 910 may further include memory located remotely from the processor 920, which may be connected to the electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 930 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic apparatus. The output device 940 may include a display device such as a display screen.
An embodiment of the present invention further provides another electronic device, which includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors implement the image fusion method provided by the embodiment of the invention, the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Of course, those skilled in the art can understand that the processor can also implement the technical solution of the image fusion method provided by any embodiment of the present invention. The hardware structure and functions of the electronic device can be explained with reference to fig. 9.
The disclosed embodiments also provide a storage medium containing computer-executable instructions for performing a neural network training method when executed by a computer processor, the neural network comprising a first sub-network and a second sub-network having the same network structure, and a primary feature extraction module, a high-level feature extraction module and a coupling feedback module being included in any of the sub-networks; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
respectively inputting the underexposed low-resolution image and the overexposed low-resolution image into initial feature extraction modules in a first sub-network and a second sub-network to generate an underexposed low-level feature and an overexposed low-level feature;
respectively inputting the underexposure low-level features and the overexposure low-level features into high-level feature extraction modules in the first sub-network and the second sub-network to generate the underexposure high-level features and the overexposure high-level features;
inputting the underexposure low-level features, the underexposure high-level features and the overexposure high-level features into a coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the overexposure low-level features, the overexposure high-level features and the underexposure high-level features into a coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
Of course, the storage medium provided by the embodiments of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the above method operations, and may also perform related operations in the neural network training method provided by any embodiments of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Embodiments of the present invention also provide another computer-readable storage medium, where computer-executable instructions, when executed by a computer processor, are configured to perform an image fusion method, including:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; the neural network is obtained by training through a neural network training method in any embodiment of the disclosure;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the above method operations, and may also perform related operations in the image fusion method provided by any embodiment of the present invention. The description of the storage medium is explained with reference to the above embodiments.
It is to be understood that the terminology used in the disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present application. As used in the specification and claims of this disclosure, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are inclusive in the plural, unless the context clearly dictates otherwise. The term "and/or" includes any and all combinations of one or more of the associated listed items. Relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (12)

1. A neural network training method is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the method comprises the following steps:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network respectively to generate an underexposed low-level feature and an overexposed low-level feature;
inputting the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network respectively to generate the underexposed high-level features and the overexposed high-level features;
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into the coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into the coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level features and the coupling feedback results corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level features and the coupling feedback results corresponding to the second sub-network.
2. The method of claim 1, wherein the neural network comprises a plurality of the coupled feedback modules, and wherein each of the coupled feedback modules does not share model parameters.
3. The method of claim 2, wherein each of the coupled feedback modules processes serially;
the inputting the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and the generating of the coupling feedback result corresponding to the first sub-network includes:
inputting the underexposed low-level features, the underexposed high-level features and the overexposed high-level features into a first coupling feedback module in the first sub-network to generate a coupling feedback result corresponding to the first sub-network;
for any subsequent coupled feedback module in the first sub-network except the first coupled feedback module, inputting the underexposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the second sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the first sub-network;
the inputting the overexposed low-level features, the overexposed high-level features, and the underexposed high-level features into the coupling feedback module in the second sub-network, and the generating of the coupling feedback result corresponding to the second sub-network includes:
inputting the over-exposed low-level features, the over-exposed high-level features and the under-exposed high-level features into a first coupling feedback module in the second sub-network to generate a coupling feedback result corresponding to the second sub-network;
and for any subsequent coupled feedback module except the first coupled feedback module in the second sub-network, inputting the over-exposed low-level feature, the coupled feedback result of a previous adjacent coupled feedback module of the subsequent coupled feedback module, and the coupled feedback result of the coupled feedback module corresponding to the previous adjacent coupled feedback module in the first sub-network into the subsequent coupled feedback module, and generating the coupled feedback result corresponding to the second sub-network.
4. The method of any one of claims 1 to 3, wherein the coupled feedback module comprises at least two concatenated sub-modules and at least two sets of signature maps, wherein each set of signature maps comprises a filter, an deconvolution layer and a convolution layer;
a first one of said join submodules precedes each of said sets of feature maps;
any other coupling submodule than the first coupling submodule is located between any two adjacent feature map sets, and any two other coupling submodules are located at different positions.
5. The method of claim 1, wherein the adjusting parameters of the neural network based on the under-exposed low resolution image, the under-exposed high level features and the coupled feedback results corresponding to the first sub-network, and the over-exposed low resolution image, the over-exposed high level features and the coupled feedback results corresponding to the second sub-network comprises:
respectively carrying out up-sampling operation on the under-exposed low-resolution image and the over-exposed low-resolution image;
adding the image corresponding to the under-exposed high-level features and the image corresponding to the coupling feedback result corresponding to the first sub-network to the up-sampled under-exposed low-resolution image respectively to generate an under-exposed high-resolution image and a fusion exposed high-resolution image corresponding to the second sub-network;
adding the image corresponding to the overexposure high-level feature and the image corresponding to the coupling feedback result corresponding to the second sub-network to the upsampled overexposure low-resolution image respectively to generate an overexposure high-resolution image and a fusion exposure high-resolution image corresponding to the second sub-network;
and adjusting parameters of the neural network based on the underexposed high-resolution image, the fusion exposed high-resolution image corresponding to the first sub-network, the overexposed high-resolution image and the fusion exposed high-resolution image corresponding to the second sub-network.
6. The method of claim 5, wherein adjusting parameters of the neural network based on the under-exposed high resolution image, the fused-exposed high resolution image corresponding to the first sub-network, the over-exposed high resolution image, and the fused-exposed high resolution image corresponding to the second sub-network comprises:
adjusting parameters of the neural network by a loss function as shown in the following equation:
Figure FDA0002689356970000031
wherein L istotalExpressing the value of the total loss function, λo、λuAnd
Figure FDA0002689356970000032
the weight corresponding to each partial loss function value is respectively represented,
Figure FDA0002689356970000033
and
Figure FDA0002689356970000034
respectively representing the high-level feature extraction module and the coupled feedback module in the first subnetworkThe value of the loss function to which the block corresponds,
Figure FDA0002689356970000035
and
Figure FDA0002689356970000036
respectively represent the loss function values, L, corresponding to the high-level feature extraction module and the coupling feedback module in the second sub-networkMSRepresenting a loss value between two images determined based on the structural similarity index of the images,
Figure FDA0002689356970000041
and
Figure FDA0002689356970000042
respectively representing the overexposed high resolution image and the overexposed high resolution reference image,
Figure FDA0002689356970000043
and
Figure FDA0002689356970000044
respectively representing the under-exposed high resolution image and the under-exposed high resolution reference image,
Figure FDA0002689356970000045
and IgtAnd the fusion exposure high-resolution images corresponding to the tth second sub-network, the fusion exposure high-resolution images corresponding to the tth first sub-network and the fusion exposure high-resolution reference images are respectively represented, and T represents the number of the coupling feedback modules.
7. An image fusion method, comprising:
acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
inputting the underexposed low-resolution image and the overexposed low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein the neural network is trained by the neural network training method of any one of claims 1-6;
and generating an image fusion result based on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image.
8. The method of claim 7, wherein generating an image fusion result based on the first and second fused-exposure high-resolution images comprises:
and respectively utilizing a first weight and a second weight to carry out weighted summation processing on the first fusion exposure high-resolution image and the second fusion exposure high-resolution image so as to generate the image fusion result.
9. A neural network training device is characterized in that the neural network comprises a first sub-network and a second sub-network which have the same network structure, and any sub-network comprises a primary feature extraction module, a high-level feature extraction module and a coupling feedback module; the device comprises:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
a low-level feature generation unit, configured to input the under-exposed low-resolution image and the over-exposed low-resolution image into the initial feature extraction modules in the first sub-network and the second sub-network, respectively, and generate an under-exposed low-level feature and an over-exposed low-level feature;
a high-level feature generation unit configured to input the underexposed low-level features and the overexposed low-level features into the high-level feature extraction modules in the first sub-network and the second sub-network, respectively, and generate the underexposed high-level features and the overexposed high-level features;
a first coupling feedback result generating unit, configured to input the underexposed low-level features, the underexposed high-level features, and the overexposed high-level features into the coupling feedback module in the first sub-network, and generate a coupling feedback result corresponding to the first sub-network;
a second coupling feedback result generating unit, configured to input the overexposed low-level feature, the overexposed high-level feature, and the underexposed high-level feature into the coupling feedback module in the second sub-network, and generate a coupling feedback result corresponding to the second sub-network;
and the parameter adjusting unit is used for adjusting the parameters of the neural network based on the underexposed low-resolution image, the underexposed high-level feature and the coupling feedback result corresponding to the first sub-network, and the overexposed low-resolution image, the overexposed high-level feature and the coupling feedback result corresponding to the second sub-network.
10. An image fusion apparatus, comprising:
an image acquisition unit for acquiring an under-exposed low-resolution image and an over-exposed low-resolution image;
the fusion exposure high-resolution image generating unit is used for inputting the underexposure low-resolution image and the overexposure low-resolution image into a pre-trained neural network to generate a first fusion exposure high-resolution image and a second fusion exposure high-resolution image; wherein the neural network is trained by the neural network training method of any one of claims 1-6;
an image fusion result generating unit configured to generate an image fusion result based on the first fusion-exposed high-resolution image and the second fusion-exposed high-resolution image.
11. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the neural network training method of any one of claims 1-6 or the image fusion method of any one of claims 7-8.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the neural network training method of any one of claims 1 to 6 or the image fusion method of any one of claims 7 to 8.
CN202010986245.1A 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium Active CN112184550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010986245.1A CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010986245.1A CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN112184550A true CN112184550A (en) 2021-01-05
CN112184550B CN112184550B (en) 2022-11-01

Family

ID=73921653

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010986245.1A Active CN112184550B (en) 2020-09-18 2020-09-18 Neural network training method, image fusion method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112184550B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115103118A (en) * 2022-06-20 2022-09-23 北京航空航天大学 High dynamic range image generation method, device, equipment and readable storage medium
CN115100043A (en) * 2022-08-25 2022-09-23 天津大学 HDR image reconstruction method based on deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805836A (en) * 2018-05-31 2018-11-13 大连理工大学 Method for correcting image based on the reciprocating HDR transformation of depth
US20190130545A1 (en) * 2017-11-01 2019-05-02 Google Llc Digital image auto exposure adjustment
CN110728633A (en) * 2019-09-06 2020-01-24 上海交通大学 Multi-exposure high-dynamic-range inverse tone mapping model construction method and device
US20200111194A1 (en) * 2018-10-08 2020-04-09 Rensselaer Polytechnic Institute Ct super-resolution gan constrained by the identical, residual and cycle learning ensemble (gan-circle)
CN111246091A (en) * 2020-01-16 2020-06-05 北京迈格威科技有限公司 Dynamic automatic exposure control method and device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130545A1 (en) * 2017-11-01 2019-05-02 Google Llc Digital image auto exposure adjustment
CN108805836A (en) * 2018-05-31 2018-11-13 大连理工大学 Method for correcting image based on the reciprocating HDR transformation of depth
US20200111194A1 (en) * 2018-10-08 2020-04-09 Rensselaer Polytechnic Institute Ct super-resolution gan constrained by the identical, residual and cycle learning ensemble (gan-circle)
CN110728633A (en) * 2019-09-06 2020-01-24 上海交通大学 Multi-exposure high-dynamic-range inverse tone mapping model construction method and device
CN111246091A (en) * 2020-01-16 2020-06-05 北京迈格威科技有限公司 Dynamic automatic exposure control method and device and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
史振威等: "图像超分辨重建算法综述", 《数据采集与处理》 *
陈文等: "基于卷积神经网络的LDR图像重建HDR图像的方法研究", 《包装工程》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115103118A (en) * 2022-06-20 2022-09-23 北京航空航天大学 High dynamic range image generation method, device, equipment and readable storage medium
CN115103118B (en) * 2022-06-20 2023-04-07 北京航空航天大学 High dynamic range image generation method, device, equipment and readable storage medium
CN115100043A (en) * 2022-08-25 2022-09-23 天津大学 HDR image reconstruction method based on deep learning

Also Published As

Publication number Publication date
CN112184550B (en) 2022-11-01

Similar Documents

Publication Publication Date Title
Xu et al. Learning to restore low-light images via decomposition-and-enhancement
Afifi et al. Learning multi-scale photo exposure correction
CN109102483B (en) Image enhancement model training method and device, electronic equipment and readable storage medium
Zhang et al. Dual illumination estimation for robust exposure correction
CN108010031B (en) Portrait segmentation method and mobile terminal
Huang et al. Deep fourier-based exposure correction network with spatial-frequency interaction
CN111311532B (en) Image processing method and device, electronic device and storage medium
CN112184550B (en) Neural network training method, image fusion method, device, equipment and medium
CN112602088B (en) Method, system and computer readable medium for improving quality of low light images
CN107103585B (en) Image super-resolution system
CN111028142A (en) Image processing method, apparatus and storage medium
CN109886875A (en) Image super-resolution rebuilding method and device, storage medium
Liu et al. Tape: Task-agnostic prior embedding for image restoration
WO2023081399A1 (en) Integrated machine learning algorithms for image filters
Le et al. Single-image hdr reconstruction by multi-exposure generation
Yin et al. Two exposure fusion using prior-aware generative adversarial network
Li et al. D2c-sr: A divergence to convergence approach for real-world image super-resolution
CN114049258A (en) Method, chip and device for image processing and electronic equipment
CN115115518B (en) Method, device, equipment, medium and product for generating high dynamic range image
CN112150363A (en) Convolution neural network-based image night scene processing method, and computing module and readable storage medium for operating method
Hung et al. Image interpolation using convolutional neural networks with deep recursive residual learning
CN113810597B (en) Rapid image and scene rendering method based on semi-predictive filtering
CN111861940A (en) Image toning enhancement method based on condition continuous adjustment
CN110351489B (en) Method and device for generating HDR image and mobile terminal
CN110378852A (en) Image enchancing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant