CN113096029A - High dynamic range image generation method based on multi-branch codec neural network - Google Patents

High dynamic range image generation method based on multi-branch codec neural network Download PDF

Info

Publication number
CN113096029A
CN113096029A CN202110246503.7A CN202110246503A CN113096029A CN 113096029 A CN113096029 A CN 113096029A CN 202110246503 A CN202110246503 A CN 202110246503A CN 113096029 A CN113096029 A CN 113096029A
Authority
CN
China
Prior art keywords
image
branch
neural network
value
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110246503.7A
Other languages
Chinese (zh)
Inventor
霍永青
李翰林
甘静
刘耀辉
武畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110246503.7A priority Critical patent/CN113096029A/en
Publication of CN113096029A publication Critical patent/CN113096029A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/90
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/046Forward inferencing; Production systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20208High dynamic range [HDR] image processing

Abstract

The invention discloses a high dynamic range image generation method based on a multi-branch codec neural network, which comprises the following steps: s1: collecting and cleaning an HDR image; s2: preprocessing the cleaned HDR image to obtain an LDR image; s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence; s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image. The invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.

Description

High dynamic range image generation method based on multi-branch codec neural network
Technical Field
The invention belongs to the technical field of high dynamic image generation, and particularly relates to a high dynamic range image generation method based on a multi-branch codec neural network.
Background
In recent years, High Dynamic Range (HDR) image correlation techniques have been widely studied and applied in academic and industrial fields. The high dynamic range image mainly relates to the technology of high dynamic range image acquisition, coding, display and the like. For high dynamic range image acquisition techniques, the most common method is to capture multiple frames of differently exposed Low Dynamic Range (LDR) images of the same scene and then synthesize them to obtain an HDR image. However, when a multi-frame LDR image is used to generate a high-quality HDR image, background alignment needs to be performed on the multi-frame image with different exposures and foreground motion needs to be solved, which all have adverse effects on algorithm complexity and reconstruction effect. In addition, a large number of images exist in reality as single-frame images, and most of the images actually taken by people are often single-exposure images. As camera performance improves, the captured single exposure image possesses enough information to reconstruct a high dynamic range image. Therefore, methods of generating HDR images from single-frame LDR images have also received attention from researchers.
The conventional method of single exposure HDR image imaging is to stretch the luminance channel dynamic range of a single frame image in a specific way, so as to obtain an HDR image with a high dynamic range. The main algorithms can be divided into two categories: a Camera Response Function (CRF) based method and an Inverse Tone Mapping operator (ITM) based method. The first method is to estimate a corresponding camera response function according to an input image, and then to obtain a target HDR image by applying the function to a pixel value of an original irradiation field. The second method is a mainstream single-frame HDR image generation method, and by using a segmented mapping function or a specific inverse tone mapping operator to calculate different exposure areas of an image, the dynamic range of an original LDR image can be expanded, and the detail information of an imaging defective area of the original image is enhanced, so that the generated HDR image has a better visual effect.
With the development of deep learning techniques, in recent years, methods of generating HDR images using deep convolutional neural networks have begun to appear. The deep network can replace various complex algorithms in the traditional method to realize the nonlinear mapping from the LDR image to the HDR image, and can also improve the defects that the traditional method is insufficient in generalization, the algorithm is complex and is difficult to realize on hardware and the like. On the task of generating a single-exposure HDR image, the convolutional neural network extracts and combines the bottom layer features to obtain abstract feature representation in a high-level meaning, and has strong fitting capability. The method for deep learning is used for enhancing or recovering and estimating the detail information of the poor region of the single-frame image, the original scene information corresponding to the single-frame LDR image can be greatly reproduced, and compared with the traditional HDR image acquisition method, the trained deep network has the advantages of smaller computational complexity and better real-time property.
Disclosure of Invention
The invention aims to solve the problem of high dynamic image generation and provides a high dynamic range image generation method based on a multi-branch codec neural network.
The technical scheme of the invention is as follows: a high dynamic range image generation method based on a multi-branch codec neural network comprises the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
The invention has the beneficial effects that: the invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.
Further, in step S1, an HDR image is generated using a multiple exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
Further, step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
The beneficial effects of the further scheme are as follows: in the present invention, the HDR image with high quality has a high resolution, and in order to reduce the computational burden, it is necessary to randomly crop the original image, then resize is a small image with a fixed size (the size used by the model in the present invention is 256 × 256), and then perform random tone, saturation adjustment, random histogram cropping, and tone mapping of random parameters on the obtained image to obtain an LDR image. The series of operations need to control the generation of an LDR image with moderate quality, if the quality of the generated LDR image is too poor, the network can not be converged normally, and if the quality is too high, the network effect is not obvious.
Further, in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% to 5% of pixels with the highest pixel value in the RGB channels of the whole image.
The beneficial effects of the further scheme are as follows: in the invention, the used cutting proportion is the first 3-5% pixels with the highest pixel value in the RGB channel of the whole picture, the data degradation can be effectively learned by a network and can obtain a certain quality improvement effect, and the image brightness is suppressed, so that the value of a low exposure area is limited in a low range, and certain information loss appears after 8-bit quantization for network learning quantization error.
Further, in step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
The beneficial effects of the further scheme are as follows: in the present invention, the encoder network finally obtains the feature information of 64 channels size 1 × 1, and uses the copy operation to obtain the feature information of 64 channels 256 × 256 size for final decoding. And the LDR image is input into a network, and after calculation through each branch of the encoder, the characteristic information output by each part of the encoder is fused.
Further, in the detail information processing network, the number of channels is 64 and 128 respectively, the step size is 1, the filling value of the convolution kernel is 1, and the shape is 3 x 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
The beneficial effects of the further scheme are as follows: in the invention, each square in the detail information processing network represents a group of convolution kernels, the number of channels (channel) is 64 and 128 respectively, the step length (stride) is 1, the padding value (padding) is 1, and the shape is 3 x 3; the part pays attention to the learning of the pixel level and finishes the extraction of detail information.
Each square in the intermediate frequency information processing network represents a group of convolution kernels, the number of channels is 64, the step length is 1, the filling value is 2, the shape is 3 x 3, and the sparse convolution with the coefficient of 2 is used for expanding the network receptive field, so that the partial network extracts the intermediate frequency characteristic information of the input LDR image.
Each square in the global information processing network represents a group of convolution kernels, the number of channels is 64, and the step length is 1; the first 6 sets of convolution kernel fill values are 1, the shape is 3 x 3, and the last set of convolution kernel fill values are 0, the shape is 4 x 4, so that the partial network extracts the input LDR image global feature information.
The partial network finally obtains the feature information of 64 channels with size 1 × 1, and obtains the feature information of 64 channels with size 256 × 256 by using the copy operation for final decoding.
Further, in the information fusion network, the number of channels is 64 and 3, the shape is 1 × 1, the step size is 1, and the convolution kernel filling value is 0.
Furthermore, in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
Further, in step S3, training the loss function of the multi-branch codec neural network model
Figure BDA0002964281190000051
The calculation formula of (2) is as follows:
Figure BDA0002964281190000052
wherein the content of the first and second substances,
Figure BDA0002964281190000053
Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,
Figure BDA0002964281190000054
a mask for the bright area is shown,
Figure BDA0002964281190000055
denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting imagesBrightness, XrRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,
Figure BDA0002964281190000056
representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,
Figure BDA0002964281190000057
representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
Drawings
FIG. 1 is a flow chart of a high dynamic range image generation method;
FIG. 2 is a training image contrast map;
FIG. 3 is a diagram of a network architecture;
FIG. 4 is a graph of network model test results;
FIG. 5 is a diagram of a forward inference process;
FIG. 6 is a comparative outdoor 1 chart;
FIG. 7 is a comparative view of the outdoor unit 2;
FIG. 8 is a comparison of outdoor evening;
FIG. 9 is a test image contrast map;
FIG. 10 is a graph of HDR-VDP2 visualization results.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the present invention provides a high dynamic range image generation based on multi-branch codec neural network, the method includes the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
In the embodiment of the present invention, as shown in fig. 1, in step S1, an HDR image is generated using a multiple exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
In an embodiment of the invention, the threshold is 70%.
In the embodiment of the present invention, as shown in fig. 1, step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
The beneficial effects of the further scheme are as follows: in the present invention, the HDR image with high quality has a high resolution, and in order to reduce the computational burden, it is necessary to randomly crop the original image, then resize is a small image with a fixed size (the size used by the model in the present invention is 256 × 256), and then perform random tone, saturation adjustment, random histogram cropping, and tone mapping of random parameters on the obtained image to obtain an LDR image. The series of operations need to control the generation of an LDR image with moderate quality, if the quality of the generated LDR image is too poor, the network can not be converged normally, and if the quality is too high, the network effect is not obvious.
The pair of training images obtained by pre-processing is shown in fig. 2, where the left two are LDR images for input and the right two are the corresponding tone-mapped HDR images. It is clear that HDR images have better detail.
In the embodiment of the present invention, as shown in fig. 1, in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% -5% pixels with the highest pixel value in the RGB channels of the whole image.
In the invention, the used cutting proportion is the first 3-5% pixels with the highest pixel value in the RGB channel of the whole picture, the data degradation can be effectively learned by a network and can obtain a certain quality improvement effect, and the image brightness is suppressed, so that the value of a low exposure area is limited in a low range, and certain information loss appears after 8-bit quantization for network learning quantization error.
In the embodiment of the present invention, as shown in fig. 3, in step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
In the present invention, the encoder network finally obtains the feature information of 64 channels size 1 × 1, and uses the copy operation to obtain the feature information of 64 channels 256 × 256 size for final decoding. And the LDR image is input into a network, and after calculation through each branch of the encoder, the characteristic information output by each part of the encoder is fused.
In the embodiment of the present invention, as shown in fig. 3, in the detail information processing network, the number of channels is 64 and 128, respectively, the step size is 1, the convolution kernel filling value is 1, and the shape is 3 × 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
In the invention, each square in the detail information processing network represents a group of convolution kernels, the number of channels (channel) is 64 and 128 respectively, the step length (stride) is 1, the padding value (padding) is 1, and the shape is 3 x 3; the part pays attention to the learning of the pixel level and finishes the extraction of detail information.
Each square in the intermediate frequency information processing network represents a group of convolution kernels, the number of channels is 64, the step length is 1, the filling value is 2, the shape is 3 x 3, and the sparse convolution with the coefficient of 2 is used for expanding the network receptive field, so that the partial network extracts the intermediate frequency characteristic information of the input LDR image.
Each square in the global information processing network represents a group of convolution kernels, the number of channels is 64, and the step length is 1; the first 6 sets of convolution kernel fill values are 1, the shape is 3 x 3, and the last set of convolution kernel fill values are 0, the shape is 4 x 4, so that the partial network extracts the input LDR image global feature information.
The partial network finally obtains the feature information of 64 channels with size 1 × 1, and obtains the feature information of 64 channels with size 256 × 256 by using the copy operation for final decoding.
In the embodiment of the present invention, as shown in fig. 3, in the information fusion network, the number of channels is 64 and 3, the shape is 1 × 1, the step size is 1, and the convolution kernel padding value is 0.
In the embodiment of the present invention, as shown in fig. 3, in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network, and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
In the embodiment of the present invention, as shown in FIG. 1, in step S3, the loss function of the multi-branch codec neural network model is trained
Figure BDA0002964281190000091
The calculation formula of (2) is as follows:
Figure BDA0002964281190000092
wherein the content of the first and second substances,
Figure BDA0002964281190000093
Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,
Figure BDA0002964281190000094
a mask for the bright area is shown,
Figure BDA0002964281190000095
denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting the brightness, X, of the imagerRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,
Figure BDA0002964281190000096
representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,
Figure BDA0002964281190000097
representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
In the embodiment of the invention, after the network model training is converged, the network model is tested. The input test image and the output result graph are shown in fig. 4.
As shown in fig. 5, to better understand the present invention, a summary of the whole forward reasoning process of the network is made:
1. calculating a mask on the LDR image according to a threshold value;
the LDR image is input into the network after being converted into an HDR domain (gamma transformation is used);
3. performing dot multiplication on the network output by using the mask obtained by calculation, wherein the operation is used for extracting information of a corresponding area in the image;
4. performing dot multiplication on the input converted to the HDR domain by using the calculated 1-mask, and fully utilizing the information of the normally exposed region in the original image;
5. and (4) summing the outputs in the steps (3) and (4) to obtain a final output result.
To verify the effect of the present invention, the above-mentioned results were compared with "Ldr to hdr image mapping with iterative preprocessing" (Huo in the present invention) proposed by Y.Huo et al in 2013 and "expand dNet: A Deep conditional New Network for High Dynamic Range Expansion from Low Dynamic Range Content" (Exexpand dNet in the present invention) proposed by D.Marnerides et al in 2018. The HDR images generated by the three algorithms are tone mapped to give a subjective result comparison graph. The results of the objective comparison are given using the HDR-VDP2 index.
The subjective results were: in fig. 6, 7, and 8, the input images for evaluation are outdoor 1, outdoor 2, and outdoor evening, respectively. The test image includes over-exposed and under-exposed cases. The upper left corner of each group of graphs is an LDR image of an input network, the upper right corner is a Huo result graph, the lower left corner is an ExpandNet result graph, and the lower right corner is a result graph output by the method. By contrast, the result of the present invention produces more natural colors and richer details in the poor exposure areas, and the noise in the low exposure areas can be effectively suppressed.
The objective results are: using HDR-VDP2 as a measure of objective index, the test images shown in fig. 9 were taken as outdoor 1, outdoor 2, and outdoor evening, respectively, in the clockwise direction; the corresponding visualization result graph (fig. 10) and the Q-value comparison table (table 1) are respectively given. The same test pictures as compared with the subjective results were used for evaluation, outdoor 1, outdoor 2, and outdoor evening, respectively, clockwise from the first picture in the first row. In fig. 10, the input diagrams corresponding to the images from the first row to the last row are outdoor 1, outdoor 2, and outdoor evening, respectively. The first column is the results of Huo, the second column is the results of ExpandNet, and the third column is the results of the present invention. Wherein more blue regions represents a higher quality HDR image generated. Table 1 is an evaluation index Q value in HDR-VDP2, a larger value indicates that the generated HDR image has lower perceptual difference from its real HDR image, and the present invention achieves a higher score compared with the other two methods; fig. 10 is a graph of HDR-VDP2 visualization, where the blue pixels are in the area indicating less perceptual difference between the original image and the target image. On the contrary, the red pixel represents that the region is located with a larger perception difference from the target image; the green pixels indicate a difference in their perception between red and blue. Compared with the other two methods, the visual result graph has more blue areas and less red areas, and generates a higher-quality image; the method has certain advantages in the aspects of visualization result graphs and specific Q values.
TABLE 1
Figure BDA0002964281190000111
The working principle and the process of the invention are as follows: the invention estimates the detail information lost by the high exposure area and the low exposure area of the single-frame LDR image by a deep learning method to obtain the HDR image. For a normal LDR image, the details of the bright or dark part of the image are often lost due to insufficient dynamic range of the camera itself or too high contrast caused by strong illumination in the same scene. Meanwhile, when the camera records the illumination of a natural scene, the high and low brightness values are subjected to nonlinear compression due to hardware limitation, so that the acquired image data cannot truly reflect scene information; the key to solving the HDR image reconstruction problem of a single-frame LDR image is to restore the detail poor regions of high and low exposures that are compressed nonlinearly. The invention uses deep learning mode to realize lost information estimation and linearization of high exposure area and low exposure area, and rebuilds HDR image with better quality.
The invention has the beneficial effects that: the invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (9)

1. A high dynamic range image generation method based on a multi-branch codec neural network is characterized by comprising the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
2. The method for generating a high dynamic range image based on a multi-branch codec neural network as claimed in claim 1, wherein in step S1, the HDR image is generated by a multi-exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
3. The multi-branch codec neural network-based high dynamic range image generation method according to claim 1, wherein the step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
4. The method for generating high dynamic range image based on multi-branch codec neural network as claimed in claim 3, wherein in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% -5% pixels with the highest pixel value in RGB channel of the whole image.
5. The multi-branch codec neural network-based high dynamic range image generation method according to claim 1, wherein in the step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
6. The method according to claim 5, wherein in the detail information processing network, the number of channels is 64 and 128, the step size is 1, the convolution kernel filling value is 1, and the shape is 3 x 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
7. The method according to claim 5, wherein the information fusion network comprises 64 and 3 channels, 1 x 1 in shape, 1 step size, and 0 convolution kernel filling value.
8. The method according to claim 5, wherein in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network, and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
9. The method for generating high dynamic range image based on multi-branch codec neural network of claim 1, wherein in step S3, the loss function of the multi-branch codec neural network model is trained
Figure FDA0002964281180000031
The calculation formula of (2) is as follows:
Figure FDA0002964281180000032
wherein the content of the first and second substances,
Figure FDA0002964281180000033
Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,
Figure FDA0002964281180000034
a mask for the bright area is shown,
Figure FDA0002964281180000035
denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting the brightness, X, of the imagerRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,
Figure FDA0002964281180000036
representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,
Figure FDA0002964281180000037
representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
CN202110246503.7A 2021-03-05 2021-03-05 High dynamic range image generation method based on multi-branch codec neural network Pending CN113096029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110246503.7A CN113096029A (en) 2021-03-05 2021-03-05 High dynamic range image generation method based on multi-branch codec neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110246503.7A CN113096029A (en) 2021-03-05 2021-03-05 High dynamic range image generation method based on multi-branch codec neural network

Publications (1)

Publication Number Publication Date
CN113096029A true CN113096029A (en) 2021-07-09

Family

ID=76666728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110246503.7A Pending CN113096029A (en) 2021-03-05 2021-03-05 High dynamic range image generation method based on multi-branch codec neural network

Country Status (1)

Country Link
CN (1) CN113096029A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113973175A (en) * 2021-08-27 2022-01-25 天津大学 Rapid HDR video reconstruction method
CN114332755A (en) * 2021-12-06 2022-04-12 南京瀚元科技有限公司 Power generation incinerator monitoring method based on binocular three-dimensional modeling
CN114359083A (en) * 2021-12-24 2022-04-15 北京航空航天大学 High-dynamic thermal infrared image self-adaptive preprocessing method for interference environment
CN114693548A (en) * 2022-03-08 2022-07-01 电子科技大学 Dark channel defogging method based on bright area detection
CN114998141A (en) * 2022-06-07 2022-09-02 西北工业大学 Space environment high dynamic range imaging method based on multi-branch network
CN116912602A (en) * 2023-09-11 2023-10-20 荣耀终端有限公司 Training method of image processing model, image processing method and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905738A (en) * 2012-12-31 2014-07-02 博世汽车部件(苏州)有限公司 High-dynamic-range image generation system and method
CN110300989A (en) * 2017-05-15 2019-10-01 谷歌有限责任公司 Configurable and programmable image processor unit
CN111105376A (en) * 2019-12-19 2020-05-05 电子科技大学 Single-exposure high-dynamic-range image generation method based on double-branch neural network
CN111372006A (en) * 2020-03-03 2020-07-03 山东大学 High dynamic range imaging method and system for mobile terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905738A (en) * 2012-12-31 2014-07-02 博世汽车部件(苏州)有限公司 High-dynamic-range image generation system and method
CN110300989A (en) * 2017-05-15 2019-10-01 谷歌有限责任公司 Configurable and programmable image processor unit
CN111105376A (en) * 2019-12-19 2020-05-05 电子科技大学 Single-exposure high-dynamic-range image generation method based on double-branch neural network
CN111372006A (en) * 2020-03-03 2020-07-03 山东大学 High dynamic range imaging method and system for mobile terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
D. MARNERIDES1 ET AL: "ExpandNet:A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content", 《HTTPS://ARXIV.ORG/PDF/1803.02266.PDF》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113973175A (en) * 2021-08-27 2022-01-25 天津大学 Rapid HDR video reconstruction method
CN114332755A (en) * 2021-12-06 2022-04-12 南京瀚元科技有限公司 Power generation incinerator monitoring method based on binocular three-dimensional modeling
CN114359083A (en) * 2021-12-24 2022-04-15 北京航空航天大学 High-dynamic thermal infrared image self-adaptive preprocessing method for interference environment
CN114693548A (en) * 2022-03-08 2022-07-01 电子科技大学 Dark channel defogging method based on bright area detection
CN114693548B (en) * 2022-03-08 2023-04-18 电子科技大学 Dark channel defogging method based on bright area detection
CN114998141A (en) * 2022-06-07 2022-09-02 西北工业大学 Space environment high dynamic range imaging method based on multi-branch network
CN114998141B (en) * 2022-06-07 2024-03-12 西北工业大学 Space environment high dynamic range imaging method based on multi-branch network
CN116912602A (en) * 2023-09-11 2023-10-20 荣耀终端有限公司 Training method of image processing model, image processing method and electronic equipment
CN116912602B (en) * 2023-09-11 2023-12-15 荣耀终端有限公司 Training method of image processing model, image processing method and electronic equipment

Similar Documents

Publication Publication Date Title
CN113096029A (en) High dynamic range image generation method based on multi-branch codec neural network
CN109447907B (en) Single image enhancement method based on full convolution neural network
CN111105376B (en) Single-exposure high-dynamic-range image generation method based on double-branch neural network
CN111292264A (en) Image high dynamic range reconstruction method based on deep learning
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
WO2022000397A1 (en) Low-illumination image enhancement method and apparatus, and computer device
Chang et al. Low-light image restoration with short-and long-exposure raw pairs
CN110796622B (en) Image bit enhancement method based on multi-layer characteristics of series neural network
CN111986084A (en) Multi-camera low-illumination image quality enhancement method based on multi-task fusion
CN113344773B (en) Single picture reconstruction HDR method based on multi-level dual feedback
CN110225260B (en) Three-dimensional high dynamic range imaging method based on generation countermeasure network
Moriwaki et al. Hybrid loss for learning single-image-based HDR reconstruction
WO2023086194A1 (en) High dynamic range view synthesis from noisy raw images
Steffens et al. Cnn based image restoration: Adjusting ill-exposed srgb images in post-processing
Garg et al. LiCENt: Low-light image enhancement using the light channel of HSL
Lv et al. Low-light image enhancement via deep Retinex decomposition and bilateral learning
Wang et al. Low-light image enhancement based on virtual exposure
CN117197627B (en) Multi-mode image fusion method based on high-order degradation model
CN112927160B (en) Single low-light image enhancement method based on depth Retinex
Ye et al. Single exposure high dynamic range image reconstruction based on deep dual-branch network
US20230325974A1 (en) Image processing method, apparatus, and non-transitory computer-readable medium
Suda et al. Deep snapshot hdr imaging using multi-exposure color filter array
CN111161189A (en) Single image re-enhancement method based on detail compensation network
CN115841523A (en) Double-branch HDR video reconstruction algorithm based on Raw domain
CN115661012A (en) Multi-exposure image fusion system based on global-local aggregation learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210709