CN113096029A - High dynamic range image generation method based on multi-branch codec neural network - Google Patents
High dynamic range image generation method based on multi-branch codec neural network Download PDFInfo
- Publication number
- CN113096029A CN113096029A CN202110246503.7A CN202110246503A CN113096029A CN 113096029 A CN113096029 A CN 113096029A CN 202110246503 A CN202110246503 A CN 202110246503A CN 113096029 A CN113096029 A CN 113096029A
- Authority
- CN
- China
- Prior art keywords
- image
- branch
- neural network
- value
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 19
- 238000003062 neural network model Methods 0.000 claims abstract description 17
- 238000012360 testing method Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 8
- 238000004140 cleaning Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 230000010365 information processing Effects 0.000 claims description 42
- 230000006870 function Effects 0.000 claims description 17
- 230000004927 fusion Effects 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 235000004257 Cordia myxa Nutrition 0.000 claims description 6
- 244000157795 Cordia myxa Species 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000013515 script Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000013139 quantization Methods 0.000 description 4
- 238000012800 visualization Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000005316 response function Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G06T5/90—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/046—Forward inferencing; Production systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20208—High dynamic range [HDR] image processing
Abstract
The invention discloses a high dynamic range image generation method based on a multi-branch codec neural network, which comprises the following steps: s1: collecting and cleaning an HDR image; s2: preprocessing the cleaned HDR image to obtain an LDR image; s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence; s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image. The invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.
Description
Technical Field
The invention belongs to the technical field of high dynamic image generation, and particularly relates to a high dynamic range image generation method based on a multi-branch codec neural network.
Background
In recent years, High Dynamic Range (HDR) image correlation techniques have been widely studied and applied in academic and industrial fields. The high dynamic range image mainly relates to the technology of high dynamic range image acquisition, coding, display and the like. For high dynamic range image acquisition techniques, the most common method is to capture multiple frames of differently exposed Low Dynamic Range (LDR) images of the same scene and then synthesize them to obtain an HDR image. However, when a multi-frame LDR image is used to generate a high-quality HDR image, background alignment needs to be performed on the multi-frame image with different exposures and foreground motion needs to be solved, which all have adverse effects on algorithm complexity and reconstruction effect. In addition, a large number of images exist in reality as single-frame images, and most of the images actually taken by people are often single-exposure images. As camera performance improves, the captured single exposure image possesses enough information to reconstruct a high dynamic range image. Therefore, methods of generating HDR images from single-frame LDR images have also received attention from researchers.
The conventional method of single exposure HDR image imaging is to stretch the luminance channel dynamic range of a single frame image in a specific way, so as to obtain an HDR image with a high dynamic range. The main algorithms can be divided into two categories: a Camera Response Function (CRF) based method and an Inverse Tone Mapping operator (ITM) based method. The first method is to estimate a corresponding camera response function according to an input image, and then to obtain a target HDR image by applying the function to a pixel value of an original irradiation field. The second method is a mainstream single-frame HDR image generation method, and by using a segmented mapping function or a specific inverse tone mapping operator to calculate different exposure areas of an image, the dynamic range of an original LDR image can be expanded, and the detail information of an imaging defective area of the original image is enhanced, so that the generated HDR image has a better visual effect.
With the development of deep learning techniques, in recent years, methods of generating HDR images using deep convolutional neural networks have begun to appear. The deep network can replace various complex algorithms in the traditional method to realize the nonlinear mapping from the LDR image to the HDR image, and can also improve the defects that the traditional method is insufficient in generalization, the algorithm is complex and is difficult to realize on hardware and the like. On the task of generating a single-exposure HDR image, the convolutional neural network extracts and combines the bottom layer features to obtain abstract feature representation in a high-level meaning, and has strong fitting capability. The method for deep learning is used for enhancing or recovering and estimating the detail information of the poor region of the single-frame image, the original scene information corresponding to the single-frame LDR image can be greatly reproduced, and compared with the traditional HDR image acquisition method, the trained deep network has the advantages of smaller computational complexity and better real-time property.
Disclosure of Invention
The invention aims to solve the problem of high dynamic image generation and provides a high dynamic range image generation method based on a multi-branch codec neural network.
The technical scheme of the invention is as follows: a high dynamic range image generation method based on a multi-branch codec neural network comprises the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
The invention has the beneficial effects that: the invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.
Further, in step S1, an HDR image is generated using a multiple exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
Further, step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
The beneficial effects of the further scheme are as follows: in the present invention, the HDR image with high quality has a high resolution, and in order to reduce the computational burden, it is necessary to randomly crop the original image, then resize is a small image with a fixed size (the size used by the model in the present invention is 256 × 256), and then perform random tone, saturation adjustment, random histogram cropping, and tone mapping of random parameters on the obtained image to obtain an LDR image. The series of operations need to control the generation of an LDR image with moderate quality, if the quality of the generated LDR image is too poor, the network can not be converged normally, and if the quality is too high, the network effect is not obvious.
Further, in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% to 5% of pixels with the highest pixel value in the RGB channels of the whole image.
The beneficial effects of the further scheme are as follows: in the invention, the used cutting proportion is the first 3-5% pixels with the highest pixel value in the RGB channel of the whole picture, the data degradation can be effectively learned by a network and can obtain a certain quality improvement effect, and the image brightness is suppressed, so that the value of a low exposure area is limited in a low range, and certain information loss appears after 8-bit quantization for network learning quantization error.
Further, in step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
The beneficial effects of the further scheme are as follows: in the present invention, the encoder network finally obtains the feature information of 64 channels size 1 × 1, and uses the copy operation to obtain the feature information of 64 channels 256 × 256 size for final decoding. And the LDR image is input into a network, and after calculation through each branch of the encoder, the characteristic information output by each part of the encoder is fused.
Further, in the detail information processing network, the number of channels is 64 and 128 respectively, the step size is 1, the filling value of the convolution kernel is 1, and the shape is 3 x 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
The beneficial effects of the further scheme are as follows: in the invention, each square in the detail information processing network represents a group of convolution kernels, the number of channels (channel) is 64 and 128 respectively, the step length (stride) is 1, the padding value (padding) is 1, and the shape is 3 x 3; the part pays attention to the learning of the pixel level and finishes the extraction of detail information.
Each square in the intermediate frequency information processing network represents a group of convolution kernels, the number of channels is 64, the step length is 1, the filling value is 2, the shape is 3 x 3, and the sparse convolution with the coefficient of 2 is used for expanding the network receptive field, so that the partial network extracts the intermediate frequency characteristic information of the input LDR image.
Each square in the global information processing network represents a group of convolution kernels, the number of channels is 64, and the step length is 1; the first 6 sets of convolution kernel fill values are 1, the shape is 3 x 3, and the last set of convolution kernel fill values are 0, the shape is 4 x 4, so that the partial network extracts the input LDR image global feature information.
The partial network finally obtains the feature information of 64 channels with size 1 × 1, and obtains the feature information of 64 channels with size 256 × 256 by using the copy operation for final decoding.
Further, in the information fusion network, the number of channels is 64 and 3, the shape is 1 × 1, the step size is 1, and the convolution kernel filling value is 0.
Furthermore, in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
Further, in step S3, training the loss function of the multi-branch codec neural network modelThe calculation formula of (2) is as follows:
wherein the content of the first and second substances,Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,a mask for the bright area is shown,denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting imagesBrightness, XrRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
Drawings
FIG. 1 is a flow chart of a high dynamic range image generation method;
FIG. 2 is a training image contrast map;
FIG. 3 is a diagram of a network architecture;
FIG. 4 is a graph of network model test results;
FIG. 5 is a diagram of a forward inference process;
FIG. 6 is a comparative outdoor 1 chart;
FIG. 7 is a comparative view of the outdoor unit 2;
FIG. 8 is a comparison of outdoor evening;
FIG. 9 is a test image contrast map;
FIG. 10 is a graph of HDR-VDP2 visualization results.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, the present invention provides a high dynamic range image generation based on multi-branch codec neural network, the method includes the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
In the embodiment of the present invention, as shown in fig. 1, in step S1, an HDR image is generated using a multiple exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
In an embodiment of the invention, the threshold is 70%.
In the embodiment of the present invention, as shown in fig. 1, step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
The beneficial effects of the further scheme are as follows: in the present invention, the HDR image with high quality has a high resolution, and in order to reduce the computational burden, it is necessary to randomly crop the original image, then resize is a small image with a fixed size (the size used by the model in the present invention is 256 × 256), and then perform random tone, saturation adjustment, random histogram cropping, and tone mapping of random parameters on the obtained image to obtain an LDR image. The series of operations need to control the generation of an LDR image with moderate quality, if the quality of the generated LDR image is too poor, the network can not be converged normally, and if the quality is too high, the network effect is not obvious.
The pair of training images obtained by pre-processing is shown in fig. 2, where the left two are LDR images for input and the right two are the corresponding tone-mapped HDR images. It is clear that HDR images have better detail.
In the embodiment of the present invention, as shown in fig. 1, in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% -5% pixels with the highest pixel value in the RGB channels of the whole image.
In the invention, the used cutting proportion is the first 3-5% pixels with the highest pixel value in the RGB channel of the whole picture, the data degradation can be effectively learned by a network and can obtain a certain quality improvement effect, and the image brightness is suppressed, so that the value of a low exposure area is limited in a low range, and certain information loss appears after 8-bit quantization for network learning quantization error.
In the embodiment of the present invention, as shown in fig. 3, in step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
In the present invention, the encoder network finally obtains the feature information of 64 channels size 1 × 1, and uses the copy operation to obtain the feature information of 64 channels 256 × 256 size for final decoding. And the LDR image is input into a network, and after calculation through each branch of the encoder, the characteristic information output by each part of the encoder is fused.
In the embodiment of the present invention, as shown in fig. 3, in the detail information processing network, the number of channels is 64 and 128, respectively, the step size is 1, the convolution kernel filling value is 1, and the shape is 3 × 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
In the invention, each square in the detail information processing network represents a group of convolution kernels, the number of channels (channel) is 64 and 128 respectively, the step length (stride) is 1, the padding value (padding) is 1, and the shape is 3 x 3; the part pays attention to the learning of the pixel level and finishes the extraction of detail information.
Each square in the intermediate frequency information processing network represents a group of convolution kernels, the number of channels is 64, the step length is 1, the filling value is 2, the shape is 3 x 3, and the sparse convolution with the coefficient of 2 is used for expanding the network receptive field, so that the partial network extracts the intermediate frequency characteristic information of the input LDR image.
Each square in the global information processing network represents a group of convolution kernels, the number of channels is 64, and the step length is 1; the first 6 sets of convolution kernel fill values are 1, the shape is 3 x 3, and the last set of convolution kernel fill values are 0, the shape is 4 x 4, so that the partial network extracts the input LDR image global feature information.
The partial network finally obtains the feature information of 64 channels with size 1 × 1, and obtains the feature information of 64 channels with size 256 × 256 by using the copy operation for final decoding.
In the embodiment of the present invention, as shown in fig. 3, in the information fusion network, the number of channels is 64 and 3, the shape is 1 × 1, the step size is 1, and the convolution kernel padding value is 0.
In the embodiment of the present invention, as shown in fig. 3, in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network, and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
In the embodiment of the present invention, as shown in FIG. 1, in step S3, the loss function of the multi-branch codec neural network model is trainedThe calculation formula of (2) is as follows:
wherein the content of the first and second substances,Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,a mask for the bright area is shown,denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting the brightness, X, of the imagerRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
In the embodiment of the invention, after the network model training is converged, the network model is tested. The input test image and the output result graph are shown in fig. 4.
As shown in fig. 5, to better understand the present invention, a summary of the whole forward reasoning process of the network is made:
1. calculating a mask on the LDR image according to a threshold value;
the LDR image is input into the network after being converted into an HDR domain (gamma transformation is used);
3. performing dot multiplication on the network output by using the mask obtained by calculation, wherein the operation is used for extracting information of a corresponding area in the image;
4. performing dot multiplication on the input converted to the HDR domain by using the calculated 1-mask, and fully utilizing the information of the normally exposed region in the original image;
5. and (4) summing the outputs in the steps (3) and (4) to obtain a final output result.
To verify the effect of the present invention, the above-mentioned results were compared with "Ldr to hdr image mapping with iterative preprocessing" (Huo in the present invention) proposed by Y.Huo et al in 2013 and "expand dNet: A Deep conditional New Network for High Dynamic Range Expansion from Low Dynamic Range Content" (Exexpand dNet in the present invention) proposed by D.Marnerides et al in 2018. The HDR images generated by the three algorithms are tone mapped to give a subjective result comparison graph. The results of the objective comparison are given using the HDR-VDP2 index.
The subjective results were: in fig. 6, 7, and 8, the input images for evaluation are outdoor 1, outdoor 2, and outdoor evening, respectively. The test image includes over-exposed and under-exposed cases. The upper left corner of each group of graphs is an LDR image of an input network, the upper right corner is a Huo result graph, the lower left corner is an ExpandNet result graph, and the lower right corner is a result graph output by the method. By contrast, the result of the present invention produces more natural colors and richer details in the poor exposure areas, and the noise in the low exposure areas can be effectively suppressed.
The objective results are: using HDR-VDP2 as a measure of objective index, the test images shown in fig. 9 were taken as outdoor 1, outdoor 2, and outdoor evening, respectively, in the clockwise direction; the corresponding visualization result graph (fig. 10) and the Q-value comparison table (table 1) are respectively given. The same test pictures as compared with the subjective results were used for evaluation, outdoor 1, outdoor 2, and outdoor evening, respectively, clockwise from the first picture in the first row. In fig. 10, the input diagrams corresponding to the images from the first row to the last row are outdoor 1, outdoor 2, and outdoor evening, respectively. The first column is the results of Huo, the second column is the results of ExpandNet, and the third column is the results of the present invention. Wherein more blue regions represents a higher quality HDR image generated. Table 1 is an evaluation index Q value in HDR-VDP2, a larger value indicates that the generated HDR image has lower perceptual difference from its real HDR image, and the present invention achieves a higher score compared with the other two methods; fig. 10 is a graph of HDR-VDP2 visualization, where the blue pixels are in the area indicating less perceptual difference between the original image and the target image. On the contrary, the red pixel represents that the region is located with a larger perception difference from the target image; the green pixels indicate a difference in their perception between red and blue. Compared with the other two methods, the visual result graph has more blue areas and less red areas, and generates a higher-quality image; the method has certain advantages in the aspects of visualization result graphs and specific Q values.
TABLE 1
The working principle and the process of the invention are as follows: the invention estimates the detail information lost by the high exposure area and the low exposure area of the single-frame LDR image by a deep learning method to obtain the HDR image. For a normal LDR image, the details of the bright or dark part of the image are often lost due to insufficient dynamic range of the camera itself or too high contrast caused by strong illumination in the same scene. Meanwhile, when the camera records the illumination of a natural scene, the high and low brightness values are subjected to nonlinear compression due to hardware limitation, so that the acquired image data cannot truly reflect scene information; the key to solving the HDR image reconstruction problem of a single-frame LDR image is to restore the detail poor regions of high and low exposures that are compressed nonlinearly. The invention uses deep learning mode to realize lost information estimation and linearization of high exposure area and low exposure area, and rebuilds HDR image with better quality.
The invention has the beneficial effects that: the invention provides a high dynamic range image generation method based on a neural network. The invention can output the single-frame low dynamic range image captured in the real scene through the neural network of the multi-branch codec structure and then output the high dynamic range image with high imaging quality.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (9)
1. A high dynamic range image generation method based on a multi-branch codec neural network is characterized by comprising the following steps:
s1: collecting and cleaning an HDR image;
s2: preprocessing the cleaned HDR image to obtain an LDR image;
s3: taking the LDR image as the input of a multi-branch codec neural network model, and training until convergence;
s4: and testing the input test image by using the trained multi-branch codec neural network model to generate a high dynamic range image.
2. The method for generating a high dynamic range image based on a multi-branch codec neural network as claimed in claim 1, wherein in step S1, the HDR image is generated by a multi-exposure method; the cleaning method comprises the following steps: and eliminating bad pixels with the pixel value of 0 in the area exceeding the threshold value and damaged data of the image file by utilizing manual work or scripts.
3. The multi-branch codec neural network-based high dynamic range image generation method according to claim 1, wherein the step S2 includes the following sub-steps:
s21: randomly cropping the cleaned HDR image, and adjusting the size of the HDR image to 256 × 256;
s22: and sequentially carrying out random tone, saturation adjustment, random histogram cutting and tone mapping of random parameters on the HDR image after size adjustment to obtain an LDR image.
4. The method for generating high dynamic range image based on multi-branch codec neural network as claimed in claim 3, wherein in step S2, the clipping objects for random clipping and random histogram clipping are the first 3% -5% pixels with the highest pixel value in RGB channel of the whole image.
5. The multi-branch codec neural network-based high dynamic range image generation method according to claim 1, wherein in the step S3, the multi-branch codec neural network model includes an encoder and a decoder;
the encoder comprises a detail information processing network, an intermediate frequency information processing network and a global information processing network; the detail information processing network is used for extracting the detail information of the LDR image; the intermediate frequency information processing network is used for extracting intermediate frequency characteristic information of the LDR image; the global information processing network is used for extracting global characteristic information of the LDR image;
the decoder comprises an information fusion network; the information fusion network is used for fusing the cascaded detail information, the intermediate frequency characteristic information and the global characteristic information.
6. The method according to claim 5, wherein in the detail information processing network, the number of channels is 64 and 128, the step size is 1, the convolution kernel filling value is 1, and the shape is 3 x 3;
in the intermediate frequency information processing network, the number of channels is 64, the step length is 1, the filling value of a convolution kernel is 2, the shape is 3 x 3, and the sparse convolution coefficient is 2;
in the global information processing network, the number of channels is 64, and the step length is 1; the 1 st to 6 th sets of convolution kernels have a fill value of 1, a shape of 3 x 3, and the seventh set of convolution kernels has a fill value of 0, a shape of 4 x 4.
7. The method according to claim 5, wherein the information fusion network comprises 64 and 3 channels, 1 x 1 in shape, 1 step size, and 0 convolution kernel filling value.
8. The method according to claim 5, wherein in the multi-branch codec neural network model, the activation functions of the detail information processing network, the intermediate frequency information processing network, and the global information processing network are Selu; the activation function of the first convolution module in the information fusion network is Selu, and the activation function of the second convolution module is sigmoid.
9. The method for generating high dynamic range image based on multi-branch codec neural network of claim 1, wherein in step S3, the loss function of the multi-branch codec neural network model is trainedThe calculation formula of (2) is as follows:
wherein the content of the first and second substances,Ii=0.299Xr+0.587Xg+0.114Xb,Mithe final mask is shown to be the final mask,a mask for the bright area is shown,denotes a dark area mask, ta denotes a threshold for determining whether an image area is highly exposed, tb denotes a threshold for determining whether an image area is underexposed, IiRepresenting the brightness, X, of the imagerRepresenting the red channel value, X, of pixel XgRepresenting the green channel value, X, of pixel XbRepresenting the blue channel value of pixel X, w representing the width of the image, h representing the height of the image,representing the net output value of the ith pixel in channel c,. epsilon.represents the minimum value, Yi,cRepresents the group Truth value, α, of the ith pixel in channel cLThe weight representing the hue loss in the loss function,representing ith pixel value, H, of a color channel of a network output imageiRepresents the ith pixel value of the color channel of the GroundTruth image, and log (-) represents the logarithm operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110246503.7A CN113096029A (en) | 2021-03-05 | 2021-03-05 | High dynamic range image generation method based on multi-branch codec neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110246503.7A CN113096029A (en) | 2021-03-05 | 2021-03-05 | High dynamic range image generation method based on multi-branch codec neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113096029A true CN113096029A (en) | 2021-07-09 |
Family
ID=76666728
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110246503.7A Pending CN113096029A (en) | 2021-03-05 | 2021-03-05 | High dynamic range image generation method based on multi-branch codec neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113096029A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113973175A (en) * | 2021-08-27 | 2022-01-25 | 天津大学 | Rapid HDR video reconstruction method |
CN114332755A (en) * | 2021-12-06 | 2022-04-12 | 南京瀚元科技有限公司 | Power generation incinerator monitoring method based on binocular three-dimensional modeling |
CN114359083A (en) * | 2021-12-24 | 2022-04-15 | 北京航空航天大学 | High-dynamic thermal infrared image self-adaptive preprocessing method for interference environment |
CN114693548A (en) * | 2022-03-08 | 2022-07-01 | 电子科技大学 | Dark channel defogging method based on bright area detection |
CN114998141A (en) * | 2022-06-07 | 2022-09-02 | 西北工业大学 | Space environment high dynamic range imaging method based on multi-branch network |
CN116912602A (en) * | 2023-09-11 | 2023-10-20 | 荣耀终端有限公司 | Training method of image processing model, image processing method and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103905738A (en) * | 2012-12-31 | 2014-07-02 | 博世汽车部件(苏州)有限公司 | High-dynamic-range image generation system and method |
CN110300989A (en) * | 2017-05-15 | 2019-10-01 | 谷歌有限责任公司 | Configurable and programmable image processor unit |
CN111105376A (en) * | 2019-12-19 | 2020-05-05 | 电子科技大学 | Single-exposure high-dynamic-range image generation method based on double-branch neural network |
CN111372006A (en) * | 2020-03-03 | 2020-07-03 | 山东大学 | High dynamic range imaging method and system for mobile terminal |
-
2021
- 2021-03-05 CN CN202110246503.7A patent/CN113096029A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103905738A (en) * | 2012-12-31 | 2014-07-02 | 博世汽车部件(苏州)有限公司 | High-dynamic-range image generation system and method |
CN110300989A (en) * | 2017-05-15 | 2019-10-01 | 谷歌有限责任公司 | Configurable and programmable image processor unit |
CN111105376A (en) * | 2019-12-19 | 2020-05-05 | 电子科技大学 | Single-exposure high-dynamic-range image generation method based on double-branch neural network |
CN111372006A (en) * | 2020-03-03 | 2020-07-03 | 山东大学 | High dynamic range imaging method and system for mobile terminal |
Non-Patent Citations (1)
Title |
---|
D. MARNERIDES1 ET AL: "ExpandNet:A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content", 《HTTPS://ARXIV.ORG/PDF/1803.02266.PDF》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113973175A (en) * | 2021-08-27 | 2022-01-25 | 天津大学 | Rapid HDR video reconstruction method |
CN114332755A (en) * | 2021-12-06 | 2022-04-12 | 南京瀚元科技有限公司 | Power generation incinerator monitoring method based on binocular three-dimensional modeling |
CN114359083A (en) * | 2021-12-24 | 2022-04-15 | 北京航空航天大学 | High-dynamic thermal infrared image self-adaptive preprocessing method for interference environment |
CN114693548A (en) * | 2022-03-08 | 2022-07-01 | 电子科技大学 | Dark channel defogging method based on bright area detection |
CN114693548B (en) * | 2022-03-08 | 2023-04-18 | 电子科技大学 | Dark channel defogging method based on bright area detection |
CN114998141A (en) * | 2022-06-07 | 2022-09-02 | 西北工业大学 | Space environment high dynamic range imaging method based on multi-branch network |
CN114998141B (en) * | 2022-06-07 | 2024-03-12 | 西北工业大学 | Space environment high dynamic range imaging method based on multi-branch network |
CN116912602A (en) * | 2023-09-11 | 2023-10-20 | 荣耀终端有限公司 | Training method of image processing model, image processing method and electronic equipment |
CN116912602B (en) * | 2023-09-11 | 2023-12-15 | 荣耀终端有限公司 | Training method of image processing model, image processing method and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113096029A (en) | High dynamic range image generation method based on multi-branch codec neural network | |
CN109447907B (en) | Single image enhancement method based on full convolution neural network | |
CN111105376B (en) | Single-exposure high-dynamic-range image generation method based on double-branch neural network | |
CN111292264A (en) | Image high dynamic range reconstruction method based on deep learning | |
CN110717868B (en) | Video high dynamic range inverse tone mapping model construction and mapping method and device | |
WO2022000397A1 (en) | Low-illumination image enhancement method and apparatus, and computer device | |
Chang et al. | Low-light image restoration with short-and long-exposure raw pairs | |
CN110796622B (en) | Image bit enhancement method based on multi-layer characteristics of series neural network | |
CN111986084A (en) | Multi-camera low-illumination image quality enhancement method based on multi-task fusion | |
CN113344773B (en) | Single picture reconstruction HDR method based on multi-level dual feedback | |
CN110225260B (en) | Three-dimensional high dynamic range imaging method based on generation countermeasure network | |
Moriwaki et al. | Hybrid loss for learning single-image-based HDR reconstruction | |
WO2023086194A1 (en) | High dynamic range view synthesis from noisy raw images | |
Steffens et al. | Cnn based image restoration: Adjusting ill-exposed srgb images in post-processing | |
Garg et al. | LiCENt: Low-light image enhancement using the light channel of HSL | |
Lv et al. | Low-light image enhancement via deep Retinex decomposition and bilateral learning | |
Wang et al. | Low-light image enhancement based on virtual exposure | |
CN117197627B (en) | Multi-mode image fusion method based on high-order degradation model | |
CN112927160B (en) | Single low-light image enhancement method based on depth Retinex | |
Ye et al. | Single exposure high dynamic range image reconstruction based on deep dual-branch network | |
US20230325974A1 (en) | Image processing method, apparatus, and non-transitory computer-readable medium | |
Suda et al. | Deep snapshot hdr imaging using multi-exposure color filter array | |
CN111161189A (en) | Single image re-enhancement method based on detail compensation network | |
CN115841523A (en) | Double-branch HDR video reconstruction algorithm based on Raw domain | |
CN115661012A (en) | Multi-exposure image fusion system based on global-local aggregation learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210709 |