CN111951171A - HDR image generation method and device, readable storage medium and terminal equipment - Google Patents

HDR image generation method and device, readable storage medium and terminal equipment Download PDF

Info

Publication number
CN111951171A
CN111951171A CN201910407590.2A CN201910407590A CN111951171A CN 111951171 A CN111951171 A CN 111951171A CN 201910407590 A CN201910407590 A CN 201910407590A CN 111951171 A CN111951171 A CN 111951171A
Authority
CN
China
Prior art keywords
processing
deep learning
hdr image
network
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910407590.2A
Other languages
Chinese (zh)
Inventor
黄海鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan TCL Group Industrial Research Institute Co Ltd
Original Assignee
Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan TCL Group Industrial Research Institute Co Ltd filed Critical Wuhan TCL Group Industrial Research Institute Co Ltd
Priority to CN201910407590.2A priority Critical patent/CN111951171A/en
Publication of CN111951171A publication Critical patent/CN111951171A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20208High dynamic range [HDR] image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of image processing, and particularly relates to a method and a device for generating an HDR image, a computer-readable storage medium and terminal equipment. The method comprises the steps of firstly obtaining a single-frame original image to be processed, then processing the original image by using a preset deep learning network, and generating an HDR image corresponding to the original image. The deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, and is obtained by training based on a preset training sample set, compared with the original Unet network, the light-weight Unet network further reduces the number of down-sampling layers, the number of up-sampling layers and the number of channels of feature maps, greatly simplifies the network structure, and can effectively extract high-frequency information in the original image due to the residual error module, thereby better recovering the detail information of a highlight area and greatly improving the generation effect of HDR images.

Description

HDR image generation method and device, readable storage medium and terminal equipment
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a method and a device for generating an HDR image, a computer-readable storage medium and terminal equipment.
Background
High-Dynamic Range (HDR) images may provide more Dynamic Range and image detail than normal images. In the prior art, the HDR image generation technology mainly includes two methods, namely, multi-frame synthesis and single-frame generation, where the multi-frame synthesis method needs to take multiple images of the same scene at different exposure times (mainly using 3 or 5 exposures), and then perform fusion processing on the images to obtain a final HDR image. This approach has two disadvantages: firstly, because a plurality of images need to be shot for the same scene, the shooting process consumes a long time; secondly, in the whole shooting process, the user is often difficult to keep the stability of the mobile phone, and if the mobile phone shakes, the view finding range of each image is deviated, so that the final synthesis effect is poor. In comparison, a single-frame generation method which can generate an HDR image only by a single-frame image is more convenient and practical, and currently, a single-frame HDR image generation method based on a GAN network, a single-frame HDR image generation method based on a CNN network, and the like are commonly used, but network structures used by the methods are complex, so that time consumption is long, resource consumption is high, and the effect of the generated HDR image is often poor due to the fact that the single-frame image is difficult to clearly and completely recover detailed information of a highlight region.
Disclosure of Invention
In view of this, embodiments of the present invention provide an HDR image generation method and apparatus, a computer-readable storage medium, and a terminal device, so as to solve the problems of long time consumption, large resource consumption, and poor effect of the existing single-frame HDR image generation method.
A first aspect of an embodiment of the present invention provides an HDR image generation method, which may include:
acquiring a single-frame original image to be processed;
processing the original image by using a preset deep learning network to generate an HDR image corresponding to the original image, wherein the deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, the deep learning network is obtained by training based on a preset training sample set, and the light-weight Unet network is a network obtained by reducing the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map on the basis of the original Unet network.
Further, the deep neural network includes a down-sampling layer, an intermediate processing layer, and an up-sampling layer, and the processing the original image using the preset deep learning network, and generating the HDR image corresponding to the original image may include:
performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result;
processing the first processing result by using a residual error module preset in the intermediate processing layer to obtain a second processing result;
and performing convolution and upsampling processing on the second processing result by using the upsampling layer to obtain an HDR image corresponding to the original image.
Further, the training process of the deep learning network comprises the following steps:
obtaining the training sample set, wherein the training sample set comprises training samples, each training sample comprises a sample image and an HDR image, and the HDR image in each training sample and the sample image have a one-to-one correspondence relationship;
inputting sample images in each training sample into the deep learning network for processing to obtain output images;
evaluating a first degree of difference between the HDR image and the output image in each training sample by using a preset first loss function;
if the first difference degree is larger than a preset first threshold value, adjusting model parameters of the deep learning network, and returning to execute the step of inputting the sample images in the training samples into the deep learning network for processing;
and if the first difference degree is smaller than or equal to the first threshold value, finishing the training of the deep learning network.
Further, after evaluating a first degree of difference between the HDR image and the output image in each training sample using a preset first loss function, the method further includes:
evaluating a second degree of difference between the HDR image and the output image in each training sample using a preset second loss function;
if the second difference degree is larger than a preset second threshold value, adjusting model parameters of the deep learning network, and returning to execute the step of inputting the sample images in the training samples into the deep learning network for processing;
and if the second difference degree is smaller than or equal to the second threshold value, finishing the training of the deep learning network.
Further, after the training of the deep learning network is finished, the method may further include:
freezing model parameters of the deep learning network;
and transplanting the frozen model file of the deep learning network to preset terminal equipment.
A second aspect of an embodiment of the present invention provides an HDR image generation apparatus, which may include:
the original image acquisition module is used for acquiring a single-frame original image to be processed;
the HDR image generation module is used for processing the original image by using a preset deep learning network to generate an HDR image corresponding to the original image, the deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, the deep learning network is obtained by training based on a preset training sample set, and the light-weight Unet network is a network obtained by reducing the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map on the basis of the original Unet network.
Further, the deep neural network includes a downsampling layer, an intermediate processing layer, and an upsampling layer, and the HDR image generation module may include:
the first processing unit is used for performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result;
the second processing unit is used for processing the first processing result by using a residual error module preset in the intermediate processing layer to obtain a second processing result;
and a third processing unit, configured to perform convolution and upsampling processing on the second processing result by using the upsampling layer, so as to obtain an HDR image corresponding to the original image.
Further, the HDR image generation apparatus may further include:
a sample set obtaining module, configured to obtain the training sample set, where the training sample set includes training samples, and each training sample includes a sample image and an HDR image, where the HDR image in each training sample and the sample image have a one-to-one correspondence relationship;
the network training module is used for inputting the sample images in the training samples into the deep learning network for processing to obtain output images;
a first difference degree calculation module for evaluating a first difference degree between the HDR image and the output image in each training sample by using a preset first loss function;
the parameter adjusting module is used for adjusting the model parameters of the deep learning network;
and the training ending module is used for ending the training process of the deep learning network.
Further, the HDR image generation apparatus may further include:
and the second difference calculation module is used for evaluating a second difference between the HDR image and the output image in each training sample by using a preset second loss function.
Further, the HDR image generation apparatus may further include:
the parameter freezing module is used for freezing the model parameters of the deep learning network;
and the model transplanting module is used for transplanting the frozen model file of the deep learning network to preset terminal equipment.
A third aspect of embodiments of the present invention provides a computer readable storage medium storing computer readable instructions which, when executed by a processor, implement the steps of any of the HDR image generation methods described above.
A fourth aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, where the processor implements the steps of any of the HDR image generation methods described above when executing the computer readable instructions.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: the method comprises the steps of firstly obtaining a single-frame original image to be processed, then processing the original image by using a preset deep learning network, and generating an HDR image corresponding to the original image. The deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, and is obtained by training based on a preset training sample set, compared with the original Unet network, the light-weight Unet network further reduces the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map, greatly simplifies the network structure, and can complete the generation of an HDR image only with less time consumption and less resource consumption.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of one embodiment of a HDR image generation method in an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a deep learning network;
FIG. 3 is a schematic flow diagram of the processing of raw images using a deep learning network;
FIG. 4 is a schematic diagram of a residual module;
FIG. 5 is a schematic flow diagram of training a deep learning network;
FIG. 6 is a comparison of an original image and an HDR image;
FIG. 7 is a block diagram of an embodiment of an HDR image generating apparatus according to an embodiment of the present invention;
fig. 8 is a schematic block diagram of a terminal device in an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, an embodiment of an HDR image generation method according to an embodiment of the present invention may include:
and step S101, acquiring a single-frame original image to be processed.
The original image can be an image which is instantly acquired by a user through a camera of a mobile phone, a tablet personal computer and other terminal equipment. In a specific usage scenario of this embodiment, when a user wants to directly obtain an HDR image, before taking a picture of a target object, the terminal device may open an HDR shooting mode of the terminal device by clicking a specific physical key or a virtual key, and in this mode, the terminal device may process each original image taken by the user according to subsequent steps to generate the HDR image, where a specific processing procedure will be described in detail later.
The original image may also be an image that is originally stored in the terminal device, or an image that is acquired by the terminal device from a cloud server or other terminal devices through a network. In another specific use scenario of this embodiment, when a user wants to convert one or more existing original images into an HDR image, the HDR conversion mode of the terminal device may be opened by clicking a specific physical key or virtual key, and the original images are selected (the order of clicking the key and the selected images may be interchanged, that is, the images may be selected first, and then the HDR conversion mode of the terminal device is opened), so that the terminal device may process the original images according to subsequent steps to generate the HDR image. The specific processing procedure will be described in detail later.
Step S102, processing the original image by using a preset deep learning network, and generating an HDR image corresponding to the original image.
Preferably, the deep learning network used in this embodiment may adopt a network structure of a lightweight Unet network and a Residual block (Residual block) as shown in fig. 2, and the deep learning network is trained based on a preset training sample set.
The original Unet network was described by Olaf Ronneberger, Philipp Fischer and Thomas Brox et al in U-Net: convolutional Networks for Biomedical Image Segmentation (U-Net: relational Networks for Biomedical Image Segmentation) are proposed in the literature (see MICCAI 2015: Medical Image Computing and Computer-Assisted interaction-MICCAI 2015pp 234-241 for details). In the original Unet network, the number of down-sampling layers is 5, the number of channels of the feature maps corresponding to each layer is 64, 128, 256, 512, 1024, the number of channels of the feature maps corresponding to each layer is also 5, and the number of channels of the feature maps corresponding to each layer is 1024, 512, 256, 128, 64. The light-weight Unet network is a network obtained by reducing the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map on the basis of an original Unet network, the network structure is greatly simplified, the HDR image can be generated only with less time consumption and less resource consumption, and moreover, because a residual error module is added in the deep learning network, high-frequency information in the original image can be effectively extracted, so that the detail information of a highlight area is better recovered, and the generation effect of the HDR image is greatly improved.
The deep learning network comprises three processing layers of a down-sampling layer, an intermediate processing layer and an up-sampling layer, and the specific processing process can comprise the steps as shown in fig. 3:
and S1021, performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result.
Here, the number of downsampling layers is referred to as LayerNum1, which satisfies LayerNum1< LayerNum2, and LayerNum2 is the number of downsampling layers in the original Unet network. The specific value of LayerNum1 can be set according to actual conditions.
In the present embodiment, LayerNum1 is preferably set to 3, and these three layers are sequentially referred to as DL1, DL2, and DL 3. Wherein, DL1 is the first layer of the downsampling layer, its input is the original image, and its output is the result of the original image being processed by convolution and downsampling in DL1, here, the output of DL1 is denoted as DL1_ Res; DL2 is the second layer of the downsampled layers, its input is DL1_ Res, and its output is the result of DL1_ Res being subjected to convolution and downsampling processing in DL2, here, the output of DL2 is denoted as DL2_ Res; DL3 is the third layer of the downsampling layer, and its input is DL2_ Res and output is the result of DL2_ Res being subjected to convolution and downsampling processing in DL 3.
The specific processing procedure in DL1 is described in detail as follows: the number of channels of the original image is 3, that is, three channels of R (red component), G (green component), and B (blue component), the convolution kernel in DL1 performs convolution processing on the original image to obtain a feature map (feature map) of the original image, the number of convolution kernels in DL1 may be set according to actual conditions, in this embodiment, it is preferably set to 8, after the convolution processing of DL1, the feature map of 8 channels may be obtained from the original image, an activation function (for example, a Linear rectification function (ReLU) may be used as the activation function) is used to process the feature map, values in the feature map are limited in the range of [0,1], then the feature map is downsampled, the scale of the feature map is reduced, for example, the length and the width of the feature map may be reduced to half of the original by downsampling processing, the feature map after down-sampling processing of DL1 is input as DL 2.
The processing procedures of DL2 and DL3 are similar to DL1, and are not described herein. Note, however, that the number of convolution kernels in DL2 is 2 times the number of convolution kernels in DL1, and the number of convolution kernels in DL3 is 2 times the number of convolution kernels in DL2, so that the number of channels of feature maps output by DL1, DL2, and DL3 is 8, 16, and 32 in this order. Finally, the feature map output by DL3 is the first processing result.
Step S1022, the residual error module preset in the intermediate processing layer is used to process the first processing result, so as to obtain a second processing result.
The intermediate processing layer includes two residual error modules as shown in fig. 4, and the structure of the residual error module includes two branches, where a first branch is used to extract deeper features from the first processing result, and a second branch is used to maintain the first processing result, the first processing result in the first branch may sequentially undergo convolution processing, ReLU function processing, convolution processing, and the like, and the number of convolution kernels performing convolution processing in the first branch is the same as the number of convolution kernels in DL3, so that the number of channels of the feature map remains unchanged in the whole processing process; and in the second branch line, the first processing result can skip the processing process in the first branch line in a jump connection mode, and the data on the two branch lines are weighted and superposed to obtain the second processing result. By the processing mode, the high-frequency characteristic of the data can be effectively kept, and the problems of gradient disappearance and gradient explosion possibly caused by deepening of the network depth are solved, so that the deeper neural network can be trained, and meanwhile, the good performance can be ensured.
Step S1023, performing convolution and upsampling processing on the second processing result by using the upsampling layer, to obtain an HDR image corresponding to the original image.
The number of layers of the up-sampling layer is the same as that of the down-sampling layer, and is less than that of the up-sampling layer in the original Unet network. In the following, the case where the number of layers is 3 is described as an example, and these three layers are referred to as UL1, UL2, and UL3 in this order. Wherein, UL1 is the first layer of the upsampling layer, the input of which is the second processing result, and the output of which is the result of the second processing result after the convolution and upsampling processing in UL1, and the output of UL1 is referred to as UL1_ Res; UL2 is the second layer of the upsampling layer, and its input is UL1_ Res and output is the result of UL1_ Res being subjected to convolution and upsampling processing in UL2, where the output of UL2 is denoted as UL2_ Res; UL3 is the third layer of the upsampling layer, with input UL2_ Res and output UL2_ Res as a result of the convolution and upsampling process in UL 3.
The specific processing procedure in UL1 is described in detail as follows: the number of convolution kernels in UL1 is the same as that in DL2, and then the second processing result is subjected to convolution processing in UL1 to obtain a feature map of 16 channels, and then the feature map is processed by using an activation function (for example, ReLU may be used as the activation function), the value in the feature map is limited to the range of [0,1], and then the feature map is subjected to upsampling processing to expand the scale of the feature map, for example, the length and width of the feature map can be expanded to one time of the original by upsampling processing, and the feature map subjected to upsampling processing in UL1 is used as the input of UL 2.
The processing procedures of UL2 and UL3 are similar to UL1 and are not described herein. However, it should be noted that the number of convolution kernels in UL2 is half of the number of convolution kernels in UL1, and the number of convolution kernels in UL3 is 3, so that the number of channels of the feature map output by UL1, UL2 and UL3 is 16, 8 and 3 in this order. Finally, the feature map output by UL3 may be further processed by 1 convolution processing (the number of convolution kernels is 3) and 1 activation function (for example, Sigmoid may be used as an activation function), so as to obtain an HDR image corresponding to the original image. It should be noted that, a jump connection is also introduced between the down-sampling layer and the up-sampling layer, and before each convolution processing in the up-sampling layer, the data to be subjected to convolution processing is superposed with the output result of the same number of channels in the down-sampling layer, and the superposed result is used as the input of the next convolution processing.
The deep learning network adopted in the embodiment adopts a network structure combining a light-weight Unet network and a residual error module, compared with the original Unet network, the light-weight Unet network further reduces the number of layers of down-sampling layers, the number of layers of up-sampling layers and the number of channels of feature maps, greatly simplifies the network structure, and can complete the generation of HDR images only with less time consumption and less resource consumption.
The deep learning network is obtained based on sample training, and before being put into use, the deep learning network can be trained in advance through a process shown in fig. 5:
step S501, obtaining the training sample set from a preset database.
The training sample set comprises training samples, each training sample comprises a sample image and an HDR image, and the HDR images in the training samples and the sample images have one-to-one correspondence.
The training sample set may be constructed by selecting any one or more of various types of image datasets including, but not limited to, ImageNet, PASCAL VOC, Labelme, COCO, Caltech, CIFAR, and the like, which are commonly used in the image processing arts. After the image data set is selected, a certain number of images may be selected as sample images, and HDR images corresponding to the respective sample images are produced by existing HDR image generation software.
Preferably, the present embodiment may construct the training sample set in advance according to a watermark expansion Database data set, where the data set includes 4744 original images in common scenes such as human beings, animals, plants, natural scenes, urban landscapes, daily life scenes, transportation vehicles, and the like, and uses 3744 of the original images as sample images, and uses an Aurora HDR tool to make HDR images respectively corresponding to the sample images as a reference standard (ground route), where the Aurora HDR tool is an image processing software and has a function of generating HDR images. 3744 training samples can be obtained through the steps.
And step S502, inputting the sample images in each training sample into the deep learning network for processing to obtain an output image.
Preferably, before training, the sample image may be preprocessed and randomly cropped into smaller-scale images as input, for example, it may be cropped into 512 pixel by 512 pixel images, in which case the HDR image in each training sample also needs to be cropped accordingly. By cropping the sample image into an image with a smaller scale, the amount of computation required for training can be reduced, and the training process can be accelerated.
Step S503, evaluating a first difference between the HDR image and the output image in each training sample by using a preset first loss function.
The first loss function may be expressed as:
Figure BDA0002061742870000111
wherein N is the serial number of the training sample, N is more than or equal to 1 and less than or equal to N, N is the total number of the training samples, snFor the output image of the nth training sample, s ═ s1,s2,...,sN),ynFor the HDR image of the nth training sample, y ═ y1,y2,...,yN) And Loss _1 is the first difference.
Step S504, determining whether the first difference is greater than a preset first threshold.
If the first difference is greater than the first threshold, step S505 is executed, and if the first difference is less than or equal to the first threshold, step S506 is executed.
And step S505, adjusting the model parameters of the deep learning network.
After the parameter adjustment is completed, returning to perform step S502, that is, continuing to train the deep learning network by using the training sample set until the first difference degree is less than or equal to the first threshold.
And step S506, finishing the training of the deep learning network.
When the first difference degree is smaller than or equal to the first threshold, the training is indicated to reach the preset effect, and the training can be finished. At this time, the determined deep learning network is trained by a large number of samples, the difference degree is kept in a small range, and the deep learning network is used for processing the original image, so that a better HDR image generation effect can be obtained.
Preferably, before step S506, the deep learning network may be further fine-tuned by a preset second loss function, in order to further enhance the training effect. Specifically, a preset second loss function may be used to evaluate a second difference between the HDR image and the output image in each training sample, if the second difference is greater than a preset second threshold, the model parameters of the deep learning network are adjusted, and the deep learning network continues to be trained using the training sample set, and if the second difference is less than or equal to the second threshold, the training of the deep learning network is ended.
The second loss function may be expressed as:
Figure BDA0002061742870000121
wherein Loss _2 is the second difference.
The second loss function is different from the first loss function in that the first loss function amplifies a difference between an actual output and a target output by a square calculation, and thus gives a large penalty to an output deviating from the target. Furthermore, the first loss function is a smooth function, which facilitates the calculation of the error gradient when solving its optimization problem. The second loss function takes an absolute value of the difference between the actual output and the target output, and is insensitive to the output deviating from the target, so that the model is favorably kept stable when an abnormal value exists in the actual output.
Preferably, in this embodiment, the training process may be performed on GTX1080TI, the parameter initialization is performed by using an Xavier initialization method, the initialization learning rate is 1e-4, the optimizer is selected as Adam, the learning rate is performed by using a polynomial attenuation method, the iteration number (epoch) is 3000, and the batch size (batch size) is 8.
After the training of the deep learning network is finished, freezing the model parameters of the deep learning network can be realized by using a tensrflow framework, the input size is 3024 pixels by 4032 pixels, and then the frozen model file (for example, the file which can be in a PB format) of the deep learning network is transplanted to a preset terminal device by using a MACE framework. Thus, the terminal device has a function of generating an HDR image, and the process of converting an original image into an HDR image can be implemented by using the terminal device through the steps shown in fig. 1. As shown in fig. 6, it can be seen from the figure that compared with the original image, the finally generated HDR image retains more detail display of highlight areas, and the photographing effect of the terminal device is improved.
In summary, in the embodiments of the present invention, a single frame original image to be processed is first obtained, and then the original image is processed by using a preset deep learning network, so as to generate an HDR image corresponding to the original image. The deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, and is obtained by training based on a preset training sample set, compared with the original Unet network, the light-weight Unet network further reduces the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map, greatly simplifies the network structure, and can complete the generation of an HDR image only with less time consumption and less resource consumption.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 7 shows a structure diagram of an embodiment of an HDR image generation apparatus according to an embodiment of the present invention, corresponding to the HDR image generation method described in the foregoing embodiment.
In this embodiment, an HDR image generation apparatus may include:
an original image obtaining module 701, configured to obtain a single-frame original image to be processed;
an HDR image generation module 702, configured to process the original image using a preset deep learning network, and generate an HDR image corresponding to the original image, where the deep learning network adopts a network structure in which a light-weighted Unet network and a residual module are combined, and is obtained by training based on a preset training sample set, and the light-weighted Unet network is a network obtained by reducing, based on the original Unet network, the number of down-sampling layers, the number of up-sampling layers, and the number of channels of a feature map.
Further, the deep neural network includes a downsampling layer, an intermediate processing layer, and an upsampling layer, and the HDR image generation module may include:
the first processing unit is used for performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result;
the second processing unit is used for processing the first processing result by using a residual error module preset in the intermediate processing layer to obtain a second processing result;
and a third processing unit, configured to perform convolution and upsampling processing on the second processing result by using the upsampling layer, so as to obtain an HDR image corresponding to the original image.
Further, the HDR image generation apparatus may further include:
a sample set obtaining module, configured to obtain the training sample set, where the training sample set includes training samples, and each training sample includes a sample image and an HDR image, where the HDR image in each training sample and the sample image have a one-to-one correspondence relationship;
the network training module is used for inputting the sample images in the training samples into the deep learning network for processing to obtain output images;
a first difference degree calculation module for evaluating a first difference degree between the HDR image and the output image in each training sample by using a preset first loss function;
the parameter adjusting module is used for adjusting the model parameters of the deep learning network;
and the training ending module is used for ending the training process of the deep learning network.
Further, the HDR image generation apparatus may further include:
and the second difference calculation module is used for evaluating a second difference between the HDR image and the output image in each training sample by using a preset second loss function.
Further, the HDR image generation apparatus may further include:
the parameter freezing module is used for freezing the model parameters of the deep learning network;
and the model transplanting module is used for transplanting the frozen model file of the deep learning network to preset terminal equipment.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, modules and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Fig. 8 shows a schematic block diagram of a terminal device according to an embodiment of the present invention, and for convenience of description, only the relevant parts related to the embodiment of the present invention are shown.
As shown in fig. 8, the HDR image generation terminal device 8 of this embodiment includes: a processor 80, a memory 81 and a computer program 82 stored in said memory 81 and executable on said processor 80. The processor 80, when executing the computer program 82, implements the steps in the various HDR image generation method embodiments described above, such as steps S101 to S102 shown in fig. 1. Alternatively, the processor 80, when executing the computer program 82, implements the functions of each module/unit in the above-described device embodiments, such as the functions of the modules 701 to 702 shown in fig. 7.
Illustratively, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution process of the computer program 82 in the HDR image generation terminal device 8.
The HDR image generation terminal device 8 may be a computing device such as a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, and a cloud server. Those skilled in the art will appreciate that fig. 8 is merely an example of an HDR image generation terminal device 8, and does not constitute a limitation of the HDR image generation terminal device 8, and may include more or less components than those shown, or combine certain components, or different components, for example, the HDR image generation terminal device 8 may further include an input-output device, a network access device, a bus, etc.
The Processor 80 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may be an internal storage unit of the HDR image generation terminal device 8, such as a hard disk or a memory of the HDR image generation terminal device 8. The memory 81 may also be an external storage device of the HDR image generation terminal device 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like equipped on the HDR image generation terminal device 8. Further, the memory 81 may also include both an internal storage unit of the HDR image generation terminal device 8 and an external storage device. The memory 81 is used to store the computer program and other programs and data required by the HDR image generation terminal device 8. The memory 81 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (10)

1. An HDR image generation method, comprising:
acquiring a single-frame original image to be processed;
processing the original image by using a preset deep learning network to generate an HDR image corresponding to the original image, wherein the deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, the deep learning network is obtained by training based on a preset training sample set, and the light-weight Unet network is a network obtained by reducing the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map on the basis of the original Unet network.
2. The HDR image generation method of claim 1, wherein the deep neural network comprises a down-sampling layer, an intermediate processing layer and an up-sampling layer, and wherein the processing the original image using the preset deep learning network to generate the HDR image corresponding to the original image comprises:
performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result;
processing the first processing result by using a residual error module preset in the intermediate processing layer to obtain a second processing result;
and performing convolution and upsampling processing on the second processing result by using the upsampling layer to obtain an HDR image corresponding to the original image.
3. The HDR image generation method of claim 1, wherein the training process of the deep learning network comprises:
obtaining the training sample set, wherein the training sample set comprises training samples, each training sample comprises a sample image and an HDR image, and the HDR image in each training sample and the sample image have a one-to-one correspondence relationship;
inputting sample images in each training sample into the deep learning network for processing to obtain output images;
evaluating a first degree of difference between the HDR image and the output image in each training sample by using a preset first loss function;
if the first difference degree is larger than a preset first threshold value, adjusting model parameters of the deep learning network, and returning to execute the step of inputting the sample images in the training samples into the deep learning network for processing;
and if the first difference degree is smaller than or equal to the first threshold value, finishing the training of the deep learning network.
4. The HDR image generation method of claim 3, further comprising, after evaluating the first degree of difference between the HDR image and the output image in each training sample using a preset first loss function:
evaluating a second degree of difference between the HDR image and the output image in each training sample using a preset second loss function;
if the second difference degree is larger than a preset second threshold value, adjusting model parameters of the deep learning network, and returning to execute the step of inputting the sample images in the training samples into the deep learning network for processing;
and if the second difference degree is smaller than or equal to the second threshold value, finishing the training of the deep learning network.
5. The HDR image generation method of any one of claims 3 to 4, further comprising, after training of the deep learning network is finished:
freezing model parameters of the deep learning network;
and transplanting the frozen model file of the deep learning network to preset terminal equipment.
6. An HDR image generation apparatus, comprising:
the original image acquisition module is used for acquiring a single-frame original image to be processed;
the HDR image generation module is used for processing the original image by using a preset deep learning network to generate an HDR image corresponding to the original image, the deep learning network adopts a network structure combining a light-weight Unet network and a residual error module, the deep learning network is obtained by training based on a preset training sample set, and the light-weight Unet network is a network obtained by reducing the number of down-sampling layers, the number of up-sampling layers and the number of channels of a feature map on the basis of the original Unet network.
7. An HDR image generation apparatus as claimed in claim 6, wherein the deep neural network comprises a down-sampling layer, an intermediate processing layer and an up-sampling layer, the HDR image generation module comprising:
the first processing unit is used for performing convolution and downsampling processing on the original image by using the downsampling layer to obtain a first processing result;
the second processing unit is used for processing the first processing result by using a residual error module preset in the intermediate processing layer to obtain a second processing result;
and a third processing unit, configured to perform convolution and upsampling processing on the second processing result by using the upsampling layer, so as to obtain an HDR image corresponding to the original image.
8. An HDR image generation apparatus as claimed in claim 6, further comprising:
a sample set obtaining module, configured to obtain the training sample set, where the training sample set includes training samples, and each training sample includes a sample image and an HDR image, where the HDR image in each training sample and the sample image have a one-to-one correspondence relationship;
the network training module is used for inputting the sample images in the training samples into the deep learning network for processing to obtain output images;
a first difference degree calculation module for evaluating a first difference degree between the HDR image and the output image in each training sample by using a preset first loss function;
the parameter adjusting module is used for adjusting the model parameters of the deep learning network;
and the training ending module is used for ending the training process of the deep learning network.
9. A computer readable storage medium storing computer readable instructions, which when executed by a processor implement the steps of the HDR image generation method of any of claims 1 to 5.
10. A terminal device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, characterized in that the processor, when executing the computer readable instructions, implements the steps of the HDR image generation method of any of claims 1 to 5.
CN201910407590.2A 2019-05-16 2019-05-16 HDR image generation method and device, readable storage medium and terminal equipment Pending CN111951171A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910407590.2A CN111951171A (en) 2019-05-16 2019-05-16 HDR image generation method and device, readable storage medium and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910407590.2A CN111951171A (en) 2019-05-16 2019-05-16 HDR image generation method and device, readable storage medium and terminal equipment

Publications (1)

Publication Number Publication Date
CN111951171A true CN111951171A (en) 2020-11-17

Family

ID=73335869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910407590.2A Pending CN111951171A (en) 2019-05-16 2019-05-16 HDR image generation method and device, readable storage medium and terminal equipment

Country Status (1)

Country Link
CN (1) CN111951171A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651975A (en) * 2020-12-29 2021-04-13 奥比中光科技集团股份有限公司 Training method, device and equipment of lightweight network model
CN115205157A (en) * 2022-07-29 2022-10-18 如你所视(北京)科技有限公司 Image processing method and system, electronic device, and storage medium
WO2022266955A1 (en) * 2021-06-24 2022-12-29 Oppo广东移动通信有限公司 Image decoding method and apparatus, image processing method and apparatus, and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787500A (en) * 2014-12-26 2016-07-20 日本电气株式会社 Characteristic selecting method and characteristic selecting device based on artificial neural network
US20170104971A1 (en) * 2014-06-20 2017-04-13 SZ DJI Technology Co., Ltd. Method and apparatus for generating hdri
CN106897739A (en) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 A kind of grid equipment sorting technique based on convolutional neural networks
CN108492258A (en) * 2018-01-17 2018-09-04 天津大学 A kind of radar image denoising method based on generation confrontation network
CN109618094A (en) * 2018-12-14 2019-04-12 深圳市华星光电半导体显示技术有限公司 Image processing method and image processing system
CN109727259A (en) * 2019-01-07 2019-05-07 哈尔滨理工大学 A kind of retinal images partitioning algorithm based on residual error U-NET network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170104971A1 (en) * 2014-06-20 2017-04-13 SZ DJI Technology Co., Ltd. Method and apparatus for generating hdri
CN105787500A (en) * 2014-12-26 2016-07-20 日本电气株式会社 Characteristic selecting method and characteristic selecting device based on artificial neural network
CN106897739A (en) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 A kind of grid equipment sorting technique based on convolutional neural networks
CN108492258A (en) * 2018-01-17 2018-09-04 天津大学 A kind of radar image denoising method based on generation confrontation network
CN109618094A (en) * 2018-12-14 2019-04-12 深圳市华星光电半导体显示技术有限公司 Image processing method and image processing system
CN109727259A (en) * 2019-01-07 2019-05-07 哈尔滨理工大学 A kind of retinal images partitioning algorithm based on residual error U-NET network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651975A (en) * 2020-12-29 2021-04-13 奥比中光科技集团股份有限公司 Training method, device and equipment of lightweight network model
WO2022266955A1 (en) * 2021-06-24 2022-12-29 Oppo广东移动通信有限公司 Image decoding method and apparatus, image processing method and apparatus, and device
CN115205157A (en) * 2022-07-29 2022-10-18 如你所视(北京)科技有限公司 Image processing method and system, electronic device, and storage medium
CN115205157B (en) * 2022-07-29 2024-04-26 如你所视(北京)科技有限公司 Image processing method and system, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN109949255B (en) Image reconstruction method and device
CN111369440B (en) Model training and image super-resolution processing method, device, terminal and storage medium
Chen et al. MICU: Image super-resolution via multi-level information compensation and U-net
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
RU2697928C1 (en) Superresolution of an image imitating high detail based on an optical system, performed on a mobile device having limited resources, and a mobile device which implements
CN111951171A (en) HDR image generation method and device, readable storage medium and terminal equipment
CN111640060A (en) Single image super-resolution reconstruction method based on deep learning and multi-scale residual dense module
CN113129212B (en) Image super-resolution reconstruction method and device, terminal device and storage medium
CN115393191A (en) Method, device and equipment for reconstructing super-resolution of lightweight remote sensing image
He et al. Remote sensing image super-resolution using deep–shallow cascaded convolutional neural networks
CN114627035A (en) Multi-focus image fusion method, system, device and storage medium
CN114926734B (en) Solid waste detection device and method based on feature aggregation and attention fusion
CN114519667A (en) Image super-resolution reconstruction method and system
CN113902658A (en) RGB image-to-hyperspectral image reconstruction method based on dense multiscale network
CN114926342A (en) Image super-resolution reconstruction model construction method, device, equipment and storage medium
CN115222581A (en) Image generation method, model training method, related device and electronic equipment
CN106558021A (en) Video enhancement method based on super-resolution technique
CN112489103B (en) High-resolution depth map acquisition method and system
CN111325700B (en) Multi-dimensional fusion method and system based on color image
CN113379606A (en) Face super-resolution method based on pre-training generation model
CN115311149A (en) Image denoising method, model, computer-readable storage medium and terminal device
CN113096015A (en) Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network
CN116739920A (en) Double-decoupling mutual correction multi-temporal remote sensing image missing information reconstruction method and system
CN114638761B (en) Full-color sharpening method, equipment and medium for hyperspectral image
CN112785498B (en) Pathological image superscore modeling method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination