CN113592726A

CN113592726A - High dynamic range imaging method, device, electronic equipment and storage medium

Info

Publication number: CN113592726A
Application number: CN202110732200.6A
Authority: CN
Inventors: 刘震; 刘帅成
Original assignee: Beijing Kuangshi Technology Co Ltd; Beijing Megvii Technology Co Ltd
Current assignee: Beijing Kuangshi Technology Co Ltd; Beijing Megvii Technology Co Ltd
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-11-02

Abstract

The application relates to a high dynamic range imaging method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring a low dynamic range image of a target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image, and a third low dynamic range image in which exposure values sequentially increase; generating masks corresponding to different scales based on each low dynamic range image; and performing fusion processing and image reconstruction processing according to the low dynamic range images and the masks corresponding to the scales to obtain the high dynamic range image of the target object. According to the method, the masks with multiple scales are generated according to the multi-frame images with different exposure values of the same target object, dynamic range reconstruction and restoration are carried out on the images with different exposure values based on the masks, more image details of the target object can be restored as far as possible on the premise of avoiding the occurrence of artifacts, and the image effect of the reconstructed high-dynamic-range image is improved.

Description

High dynamic range imaging method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a high dynamic range imaging method and apparatus, an electronic device, and a storage medium.

Background

Dynamic Range (Dynamic Range) is the ratio of the maximum value to the minimum value of a variable signal, such as a light signal or a sound signal. In the field of image processing, dynamic range refers to the range of light exposure values in a scene that an image can capture. The Dynamic Range of a natural scene observable by human eyes can reach 10000:1, and common consumer grade photographic equipment (such as a smart phone) can only shoot a limited Low Dynamic Range (LDR) image (such as 100: 1-300: 1). The High Dynamic Range (HDR) image has wider Dynamic Range, and can more truly restore the shadow effect close to a real scene, so that the photo with richer hierarchy, more real picture and higher quality can be obtained.

There are two methods for obtaining high dynamic range photographs: the first is through specialized equipment, but this approach is expensive and not popular in consumer electronics (e.g., smart phones); a more common approach is to obtain HDR images by fusing multiple frames of LDR photographs at different exposures. In a dynamic scene, due to the shake of a handheld camera or the motion of a foreground object, the multi-frame fusion is easy to cause an artifact problem.

In the related art, deep learning is used for multi-frame fusion in a dynamic scene, specifically, an input three-frame LDR picture is aligned through an optical flow, and then the aligned three-frame LDR picture is input into a fusion network to obtain a final HDR picture.

Disclosure of Invention

In view of the above, it is necessary to provide a high dynamic range imaging method, apparatus, electronic device and storage medium capable of avoiding artifacts.

A high dynamic range imaging method, the method comprising:

acquiring a low dynamic range image of a target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image and a third low dynamic range image, of which exposure values are sequentially increased;

generating masks corresponding to different scales based on the low dynamic range images;

and performing fusion processing and image reconstruction processing according to the low dynamic range images and the masks corresponding to the scales to obtain high dynamic range images of the target object.

A high dynamic range imaging device, the device comprising:

the image acquisition module is used for acquiring a low dynamic range image of the target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image and a third low dynamic range image, of which exposure values are sequentially increased;

the mask generation module is used for generating masks corresponding to different scales based on the low dynamic range images;

and the image reconstruction module is used for performing fusion processing and image reconstruction processing according to the low dynamic range images and the masks corresponding to the scales to obtain high dynamic range images of the target object.

An electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

The high dynamic range imaging method, the high dynamic range imaging device, the electronic equipment and the storage medium are used for acquiring low dynamic range images of three different exposure values of a target object and generating masks corresponding to different scales based on the low dynamic range images; and then carrying out fusion processing and image reconstruction processing according to the low dynamic range images and the masks of all scales to obtain a high dynamic range image of the target object. According to the method, the masks with multiple scales are generated according to the multi-frame images with different exposure values of the same target object, dynamic range reconstruction and restoration are carried out on the images with different exposure values based on the masks, more image details of the target object can be restored as far as possible on the premise of avoiding the occurrence of artifacts, and the image effect of the reconstructed high-dynamic-range image is improved.

Drawings

FIG. 1 is a schematic flow diagram of a high dynamic range imaging method in one embodiment;

fig. 2 is a schematic flow chart illustrating a process of performing fusion processing and image reconstruction processing according to each low dynamic range image and a mask corresponding to each scale to obtain a high dynamic range image of a target object in one embodiment;

FIG. 3 is a schematic flow chart illustrating fusion processing of second low dynamic range image features of respective scales and fused image features of the scales in one embodiment;

fig. 4 is a schematic flow chart illustrating that, in one embodiment, based on masks of respective scales, fusion processing is performed on the first low dynamic range image feature and the third low dynamic range image feature of the respective scales to obtain fused image features of the respective scales;

FIG. 5 is a flow diagram that illustrates the training process for a high dynamic range imaging model, according to one embodiment;

FIG. 6 is a flow diagram illustrating a process for training a high dynamic range imaging model in accordance with another embodiment;

FIG. 7(1) (left) is a diagram of a first initial sample low dynamic range image in one embodiment;

FIG. 7(1) (right) is a diagram of a first transformed sample low dynamic range image in one embodiment;

FIG. 7(2) (left) is a diagram of a second initial sample low dynamic range image in one embodiment;

FIG. 7 (right) is a diagram of a second transformed sample low dynamic range image in one embodiment;

FIG. 7(3) (left) is a diagram of a third initial sample low dynamic range image in one embodiment;

FIG. 7 (right) is a diagram of a third transformed sample low dynamic range image in one embodiment;

FIG. 7(4) (left) is a diagram of a third initial sample low dynamic range image in one embodiment;

FIG. 7(4) (right) is a diagram of a fourth transformed sample low dynamic range image in one embodiment;

FIG. 8 is a schematic diagram of a high dynamic range imaging frame in one embodiment;

FIG. 9 is a block diagram of a high dynamic range imaging device in one embodiment;

FIG. 10 is a block diagram showing the construction of a high dynamic range imaging apparatus in another embodiment;

FIG. 11 is a diagram illustrating the internal architecture of an electronic device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In recent years, technical research based on artificial intelligence, such as computer vision, deep learning, machine learning, image processing, and image recognition, has been actively developed. Artificial Intelligence (AI) is an emerging scientific technology for studying and developing theories, methods, techniques and application systems for simulating and extending human Intelligence. The artificial intelligence subject is a comprehensive subject and relates to various technical categories such as chips, big data, cloud computing, internet of things, distributed storage, deep learning, machine learning and neural networks. Computer vision is used as an important branch of artificial intelligence, particularly a machine is used for identifying the world, and the computer vision technology generally comprises the technologies of face identification, living body detection, fingerprint identification and anti-counterfeiting verification, biological feature identification, face detection, pedestrian detection, target detection, pedestrian identification, image processing, image identification, image semantic understanding, image retrieval, character identification, video processing, video content identification, behavior identification, three-dimensional reconstruction, virtual reality, augmented reality, synchronous positioning and map construction (SLAM), computational photography, robot navigation and positioning and the like. With the research and progress of artificial intelligence technology, the technology is applied to various fields, such as security, city management, traffic management, building management, park management, face passage, face attendance, logistics management, warehouse management, robots, intelligent marketing, computational photography, mobile phone images, cloud services, smart homes, wearable equipment, unmanned driving, automatic driving, smart medical treatment, face payment, face unlocking, fingerprint unlocking, testimony verification, smart screens, smart televisions, cameras, mobile internet, live webcasts, beauty treatment, medical beauty treatment, intelligent temperature measurement and the like.

In one embodiment, as shown in fig. 1, a high dynamic range imaging method is provided, and this embodiment is illustrated by applying the method to a terminal, and it is to be understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes steps S110 to S130.

Step S110, acquiring a low dynamic range image of a target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image, and a third low dynamic range image in which exposure values sequentially increase.

The target object refers to a target included in a photographed picture, and in one embodiment, the target object may include a human image, an animal, a plant, and the like. In the field of image processing, the dynamic range is used to describe the range of light intensity distribution from the darkest shaded portion to the highest high light portion in a picture. It can be metered with an Exposure Value (EV) or "gear". If a scene is described, its dynamic range is very wide, i.e. it means that the exposure value of the scene from the shadow part to the highlight part is very different, the contrast of the picture is relatively high, the gradation is rich, for example when a silhouette in sunset is taken, and the dynamic range of other scenes is much smaller. In the present embodiment, the low dynamic range image refers to a range in which the range of exposure values in the scene of image capturing is narrow (lower), that is, the difference in exposure values between the shadow portion and the highlight portion is small. It should be noted that the high dynamic range and the low dynamic range mentioned in the embodiments of the present application are a set of relative concepts.

Further, in this embodiment, the obtained low dynamic range image at least includes three images with different dynamic range values, the exposure value represents the dynamic range of the three images, and the exposure value is respectively marked as a first low dynamic range image, a second low dynamic range image, and a third low dynamic range image from small to large. It should be noted that, in the embodiments of the present application, "first", "second", and "third" are used only to distinguish images of different exposure values, and do not represent any practical meaning; in other embodiments, it is also possible that the exposure values of the first low dynamic range image, the second low dynamic range image, and the third low dynamic range image are sequentially decreased.

Step S120, based on each low dynamic range image, masks corresponding to different scales are generated.

Scaling is an important concept in the field of computer vision and image processing. The answer to any one visual question is often dependent on the scale on which it is placed. In an image, however, an object is meaningful only within a certain scale. For example, to view a tree, the selected scale should be on the order of "meters"; if leaves on the tree are observed, the chosen scale should be on the order of "centimeters"; if the cellular structure of the infusion is to be observed, the "millimeter" or "micron" scale is chosen. The concept of "scale" in vision problems can also be understood visually as the distance between the device and the observed object: a longer distance corresponds to a larger scale and a shorter distance corresponds to a smaller scale. In image processing, the scale represents a resolution measurement scale.

Image masks are similar to physical masks in that the selected image, graphic or object is used to block (either wholly or partially) the processed image to control the area or process of image processing. The mask can be used for extracting a region of interest, shielding, structural feature extraction, or the making of a special-shaped image, and the like. In one embodiment, the mask of any dimension is a mask matrix; wherein, the value range of the element values in the mask matrix is [0,1 ].

In one embodiment, generating masks corresponding to different scales based on respective low dynamic range images includes: splicing the first low dynamic range image and the third low dynamic range image to obtain a spliced image; and generating masks with different scales based on the spliced images.

Image stitching is an important branch of computer vision, and is to seamlessly stitch two or more images with partial overlap to obtain a higher resolution or wide viewing angle image. The two images are spliced to obtain one image, and the image splicing method can be realized in any mode. In one embodiment, stitching the first low dynamic range image and the third low dynamic range image to obtain a stitched image includes: and splicing the same color channel in RGB (three primary optical colors: red, green and blue) of the first low dynamic range image and the third low dynamic range image to obtain a spliced image. In a specific embodiment, a concat function is used to implement the process of stitching the first low dynamic range image and the third low dynamic range image. And after the spliced images are obtained, generating corresponding masks with different scales based on the spliced images. In other embodiments, the two images can be spliced in other manners.

In one embodiment, generating masks of different dimensions based on the stitched image comprises: and generating masks with different scales based on the guide mask encoder determined by training and the spliced images.

The guide mask encoder is used for generating a mask, and the preset mask encoder frame can be trained in advance by utilizing sample data to determine the guide mask encoder. The detailed process of training the determined guide mask encoder will be described in detail in the following embodiments, and will not be described herein again. Further, in this embodiment, the stitched image is input to a trained guide mask encoder, so that masks with different scales output by the guide mask encoder can be obtained.

Further, in one embodiment, the guided mask encoder includes a multi-layer structure with the same number of layers as the network structure used to extract the image features. Any one-layer structure of the guided mask encoder includes a convolutional layer and an activation function. An activation function is a function that runs on a neuron of an artificial neural network, responsible for mapping the input of the neuron to the output. Commonly used activation functions include Sigmoid function (S-type growth curve), Tanh function (hyperbolic tangent function), Relu activation function (The Rectified Linear Unit), and The like. In one embodiment, the spliced image is input into a first layer network structure of a guide mask encoder, and downsampling image features output by the first layer network structure and a corresponding mask are obtained; inputting the down-sampling image characteristics output by the first layer network structure into the second layer network structure to obtain the down-sampling image characteristics output by the second layer network structure and a corresponding mask; … are provided.

And step S130, performing fusion processing and image reconstruction processing according to each low dynamic range image and the mask corresponding to each scale to obtain a high dynamic range image of the target object.

In one embodiment, as shown in fig. 2, a fusion process and an image reconstruction process are performed according to each low dynamic range image and a mask corresponding to each scale to obtain a high dynamic range image of the target object, including steps S131 to S133. Wherein:

step S131, respectively carrying out multi-scale feature extraction on the first low dynamic range image, the second low dynamic range image and the third low dynamic range image to obtain a first low dynamic range image feature, a second low dynamic range image feature and a third low dynamic range image feature under different scales; each scale of each low dynamic range image feature corresponds to each scale of the mask.

For images, each image has self characteristics which can be distinguished from other images, and some images are natural characteristics which can be intuitively felt, such as exposure values, edges, textures, colors and the like; some of them are obtained by transformation or processing, such as moment, histogram, principal component, etc. In one embodiment, feature extraction is a concept in computer vision and image processing, which refers to using a computer to extract image information to decide whether a point of each image belongs to an image feature.

Extracting features of different scales from each low dynamic range image by multi-scale feature extraction, wherein the multi-scale image features extracted from the same image are pyramid features of the image; the multi-scale feature extraction of the low dynamic range image can be realized in any mode. In one embodiment, multi-scale feature extraction may be performed on low dynamic range images through a neural network; for example, in one embodiment, an encoder portion in a unet network may be employed to sequentially down-sample an input low dynamic range image to obtain image features of the low dynamic range image at different scales. The unet network is a network with a U-shaped symmetric structure, the left side is a convolution layer, the right side is an upsampling layer, and only the left convolution layer portion of the unet network is adopted in the embodiment. Further, unet network sharing weights used for multi-scale feature extraction of different low dynamic range images are shared, and the scales of the extracted image features are the same. In other embodiments, the extraction of multi-scale features may be performed on the low dynamic range image in other ways.

Further, in this embodiment, multi-scale feature extraction is performed on each of the obtained low dynamic range images, so as to obtain an image feature of the first low dynamic range image of each scale, an image feature of the second low dynamic range image of each scale, and an image feature of the third low dynamic range image of each scale.

In this embodiment, the scale of the image feature obtained by feature extraction corresponds to the scale of the generated mask; further, in one embodiment, the number of channels in each layer of the network structure of the guide mask encoder is set so that the guide mask encoder is consistent with the scale of the image feature extraction.

And S132, respectively fusing the first low dynamic range image characteristic and the third low dynamic range image characteristic of the scale based on the mask of each scale to obtain fused image characteristics of each scale.

After the masks with different scales are generated, the image features of the first low dynamic range image and the image features of the third low dynamic range image of the scale are fused by using the masks, and the fused image features of the scale can be obtained. It will be appreciated that one fused image feature may be obtained for each scale. The fused image features include details of the first low dynamic range image and the third low dynamic range image in each corresponding region.

And S133, respectively reconstructing an image according to the second low dynamic range image characteristics of each scale and the fused image characteristics to obtain a high dynamic range image of the target object.

The purpose of image reconstruction is to measure and observe an object, and the visual display of a large amount of information in a reconstructed image is one of the tasks of image reconstruction. In one embodiment, the image reconstruction is performed according to the second low dynamic range image features and the fused image features of each scale, respectively, to obtain a high dynamic range image of the target object, including: respectively fusing the second low dynamic range image features of each scale with the fused image features of the scale to obtain the image features of the high dynamic range image; the image features of the high dynamic range image are converted into a high dynamic range image of the target object.

The image features can be fused in any mode. In a specific embodiment, the image features are fused through a concat function to obtain the image features of the high dynamic range image. Further, after obtaining the image features of the high dynamic range image, the image features of the high dynamic range image are converted into a corresponding high dynamic range image, i.e., a high dynamic range image of the target object.

The conversion of image features into a high dynamic range image may be achieved in any manner. For example, in a specific embodiment, the process of converting the image features of the high dynamic range image into the high dynamic range image is implemented by RCAB (residual channel attention module).

Further, in an embodiment, the image reconstruction is performed according to the image features of each scale of the second low dynamic range image and the fused image features of each scale, as follows: and reconstructing the image from coarse to fine according to the image characteristics of each scale of the second low dynamic range image and the fused image characteristics of each scale. In the foregoing step, multi-scale image feature extraction is performed on the first low dynamic range image, the second low dynamic range image, and the third low dynamic range image, and the scales of the extracted second low dynamic range image features correspond to the generated mask scales one to one, and the fused image features correspond to the generated mask scales one to one. In this step, the second low dynamic range image features of each scale and the fused image features are used to construct and obtain the high dynamic range image features of the target object, specifically, the fused image features and the second low dynamic range image features are fused from the lowest scale, and the output result of the previous scale is combined to finally obtain the high dynamic range image features with the same scale as the input low dynamic range image.

In one embodiment, the fusion processing is performed on the second low dynamic range image features of each scale and the fused image features of the scale, so as to obtain the image features of the high dynamic range image, and the method includes: from the lowest scale, splicing the image features of the scale, the fused image features and the output features of the previous scale to obtain cross-scale splicing features; performing convolution on the cross-scale splicing characteristics to obtain output characteristics of the scale; the output characteristic of the last scale of the lowest scale is a preset value; and determining the output characteristic corresponding to the highest scale as the image characteristic of the high dynamic range image.

Wherein, the lowest scale is the image feature extracted by the last layer when the image feature is extracted. In this embodiment, a specific process of fusing the fused image features of each scale and the second low dynamic range image features to obtain the high dynamic range image features of the target object includes: splicing the second low dynamic range image features, the fused image features (obtained by fusing the first low dynamic range image features and the third low dynamic range image features based on the mask) which correspond to the same scale and the output features of the previous scale, wherein the obtained result is marked as a cross-scale splicing feature in the embodiment; and then, performing convolution on the cross-scale splicing characteristics to obtain the output characteristics of the scale.

Further, since the operation is started from the lowest scale, and the last scale of the lowest scale is empty, in this embodiment, the last scale output feature of the lowest scale is set as a preset value, so that it can be ensured that the operation steps at each scale are the same, and all the operation steps are the splicing of the fused image feature, the second low dynamic range image feature and the last scale output feature, and then the convolution processing is performed, so as to obtain the output feature of the scale where the fused image feature, the second low dynamic range image feature and the last scale output feature are located. The operation is started from the lowest scale until the output result corresponding to the highest scale is obtained, which is the image feature of the high dynamic range image in this embodiment. Wherein, in a specific embodiment, the output feature of the last scale of the lowest scale is set to 0.

Further, when the output features of the fused image features and the second low dynamic range image features in the previous scale are spliced, the scale of the output features in the previous scale is still the previous scale and is different from the currently calculated scale, and at this time, the splicing is cross-scale splicing. In one embodiment, when performing cross-scale stitching, stitching the second low dynamic range image feature of the scale, the fused image feature, and the output feature of the previous scale, and obtaining the cross-scale stitching feature includes: and performing up-sampling on the output characteristic of the previous scale to obtain an up-sampled image characteristic, and splicing the second low dynamic range image characteristic of the scale, the fused image characteristic and the up-sampled image characteristic to obtain a cross-scale splicing characteristic.

In a specific embodiment, taking 4 scales as an example, as shown in fig. 3, a schematic flow diagram of a fusion process performed on the second low dynamic range image features of each scale and the fused image features of the scale in this embodiment is shown. Wherein,

sequentially representing the fused image features from high to low scales,

the second low dynamic range image features at the scales from high to low are sequentially represented.

Sequentially represents the cross-scale splicing characteristics of scales from high to low, and "arp" represents the up-sampling operation.

Sequentially representing the up-sampled image features at each scale from high to low. Where subscript (1,2,3,4) indicates the scale and subscript 4 indicates the highest scale.

The high dynamic range imaging method obtains low dynamic range images of three different exposure values of a target object, and generates masks corresponding to different scales based on the low dynamic range images; and then carrying out fusion processing and image reconstruction processing according to the low dynamic range images and the masks of all scales to obtain a high dynamic range image of the target object. According to the method, the masks with multiple scales are generated according to the multi-frame images with different exposure values of the same target object, dynamic range reconstruction and restoration are carried out on the images with different exposure values based on the masks, more image details of the target object can be restored as far as possible on the premise of avoiding the occurrence of artifacts, and the image effect of the reconstructed high-dynamic-range image is improved.

In one embodiment, as shown in fig. 4, based on a mask of each scale, the first low dynamic range image feature and the third low dynamic range image feature of the scale are respectively fused to obtain fused image features of each scale, including steps S141 to S143:

for the same scale:

step S141 calculates a first product of the first low dynamic range image feature and the mask.

Step S142, determining a difference between 1 and the mask, and calculating a second product of the difference and the image feature of the third low dynamic range.

And S143, determining the sum of the first product and the second product as the fused image characteristic of the scale.

And for each scale, fusing the first low dynamic range image characteristic and the second low dynamic range image characteristic by using the mask of the scale, and obtaining the fused image characteristic. In one embodiment, to

Representing the feature images after the fusion of all scales,

a first low dynamic range image feature representing each scale,

third Low dynamic Range image feature, m, representing respective scales_kRepresenting masks corresponding to all scales; where the subscript k denotes the scale. The above process can be expressed by the following formula:

in this embodiment, an operation process when fusing image features of two images by using a mask is provided, according to the above manner, the first low dynamic range image feature and the third low dynamic range image feature may be fused together, and details of corresponding regions in the first low dynamic range image feature and the third low dynamic range image feature may be adaptively obtained and retained in the fused image features.

Further, in one embodiment, the high dynamic range imaging method as in any of the above embodiments is implemented by a high dynamic range imaging model. In the present embodiment, as shown in fig. 5, the training process of the high dynamic range imaging model includes steps S310 to S330.

Step S310, obtaining an initial sample low dynamic range image, wherein the initial sample image comprises a first initial sample low dynamic range image, a second initial sample low dynamic range image and a third initial sample low dynamic range image, and exposure values of the first initial sample low dynamic range image, the second initial sample low dynamic range image and the third initial sample low dynamic range image are sequentially increased.

In one embodiment, the initial sample low dynamic range image refers to a photo obtained by taking a natural shot, and further, the initial sample is a plurality of low dynamic range images with different exposure values shot for the same target object. In one embodiment, the initial sample low dynamic range image is obtained from the dynamic scene dataset proposed by Kalantari, which contains 74 training samples and 15 test samples, and the Exposure combinations (EVs) of the data are only two combinations of EV-2, EV0, EV +2 and EV-3, EV0, EV + 3.

Further, in one embodiment, after acquiring the initial sample low dynamic range image, the method further includes: uniformly cutting the initial sample low dynamic range image into an image with a preset size; in one embodiment, the predetermined size is 256 × 256. The initial sample low dynamic range image is cut, so that the video memory of a Graphic Processing Unit (GPU) can be saved.

Step S320, randomly selecting an image block in each initial sample low dynamic range image, and performing pixel value transformation on the image block to obtain a transformed sample low dynamic range image.

The method has the advantages that the image blocks are randomly selected from the initial sample image, and the pixel values in the selected image blocks are transformed, so that the problem of few extreme overexposure/underexposure scenes can be simulated. It will be appreciated that the size of the selected image block is typically smaller than the size of the initial sample image itself. In one embodiment, the selected image block size is 64 x 64.

Further, pixel value conversion processing is performed on the selected image block, including converting pixel values in the image block to pixel values of a higher exposure value or a lower exposure value. The pixel values for replacement may be determined by transformation based on the original pixel values.

In one embodiment, as shown in fig. 6, randomly selecting an image block in each initial sample low dynamic range image, and performing pixel value transformation processing on the image block to obtain a transformed sample low dynamic range image, includes steps S321 and S322:

step S321, randomly selecting a first image block in the first initial sample low dynamic range image and the second initial sample low dynamic range image, and converting pixel values in the first image block into first target pixel values to obtain a first converted sample low dynamic range image and a second converted sample low dynamic range image; the first target pixel value is obtained by converting an original pixel value in the first image block based on a gamma curve, and the exposure value of the first target pixel value is larger than that of the pixel value in the first image block.

As can be seen from the foregoing steps, the exposure values of the first initial sample low dynamic range image and the second initial sample low dynamic range image are lower than the exposure value of the third initial sample low dynamic range image, in this embodiment, two initial samples with lower exposure values are selected from a group of initial samples, image blocks corresponding to the same position are respectively selected from the two initial samples, which are denoted as first image blocks in this embodiment, and the pixel values in the first image blocks of the two initial samples are replaced with the pixel values with higher exposure values than the original pixels, which are denoted as first target pixel values in this embodiment.

The Gamma (Gamma) curve is a special tone curve, and when the Gamma value is equal to 1, the curve is a straight line having an angle of 45 ° with the coordinate axis, which indicates that the input and output densities are the same. Gamma values above 1 will cause output dimming, and Gamma values below 1 will cause output brightening. In one embodiment, transforming the pixel values in the image block based on the gamma curve may be expressed by the following expression: output-input/gamma. Further, in this embodiment, the pixel values in the first image block need to be replaced with the pixel values with higher exposure values, and the gamma values lower than 1 may be selected from the gamma curve to transform the pixel values in the first image block, so as to obtain the transformed pixel values.

Further, a first image block is randomly selected from the first initial sample low dynamic range image and the second initial sample low dynamic range image for pixel value replacement, and after pixel values in the first image blocks of the two initial samples are replaced with pixel values with higher exposure values, a sample image including a transformed image block is obtained, and in this embodiment, the sample image is marked as a first transformed sample low dynamic range image and a second transformed sample low dynamic range image.

Step S322, randomly selecting a second image block in the second initial sample low dynamic range image and the third initial sample low dynamic range image, and converting pixel values in the second image block into second target pixel values to obtain a third converted sample low dynamic range image and a fourth converted sample low dynamic range image; the second target pixel value is obtained by converting the original pixel value in the second image block based on the gamma curve, and the exposure value of the second target pixel value is larger than the exposure value of the pixel value in the second image block.

As can be seen from the foregoing steps, the exposure values of the second initial sample low dynamic range image and the third initial sample low dynamic range image are higher than the exposure value of the first initial sample low dynamic range image, in this embodiment, two initial samples with higher exposure values are selected from a group of initial samples, an image block is selected, which is denoted as a second image block in this embodiment, and the pixel values in the second image block are replaced with pixel values with lower exposure values than the original pixels, which is denoted as a second target pixel value in this embodiment.

Further, in this embodiment, the pixel values in the second image block need to be replaced with the pixel values with the lower exposure value, and the gamma value higher than 1 may be selected from the gamma curve to transform the pixel values in the second image block, so as to obtain the transformed pixel values.

Further, a second image block is randomly selected from the second initial sample low dynamic range image and the third initial sample low dynamic range image for pixel value replacement, and after the pixel value in the second image block is replaced with a pixel value with a lower exposure value, a sample image including a transformed image block is obtained, and in this embodiment, the sample image is marked as a second transformed sample low dynamic range image and a third transformed sample low dynamic range image.

In one embodiment, for the same initial sample low dynamic range image, a plurality of transformed sample low dynamic range images can be obtained by transforming image block pixel values.

Fig. 7(1) (left) is a schematic diagram of a first initial sample low dynamic range image in an embodiment, and fig. 7(1) (right) is a schematic diagram of a first transformed sample low dynamic range image; wherein the image block with the lower exposure value is the first image block. Fig. 7(2) (left) is a schematic diagram of a second initial sample low dynamic range image in an embodiment, and fig. 7(2) (right) is a schematic diagram of a second transformed sample low dynamic range image; wherein the image block with the lower exposure value is the second image block. Fig. 7(3) (left) is a schematic diagram of a third initial sample low dynamic range image in an embodiment, and fig. 7(3) (right) is a schematic diagram of a third transformed sample low dynamic range image; wherein the image block with the higher exposure value is the third image block. Fig. 7(4) (left) is a schematic diagram of a third initial sample low dynamic range image in an embodiment, and fig. 7(4) (right) is a schematic diagram of a fourth transformed sample low dynamic range image; wherein the image block with the higher exposure value is the second image block.

In other embodiments, the initial sample low dynamic range image may be randomly selected, and the image block may be selected therefrom for pixel value replacement.

In the embodiment, the gamma curve is used for carrying out pixel value transformation on the randomly selected image blocks in the initial sample image to obtain the transformed sample low dynamic range image, the transformed sample low dynamic range image simulates a scene of extreme overexposure or extreme underexposure to a certain extent, the sample images lacking such a scene in the original data set are supplemented, and the richness of sample data is increased. The transformed sample low dynamic image is subsequently used for a high dynamic range imaging model, so that the high dynamic range imaging model with better effect can be obtained, and the image details can be better recovered.

In one embodiment, half of the images are selected to perform the step of randomly selecting the image blocks to replace the pixel values, and the other half of the images are not processed for the obtained initial sample low dynamic range images; this ensures that the sample data contains images of a variety of different scenes.

And step S330, training an initial high dynamic range imaging model frame based on the transformed low dynamic range image and the initial sample low dynamic range image to obtain a high dynamic range imaging model.

And training a preset initial high dynamic range imaging model frame based on all initial sample low dynamic range images and the transformed sample low dynamic range images to obtain a high dynamic range imaging model.

In one embodiment, model training is performed using the L1 loss function:

l＝||I^GT-I^H||₁

wherein I^GTFor true high dynamic range image labeling of training data, I^HA predicted high dynamic range image generated for the model at training.

Further, in one embodiment, the model is trained using an Adam optimizer, Adam optimizer parameter β₁＝0.9,β₂＝0.999,∈＝10^-8The learning rate is 1e-4, the training code is realized by using a Pythrch frame, and 300 rounds of training are carried out on 2 RTX 2080ti display cards.Among them, the Adam algorithm is an algorithm that performs first order gradient optimization on a random objective function, and is based on adaptive low-order moment estimation. The Adam algorithm is easy to implement, and has high computational efficiency and low memory requirements.

In this embodiment, a mode for constructing a sample image used for a training model is provided, in which an image block is randomly selected from an initial sample, and the image block is replaced by a pixel value to simulate a scene with a maximum overexposure or a maximum underexposure in a real scene, and model training is performed using initial sample image data and transformed sample image data obtained by construction, so that a model with a better effect can be obtained, and more image details can be restored by reconstructing a high dynamic range image based on a low dynamic range image.

In a specific embodiment, the above-mentioned high dynamic range imaging method is described in a detailed embodiment:

1. training model

Initial sample image data was acquired from the dynamic scene dataset proposed by Kalantari. Aiming at the problems that the data set contains few data samples, the Exposure combinations (EVs) of the data are only EV-2, EV0, EV +2 and EV-3, EV0 and EV +3, and the number of extreme overexposure scenes or extreme underexposure scenes in the data is small, the patent provides a novel cross-Exposure data enhancement mode. As shown in the figure below, the left-most side is the three-frame input LDR picture, and to simulate a limit overexposure/underexposure scene, the selected region pixel values are transformed using a gamma curve over two frames at random, resulting in one column of enhancement data samples in the middle of the figure below.

2. Model construction

After the training data set is prepared, the deep neural network needs to be trained. In this embodiment, the high dynamic range imaging frame structure is shown in fig. 8.

In this embodiment, the input to the high dynamic range imaging model is a three frame low dynamic range image I^l，I^rAnd I^hThe output is a high dynamic range image I^H. The input image is firstly extracted to three through a shared weight multi-scale feature coding neural networkMulti-scale features corresponding to the frame LDR image:

and

at the same time, for I^lAnd I^hInputting the data into a guide mask coder to obtain four masks { m in corresponding scales₄，m₃，m₂，m₁The mask of these four corresponding scales is then paired

And

performing fusion for adaptively obtaining a bright frame I^hAnd dark frame I^lIn the details of the corresponding region, the new features on the four scales obtained by fusion are recorded as

Finally, the newly obtained features

And a reference frame I^rIs characterized by

A coarse to fine HDR reconstruction is performed, starting from the lowest scale, as in the example of FIG. 3, and f^*、f^rAnd the result f output in the last scale^upAnd splicing, sending the spliced images into a convolution module, and repeating the steps for each layer, and finally outputting the result of the high-dynamic-range image.

In the model training, the L1 loss function is used for training. And when training the model, in order to save the video memory of the GPU, the training data is first clipped to a size of 256 × 256. Training a model using an Adam optimizer, Adam optimizer parameter β₁＝0.9,β₂＝0.999,∈＝10^-8Learning rate of 1e-4, training generationThe code was implemented using a Pythrch framework, trained for 300 rounds on 2 RTX 2080ti display cards.

In the embodiment, the high dynamic range imaging model is used for image reconstruction based on a plurality of low dynamic range images with different exposure values to finally generate a high dynamic range image, and the model is used for multi-scale feature fusion based on a mask to effectively remove artifacts. Meanwhile, for the acquired initial sample data, the sample data is enhanced by randomly selecting image blocks to replace pixel values, and a scene of extreme overexposure or extreme underexposure is simulated, so that the trained model can perform better in an overexposure saturation region (a scene with a large optical ratio) and an extreme dark scene, and good image details can be recovered.

Further, in a specific embodiment, comparing the effect of the high dynamic range imaging method provided in the embodiment of the present application with the effect of two conventional high dynamic range imaging methods, the method provided in the embodiment of the present application has better effect on high dynamic range images, has fewer artifacts, and can recover more image details.

It should be understood that, although the steps in the flowcharts involved in the above embodiments are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in each flowchart involved in the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

In one embodiment, as shown in fig. 9, there is provided a high dynamic range imaging apparatus including: an image acquisition module 910, a mask generation module 920, and an image reconstruction module 930, wherein:

an image acquisition module 910, configured to acquire a low dynamic range image of a target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image, and a third low dynamic range image in which exposure values sequentially increase;

a mask generation module 920, configured to generate masks corresponding to different scales based on each low dynamic range image;

and an image reconstruction module 930, configured to perform fusion processing and image reconstruction processing according to each low dynamic range image and the mask corresponding to each scale, to obtain a high dynamic range image of the target object.

The high dynamic range imaging device acquires low dynamic range images of three different exposure values of a target object, and generates masks corresponding to different scales based on the low dynamic range images; and then carrying out fusion processing and image reconstruction processing according to the low dynamic range images and the masks of all scales to obtain a high dynamic range image of the target object. The device generates masks with a plurality of scales according to the images of a plurality of frames of the same target object with different exposure values, and carries out dynamic range reconstruction and restoration on the images with different exposure values based on the masks, so that more image details of the target object can be restored as far as possible on the premise of avoiding the occurrence of artifacts, and the image effect of the reconstructed high-dynamic-range image is improved.

In one embodiment, the mask generation module 920 of the apparatus comprises: the splicing submodule is used for splicing the first low dynamic range image and the third low dynamic range image to obtain a spliced image; and the mask generation submodule is used for generating masks with different scales based on the spliced images.

In one embodiment, the mask generation module 920 of the apparatus is specifically configured to generate masks of different dimensions based on the guided mask encoder determined by training and the stitched image.

In one embodiment, the image reconstruction module 930 of the above apparatus comprises: the feature extraction submodule is used for respectively carrying out multi-scale feature extraction on the first low dynamic range image, the second low dynamic range image and the third low dynamic range image to obtain a first low dynamic range image feature, a second low dynamic range image feature and a third low dynamic range image feature under different scales; each scale of each low dynamic range image feature corresponds to each scale of the mask respectively; the fusion submodule is used for respectively carrying out fusion processing on the first low dynamic range image characteristic and the third low dynamic range image characteristic of the scale based on the mask of each scale to obtain fused image characteristics of each scale; and the reconstruction submodule is used for respectively reconstructing images according to the second low dynamic range image characteristics of each scale and the fused image characteristics to obtain a high dynamic range image of the target object.

In one embodiment, the fusion submodule of the above apparatus is specifically configured to: for the same scale: calculating a first product of the first low dynamic range image feature and the mask; determining a difference between 1 and the mask, and calculating a second product of the difference and the image feature of the third low dynamic range; and determining the sum of the first product and the second product as the fused image feature of the scale.

In one embodiment, the reconstruction sub-module of the apparatus comprises: the feature fusion unit is used for respectively carrying out fusion processing on the second low dynamic range image features of each scale and the fused image features of the scale to obtain the image features of the high dynamic range image; and the image conversion unit is used for converting the image characteristics of the high dynamic range image into the high dynamic range image of the target object.

Further, in one embodiment, the feature fusion unit of the above apparatus includes: the splicing subunit is used for splicing the second low dynamic range image feature of the scale, the fused image feature and the output feature of the previous scale from the lowest scale to obtain a cross-scale splicing feature; the convolution subunit is used for performing convolution on the cross-scale splicing characteristics to obtain output characteristics of the scale; determining the output characteristic corresponding to the highest scale as the image characteristic of the high dynamic range image; the output characteristic of the last scale of the lowest scale is a preset value.

In one embodiment, the high dynamic range imaging method as in any one of the above embodiments is implemented by a high dynamic range imaging model; in this embodiment, as shown in fig. 10, the apparatus further includes: a model training module:

a sample image obtaining module 1010, configured to obtain an initial sample low dynamic range image, where the initial sample image includes a first initial sample low dynamic range image, a second initial sample low dynamic range image, and a third initial sample low dynamic range image, where exposure values of the first initial sample low dynamic range image, the second initial sample low dynamic range image, and the third initial sample low dynamic range image are sequentially increased;

a pixel value transformation processing module 1020, configured to randomly select an image block in each initial sample low dynamic range image, and perform pixel value transformation processing on the image block to obtain a transformed sample low dynamic range image;

the training module 1030 is configured to train the initial high dynamic range imaging model frame based on the transformed low dynamic range image and the initial sample low dynamic range image to obtain a high dynamic range imaging model.

Further, in one embodiment, the pixel value transformation processing module 820 of the apparatus includes: a first pixel value transformation unit and a second pixel value transformation unit, wherein:

the first pixel value transformation unit is used for randomly selecting a first image block in the first initial sample low dynamic range image and the second initial sample low dynamic range image, transforming the pixel value in the first image block into a first target pixel value, and obtaining a first transformed sample low dynamic range image and a second transformed sample low dynamic range image; the first target pixel value is obtained by converting an original pixel value in the first image block based on a gamma curve, and the exposure value of the first target pixel value is larger than that of the pixel value in the first image block; the second pixel value transformation unit is used for randomly selecting a second image block in the second initial sample low dynamic range image and the third initial sample low dynamic range image, transforming the pixel values in the second image block into second target pixel values, and obtaining a third transformed sample low dynamic range image and a fourth transformed sample low dynamic range image; the second target pixel value is obtained by converting the original pixel value in the second image block based on the gamma curve, and the exposure value of the second target pixel value is larger than the exposure value of the pixel value in the second image block.

For specific limitations of the high dynamic range imaging apparatus, reference may be made to the above limitations of the high dynamic range imaging method, which are not described in detail herein. The various modules in the high dynamic range imaging apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent of a processor in the electronic device, or can be stored in a memory in the electronic device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, an electronic device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 11. The electronic device comprises a processor, a memory, a communication interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the electronic device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a high dynamic range imaging method. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the electronic equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 11 is a block diagram of only a portion of the architecture associated with the subject application, and does not constitute a limitation on the electronic devices to which the subject application may be applied, and that a particular electronic device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, an electronic device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a low dynamic range image of a target object; the low dynamic range image includes a first low dynamic range image, a second low dynamic range image, and a third low dynamic range image in which exposure values sequentially increase; generating masks corresponding to different scales based on each low dynamic range image; and performing fusion processing and image reconstruction processing according to the low dynamic range images and the masks corresponding to the scales to obtain the high dynamic range image of the target object.

In one embodiment, the processor, when executing the computer program, further performs the steps of: splicing the first low dynamic range image and the third low dynamic range image to obtain a spliced image; and generating masks with different scales based on the spliced images.

In one embodiment, the processor, when executing the computer program, further performs the steps of: and generating masks with different scales based on the guide mask encoder determined by training and the spliced images.

In one embodiment, the processor, when executing the computer program, further performs the steps of:

respectively carrying out multi-scale feature extraction on the first low dynamic range image, the second low dynamic range image and the third low dynamic range image to obtain a first low dynamic range image feature, a second low dynamic range image feature and a third low dynamic range image feature under different scales; each scale of each low dynamic range image feature corresponds to each scale of the mask respectively; based on the mask of each scale, respectively carrying out fusion processing on the first low dynamic range image feature and the third low dynamic range image feature of the scale to obtain fused image features of each scale; respectively carrying out image reconstruction according to the second low dynamic range image characteristics of each scale and the fused image characteristics to obtain a high dynamic range image of the target object

In one embodiment, the processor, when executing the computer program, further performs the steps of: for the same scale: calculating a first product of the first low dynamic range image feature and the mask; determining a difference between 1 and the mask, and calculating a second product of the difference and the image feature of the third low dynamic range; and determining the sum of the first product and the second product as the fused image feature of the scale.

In one embodiment, the processor, when executing the computer program, further performs the steps of: respectively fusing the second low dynamic range image characteristics of each scale with the fused image characteristics of the scale to obtain the image characteristics of the high dynamic range image; the image features of the high dynamic range image are converted into a high dynamic range image of the target object.

In one embodiment, the processor, when executing the computer program, further performs the steps of: from the lowest scale, splicing the second low dynamic range image feature of the scale, the fused image feature and the output feature of the previous scale to obtain a cross-scale splicing feature; performing convolution on the cross-scale splicing characteristics to obtain output characteristics of the scale; the output characteristic of the last scale of the lowest scale is a preset value; and determining the output characteristic corresponding to the highest scale as the image characteristic of the high dynamic range image.

In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring an initial sample low dynamic range image, wherein the initial sample image comprises a first initial sample low dynamic range image, a second initial sample low dynamic range image and a third initial sample low dynamic range image of which exposure values are sequentially increased; randomly selecting image blocks in the low dynamic range images of the initial samples, and performing pixel value transformation processing on the image blocks to obtain transformed sample low dynamic range images; and training an initial high dynamic range imaging model frame based on the transformed low dynamic range image and the initial sample low dynamic range image to obtain a high dynamic range imaging model.

In one embodiment, the processor, when executing the computer program, further performs the steps of: randomly selecting a first image block in a first initial sample low dynamic range image and a second initial sample low dynamic range image, and converting pixel values in the first image block into first target pixel values to obtain a first converted sample low dynamic range image and a second converted sample low dynamic range image; the first target pixel value is obtained by converting an original pixel value in the first image block based on a gamma curve, and the exposure value of the first target pixel value is larger than that of the pixel value in the first image block; randomly selecting a second image block in the second initial sample low dynamic range image and the third initial sample low dynamic range image, and converting pixel values in the second image block into second target pixel values to obtain a third converted sample low dynamic range image and a fourth converted sample low dynamic range image; the second target pixel value is obtained by converting the original pixel value in the second image block based on the gamma curve, and the exposure value of the second target pixel value is larger than the exposure value of the pixel value in the second image block.

In one embodiment, the present application further provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps in the above-described method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of high dynamic range imaging, the method comprising:

2. The high dynamic range imaging method of claim 1, wherein said generating masks corresponding to different scales based on each of said low dynamic range images comprises:

splicing the first low dynamic range image and the third low dynamic range image to obtain a spliced image;

and generating masks with different scales based on the spliced images.

3. The high dynamic range imaging method of claim 2, wherein said generating different scale masks based on said stitched image comprises:

and generating masks with different scales based on the guide mask encoder determined by training and the spliced images.

4. The high dynamic range imaging method according to claim 2, wherein performing fusion processing and image reconstruction processing according to each of the low dynamic range images and the mask corresponding to each scale to obtain a high dynamic range image of the target object comprises:

respectively carrying out multi-scale feature extraction on the first low dynamic range image, the second low dynamic range image and the third low dynamic range image to obtain a first low dynamic range image feature, a second low dynamic range image feature and a third low dynamic range image feature under different scales; each scale of each low dynamic range image feature corresponds to each scale of the mask respectively;

based on the mask of each scale, respectively carrying out fusion processing on the first low dynamic range image feature and the third low dynamic range image feature of the scale to obtain fused image features of each scale;

and respectively carrying out image reconstruction according to the second low dynamic range image characteristics and the fused image characteristics of each scale to obtain a high dynamic range image of the target object.

5. The high dynamic range imaging method according to claim 4, wherein based on the mask of each scale, performing fusion processing on the first low dynamic range image feature and the third low dynamic range image feature of the scale to obtain fused image features of each scale, respectively, includes:

for the same scale:

calculating a first product of a first low dynamic range image feature and the mask;

determining a difference between 1 and the mask, calculating a second product of the difference and the third low dynamic range image feature;

and determining the sum of the first product and the second product as the fused image feature of the scale.

6. The high dynamic range imaging method according to claim 4, wherein performing image reconstruction according to the second low dynamic range image features and the fused image features of each scale to obtain a high dynamic range image of the target object comprises:

respectively fusing the second low dynamic range image features of each scale with the fused image features of the scale to obtain image features of a high dynamic range image;

converting image features of the high dynamic range image into a high dynamic range image of the target object.

7. The high dynamic range imaging method according to claim 6, wherein the step of performing fusion processing on the second low dynamic range image features of each scale and the fused image features of the scale to obtain the image features of the high dynamic range image comprises:

from the lowest scale, splicing the second low dynamic range image feature of the scale, the fused image feature and the output feature of the previous scale to obtain a cross-scale splicing feature;

performing convolution on the cross-scale splicing features to obtain output features of the scale; the output characteristic of the last scale of the lowest scale is a preset value;

and determining the output characteristic corresponding to the highest scale as the image characteristic of the high dynamic range image.

8. The high dynamic range imaging method according to any one of claims 1 to 7, wherein said high dynamic range imaging method is implemented by a high dynamic range imaging model;

the training process of the high dynamic range imaging model comprises the following steps:

acquiring an initial sample low dynamic range image, wherein the initial sample image comprises a first initial sample low dynamic range image, a second initial sample low dynamic range image and a third initial sample low dynamic range image, of which exposure values are increased in sequence;

randomly selecting image blocks in the initial sample low dynamic range images, and performing pixel value transformation processing on the image blocks to obtain transformed sample low dynamic range images;

and training an initial high dynamic range imaging model frame based on the transformed low dynamic range image and the initial sample low dynamic range image to obtain the high dynamic range imaging model.

9. The high dynamic range imaging method according to claim 8, wherein randomly selecting image blocks in each of the initial sample low dynamic range images, and performing pixel value transformation processing on the image blocks to obtain transformed sample low dynamic range images comprises:

randomly selecting a first image block in the first initial sample low dynamic range image and the second initial sample low dynamic range image, and converting pixel values in the first image block into first target pixel values to obtain a first converted sample low dynamic range image and a second converted sample low dynamic range image; the first target pixel value is obtained by transforming an original pixel value in the first image block based on a gamma curve, and the exposure value of the first target pixel value is larger than that of the pixel value in the first image block;

randomly selecting a second image block in the second initial sample low dynamic range image and a third initial sample low dynamic range image, and converting pixel values in the second image block into second target pixel values to obtain a third converted sample low dynamic range image and a fourth converted sample low dynamic range image; the second target pixel value is obtained by transforming the original pixel value in the second image block based on a gamma curve, and the exposure value of the second target pixel value is larger than that of the pixel value in the second image block.

10. A high dynamic range imaging apparatus, comprising:

11. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any of claims 1 to 9 when executing the computer program.

12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.