WO2022247232A1 - Image enhancement method and apparatus, terminal device, and storage medium - Google Patents

Image enhancement method and apparatus, terminal device, and storage medium Download PDF

Info

Publication number
WO2022247232A1
WO2022247232A1 PCT/CN2021/137821 CN2021137821W WO2022247232A1 WO 2022247232 A1 WO2022247232 A1 WO 2022247232A1 CN 2021137821 W CN2021137821 W CN 2021137821W WO 2022247232 A1 WO2022247232 A1 WO 2022247232A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
network
processed
layer
image enhancement
Prior art date
Application number
PCT/CN2021/137821
Other languages
French (fr)
Chinese (zh)
Inventor
陈翔宇
刘翼豪
章政文
乔宇
董超
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2022247232A1 publication Critical patent/WO2022247232A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the present application relates to the field of deep learning technology, and in particular to an image enhancement method, device, terminal equipment and storage medium.
  • Image enhancement tasks generally include operations such as dehazing, denoising, deraining, super-resolution, decompression artifacts, deblurring, and high dynamic range (High Dynamic Range, HDR) reconstruction.
  • HDR High Dynamic Range
  • the U-net structure of the network as a special convolutional neural network (Convolutional Neural Network Networks, referred to as CNN), although it can well extract the spatial features of different scales of the input image in the processing of image enhancement tasks.
  • CNN convolutional Neural Network Networks
  • the pure U-net structure network cannot specifically process the input features, so it is difficult to handle the features with large differences in the enhancement task at the same time, which affects the image enhancement effect, resulting in poor image quality in some areas of the enhanced image.
  • the present application provides an image enhancement method, device, terminal equipment, and storage medium, which can improve the quality of an enhanced image in an image enhancement task.
  • the present application provides an image enhancement method, including: acquiring an image to be processed; inputting the image to be processed into a trained image enhancement model for processing, outputting an enhanced image, the image enhancement model includes a main network and a conditional network, and the main network It is a U-net structure.
  • the image enhancement model includes a main network and a conditional network, and the main network It is a U-net structure.
  • the main network includes M downsampling layers and M upsampling layers
  • the conditional network includes a shared convolution layer and M+1 feature extraction modules
  • the M+1 feature extraction modules include different numbers of downsampling operations ; Extract multiple feature tensors of different scales from the image to be processed through the conditional network, including:
  • the intermediate features are extracted from the image to be processed through the shared convolutional layer; the intermediate features are respectively input into M+1 feature extraction modules for processing, and M+1 feature tensors of different scales are obtained.
  • the main network also includes a first SFT layer and a plurality of residual modules, and the residual module includes alternately arranged second SFT layers and convolutional layers;
  • the first SFT layer is connected to the input side of M downsampling layers and M
  • multiple residual modules are interspersed between M downsampling layers and M upsampling layers, and M+1 feature tensors of different scales are input to the first SFT layer and the second SFT layer respectively.
  • the SFT layer of the corresponding scale in the SFT layer In the SFT layer of the corresponding scale in the SFT layer.
  • the image enhancement model also includes a weight network, the weight network includes skip connections and multi-layer convolutional layers; the enhanced image is obtained by fusing the output of the main network with the original features, and the original features are obtained from the weight network extracted from the image to be processed.
  • the image to be processed is an LDR image
  • the enhanced image is an HDR image
  • the image enhancement model is obtained after training the preset image enhancement initial model using a preset loss function and a training set; wherein, the training set includes a plurality of LDR image samples and the HDR corresponding to each LDR image sample Image sample, the preset loss function is used to describe the L1 loss between the value obtained by the Tanh function of the HDR predicted image and the value obtained by the Tanh function of the HDR image sample, and the HDR predicted image is the initial model of the image enhancement An image is obtained after processing the LDR image sample.
  • an image enhancement device including:
  • the acquiring unit is used to acquire the image to be processed.
  • the processing unit is used to input the image to be processed into the trained image enhancement model for processing, and output the enhanced image.
  • the image enhancement model includes a main network and a conditional network.
  • the main network is a U-net structure.
  • the conditional network extracts multiple feature tensors of different scales from the image to be processed, and inputs the image to be processed and multiple feature tensors of different scales to the network layer of the corresponding scale in the main network for processing to obtain an enhanced image.
  • the image enhancement model also includes a weight network, and the weight network includes a skip connection and a multi-layer convolutional layer; the enhanced image is obtained by fusing the output of the main network with the original feature, and the original feature is obtained by weight
  • the network extracts from the image to be processed.
  • the present application provides a terminal device, including: a memory and a processor, where the memory is used to store a computer program; and the processor is used to execute the method described in any one of the above first aspects when calling the computer program.
  • the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in any one of the above-mentioned first aspects is implemented.
  • an embodiment of the present application provides a computer program product, which, when the computer program product runs on a processor, causes the processor to execute the method described in any one of the above-mentioned first aspects.
  • the image enhancement method, device, terminal equipment and storage medium provided by this application add a conditional network to the main network of the U-net structure, and use the conditional network to extract multiple feature tensors of different scales from the image to be processed and input them to the
  • the main network after the main network extracts the spatial features of different scales from the image to be processed, it can specialize the spatial features of different scales based on the multiple feature tensors of different scales, so as to retain the spatial features of different scales. effective information, thereby improving the quality of the enhanced image.
  • FIG. 1 is a network architecture diagram 1 of an image enhancement model provided by an embodiment of the present application.
  • Fig. 2 is a network architecture diagram 2 of an image enhancement model provided by the embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a method for an HDR reconstruction task provided in an embodiment of the present application
  • FIG. 4 is a schematic diagram of a reconstruction effect of an HDR reconstruction task provided by an embodiment of the present application.
  • Fig. 5 is a schematic structural diagram of an image enhancement device provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the present application provides an image enhancement method. After the image to be processed is acquired, the image to be processed is input into the image enhancement model provided by the application for processing, and the enhanced image is output.
  • the image enhancement model provided by this application is to add a conditional network to the main network of the U-net structure, and use the conditional network to extract multiple feature tensors of different scales from the image to be processed as adjustment information and input them into the main network.
  • the main network After the main network extracts the spatial features of different scales from the image to be processed, it can specialize the spatial features of different scales based on the multiple feature tensors of different scales, so as to retain the effective information of the spatial features of different scales, so that Effectively improve the image enhancement effect and improve the quality of the enhanced image.
  • an exemplary image enhancement model provided by the present application is introduced with reference to FIG. 1 .
  • the image enhancement model is deployed in an image processing device, which may be a mobile terminal such as a smart phone, a tablet computer, or a camera, or a device capable of processing image data such as a desktop computer, a robot, or a server.
  • the image enhancement model provided by this application includes a main network and a condition network.
  • the main network adopts a U-net structure, which includes M downsampling layers and M upsampling layers connected to the M downsampling layers by jumping.
  • the U-net structure network performs image enhancement tasks, it gradually extracts spatial features of different scales through layer-by-layer down-sampling layers, and restores the spatial features of corresponding scales through layer-by-layer up-sampling layers to identify Enhanced information corresponding to the pixel to achieve image enhancement.
  • a conditional network is added to the main network of the U-net structure.
  • the conditional network can be used to extract multiple feature tensors of different scales from the image to be processed, and the multiple feature tensors of different scales are used as adjustment
  • the information is input into the main network, so that the main network can specialize the spatial features of different scales based on the feature tensors of different scales, so as to retain the effective information of the spatial features of different scales, thereby effectively improving the image enhancement effect. Improves the quality of enhanced images.
  • the conditional network can be designed based on the number of spatial features of different scales processed by the main network. For example, M downsampling layers of the main network mean that the main network can process spatial features of M+1 scales including the original scale. Then, correspondingly, the conditional network can output at most M+1 feature tensors of different scales corresponding to the spatial features of the M+1 scales.
  • the conditional network may include shared convolutional layers and M+1 feature extraction modules, and the M+1 feature extraction modules include different numbers of downsampling operation.
  • the conditional network first extracts intermediate features from the image to be processed by sharing a convolutional layer (for example, including multiple convolutional layers); then the intermediate features are respectively input into M+1 feature extraction modules for processing , so as to obtain M+1 feature tensors of different scales.
  • the main network includes 2 downsampling (Down) layers and corresponding 2 upsampling (Up) layers.
  • the spatial features processed by the main network include the original scale (large scale) spatial features extracted from the image to be processed, and the intermediate scale (smaller than the large scale) obtained after the first layer of downsampling layer is downsampled on the original scale. ) spatial features, the small-scale (smaller than the intermediate scale) spatial features obtained after the second downsampling layer performs downsampling on the intermediate scale.
  • the conditional network can include 3 feature extraction modules
  • the first feature extraction module can be composed of a convolution (Conv) layer (take 3 convolution layers as an example in Figure 1), which includes 0 downsampling operations
  • the scale of the output feature tensor 1 is the original scale, which is used to specifically process the spatial features of the original scale.
  • the second feature extraction module can be composed of a convolutional layer and a downsampling layer (take 2 convolutional layers and a downsampling layer as an example in Figure 1), that is, it includes a downsampling operation, and the output feature sheet
  • the scale of quantity 2 is the intermediate scale, which is used to specifically process the spatial features of the intermediate scale.
  • the third feature extraction module can be composed of a convolutional layer and 2 downsampling layers (take 1 convolutional layer and 2 downsampling layers as an example in Figure 1), which includes 2 downsampling operations, and the output feature sheet
  • the scale of quantity 3 is a small scale, which is used for specific processing of small-scale spatial features.
  • the spatial feature transform (Spatial Feature Transform, SFT) layer can be designed in the main network to make the feature tensors of different scales output by the conditional network act on the spatial features of the corresponding scale.
  • the main network further includes a first SFT layer and multiple residual modules.
  • the first SFT layer is connected to the input side of the M downsampling layers and the output side of the M upsampling layers.
  • the residual module includes alternately arranged second SFT layers and convolutional layers; multiple residual modules are interspersed between M downsampling layers and M upsampling layers, and M+1 feature tensors of different scales are respectively input To the SFT layer of the corresponding scale in the first SFT layer and the second SFT layer, to perform spatial feature transformation on the spatial features of different scales, and realize the specific processing of the spatial features of different scales.
  • the main network includes a convolutional layer, the first SFT layer (SFT layer1), a convolutional layer, a downsampling layer Down1, and two residual modules (Residual block), downsampling layer Down2, N (N is an integer greater than or equal to 1) residual modules, convolutional layer, upsampling layer Up1, two residual modules, upsampling layer Up2, SFT layer1, two convolutions Floor.
  • the residual module includes two sets of alternately arranged second SFT layers (ie, SFT layer2) and convolutional layers.
  • Feature tensor 1 is input to the first SFT layer SFT layer11 and SFT layer5, feature tensor 2 is input to SFT layer2 in the two residual modules between Down1 and Down2, and two residual modules between Up1 and Up2 SFT layer2 in difference module.
  • Feature tensor 3 is input to SFT layer2 in N residual modules located between Down2 and Up1.
  • the SFT layer can include two sets of convolutional layers (in Figure 1, each set of convolutional layers includes two convolutional layers as an example), and the feature tensor output by the conditional network is processed by a set of convolutional layers Get the modulation parameter a.
  • the modulation parameter a is multiplied by the output feature of the previous layer of the SFT layer to obtain the transformed feature.
  • the feature tensor is processed by another set of convolutional layers to obtain the modulation parameter b.
  • the modulation parameter b is added to the transform feature to obtain the output feature of the SFT layer.
  • the modulation parameters (a, b) are learned from the feature tensor output by the conditional network through the SFT layer, and then adaptive affine transformation is performed on the spatial features of the corresponding scale based on the modulation parameters (a, b), so as to realize Specific processing of different spatial features to retain more effective spatial information.
  • the image enhancement model of the present application may also include a weight network. That is to further increase the weight network on the main network.
  • the weight network is used to extract raw features from the image to be processed.
  • the weight network includes multiple convolutional layers (four convolutional layers are taken as an example in FIG. 2 ), and the input of the multi-layer convolutional layer is skip-connected to the output.
  • the enhanced image output by the image enhancement model is obtained by performing feature fusion on the output of the main network and the original features. Among them, feature fusion can be performed on the output of the main network and the original features by means of superposition.
  • the weight network By designing the weight network, the original features can be learned from the image to be processed without manual estimation, so that more and more accurate original features can be retained in the image enhancement task. Moreover, the weight network structure provided by this application is simple, easy to optimize, and can reduce the training difficulty of the image enhancement network under the condition that the original features are fully preserved.
  • the network framework provided by this application is universal. It can be applied to any image enhancement tasks or tasks that use image enhancement effects as evaluation indicators. Image defogging, denoising, deraining, super-resolution, decompression artifacts, deblurring, HDR reconstruction and other image enhancement tasks.
  • the initial model can be trained by designing corresponding training sets and loss functions, so as to obtain image enhancement models suitable for different image enhancement tasks.
  • the image enhancement initial model can be trained to obtain an image enhancement model that can be applied to super-resolution image enhancement tasks.
  • the initial image enhancement model is trained to obtain an image enhancement model that can be applied to the image enhancement task of deblurring.
  • the image enhancement model may be pre-trained by the image processing device, or the file corresponding to the image enhancement model may be transplanted to the image processing device after being pre-trained by other devices. That is to say, the execution subject for training the image enhancement model and the execution subject for performing the image enhancement task using the image enhancement model may be the same or different. For example, when other devices are used to train the image enhancement initial model, after the other devices complete the training of the image enhancement initial model, the model parameters are fixed to obtain the corresponding file of the image enhancement model. This file is then ported to an image processing device.
  • the following uses the HDR reconstruction task to illustrate the training process and effect of the image enhancement model provided by this application.
  • an image enhancement initial model is constructed first. That is, after building the U-net initial network, design the corresponding conditional initial network and weight initial network based on the initial main network, and add the conditional initial network and weight initial network to the initial main network.
  • the training set includes a plurality of image sample pairs, and each image sample pair includes an LDR image sample and an HDR image sample corresponding to the LDR image sample.
  • image sample pairs may be collected by a mobile phone, a camera, or the like. It is also possible to use an open source algorithm to obtain corresponding LDR image samples based on the HDR image samples to be made public.
  • the pixels in the generally highlighted area (ie, overexposed area) and the pixels in the non-highlighted area (ie, normally exposed area) have a large difference, therefore, it is easy to Causes the initial model to focus on areas with larger pixel values during training. That is to say, during the training process of the initial model, it may focus on restoring the brightness and texture details of the highlighted area, while ignoring the noise and quantization loss problems that may exist in the non-highlighted area.
  • the present application provides a loss function Tanh_L1, which is used to describe the L1 loss between the value obtained by the Tanh function for the HDR prediction image and the value obtained by the Tanh function for the HDR image sample. That is, the Tanh function is added on the basis of the L1 loss function, where the HDR prediction image is the image obtained after the initial image enhancement model processes the LDR image sample.
  • Tanh_L1 The expression of Tanh_L1 is as follows:
  • Y represents the HDR prediction image obtained by processing the LDR image sample through the image enhancement initial model
  • H represents the HDR image sample corresponding to the LDR image sample.
  • the Tanh function can perform non-linear compression on pixel values, so after being applied to the L1 loss function, it can balance the pixels in the highlighted area and the pixels in the non-highlighted area in the LDR image sample, and reduce the pixels due to the highlighted area and the non-highlighted area.
  • the pixel difference in the region is too large, which will affect the training effect of the initial model.
  • the gradient descent method can be used to iteratively train the image enhancement initial model.
  • the model converges (that is, the value of Tanh_L1 is not decreasing)
  • the trained image enhancement model can be obtained.
  • the image processing device when using the image enhancement model for HDR reconstruction.
  • the image processing device acquires the LDR image to be processed, it can input the LDR image to the image enhancement model, and the LDR image is respectively input to the weight network, the main network and the conditional network.
  • the original features are extracted from the LDR image by the weight network;
  • the feature tensors of different scales are extracted from the LDR image by the conditional network;
  • the feature tensors of different scales are input to the main network, and the main network is used for spatial features of different scales of the LDR image
  • the feature tensors of different scales are used to specifically process the spatial features of different scales to obtain the output features of the main network.
  • the output feature is fused with the original feature to obtain an HDR image.
  • the reconstruction effect of HDR reconstruction can be referred to as shown in FIG. 4 .
  • this application when using the image enhancement model provided by this application to perform HDR reconstruction tasks, firstly, it is not necessary to collect LDR images with different exposures in the same scene. After the model training is completed, only a single LDR image is needed to restore the corresponding HDR image. . Secondly, compared with the previous HDR reconstruction method based on a single LDR image, this application adopts an end-to-end training method, which has a simple model, high training efficiency, and good model effect.
  • the image enhancement model provided by this application enters the conditional network in the main network, the main network can perform specific processing on spatial features of different scales, and the Tanh_L1 loss function is used in the training process, so that the image enhancement provided by this application
  • the model is able to simultaneously process denoising and dequantization losses on non-highlight regions while restoring brightness and texture details in highlight regions. That is to say, in the HDR reconstruction task, the image enhancement model provided by this application can jointly realize the HDR reconstruction task, denoising task, and dequantization loss task.
  • the image enhancement model can be directly added to the camera's post-processing process to improve the camera's shooting quality from the perspective of software.
  • the image enhancement model can also be used as an image/video post-enhancement means to enhance the image quality of the existing LDR data.
  • the embodiment of the present application provides an image enhancement device, the embodiment of the device corresponds to the foregoing method embodiment, for the convenience of reading, the embodiment of the device no longer compares the foregoing method embodiment
  • the details in the present invention will be described one by one, but it should be clear that the device in this embodiment can correspondingly implement all the content in the foregoing method embodiments.
  • FIG. 5 is a schematic structural diagram of an image enhancement device provided by an embodiment of the present application.
  • the image enhancement device provided by this embodiment includes: an acquisition unit 501 and a processing unit 502 .
  • the obtaining unit 501 is configured to obtain an image to be processed.
  • the processing unit 502 is configured to input the image to be processed into a trained image enhancement model for processing, and output the enhanced image.
  • the image enhancement model includes a main network and a conditional network, the main network is a U-net structure, and the When processing the image to be processed, a plurality of feature tensors of different scales are extracted from the image to be processed through the conditional network, and the image to be processed and the feature tensors of multiple different scales are respectively input Go to the network layer of the corresponding scale in the main network for processing to obtain the enhanced image.
  • the main network includes M downsampling layers and M upsampling layers
  • the conditional network includes a shared convolution layer and M+1 feature extraction modules
  • the M+1 feature extraction modules include Different numbers of downsampling operations; extracting multiple feature tensors of different scales from the image to be processed through the conditional network, including:
  • Extract intermediate features from the image to be processed through the shared convolutional layer input the intermediate features to the M+1 feature extraction modules for processing, and obtain M+1 feature sheets of different scales quantity.
  • the main network further includes a first SFT layer and a plurality of residual modules, and the residual module includes alternately arranged second SFT layers and convolutional layers;
  • the first SFT layer is connected to the M The input side of the downsampling layer and the output side of the M upsampling layers, the plurality of residual modules are interspersed between the M downsampling layers and the M upsampling layers, M+1
  • the feature tensors of different scales are respectively input into SFT layers of corresponding scales in the first SFT layer and the second SFT layer.
  • the image enhancement model also includes a weight network, and the weight network includes a skip connection and a multi-layer convolution layer; the enhanced image is obtained by merging the output of the main network with the original features , the original feature is extracted from the image to be processed through the weight network.
  • the image to be processed is an LDR image
  • the enhanced image is an HDR image
  • the image enhancement model is obtained by using a preset loss function and a training set to train a preset image enhancement initial model; wherein, the training set includes a plurality of LDR image samples and each of the LDR image samples The HDR image sample corresponding to the sample, the preset loss function is used to describe the L1 loss between the value obtained by the Tanh function of the HDR predicted image and the value obtained by the Tanh function of the HDR image sample, the HDR predicted image is The image enhancement initial model obtains an image after processing the LDR image sample.
  • the image enhancement device provided in this embodiment can execute the above-mentioned method embodiment, and its implementation principle and technical effect are similar, and details are not repeated here.
  • the embodiment of the present application also provides a terminal device.
  • the terminal device 6 of this embodiment includes: a processor 60 , a memory 61 , and a computer program 62 stored in the memory 61 and operable on the processor 60 .
  • the processor 60 executes the computer program 62
  • the steps in the above embodiments of the image enhancement method are implemented, for example, steps S101 to S104 shown in FIG. 1 .
  • the processor 60 executes the computer program 62
  • the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 401 to 403 shown in FIG. 4 are realized.
  • the computer program 62 can be divided into one or more modules/units, and the one or more modules/units are stored in the memory 61 and executed by the processor 60 to complete this application.
  • the one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 62 in the terminal device 6 .
  • FIG. 6 is only an example of the terminal device 6, and does not constitute a limitation on the terminal device 6. It may include more or less components than those shown in the figure, or combine certain components, or different components.
  • the terminal device 6 may also include an input and output device, a network access device, a bus, and the like.
  • the processor 60 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the storage 61 may be an internal storage unit of the terminal device 6 , such as a hard disk or memory of the terminal device 6 .
  • the memory 61 can also be an external storage device of the terminal device 6, such as a plug-in hard disk equipped on the terminal device 6, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Further, the memory 61 may also include both an internal storage unit of the terminal device 6 and an external storage device.
  • the memory 61 is used to store the computer program and other programs and data required by the terminal device 6 .
  • the memory 61 can also be used to temporarily store data that has been output or will be output.
  • the terminal device provided in this embodiment can execute the foregoing method embodiment, and its implementation principle and technical effect are similar, and details are not repeated here.
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in the foregoing method embodiment is implemented.
  • the embodiment of the present application further provides a computer program product, which, when the computer program product runs on a terminal device, enables the terminal device to implement the method described in the foregoing method embodiments when executed.
  • the above integrated units are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, all or part of the procedures in the methods of the above embodiments in the present application can be completed by instructing related hardware through computer programs, and the computer programs can be stored in a computer-readable storage medium.
  • the computer program When executed by a processor, the steps in the above-mentioned various method embodiments can be realized.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form.
  • the computer-readable storage medium may at least include: any entity or device capable of carrying computer program codes to a photographing device/terminal device, a recording medium, a computer memory, a read-only memory (Read-Only Memory, ROM), a random access Memory (Random Access Memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media.
  • a photographing device/terminal device a recording medium
  • a computer memory a read-only memory (Read-Only Memory, ROM), a random access Memory (Random Access Memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media.
  • ROM read-only memory
  • RAM random access Memory
  • electrical carrier signals telecommunication signals
  • software distribution media such as U disk, mobile hard disk, magnetic disk or optical disk, etc.
  • computer readable media may not be electrical carrier signals and telecommunication signals under legislation and patent practice.
  • the disclosed device/device and method can be implemented in other ways.
  • the device/device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the term “if” may be construed, depending on the context, as “when” or “once” or “in response to determining” or “in response to detecting “.
  • the phrase “if determined” or “if [the described condition or event] is detected” may be construed, depending on the context, to mean “once determined” or “in response to the determination” or “once detected [the described condition or event] ]” or “in response to detection of [described condition or event]”.
  • references to "one embodiment” or “some embodiments” or the like in the specification of the present application means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically stated otherwise.
  • the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to the technical field of deep learning, and provides an image enhancement method and apparatus, a terminal device, and a storage medium, capable of being improving the quality of an enhanced image in an image enhancement task. The image enhancement method comprises: acquiring an image to be processed; and inputting said image into a trained image enhancement model for processing, and outputting an enhanced image, the image enhancement model comprising a master network and a conditional network, the master network being of a U-net structure, and when said image is processed, extracting a plurality of feature tensors of different scales from said image by means of the conditional network, and respectively inputting said image and the plurality of feature tensors of different scales into network layers of corresponding scales in the master network for processing to obtain the enhanced image.

Description

一种图像增强方法、装置、终端设备及存储介质An image enhancement method, device, terminal equipment and storage medium 技术领域technical field
本申请涉及深度学习技术领域,尤其涉及一种图像增强方法、装置、终端设备及存储介质。The present application relates to the field of deep learning technology, and in particular to an image enhancement method, device, terminal equipment and storage medium.
背景技术Background technique
图像增强任务一般包括去雾,去噪声,去雨,超分辨率,去压缩伪影,去模糊,高动态范围(High Dynamic Range, HDR)重建等操作。随着深度学习技术在图像处理和计算机视觉领域的广泛地应用,如何提高图像增强效果,成为基于神经网络的图像增强任务中的焦点问题。Image enhancement tasks generally include operations such as dehazing, denoising, deraining, super-resolution, decompression artifacts, deblurring, and high dynamic range (High Dynamic Range, HDR) reconstruction. With the wide application of deep learning technology in the fields of image processing and computer vision, how to improve the effect of image enhancement has become the focus of image enhancement tasks based on neural networks.
技术问题technical problem
例如,U-net结构的网络作为一种特殊的卷积神经网络(Convolutional Neural Networks,简称CNN),虽然在处理图像增强任务中能够很好地提取输入图像的不同尺度的空间特征。但单纯的U-net结构网络无法对输入特征进行特异性处理,从而难以同时处理好增强任务中差异较大的特征,影响图像增强效果,导致增强图像的部分区域画质较差。For example, the U-net structure of the network as a special convolutional neural network (Convolutional Neural Network Networks, referred to as CNN), although it can well extract the spatial features of different scales of the input image in the processing of image enhancement tasks. However, the pure U-net structure network cannot specifically process the input features, so it is difficult to handle the features with large differences in the enhancement task at the same time, which affects the image enhancement effect, resulting in poor image quality in some areas of the enhanced image.
技术解决方案technical solution
有鉴于此,本申请提供一种图像增强方法、装置、终端设备及存储介质,能够提高图像增强任务中增强图像的画质。In view of this, the present application provides an image enhancement method, device, terminal equipment, and storage medium, which can improve the quality of an enhanced image in an image enhancement task.
第一方面,本申请提供一种图像增强方法,包括:获取待处理图像;待处理图像输入到已训练的图像增强模型中处理,输出增强图像,图像增强模型包括主网络和条件网络,主网络为U-net结构,在处理待处理图像时,通过条件网络从待处理图像中提取多个不同尺度的特征张量,并将待处理图像和多个不同尺度的特征张量分别输入到主网络中相应尺度的网络层进行处理,得到增强图像。In the first aspect, the present application provides an image enhancement method, including: acquiring an image to be processed; inputting the image to be processed into a trained image enhancement model for processing, outputting an enhanced image, the image enhancement model includes a main network and a conditional network, and the main network It is a U-net structure. When processing the image to be processed, multiple feature tensors of different scales are extracted from the image to be processed through the conditional network, and the image to be processed and multiple feature tensors of different scales are respectively input to the main network. The network layer of the corresponding scale is processed to obtain an enhanced image.
可选的,主网络包括M个下采样层和M个上采样层,条件网络包括共享卷积层和M+1个特征提取模块,M+1个特征提取模块分别包括不同数量的下采样操作;通过条件网络从待处理图像中提取多个不同尺度的特征张量,包括:Optionally, the main network includes M downsampling layers and M upsampling layers, the conditional network includes a shared convolution layer and M+1 feature extraction modules, and the M+1 feature extraction modules include different numbers of downsampling operations ; Extract multiple feature tensors of different scales from the image to be processed through the conditional network, including:
通过共享卷积层从待处理图像中提取中间特征;将中间特征分别输入到M+1个特征提取模块中处理,得到M+1个不同尺度的特征张量。The intermediate features are extracted from the image to be processed through the shared convolutional layer; the intermediate features are respectively input into M+1 feature extraction modules for processing, and M+1 feature tensors of different scales are obtained.
可选的,主网络还包括第一SFT层和多个残差模块,残差模块包括交替设置的第二SFT层和卷积层;第一SFT层连接M个下采样层的输入侧和M个上采样层的输出侧,多个残差模块穿插设置在M个下采样层和M个上采样层之间,M+1个不同尺度的特征张量分别输入至第一SFT层和第二SFT层中相应尺度的SFT层中。Optionally, the main network also includes a first SFT layer and a plurality of residual modules, and the residual module includes alternately arranged second SFT layers and convolutional layers; the first SFT layer is connected to the input side of M downsampling layers and M On the output side of the upsampling layers, multiple residual modules are interspersed between M downsampling layers and M upsampling layers, and M+1 feature tensors of different scales are input to the first SFT layer and the second SFT layer respectively. In the SFT layer of the corresponding scale in the SFT layer.
可选的,图像增强模型还包括权重网络,权重网络包括跳跃连接和多层卷积层;增强图像是通过将主网络的输出与原始特征进行特征融合后得到的,原始特征是通过权重网络从待处理图像中提取到的。Optionally, the image enhancement model also includes a weight network, the weight network includes skip connections and multi-layer convolutional layers; the enhanced image is obtained by fusing the output of the main network with the original features, and the original features are obtained from the weight network extracted from the image to be processed.
可选的,待处理图像为LDR图像,增强图像为HDR图像。Optionally, the image to be processed is an LDR image, and the enhanced image is an HDR image.
可选的,图像增强模型是利用预设的损失函数和训练集对预设的图像增强初始模型进行训练后得到的;其中,训练集包括多个LDR图像样本和每个LDR图像样本对应的HDR图像样本,预设的损失函数用于描述HDR预测图像经过Tanh函数得到的数值与所述HDR图像样本经过Tanh函数得到的数值之间的L1损失,所述HDR预测图像为所述图像增强初始模型对所述LDR图像样本进行处理后得到图像。Optionally, the image enhancement model is obtained after training the preset image enhancement initial model using a preset loss function and a training set; wherein, the training set includes a plurality of LDR image samples and the HDR corresponding to each LDR image sample Image sample, the preset loss function is used to describe the L1 loss between the value obtained by the Tanh function of the HDR predicted image and the value obtained by the Tanh function of the HDR image sample, and the HDR predicted image is the initial model of the image enhancement An image is obtained after processing the LDR image sample.
第二方面,本申请提供一种图像增强装置,包括:In a second aspect, the present application provides an image enhancement device, including:
获取单元,用于获取待处理图像。The acquiring unit is used to acquire the image to be processed.
处理单元,用于将待处理图像输入到已训练的图像增强模型中处理,输出增强图像,图像增强模型包括主网络和条件网络,主网络为U-net结构,在处理待处理图像时,通过条件网络从待处理图像中提取多个不同尺度的特征张量,并将待处理图像和多个不同尺度的特征张量分别输入到主网络中相应尺度的网络层进行处理,得到增强图像。The processing unit is used to input the image to be processed into the trained image enhancement model for processing, and output the enhanced image. The image enhancement model includes a main network and a conditional network. The main network is a U-net structure. When processing the image to be processed, through The conditional network extracts multiple feature tensors of different scales from the image to be processed, and inputs the image to be processed and multiple feature tensors of different scales to the network layer of the corresponding scale in the main network for processing to obtain an enhanced image.
可选的,其中,图像增强模型还包括权重网络,权重网络包括跳跃连接和多层卷积层;增强图像是通过将主网络的输出与原始特征进行特征融合后得到的,原始特征是通过权重网络从待处理图像中提取到的。Optionally, the image enhancement model also includes a weight network, and the weight network includes a skip connection and a multi-layer convolutional layer; the enhanced image is obtained by fusing the output of the main network with the original feature, and the original feature is obtained by weight The network extracts from the image to be processed.
第三方面,本申请提供一种终端设备,包括:存储器和处理器,存储器用于存储计算机程序;处理器用于在调用计算机程序时执行上述第一方面中任一方式所述的方法。In a third aspect, the present application provides a terminal device, including: a memory and a processor, where the memory is used to store a computer program; and the processor is used to execute the method described in any one of the above first aspects when calling the computer program.
第四方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现如上述第一方面中任一方式所述的方法。In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in any one of the above-mentioned first aspects is implemented.
第五方面,本申请实施例提供一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行上述第一方面中任一方式所述的方法。In a fifth aspect, an embodiment of the present application provides a computer program product, which, when the computer program product runs on a processor, causes the processor to execute the method described in any one of the above-mentioned first aspects.
有益效果Beneficial effect
本申请所提供的图像增强方法、装置、终端设备及存储介质,在U-net结构的主网络上增加条件网络,利用条件网络从待处理图像中提取多个不同尺度的特征张量并输入至主网络中,当主网络从待处理图像中提取的不同尺度的空间特征后,可以基于该多个不同尺度的特征张量对不同尺度的空间特征进行特异化处理,以保留不同尺度的空间特征的有效信息,从而提高了增强图像的画质。The image enhancement method, device, terminal equipment and storage medium provided by this application add a conditional network to the main network of the U-net structure, and use the conditional network to extract multiple feature tensors of different scales from the image to be processed and input them to the In the main network, after the main network extracts the spatial features of different scales from the image to be processed, it can specialize the spatial features of different scales based on the multiple feature tensors of different scales, so as to retain the spatial features of different scales. effective information, thereby improving the quality of the enhanced image.
附图说明Description of drawings
图1为本申请实施例提供的一种图像增强模型的网络架构图一;FIG. 1 is a network architecture diagram 1 of an image enhancement model provided by an embodiment of the present application;
图2为本申请实施例提供的一种图像增强模型的网络架构图二;Fig. 2 is a network architecture diagram 2 of an image enhancement model provided by the embodiment of the present application;
图3为本申请实施例提供的一种HDR重建任务的方法流程示意图;FIG. 3 is a schematic flowchart of a method for an HDR reconstruction task provided in an embodiment of the present application;
图4为本申请实施例提供的一种HDR重建任务的重建效果示意图;FIG. 4 is a schematic diagram of a reconstruction effect of an HDR reconstruction task provided by an embodiment of the present application;
图5为本申请实施例所提供的一种图像增强装置的结构示意图Fig. 5 is a schematic structural diagram of an image enhancement device provided by an embodiment of the present application
图6为本申请实施例所提供的一种终端设备的结构示意图。FIG. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
本发明的实施方式Embodiments of the present invention
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.
针对图像增强任务,本申请提供一种图像增强方法,在获取到待处理图像后,通过将待处理图像输入到本申请所提供的图像增强模型中处理,输出增强图像。其中,本申请提供的图像增强模型是在U-net结构的主网络上增加条件网络,利用条件网络从待处理图像中提取多个不同尺度的特征张量作为调节信息输入至主网络中。当主网络从待处理图像中提取的不同尺度的空间特征后,可以基于该多个不同尺度的特征张量对不同尺度的空间特征进行特异化处理,以保留不同尺度的空间特征的有效信息,从而有效提高图像增强效果,提高增强图像的画质。For the image enhancement task, the present application provides an image enhancement method. After the image to be processed is acquired, the image to be processed is input into the image enhancement model provided by the application for processing, and the enhanced image is output. Among them, the image enhancement model provided by this application is to add a conditional network to the main network of the U-net structure, and use the conditional network to extract multiple feature tensors of different scales from the image to be processed as adjustment information and input them into the main network. After the main network extracts the spatial features of different scales from the image to be processed, it can specialize the spatial features of different scales based on the multiple feature tensors of different scales, so as to retain the effective information of the spatial features of different scales, so that Effectively improve the image enhancement effect and improve the quality of the enhanced image.
下面以具体地实施例对本申请的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。The technical solution of the present application will be described in detail below with specific embodiments. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments.
首先,结合图1对本申请提供的一种图像增强模型进行示例性的介绍。该图像增强模型部署在图像处理设备中,图像处理设备可以是智能手机、平板电脑、摄像机等移动终端,还可以是台式电脑、机器人、服务器等能够处理图像数据的设备。First, an exemplary image enhancement model provided by the present application is introduced with reference to FIG. 1 . The image enhancement model is deployed in an image processing device, which may be a mobile terminal such as a smart phone, a tablet computer, or a camera, or a device capable of processing image data such as a desktop computer, a robot, or a server.
本申请提供的图像增强模型包括主网络和条件网络(condition network)。其中,主网络采用U-net结构,即包括M个下采样层和与该M个下采样层通过跳跃连接的M个上采样层。U-net结构的网络在执行图像增强任务时,通过一层一层下采样层逐步提取不同尺度的空间特征,在经过一层一层上采样层对相应尺度的空间特征进行特征还原,以识别对应像素的增强信息,从而实现图像增强。The image enhancement model provided by this application includes a main network and a condition network. Among them, the main network adopts a U-net structure, which includes M downsampling layers and M upsampling layers connected to the M downsampling layers by jumping. When the U-net structure network performs image enhancement tasks, it gradually extracts spatial features of different scales through layer-by-layer down-sampling layers, and restores the spatial features of corresponding scales through layer-by-layer up-sampling layers to identify Enhanced information corresponding to the pixel to achieve image enhancement.
本申请实施例中,在U-net结构的主网络上增加条件网络,该条件网络可用于从待处理图像中提取多个不同尺度的特征张量,该多个不同尺度的特征张量作为调节信息输入至主网络中,以使得主网络基于该多个不同尺度的特征张量对不同尺度的空间特征进行特异化处理,以保留不同尺度的空间特征的有效信息,从而有效提高图像增强效果,提高增强图像的画质。In the embodiment of the present application, a conditional network is added to the main network of the U-net structure. The conditional network can be used to extract multiple feature tensors of different scales from the image to be processed, and the multiple feature tensors of different scales are used as adjustment The information is input into the main network, so that the main network can specialize the spatial features of different scales based on the feature tensors of different scales, so as to retain the effective information of the spatial features of different scales, thereby effectively improving the image enhancement effect. Improves the quality of enhanced images.
在一个示例中,条件网络可以基于主网络所处理的不同尺度的空间特征的个数进行设计。例如,主网络M个下采样层,意味着主网络可以处理包含原始尺度在内的M+1个尺度的空间特征。那么,相应的,条件网络则最多可以输出与该M+1个尺度的空间特征对应的M+1个不同尺度的特征张量。In one example, the conditional network can be designed based on the number of spatial features of different scales processed by the main network. For example, M downsampling layers of the main network mean that the main network can process spatial features of M+1 scales including the original scale. Then, correspondingly, the conditional network can output at most M+1 feature tensors of different scales corresponding to the spatial features of the M+1 scales.
例如,当主网络包括M个下采样层和相对应的M个上采样层时,条件网络可以包括共享卷积层和M+1个特征提取模块,M+1个特征提取模块分别包括不同数量的下采样操作。在处理待处理图像时,条件网络首先通过共享卷积层(例如,包括多个卷积层)从待处理图像中提取中间特征;然后将中间特征分别输入到M+1个特征提取模块中处理,从而得到M+1个不同尺度的特征张量。For example, when the main network includes M downsampling layers and corresponding M upsampling layers, the conditional network may include shared convolutional layers and M+1 feature extraction modules, and the M+1 feature extraction modules include different numbers of downsampling operation. When processing an image to be processed, the conditional network first extracts intermediate features from the image to be processed by sharing a convolutional layer (for example, including multiple convolutional layers); then the intermediate features are respectively input into M+1 feature extraction modules for processing , so as to obtain M+1 feature tensors of different scales.
示例性的,如图1所示,假设M=2,主网络包括2个下采样(Down)层和相对应的2个上采样(Up)层。那么,主网络处理的空间特征包括对从待处理图像提取的原始尺度(大尺度)的空间特征,经过第1层下采样层在原始尺度上进行下采样操作后得到的中间尺度(小于大尺度)的空间特征,经过第2层下采样层在中间尺度上进行下采样操作后得到的小尺度(小于中间尺度)的空间特征。Exemplarily, as shown in FIG. 1 , assuming that M=2, the main network includes 2 downsampling (Down) layers and corresponding 2 upsampling (Up) layers. Then, the spatial features processed by the main network include the original scale (large scale) spatial features extracted from the image to be processed, and the intermediate scale (smaller than the large scale) obtained after the first layer of downsampling layer is downsampled on the original scale. ) spatial features, the small-scale (smaller than the intermediate scale) spatial features obtained after the second downsampling layer performs downsampling on the intermediate scale.
相应的,条件网络可以包括3个特征提取模块,第1个特征提取模块可以由卷积(Conv)层(图1中以3个卷积层为例)组成,即包括0个下采样操作,输出的特征张量1的尺度为原始尺度,用于对原始尺度的空间特征进行特异性处理。第2个特征提取模块可以由卷积层和1个下采样层(图1中以2个卷积层和1个下采样层为例)组成,即包括1个下采样操作,输出的特征张量2的尺度为中间尺度,用于对中间尺度的空间特征进行特异性处理。第3个特征提取模块可以由卷积层和2个下采样层(图1中以1个卷积层和2个下采样层为例)组成,即包括2个下采样操作,输出的特征张量3的尺度为小尺度,用于对小尺度的空间特征进行特异性处理。Correspondingly, the conditional network can include 3 feature extraction modules, and the first feature extraction module can be composed of a convolution (Conv) layer (take 3 convolution layers as an example in Figure 1), which includes 0 downsampling operations, The scale of the output feature tensor 1 is the original scale, which is used to specifically process the spatial features of the original scale. The second feature extraction module can be composed of a convolutional layer and a downsampling layer (take 2 convolutional layers and a downsampling layer as an example in Figure 1), that is, it includes a downsampling operation, and the output feature sheet The scale of quantity 2 is the intermediate scale, which is used to specifically process the spatial features of the intermediate scale. The third feature extraction module can be composed of a convolutional layer and 2 downsampling layers (take 1 convolutional layer and 2 downsampling layers as an example in Figure 1), which includes 2 downsampling operations, and the output feature sheet The scale of quantity 3 is a small scale, which is used for specific processing of small-scale spatial features.
在本申请实施例中,主网络中可以通过设计空间特征变换(Spatial Feature Transform,SFT)层来使得条件网络输出的不同尺度的特征张量作用到相应尺度的空间特征上。示例性的,主网络还包括第一SFT层和多个残差模块。第一SFT层连接到M个下采样层的输入侧和M个上采样层的输出侧。残差模块包括交替设置的第二SFT层和卷积层;多个残差模块穿插设置在M个下采样层和M个上采样层之间,M+1个不同尺度的特征张量分别输入至第一SFT层和第二SFT层中相应尺度的SFT层中,以对不同尺度的空间特征进行空间特征变换,实现对不同尺度的空间特征进行特异性处理。In the embodiment of this application, the spatial feature transform (Spatial Feature Transform, SFT) layer can be designed in the main network to make the feature tensors of different scales output by the conditional network act on the spatial features of the corresponding scale. Exemplarily, the main network further includes a first SFT layer and multiple residual modules. The first SFT layer is connected to the input side of the M downsampling layers and the output side of the M upsampling layers. The residual module includes alternately arranged second SFT layers and convolutional layers; multiple residual modules are interspersed between M downsampling layers and M upsampling layers, and M+1 feature tensors of different scales are respectively input To the SFT layer of the corresponding scale in the first SFT layer and the second SFT layer, to perform spatial feature transformation on the spatial features of different scales, and realize the specific processing of the spatial features of different scales.
例如,在图1中,主网络从输入(Input)到输出(Output)依次包括卷积层、第一SFT层(SFT layer1)、卷积层、下采样层Down1、两个残差模块(Residual block)、下采样层Down2、N(N为大于等于1的整数)个残差模块、卷积层、上采样层Up1、两个残差模块、上采样层Up2、SFT layer1、两个卷积层。For example, in Figure 1, the main network includes a convolutional layer, the first SFT layer (SFT layer1), a convolutional layer, a downsampling layer Down1, and two residual modules (Residual block), downsampling layer Down2, N (N is an integer greater than or equal to 1) residual modules, convolutional layer, upsampling layer Up1, two residual modules, upsampling layer Up2, SFT layer1, two convolutions Floor.
其中,Down1的输出跳跃连接至Up2的输出,Down2的输入跳跃连接至Up1的输出,Down2的输出跳跃连接至N个残差模块的输出。残差模块包括两组交替设置的第二SFT层(即SFT layer2)和卷积层。Among them, the output of Down1 is skipped connected to the output of Up2, the input of Down2 is skipped connected to the output of Up1, and the output of Down2 is skipped connected to the outputs of N residual modules. The residual module includes two sets of alternately arranged second SFT layers (ie, SFT layer2) and convolutional layers.
特征张量1输入至第一SFT层SFT layer11和SFT layer5,特征张量2输入至位于Down1和Down2之间的两个残差模块中的SFT layer2,和位于Up1和Up2之间的两个残差模块中的SFT layer2。特征张量3输入至位于Down2和Up1之间的N个残差模块中的SFT layer2。Feature tensor 1 is input to the first SFT layer SFT layer11 and SFT layer5, feature tensor 2 is input to SFT layer2 in the two residual modules between Down1 and Down2, and two residual modules between Up1 and Up2 SFT layer2 in difference module. Feature tensor 3 is input to SFT layer2 in N residual modules located between Down2 and Up1.
如图1所示,SFT层可以包括两组卷积层(图1中以每组卷积层包括两层卷积层为例),条件网络输出的特征张量通过一组卷积层的处理得到调制参数a。将该调制参数a与SFT层的上一层输出特征相乘,得到变换特征。再将特征张量通过另一组卷积层的处理得到调制参数b。将该调制参数b与变换特征相加,得到SFT层的输出特征。As shown in Figure 1, the SFT layer can include two sets of convolutional layers (in Figure 1, each set of convolutional layers includes two convolutional layers as an example), and the feature tensor output by the conditional network is processed by a set of convolutional layers Get the modulation parameter a. The modulation parameter a is multiplied by the output feature of the previous layer of the SFT layer to obtain the transformed feature. Then the feature tensor is processed by another set of convolutional layers to obtain the modulation parameter b. The modulation parameter b is added to the transform feature to obtain the output feature of the SFT layer.
也就是说,通过SFT层从条件网络输出的特征张量中学到调制参数(a,b),然后基于调制参数(a,b)对相应尺度的空间特征进行自适应的仿射变换,从而实现对不同空间特征的特异性处理,以保留有更多的有效空间信息。That is to say, the modulation parameters (a, b) are learned from the feature tensor output by the conditional network through the SFT layer, and then adaptive affine transformation is performed on the spatial features of the corresponding scale based on the modulation parameters (a, b), so as to realize Specific processing of different spatial features to retain more effective spatial information.
可以理解的是,相比于传统的U-net结构的网络,采用本申请提供的图像增强网络对待处理图像进行图像增强时,由于保留了更多的有效空间信息,因此可以恢复更多的细节纹理,并有效去噪和量化损失。从而可以提高图像增强效果,提高增强图像的画质。It can be understood that, compared with the traditional U-net structure network, when using the image enhancement network provided by this application to perform image enhancement on the image to be processed, since more effective spatial information is retained, more details can be recovered texture, and effectively denoising and quantizing losses. Therefore, the image enhancement effect can be improved, and the image quality of the enhanced image can be improved.
可选的,为了进一步降低传统U-net结构的网络对输入图像中的原始特征的丢失,本申请图像增强模型还可以包括权重网络。即在主网络上进一步增加权重网络。权重网络用于从待处理图像中提取原始特征。Optionally, in order to further reduce the loss of original features in the input image by the traditional U-net network, the image enhancement model of the present application may also include a weight network. That is to further increase the weight network on the main network. The weight network is used to extract raw features from the image to be processed.
示例性的,如图2所示,权重网络包括多层卷积层(图2中以4层卷积层为例),该多层卷积层的输入跳跃连接至输出。在该示例中,图像增强模型输出的增强图像为述主网络的输出与原始特征进行特征融合后得到的。其中,可以通过叠加的方式对主网络的输出与原始特征进行特征融合。Exemplarily, as shown in FIG. 2 , the weight network includes multiple convolutional layers (four convolutional layers are taken as an example in FIG. 2 ), and the input of the multi-layer convolutional layer is skip-connected to the output. In this example, the enhanced image output by the image enhancement model is obtained by performing feature fusion on the output of the main network and the original features. Among them, feature fusion can be performed on the output of the main network and the original features by means of superposition.
通过设计权重网络,可以从待处理图像中学习到原始特征,无需人工估计,使得在图像增强任务中保留更多且更准确的原始特征。且本申请提供的权重网络结构简单,便于优化,在保证原始特征的充分保留的情况下,能够降低图像增强网络的训练难度。By designing the weight network, the original features can be learned from the image to be processed without manual estimation, so that more and more accurate original features can be retained in the image enhancement task. Moreover, the weight network structure provided by this application is simple, easy to optimize, and can reduce the training difficulty of the image enhancement network under the condition that the original features are fully preserved.
值得说明的是,本申请提供网络框架具备泛用性。可以应用于任何图像增强任务或者以图像增强效果为评价指标的任务中。图像去雾,去噪声,去雨,超分辨率,去压缩伪影,去模糊,HDR重建等多种图像增强任务中。It is worth noting that the network framework provided by this application is universal. It can be applied to any image enhancement tasks or tasks that use image enhancement effects as evaluation indicators. Image defogging, denoising, deraining, super-resolution, decompression artifacts, deblurring, HDR reconstruction and other image enhancement tasks.
可以理解的是,针对不同的图像增强任务,可以通过设计对应的训练集和损失函数来训练初始模型,从而得到适用不同图像增强任务的图像增强模型。It can be understood that for different image enhancement tasks, the initial model can be trained by designing corresponding training sets and loss functions, so as to obtain image enhancement models suitable for different image enhancement tasks.
例如,基于由低分辨率图像样本和对应的高分辨率图像样本构成的训练集,对图像增强初始模型进行训练,可以得到能够应用于超分辨率的图像增强任务的图像增强模型。基于由于模糊图像样本和对应的清晰图像样本构成的训练集,对图像增强初始模型进行训练,可以得到能够应用于去模糊的图像增强任务的图像增强模型。For example, based on a training set composed of low-resolution image samples and corresponding high-resolution image samples, the image enhancement initial model can be trained to obtain an image enhancement model that can be applied to super-resolution image enhancement tasks. Based on the training set composed of blurred image samples and corresponding clear image samples, the initial image enhancement model is trained to obtain an image enhancement model that can be applied to the image enhancement task of deblurring.
可以理解的是,图像增强模型可以由图像处理设备预先训练好,也可以由其他设备预先训练好后将图像增强模型对应的文件移植至图像处理设备中。也就是说,训练该图像增强模型的执行主体与使用该图像增强模型进行图像增强任务的执行主体可以是相同的,也可以是不同的。例如,当采用其他设备训练图像增强初始模型时,其他设备对图像增强初始模型结束训练后,固定模型参数,得到图像增强模型对应的文件。然后将该文件移植到图像处理设备中。It can be understood that the image enhancement model may be pre-trained by the image processing device, or the file corresponding to the image enhancement model may be transplanted to the image processing device after being pre-trained by other devices. That is to say, the execution subject for training the image enhancement model and the execution subject for performing the image enhancement task using the image enhancement model may be the same or different. For example, when other devices are used to train the image enhancement initial model, after the other devices complete the training of the image enhancement initial model, the model parameters are fixed to obtain the corresponding file of the image enhancement model. This file is then ported to an image processing device.
下面以HDR重建任务,对本申请提供的图像增强模型的训练过程和的效果进行示例性的说明。The following uses the HDR reconstruction task to illustrate the training process and effect of the image enhancement model provided by this application.
示例性的,首先构建图像增强初始模型。即在搭建U-net初始网络后,基于初始主网络设计对应的条件初始网络和权重初始网络,并将条件初始网络和权重初始网络添加到初始主网络上。Exemplarily, an image enhancement initial model is constructed first. That is, after building the U-net initial network, design the corresponding conditional initial network and weight initial network based on the initial main network, and add the conditional initial network and weight initial network to the initial main network.
针对HDR重建任务,采集对应的训练集。该训练集包括多个图像样本对,每个图像样本对包括LDR图像样本和该LDR图像样本对应的HDR图像样本。例如,可以通过手机、相机等采集图像样本对。也可以根据将公开的HDR图像样本,利用开源算法得到对应的LDR图像样本。For the HDR reconstruction task, collect the corresponding training set. The training set includes a plurality of image sample pairs, and each image sample pair includes an LDR image sample and an HDR image sample corresponding to the LDR image sample. For example, image sample pairs may be collected by a mobile phone, a camera, or the like. It is also possible to use an open source algorithm to obtain corresponding LDR image samples based on the HDR image samples to be made public.
在一个实施例中,由于在LDR图像中,一般高亮区域(即过曝光区域)的像素和非高亮区域(即正常曝光区域)的像素差异较大,因此,在训练网络的过程中容易导致初始模型在训练过程中专注与像素值较大的区域。也就是说,初始模型在训练过程中,可能会专注于恢复高亮区域的亮度和纹理细节,而忽略了非高亮区域可能存在的噪声和量化损失问题。In one embodiment, because in the LDR image, the pixels in the generally highlighted area (ie, overexposed area) and the pixels in the non-highlighted area (ie, normally exposed area) have a large difference, therefore, it is easy to Causes the initial model to focus on areas with larger pixel values during training. That is to say, during the training process of the initial model, it may focus on restoring the brightness and texture details of the highlighted area, while ignoring the noise and quantization loss problems that may exist in the non-highlighted area.
为此,针对HDR重建任务,本申请提供一种损失函数Tanh_L1,用于描述用于HDR预测图像经过Tanh函数得到的数值与HDR图像样本经过Tanh函数得到的数值之间的L1损失。即在L1损失函数的基础上增加Tanh函数,其中,HDR预测图像为图像增强初始模型对LDR图像样本进行处理后得到图像。Therefore, for the HDR reconstruction task, the present application provides a loss function Tanh_L1, which is used to describe the L1 loss between the value obtained by the Tanh function for the HDR prediction image and the value obtained by the Tanh function for the HDR image sample. That is, the Tanh function is added on the basis of the L1 loss function, where the HDR prediction image is the image obtained after the initial image enhancement model processes the LDR image sample.
Tanh_L1的表达式如下所示:The expression of Tanh_L1 is as follows:
Figure dest_path_image001
Figure dest_path_image001
Y表示通过图像增强初始模型对LDR图像样本进行处理后得到的HDR预测图像,H表示LDR图像样本对应的HDR图像样本。Y represents the HDR prediction image obtained by processing the LDR image sample through the image enhancement initial model, and H represents the HDR image sample corresponding to the LDR image sample.
Tanh函数能够对像素值进行非线性压缩,因此应用于L1损失函数后,能够平衡LDR图像样本中的高亮区域的像素和非高亮区域的像素,降低由于高亮区域的像素和非高亮区域的像素差别过大,而产生的对初始模型的训练效果的影响。The Tanh function can perform non-linear compression on pixel values, so after being applied to the L1 loss function, it can balance the pixels in the highlighted area and the pixels in the non-highlighted area in the LDR image sample, and reduce the pixels due to the highlighted area and the non-highlighted area. The pixel difference in the region is too large, which will affect the training effect of the initial model.
基于该训练集和Tanh_L1,可以采用梯度下降法对图像增强初始模型进行迭代训练,当模型收敛(即Tanh_L1的值不在减小)时,即可得到训练完成的图像增强模型。Based on the training set and Tanh_L1, the gradient descent method can be used to iteratively train the image enhancement initial model. When the model converges (that is, the value of Tanh_L1 is not decreasing), the trained image enhancement model can be obtained.
如图3所示,利用该图像增强模型进行HDR重建时。图像处理设备在获取到待处理的LDR图像后,即可向图像增强模型输入LDR图像,LDR图像分别输入至权重网络、主网络和条件网络。由权重网络从LDR图像中提取原始特征;由条件网络从LDR图像中提取不同尺度的特征张量;将不同尺度的特征张量输入至主网络,主网络在对LDR图像的不同尺度的空间特征进行处理时,利用不同尺度的特征张量对不同尺度的空间特征进行特异性处理,得到主网络的输出特征。将该输出特征和原始特征进行特征融合,得到HDR图像。As shown in Figure 3, when using the image enhancement model for HDR reconstruction. After the image processing device acquires the LDR image to be processed, it can input the LDR image to the image enhancement model, and the LDR image is respectively input to the weight network, the main network and the conditional network. The original features are extracted from the LDR image by the weight network; the feature tensors of different scales are extracted from the LDR image by the conditional network; the feature tensors of different scales are input to the main network, and the main network is used for spatial features of different scales of the LDR image When processing, the feature tensors of different scales are used to specifically process the spatial features of different scales to obtain the output features of the main network. The output feature is fused with the original feature to obtain an HDR image.
示例性的,HDR重建的重建效果可以参见图4所示。Exemplarily, the reconstruction effect of HDR reconstruction can be referred to as shown in FIG. 4 .
综上可知,利用本申请提供的图像增强模型执行HDR重建任务时,首先不需要在同一场景下采集不同曝光的LDR图片,完成模型训练后,仅需要单张LDR图像即可恢复对应的HDR图像。其次,相对于以往基于单张LDR图像的HDR重建方法,本申请采用端到端的训练方式,模型简单,训练效率高,且模型效果好。To sum up, when using the image enhancement model provided by this application to perform HDR reconstruction tasks, firstly, it is not necessary to collect LDR images with different exposures in the same scene. After the model training is completed, only a single LDR image is needed to restore the corresponding HDR image. . Secondly, compared with the previous HDR reconstruction method based on a single LDR image, this application adopts an end-to-end training method, which has a simple model, high training efficiency, and good model effect.
最后,本申请提供的图像增强模型由于在主网络中进入条件网络,使得主网络能够对不同尺度的空间特征进行特异性处理,且在训练过程中采用Tanh_L1损失函数,使得本申请提供的图像增强模型能够同时处理在恢复高亮区域的亮度和纹理细节的同时,实现对非高亮区域进行去噪声和去量化损失的处理。也就是说,在HDR重建任务中,本申请提供的图像增强模型可以联合实现HDR重建任务、去噪声任、去量化损失任务。Finally, since the image enhancement model provided by this application enters the conditional network in the main network, the main network can perform specific processing on spatial features of different scales, and the Tanh_L1 loss function is used in the training process, so that the image enhancement provided by this application The model is able to simultaneously process denoising and dequantization losses on non-highlight regions while restoring brightness and texture details in highlight regions. That is to say, in the HDR reconstruction task, the image enhancement model provided by this application can jointly realize the HDR reconstruction task, denoising task, and dequantization loss task.
因此,该图像增强模型可以直接加入相机的后期处理过程,以从软件的角度实现相机拍摄质量的提升。当然,该图像增强模型也可以作为一种图像/视频后期增强手段,对已有的LDR数据进行画质增强。Therefore, the image enhancement model can be directly added to the camera's post-processing process to improve the camera's shooting quality from the perspective of software. Of course, the image enhancement model can also be used as an image/video post-enhancement means to enhance the image quality of the existing LDR data.
基于同一发明构思,作为对上述方法的实现,本申请实施例提供了一种图像增强装置,该装置实施例与前述方法实施例对应,为便于阅读,本装置实施例不再对前述方法实施例中的细节内容进行逐一赘述,但应当明确,本实施例中的装置能够对应实现前述方法实施例中的全部内容。Based on the same inventive concept, as the implementation of the above method, the embodiment of the present application provides an image enhancement device, the embodiment of the device corresponds to the foregoing method embodiment, for the convenience of reading, the embodiment of the device no longer compares the foregoing method embodiment The details in the present invention will be described one by one, but it should be clear that the device in this embodiment can correspondingly implement all the content in the foregoing method embodiments.
图5为本申请实施例提供的图像增强装置的结构示意图,如图5所示,本实施例提供的图像增强装置包括:获取单元501、处理单元502。FIG. 5 is a schematic structural diagram of an image enhancement device provided by an embodiment of the present application. As shown in FIG. 5 , the image enhancement device provided by this embodiment includes: an acquisition unit 501 and a processing unit 502 .
获取单元501,用于获取待处理图像。The obtaining unit 501 is configured to obtain an image to be processed.
处理单元502,用于将所述待处理图像输入到已训练的图像增强模型中处理,输出增强图像,所述图像增强模型包括主网络和条件网络,所述主网络为U-net结构,在处理所述待处理图像时,通过所述条件网络从所述待处理图像中提取多个不同尺度的特征张量,并将所述待处理图像和多个不同尺度的所述特征张量分别输入到所述主网络中相应尺度的网络层进行处理,得到所述增强图像。The processing unit 502 is configured to input the image to be processed into a trained image enhancement model for processing, and output the enhanced image. The image enhancement model includes a main network and a conditional network, the main network is a U-net structure, and the When processing the image to be processed, a plurality of feature tensors of different scales are extracted from the image to be processed through the conditional network, and the image to be processed and the feature tensors of multiple different scales are respectively input Go to the network layer of the corresponding scale in the main network for processing to obtain the enhanced image.
可选的,所述主网络包括M个下采样层和M个上采样层,所述条件网络包括共享卷积层和M+1个特征提取模块,所述M+1个特征提取模块分别包括不同数量的下采样操作;所述通过所述条件网络从所述待处理图像中提取多个不同尺度的特征张量,包括:Optionally, the main network includes M downsampling layers and M upsampling layers, the conditional network includes a shared convolution layer and M+1 feature extraction modules, and the M+1 feature extraction modules include Different numbers of downsampling operations; extracting multiple feature tensors of different scales from the image to be processed through the conditional network, including:
通过所述共享卷积层从所述待处理图像中提取中间特征;将所述中间特征分别输入到所述M+1个特征提取模块中处理,得到M+1个不同尺度的所述特征张量。Extract intermediate features from the image to be processed through the shared convolutional layer; input the intermediate features to the M+1 feature extraction modules for processing, and obtain M+1 feature sheets of different scales quantity.
可选的,所述主网络还包括第一SFT层和多个残差模块,所述残差模块包括交替设置的第二SFT层和卷积层;所述第一SFT层连接所述M个下采样层的输入侧和所述M个上采样层的输出侧,所述多个残差模块穿插设置在所述M个下采样层和所述M个上采样层之间,M+1个不同尺度的所述特征张量分别输入至所述第一SFT层和所述第二SFT层中相应尺度的SFT层中。Optionally, the main network further includes a first SFT layer and a plurality of residual modules, and the residual module includes alternately arranged second SFT layers and convolutional layers; the first SFT layer is connected to the M The input side of the downsampling layer and the output side of the M upsampling layers, the plurality of residual modules are interspersed between the M downsampling layers and the M upsampling layers, M+1 The feature tensors of different scales are respectively input into SFT layers of corresponding scales in the first SFT layer and the second SFT layer.
可选的,所述图像增强模型还包括权重网络,所述权重网络包括跳跃连接和多层卷积层;所述增强图像是通过将所述主网络的输出与原始特征进行特征融合后得到的,所述原始特征是通过所述权重网络从所述待处理图像中提取到的。Optionally, the image enhancement model also includes a weight network, and the weight network includes a skip connection and a multi-layer convolution layer; the enhanced image is obtained by merging the output of the main network with the original features , the original feature is extracted from the image to be processed through the weight network.
可选的,所述待处理图像为LDR图像,所述增强图像为HDR图像。Optionally, the image to be processed is an LDR image, and the enhanced image is an HDR image.
可选的,所述图像增强模型是利用预设的损失函数和训练集对预设的图像增强初始模型进行训练后得到的;其中,训练集包括多个LDR图像样本和每个所述LDR图像样本对应的HDR图像样本,所述预设的损失函数用于描述HDR预测图像经过Tanh函数得到的数值与所述HDR图像样本经过Tanh函数得到的数值之间的L1损失,所述HDR预测图像为所述图像增强初始模型对所述LDR图像样本进行处理后得到图像。Optionally, the image enhancement model is obtained by using a preset loss function and a training set to train a preset image enhancement initial model; wherein, the training set includes a plurality of LDR image samples and each of the LDR image samples The HDR image sample corresponding to the sample, the preset loss function is used to describe the L1 loss between the value obtained by the Tanh function of the HDR predicted image and the value obtained by the Tanh function of the HDR image sample, the HDR predicted image is The image enhancement initial model obtains an image after processing the LDR image sample.
本实施例提供的图像增强装置可以执行上述方法实施例,其实现原理与技术效果类似,此处不再赘述。The image enhancement device provided in this embodiment can execute the above-mentioned method embodiment, and its implementation principle and technical effect are similar, and details are not repeated here.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, only the division of the above-mentioned functional units and modules is used for illustration. In practical applications, the above-mentioned functions can be assigned to different functional units, Completion of modules means that the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit, and the above-mentioned integrated units may adopt hardware It can also be implemented in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the above system, reference may be made to the corresponding process in the foregoing method embodiments, and details will not be repeated here.
基于同一发明构思,本申请实施例还提供了一种终端设备。如图6所示,该实施例的终端设备6包括:处理器60、存储器61以及存储在所述存储器61中并可在所述处理器60上运行的计算机程序62。所述处理器60执行所述计算机程序62时实现上述各个图像增强方法实施例中的步骤,例如图1所示的步骤S101至步骤S104。或者,所述处理器60执行所述计算机程序62时实现上述各装置实施例中各模块/单元的功能,例如图4所示模块401至模块403的功能。Based on the same inventive concept, the embodiment of the present application also provides a terminal device. As shown in FIG. 6 , the terminal device 6 of this embodiment includes: a processor 60 , a memory 61 , and a computer program 62 stored in the memory 61 and operable on the processor 60 . When the processor 60 executes the computer program 62 , the steps in the above embodiments of the image enhancement method are implemented, for example, steps S101 to S104 shown in FIG. 1 . Alternatively, when the processor 60 executes the computer program 62, the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 401 to 403 shown in FIG. 4 , are realized.
示例性的,所述计算机程序62可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器61中,并由所述处理器60执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序62在所述终端设备6中的执行过程。Exemplarily, the computer program 62 can be divided into one or more modules/units, and the one or more modules/units are stored in the memory 61 and executed by the processor 60 to complete this application. The one or more modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 62 in the terminal device 6 .
本领域技术人员可以理解,图6仅仅是终端设备6的示例,并不构成对终端设备6的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备6还可以包括输入输出设备、网络接入设备、总线等。Those skilled in the art can understand that FIG. 6 is only an example of the terminal device 6, and does not constitute a limitation on the terminal device 6. It may include more or less components than those shown in the figure, or combine certain components, or different components. For example, the terminal device 6 may also include an input and output device, a network access device, a bus, and the like.
所述处理器60可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 60 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
所述存储器61可以是所述终端设备6的内部存储单元,例如终端设备6的硬盘或内存。所述存储器61也可以是所述终端设备6的外部存储设备,例如所述终端设备6上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器61还可以既包括所述终端设备6的内部存储单元也包括外部存储设备。所述存储器61用于存储所述计算机程序以及所述终端设备6所需的其它程序和数据。所述存储器61还可以用于暂时地存储已经输出或者将要输出的数据。The storage 61 may be an internal storage unit of the terminal device 6 , such as a hard disk or memory of the terminal device 6 . The memory 61 can also be an external storage device of the terminal device 6, such as a plug-in hard disk equipped on the terminal device 6, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Further, the memory 61 may also include both an internal storage unit of the terminal device 6 and an external storage device. The memory 61 is used to store the computer program and other programs and data required by the terminal device 6 . The memory 61 can also be used to temporarily store data that has been output or will be output.
本实施例提供的终端设备可以执行上述方法实施例,其实现原理与技术效果类似,此处不再赘述。The terminal device provided in this embodiment can execute the foregoing method embodiment, and its implementation principle and technical effect are similar, and details are not repeated here.
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例所述的方法。The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in the foregoing method embodiment is implemented.
本申请实施例还提供一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行时实现上述方法实施例所述的方法。The embodiment of the present application further provides a computer program product, which, when the computer program product runs on a terminal device, enables the terminal device to implement the method described in the foregoing method embodiments when executed.
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读存储介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(Read-Only Memory ,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。If the above integrated units are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, all or part of the procedures in the methods of the above embodiments in the present application can be completed by instructing related hardware through computer programs, and the computer programs can be stored in a computer-readable storage medium. The computer program When executed by a processor, the steps in the above-mentioned various method embodiments can be realized. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form. The computer-readable storage medium may at least include: any entity or device capable of carrying computer program codes to a photographing device/terminal device, a recording medium, a computer memory, a read-only memory (Read-Only Memory, ROM), a random access Memory (Random Access Memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media. Such as U disk, mobile hard disk, magnetic disk or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunication signals under legislation and patent practice.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the descriptions of each embodiment have their own emphases, and for parts that are not detailed or recorded in a certain embodiment, refer to the relevant descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/device and method can be implemented in other ways. For example, the device/device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or Components may be combined or integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the term "comprising" indicates the presence of described features, integers, steps, operations, elements and/or components, but does not exclude one or more other Presence or addition of features, wholes, steps, operations, elements, components and/or collections thereof.
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should also be understood that the term "and/or" used in the description of the present application and the appended claims refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations.
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in this specification and the appended claims, the term "if" may be construed, depending on the context, as "when" or "once" or "in response to determining" or "in response to detecting ". Similarly, the phrase "if determined" or "if [the described condition or event] is detected" may be construed, depending on the context, to mean "once determined" or "in response to the determination" or "once detected [the described condition or event] ]” or “in response to detection of [described condition or event]”.
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of the specification and appended claims of the present application, the terms "first", "second", "third" and so on are only used to distinguish descriptions, and should not be understood as indicating or implying relative importance.
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。Reference to "one embodiment" or "some embodiments" or the like in the specification of the present application means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "including", "comprising", "having" and variations thereof mean "including but not limited to", unless specifically stated otherwise.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit it; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.

Claims (10)

  1. 一种图像增强方法,其特征在于,所述方法包括:An image enhancement method, characterized in that the method comprises:
    获取待处理图像;Get the image to be processed;
    将所述待处理图像输入到已训练的图像增强模型中处理,输出增强图像,所述图像增强模型包括主网络和条件网络,所述主网络为U-net结构,在处理所述待处理图像时,通过所述条件网络从所述待处理图像中提取多个不同尺度的特征张量,并将所述待处理图像和多个不同尺度的所述特征张量分别输入到所述主网络中相应尺度的网络层进行处理,得到所述增强图像。The image to be processed is input into a trained image enhancement model for processing, and an enhanced image is output. The image enhancement model includes a main network and a conditional network. The main network is a U-net structure, and the image to be processed is processed When , multiple feature tensors of different scales are extracted from the image to be processed through the conditional network, and the image to be processed and the feature tensors of multiple different scales are respectively input into the main network The network layer of the corresponding scale is processed to obtain the enhanced image.
  2. 根据权利要求1所述的方法,其特征在于,所述主网络包括M个下采样层和M个上采样层,所述条件网络包括共享卷积层和M+1个特征提取模块,所述M+1个特征提取模块分别包括不同数量的下采样操作;The method according to claim 1, wherein the main network includes M downsampling layers and M upsampling layers, and the conditional network includes a shared convolution layer and M+1 feature extraction modules, the M+1 feature extraction modules include different numbers of downsampling operations;
    所述通过所述条件网络从所述待处理图像中提取多个不同尺度的特征张量,包括:The extraction of multiple feature tensors of different scales from the image to be processed through the conditional network includes:
    通过所述共享卷积层从所述待处理图像中提取中间特征;extracting intermediate features from the image to be processed through the shared convolutional layer;
    将所述中间特征分别输入到所述M+1个特征提取模块中处理,得到M+1个不同尺度的所述特征张量。The intermediate features are respectively input to the M+1 feature extraction modules for processing, and M+1 feature tensors of different scales are obtained.
  3. 根据权利要求2所述的方法,其特征在于,所述主网络还包括第一SFT层和多个残差模块,所述残差模块包括交替设置的第二SFT层和卷积层;所述第一SFT层连接所述M个下采样层的输入侧和所述M个上采样层的输出侧,所述多个残差模块穿插设置在所述M个下采样层和所述M个上采样层之间,M+1个不同尺度的所述特征张量分别输入至所述第一SFT层和所述第二SFT层中相应尺度的SFT层中。The method according to claim 2, wherein the main network further includes a first SFT layer and a plurality of residual modules, and the residual module includes alternately arranged second SFT layers and convolutional layers; the The first SFT layer connects the input side of the M downsampling layers and the output side of the M upsampling layers, and the plurality of residual modules are interspersed between the M downsampling layers and the M upsampling layers Between the sampling layers, M+1 feature tensors of different scales are respectively input to the SFT layers of corresponding scales in the first SFT layer and the second SFT layer.
  4. 根据权利要求1所述的方法,其特征在于,所述图像增强模型还包括权重网络,所述权重网络包括跳跃连接和多层卷积层;所述增强图像是通过将所述主网络的输出与原始特征进行特征融合后得到的,所述原始特征是通过所述权重网络从所述待处理图像中提取到的。The method according to claim 1, wherein the image enhancement model also includes a weight network, and the weight network includes a skip connection and a multi-layer convolutional layer; the enhanced image is obtained by adding the output of the main network It is obtained after performing feature fusion with original features, and the original features are extracted from the image to be processed through the weight network.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述待处理图像为LDR图像,所述增强图像为HDR图像。The method according to any one of claims 1-4, wherein the image to be processed is an LDR image, and the enhanced image is an HDR image.
  6. 根据权利要求5所述的方法,其特征在于,所述图像增强模型是利用预设的损失函数和训练集对预设的图像增强初始模型进行训练后得到的;The method according to claim 5, wherein the image enhancement model is obtained after training a preset image enhancement initial model using a preset loss function and a training set;
    其中,训练集包括多个LDR图像样本和每个所述LDR图像样本对应的HDR图像样本,所述预设的损失函数用于描述HDR预测图像经过Tanh函数得到的数值与所述HDR图像样本经过Tanh函数得到的数值之间的L1损失,所述HDR预测图像为所述图像增强初始模型对所述LDR图像样本进行处理后得到图像。Wherein, the training set includes a plurality of LDR image samples and HDR image samples corresponding to each of the LDR image samples, and the preset loss function is used to describe the difference between the value obtained by the HDR prediction image through the Tanh function and the HDR image sample after the The L1 loss between values obtained by the Tanh function, the HDR prediction image is an image obtained after the image enhancement initial model processes the LDR image sample.
  7. 一种图像增强装置,其特征在于,包括:An image enhancement device, characterized in that it comprises:
    获取单元,用于获取待处理图像;an acquisition unit, configured to acquire an image to be processed;
    处理单元,用于将所述待处理图像输入到已训练的图像增强模型中处理,输出增强图像,所述图像增强模型包括主网络和条件网络,所述主网络为U-net结构,在处理所述待处理图像时,通过所述条件网络从所述待处理图像中提取多个不同尺度的特征张量,并将所述待处理图像和多个不同尺度的所述特征张量分别输入到所述主网络中相应尺度的网络层进行处理,得到所述增强图像。The processing unit is used to input the image to be processed into the trained image enhancement model for processing, and output the enhanced image. The image enhancement model includes a main network and a conditional network, and the main network is a U-net structure. For the image to be processed, a plurality of feature tensors of different scales are extracted from the image to be processed through the conditional network, and the image to be processed and the feature tensors of multiple different scales are respectively input into The network layer of the corresponding scale in the main network performs processing to obtain the enhanced image.
  8. 根据权利要求7所述的图像增强装置,其特征在于,所述图像增强模型还包括权重网络,所述权重网络包括跳跃连接和多层卷积层;所述增强图像是通过将所述主网络的输出与原始特征进行特征融合后得到的,所述原始特征是通过所述权重网络从所述待处理图像中提取到的。The image enhancement device according to claim 7, wherein the image enhancement model further comprises a weight network, and the weight network comprises a skip connection and a multi-layer convolutional layer; the enhanced image is obtained by combining the main network The output of is obtained after performing feature fusion with the original feature, and the original feature is extracted from the image to be processed through the weight network.
  9. 一种终端设备,其特征在于,包括:存储器和处理器,所述存储器用于存储计算机程序;所述处理器用于在调用所述计算机程序时执行如权利要求1-6任一项所述的方法。A terminal device, characterized in that it includes: a memory and a processor, the memory is used to store a computer program; the processor is used to execute the method described in any one of claims 1-6 when calling the computer program method.
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-6任一项所述的方法。A computer-readable storage medium on which a computer program is stored, wherein the computer program implements the method according to any one of claims 1-6 when executed by a processor.
PCT/CN2021/137821 2021-05-27 2021-12-14 Image enhancement method and apparatus, terminal device, and storage medium WO2022247232A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110584556.XA CN113298740A (en) 2021-05-27 2021-05-27 Image enhancement method and device, terminal equipment and storage medium
CN202110584556.X 2021-05-27

Publications (1)

Publication Number Publication Date
WO2022247232A1 true WO2022247232A1 (en) 2022-12-01

Family

ID=77325578

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137821 WO2022247232A1 (en) 2021-05-27 2021-12-14 Image enhancement method and apparatus, terminal device, and storage medium

Country Status (2)

Country Link
CN (1) CN113298740A (en)
WO (1) WO2022247232A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113298740A (en) * 2021-05-27 2021-08-24 中国科学院深圳先进技术研究院 Image enhancement method and device, terminal equipment and storage medium
CN117157665A (en) * 2022-03-25 2023-12-01 京东方科技集团股份有限公司 Video processing method and device, electronic equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190172230A1 (en) * 2017-12-06 2019-06-06 Siemens Healthcare Gmbh Magnetic resonance image reconstruction with deep reinforcement learning
CN111353939A (en) * 2020-03-02 2020-06-30 中国科学院深圳先进技术研究院 Image super-resolution method based on multi-scale feature representation and weight sharing convolution layer
CN112270644A (en) * 2020-10-20 2021-01-26 西安工程大学 Face super-resolution method based on spatial feature transformation and cross-scale feature integration
CN112419152A (en) * 2020-11-23 2021-02-26 中国科学院深圳先进技术研究院 Image super-resolution method and device, terminal equipment and storage medium
CN113298740A (en) * 2021-05-27 2021-08-24 中国科学院深圳先进技术研究院 Image enhancement method and device, terminal equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830816B (en) * 2018-06-27 2020-12-04 厦门美图之家科技有限公司 Image enhancement method and device
RU2709661C1 (en) * 2018-09-19 2019-12-19 Общество с ограниченной ответственностью "Аби Продакшн" Training neural networks for image processing using synthetic photorealistic containing image signs

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190172230A1 (en) * 2017-12-06 2019-06-06 Siemens Healthcare Gmbh Magnetic resonance image reconstruction with deep reinforcement learning
CN111353939A (en) * 2020-03-02 2020-06-30 中国科学院深圳先进技术研究院 Image super-resolution method based on multi-scale feature representation and weight sharing convolution layer
CN112270644A (en) * 2020-10-20 2021-01-26 西安工程大学 Face super-resolution method based on spatial feature transformation and cross-scale feature integration
CN112419152A (en) * 2020-11-23 2021-02-26 中国科学院深圳先进技术研究院 Image super-resolution method and device, terminal equipment and storage medium
CN113298740A (en) * 2021-05-27 2021-08-24 中国科学院深圳先进技术研究院 Image enhancement method and device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN113298740A (en) 2021-08-24

Similar Documents

Publication Publication Date Title
Zamir et al. Learning enriched features for fast image restoration and enhancement
CN111402130B (en) Data processing method and data processing device
Shi et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network
EP3948764B1 (en) Method and apparatus for training neural network model for enhancing image detail
WO2021164234A1 (en) Image processing method and image processing device
CN112308200B (en) Searching method and device for neural network
CN112233038A (en) True image denoising method based on multi-scale fusion and edge enhancement
CN112767290B (en) Image fusion method, image fusion device, storage medium and terminal device
WO2022247232A1 (en) Image enhancement method and apparatus, terminal device, and storage medium
WO2022242122A1 (en) Video optimization method and apparatus, terminal device, and storage medium
CN111932480A (en) Deblurred video recovery method and device, terminal equipment and storage medium
Guan et al. Srdgan: learning the noise prior for super resolution with dual generative adversarial networks
Xu et al. Exploiting raw images for real-scene super-resolution
CN116547694A (en) Method and system for deblurring blurred images
Zhang et al. Deep motion blur removal using noisy/blurry image pairs
CN110717864B (en) Image enhancement method, device, terminal equipment and computer readable medium
CN113628134B (en) Image noise reduction method and device, electronic equipment and storage medium
Hua et al. Dynamic scene deblurring with continuous cross-layer attention transmission
CN111383188A (en) Image processing method, system and terminal equipment
CN113658050A (en) Image denoising method, denoising device, mobile terminal and storage medium
CN111953888B (en) Dim light imaging method and device, computer readable storage medium and terminal equipment
US20230060988A1 (en) Image processing device and method
Zhang et al. A new image filtering method: Nonlocal image guided averaging
WO2023273515A1 (en) Target detection method, apparatus, electronic device and storage medium
CN115937121A (en) Non-reference image quality evaluation method and system based on multi-dimensional feature fusion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942795

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21942795

Country of ref document: EP

Kind code of ref document: A1