WO2023050731A1 - 图像增强模型训练的方法、图像增强的方法、可读介质 - Google Patents

图像增强模型训练的方法、图像增强的方法、可读介质 Download PDF

Info

Publication number
WO2023050731A1
WO2023050731A1 PCT/CN2022/081106 CN2022081106W WO2023050731A1 WO 2023050731 A1 WO2023050731 A1 WO 2023050731A1 CN 2022081106 W CN2022081106 W CN 2022081106W WO 2023050731 A1 WO2023050731 A1 WO 2023050731A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
convolution
enhancement
loss
sampling
Prior art date
Application number
PCT/CN2022/081106
Other languages
English (en)
French (fr)
Inventor
任聪
刘衡祁
徐科
孔德辉
艾吉松
刘欣
游晶
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Priority to EP22874134.4A priority Critical patent/EP4394692A1/en
Publication of WO2023050731A1 publication Critical patent/WO2023050731A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to but not limited to the technical field of image processing.
  • the collected images are often too low in brightness and not clear enough.
  • the image capture unit therein has a weaker light-sensing ability than a SLR camera due to its size limitation, and the above problems are even more serious.
  • image enhancement techniques can be used to improve their brightness and contrast to make them clearer, such as night scene image enhancement ("night scene image enhancement" only means that the overall brightness of the image is low, while does not mean it has to be collected at night time).
  • night scene image enhancement only means that the overall brightness of the image is low, while does not mean it has to be collected at night time.
  • the effect of the existing image enhancement technology cannot effectively improve the brightness and contrast of the image at the same time.
  • the disclosure provides an image enhancement model training method, an image enhancement method, and a computer-readable medium.
  • the present disclosure provides a method for training an image enhancement model.
  • the image enhancement model includes an enhancement module configured to enhance brightness and contrast, and the enhancement module includes volumes corresponding to a plurality of preset brightness intervals one-to-one. product branch; the enhancement module is configured to input the pixels of the image input therein to the corresponding convolution branch according to the brightness interval, use the first convolution unit to convolve in each convolution branch, and each convolution branch After the images output by the two channels are combined, they are convolved with the second convolution unit; the method includes: inputting the sample image into the image enhancement model, and obtaining the result image output by the image enhancement model; calculating the loss, and the loss includes the result image The image loss relative to the standard image, and the first constraint loss of the brightness histogram constraint of the image output by the convolution branch relative to the standard image in each convolution branch; adjust the enhancement module according to the loss; if not satisfy the training The end condition returns to the step of inputting the sample image into the image enhancement model.
  • the present disclosure provides a method for image enhancement, including: at least inputting the image to be enhanced into an image enhancement model; the image enhancement model is trained by any method of image enhancement model training described herein ; Obtain a result image output by the image enhancement model.
  • the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, it realizes: any method for training an image enhancement model described herein, and/or, Any one of the image enhancement methods described herein.
  • FIG. 1 is a flow chart of a method for image enhancement model training provided by the present disclosure
  • FIG. 2 is a schematic diagram of a processing flow of an image enhancement model provided by the present disclosure
  • FIG. 3 is a schematic diagram of an input image of an alignment module of an image enhancement model provided by the present disclosure
  • FIG. 4 is a schematic diagram of an input image of an alignment module of another image enhancement model provided by the present disclosure.
  • FIG. 5 is a schematic diagram of the processing flow of an AP3D alignment module of an image enhancement model provided by the present disclosure
  • FIG. 6 is a schematic diagram of the processing flow of the appearance maintaining unit in the AP3D alignment module of an image enhancement model provided by the present disclosure
  • FIG. 7 is a schematic diagram of a processing flow of a fusion module of an image enhancement model provided by the present disclosure
  • FIG. 8 is a schematic diagram of the processing flow of an enhancement module of an image enhancement model provided by the present disclosure.
  • Fig. 9 is a schematic diagram of the processing flow of the residual down-sampling unit of the enhancement module of an image enhancement model provided by the present disclosure.
  • FIG. 10 is a schematic diagram of the processing flow of the residual upsampling unit of the enhancement module of an image enhancement model provided by the present disclosure
  • FIG. 11 is a flowchart of an image enhancement method provided by the present disclosure.
  • Fig. 12 is a composition block diagram of a computer-readable medium provided by the present disclosure.
  • the present disclosure may be described with reference to plan views and/or cross-sectional views by way of idealized schematic views of the present disclosure. Accordingly, the example illustrations may be modified according to manufacturing techniques and/or tolerances.
  • the terms used in the present disclosure are for describing specific embodiments only, and are not intended to limit the present disclosure.
  • the term “and/or” includes any and all combinations of one or more of the associated listed items.
  • the singular forms “a” and “the” are intended to include the plural forms as well, unless the context clearly dictates otherwise.
  • the terms “comprising”, “made up of” designate the presence of said features, integers, steps, operations, elements and/or components, but do not exclude the presence or addition of one or more other features, Integrals, steps, operations, elements, components and/or groups thereof.
  • the present disclosure is not limited to the embodiments shown in the drawings, but includes modifications of configurations formed based on manufacturing processes. Accordingly, the regions illustrated in the figures have schematic properties, and the shapes of the regions shown in the figures illustrate the specific shapes of the regions of the elements, but are not intended to be limiting.
  • "Super Night” technology can be used for image enhancement.
  • CNN Convolutional Neural Networks
  • the purpose of Super Night Scene technology is not to increase the brightness of the image as a whole (to turn the night scene into daytime), but to increase the local area with relatively high brightness while keeping the brightness of the low-brightness area in the image basically unchanged.
  • the brightness of the image can make the brighter part of the image more clear and obvious (in the case of maintaining the night scene, the part of the scene can be brightened).
  • the super night scene technology should be able to keep the night sky relatively dark (maintain the night scene), improve the starlight, night scene, etc.
  • the brightness of the area where lights etc. are located local brightening).
  • the training method of the image enhancement model is unreasonable and lacks appropriate brightness constraints, which leads to unsatisfactory effects of image enhancement (such as super night scenes) using it: or the overall brightness of the obtained image is insufficiently improved, and the brightening The effect is not obvious, and the image is still not clear; or the overall brightness of the obtained image is increased too much and the contrast is insufficient, that is, it is not a local brightening but an overall brightening, such as changing the night scene into daytime.
  • the present disclosure provides a method for training an image enhancement model.
  • the image enhancement model of the present disclosure may be a neural network (Neural Networks) model, for example, a model including a convolutional neural network (CNN, Convolutional Neural Networks).
  • a neural network Neural Networks
  • CNN convolutional neural network
  • CNN Convolutional Neural Networks
  • the image enhancement model disclosed in the present disclosure can be used for image enhancement (such as night scene image enhancement), specifically, improving the brightness and contrast of the image at the same time.
  • image enhancement model can increase the brightness of the high brightness area while keeping the brightness of the low brightness area basically unchanged (in the case of maintaining the night scene, the local scene in it can be reduced). Brightening) for image enhancement with "Super Night” technology. Therefore, the above image enhancement model can also be regarded as "Super Night Network (SNN, Super Night Network)".
  • the disclosed method is used to train the image enhancement model, that is, to adjust the parameters in the image enhancement model during the training process, so as to improve the performance of the image enhancement model, and finally obtain an image enhancement model that meets the requirements.
  • the image enhancement model includes an enhancement module configured to enhance brightness and contrast
  • the enhancement module includes convolution branches corresponding to a plurality of preset brightness intervals one-to-one; The corresponding brightness range is input to a corresponding convolution branch, and each convolution branch is convolved with a first convolution unit, and the images output from each convolution branch are combined and then convolved with a second convolution unit.
  • the image enhancement model of the present disclosure includes an enhancement module configured to enhance brightness and contrast, such as a histogram constraint module (HCM, Histogram Consistency Module).
  • HCM histogram constraint Module
  • the pixels (features) therein enter the corresponding brightness range (that is, the range of brightness level) according to the brightness interval of its own brightness.
  • the convolution branches form multiple (three in the figure as an example) parallel data streams, that is, the data streams in each convolution branch only include pixels (features) whose luminance is within a specific luminance interval.
  • the data stream entering each convolution branch is convoluted by the first convolution unit (CONV1) in the convolution branch to obtain an image (FM in ) for subsequent processing (such as entering the sampling section);
  • the image merged by multiple convolution branches is the output image (FM out ) of the convolution branch, which is then convolved in the "overall" second convolution unit (CONV2), and obtained from the enhancement module output, as the resulting image (I out ).
  • the second convolution unit and each first convolution unit may include one or more convolution kernels, and the number, size, weight, elements, etc. of convolution kernels in different convolution units may be different.
  • the image augmentation model further includes an alignment module and a fusion module.
  • the alignment module is arranged in front of the enhancement module and is configured to align the image to be enhanced input to the image enhancement model with adjacent images.
  • the adjacent image is an image corresponding to the same scene as the image to be enhanced and collected at a time adjacent to the image to be enhanced.
  • the fusion module is arranged between the alignment module and the enhancement module, and is configured to fuse multiple aligned images output by the alignment module into one image and input it to the enhancement module.
  • the image enhancement model may further include an alignment module (AM, Alignment Module) and a fusion module (FM, Fusion Module) arranged in front of the enhancement module.
  • the alignment module is configured to align multiple images of the input image enhancement model (such as pixel alignment), and then the fusion module fuses the multiple images into one image for subsequent processing by the enhancement module.
  • multiple frames of images can be continuously collected for the same scene, one frame is used as the image to be enhanced, and the others are adjacent images of the image to be enhanced (acquired at adjacent times), and both the image to be enhanced and the adjacent image are input Align and fuse to the alignment module and the fusion module, so that the fused image combines the information of multiple images, which can reduce or eliminate the noise in the image, and obtain more details and have higher resolution.
  • the alignment module and the fusion module the information of multiple (multi-frame) images can be comprehensively utilized, and the detailed information of the image input to the enhancement module (HCM) can be enriched, so that the details of the final result image can be further improved , the noise is reduced.
  • HCM enhancement module
  • the specific ways of selecting the image to be enhanced and adjacent images, and the specific forms of the alignment module and the fusion module are various.
  • the INt frame image can be used as the image to be enhanced, and each n before and after it (n is greater than or equal to 1 Integer) frame image as adjacent image.
  • the image enhancement performed on the video stream may also be performed on each frame in the video stream, or in other words, perform the method of the embodiment of the present disclosure on each frame in the video stream. Therefore, the overall processing effect on the video stream is to enhance the video stream.
  • the INtth frame image can be used as the image to be enhanced, and the next n frames of images can be used as adjacent images.
  • image enhancement is not limited to collecting only one frame of image, but can use continuous shooting to continuously collect multiple frames of images for the same scene, so as to obtain richer content details.
  • the alignment module can adopt an appearance-preserving 3D convolution network (AP3D, Appearance-Preserving 3D Convolution).
  • AP3D can be used to reconstruct images and ensure the apparent alignment of the reconstructed images.
  • the structure of AP3D can refer to Figure 5.
  • the input images are INt-1, INt, INt+1, where INt is the image to be enhanced, and others are neighboring images.
  • the input image is first copied and then divided into two paths, and 3 images in each path are copied to obtain 6 images and reordered, so that each image will be used as the main image and the other as the secondary image
  • the images are input to the Appearance-Preserving Module (APM, Appearance-Preserving Module) together for pixel-level adjacent alignment to obtain 6 images.
  • the 6 images are then inserted (concat) into the 3 original input images to obtain 9 images.
  • the 9 images use a 3*3*3 convolution kernel to perform convolution with a stride of 3, 1, 1.
  • Product (Conv) to get 3 images (that is, the reconstructed image after alignment) as the output of the alignment module (I align ).
  • the structure of the Appearance Preservation Unit can refer to Figure 6.
  • the main image (central) and the secondary image (adjacent) input to it pass a series of deformation (reshape), L2 norm (L2-norm), Hadamard Product (Hadamard product), inner product (inner product) operations, the two input images are adjacently aligned (aligned adjacent) into one image.
  • the tensor is transposed, at 602, a proportion is calculated for the input feature tensor, and normalized to [0,1], and at 603, the input feature tensor is nonlinearly Normalize to (0,1), at 604, name the feature of the output of the processing at 603, for example, name it mask.
  • the alignment module can be any other module that can realize the alignment function, such as other 3D convolutional networks, optical flow networks, Deformable convolutional networks, MEMC (Motion Estimation and Motion Compensation) networks, etc.
  • the fusion module can directly perform feature (pixel) addition (ADD) fusion on the aligned images (I align_t-1 , I align_t , I align_t+1 ) to obtain a fusion image ( I fusion ), whose dimension is c ⁇ h ⁇ w.
  • the image (I fusion ) can be the image used to input the enhancement module.
  • the fusion module may be other modules capable of realizing the fusion function, such as a cascade module, an addition module, a convolutional neural network, and the like.
  • the method of the present disclosure includes steps S101 to S104.
  • step S101 a sample image is input into the image enhancement model, and a result image output by the image enhancement model is obtained.
  • the sample image is input into the current image enhancement model, and the image enhancement model performs processing (image enhancement) to obtain the result image (I out ).
  • the sample image is located in the preset training samples, each training sample includes the sample image and the corresponding standard (Ground Truth) image (I GT ), the sample image is equivalent to the image to be enhanced, and the standard image is the sample image for effect The image you deserve after a good image enhancement.
  • each training sample includes the sample image and the corresponding standard (Ground Truth) image (I GT )
  • the sample image is equivalent to the image to be enhanced
  • the standard image is the sample image for effect The image you deserve after a good image enhancement.
  • image acquisition with shorter exposure time and image acquisition with longer exposure time can be continuously performed on the same scene by an image acquisition unit (such as a camera), so that a pair of long-exposure images (long) and short-exposure images can be obtained.
  • the image (short) is used as a training sample.
  • the exposure time of the long-exposure image is long, and the amount of light collected is large, which is equivalent to the image after the image enhancement of the short-exposure image, so the short-exposure image can be used as a sample image, and the long-exposure image can be used as a standard image.
  • step S102 the loss is calculated.
  • the loss includes the image loss of the result image relative to the standard image, and the first constraint loss of the brightness histogram constraints of the image output by the convolution branch relative to the standard image in each convolution branch.
  • the corresponding loss can be calculated according to the loss function, and the loss can represent the difference between the result image obtained by the current image enhancement model and the expected standard result.
  • the loss includes at least an image loss and a first constraint loss.
  • This image loss characterizes the difference between the current result image (ie, the image I out output by the enhancement module) and the corresponding standard image ( IGT ).
  • the first constraint loss represents the image output by the convolution branch (that is, the image FMout output by the convolution branch that needs to be convolved by the second convolution unit) relative to the standard image ( IGT ), in the above brightness
  • IGT standard image
  • the first constraint loss is equivalent to introducing a brightness histogram constraint (regular constraint) during the training process.
  • step S103 the enhancement module is adjusted according to the loss.
  • the parameters in the enhancement module (such as the weight of the convolution kernel and the value of each element) are adjusted accordingly to improve the image enhancement model.
  • the image enhancement model also includes other modules (such as the above alignment module, fusion module, etc.), these modules can be pre-set (such as using existing mature modules, or training separately in advance), so that Their parameters may not be adjusted during the training process, that is, no training is performed.
  • these modules can be pre-set (such as using existing mature modules, or training separately in advance), so that Their parameters may not be adjusted during the training process, that is, no training is performed.
  • step S104 if the training end condition is not met, return to the step of inputting the sample image into the image enhancement model.
  • the specific form of the training end condition is various.
  • the training end condition can be that the loss reaches a predetermined range, or the convergence of the image enhancement model reaches a certain level, or reaches a predetermined number of cycles, etc., which will not be described in detail here. .
  • the convolution branch further includes: a sampling section arranged after the first convolution unit, and the sampling section includes a plurality of sampling units configured to perform sampling; wherein, the input of each sampling unit comes from the volume in which it is located product branch and at least one other convolution branch.
  • a sampling section can be set after the first convolution unit of each convolution branch, and the sampling section includes sampling units, and each The input of a sampling unit not only comes from the previous unit of the same convolution branch (such as the first sampling unit or other sampling units), but also from the corresponding previous unit in other convolution branches (that is, in other convolution branches In the branch, the previous unit in the same position as the previous unit of this convolution branch), multiple inputs are combined (Add) and then enter the subsequent unit; or in other words, the output of the first sampling unit and each sampling unit except In addition to being input to the post-sampling unit of its own convolution branch, it is also output to corresponding post-sampling units of other convolution branches.
  • each sampling unit of the sampling section of the middle convolution branch receives the output of the previous unit of the convolution branch, and also receives the other two convolution branches.
  • the output of the corresponding previous unit in the branch is merged and then enters the subsequent unit; while each sampling unit of the other two convolution branches receives the output of the previous unit of the convolution branch and also receives the intermediate convolution The output of the corresponding previous unit in the product branch.
  • each sampling unit of the other two convolution branches receives the output of the previous unit of the convolution branch and also receives the intermediate convolution The output of the corresponding previous unit in the product branch.
  • the sampling section includes a down-sampling unit configured to perform down-sampling, and an up-sampling unit configured to perform up-sampling after the down-sampling unit.
  • the down-sampling unit is configured to perform residual down-sampling; the up-sampling unit is configured to perform residual up-sampling.
  • a down-sampling unit in order to ensure that the size of the final output image of each convolution branch (that is, the image output to the second sampling unit) remains unchanged, a down-sampling unit can be set in the sampling section first, and then set upsampling unit.
  • the sampling section may include an equal number of down-sampling units and up-sampling units. For example, referring to FIG. 8 , there are two down-sampling units and then two up-sampling units.
  • the above down-sampling unit may be a residual down-sampling unit (Resblock_down) configured to perform residual down-sampling
  • the up-sampling unit may be a residual up-sampling unit (Resblock_up) configured to perform residual up-sampling.
  • the structure of the residual down-sampling unit and the residual up-sampling unit can refer to Figure 9 and Figure 10 respectively, which include multiple convolutions (Conv) of different sizes, linear rectification functions (ReLU) for activation, Down-sampling (Down-Sampling), up-sampling (UP-Sampling) and other operations.
  • Conv convolutions
  • ReLU linear rectification functions
  • sampling unit and the residual sampling unit are not limited thereto, and they may also be any other sampling unit, even a convolutional sampling unit whose size does not change.
  • the convolution branch further includes: a short-circuit connection connected between the input terminal and the output terminal of the sampling section, and the short-circuit connection is configured to short-circuit the image input to the sampling section to the output terminal of the sampling section.
  • the convolution branch also includes a short-circuit connection (Short-cut) connected between its input and output, that is, the image input to the convolution branch is also input to its output for merging ( Add) to further improve the effect of increasing contrast.
  • Short-cut short-circuit connection
  • short-circuit connections may also be included in the enhancement module, for example, a short-circuit connection that directly connects the image input to the enhancement module (ie, the image I fusion output by the fusion module) to the output terminal of the enhancement module (ie, after the second convolution unit) etc., the image (I fusion ) input to the enhancement unit is merged with the image that the enhancement module should originally output (if there is no short-circuit connection) (that is, the image output by the second convolution unit).
  • a short-circuit connection that directly connects the image input to the enhancement module (ie, the image I fusion output by the fusion module) to the output terminal of the enhancement module (ie, after the second convolution unit) etc.
  • the image (I fusion ) input to the enhancement unit is merged with the image that the enhancement module should originally output (if there is no short-circuit connection) (that is, the image output by the second convolution unit).
  • the loss further includes: a second constraint loss of the brightness histogram constraints in each convolution branch of the image input to the sampling segment relative to the image input to the enhancement module.
  • the first constraint loss characterizes the difference between the image (FMout) output by the convolution branch and the standard image ( IGT ), in each brightness interval according to the difference in the distribution of the number of pixels counted by the histogram, that is, the "output side” differences. But at the same time, there may also be differences in the "input side” of the augmentation module and also have an impact on the loss.
  • a second loss can also be added to the loss, that is, the image input to the sampling segment (the image FM in after the first convolution unit) is compared to the image input to the enhancement module (I in , such as the image output by the fusion module) , the difference in the distribution of the number of pixels according to the histogram statistics in each brightness interval.
  • the loss loss is calculated by the following formula:
  • Hist(FM out , I GT ,S) represents the first constraint loss
  • Hist(FM in ,I in ,S) represents the second constraint loss
  • I in represents the image input to the enhancement module
  • I out represents the result image
  • I GT represents the standard image
  • FM in represents the image input to the sampling segment
  • FM out represents the image output by the convolution branch
  • S represents the number of brightness intervals
  • 1 represents the L1 norm function
  • hist represents the HIST statistics function
  • ⁇ 1 represents a preset coefficient greater than 0 (for example, 0.2)
  • ⁇ 2 represents a preset coefficient greater than 0 (for example, 0.2).
  • the FM in the above FM in and FM out represents the feature image (Feature Map), and has nothing to do with the fusion module (FM).
  • the loss loss can be calculated by the above formula, where
  • hist represents a HIST statistical function, which counts the number of different features (pixels with different brightness) in an object (image).
  • Hist(FM out , I GT , S) represents the number of pixels in the standard image ( IGT ) in S brightness regions, which are respectively in S brightness regions in the image (FM out ) output by the convolution branch The difference between the number of pixels in the area and the sum of the ratios to the total number of pixels.
  • the present disclosure provides a method for image enhancement (such as night scene image enhancement), which may include steps S201 and S202.
  • At least the image to be enhanced is input to the image enhancement model.
  • the image enhancement model is obtained by training through any method for training an image enhancement model in the embodiments of the present disclosure.
  • step S202 a result image output by the image enhancement model is acquired.
  • the image enhancement model can be used for image enhancement (super night scene).
  • the image to be enhanced (of course, can also include one or more adjacent images) is input into the image enhancement model, and the alignment module, fusion module, and enhancement module of the image enhancement model are sequentially processed to obtain the enhanced result image .
  • the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, it realizes: any method for training an image enhancement model according to the embodiments of the present disclosure, and /or, any image enhancement method in the embodiments of the present disclosure.
  • the processor is a device with data processing capability, which includes but not limited to central processing unit (CPU), etc.
  • the computer-readable medium is a device with data storage capability, which includes but not limited to random access memory (RAM, more Specifically such as SDRAM, DDR, etc.), read-only memory (ROM), electrified erasable programmable read-only memory (EEPROM), flash memory (FLASH);
  • the I/O interface read-write interface
  • the processor is a device with data processing capability, which includes but not limited to central processing unit (CPU), etc.
  • the computer-readable medium is a device with data storage capability, which includes but not limited to random access memory (RAM, more Specifically such as SDRAM, DDR, etc.), read-only memory (ROM), electrified erasable programmable read-only memory (EEPROM), flash memory (FLASH);
  • the I/O interface read-write interface
  • brightness histogram constraints (belonging to regular constraints) are introduced in the training process of the image enhancement model, so that the obtained image enhancement model can effectively perform brightness enhancement in different ways according to the brightness of different regions, that is, while maintaining The brightness of the high-brightness area is effectively increased while the brightness of the low-brightness area is basically unchanged, so that it not only improves the brightness but also improves the contrast, which can meet the image enhancement (such as night scene image enhancement) requirements of the super night scene (in the case of maintaining the night scene to brighten the local scene) to achieve a better image enhancement effect.
  • image enhancement such as night scene image enhancement
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit (CPU), digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as ASIC.
  • a processor such as a central processing unit (CPU), digital signal processor, or microprocessor
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media include but not limited to random access memory (RAM, more specifically SDRAM, DDR, etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory (FLASH) or other disk storage ; compact disc read-only (CD-ROM), digital versatile disc (DVD) or other optical disk storage; magnetic cartridge, magnetic tape, magnetic disk storage or other magnetic storage; any other that can be used to store desired information and that can be accessed by a computer medium.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

本公开提供了一种图像增强模型训练的方法,图像增强模型包括配置为增强亮度和对比度的增强模块,增强模块包括与亮度区间对应的卷积支路,用于将输入其中的图像的像素按所属亮度区间输入对应卷积支路,在各卷积支路中用第一卷积单元卷积,并将各卷积支路输出的图像合并后用第二卷积单元卷积;所述方法包括:将样本图像输入图像增强模型,获取图像增强模型输出的结果图像;计算损失;损失包括结果图像相对标准图像的图像损失,以及卷积支路输出的图像相对标准图像在各卷积支路中的亮度直方图约束的第一约束损失;根据损失调整增强模块;若不满足训练结束条件,返回将样本图像输入图像增强模型的步骤。本公开还提供一种图像增强的方法、计算机可读介质。

Description

图像增强模型训练的方法、图像增强的方法、可读介质
相关申请的交叉引用
本申请要求2021年9月28日提交给中国专利局的第202111143110.X号专利申请的优先权,其全部内容通过引用合并于此。
技术领域
本公开涉及但不限于图像处理技术领域。
背景技术
在夜间等亮度较低的情况下,采集到的图像往往亮度过低,不够清晰。尤其是,当采用手机等便携电子设备采集图像时,基于其尺寸的限制,其中的图像采集单元相对单反相机等的感光能力弱,以上问题就更严重。
为此,对在低亮度环境下采集的图像,可通过图像增强技术提高其亮度和对比度,使其更加清晰,例如是进行夜景图像增强(“夜景图像增强”仅表示图像整体亮度较低,而不代表其必须是在夜间的时间采集的)。但是,现有图像增强技术的效果不能同时有效的提升图像的亮度和对比度。
发明内容
本公开提供一种图像增强模型训练的方法、图像增强的方法、计算机可读介质。
第一方面,本公开提供一种图像增强模型训练的方法,所述图像增强模型包括配置为增强亮度和对比度的增强模块,所述增强模块包括与多个预设的亮度区间一一对应的卷积支路;所述增强模块配置为将输入其中的图像的像素按所属亮度区间输入对应卷积支路,在各卷积支路中用第一卷积单元卷积,并将各卷积支路输出的图像合并后用 第二卷积单元卷积;所述方法包括:将样本图像输入所述图像增强模型,获取所述图像增强模型输出的结果图像;计算损失,所述损失包括结果图像相对标准图像的图像损失,以及所述卷积支路输出的图像相对标准图像在各卷积支路中的亮度直方图约束的第一约束损失;根据所述损失调整增强模块;若不满足训练结束条件,返回所述将样本图像输入所述图像增强模型的步骤。
第二方面,本公开提供一种图像增强的方法,包括:至少将待增强图像输入至图像增强模型;所述图像增强模型是通过本文所述的任意一种图像增强模型训练的方法训练得到的;获取所述图像增强模型输出的结果图像。
第三方面,本公开提供一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现:本文所述的任意一种图像增强模型训练的方法,和/或,本文所述的任意一种图像增强的方法。
附图说明
图1为本公开提供的一种图像增强模型训练的方法的流程图;
图2为本公开提供的一种图像增强模型的处理流程示意图;
图3为本公开提供的一种图像增强模型的对齐模块的输入图像的示意图;
图4为本公开提供的另一种图像增强模型的对齐模块的输入图像的示意图;
图5为本公开提供的一种图像增强模型的AP3D对齐模块的处理流程示意图;
图6为本公开提供的一种图像增强模型的AP3D对齐模块中的外观保持单元的处理流程示意图;
图7为本公开提供的一种图像增强模型的融合模块的处理流程示意图;
图8为本公开提供的一种图像增强模型的增强模块的处理流程示意图;
图9为本公开提供的一种图像增强模型的增强模块的残差下采 样单元的处理流程示意图;
图10为本公开提供的一种图像增强模型的增强模块的残差上采样单元的处理流程示意图;
图11为本公开提供的一种图像增强的方法的流程图;
图12为本公开提供的一种计算机可读介质的组成框图。
具体实施方式
为使本领域的技术人员更好地理解本公开的技术方案,下面结合附图对本公开提供的图像增强模型训练的方法、图像增强的方法、计算机可读介质进行详细描述。
在下文中将参考附图更充分地描述本公开,但是所示的实施方式可以以不同形式来体现,且本公开不应当被解释为限于以下阐述的实施方式。反之,提供这些实施方式的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。
本公开实施方式的附图用来提供对本公开实施方式的进一步理解,并且构成说明书的一部分,与详细实施方式一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细实施方式进行描述,以上和其它特征和优点对本领域技术人员将变得更加显而易见。
本公开可借助本公开的理想示意图而参考平面图和/或截面图进行描述。因此,可根据制造技术和/或容限来修改示例图示。
在不冲突的情况下,本公开各实施方式及实施方式中的各特征可相互组合。
本公开所使用的术语仅用于描述特定实施方式,且不意欲限制本公开。如本公开所使用的术语“和/或”包括一个或多个相关列举条目的任何和所有组合。如本公开所使用的单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。如本公开所使用的术语“包括”、“由……制成”,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其它特征、整体、步骤、操作、元件、组件和/或其群组。
除非另外限定,否则本公开所用的所有术语(包括技术和科学术 语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本公开明确如此限定。
本公开不限于附图中所示的实施方式,而是包括基于制造工艺而形成的配置的修改。因此,附图中例示的区具有示意性属性,并且图中所示区的形状例示了元件的区的具体形状,但并不是旨在限制性的。
在一些相关技术中,可采用“超级夜景(Super Night)”技术进行图像增强。
超级夜景技术中,用卷积神经网络(CNN,Convolutional Neural Networks)等作为图像增强模型处理图像,以同时提高图像的亮度和对比度。
也就是说,超级夜景技术的目的不是整体提高图像的亮度(使夜景变为白天),而是在保持图像中低亮度区域的亮度基本不变的情况下,提高其中亮度相对较高的局部区域的亮度,使图像中的较亮的部分更加清晰明显(在保持夜景的情况下使其中局部的景物增亮)。例如,对夜间拍摄的夜景图像,其大部分区域都是漆黑的夜空,少数部分是星光、灯光等,而超级夜景技术应能在保持夜空相对黑暗(保持夜景)的情况下,提高其中星光、灯光等所在区域的亮度(局部增亮)。
但是,在相关技术中图像增强模型的训练方式不合理,缺少适当的亮度约束,从而导致用其进行的图像增强(如超级夜景)的效果不理想:或是所得图像整体亮度提升不足,增亮效果不明显,图像仍不清晰;或是所得图像整体亮度提升过大而对比度不足,即不是局部增亮而成了整体增亮,如将夜景变为了白天。
第一方面,参照图1至图10,本公开提供一种图像增强模型训练的方法。
其中,本公开的图像增强模型可为神经网络(Neural Networks)模型,例如为包括卷积神经网络(CNN,Convolutional Neural Networks)的模型等。
其中,本公开的图像增强模型可用于进行图像增强(如夜景图像增强),具体是同时提高图像的亮度和对比度。从而对整体亮度较低的图像(如夜景图像),该图像增强模型可在保持其低亮度区域亮度基本不变的情况下增加高亮度区域的亮度(在保持夜景的情况下使其中局部的景物增亮),以实现“超级夜景(Super Night)”技术的图像增强。由此,以上图像增强模型也可视为“超级夜景网络(SNN,Super Night Network)”。
其中,本公开的方法用于对图像增强模型进行训练,即在训练过程中调整该图像增强模型中的参数,以改善图像增强模型的性能,最终得到满足要求的图像增强模型。
本公开中,图像增强模型包括配置为增强亮度和对比度的增强模块,增强模块包括与多个预设的亮度区间一一对应的卷积支路;增强模块配置为将输入其中的图像的像素按所属亮度区间输入对应卷积支路,在各卷积支路中用第一卷积单元卷积,并将各卷积支路输出的图像合并后用第二卷积单元卷积。
参照图2,本公开的图像增强模型中包括配置为增强亮度和对比度的增强模块,例如为直方图约束模块(HCM,Histogram Consistency Module)。
参照图2、图8,图像(如后续融合模块FM输出的图像I fusion)进入该增强模块后,其中的像素(特征)根据自身亮度所处的亮度区间(即亮度等级的范围)进入相应的卷积支路形成多个(图中以3个为例)并行的数据流,即,每个卷积支路中的数据流仅包括亮度处于特定亮度区间内的像素(特征)。进入每个卷积支路的数据流被该卷积支路中的第一卷积单元(CONV1)进行卷积处理,得到用于进行后续处理(如进入采样段)的图像(FM in);而多个卷积支路输出的图像合并即为卷积支路的输出图像(FM out),其再于“总体的”第二卷积单元(CONV2)中进行卷积处理,并从增强模块输出,作为结果图像(I out)。
其中,第二卷积单元和每个第一卷积单元中均可包括一个或多个卷积核,且不同卷积单元中的卷积核的个数、尺寸、权重、元素等 都可以是不同的。
在一些实施方式中,图像增强模型还包括对齐模块和融合模块。
对齐模块,其设于增强模块前,配置为将输入至图像增强模型的待增强图像和邻近图像对齐。
其中,邻近图像为与待增强图像对应相同场景,且与待增强图像在邻近的时间采集的图像。
融合模块,其设于对齐模块与增强模块间,配置为将对齐模块输出的多幅对齐后的图像融合为一幅图像输入至增强模块。
参照图2,作为本公开的一种实施方式,图像增强模型还可包括设于增强模块前的对齐模块(AM,Alignment Module)和融合模块(FM,Fusion Module)。对齐模块配置为将输入图像增强模型的多幅图像对齐(如像素对齐),之后融合模块将多幅图像融合为一幅图像,以供后续的增强模块处理。
对于在夜间等亮度较低的情况下采集的图像,由于能采集到的光量较少,故往往存在噪点多、解析力弱、亮度低等问题,若直接对齐进行增强,可能导致误将噪点增强等问题。
为此,可连续对同一场景采集多帧图像,以其中一帧作为待增强图像,其它为该待增强图像的邻近图像(在邻近的时间采集),并将该待增强图像和邻近图像均输入至对齐模块、融合模块进行对齐和融合,从而,融合得到的图像结合了多幅图像的信息,可减少或消除图像中的噪点,并且获得更多的细节,具有更高的解析力。
也就是说,通过对齐模块、融合模块可综合利用多幅(多帧)图像的信息,可丰富输入至增强模块(HCM)的图像的细节信息,使得最终得到的结果图像的细节有更大提升,噪点减少。
其中,选取待增强图像和邻近图像的具体方式,以及对齐模块和融合模块的具体形式均是多样的。
例如,参照图3,对视频流中连续的多帧图像INt-n…INt…INt+n,可用第INt帧图像作为待增强图像,而以其前后的各n(n为大于或等于1的整数)帧图像作为邻近图像。
其中,对视频流进行的图像增强当然也可以是对视频流中的每 一帧都进行图像增强,或者说对视频流中的每一帧都进行本公开实施方式的方法。由此,对视频流的整体处理效果也就是对视频流进行了增强。
再如,参照图4,对连续采集的多帧图像INt…INt+n,可用第INt帧图像作为待增强图像,而以其后的n帧图像作为邻近图像。
也就是说,对图像进行增强不局限于仅采集一帧图像,而是可采用连拍的方式,对同一场景连续采集多帧图像,从而获得更丰富的内容细节。
例如,对齐模块(AM)可采用外观保持的3D卷积网络(AP3D,Appearance-Preserving 3D Convolution),AP3D可用于对图像做重建,并保证重建后的各图像表观对齐。
AP3D的结构可参照图5,待增强图像和邻近图像被输入其中,其输入维度为n×c×h×w,其中n为图像数(帧数,图中以n=3为例),c为图像的通道(channel)数(图中以raw格式为例,故c=1),h和w分别为图像的长宽。例如,故输入的图像为INt-1,INt,INt+1,其中INt为待增强图像,其它为邻近图像。
参照图5,输入的图像先复制后分为两路,每路中的3幅图像再复制一份得到6幅图像并重新排序,从而每幅图像都会作为主图像与另一幅作为副图像的图像一起输入外观保持单元(APM,Appearance-Preserving Module),进行像素级相邻对齐,得到6幅图像。该6幅图像再插入(concat)原输入的3幅图像后得到9幅图像,该9幅图像用3*3*3的卷积核,进行步长(stride)为3,1,1的卷积(Conv),得到3幅图像(即对齐后重建的图像)作为对齐模块的输出(I align)。
具体的,外观保持单元(APM)的结构可参照图6,输入其中的主图像(central)和副图像(adjacent)通过一系列的变形(reshape)、L2范数(L2-norm)、哈达玛积(Hadamard product)、内积(inner product)运算,将输入的两幅图像相邻对齐(aligned adjacent)为一幅图像。参照图6,在601处,对张量进行转置,在602处,对输入特征张量计算一个比重,归一化到[0,1],在603处,对输入特征张量 进行非线性化,归一化到(0,1),在604处,对经过603处的处理的输出的特征进行命名,例如,命名为mask。
当然,对齐模块(AM)可为其它任意能实现对齐功能的模块,比如其它3D卷积网络、光流网络、Deformable卷积网络、MEMC(Motion Estimation and Motion Compensation)网络等。
而参照图7,融合模块(FM)可将对齐后的图像(I align_t-1、I align_t、I align_t+1)直接进行特征(像素)相加(ADD)的融合,得到一幅融合图像(I fusion),其维度为c×h×w。参照图2、图8,该图像(I fusion)即可为用于输入增强模块的图像。
当然,融合模块(FM)可为其它的能实现融合功能的模块,比如级联模块、相加模块、卷积神经网络等。
参照图1,本公开的方法包括步骤S101至S104。
在步骤S101,将样本图像输入图像增强模型,获取图像增强模型输出的结果图像。
本公开中,将样本图像输入到当前的图像增强模型中,由图像增强模型进行处理(图像增强),得到结果图像(I out)。
其中,样本图像位于预先设置的训练样本中,每个训练样本包括样本图像和对应的标准(Ground Truth)图像(I GT),样本图像相当于待增强图像,而标准图像是样本图像进行了效果良好的图像增强后应得的图像。
其中,获得对应的样本图像和标准图像的具体方式是多样的。例如,可以是通过图像采集单元(如摄像头)连续对同一场景进行曝光时间较短的图像采集和曝光时间较长的图像采集,以所得的一对(pair)长曝光图像(long)和短曝光图像(short)作为训练样本。其中,长曝光图像的曝光时间长,采集的光量多,相当于短曝光图像进行了图像增强后的图像,故可用短曝光图像作为样本图像,而长曝光图像作为标准图像。
在步骤S102,计算损失。
其中,损失包括结果图像相对标准图像的图像损失,以及卷积支路输出的图像相对标准图像在各卷积支路中的亮度直方图约束的第 一约束损失。
在得到结果图像后,即可根据损失函数计算对应的损失(loss),损失能表征当前图像增强模型所得的结果图像与应有的标准结果之间的差异。
本公开实施方式中,损失至少包括图像损失和第一约束损失。
该图像损失表征当前结果图像(即增强模块输出的图像I out)与对应的标准图像(I GT)之间的差异。
而第一约束损失则表征卷积支路输出的图像(即卷积支路输出的,需要由第二卷积单元进行卷积的图像FMout)相对于标准图像(I GT),在以上各亮度区间中按照直方图统计的像素(特征)数分布的差异(即两个图像亮度直方图离散分布的差异)。由此,该第一约束损失相当于在训练过程中引入了亮度直方图约束(正则约束)。
在步骤S103,根据损失调整增强模块。
根据损失的情况,按照可降低损失的方向,相应对增强模块中的参数(如卷积核的权重和各元素值)进行调整,以改善图像增强模型。
当然,若图像增强模型中还包括其它的模块(如以上对齐模块、融合模块等),则这些模块可以是预先设定好的(如采用已有的成熟模块,或者是预先单独训练),从而它们的参数在训练过程中可不进行调整,即不进行训练。
当然,若以上模块也要根据损失调整参数,即也进行训练,也是可行的。
其中,根据损失确定调整参数的具体方式是多样的,在此不再详细描述。
在步骤S104,若不满足训练结束条件,返回将样本图像输入图像增强模型的步骤。
判断当前是否满足预设的训练结束条件:
若否,则返回将样本图像输入图像增强模型的步骤(S101),重新选用样本图像继续进行训练,进一步优化图像增强模型。
若是,则结束,图像增强模型的训练完成,后续可用其进行图像增强。
其中,训练结束条件的具体形式是多样的,例如,训练结束条件可以是损失达到预定范围,或是图像增强模型的收敛达到一定程度,或是达到预定的循环次数等,在此不再详细描述。
在一些实施方式中,卷积支路还包括:设于第一卷积单元后的采样段,采样段包括多个配置为进行采样的采样单元;其中,每个采样单元的输入来自其所在卷积支路和至少一个其它卷积支路。
为更好的交互不同卷积支路中数据流的信息,可参照图8,在每个卷积支路的第一卷积单元后可设置采样段,采样段中包括采样单元,而且,每个采样单元的输入不仅来自其同一卷积支路的在前单元(如第一采样单元或其它采样单元),而且还来自其它的卷积支路中对应的在前单元(即在其它卷积支路中,与本卷积支路的在前单元处于相同位置的在前单元),多个输入合并(Add)后再进入后续单元;或者说,第一采样单元和各采样单元的输出除了输入至自身卷积支路的在后采样单元外,还输出至其它卷积支路的对应的在后采样单元。
例如,参照图8,在三个卷积支路中,中间的卷积支路的采样段的每个采样单元除接收本卷积支路在前单元的输出外,还接收其它两个卷积支路中对应的在前单元的输出,合并后进入再进入后续单元;而其它两个卷积支路的每个采样单元除接收本卷积支路在前单元的输出外,还接收中间卷积支路中对应的在前单元的输出。其中,因为上下两个卷积支路中像素的亮度差异较大,故不相互输出。
在一些实施方式中,采样段包括配置为进行下采样的下采样单元,以及设于下采样单元后的、配置为进行上采样的上采样单元。
在一些实施方式中,下采样单元配置为进行残差下采样;上采样单元配置为进行残差上采样。
作为本公开的一种实施方式,为了保证各卷积支路的最终输出的图像(即输出至第二采样单元的图像)尺寸不变,故采样段中可先设有下采样单元,再设置上采样单元。例如,采样段中可包括数量相等的下采样单元和上采样单元,例如参照图8先有两个下采样单元,再有两个上采样单元。
进一步的,以上的下采样单元可为配置为进行残差下采样的残 差下采样单元(Resblock_down),而上采样单元可为配置为进行残差上采样的残差上采样单元(Resblock_up)。
示例性的,残差下采样单元和残差上采样单元的结构可分别参照图9和图10,其中包括多个不同尺寸的卷积(Conv)、用于激活的线性整流函数(ReLU)、下取样(Down-Sampling)、上取样(UP-Sampling)等运算。
当然,采样单元和残差采样单元的具体形式都不限于此,其也可为其它任何的采样单元,甚至是不改变尺寸的卷积采样单元。
在一些实施方式中,卷积支路还包括:连接在采样段的输入端与输出端间的短路连接,短路连接配置为将输入至采样段的图像短接至采样段的输出端。
参照图8中的虚线箭头,卷积支路还包括连接在其输入端与输出端间的短路连接(Short-cut),即输入到卷积支路的图像还输入到其输出端进行合并(Add),以进一步改善提高对比度的效果。
当然,在增强模块中还可包括其它的短路连接,例如直接将输入增强模块的图像(即融合模块输出的图像I fusion)连接至增强模块输出端(即第二卷积单元后)的短路连接等,即将输入至增强单元的图像(I fusion)与增强模块原本(若不存在该短路连接时)应输出的图像(即第二卷积单元输出的图像)合并。
其中,应当理解,以上短路连接也是跨越多级的,故其本质上也是一种残差连接。
在一些实施方式中,损失还包括:输入至采样段的图像相对输入至增强模块的图像在各卷积支路中的亮度直方图约束的第二约束损失。
如前,第一约束损失表征卷积支路输出的图像(FMout)相对于标准图像(I GT),在各亮度区间中按照直方图统计的像素数分布的差异,也就是增强模块的“输出侧”存在的差异。但同时,增强模块的“输入侧”也可能存在差异,并也会对损失有影响。
为此,还可在损失中加入第二损失,即输入至采样段的图像(经过第一卷积单元的图像FM in)相对输入至增强模块的图像(I in,例如 融合模块输出的图像),在各亮度区间中按照直方图统计的像素数分布的差异。
在一些实施方式中,损失loss通过以下公式计算:
loss=||I out-I GT|| 11Hist(FM out,I GT,S)+λ 2Hist(FM in,I in,S);
Figure PCTCN2022081106-appb-000001
Figure PCTCN2022081106-appb-000002
其中,Hist(FM out,I GT,S)代表第一约束损失,Hist(FM in,I in,S)代表第二约束损失,I in代表输入至增强模块的图像,I out代表结果图像,I GT代表标准图像,FM in代表输入至采样段的图像,FM out代表卷积支路输出的图像,S代表亮度区间的个数,|||| 1代表L1范数函数,hist代表HIST统计函数,λ 1代表预设的大于0的系数(例如为0.2),λ 2代表预设的大于0的系数(例如为0.2)。
其中,以上FM in和FM out中的FM代表特征像(Feature Map),而与融合模块(FM)无关。
作为本公开的一种实施方式,损失loss可通过以上的公式计算,其中||I out-I GT|| 1为图像损失,而Hist(FM out,I GT,S)和Hist(FM in,I in,S)分别为第一约束损失和第二约束损失。
具体的,hist代表HIST统计函数,其统计的是对象(图像)中不同特征(不同亮度的像素)的个数。例如,Hist(FM out,I GT,S)就代表标准图像(I GT)中处于S个亮度区中的像素个数,分别与卷积支路输出的图像(FM out)中处于S个亮度区中的像素个数的差,相对像素总个数的比例的和。
当然,如果采用其它具体的方式计算由直方图产生的约束损失,也是可行的。
第二方面,参照图11,本公开提供一种图像增强(如夜景图像 增强)的方法,其可以包括步骤S201和S202。
再S201,至少将待增强图像输入至图像增强模型。
其中,图像增强模型是通过本公开实施方式的任意一种图像增强模型训练的方法训练得到的。
在步骤S202,获取图像增强模型输出的结果图像。
在通过以上包括亮度直方图约束的方式训练得到以上的图像增强模型后,就可使用该图像增强模型进行图像增强(超级夜景)。
即,将待增强图像(当然也可包括一幅或多幅邻近图像)输入至图像增强模型中,由图像增强模型的对齐模块、融合模块、增强模块依次进行处理,以得到增强后的结果图像。
第三方面,参照图12,本公开提供一种计算机可读介质,其上存储有计算机程序,计算机程序被处理器执行时实现:本公开实施方式的任意一种图像增强模型训练的方法,和/或,本公开实施方式的任意一种图像增强的方法。
其中,处理器为具有数据处理能力的器件,其包括但不限于中央处理器(CPU)等;计算机可读介质为具有数据存储能力的器件,其包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH);I/O接口(读写接口)连接在处理器与存储器间,能实现存储器与处理器的信息交互,其包括但不限于数据总线(Bus)等。
本公开中,在图像增强模型的训练过程中引入了亮度直方图约束(属于正则约束),从而其所得的图像增强模型可有效的根据不同区域的亮度以不同方式进行亮度增强,即,在保持低亮度区域亮度基本不变的情况下有效增加高亮度区域的亮度,从而其既提高了亮度也提高了对比度,可满足超级夜景等的图像增强(如夜景图像增强)需求(在保持夜景的情况下使其中局部的景物增亮),实现更好的图像增强效果。
本领域普通技术人员可以理解,上文中所公开的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及 其适当的组合。
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。
某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器(CPU)、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其它数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于随机存取存储器(RAM,更具体如SDRAM、DDR等)、只读存储器(ROM)、带电可擦可编程只读存储器(EEPROM)、闪存(FLASH)或其它磁盘存储器;只读光盘(CD-ROM)、数字多功能盘(DVD)或其它光盘存储器;磁盒、磁带、磁盘存储或其它磁存储器;可以用于存储期望的信息并且可以被计算机访问的任何其它的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其它传输机制之类的调制数据信号中的其它数据,并且可包括任何信息递送介质。
本公开已经公开了示例实施方式,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施方式相结合描述的特征、特性和/或元素,或可与其它实施方式相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。

Claims (10)

  1. 一种图像增强模型训练的方法,所述图像增强模型包括配置为增强亮度和对比度的增强模块,所述增强模块包括与多个预设的亮度区间一一对应的卷积支路;所述增强模块配置为将输入其中的图像的像素按所属亮度区间输入对应卷积支路,在各卷积支路中用第一卷积单元卷积,并将各卷积支路输出的图像合并后用第二卷积单元卷积;所述方法包括:
    将样本图像输入所述图像增强模型,获取所述图像增强模型输出的结果图像;
    计算损失,所述损失包括结果图像相对标准图像的图像损失,以及所述卷积支路输出的图像相对标准图像在各卷积支路中的亮度直方图约束的第一约束损失;
    根据所述损失调整增强模块;
    若不满足训练结束条件,返回所述将样本图像输入所述图像增强模型的步骤。
  2. 根据权利要求1所述的方法,其中,所述卷积支路还包括:
    设于所述第一卷积单元后的采样段,所述采样段包括多个配置为进行采样的采样单元;其中,每个所述采样单元的输入来自其所在卷积支路和至少一个其它卷积支路。
  3. 根据权利要求2所述的方法,其中,所述损失还包括:
    输入至所述采样段的图像相对输入至所述增强模块的图像在各卷积支路中的亮度直方图约束的第二约束损失。
  4. 根据权利要求3所述的方法,其中,所述损失通过以下公式计算:
    损失=||I out-I GT|| 11Hist(FM out,I GT,S)+λ 2Hist(FM in,I in,S);
    Figure PCTCN2022081106-appb-100001
    Figure PCTCN2022081106-appb-100002
    其中,Hist(FM out,I GT,S)代表所述第一约束损失,Hist(FM in,I in,S)代表所述第二约束损失,I in代表输入至所述增强模块的图像,I out代表所述结果图像,I GT代表所述标准图像,FM in代表输入至所述采样段的图像,FM out代表所述卷积支路输出的图像,S代表亮度区间的个数,|||| 1代表L1范数函数,hist代表HIST统计函数,λ 1代表预设的大于0的系数,λ 2代表预设的大于0的系数。
  5. 根据权利要求2所述的方法,其中,
    所述采样段包括配置为进行下采样的下采样单元,以及设于所述下采样单元后的、配置为进行上采样的上采样单元。
  6. 根据权利要求5所述的方法,其中,
    所述下采样单元配置为进行残差下采样;
    所述上采样单元配置为进行残差上采样。
  7. 根据权利要求2所述的方法,其中,所述卷积支路还包括:
    连接在所述采样段的输入端与输出端间的短路连接,所述短路连接用于将输入至采样段的图像短接至采样段的输出端。
  8. 根据权利要求1所述的方法,其中,所述图像增强模型还包括:
    对齐模块,其设于所述增强模块前,配置为将输入至所述图像增强模型的待增强图像和邻近图像对齐;所述邻近图像为与待增强图像对应相同场景,且与待增强图像在邻近的时间采集的图像;
    融合模块,其设于所述对齐模块与增强模块间,配置为将所述对齐模块输出的多幅对齐后的图像融合为一幅图像输入至增强模块。
  9. 一种图像增强的方法,其包括:
    至少将待增强图像输入至图像增强模型;所述图像增强模型是通过权利要求1至8中任意一项所述的图像增强模型训练的方法训练得到的;
    获取所述图像增强模型输出的结果图像。
  10. 一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现:
    权利要求1至8中任意一项所述的图像增强模型训练的方法,和/或,权利要求9所述的图像增强的方法。
PCT/CN2022/081106 2021-09-28 2022-03-16 图像增强模型训练的方法、图像增强的方法、可读介质 WO2023050731A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22874134.4A EP4394692A1 (en) 2021-09-28 2022-03-16 Method for training image enhancement model, image enhancement method, and readable medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111143110.XA CN115880162A (zh) 2021-09-28 2021-09-28 图像增强模型训练的方法、图像增强的方法、可读介质
CN202111143110.X 2021-09-28

Publications (1)

Publication Number Publication Date
WO2023050731A1 true WO2023050731A1 (zh) 2023-04-06

Family

ID=85763471

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/081106 WO2023050731A1 (zh) 2021-09-28 2022-03-16 图像增强模型训练的方法、图像增强的方法、可读介质

Country Status (3)

Country Link
EP (1) EP4394692A1 (zh)
CN (1) CN115880162A (zh)
WO (1) WO2023050731A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664463A (zh) * 2023-05-29 2023-08-29 中兴协力(山东)数字科技集团有限公司 两阶段的低照度图像增强方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930517A (zh) * 2012-11-30 2013-02-13 江苏技术师范学院 直方图均衡化图像增强方法
CN108447036A (zh) * 2018-03-23 2018-08-24 北京大学 一种基于卷积神经网络的低光照图像增强方法
CN110197463A (zh) * 2019-04-25 2019-09-03 深圳大学 基于深度学习的高动态范围图像色调映射方法及其系统
US20200219238A1 (en) * 2020-03-18 2020-07-09 Intel Corporation Brightness and contrast enhancement for video
CN112614077A (zh) * 2020-12-30 2021-04-06 北京航空航天大学杭州创新研究院 一种基于生成对抗网络的非监督低照度图像增强方法
CN113052210A (zh) * 2021-03-11 2021-06-29 北京工业大学 一种基于卷积神经网络的快速低光照目标检测方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930517A (zh) * 2012-11-30 2013-02-13 江苏技术师范学院 直方图均衡化图像增强方法
CN108447036A (zh) * 2018-03-23 2018-08-24 北京大学 一种基于卷积神经网络的低光照图像增强方法
CN110197463A (zh) * 2019-04-25 2019-09-03 深圳大学 基于深度学习的高动态范围图像色调映射方法及其系统
US20200219238A1 (en) * 2020-03-18 2020-07-09 Intel Corporation Brightness and contrast enhancement for video
CN112614077A (zh) * 2020-12-30 2021-04-06 北京航空航天大学杭州创新研究院 一种基于生成对抗网络的非监督低照度图像增强方法
CN113052210A (zh) * 2021-03-11 2021-06-29 北京工业大学 一种基于卷积神经网络的快速低光照目标检测方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664463A (zh) * 2023-05-29 2023-08-29 中兴协力(山东)数字科技集团有限公司 两阶段的低照度图像增强方法
CN116664463B (zh) * 2023-05-29 2024-01-30 中兴协力(山东)数字科技集团有限公司 两阶段的低照度图像增强方法

Also Published As

Publication number Publication date
EP4394692A1 (en) 2024-07-03
CN115880162A (zh) 2023-03-31

Similar Documents

Publication Publication Date Title
US11922639B2 (en) HDR image generation from single-shot HDR color image sensors
JP6066536B2 (ja) ゴーストのない高ダイナミックレンジ画像の生成
US9648251B2 (en) Method for generating an HDR image of a scene based on a tradeoff between brightness distribution and motion
CN108205796B (zh) 一种多曝光图像的融合方法及装置
US9131160B2 (en) Method for controlling exposure time of high dynamic range image
US9137454B2 (en) Method of filming high dynamic range videos
CN110728648A (zh) 图像融合的方法、装置、电子设备及可读存储介质
WO2021208706A1 (zh) 高动态范围图像合成方法、装置、图像处理芯片及航拍相机
US20100253833A1 (en) Exposing pixel groups in producing digital images
CN111064904A (zh) 一种暗光图像增强方法
WO2023050731A1 (zh) 图像增强模型训练的方法、图像增强的方法、可读介质
US20230074180A1 (en) Method and apparatus for generating super night scene image, and electronic device and storage medium
US20220198625A1 (en) High-dynamic-range image generation with pre-combination denoising
CN111612722A (zh) 基于简化Unet全卷积神经网络的低照度图像处理方法
WO2023124123A1 (zh) 图像处理方法及其相关设备
CN116711317A (zh) 用于图像处理的高动态范围技术选择
CN111325679A (zh) 一种Raw到Raw的暗光图像增强方法
CN112598609A (zh) 一种动态图像的处理方法及装置
WO2023000878A1 (zh) 拍摄方法、装置、控制器、设备和计算机可读存储介质
US20230196721A1 (en) Low-light video processing method, device and storage medium
CN115482173A (zh) 一种基于Retinex理论的夜间无人机跟踪低照度图像增强方法
WO2024098260A1 (zh) 图像确定方法和装置、图像处理方法和装置、电子设备
CN114450934B (zh) 获取图像的方法、装置、设备及计算机可读存储介质
CN117455805A (zh) 图像处理方法及装置、计算机可读存储介质、终端
US20230336887A1 (en) In-line chromatic aberration correction in wide dynamic range (wdr) image processing pipeline

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22874134

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022874134

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022874134

Country of ref document: EP

Effective date: 20240325

NENP Non-entry into the national phase

Ref country code: DE