CN116452472A - Low-illumination image enhancement method based on semantic knowledge guidance - Google Patents

Low-illumination image enhancement method based on semantic knowledge guidance Download PDF

Info

Publication number
CN116452472A
CN116452472A CN202310277679.8A CN202310277679A CN116452472A CN 116452472 A CN116452472 A CN 116452472A CN 202310277679 A CN202310277679 A CN 202310277679A CN 116452472 A CN116452472 A CN 116452472A
Authority
CN
China
Prior art keywords
image
semantic
loss
image enhancement
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310277679.8A
Other languages
Chinese (zh)
Inventor
王国庆
吴煜辉
潘晨
位纪伟
杨阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202310277679.8A priority Critical patent/CN116452472A/en
Publication of CN116452472A publication Critical patent/CN116452472A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a low-illumination image enhancement method based on semantic knowledge guidance, and belongs to the technical field of low-illumination image enhancement. The semantic knowledge guided low-illumination image enhancement method can be used for focusing on the problem that the previous method is ignored through the introduction of semantic information. And the invention can be applied to any image enhancement network of encoder-decoder structure, so that the models without semantically related information learn more knowledge. The invention can pay attention to the knowledge related to the semantics from a plurality of different angles by combining the semantics guiding and embedding module with the semantics guiding color histogram loss and the semantics guiding and countering loss. The invention improves the capability of the low-illumination image enhancement network and obtains more real and natural enhancement results.

Description

Low-illumination image enhancement method based on semantic knowledge guidance
Technical Field
The invention belongs to the technical field of low-illumination image enhancement, and particularly relates to a low-illumination image enhancement method based on semantic knowledge guidance.
Background
Due to unavoidable environmental and/or technical limitations, such as insufficient illumination and limited exposure time, images are often taken under suboptimal illumination conditions, disturbed by backlight, non-uniform illumination and low light. The aesthetic quality of such images can be compromised and the transmission of information is not desirable for higher-level tasks such as object tracking, identification and detection. Low-light (image brightness less than or equal to a specified value) enhancement enjoys wide application in different fields including visual monitoring, autopilot, and computed photography. In particular, smartphone photography has become popular and popular. Limited by the size of the cell phone camera aperture, the real-time processing requirements, and memory limitations, taking pictures with the camera of a smartphone in a dim environment is particularly challenging. In this application, enhancing low-light images and video is a valuable area of research. Conventional low-illumination image enhancement methods include histogram equalization-based and Retinex model-based methods, but these methods do not adapt well to diverse environments, run-times are generally long, and optimal parameters are difficult to obtain. In recent years, in combination with the progress of the deep learning technique, low-illuminance image enhancement based on deep learning has been remarkably successful.
The current low-illumination image enhancement method based on deep learning is mainly divided into two types, namely an end-to-end method and a Retinex-based method. From the most classical LLNet, researchers have proposed a variety of end-to-end approaches, including end-to-end parametric filter estimation networks, recurrent neural networks, multiple exposure fusion networks, deep pile laplace enhancement networks, and wavelet transform-based enhancement networks. In contrast to the enhancement effect learned directly in an end-to-end network, the Retinex theory, due to its physical interpretability, the Retinex theory-based deep low-illumination image enhancement method generally can obtain better effects. The first Retinex-based method, called Retinex-Net, decomposes the low-light image into a light component and a reflection component through a network, enhances the light component, and then fuses into a normal light image. Later researchers proposed KinD based on Retinex-Net, increased enhancement and denoising operations on the reflective component, and improved enhancement effects. Besides, there are KinD++, retinex and neural structure search-based enhancement networks, retinex-based deep-expansion enhancement networks, and regularized flow-based enhancement networks. Notably, these methods all tend to enhance the low-light image regardless of semantic information of its different regions, and when there are objects in the low-light image that are inherently black, such as human black hair and black vehicles, these enhancement methods typically enhance these portions to gray, resulting in color bias. To solve this problem, there is a need to make the enhancement network learn semantically relevant information, and researchers have proposed some preliminary schemes including fusing the prediction results of the semantic segmentation network into a Retinex-based network, and constraining the parameter update of the image enhancement network with the loss function of the semantic segmentation network. Both methods achieve the combination of semantic information and image information through well-designed networks and training methods, but they do not fully utilize the information that the semantic segmentation network can provide, nor do they take into account the differences between semantic information and the original image enhancement task. For the former, the difference between the semantic segmentation result and the image enhancement intermediate feature is relatively large, and the original image information is inevitably damaged during fusion; in the latter case, the two different tasks are constrained directly by loss, which affects the original optimization process of the image enhancement network parameters, and thus the final enhancement result. In summary, the existing scheme cannot well introduce semantic information into the image enhancement task, and the semantic information and the image information are interactively and carefully designed, so that not only is generalization limited, but also abnormal colors and details exist in the generated normal light image, the visual effect of the image is affected, and the effect of the subsequent image processing task is also affected.
Disclosure of Invention
The invention provides a low-illumination image enhancement method based on semantic knowledge guidance, which can be used for improving the image enhancement effect of a low-illumination image.
The invention adopts the technical scheme that:
a semantic knowledge-based low-light image enhancement method, the method comprising:
step 1, constructing an image enhancement processing network model;
the image enhancement processing network model comprises two branches, wherein one branch is a semantic segmentation network, the other branch is an image enhancement network, and N (N is more than or equal to 2) semantic embedding modules are arranged between the two branches;
the semantic segmentation network sequentially comprises: a first encoder, a first decoder and a pre-processing head, the first encoder is used for inputting image I l Feature extraction is performed on (low-illumination image) to obtain an input image I l Is a first initial feature map of (a);
the first decoder is used for decoding the first initial feature map in multiple scales to obtain deep feature maps in different scales, namely semantic segmentation features F i The scale number M of the semantic segmentation features is larger than N;
predicting semantic segmentation features F of different scales output by the head to the first decoder i Performing pixel-level semantic category prediction, and outputting an input image I l Semantic predictive diagram I of (1) seg (semantic segmentation result);
selecting N-scale semantic segmentation features of continuous scales from the M-scale deep feature graphs as one of two inputs of each semantic embedding module respectively, and sequentially defining the N semantic embedding modules as 1 st to N semantic embedding modules according to the scale ascending direction;
the image enhancement network includes: a second encoder and a second decoder, wherein the second encoder is used for inputting image I l Feature extraction is carried out on the low-illumination image of (2) to obtain an input image I l Is a second initial feature map of (2); the decoder comprises n+1 convolution blocks, each for upsampling its input to output a different oneScaled image enhancement features F i The output of the last convolution block is the input image I l Is a predictive enhanced image of (1)The input of the 1 st convolution block is a second initial feature map, the output of the 1 st convolution block is used as the other input of the 1 st semantic embedding module, and the output of any i (i=1, …, N) semantic embedding module is used as the input of the i+1st convolution block;
step 2, learning and training network parameters of the image enhancement processing network model based on the training sample, and stopping when a preset training ending condition is met, so as to obtain a trained image enhancement processing network model;
the loss function when training the image enhancement processing network model is set as follows:
wherein,,representing prediction enhanced image +.>And input image I l Is a label image I of (1) h The reconstruction loss between the two is equal to the reconstruction loss,representing semantic guided color histogram loss, i.e. prediction enhancement image +.>Histogram of (2) and label image I h L1 norm loss, lambda between histograms of (2) SCH Representing semantic guided color histogram loss->Weight of->Representing semantic guided fight loss, lambda SA Representing semantic guidance fight loss->Weights of (2);
and 3, inputting the image to be enhanced, which is matched with the input of the image enhancement processing network model, into a trained image enhancement network model, and obtaining an enhancement result of the image to be enhanced based on the output of the last convolution block of the image enhancement network.
Further, in step 2, semantic guidance counter-lossIs a global countermeasure loss->Is in charge of local countermeasures against loss>The sum is used for acquiring the global countermeasures loss +.>Against local lossesThe method comprises the following steps:
semantic prediction graph I based on semantic segmentation network output seg For prediction enhancement imagePartitioning, wherein each image block corresponds to a semantic category, and P is defined k Representing an arbitrary kth image block;
calculating local challenge losses
x f =P t ,D(P t )=min{D(P k )}
Wherein G represents a generator, i.e. an image enhancement processing network model, D represents a generator, D () represents an output of the generator, x r Representing a real image block, p real Data distribution, x, representing real image blocks f Representing false image blocks, p fake Representing the distribution of data for the dummy image blocks,representing mathematical expectations about a real image block;
input of a pre-measurement head of the semantic segmentation network is recorded as a characteristic diagram I s eg Predicting enhanced images in the channel dimensionIn the characteristic diagram I s eg Stitching is performed as a new dummy image block x' f And calculate global challenge loss->
Wherein,,representing mathematical expectations about new false image blocks.
Furthermore, the semantic segmentation network is a pre-trained network, and is kept unchanged in learning and training network parameters of the image enhancement processing network model, namely, only the image enhancement network and the network parameters of the N semantic embedding modules are learned and updated.
Further, the network structure of the semantic embedding module is specifically:
input semantic segmentation feature F s And image enhancement feature F i Respectively passing through a normalization layer and a convolution layer to obtain a semantic feature map and an image enhancement feature map with consistent dimensions; flattening the semantic feature map and the image enhancement feature map in the channel dimension respectively, and calculating attention force map between the two flattened feature maps through a transposed attention mechanism to obtain semantic correlation attention force map A;
adjusting image enhancement features F by semantically related attention seeking A i Obtaining the output characteristic F of the semantic embedding module o
F o =FN(W v (F i )×A+F i ),
Wherein W is v Representing the weights of the value embedded convolutional layers, FN () represents the output of the feedforward neural network.
Further, compute semantic guided color histogram lossWhen the histogram is estimated in a conductive way, the method specifically comprises the following steps:
semantic prediction graph I based on semantic segmentation network output seg For prediction enhancement images respectivelyAnd a label image I h Partitioning, wherein each image block corresponds to a semantic category;
separately estimating prediction enhanced imagesAnd a label image I h Is a semantic guided color histogram of:
respectively carrying out category edge pixel adjustment on each color channel of each image block to obtain high and low anchor point values of the pixel gray value of each pixel of each image block;
for the same pixel gray value of the same color channel of the same semantic class, multiplying the high and low anchor point values by a preset scaling factor to be used as input of a Sigmoid activation function, obtaining the pixel number estimated value of the current pixel gray value of the current semantic class based on the accumulated value of the difference between the Sigmoid activation function values of the high and low anchor point values under the scaling factor of all pixel points, obtaining the estimated histogram of the current color channel of the current semantic class based on the pixel number estimated value of all pixel gray values, and obtaining the prediction enhancement image based on the estimated histogram of all color channelsOr label image I h Is described herein) a semantic guided color histogram.
The technical scheme provided by the invention has at least the following beneficial effects:
by introducing semantic information, the semantic knowledge guided low-illumination image enhancement method can be used for focusing on the problem that the previous method ignores. And the invention can be applied to any image enhancement network of encoder-decoder structure, so that the models without semantically related information learn more knowledge. The invention can pay attention to the knowledge related to the semantics from a plurality of different angles by combining the semantics guiding and embedding module with the semantics guiding color histogram loss and the semantics guiding and countering loss.
The semantic guidance embedding module processes at a feature level, the multi-scale features (semantic segmentation features) extracted by the semantic segmentation network correspond to the multi-scale features in the original low-illumination image enhancement network decoder, the deep coding information of the semantic segmentation is introduced into the image features through similarity calculation, and transformation is carried out in a feature characterization space, so that output optimization is realized; after the semantic segmentation prediction result is obtained, the final enhanced image output by the image enhancement network is divided according to the category, and the color histograms are estimated for different image blocks and compared with the histograms of the real label images respectively, so that more accurate color constraint is realized, the network learns the color information related to the semantic category, and the color consistency of the enhanced result is ensured. The semantic guided contrast loss is processed at the output level as well, the semantic segmentation prediction result is used again to combine the semantic guided contrast loss with the global and local contrast loss, and in the local contrast loss, the most false image block is found by comparing the output of the image block through the discriminator, so that the generator (namely the image enhancement network) focuses on the false part. In addition, in the global contrast loss, the segmentation result and the enhancement result are spliced and then input into the discriminator, so that the discriminator refers to the semantic information to give a global discrimination result, and the discriminator and the generator are constrained together with the local contrast loss, thereby improving the capability of the low-illumination image enhancement network and obtaining a more real and natural enhancement result.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a processing procedure of a low-illumination image enhancement method based on semantic knowledge guidance according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a network structure of a semantic-guided semantic embedding module according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
The embodiment of the invention is used for solving the problems of color deviation and abnormal details of the enhanced image caused by lack of guidance of semantic information. The aim of the embodiment of the invention is that: it is confirmed which semantic information can be utilized by image enhancement, how these semantic information positively affect the low-light image enhancement task. For a semantic segmentation network, there are many multi-scale features (semantic segmentation features) of the middle layer output inside the network, which have different receptive fields and different characterization capabilities, the middle layer features of the image enhancement network can be optimized in the characterization space. Secondly, the prediction result of the semantic segmentation network can also be used as priori information to guide the image enhancement network to learn the mapping relation related to the semantics.
Because the middle layer characteristics of the semantic segmentation network and the middle layer characteristics of the image enhancement network have certain differences, if splicing or multiplication operation is directly adopted, the characteristics can be degraded, and the image enhancement effect is affected. Therefore, the method provides a semantic guidance embedding module, and the semantic features are reasonably embedded into the image enhancement features by establishing cross-modal interaction information. Secondly, for color optimization, the embodiment of the invention keeps the color consistency of the output image through the constraint of the color histogram, but the histogram is taken as a global statistical feature, and the local consistency cannot be ensured, so that the reservation capacity of the histogram for the color is limited. Therefore, the invention proposes the semantic-guided color histogram loss, and by dividing each region by means of the semantic division result, respectively calculating the histogram and calculating the loss, the output color characteristics are constrained on the semantic level. Finally, the current loss function does not represent the visual effect of the image well, and the internal structure of the image cannot be captured, resulting in a visually poor result, and in order to further improve the quality of the output image, researchers have improved the quality of the image through global and local countermeasure training, but the random selection of the local image blocks cannot fully exert the capability of local countermeasure loss. Therefore, the method of the invention provides the antagonism loss of semantic guidance, obtains the image blocks corresponding to different categories through the semantic segmentation result, finds out the most false image block as the local image block to update the parameters, thereby improving the capability of local loss and improving the quality of the final output image.
As a possible implementation manner, the specific implementation process of the low-illumination image enhancement method based on semantic knowledge guidance provided by the embodiment of the invention includes:
first, a low-illuminance image (I l ) In the image enhancement network and the semantic segmentation network, after multi-layer feature interaction, outputting image enhancementAnd semantic segmentation results (I seg ) And the color histogram loss and the countering loss are realized under the guidance of the semantic segmentation result, and the training of the image enhancement network is restrained, as shown in fig. 1.
In the present method, the low-light image enhancement problem under semantic guidance can be described as a formula,
M=F segment (I l ;θ s ),
wherein F is segment For a pre-trained semantic segmentation network, M is semantic priori information obtained from the semantic segmentation network, I l For input low-light image, θ s Parameters of the network are partitioned for semantics. The semantic segmentation network can provide rich and varied semantic prior information due to pre-training on a large-scale data set, and is called a semantic knowledge base in the embodiment. After the semantic prior information is obtained, the semantic prior information and the low-illumination image are input into an image enhancement network together:
wherein F is enhance For low-light image enhancement network, θ e For the parameters of the image enhancement network,is the output normal light image, i.e., the predicted enhanced image. In this embodiment, only parameters of the image enhancement network are updated in the training process, and the semantic segmentation network is fixed, as follows:
wherein I is h To correspond to I l As a label-constrained image enhancement network.
In order to solve the influence of the difference between semantic segmentation and image enhancement on feature fusion, reasonable interaction is established between a semantic segmentation network and an image enhancement network through the constructed semantic guidance embedding module. In this embodiment, HRNet (High-Resolution Net) is selected as the semantic knowledge base to provide semantic priori information. In HRNet, multi-scale intermediate layer features, output features, and prediction results are used as semantic information in image enhancement tasks. For better explanation, the number of semantic guidance embedding modules in the present embodiment is set to three as shown in fig. 1.
FIG. 2 shows a network structure schematic diagram of each semantic guidance semantic embedding module, the input of the module is semantic segmentation features and image enhancement features, the semantic guidance semantic embedding module is changed into feature graphs with the same dimension after convolution, layer normalization and flattening operations, attention force is calculated and information contained in the semantic segmentation features is fused into the image enhancement features, and finally optimized features are output to realize corresponding feature interaction operations. In each semantic guidance semantic embedding module, the input of the semantic guidance semantic embedding module is corresponding semantic segmentation features and image enhancement features, after the input of the semantic guidance semantic embedding module into the module, the features are preprocessed through convolution layer and layer normalization, the dimensions of the two features are transformed to be consistent, and the corresponding dimensions are expressed as H multiplied by W multiplied by C. The features are then flattened in the channel dimension, resulting in two hwxc feature maps. Based on the transposed attention mechanism, an attention map between the two feature maps is then computed and computational resources are saved, the resulting semantically related attention map a is shown below:
wherein W is k And W is q For key embedding and query embedding convolutional layers, F i And F s Is an imageEnhancement features and semantic segmentation features, C is the number of channels, softmax is the activation function. The resulting semantically related attention mapRepresents F i And F s Inter-correlation between, then use A to adjust F i The following is shown:
F o =FN(W v (F i )×A+F i ),
wherein W is v For value embedding convolution layer, FN is feedforward neural network, F o And guiding the output characteristics of the semantic embedding module for the optimized characteristics, namely the semantics. Therefore, the invention realizes the optimization of the image enhancement features through the semantic segmentation features, so that the image enhancement features pay attention to the semantic related information in the characterization subspace.
Namely W v (F i ) XA is convolved with the image enhancement feature F i And adding, namely sequentially passing through a normalization layer and a convolution layer, performing dot multiplication of a matrix, then passing through the convolution layer, and finally adding with the addition result to obtain the output characteristics of the semantic guidance semantic embedding module.
Color histograms have important image statistics and are well suited for preserving the color consistency of images. In order to achieve the purpose of optimizing the colors, the learned color histogram can be combined with the image content by adopting an affinity matrix method, but the histogram describes global statistical information and has great difference with the content, and the direct fusion can influence the recovery of detail textures. And the color features of each category are ignored when computing the global histogram, limiting the color optimization capability. Therefore, the embodiment of the invention provides semantic guidance of color histogram loss to realize local color adjustment so as to improve the color retention capacity of the image enhancement frame.
First, the embodiment of the present invention uses semantic segmentation results to segment an image into image blocks, each of which contains only one category of content. Thus, the image block generation process is as follows:
P={P 0 ,P 1 ,…,P class },
wherein, as follows, the symbol ". Ii represents a matrix dot product out Representing the output enhancement result (i.e. predictive enhanced image),/>Representing the c type prediction result, P, output by the semantic segmentation network c Representing a c-th category tile and P representing a group of tiles. Thus, image blocks of each category are obtained.
Since color histograms are discrete statistical features, embodiments of the present invention estimate the histograms in a guided manner and thus can be used for model training. Considering the error of the semantic segmentation result, pixels at the class edge are ignored during calculation, the influence of segmentation errors on training is reduced, each image block of the group P of image blocks is adjusted based on class edge pixel adjustment, and an adjusted image block group P' is obtained. For convenience of explaining the histogram estimation process, the image block P is adjusted by the R channel of the c-th class c′ (R) is exemplified:
wherein x is j Representing P c′ The j-th pixel of (R), i ε [0,255]Representing the pixel gray value.And->Representing a high anchor value and a low anchor value, respectively, as a feature of the current pixel for use in subsequent calculations, as follows:
wherein H is c Representing P c′ A derivative histogram estimate of (R),an estimated value representing the number of pixels with a gray value i, α being a scaling factor, is set to 400 in this embodiment. The two anchor point values are amplified and contracted and then output by using a Sigmoid activation function, and the output difference value is used as a pixel x j Contribution to pixel number estimate, x j The closer the distance from i, the greater the difference, when x j When the value of (1) is exactly equal to i, the difference is 1, i.e. one pixel is contributed. Finally, l is 1 The penalty is the final constraint of the estimated color histogram, and therefore, the semantically guided color histogram penalty is as follows:
wherein,,and I h Respectively representing an output image and a real label image, H c (. Cndot.) represents the histogram estimation process.
In the image complement task, global and local discriminators are used to get more realistic complement results. In the low-illumination image enhancement task, semantic information is introduced to guide the discriminator to pay attention to the region of interest. To achieve this goal, embodiments of the present invention optimize global and local countermeasures against loss by predicting the semantic graph I seg And the image block group P' introduces the calculation of the loss function, the semantic guided countermeasure is proposedLoss.
For local countermeasures against loss, the aforementioned group of tiles P' is first taken as candidate false tiles. And then inputting the candidate false image blocks into a discriminator to obtain a discriminating result (the probability of whether the candidate image blocks are label images or not), wherein the image block with the smallest output result is regarded as the most false part, and selecting the gradient obtained by the output to update parameters of the discriminator and a generator, so that the discriminator reasonably utilizes semantic prior information to find false target areas. For real image blocks, the method of random clipping is adopted to acquire the real image blocks from the data set, so the local countermeasure loss can be described as:
x f =P t ,D(P t )=min(D(P 0 ),…,D(P class ))
wherein MSE (·) represents the mean square error, P t Representing candidate false image blocks, x r Representing a real image block, x f Representing a false image block.
For global countering loss, the embodiment of the invention adopts a simple design to realize the guidance of semantic correlation. Splice I in channel dimension out (n+1th convolution block output) and I s eg As a new x f Wherein I s eg The output features before the function are activated for the last Softmax of the semantic segmentation network, namely the output features of the pre-measurement head of the semantic segmentation network. The true image is still sampled randomly, so the final global contrast loss can be described as:
that is, the semantic guidance fight loss of an embodiment of the present invention can be described as:
meanwhile, the embodiment of the invention also defines the original loss function of the enhanced network as(prediction enhanced image->And input image I l Is a label image I of (1) h Reconstruction loss between), in general, the loss function (reconstruction loss) of the enhanced network may be a first order differential loss, a mean square error loss, or a structural similarity loss, etc.
In summary, in an embodiment of the present invention, a loss function of semantic knowledge-guided low-light image enhancement may be described as:
wherein lambda is sCH And lambda (lambda) SA To balance the weights of the loss functions, an empirical value is used.
Based on total lossThe image enhancement processing network model is trained, and when the loss value tends to be converged and kept stable, the training is stopped, and the trained image enhancement processing network model is obtained, so that an enhancement processing result of an image to be enhanced (low-illumination image) is obtained based on the output of the trained image enhancement processing network model.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
What has been described above is merely some embodiments of the present invention. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit of the invention.

Claims (5)

1. The low-illumination image enhancement method based on semantic knowledge guidance is characterized by comprising the following steps of:
step 1, constructing an image enhancement processing network model;
the image enhancement processing network model comprises two branches, wherein one branch is a semantic segmentation network, the other branch is an image enhancement network, and N semantic embedding modules are arranged between the two branches, wherein N is more than or equal to 2;
the semantic segmentation network sequentially comprises: a first encoder, a first decoder and a pre-processing head, the first encoder is used for inputting image I l Extracting features to obtain an input image I l Is the first initial feature map of the input image I l Is a low-illumination image;
the first decoder is used for decoding the first initial feature map in multiple scales to obtain deep feature maps in different scales, namely semantic segmentation features F i The scale number M of the semantic segmentation features is larger than N;
predicting semantic segmentation features F of different scales output by the head to the first decoder i Performing pixel-level semantic category prediction, and outputting an input image I l Semantic predictive diagram I of (1) seg
Selecting N-scale semantic segmentation features of continuous scales from the M-scale deep feature graphs as one of two inputs of each semantic embedding module respectively, and sequentially defining the N semantic embedding modules as 1 st to N semantic embedding modules according to the scale ascending direction;
the image enhancement network includes: a second encoder and a second decoder, wherein the second encoder is used for inputting image I l Feature extraction is carried out on the low-illumination image of (2) to obtain an input image I l Is a second initial feature map of (2); the decoder comprises n+1 convolution blocks, each convolution block is used for up-sampling the input of the convolution block and outputting image enhancement features F with different scales i The output of the last convolution block is the input image I l Is a predictive enhanced image of (1)The input of the 1 st convolution block is a second initial feature map, the output of the 1 st convolution block is used as the other input of the 1 st semantic embedding module, and the output of any i-th semantic embedding module is used as the input of the i+1st convolution block, wherein i=1, … and N;
step 2, learning and training network parameters of the image enhancement processing network model based on the training sample, and stopping when a preset training ending condition is met, so as to obtain a trained image enhancement processing network model;
the loss function when training the image enhancement processing network model is set as follows:
wherein,,representing prediction enhanced image +.>And input image I l Is a label image I of (1) h Reconstruction loss between->Representing semantic guided color histogram loss, i.e. prediction enhancement image +.>Histogram of (2) and label image I h L1 norm loss, lambda between histograms of (2) SCH Representing semantic guided color histogram loss->Weight of->Representing semantic guided fight loss, lambda SA Representing semantic guidance fight loss->Weights of (2);
and 3, inputting the image to be enhanced, which is matched with the input of the image enhancement processing network model, into a trained image enhancement network model, and obtaining an enhancement result of the image to be enhanced based on the output of the last convolution block of the image enhancement network.
2. The method of claim 1, wherein in step 2, semantic guidance counter-lossIs a global countermeasure loss->Is in charge of local countermeasures against loss>Sum and get global challenge loss by introducing discriminators during trainingIs in charge of local countermeasures against loss>The method comprises the following steps:
semantic prediction graph I based on semantic segmentation network output seg For prediction enhancement imagePartitioning, wherein each image block corresponds to a semantic category, and P is defined k Representing an arbitrary kth image block;
calculating local challenge losses
x f =P t ,D(P t )=min{D(P k )}
Wherein G represents a generator, namely an image enhancement processing network model, D represents a generator, D (-) represents the output of the generator, and x r Representing a real image block, p real Data distribution, x, representing real image blocks f Representing false image blocks, p fake Representing the distribution of data for the dummy image blocks,representing mathematical expectations about a real image block;
the input of the pre-measurement head of the semantic segmentation network is recorded as a characteristic diagram I' seg Predicting enhanced images in the channel dimensionIn the characteristic diagram I' seg Stitching is performed as a new dummy image block x' f And calculate global challenge loss->
Wherein,,representing mathematical expectations about new false image blocks.
3. The method of claim 1, wherein the semantic segmentation network is a pre-trained network that remains unchanged during learning training of network parameters of the image enhancement processing network model.
4. The method according to claim 1, wherein the network structure of the semantic embedding module is specifically:
input semantic segmentation feature F s And image enhancement feature F i Respectively passing through a normalization layer and a convolution layer to obtain a semantic feature map and an image enhancement feature map with consistent dimensions; flattening the semantic feature map and the image enhancement feature map in the channel dimension respectively, and calculating attention force map between the two flattened feature maps through a transposed attention mechanism to obtain semantic correlation attention force map A;
adjusting image enhancement features F by semantically related attention seeking A i Obtaining the output characteristic F of the semantic embedding module o
F o =FN(W v (F i )×A+F i ),
Wherein W is v Representing the weights of the value embedded convolutional layers, FN () represents the output of the feedforward neural network.
5. The method of claim 1, wherein a semantically-guided color histogram penalty is calculatedWhen the histogram is estimated in a conductive way, the method specifically comprises the following steps:
semantic based partitioningSemantic predictive graph I of cut network output seg For prediction enhancement images respectivelyAnd a label image I h Partitioning, wherein each image block corresponds to a semantic category;
separately estimating prediction enhanced imagesAnd a label image I h Is a semantic guided color histogram of:
respectively carrying out category edge pixel adjustment on each color channel of each image block to obtain high and low anchor point values of the pixel gray value of each pixel of each image block;
for the same pixel gray value of the same color channel of the same semantic class, multiplying the high and low anchor point values by a preset scaling factor to be used as input of a Sigmoid activation function, obtaining the pixel number estimated value of the current pixel gray value of the current semantic class based on the accumulated value of the difference between the Sigmoid activation function values of the high and low anchor point values under the scaling factor of all pixel points, obtaining the estimated histogram of the current color channel of the current semantic class based on the pixel number estimated value of all pixel gray values, and obtaining the prediction enhancement image based on the estimated histogram of all color channelsOr label image I h Is described herein) a semantic guided color histogram.
CN202310277679.8A 2023-03-21 2023-03-21 Low-illumination image enhancement method based on semantic knowledge guidance Pending CN116452472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310277679.8A CN116452472A (en) 2023-03-21 2023-03-21 Low-illumination image enhancement method based on semantic knowledge guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310277679.8A CN116452472A (en) 2023-03-21 2023-03-21 Low-illumination image enhancement method based on semantic knowledge guidance

Publications (1)

Publication Number Publication Date
CN116452472A true CN116452472A (en) 2023-07-18

Family

ID=87119327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310277679.8A Pending CN116452472A (en) 2023-03-21 2023-03-21 Low-illumination image enhancement method based on semantic knowledge guidance

Country Status (1)

Country Link
CN (1) CN116452472A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117853348A (en) * 2024-03-07 2024-04-09 中国石油大学(华东) Underwater image enhancement method based on semantic perception

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117853348A (en) * 2024-03-07 2024-04-09 中国石油大学(华东) Underwater image enhancement method based on semantic perception

Similar Documents

Publication Publication Date Title
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
CN113688723B (en) Infrared image pedestrian target detection method based on improved YOLOv5
CN111950649B (en) Attention mechanism and capsule network-based low-illumination image classification method
CN113313657B (en) Unsupervised learning method and system for low-illumination image enhancement
CN112150493B (en) Semantic guidance-based screen area detection method in natural scene
CN111292264A (en) Image high dynamic range reconstruction method based on deep learning
Li et al. Deep dehazing network with latent ensembling architecture and adversarial learning
CN111861925A (en) Image rain removing method based on attention mechanism and gate control circulation unit
CN113781377A (en) Infrared and visible light image fusion method based on antagonism semantic guidance and perception
CN113378775B (en) Video shadow detection and elimination method based on deep learning
CN111401374A (en) Model training method based on multiple tasks, character recognition method and device
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN113705490A (en) Anomaly detection method based on reconstruction and prediction
CN111126155B (en) Pedestrian re-identification method for generating countermeasure network based on semantic constraint
CN113205103A (en) Lightweight tattoo detection method
CN116452472A (en) Low-illumination image enhancement method based on semantic knowledge guidance
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN116524307A (en) Self-supervision pre-training method based on diffusion model
CN117237994B (en) Method, device and system for counting personnel and detecting behaviors in oil and gas operation area
CN113283320A (en) Pedestrian re-identification method based on channel feature aggregation
CN117058235A (en) Visual positioning method crossing various indoor scenes
Xu et al. Adaptive brightness learning for active object recognition
CN114581769A (en) Method for identifying houses under construction based on unsupervised clustering
CN114549340A (en) Contrast enhancement method, computer program product, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination