CN111507910A - Single image reflection removing method and device and storage medium - Google Patents

Single image reflection removing method and device and storage medium Download PDF

Info

Publication number
CN111507910A
CN111507910A CN202010193974.1A CN202010193974A CN111507910A CN 111507910 A CN111507910 A CN 111507910A CN 202010193974 A CN202010193974 A CN 202010193974A CN 111507910 A CN111507910 A CN 111507910A
Authority
CN
China
Prior art keywords
image
reflection
loss function
network
background image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010193974.1A
Other languages
Chinese (zh)
Other versions
CN111507910B (en
Inventor
田治仁
张贵峰
李锐海
廖永力
张巍
龚博
王俊锞
黄增浩
朱登杰
何锦强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Institute of Southern Power Grid Co Ltd
Original Assignee
Research Institute of Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Institute of Southern Power Grid Co Ltd filed Critical Research Institute of Southern Power Grid Co Ltd
Priority to CN202010193974.1A priority Critical patent/CN111507910B/en
Publication of CN111507910A publication Critical patent/CN111507910A/en
Application granted granted Critical
Publication of CN111507910B publication Critical patent/CN111507910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method, a device and a storage medium for removing reflection of a single image, wherein the method comprises the following steps: acquiring a background image and a corresponding reflection image through manual shooting, and acquiring a reflection image according to superposition of the background image and the reflection image; inputting the reflection image into a pre-trained VGG-19 network for super-column feature extraction to obtain a feature set; inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; inputting the predicted background image and the background image into a preset identification network to calculate to obtain an identification loss function of the identification network; training the generation network and the identification network is completed through repeated iterative calculation until the joint loss function and the identification loss function of the generation network are converged; and selecting a plurality of reflection images to perform reflection removing treatment so as to quantitatively evaluate the reflection removing effect. The method can extract high-level sensory information of the image and add the high-level sensory information into the training for generating the countermeasure network, thereby effectively solving the problem of reflection removal of the single image.

Description

Single image reflection removing method and device and storage medium
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a device for removing reflection of a single image and a storage medium.
Background
The reflection removal of a single image usually utilizes a predetermined a priori information. First, the most common way is to separate the image layers by finding the minimized edges and corners using the sparse property of the natural image gradients, for example, using the constraint of gradient sparsity in combination with the data fidelity term of the laplacian domain to suppress image reflections. However, this approach relies on low-level heuristics and is limited in situations where high-level analysis of the results of the image is required, for example. Another a priori knowledge is that the image of the reflective layer is generally unfocused, smooth. However, algorithms based on this assumption cannot be applied to the case where the reflected image also has a strong contrast. None of these methods can effectively utilize the high-level sensory information of the image and solve the problem of de-reflection of the high-contrast reflected image.
Disclosure of Invention
The embodiment of the invention aims to provide a single image de-reflection method, a single image de-reflection device and a single image de-reflection storage medium, which can effectively extract high-level sensory information of an image, add the information into network training, combine the advantages of generating a confrontation network, effectively solve the problem of de-reflection of a single image and have satisfactory de-reflection effect on a high-contrast reflection image.
In order to achieve the above object, an embodiment of the present invention provides a method for removing reflection from a single image, including the following steps:
acquiring a background image and a corresponding reflection image through manual shooting, and obtaining a reflection image according to superposition of the background image and the reflection image;
inputting the reflection image into a pre-trained VGG-19 network for super-column feature extraction to obtain a feature set;
inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
inputting the prediction background image and the background image into a preset authentication network to calculate an authentication loss function of the authentication network;
training the generation network and the discrimination network through multiple iterative computations until the joint loss function and the discrimination loss function are converged;
and selecting a plurality of reflection images to carry out reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
Preferably, the obtaining a reflected light image according to the superposition of the background image and the reflected image specifically includes:
acquiring a first gray value of the background image;
acquiring a second gray value of the reflection image;
and performing weighted calculation on the first gray value and the second gray value to obtain the reflection image.
Preferably, the convolutional layer of the VGG-19 network includes conv1_2, conv2_2, conv3_2, conv4_2, and conv5_ 2.
Preferably, the generation network comprises an input layer with convolution kernel of 1 × 1 and 8 hole convolution layers with convolution kernel of 3 × 3, wherein the last hole convolution layer generates two three-channel RGB images by using linear transformation.
Preferably, the joint loss function of the generation network includes a reconstruction loss function, a countermeasure loss function, and a separation loss function of the supercolumn feature space, and specifically includes:
the expression of the reconstruction loss function of the super-column characteristic space is
Figure BDA0002416727460000021
Wherein, Lfeat(θ) is a reconstruction loss function of the supercolumn feature space, I, T and fT(I;. theta.) are the reflected light image, the background image and the predicted background image, respectively, lambda;.)lThe impact weight of the convolution layer of the l layer, omega, the set of image data of training, | · |. the luminance1The vector representing the result of the convolution of the neural network takes the 1-norm, i.e. the sum of the absolute values of the elements of the vector, phil(x) Representing convolution operation of the first layer convolution layer of the VGG-19 network, and theta represents a generated network parameter;
the expression of the penalty function is
Figure BDA0002416727460000031
Wherein, Ladv(θ) is the countermeasure loss function, D (I, x) represents the probability that x is the background image corresponding to the reflection image I, and is obtained from the output of the discrimination network;
the separation loss function is expressed as
Figure BDA0002416727460000032
Wherein, Lexcl(theta) is the separation loss function,
Figure BDA0002416727460000033
λTand λRRespectively a first normalization parameter and a second normalization parameterTwo standardized parameters, | · | | non-conducting phosphorF⊙ represents element multiplication, N is image down-sampling parameter, N is more than or equal to 1 and less than or equal to N, N is maximum value of image down-sampling parameter, f is Robenius normR(I; theta) are the predicted reflection maps respectively,
Figure BDA0002416727460000034
to predict the norm of the gradient of the background image,
Figure BDA0002416727460000035
is a modulus of the gradient of the predicted reflectance image;
the joint loss function of the generated network is L (theta) w1Lfeat(θ)+w2Ladv(θ)+w3Lexcl(theta), where L (theta) is the joint loss function, w1、w2And w3And the coefficients are respectively corresponding to the reconstruction loss function, the countermeasure loss function and the separation loss function of the supercolumn feature space.
Preferably, the authentication loss function of the authentication network is Ldisc(θ)=log D(I;fT(I; theta)) -log D (I, T), wherein, Ldisc(θ) is the discrimination loss function.
Preferably, the selecting of the plurality of reflection images for reflection removing processing to quantitatively evaluate the reflection removing effect specifically includes:
selecting a plurality of reflection images to perform reflection removing processing, and calculating the peak signal-to-noise ratio and the structural similarity between the prediction background image and the background image generated by the generation network so as to quantitatively evaluate the reflection removing effect
Another embodiment of the present invention provides an apparatus for de-reflecting a single image, the apparatus comprising:
the image set acquisition module is used for acquiring a background image and a corresponding reflection image through manual shooting, and acquiring a reflection image according to superposition of the background image and the reflection image;
the characteristic extraction module is used for inputting the reflection image into a pre-trained VGG-19 network for super-column characteristic extraction to obtain a characteristic set;
the prediction generation module is used for inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
the identification module is used for inputting the prediction background image and the background image into a preset identification network so as to calculate and obtain an identification loss function of the identification network;
the training module is used for completing the training of the generation network and the identification network through repeated iterative computation until the joint loss function and the identification loss function are converged;
and the evaluation module is used for selecting a plurality of reflection images to carry out reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
The invention correspondingly provides an apparatus using a single image de-reflection method, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the single image de-reflection method according to any one of the above items when executing the computer program.
Another embodiment of the present invention provides a computer-readable storage medium, which includes a stored computer program, where when the computer program runs, the computer-readable storage medium controls an apparatus to execute the method for single image de-reflection as described in any one of the above.
Compared with the prior art, the single image reflection removing method, the single image reflection removing device and the single image reflection removing storage medium provided by the embodiment of the invention can obtain the prediction background image which is closer to the real background image by effectively extracting the high-grade sensory information of the image by means of the deep convolutional neural network and combining the optimization characteristics of the generated countermeasure network, thereby effectively solving the image reflection problem in image acquisition and having satisfactory reflection removing effect on the reflection image with high contrast.
Drawings
FIG. 1 is a schematic flow chart of a method for de-reflection of a single image according to an embodiment of the present invention;
FIG. 2 is a simplified flow diagram of a method for de-imaging a single image according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a generation network according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an authentication network according to an embodiment of the present invention;
FIG. 5 is a diagram of the anti-reflective contrast effect of 4 sets of reflective images, background images, predicted background images, and reflective images according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a single image de-reflection apparatus according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of an apparatus using a single image de-reflection method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a schematic flow chart of a method for removing light reflection from a single image according to an embodiment of the present invention is shown, where the method includes steps S1 to S6:
s1, acquiring a background image and a corresponding reflection image through manual shooting, and obtaining a reflection image according to superposition of the background image and the reflection image;
s2, inputting the reflection image into a pre-trained VGG-19 network for super-column feature extraction to obtain a feature set;
s3, inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
s4, inputting the prediction background image and the background image into a preset identification network to calculate and obtain an identification loss function of the identification network;
s5, completing training of the generation network and the identification network through multiple iterative computations until the joint loss function and the identification loss function are converged;
and S6, selecting a plurality of reflection images to perform reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
Specifically, a background image and a corresponding reflection image are obtained through manual shooting, and a reflection image is obtained according to superposition of the background image and the reflection image. Because the non-reflective original image is difficult to obtain in real life, the background image is made in an artificial mode, and the method comprises the following steps: background images select images of the room, with the target object placed on one side of the transparent glass (preferably the dark side) and the taking lens on the other side of the glass. Then fixing the position of the object and the lens, and shooting the image to obtain a background image without reflection. The reflected image may be selected as an outdoor image. And after obtaining the background image and the reflection image, setting the two images to be the same size H W3, and then overlapping the images to obtain the reflection image. The final data set contained 2000 reflected images and their corresponding background and reflected images.
And inputting the light reflection image into a pre-trained VGG-19 network for super-column feature extraction to obtain a feature set. The supercolumn features have a total of 1472 dimensions, and then the three channels of the retroreflective image are connected with the supercolumn features to form a 1475-dimensional feature set, which is denoted as Φ (x), where x is the input retroreflective image.
Inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; the joint loss function of the generated network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of a supercolumn feature space;
and inputting the predicted background image and the background image into a preset identification network to calculate and obtain an identification loss function of the identification network. The authentication network is introduced to judge the two input images and output the probability that the two images are derived from the data set.
Training the generation network and the identification network is completed through repeated iterative calculation until the joint loss function and the identification loss function are converged; during the training process, the output probability of the identification network can influence the joint loss function of the generation network so as to optimize the generation network. In general, when the output probability of the discrimination network is 0.5, it means that both functions converge. Preferably, the training parameters are: max _ epoch is 250, batch _ size is 1; the optimization mode is Adam optimization algorithm, and the learning rate is 10-4
In order to evaluate the advantages and disadvantages of the method, a plurality of reflection images are selected for reflection removing treatment, so that the reflection removing effect is quantitatively evaluated.
To more clearly understand the implementation process of the method of the present invention, refer to fig. 2, which is a simplified flow chart of the method for removing reflection from a single image according to the embodiment of the present invention.
According to the single-image reflection removing method provided by the embodiment 1 of the invention, the high-level sensory information of the image is effectively extracted by means of the deep convolutional neural network, and the optimization characteristic of the generated countermeasure network is combined, so that the prediction background image which is closer to the real background image can be obtained, the image reflection problem in image acquisition is effectively solved, and the reflection removing effect of the reflection image with high contrast is still satisfactory.
As an improvement of the above scheme, obtaining a reflection image according to superposition of the background image and the reflection image specifically includes:
acquiring a first gray value of the background image;
acquiring a second gray value of the reflection image;
and performing weighted calculation on the first gray value and the second gray value to obtain the reflection image.
Therefore, the reflection image can be obtained by acquiring a first gray value of the background image and a second gray value of the reflection image, and performing weighted calculation on the first gray value and the second gray value, and expressed by a mathematical expression of I ═ 1- α) + α ×, wherein I is the reflection image, T is the background image, R is the reflection image, α is a weighting parameter corresponding to the reflection image, α∈ [0,1], and preferably α is 0.5.
As an improvement of the scheme, the convolutional layer of the VGG-19 network comprises conv1_2, conv2_2, conv3_2, conv4_2 and conv5_ 2.
Specifically, the convolutional layers of the VGG-19 network include conv1_2, conv2_2, conv3_2, conv4_2, and conv5_ 2. The VGG-19 network is pre-trained on ImageNet for extracting supercolumn features of the input image, the use of supercolumn features has the advantage that the input adds useful features that abstract the visual perception of large datasets (e.g., ImageNet). The supercolumn feature for a given pixel location is the stack of activated cells on the network-selected convolutional layer for that location.
As an improvement of the scheme, the generating network comprises an input layer with convolution kernel of 1 × 1 and 8 cavity convolution layers with convolution kernel of 3 × 3, wherein the last cavity convolution layer generates two three-channel RGB images by utilizing linear transformation.
Specifically, referring to fig. 3, it is a schematic structural diagram of a generation network according to the embodiment of the present invention, as shown in fig. 3, the generation network includes an input layer with a convolution kernel of 1 × 1 and 8 hole convolution layers with a convolution kernel of 3 × 3, where the last hole convolution layer generates two three-channel RGB maps by using linear transformation, the input layer can reduce the 1475-dimensional features output by the VGG-19 network to 64-dimensional, the expansion ratios of the 8 hole convolution layers range from 1 to 128, and the number of feature layers of all intermediate layers of the generation network is 64.
As an improvement of the above scheme, the joint loss function of the generation network includes a reconstruction loss function, a countermeasure loss function, and a separation loss function of the supercolumn feature space, and specifically includes:
of said supercolumn feature spaceThe reconstruction loss function is expressed as
Figure BDA0002416727460000081
Wherein, Lfeat(θ) is a reconstruction loss function of the supercolumn feature space, ΦlI, T and f for the l-th convolutional layer of the VGG-19 networkT(I;. theta.) are the reflected light image, the background image and the predicted background image, respectively, lambda;.)lThe impact weight of the convolution layer of the l layer, omega, the set of image data of training, | · |. the luminance1The vector representing the result of the convolution of the neural network takes the 1-norm, i.e. the sum of the absolute values of the elements of the vector, phil(x) Representing convolution operation of the first layer convolution layer of the VGG-19 network, and theta represents a generated network parameter;
the expression of the penalty function is
Figure BDA0002416727460000082
Wherein, Ladv(θ) is the countermeasure loss function, D (I, x) represents the probability that x is the background image corresponding to the reflection image I, and is obtained from the output of the discrimination network;
the separation loss function is expressed as
Figure BDA0002416727460000083
Wherein, Lexcl(theta) is the separation loss function,
Figure BDA0002416727460000084
λTand λRRespectively a first standardized parameter and a second standardized parameter, | · | | non-woven phosphorF⊙ represents element multiplication, N is image down-sampling parameter, N is more than or equal to 1 and less than or equal to N, N is maximum value of image down-sampling parameter, f is Robenius normR(I; theta) are the predicted reflection maps respectively,
Figure BDA0002416727460000091
to predict the norm of the gradient of the background image,
Figure BDA0002416727460000092
is a modulus of the gradient of the predicted reflectance image;
the joint loss function of the generated network is L (theta) w1Lfeat(θ)+w2Ladv(θ)+w3Lexcl(theta), where L (theta) is the joint loss function, w1、w2And w3And the coefficients are respectively corresponding to the reconstruction loss function, the countermeasure loss function and the separation loss function of the supercolumn feature space.
Specifically, a reconstruction loss function of the supercolumn Feature space, also called Feature reconstruction loss, is used to measure a distance between the prediction background image generated by the generation network and the background image T in the supercolumn space. Typically, the distance of the predicted image from the target image is calculated at the selected VGG-19 network layer. The reconstruction loss function of the super-column feature space is expressed as
Figure BDA0002416727460000093
Wherein, Lfeat(θ) is the reconstruction loss function for the supercolumn feature space, I, T and fT(I, theta) are respectively a reflection image, a background image and a prediction background image, lambdalThe impact weight of the convolution layer of the l layer, omega, the set of image data of training, | · |. the luminance1The vector representing the result of the convolution of the neural network takes the 1-norm, i.e. the sum of the absolute values of the elements of the vector, phil(x) Represents the convolution operation of the first layer convolution layer of the VGG-19 network, and theta represents the generated network parameter.
The function of the penalty function is to generate a predicted background map fT(I; theta) is more different from the reflected light image I. The penalty function is expressed as
Figure BDA0002416727460000094
Wherein, LadvAnd (theta) is a resistance loss function, and D (I, x) represents the probability that x is a background image corresponding to the reflection image I and is obtained from the output of the identification network.
The separation loss function, also called exception loss, is designed according to the rule that the reflected light image is observed and found at the edge of the image. The edges of the background layer and the reflection layer are generally not overlapped by observing the two layers of the reflection image. The edges in the reflected image I can only be generated by the background image or the reflected image, but cannot be caused by the superposition of the two. Therefore, the invention proposes to minimize the gradient spatial correlation between the reflection layer and the background layer obtained by generating network prediction, and to calculate the image edge correlation as a separation loss function by considering the normalized gradient information calculated on a plurality of resolutions of the two layers.
The separation loss function is expressed as
Figure BDA0002416727460000101
Wherein, Lexcl(theta) is a function of the separation loss,
Figure BDA0002416727460000102
λTand λRRespectively a first standardized parameter and a second standardized parameter, | · | | non-woven phosphorFIs a Robenius norm, ⊙ represents element multiplication, N is an image down-sampling parameter, N is more than or equal to 1 and less than or equal to N, N is the maximum value of the image down-sampling parameter, fR(I; theta) are respectively predicted reflection maps,
Figure BDA0002416727460000103
to predict the norm of the gradient of the background image,
Figure BDA0002416727460000104
is a modulus of the gradient of the predicted reflectance image; f. ofTAnd fRAll pass through 2n-1Bilinear interpolation downsampling. Preferably, the number of atoms N-3,
Figure BDA0002416727460000105
the joint loss function of the generated network is L (theta) ═ w1Lfeat(θ)+w2Ladv(θ)+w3Lexcl(theta) where L (theta) is a joint loss function, w1、w2And w3Reconstruction loss function and counterdamage respectively of supercolumn characteristic spaceAnd coefficients corresponding to the loss function and the separation loss function are used for balancing the influence capability of each loss function on the generated network. Preferably, w1=20,w2=100,w3=1。
As an improvement of the scheme, the authentication loss function of the authentication network is Ldisc(θ)=log D(I;fT(I; theta)) -log D (I, T), wherein, Ldisc(θ) is the discrimination loss function.
It is noted that the construction process of the discrimination network is to first combine the predicted background map and the background image input to the discrimination network by channels to obtain a stacked input image, if the sizes of the predicted background map and the background image are both C × W × H, where C is the number of channels of the image, and W and H are the width and height of the image, respectively, then after channel combination, the dimension of the stacked image obtained will be 2C × W × H. after the stacked input image obtained, it will be passed through a plurality of cascaded down-sampling units, the processing of these down-sampling units will make the input stacked image into a gradually decreasing feature map, these down-sampling units are composed of a convolutional layer with convolution step size of 2, a batch normalization layer and a non-linear activation layer, the convolutional layer with step size of 2 will reduce the size of the input image to one half of the original size, which plays a role of down-sampling, the batch normalization layer obtains a normalized data with normalized mean value of 0, a normalized data with a normalized value of 0, a linear convergence function of stabilizing and accelerating the model convergence, and the linear probability value of the input image obtained by normalizing the input image obtained by the linear probability value of a linear probability map of a linear regression unit, wherein the linear probability value of the present invention is represented by a linear regression unit, C358, the linear probability value of a linear regression graph obtained by normalizing step 364, the linear probability value of a linear regression graph obtained by normalizing step 2, the linear regression graph, the linear probability value of the present invention is represented by a linear regression graph obtained by a linear regression graph after the linear regression graph obtained by a linear regression unit, the linear probability value of a linear regression graph obtained by a linear regression graph after the present invention, the present invention is represented by a linear probability value of a linear regression graph obtained by a linear probability value of a linear regression graph obtained by normalizing step of a.
Specifically, the discrimination loss function of the discrimination network is Ldisc(θ)=log D(I;fT(I; theta)) -log D (I, T), wherein, Ldisc(theta) is a discrimination loss function, and D (I, x) represents the probability that x is a background image corresponding to the reflection image I, i.e., D (I; f)T(I; theta)) represents the prediction background map fT(I; θ) a probability of being derived from the background image in the data set, and D (I, T) represents a probability of the background image T being derived from the background image in the data set.
As an improvement of the above scheme, selecting a plurality of reflection images for de-reflection processing to quantitatively evaluate the de-reflection effect specifically includes:
selecting a plurality of reflection images to perform reflection removing processing, and calculating the peak signal-to-noise ratio and the structural similarity between the prediction background image and the background image generated by the generation network so as to quantitatively evaluate the reflection removing effect
Specifically, a plurality of reflection images are selected for reflection removing processing, and the peak signal-to-noise ratio and the structural similarity between the prediction background image generated by the generated network and the background image are calculated to quantitatively evaluate the reflection removing effect. The Peak signal-to-noise ratio is also called Peak signal-to-noise ratio, which is abbreviated as PSNR. Structural similarity, also known as Structural similarity index, is abbreviated SSIM. Referring to fig. 5, it is a diagram of the anti-reflection contrast effect of 4 sets of reflection images, background images, predicted background images and reflection images provided by this embodiment of the present invention. Referring to table 1, a quantitative evaluation table of evaluation indexes of the de-reflection effect corresponding to fig. 5 is shown. As can be seen from fig. 5 and table 1, the method of the present invention has a good effect on single image de-reflection.
TABLE 1 quantitative evaluation table for evaluation index of image reflection-removing effect
Reflective/background images PSNR/SSIM Background image/predicted background image PSNR/SSIM
First reflection image/first background image 15.93/0.54 First background image/first predicted background image 23.85/0.82
Second reflection image/second background image 14.70/0.53 Second background image/second predicted background image 25.40/0.87
Third reflected light image/third background image 14.45/0.54 Third background image/third predicted background map 23.86/0.83
Fourth reflected light image/fourth background image 15.52/0.58 Fourth background image/fourth predicted background map 22.74/0.79
Mean value of 15.15/0.55 Mean value of 23.96/0.83
Referring to fig. 6, a schematic structural diagram of a single image de-reflection apparatus according to an embodiment of the present invention is shown, where the apparatus includes:
the image set acquisition module 11 is configured to acquire a background image and a corresponding reflection image through manual shooting, and obtain a reflection image according to superposition of the background image and the reflection image;
the feature extraction module 12 is configured to input the reflection image into a pre-trained VGG-19 network to perform supercolumn feature extraction, so as to obtain a feature set;
the prediction generation module 13 is configured to input the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
the identification module 14 is configured to input the prediction background image and the background image into a preset identification network to calculate an identification loss function of the identification network;
a training module 15, configured to complete training of the generation network and the discrimination network by performing multiple iterative computations until both the joint loss function and the discrimination loss function converge;
and the evaluation module 16 is used for selecting a plurality of reflection images to perform reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
The single image de-reflection device provided in the embodiment of the present invention can implement all the processes of the single image de-reflection method described in any one of the embodiments, and the functions and technical effects of the modules and units in the device are respectively the same as those of the single image de-reflection method described in the embodiment, and are not described herein again.
Referring to fig. 7, the apparatus for using a single image de-reflection method according to an embodiment of the present invention includes a processor 10, a memory 20, and a computer program stored in the memory 20 and configured to be executed by the processor 10, where the processor 10 implements the single image de-reflection method according to any of the above embodiments when executing the computer program.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 20 and executed by the processor 10 to implement the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of a computer program in a single image de-reflection method. For example, the computer program may be divided into an image set acquisition module, a feature extraction module, a prediction generation module, a discrimination module, a training module, and an evaluation module, each of which functions specifically as follows:
the image set acquisition module 11 is configured to acquire a background image and a corresponding reflection image through manual shooting, and obtain a reflection image according to superposition of the background image and the reflection image;
the feature extraction module 12 is configured to input the reflection image into a pre-trained VGG-19 network to perform supercolumn feature extraction, so as to obtain a feature set;
the prediction generation module 13 is configured to input the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
the identification module 14 is configured to input the prediction background image and the background image into a preset identification network to calculate an identification loss function of the identification network;
a training module 15, configured to complete training of the generation network and the discrimination network by performing multiple iterative computations until both the joint loss function and the discrimination loss function converge;
and the evaluation module 16 is used for selecting a plurality of reflection images to perform reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
The device using the single image reflection removing method can be a desktop computer, a notebook, a palm computer, a cloud server and other computing equipment. The device using the single image de-reflection method can include, but is not limited to, a processor and a memory. It will be understood by those skilled in the art that the schematic diagram 7 is merely an example of an apparatus using the single image de-reflection method, and does not constitute a limitation of the apparatus using the single image de-reflection method, and may include more or less components than those shown, or combine some components, or different components, for example, the apparatus using the single image de-reflection method may further include an input-output device, a network access device, a bus, etc.
The Processor 10 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor 10 may be any conventional processor or the like, the processor 10 being the control center of the apparatus using the single image de-reflection method, and various interfaces and lines connecting the various parts of the entire apparatus using the single image de-reflection method.
The memory 20 may be used to store the computer programs and/or modules, and the processor 10 implements various functions of the apparatus using the method of single image de-reflection by operating or executing the computer programs and/or modules stored in the memory 20 and calling data stored in the memory 20. The memory 20 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to program use, and the like. In addition, the memory 20 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
Wherein, the device integrated module using the single image de-reflection method can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and can implement the steps of the embodiments of the method when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute the method for single-image de-reflection described in any of the above embodiments.
To sum up, the method, the device and the storage medium for single image de-reflection provided by the embodiment of the invention regard the reflection separation task as the separation and evaluation task of the image layer, the convolution layer of the generated network uses the void convolution to increase the visual field without losing the detail characteristics, and the loss function fully considers the high-level characteristics and the image gradient characteristics of the image and the difference between the prediction background image and the reflection image; the high-level features are obtained through a VGG-19 network and can abstract the visual perception of a data set; the identification network designs a loss function according to the difference between the predicted background image and the input reflective image, so that the predicted background image and the background image are more similar; finally, the invention has good effect of removing the reflection of the image, and particularly has satisfactory effect of removing the reflection of the reflection image with high contrast.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A method of de-imaging a single image, comprising the steps of:
acquiring a background image and a corresponding reflection image through manual shooting, and obtaining a reflection image according to superposition of the background image and the reflection image;
inputting the reflection image into a pre-trained VGG-19 network for super-column feature extraction to obtain a feature set;
inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
inputting the prediction background image and the background image into a preset authentication network to calculate an authentication loss function of the authentication network;
training the generation network and the discrimination network through multiple iterative computations until the joint loss function and the discrimination loss function are converged;
and selecting a plurality of reflection images to carry out reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
2. The method for removing the reflected light from the single image according to claim 1, wherein obtaining the reflected light image according to the superposition of the background image and the reflected image specifically comprises:
acquiring a first gray value of the background image;
acquiring a second gray value of the reflection image;
and performing weighted calculation on the first gray value and the second gray value to obtain the reflection image.
3. The method of single image de-reflection according to claim 1, wherein the convolution layer of the VGG-19 network comprises conv1_2, conv2_2, conv3_2, conv4_2 and conv5_ 2.
4. The method of claim 1, wherein the generating network comprises an input layer with convolution kernel of 1 × 1 and 8 hole convolution layers with convolution kernel of 3 × 3, wherein the last hole convolution layer generates two three-channel RGB images by linear transformation.
5. The method for single image de-reflection according to claim 1, wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space, and specifically comprises:
the expression of the reconstruction loss function of the super-column characteristic space is
Figure FDA0002416727450000021
Wherein, Lfeat(θ) is a reconstruction loss function of the supercolumn feature space, I, T and fT(I;. theta.) are the reflected light image, the background image and the predicted background image, respectively, lambda;.)lIs the first layer of the convolution layerOmega is the image data set of training, | · | | luminance1The vector representing the result of the convolution of the neural network takes the 1-norm, i.e. the sum of the absolute values of the elements of the vector, phil(x) Representing convolution operation of the first layer convolution layer of the VGG-19 network, and theta represents a generated network parameter;
the expression of the penalty function is
Figure FDA0002416727450000022
Wherein, Ladv(θ) is the countermeasure loss function, D (I, x) represents the probability that x is the background image corresponding to the reflection image I, and is obtained from the output of the discrimination network;
the separation loss function is expressed as
Figure FDA0002416727450000023
Wherein, Lexcl(theta) is the separation loss function,
Figure FDA0002416727450000024
λTand λRRespectively a first standardized parameter and a second standardized parameter, | · | | non-woven phosphorF⊙ represents element multiplication, N is image down-sampling parameter, N is more than or equal to 1 and less than or equal to N, N is maximum value of image down-sampling parameter, f is Robenius normR(I; theta) are the predicted reflection maps respectively,
Figure FDA0002416727450000025
to predict the norm of the gradient of the background image,
Figure FDA0002416727450000026
is a modulus of the gradient of the predicted reflectance image;
the joint loss function of the generated network is L (theta) w1Lfeat(θ)+w2Ladv(θ)+w3Lexcl(theta), where L (theta) is the joint loss function, w1、w2And w3Reconstruction of the supercolumn feature spaces, respectivelyCoefficients corresponding to the loss function, the countering loss function and the separating loss function.
6. The method of single image de-reflection according to claim 5, wherein the discrimination loss function of said discrimination network is Ldisc(θ)=logD(I;fT(I; theta)) -logD (I, T), wherein, Ldisc(θ) is the discrimination loss function.
7. The method for single image de-reflection according to claim 1, wherein the selecting a plurality of reflection images for de-reflection to quantitatively evaluate the de-reflection effect comprises:
selecting a plurality of reflection images to perform reflection removing processing, and calculating the peak signal-to-noise ratio and the structural similarity between the prediction background image and the background image generated by the generation network so as to quantitatively evaluate the reflection removing effect
8. A single image de-retroreflecting apparatus comprising:
the image set acquisition module is used for acquiring a background image and a corresponding reflection image through manual shooting, and acquiring a reflection image according to superposition of the background image and the reflection image;
the characteristic extraction module is used for inputting the reflection image into a pre-trained VGG-19 network for super-column characteristic extraction to obtain a characteristic set;
the prediction generation module is used for inputting the feature set into a preset generation network to obtain a prediction background image and a prediction reflection image; wherein the joint loss function of the generation network comprises a reconstruction loss function, a countermeasure loss function and a separation loss function of the supercolumn feature space;
the identification module is used for inputting the prediction background image and the background image into a preset identification network so as to calculate and obtain an identification loss function of the identification network;
the training module is used for completing the training of the generation network and the identification network through repeated iterative computation until the joint loss function and the identification loss function are converged;
and the evaluation module is used for selecting a plurality of reflection images to carry out reflection removing treatment so as to quantitatively evaluate the reflection removing effect.
9. An apparatus using a method of single image de-reflection, comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor when executing the computer program implementing the method of single image de-reflection as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method of single image de-reflection according to any one of claims 1 to 7.
CN202010193974.1A 2020-03-18 2020-03-18 Single image antireflection method, device and storage medium Active CN111507910B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010193974.1A CN111507910B (en) 2020-03-18 2020-03-18 Single image antireflection method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010193974.1A CN111507910B (en) 2020-03-18 2020-03-18 Single image antireflection method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111507910A true CN111507910A (en) 2020-08-07
CN111507910B CN111507910B (en) 2023-06-06

Family

ID=71864034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010193974.1A Active CN111507910B (en) 2020-03-18 2020-03-18 Single image antireflection method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111507910B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085671A (en) * 2020-08-19 2020-12-15 北京影谱科技股份有限公司 Background reconstruction method and device, computing equipment and storage medium
CN112198483A (en) * 2020-09-28 2021-01-08 上海眼控科技股份有限公司 Data processing method, device and equipment for satellite inversion radar and storage medium
CN112634161A (en) * 2020-12-25 2021-04-09 南京信息工程大学滨江学院 Reflected light removing method based on two-stage reflected light eliminating network and pixel loss
CN112802076A (en) * 2021-03-23 2021-05-14 苏州科达科技股份有限公司 Reflection image generation model and training method of reflection removal model
CN112907466A (en) * 2021-02-01 2021-06-04 南京航空航天大学 Nondestructive testing reflection interference removing method and device and computer readable storage medium
CN114926705A (en) * 2022-05-12 2022-08-19 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment
WO2022222080A1 (en) * 2021-04-21 2022-10-27 浙江大学 Single-image reflecting layer removing method based on position perception
CN114926705B (en) * 2022-05-12 2024-05-28 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993124A (en) * 2019-04-03 2019-07-09 深圳市华付信息技术有限公司 Based on the reflective biopsy method of video, device and computer equipment
CN110188776A (en) * 2019-05-30 2019-08-30 京东方科技集团股份有限公司 Image processing method and device, the training method of neural network, storage medium
CN110473154A (en) * 2019-07-31 2019-11-19 西安理工大学 A kind of image de-noising method based on generation confrontation network
CN110675336A (en) * 2019-08-29 2020-01-10 苏州千视通视觉科技股份有限公司 Low-illumination image enhancement method and device
CN110827217A (en) * 2019-10-30 2020-02-21 维沃移动通信有限公司 Image processing method, electronic device, and computer-readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993124A (en) * 2019-04-03 2019-07-09 深圳市华付信息技术有限公司 Based on the reflective biopsy method of video, device and computer equipment
CN110188776A (en) * 2019-05-30 2019-08-30 京东方科技集团股份有限公司 Image processing method and device, the training method of neural network, storage medium
CN110473154A (en) * 2019-07-31 2019-11-19 西安理工大学 A kind of image de-noising method based on generation confrontation network
CN110675336A (en) * 2019-08-29 2020-01-10 苏州千视通视觉科技股份有限公司 Low-illumination image enhancement method and device
CN110827217A (en) * 2019-10-30 2020-02-21 维沃移动通信有限公司 Image processing method, electronic device, and computer-readable storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085671A (en) * 2020-08-19 2020-12-15 北京影谱科技股份有限公司 Background reconstruction method and device, computing equipment and storage medium
CN112198483A (en) * 2020-09-28 2021-01-08 上海眼控科技股份有限公司 Data processing method, device and equipment for satellite inversion radar and storage medium
CN112634161A (en) * 2020-12-25 2021-04-09 南京信息工程大学滨江学院 Reflected light removing method based on two-stage reflected light eliminating network and pixel loss
CN112907466A (en) * 2021-02-01 2021-06-04 南京航空航天大学 Nondestructive testing reflection interference removing method and device and computer readable storage medium
CN112802076A (en) * 2021-03-23 2021-05-14 苏州科达科技股份有限公司 Reflection image generation model and training method of reflection removal model
WO2022222080A1 (en) * 2021-04-21 2022-10-27 浙江大学 Single-image reflecting layer removing method based on position perception
CN114926705A (en) * 2022-05-12 2022-08-19 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment
CN114926705B (en) * 2022-05-12 2024-05-28 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment

Also Published As

Publication number Publication date
CN111507910B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN111507910A (en) Single image reflection removing method and device and storage medium
US10891537B2 (en) Convolutional neural network-based image processing method and image processing apparatus
Yuan et al. Factorization-based texture segmentation
Zheng Gradient descent algorithms for quantile regression with smooth approximation
KR20220125377A (en) Method for Distinguishing a Real Three-Dimensional Object from a Two-Dimensional Spoof of the Real Object
CN110023989B (en) Sketch image generation method and device
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
CN110929836B (en) Neural network training and image processing method and device, electronic equipment and medium
CN112561028A (en) Method for training neural network model, and method and device for data processing
CN111223128A (en) Target tracking method, device, equipment and storage medium
CN113744136A (en) Image super-resolution reconstruction method and system based on channel constraint multi-feature fusion
CN111899203A (en) Real image generation method based on label graph under unsupervised training and storage medium
Jiang et al. Fast and high quality image denoising via malleable convolution
TW202004568A (en) Full exponential operation method applied to deep neural network, computer apparatus, and computer-readable recording medium reducing the operation complexity and circuit complexity, increasing the operation speed of the deep neural network and reducing the occupation of memory space.
US20140089365A1 (en) Object detection method, object detector and object detection computer program
CN114581318A (en) Low-illumination image enhancement method and system
WO2024078112A1 (en) Method for intelligent recognition of ship outfitting items, and computer device
Fang et al. Learning explicit smoothing kernels for joint image filtering
Zhao et al. Saliency map-aided generative adversarial network for raw to rgb mapping
Piriyatharawet et al. Image denoising with deep convolutional and multi-directional LSTM networks under Poisson noise environments
CN116543433A (en) Mask wearing detection method and device based on improved YOLOv7 model
US20220284545A1 (en) Image processing device and operating method thereof
Lu et al. Kernel estimation for motion blur removal using deep convolutional neural network
Xu et al. Blind image deblurring via the weighted schatten p-norm minimization prior
CN111027670A (en) Feature map processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant