CN116228696A - Glass detection method based on deep learning and ghost phenomenon - Google Patents

Glass detection method based on deep learning and ghost phenomenon Download PDF

Info

Publication number
CN116228696A
CN116228696A CN202310128767.1A CN202310128767A CN116228696A CN 116228696 A CN116228696 A CN 116228696A CN 202310128767 A CN202310128767 A CN 202310128767A CN 116228696 A CN116228696 A CN 116228696A
Authority
CN
China
Prior art keywords
ghost
glass
reflection
deep learning
prediction graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310128767.1A
Other languages
Chinese (zh)
Inventor
晏涛
高嘉晖
李贺龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202310128767.1A priority Critical patent/CN116228696A/en
Publication of CN116228696A publication Critical patent/CN116228696A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a glass detection method based on deep learning and ghost phenomena, and relates to the technical field of computer vision. The method comprises the following steps: glass detection is carried out based on a single original input image, wherein the image is a single RGB image; extracting ghost features based on an original input image through a deep learning method of a backbone network to obtain a ghost region prediction graph; after the ghost area prediction graph is connected with an original input image channel, extracting glass features under the guidance of ghost cues based on a backbone network, and then performing glass feature decoding and segmentation results of a glass area based on a convolutional neural network; and outputting a glass region prediction graph. Compared with the prior art, the glass detection method based on the single image is wider in application range. The backbone network is used for extracting the ghost features and the glass region features more accurately and efficiently, and meanwhile, the ghost phenomenon is utilized for positioning the glass region more accurately, so that a high-quality glass region prediction graph is obtained, and the robustness is good.

Description

Glass detection method based on deep learning and ghost phenomenon
Technical Field
The application relates to the technical field of computer vision, in particular to a glass detection method based on deep learning and ghost phenomena.
Background
Glass inspection work has recently attracted a great deal of attention. Glass surfaces, including glass windows, glass doors and glass walls, are ubiquitous in indoor and outdoor settings of our daily lives. However, since they are transparent surfaces, there is typically no particular visual image, and the information presented is largely dependent on the scene behind them. Because glass lacks a consistent visual appearance and special functions, computer vision-based systems such as robots and drones can easily disregard the glass surface, often the detected glass area is not the glass surface, but rather penetrates the scene of the glass, thereby affecting its proper operation. Thus, accurate detection of glass surfaces is critical to many computer vision-based systems.
Meanwhile, with the development of computer technology and the wide application of computer vision principles, deep learning is based on strong learning ability and feature expression ability, and is rapidly developed in the field of computer vision, and a traditional way of manually constructing features based on priori knowledge is rapidly mentioned, wherein in recent years, a deep learning method based on a transducer obtains results exceeding convolutional neural networks in a plurality of fields.
Existing deep learning-based methods make use of contextual contrast information, however, there is no useful clue to the mining of glass physical properties. Glass detection using reflection is also very limited because reflection is not a physical property specific to the glass region, and reflection from smooth surfaces such as walls, floors, displays, etc. affects the accuracy of the detection result for the glass region.
Disclosure of Invention
The technical problem to be solved by the application is that the accuracy of a detection result of detecting a glass region in the prior art is low, and the application aims to provide a glass detection method based on deep learning and ghost phenomena, which is used for extracting global features more effectively based on the deep learning method, detecting the glass region more accurately based on the ghost phenomena to obtain a high-quality detection result and has good robustness.
In order to achieve the above purpose, the following technical scheme is adopted in the application:
in one aspect, a glass detection method based on deep learning and ghost phenomenon includes the steps of:
glass detection is carried out based on a single original input image, wherein the image is a single RGB image;
extracting ghost features based on an original input image through a deep learning method of a backbone network, and calculating to obtain a ghost region prediction graph;
after the ghost area prediction graph is connected with an original input image channel, extracting glass features under the guidance of ghost cues based on a backbone network, and then decoding the glass features and acquiring a segmentation result of a glass area based on a convolutional neural network;
and outputting a glass region prediction graph based on the segmentation result of the glass region.
The step of obtaining the ghost area prediction graph comprises the following steps:
acquiring multi-scale characteristics based on a backbone network;
inputting the acquired multi-scale characteristics into a double-reflection estimation module to acquire an offset estimation diagram to detect a ghost area, wherein the double-reflection estimation module acquires primary reflection characteristics and secondary reflection characteristics through primary reflection detection and secondary reflection detection;
and fusing the primary reflection characteristic, the secondary reflection characteristic and the offset estimation graph to obtain ghost characteristics, and obtaining a high-quality ghost area prediction graph through a decoder based on a convolutional neural network.
The double-layer reflection estimation module process comprises the following steps:
inputting the multi-scale characteristics acquired based on a backbone network into a double-estimation reflection estimation module;
based on the multi-scale characteristics, acquiring primary reflection characteristics and a primary reflection area prediction graph through primary detection, and acquiring secondary reflection characteristics and a secondary reflection area prediction graph through secondary detection;
performing feature constraint on the primary reflection feature and the secondary reflection feature through deformable convolution;
inputting the primary reflection area prediction graph and the secondary reflection prediction graph into a coder-decoder structure, and acquiring an offset estimation graph through an encoder;
and (5) merging the primary reflection characteristic, the secondary reflection characteristic and the ghost characteristic obtained by the offset estimation graph, and inputting the ghost characteristic into a decoder to obtain a ghost area prediction graph.
The characteristic constraint is formed by subtracting the primary reflection characteristic and the secondary reflection characteristic of the deformable convolution, and the calculation formula of the loss function is used:
Figure BDA0004083035710000031
wherein the method comprises the steps of
Figure BDA0004083035710000034
For deformable convolution operations, ++>
Figure BDA0004083035710000032
For the primary reflection feature at the corresponding i scale, +.>
Figure BDA0004083035710000033
Is a secondary reflection feature at the corresponding i scale.
The backbone network is a Swin-Transformer.
According to the method and the device, global features are extracted based on the Swin-transducer, meanwhile, physical characteristics of the ghost are comprehensively considered, a dual reflection estimation module prediction offset map is designed, and accuracy of a detected ghost area can be improved. Wherein the feature constraints performed are also advantageous for the extraction of ghost features.
In another aspect, a glass detection system based on deep learning and ghost phenomenon, the system being adapted for use in a glass detection method based on deep learning and ghost phenomenon, the system comprising:
an acquisition module for acquiring a single Zhang Yuanshi input image;
the ghost detection module is used for extracting ghost features based on an original input image through a deep learning method of a backbone network and calculating to obtain a ghost region prediction graph;
the glass segmentation module is used for extracting glass characteristics based on a backbone network under the guidance of ghost cues after the ghost area prediction graph is connected with an original input image channel, and then carrying out glass characteristic decoding and segmentation results of the glass areas based on a convolutional neural network;
and the output module is used for outputting the glass region prediction graph.
The glass dividing module is of a U-shaped structure and comprises an encoding part and a decoding part.
The glass segmentation module designs a U-shaped structure, is different from the traditional U-Net and most convolution neural network-based methods, adopts a Swin-transform and convolution combination mode, global features extracted by the transform are beneficial to positioning of a potential glass region, and uses the convolution neural network to perform feature fusion progressive decoding, so that a high-quality glass region prediction graph is finally obtained.
The beneficial effects that this application provided technical scheme brought include at least:
compared with the prior art, the glass detection method based on the single image is wider in application range. In the network structure, the network is constructed by using a backbone network, so that ghost features and glass region features can be extracted more accurately and efficiently. Meanwhile, by utilizing the special visual clue of double image, the glass region can be positioned more accurately, and a high-quality glass region prediction graph is obtained, so that the method has good robustness.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a schematic flow chart of a glass detection method based on deep learning and ghost phenomena;
FIG. 2 illustrates a flow diagram of a ghost detection module provided in an exemplary embodiment of the present application;
FIG. 3 illustrates a dual reflection estimation module process schematic provided in one exemplary embodiment of the present application;
FIG. 4 illustrates a flow diagram of a dual reflection estimation module provided in an exemplary embodiment of the present application;
FIG. 5 illustrates a block diagram of a deep learning and ghost phenomenon based glass detection system according to an exemplary embodiment of the present application;
FIG. 6 illustrates a block diagram of a dual layer reflection estimation module in a deep learning and ghost phenomenon based glass detection system according to an exemplary embodiment of the present application;
FIG. 7 illustrates a schematic diagram of a glass detection network connection provided in an exemplary embodiment of the present application;
fig. 8 shows a graph of experimental results, wherein the first column Input is a picture of the Input real scene with glass regions; the second column of ourGhos obtains a 2D mask of the ghost area for one exemplary embodiment of the present application; the third column Ours is the 2D mask of the glass region predicted by one exemplary embodiment of the present application, and the fourth column GT is the truth chart of the glass region mask.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The present application is further described below with reference to the drawings and examples.
First, the terms involved in the embodiments of the present application will be briefly described:
the ghost effect is an inherent property of the glass surface and always occurs on the glass surface. This is due to the fact that the glass pane has two contact surfaces (on both sides thereof) which are formed by two attenuated slightly offset reflections, with the use of ghost information the detection of the glass region can be effectively guided. Since ghost images occur only in the glass region in daily life, guidance and support can be effectively provided for inspecting glass.
SwinTransformer is a deep learning method based on the Transformer.
Fig. 1 shows a schematic flow chart of a glass detection method based on deep learning and ghost phenomena according to an exemplary embodiment of the present application, where the method includes the following steps:
step 101, glass detection is performed based on a single original input image, wherein the image is a single RGB image.
And 102, extracting ghost features based on the original input image through a deep learning method of a backbone network, and calculating to obtain a ghost region prediction graph.
And step 103, after the ghost area prediction graph is connected with an original input image channel, extracting glass features under the guidance of ghost cues based on a backbone network, and then performing glass feature decoding and a segmentation result of a glass area based on a convolutional neural network.
Step 104, outputting a glass region prediction map based on the segmentation result of the glass region. The training is repeated until a high-quality glass region prediction graph is obtained.
The specific flow is as follows:
glass detection is carried out based on a single original input image, then ghost characteristics are extracted by a ghost detection module through a backbone network, the backbone network is a Swin-converter, the obtained characteristics are input into a double reflection estimation module, and primary reflection and secondary reflection are detected by using two branches; the features of the primary reflection are aligned with the features of the reflection using a deformable convolution block to accurately estimate the primary and secondary reflections (a double layer reflection estimation block). The estimated prediction map of primary and secondary reflections is up-sampled to the original resolution input to the encoder-decoder structure to estimate a displacement map, which is an offset estimation map. And (3) downsampling the obtained displacement image, connecting the downsampled displacement image with the characteristic images of primary reflection and secondary reflection, merging the downsampled displacement image with the characteristic images of primary reflection and secondary reflection, and inputting the merged displacement image into a decoder to decode a ghost area prediction image. The glass region is then segmented under the guidance of the ghost region by a glass segmentation module. And connecting the obtained ghost area with an input image channel, inputting the ghost area into another backbone network to extract glass features, performing feature decoding by using a convolutional neural network, and obtaining a segmentation structure of the glass area, and repeatedly training until a high-quality glass area prediction graph is obtained.
In summary, compared with the prior art, the glass detection method based on deep learning and ghost phenomena provided by the application is improved in method, and meanwhile, the physical characteristics of the glass are analyzed, so that the glass detection method is more unique. Since ghost images occur only in the glass region in daily life, guidance and support can be effectively provided for inspecting glass.
The global features are extracted by a SwinTransformer-based method, meanwhile, physical characteristics of the ghost are comprehensively considered, and the accuracy of the detected ghost area can be improved by designing a double reflection estimation module prediction offset map. Wherein the feature constraints performed are also advantageous for the extraction of ghost features.
The structure of the glass segmentation module designs a U-shaped structure, is a coder-decoder structure, is different from the traditional U-Net and most convolution neural network-based methods, adopts a Swin-transform and convolution combination mode, global features extracted by the transform are beneficial to positioning of potential glass regions, and uses the convolution neural network to perform feature fusion progressive decoding, so that a high-quality glass region prediction graph is finally obtained.
The application carries out glass detection based on a single image, and the application range is wider. Based on a single image, the multimodal input method requires the use of additional input devices such as an infrared camera, a polarization camera, a depth camera, etc., unlike the multimodal input in which a plurality of images are simultaneously processed, uses a larger limit of field Jing Huiyou than the use of a single camera, and is less applicable.
Fig. 2 shows a schematic flow chart of a ghost detection module according to an exemplary embodiment of the present application, and the method includes the following steps:
step 201, acquiring multi-scale characteristics through a backbone network;
step 202, inputting a double reflection estimation module to estimate a displacement map based on the acquired multi-scale features to detect a ghost area, wherein the double reflection estimation module adopts two branches to detect primary reflection and secondary reflection to acquire primary reflection features and secondary reflection features;
and 203, fusing the primary reflection characteristic, the secondary reflection characteristic and the displacement map to obtain ghost characteristics, and obtaining a high-quality ghost area prediction map through a decoder based on a convolutional neural network.
Fig. 3 and 4 illustrate a dual reflection estimation module provided in an exemplary embodiment of the present application, and the process includes the following steps:
step 301, inputting a multi-scale feature acquired based on a backbone network into a dual-estimation reflection estimation module;
step 302, acquiring primary reflection characteristics and a primary reflection area prediction graph through primary detection based on the multi-scale characteristics, and acquiring secondary reflection characteristics and a secondary reflection area prediction graph through secondary detection;
step 303, constraining the primary reflection feature and the secondary reflection feature by a deformable convolution, wherein the deformable convolution block aligns the primary reflection feature with the secondary reflection feature to accurately estimate the primary reflection and the secondary reflection;
step 304, sampling to the original resolution based on the primary reflection area prediction graph and the secondary reflection prediction graph, inputting to an encoder-decoder structure, and obtaining an offset estimation graph through an encoder, wherein the offset estimation graph is an estimated displacement graph;
in step 305, the ghost features obtained by merging the primary reflection feature, the secondary reflection feature and the offset estimation map are input to a decoder, and the decoder decodes the ghost region prediction map.
FIG. 5 shows a block diagram of a deep learning and ghost phenomenon based glass detection system according to an exemplary embodiment of the present application, the system comprising: an acquisition module 410, a ghost detection module 420, a glass segmentation module 430, and an output module 440.
The acquiring module 410 is configured to acquire a single Zhang Yuanshi input image.
The ghost detection module 420 is configured to extract ghost features based on the original input image through a deep learning method of the backbone network, and calculate and obtain a ghost region prediction graph.
The glass segmentation module 430 is configured to extract glass features under guidance of ghost cues based on a backbone network after connecting the ghost region prediction graph with an original input image channel, and then perform glass feature decoding and segmentation results of a glass region based on a convolutional neural network;
and an output module 440 for outputting the glass region prediction graph.
Fig. 6 shows a block diagram of a dual-layer reflection estimation module in a glass detection system based on deep learning and ghost phenomena according to an exemplary embodiment of the present application.
The ghost detection module can obtain multi-scale features by using a backbone network, then input the multi-scale features into ghost features in an image through a specially designed dual reflection estimation module, and finally obtain a ghost region prediction graph through a decoder.
The ghost detection module 420 includes: a backbone network 510, a dual reflection estimation module 520, and a decoder 530. Preferably, backbone network 510 is a SwinTransformer network.
The backbone network 510 is used to detect potential glass regions. Preferably, the backbone network uses an existing Swin Transformer. First, the present application takes advantage of the swinTransformer in extracting low-level features and learning long-range dependencies on these aspects, since ghost effects are typically observed as duplicates of edges in the input image. Second, swinTransformer can model region dependencies in a local to global hierarchical fashion, which helps to handle appearance changes (e.g., intensity and shape) of ghost effects.
Compared with the traditional method, the transformation former has the advantages of extracting long-distance dependent characteristics, and compared with the traditional method based on the CNN, the method can obtain larger feeling at a shallower layer of the network, and can also have the advantages of extracting the characteristics favorable for glass detection.
The dual reflection estimation module 520 detects ghost effects by acquiring an offset estimation map on multiple scales with the dual reflection estimation module, provided that the multi-scale features obtained from the backbone network have been obtained. The dual reflection estimate is used to detect the presence of a reflection shift. The present application detects any two primary reflections and then estimates the offset between the reflective layers. Such a design brings two practical advantages. First, based on the ghost phenomenon, the present application proposes that the model can handle any type of glass surface, regardless of the number of glass regions in the input image. Second, the designed module does not need to accurately estimate the offset, so the application does not provide ground truth values of the real scene, and only monitors the synthesized scene.
The present application uses two branches to detect primary and secondary reflections. The present application uses deformable convolution blocks to align the characteristic displacement of primary reflections with the characteristics of secondary reflections to accurately estimate both primary and secondary reflections. The estimated prediction map of primary and secondary reflections is first up-sampled to the original resolution and then input to the encoder-decoder structure to estimate a displacement map, which is an offset estimation map. A non-0 value in the displacement map may indicate the presence of ghost effects in the region. The obtained displacement map can provide a powerful clue to distinguish ghost effect from single reflection. And finally, downsampling the estimated displacement diagram, connecting the downsampled displacement diagram with the characteristic diagrams of the primary reflection and the secondary reflection, and inputting the fused displacement diagram into a decoder.
The decoder part uses convolutional neural network, and inputs the characteristics output by the double reflection estimation module into the decoder to obtain the final ghost region prediction result.
The convolution network is used for decoding and fusing the multi-scale characteristics extracted by the backbone network, and the fusion of the multi-scale characteristics is beneficial to detecting glass areas with different sizes.
The loss function used by the supervision part involved in the network of the present application is as follows.
The loss function used in the predictive supervision of the primary reflection 2D mask, the secondary reflection 2D mask, and the ghost area 2D mask is a BCE loss function, as follows:
equation one:
Figure BDA0004083035710000091
where i indicates the scale index of the prediction mask map and s indicates the total number of scales. M represents the predicted 2D mask and,
Figure BDA0004083035710000092
representing ground truth values.
The loss function used by the characteristic constraint part is a mean square error loss function, and the specific calculation formula is as follows:
formula II:
Figure BDA0004083035710000093
wherein the method comprises the steps of
Figure BDA0004083035710000096
For deformable convolution operations, ++>
Figure BDA0004083035710000094
For the features corresponding to primary reflection at the i scale, < >>
Figure BDA0004083035710000095
Is characteristic of the secondary reflection at the corresponding scale i.
Fig. 7 is a schematic diagram of a glass inspection network according to an exemplary embodiment of the present application, where the glass splitting module is a U-shaped network structure and the module structure is an encoder-decoder structure for better inspecting glass of different sizes.
Wherein the encoder uses SwinTransformer to obtain four layers of feature representations of different scales, wherein the input is firstly subjected to position coding, and the number of SwinTransformer blocks stacked from shallow to deep 4 stages of the network is respectively 2, 16 and 2. The decoder combines the characteristic diagram obtained in the encoding stage and the characteristic diagram obtained in the decoding stage together in a channel link mode, combines deep-level and shallow-level characteristics, refines the image, and predicts and partitions the glass region according to the obtained characteristic diagram.
Loss function:
and (3) a formula III:
Figure BDA0004083035710000101
the loss function for the glass splitting module includes three parts, namely a BCE loss function, an SSIM loss function and an IOU loss. The final loss function is the sum of the three.
Fig. 8 shows graphs showing experimental results, and the present application is illustrated by the following specific experiments:
the spatial resolution of the input image of the present application is 384×384. Training was performed on a server equipped with Inteli9-10900X10core/3.7G/19.25MCPU,16G memory and NVIDIARTX309024GB video memory GPU. The environment for network training was python3.6.13 and pytorch1.7.1, the number of iterations was set to 200, and the batch size was set to 2. Adam optimizers are employed to train the network. The learning rate of the ghost detection module was set to 0.00001, and the learning rate of the glass dividing module was set to 0.00001.
The network training process is that a backbone network is utilized to extract ghost features first. The obtained features are input into a dual reflection estimation module, using two branches to detect primary and secondary reflections. The features of the primary reflection are then aligned with the features of the secondary reflection using a deformable convolution to accurately estimate the primary and secondary reflections. The estimated prediction maps of primary and secondary reflections are first up-sampled to the original resolution and then input to the encoder-decoder structure to obtain an offset estimation map. And finally, carrying out downsampling on the offset estimation graph obtained by estimation, connecting the offset estimation graph with the characteristic graphs of primary reflection and secondary reflection, and inputting the obtained offset estimation graph into a decoder to decode a ghost area after fusion.
The glass dividing module divides the glass region under the guidance of the ghost region. Specifically, after the obtained ghost area is connected with an input image channel, the ghost area is input into another backbone network to extract glass features, and then a convolutional neural network is used for feature decoding and a segmentation result of the glass area. Finally, training is repeated until a high-quality glass region prediction graph is obtained.
Fig. 8 is a diagram showing the experimental results, and shows the qualitative evaluation of the method on the real scene, and as can be seen from fig. 8, the ghost area can be accurately detected in the real scene, and the glass area can be accurately predicted. The first and second rows show the scene of a large glass block under different lighting conditions, and the method can accurately capture the ghost phenomenon and detect the glass region. The third fourth scene shows a complex outdoor scene with multiple glass pieces, where the third row has a portion of the glass pieces at the edge of the scene due to scene limitations and the fourth row has a portion of the glass pieces due to occlusion. The fifth row shows the result of glass detection in an outdoor scene, and the method can accurately detect glass. The present application is able to accurately predict these regions where a large portion of the glass has been detected. The application has good practicability and universality.
In summary, compared with the prior art, the glass detection method based on deep learning and ghost phenomena provided by the embodiment of the application is widely applicable to glass detection based on a single image. In the network configuration, the network is constructed by using a strong transducer, so that ghost features and glass region features can be extracted more accurately and efficiently. Meanwhile, by utilizing the special visual clue of double image, the glass region can be positioned more accurately, and a high-quality glass region prediction graph is obtained, so that the method has good robustness.
The foregoing description of the preferred embodiments is merely exemplary in nature and is not intended to limit the invention, but is intended to cover various modifications, substitutions, improvements, and alternatives falling within the spirit and principles of the invention.

Claims (7)

1. A glass detection method based on deep learning and ghost phenomenon comprises the following steps:
glass detection is carried out based on a single original input image, wherein the image is a single RGB image;
extracting ghost features based on an original input image through a deep learning method of a backbone network, and calculating to obtain a ghost region prediction graph;
after the ghost area prediction graph is connected with an original input image channel, extracting glass features under the guidance of ghost cues based on a backbone network, and then decoding the glass features and acquiring a segmentation result of a glass area based on a convolutional neural network;
and outputting a glass region prediction graph based on the segmentation result of the glass region.
2. The method for detecting glass based on deep learning and ghost phenomenon according to claim 1, wherein the obtaining ghost region prediction map comprises the steps of:
acquiring multi-scale characteristics based on a backbone network;
inputting the acquired multi-scale characteristics into a double-reflection estimation module to acquire an offset estimation diagram to detect a ghost area, wherein the double-reflection estimation module acquires primary reflection characteristics and secondary reflection characteristics through primary reflection detection and secondary reflection detection;
and fusing the primary reflection characteristic, the secondary reflection characteristic and the offset estimation graph to obtain ghost characteristics, and obtaining a high-quality ghost area prediction graph through a decoder based on a convolutional neural network.
3. The method for detecting glass based on deep learning and ghost phenomenon according to claim 2, wherein the double-layer reflection estimation module process comprises the steps of:
inputting the multi-scale characteristics acquired based on a backbone network into a double-estimation reflection estimation module;
based on the multi-scale characteristics, acquiring primary reflection characteristics and a primary reflection area prediction graph through primary detection, and acquiring secondary reflection characteristics and a secondary reflection area prediction graph through secondary detection;
performing feature constraint on the primary reflection feature and the secondary reflection feature through deformable convolution;
inputting the primary reflection area prediction graph and the secondary reflection prediction graph into a coder decoder structure, and acquiring an offset estimation graph through a coder;
and (5) merging the primary reflection characteristic, the secondary reflection characteristic and the ghost characteristic obtained by the offset estimation graph, and inputting the ghost characteristic into a decoder to obtain a ghost area prediction graph.
4. The method for detecting glass based on deep learning and ghost phenomenon according to claim 3, wherein,
the feature constraint subtracts the primary reflection feature and the secondary reflection feature through the variability convolution, and the calculation formula of the loss function is used:
Figure FDA0004083035700000021
wherein the method comprises the steps of
Figure FDA0004083035700000022
For the variability convolution operation, +.>
Figure FDA0004083035700000023
For the primary reflection feature at the corresponding i scale, +.>
Figure FDA0004083035700000024
Is a secondary reflection feature at the corresponding i scale.
5. The method for glass detection based on deep learning and ghost phenomenon according to any one of claims 1 to 4, wherein the backbone network is Swin Transformer.
6. A glass detection system based on deep learning and ghost phenomena, suitable for use in the detection method according to claim 1, characterized in that it comprises:
an acquisition module for acquiring a single Zhang Yuanshi input image;
the ghost detection module is used for extracting ghost features based on an original input image through a deep learning method of a backbone network and calculating to obtain a ghost region prediction graph;
the glass segmentation module is used for extracting glass characteristics based on a backbone network under the guidance of ghost cues after the ghost area prediction graph is connected with an original input image channel, and then carrying out glass characteristic decoding and segmentation results of the glass areas based on a convolutional neural network;
and the output module is used for outputting the glass region prediction graph.
7. The glass detection system based on deep learning and ghost phenomena according to claim 6, wherein the glass dividing module has a U-shaped structure including an encoding part and a decoding part.
CN202310128767.1A 2023-02-17 2023-02-17 Glass detection method based on deep learning and ghost phenomenon Pending CN116228696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310128767.1A CN116228696A (en) 2023-02-17 2023-02-17 Glass detection method based on deep learning and ghost phenomenon

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310128767.1A CN116228696A (en) 2023-02-17 2023-02-17 Glass detection method based on deep learning and ghost phenomenon

Publications (1)

Publication Number Publication Date
CN116228696A true CN116228696A (en) 2023-06-06

Family

ID=86585241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310128767.1A Pending CN116228696A (en) 2023-02-17 2023-02-17 Glass detection method based on deep learning and ghost phenomenon

Country Status (1)

Country Link
CN (1) CN116228696A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333467A (en) * 2023-10-16 2024-01-02 山东景耀玻璃集团有限公司 Image processing-based glass bottle body flaw identification and detection method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117333467A (en) * 2023-10-16 2024-01-02 山东景耀玻璃集团有限公司 Image processing-based glass bottle body flaw identification and detection method and system
CN117333467B (en) * 2023-10-16 2024-05-14 山东景耀玻璃集团有限公司 Image processing-based glass bottle body flaw identification and detection method and system

Similar Documents

Publication Publication Date Title
CN111723798B (en) Multi-instance natural scene text detection method based on relevance hierarchy residual errors
CN113222916A (en) Method, apparatus, device and medium for detecting image using target detection model
CN110910437B (en) Depth prediction method for complex indoor scene
CN110688905A (en) Three-dimensional object detection and tracking method based on key frame
CN110020658B (en) Salient object detection method based on multitask deep learning
CN109461178A (en) A kind of monocular image depth estimation method and device merging sparse known label
CN109974743A (en) A kind of RGB-D visual odometry optimized based on GMS characteristic matching and sliding window pose figure
CN113139470A (en) Glass identification method based on Transformer
CN116228696A (en) Glass detection method based on deep learning and ghost phenomenon
CN114998566A (en) Interpretable multi-scale infrared small and weak target detection network design method
CN116485860A (en) Monocular depth prediction algorithm based on multi-scale progressive interaction and aggregation cross attention features
CN113313668B (en) Subway tunnel surface disease feature extraction method
CN115035172A (en) Depth estimation method and system based on confidence degree grading and inter-stage fusion enhancement
CN117522884B (en) Ocean remote sensing image semantic segmentation method and device and electronic equipment
Yang et al. Mixed-scale UNet based on dense atrous pyramid for monocular depth estimation
Gao et al. Robust lane line segmentation based on group feature enhancement
CN114240969A (en) Multi-line laser image defect segmentation method based on stripe multi-view convolution network
CN117173854B (en) Coal mine open fire early warning method and system based on deep learning
CN117079237A (en) Self-supervision monocular vehicle distance detection method
CN111339919A (en) Mirror detection method based on multitask cooperation
CN113920317B (en) Semantic segmentation method based on visible light image and low-resolution depth image
CN115797684A (en) Infrared small target detection method and system based on context information
Shim et al. Depth-relative self attention for monocular depth estimation
CN114842066A (en) Image depth recognition model training method, image depth recognition method and device
Xiong et al. Dsnet: Deep shadow network for illumination estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination