CN113313669A - Method for enhancing semantic features of top layer of surface disease image of subway tunnel - Google Patents
Method for enhancing semantic features of top layer of surface disease image of subway tunnel Download PDFInfo
- Publication number
- CN113313669A CN113313669A CN202110443056.4A CN202110443056A CN113313669A CN 113313669 A CN113313669 A CN 113313669A CN 202110443056 A CN202110443056 A CN 202110443056A CN 113313669 A CN113313669 A CN 113313669A
- Authority
- CN
- China
- Prior art keywords
- feature map
- feature
- level
- map
- original
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30132—Masonry; Concrete
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for enhancing semantic features of a top layer of a surface disease image of a subway tunnel. The method comprises the following steps: constructing a pyramid structure model to extract a multilayer original characteristic diagram of the image; for the original top level feature map extracted by the pyramid structure model, enhancing the top level semantic features by utilizing a channel self-attention mechanism and enhancing the top level semantic features by utilizing a sample marking truth map to obtain a top level feature map with enhanced semantic features; and replacing the original top layer feature map with the semantic feature enhanced top layer feature map, performing top-down interlayer feature fusion, and training the features subjected to interlayer feature fusion as a prediction target feature map output by a deep learning network to obtain a tunnel surface disease deep learning detection and identification model. The invention can reduce the loss of the characteristic information of the whole area of the leakage water in pyramid characteristic fusion and improve the accuracy of disease identification.
Description
Technical Field
The invention relates to the technical field of image detection and identification, in particular to a top-layer semantic feature enhancement method for a surface disease image of a subway tunnel.
Background
The detection and identification of the defects such as water leakage and cracks on the surface of the subway tunnel are important contents in the conventional subway tunnel inspection. Because the manual inspection has the defects of strong subjectivity, low efficiency and the like, the tunnel surface disease detection and identification based on machine vision becomes a new trend of industry development in recent years, and mainly comprises a traditional image processing method and a deep learning method. The traditional image processing method comprises threshold segmentation, edge detection, morphological analysis and the like, and although the algorithm computation complexity is low and the algorithm hardware computation requirement is not high, the interferences of low contrast of the subway tunnel surface diseases, uneven illumination, serious background noise pollution and the like are difficult to overcome.
Compared with the traditional image processing method, the deep learning method utilizes the multilayer neural network to mine the multilayer characteristics of the image from the massive image data information and continuously collect the multilayer characteristics of the image into the network model, and then completes the tasks of classification, positioning, segmentation and the like of the input image data by training the specific network model. The deep learning method shows excellent generalization capability and robustness, and is widely applied to the field of tunnel lining surface disease image detection and identification in recent years. For example, patent application CN201910348834.4 discloses a subway shield tunnel disease detection method based on deep learning, which performs disease detection on collected shield tunnel images by using a cocknet deep learning model, thereby solving the problem of interference of environmental factors on damage identification to a certain extent. Patent application 201810843204.X discloses a tunnel structure apparent disease detection device and method, the developed detection device is used for shooting the surface of a subway tunnel, and a full convolution neural network (R-FCN) based on a region suggestion candidate frame is used for detecting and identifying diseases for a shot image.
Although the method obtains more accurate experimental effect than the traditional image processing method by virtue of strong feature extraction and mode classification capability of a deep learning network, the following problems still exist when the method is applied to the inspection of the surface diseases of the actual subway tunnel: 1) the boundary of the block-shaped diseases such as water leakage is not obvious due to the obvious seepage change rule on the boundary, so that the problem of boundary loss is easily caused during the disease detection, and the disease detection precision is reduced; 2) with the increasing number of layers of the deep learning network, semantic feature information contained in the top-level feature map is continuously abundant. However, in order to implement top-down interlayer feature cross fusion, the feature channel number of the top layer feature map needs to be reduced, and since the top layer feature map contains the richest semantic information and has the most channel numbers, a large amount of feature information beneficial to disease detection and identification is lost during channel dimension reduction, thereby seriously affecting the final disease detection and identification accuracy.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method for enhancing the semantic features of the top layer of a surface disease image of a subway tunnel.
According to the first aspect of the invention, a method for enhancing the semantic features of the top layer of a surface disease image of a subway tunnel is provided. The method comprises the following steps:
step S1, constructing a pyramid structure model to extract a multilayer original feature map of the image;
step S2, for the original top level feature map extracted by the pyramid structure model, enhancing the top level semantic features by utilizing a channel self-attention mechanism and enhancing the top level semantic features by utilizing a sample marking truth map to obtain a top level feature map with enhanced semantic features;
and step S3, replacing the original top-level feature map with the top-level feature map enhanced by the semantic features, performing top-down interlayer feature fusion, training the features subjected to interlayer feature fusion as a prediction target feature map output by a deep learning network, and obtaining a tunnel surface disease deep learning detection and recognition model.
According to a second aspect of the invention, a subway tunnel surface disease detection method is provided, which comprises the following steps:
the subway tunnel surface image to be detected is collected, and the deep learning detection and identification model of the tunnel surface diseases obtained according to the invention is input to identify the subway tunnel surface diseases.
Compared with the prior art, the method has the advantages that the importance of each channel of the top-level feature map is calculated by using the feature channel weight calculation module, so that the important feature channels are easier to reserve, and the suppression of background clutter is realized. Further, the similar distance between the sample mark true value graph and the top-level semantic feature graph is calculated, so that the weight distribution of the deep learning network model to the key feature channel of the water leakage area is enhanced, and the loss of the feature information of the whole water leakage area in pyramid feature fusion is reduced.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a schematic diagram of a disease feature reinforcement learning framework based on a feature pyramid structure according to an embodiment of the present invention;
FIG. 2 is a block diagram of top level semantic feature information reinforcement learning, according to one embodiment of the present invention;
FIG. 3 is a flowchart of a top semantic feature enhancement method for a surface defect image of a subway tunnel according to an embodiment of the present invention;
FIG. 4 is a diagram of a feature pyramid network architecture and feature fusion reinforcement learning framework, according to an embodiment of the present invention;
FIG. 5 is a basic block diagram of a residual network according to one embodiment of the present invention;
FIG. 6 is a schematic diagram of a process for self-attention computation of each channel in a top-level feature map according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a sample label similarity weight calculation process, according to an embodiment of the invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The invention provides a novel technical scheme for enhancing top-level feature information of a subway tunnel surface disease image based on a feature pyramid network, which is shown in a combined figure 1 and mainly comprises the following aspects.
As shown in the lower left dashed box part of fig. 1, 4 original feature layers, C2, C3, C4 and C5, are extracted by using, for example, a feature pyramid structure in the depth residual network ResNet-101 as a basic feature extraction framework. Where C5 is the top-level feature map of the entire pyramid feature extraction network, which has the largest number of feature channels (e.g., 2048 channels).
As shown in the upper left dotted line frame part of fig. 1 and fig. 2, in order to reduce the serious problem of channel semantic information loss in the interlayer feature fusion performed from top to bottom in the top-to-bottom in the top-level semantic feature map in the conventional feature pyramid, the invention proposes that a feature channel weight calculation module and a sample mark truth map similarity distance calculation module are comprehensively utilized for the top-level feature map in the feature pyramid, so as to realize the enhanced learning of the top-level semantic feature information.
As shown in the dotted line frame part on the right side in fig. 1, the top layer semantic feature enhancement method is used for performing enhancement learning on the top layer feature map, replacing the original top layer feature map with the learning result, and performing top-down interlayer feature fusion; converting the image mark true value image size to the same size of each interlayer characteristic image by utilizing a down sampling technology; and establishing a deep learning error loss function by using a cross entropy function, calculating the error amount of a pixel predicted value in the characteristic image and the image mark true value image, and then reversely propagating the error amount to update the network parameters of each module in each layer of network.
Through continuous iterative learning, a deep learning detection and identification model of the tunnel surface diseases is finally obtained through training, and the deep learning detection and identification model can be applied to new tunnel surface disease image detection and identification.
Specifically, referring to fig. 3, the method for enhancing the semantic features of the top layer of the surface defect image of the subway tunnel includes the following steps.
And step S110, extracting a multilayer characteristic diagram of the image by using the pyramid structure model.
As shown in the left part of fig. 4, in one embodiment, a deep residual error network ResNet-101 is used as a basic feature extraction module, and the ResNet-101 is composed of two basic blocks, Conv Block and Identity Block, which are alternately connected in series, and have 101 layers. The structure of two basic blocks is shown in fig. 5, wherein Conv2D, BatchNorm and ReLu represent convolution, batch normalization and ReLu activation functions, respectively. After the network extraction features are extracted by a ResNet-101 backbone, 4 original feature maps of C2, C3, C4 and C5 are respectively generated, wherein C2 is a bottom-layer feature map, and C5 is a top-layer feature map. Taking the input image size of 1024 × 3 as an example, the sizes of C2, C3, C4, and C5 are 256 × 256, 128 × 512, 64 × 1024, and 32 × 2048, respectively.
And step S120, performing reinforcement learning on the top layer features.
As shown in the "top-level Feature-enhanced learning" section in the lower left side of fig. 4, the top-level Feature map C5 extracted from the pyramid backbone network is calculated by a Feature enhancement module (FE-Block) and assigned with weights for the respective channel features of C5. The FE-Block module mainly comprises a channel self-attention mechanism and a sample mark truth diagram.
In one embodiment, reinforcement learning of top-level features includes:
and step S121, enhancing the top semantic feature map by utilizing a channel self-attention mechanism.
The importance of the content of each channel of the top level feature map is learned by using a channel self-attention calculation method for the top level feature map extracted by the pyramid backbone network, and the specific structure of the top level feature map is shown in fig. 6.
Specifically, the input features (i.e., the input top level feature map) are globally pooled, the relationship among the channels is further learned through full-connection operation, different weights of each channel in the top level feature map are obtained, and finally the channel weights are multiplied by the input features of the original top level feature map to obtain a top level feature map channel enhanced output result. This process can be expressed as:
wherein F is the top-level input feature, gp(. is a global pooling layer, fc(. cndot.) is the fully connected layer, and W is the channel weight.
And step S122, enhancing the top semantic feature map by using the sample mark truth map.
As shown in fig. 7, a similar distance between the disease sample mark truth map information and each channel feature matrix of the top-level feature map is calculated, weights of all channels in the top-level feature map are determined according to the similar distance, and a calculation formula of the sample mark similar weights is represented as:
Lb=fs[Υ(F,Fb)] (2)
wherein F is a top-level input feature, FbMarking a true value graph for the down-sampled sample, upsilon (·,) is a characteristic graph Euclidean distance calculation function, fs(. cndot.) is a feature map similarity weight coefficient normalization function, such as a cosine similarity function.
And step S123, generating a feature map with enhanced semantic features.
And multiplying the calculated characteristic channel weight matrix and the target mark similar weight matrix by the original top-level characteristic map C5 together to generate a characteristic map S5 with enhanced semantic characteristics, and replacing the original C5 with the characteristic map S5 to perform a subsequent top-down characteristic fusion process.
And step S130, fusing top-down interlayer features.
After a C2-C4 feature map extracted by the pyramid backbone network and a top-level feature map S5 enhanced by top-level semantic features are obtained, the number of channels of all features is reduced to 256, wherein P5 is generated after channel dimension reduction of S5, and then top-down feature fusion is performed. The specific process is shown in a 'feature fusion' part of fig. 4, and comprises the following steps:
step S131, first, up-sampling P5 to make its size the same as the size of C4 feature map (64 × 256) after channel dimensionality reduction, and adding the two to generate P4;
step S132, upsampling P4 to make the size of the upsampled P4 be the same as the size of the C3 feature map subjected to channel dimensionality reduction (128 × 256), and adding the two to generate P3;
in step S133, P3 is upsampled to the same size as the C2 feature map size (256 × 256) after channel dimensionality reduction, and added to generate P2.
Finally, the P2, P3, P4 and P5 subjected to interlayer feature fusion are used as a prediction target feature map output by the whole deep learning network.
Step S140, training is performed with the set error loss function as a constraint.
And the predicted feature maps P2, P3 and P4 generated by the feature pyramid structure and the top-level feature map P5 after feature enhancement and channel dimensionality reduction are used as reference bases for calculating the error loss function of the feature map of the whole network. And amplifying each prediction characteristic graph to the size of the original input image, and calculating the error loss corresponding to each prediction characteristic graph according to the following formula.
Wherein y is a sample labeled binary image of the input image; and P is a predicted target feature map generated by the pyramid structure. From equation (3), the overall error loss function generated by the prediction feature map for all training samples can be determined, as:
equation (4) can be generally expressed as:
where N is the number of predicted feature maps plus 1.
Based on an error loss function, through continuous iterative learning, the tunnel surface disease deep learning detection and identification model obtained through final training can be applied to new tunnel surface disease image detection and identification, and the identified tunnel diseases include but are not limited to deformation invasion limit, cracks, water leakage, slab staggering, chipping, collapse, foundation grout pumping, sinking, bottom bulging, lining back holes and the like.
In summary, the present invention focuses on the reinforcement learning of the feature pyramid top feature map in the deep learning network, and compared with the prior art, the present invention has at least the following advantages:
1) considering that the top-level feature map in the traditional feature pyramid network has the characteristics of small scale, multiple channels and rich semantic feature information after being subjected to convolution pooling for multiple times, in feature fusion among feature pyramid layers, because the top-level feature map needs to be subjected to channel dimensionality reduction, a large amount of feature information beneficial to disease object identification is lost. The invention provides a method for calculating the importance of each channel of a top-level feature map by using a feature channel weight calculation module, so that the important feature channels are easier to reserve, and the suppression of background clutter is realized.
2) The boundary of the leakage water disease shows obvious characteristics of the leakage rule, and the detection precision is reduced because the boundary is easy to lose in the disease detection by the existing deep learning method. The invention further provides a method for calculating the similar distance between the sample mark true value graph and the top semantic feature graph, so that the weight distribution of the deep learning network model to the key feature channel of the water leakage area is enhanced, and the loss of the feature information of the whole area of the water leakage in pyramid feature fusion is reduced.
It should be noted that, without departing from the spirit and scope of the present invention, those skilled in the art may make appropriate changes or modifications to the above-described embodiments, for example, in addition to using a deep residual error network as a basic feature extraction module, other types of network models may also be used, and the present invention does not limit the number of layers for extracting feature maps, the size of convolution kernels, the dimension of feature maps, and the like.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + +, Python, or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.
Claims (10)
1. A method for enhancing semantic features of a top layer of a surface disease image of a subway tunnel comprises the following steps:
step S1, constructing a pyramid structure model to extract a multilayer original feature map of the image;
step S2, for the original top level feature map extracted by the pyramid structure model, enhancing the top level semantic features by utilizing a channel self-attention mechanism and enhancing the top level semantic features by utilizing a sample marking truth map to obtain a top level feature map with enhanced semantic features;
and step S3, replacing the original top-level feature map with the top-level feature map enhanced by the semantic features, performing top-down interlayer feature fusion, training the features subjected to interlayer feature fusion as a prediction target feature map output by a deep learning network, and obtaining a tunnel surface disease deep learning detection and recognition model.
2. The method according to claim 1, wherein step S2 includes:
pooling the original top-level feature map, and learning the relationship among all channels through full-connection operation to obtain the weight of each channel in the original top-level feature map;
calculating the similar distance between the sample mark truth diagram information and each channel feature matrix of the original top-level feature diagram, determining the weights of all channels in the original top-level feature diagram according to the similar distance, and obtaining the sample mark similar weight;
and multiplying the obtained feature channel weight matrix and the sample mark similar weight matrix by the original top-level feature map to generate the semantic feature enhanced top-level feature map.
3. The method according to claim 2, wherein the weight of each channel in the original top-level feature map is represented as:
Fse=F*W
W=fc[gp(F)]
wherein F is the top-level input feature, gp(. is a global pooling layer, fc(. cndot.) is the fully connected layer, and W is the channel weight.
4. The method of claim 2, wherein the sample label similarity weight is expressed as:
Lb=fs[Υ(F,Fb)]
wherein F is a top-level input feature, FbMarking a true value graph for the down-sampled sample, upsilon (·,) is a characteristic graph Euclidean distance calculation function, fs(. cndot.) is a feature map similarity weight coefficient normalization function.
5. The method of claim 1, wherein the loss function for training the deep learning network is established according to the following steps:
amplifying each prediction feature map to the size of the original input image, and calculating the error loss corresponding to each prediction feature map, wherein the error loss is expressed as:
y is a sample labeled binary image of the input image, and P is a predicted target characteristic image generated by the pyramid structure model;
determining the overall error loss function generated by all the training sample prediction feature maps, and expressing the overall error loss function as:
where k is an index of the predicted feature map and N is an integer related to the number of predicted feature maps.
6. The method of claim 1, wherein for step S3, top-down inter-layer feature fusion is performed according to the following steps:
the top-level feature map with enhanced semantic features is up-sampled to enable the size of the top-level feature map to be the same as the size of the secondary top-level feature map subjected to channel dimensionality reduction, and the top-level feature map and the secondary top-level feature map are added to generate a fused feature map;
and performing upsampling on the fused feature map to enable the size of the fused feature map to be the same as the size of the next feature map of the second-level feature map subjected to channel dimensionality reduction, adding the two feature maps to generate a fused feature, and performing analogy in sequence to finally use the feature generated through interlayer feature fusion as a predicted target feature map output by the deep learning network.
7. The method of claim 1, wherein the pyramid structure model is constructed based on a depth residual network ResNet-101.
8. A subway tunnel surface disease detection method comprises the following steps:
acquiring a subway tunnel surface image to be detected, and inputting the tunnel surface disease deep learning detection and identification model obtained by the method according to any one of claims 1 to 7 to identify the subway tunnel surface disease.
9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
10. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the steps of the method of any of claims 1 to 8 are implemented when the processor executes the program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110443056.4A CN113313669B (en) | 2021-04-23 | 2021-04-23 | Method for enhancing semantic features of top layer of surface defect image of subway tunnel |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110443056.4A CN113313669B (en) | 2021-04-23 | 2021-04-23 | Method for enhancing semantic features of top layer of surface defect image of subway tunnel |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113313669A true CN113313669A (en) | 2021-08-27 |
CN113313669B CN113313669B (en) | 2022-06-03 |
Family
ID=77372634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110443056.4A Active CN113313669B (en) | 2021-04-23 | 2021-04-23 | Method for enhancing semantic features of top layer of surface defect image of subway tunnel |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113313669B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818998A (en) * | 2022-06-28 | 2022-07-29 | 浙江大学 | Method for judging mud pumping disease state of ballastless track foundation bed during slurry turning |
CN117011688A (en) * | 2023-07-11 | 2023-11-07 | 广州大学 | Method, system and storage medium for identifying diseases of underwater structure |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268140A (en) * | 2014-07-31 | 2015-01-07 | 浙江大学 | Image retrieval method based on weight learning hypergraphs and multivariate information combination |
CN106503729A (en) * | 2016-09-29 | 2017-03-15 | 天津大学 | A kind of generation method of the image convolution feature based on top layer weights |
CN109300119A (en) * | 2018-09-11 | 2019-02-01 | 石家庄铁道大学 | Detection method, detection device and the terminal device in steel structure surface corrosion region |
CN109993774A (en) * | 2019-03-29 | 2019-07-09 | 大连理工大学 | Online Video method for tracking target based on depth intersection Similarity matching |
WO2020077940A1 (en) * | 2018-10-16 | 2020-04-23 | Boe Technology Group Co., Ltd. | Method and device for automatic identification of labels of image |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111524117A (en) * | 2020-04-20 | 2020-08-11 | 南京航空航天大学 | Tunnel surface defect detection method based on characteristic pyramid network |
CN112149547A (en) * | 2020-09-17 | 2020-12-29 | 南京信息工程大学 | Remote sensing image water body identification based on image pyramid guidance and pixel pair matching |
CN112434721A (en) * | 2020-10-23 | 2021-03-02 | 特斯联科技集团有限公司 | Image classification method, system, storage medium and terminal based on small sample learning |
CN112613356A (en) * | 2020-12-07 | 2021-04-06 | 北京理工大学 | Action detection method and device based on deep attention fusion network |
CN112634292A (en) * | 2021-01-06 | 2021-04-09 | 烟台大学 | Asphalt pavement crack image segmentation method based on deep convolutional neural network |
-
2021
- 2021-04-23 CN CN202110443056.4A patent/CN113313669B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268140A (en) * | 2014-07-31 | 2015-01-07 | 浙江大学 | Image retrieval method based on weight learning hypergraphs and multivariate information combination |
CN106503729A (en) * | 2016-09-29 | 2017-03-15 | 天津大学 | A kind of generation method of the image convolution feature based on top layer weights |
CN109300119A (en) * | 2018-09-11 | 2019-02-01 | 石家庄铁道大学 | Detection method, detection device and the terminal device in steel structure surface corrosion region |
WO2020077940A1 (en) * | 2018-10-16 | 2020-04-23 | Boe Technology Group Co., Ltd. | Method and device for automatic identification of labels of image |
CN109993774A (en) * | 2019-03-29 | 2019-07-09 | 大连理工大学 | Online Video method for tracking target based on depth intersection Similarity matching |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111524117A (en) * | 2020-04-20 | 2020-08-11 | 南京航空航天大学 | Tunnel surface defect detection method based on characteristic pyramid network |
CN112149547A (en) * | 2020-09-17 | 2020-12-29 | 南京信息工程大学 | Remote sensing image water body identification based on image pyramid guidance and pixel pair matching |
CN112434721A (en) * | 2020-10-23 | 2021-03-02 | 特斯联科技集团有限公司 | Image classification method, system, storage medium and terminal based on small sample learning |
CN112613356A (en) * | 2020-12-07 | 2021-04-06 | 北京理工大学 | Action detection method and device based on deep attention fusion network |
CN112634292A (en) * | 2021-01-06 | 2021-04-09 | 烟台大学 | Asphalt pavement crack image segmentation method based on deep convolutional neural network |
Non-Patent Citations (2)
Title |
---|
GUANGYU REN ET AL.: "Salient Object Detection Combining a Self-Attention Module and a Feature Pyramid Network", 《ELECTRONICS》 * |
GUANGYU REN ET AL.: "Salient Object Detection Combining a Self-Attention Module and a Feature Pyramid Network", 《ELECTRONICS》, 16 October 2020 (2020-10-16), pages 1 - 13 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818998A (en) * | 2022-06-28 | 2022-07-29 | 浙江大学 | Method for judging mud pumping disease state of ballastless track foundation bed during slurry turning |
CN114818998B (en) * | 2022-06-28 | 2022-09-13 | 浙江大学 | Method for judging mud pumping disease state of ballastless track foundation bed during slurry turning |
CN117011688A (en) * | 2023-07-11 | 2023-11-07 | 广州大学 | Method, system and storage medium for identifying diseases of underwater structure |
CN117011688B (en) * | 2023-07-11 | 2024-03-08 | 广州大学 | Method, system and storage medium for identifying diseases of underwater structure |
Also Published As
Publication number | Publication date |
---|---|
CN113313669B (en) | 2022-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kalfarisi et al. | Crack detection and segmentation using deep learning with 3D reality mesh model for quantitative assessment and integrated visualization | |
CN111461114B (en) | Multi-scale feature pyramid text detection method based on segmentation | |
Hoang et al. | Metaheuristic optimized edge detection for recognition of concrete wall cracks: a comparative study on the performances of roberts, prewitt, canny, and sobel algorithms | |
CN111476284B (en) | Image recognition model training and image recognition method and device and electronic equipment | |
CN111582175B (en) | High-resolution remote sensing image semantic segmentation method for sharing multi-scale countermeasure features | |
CN110443818B (en) | Graffiti-based weak supervision semantic segmentation method and system | |
WO2019192397A1 (en) | End-to-end recognition method for scene text in any shape | |
CN111488826B (en) | Text recognition method and device, electronic equipment and storage medium | |
CN114841972B (en) | Transmission line defect identification method based on saliency map and semantic embedded feature pyramid | |
US20190385054A1 (en) | Text field detection using neural networks | |
CN113313669B (en) | Method for enhancing semantic features of top layer of surface defect image of subway tunnel | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN110895695A (en) | Deep learning network for character segmentation of text picture and segmentation method | |
CN113744153B (en) | Double-branch image restoration forgery detection method, system, equipment and storage medium | |
CN113313668B (en) | Subway tunnel surface disease feature extraction method | |
CN111507337A (en) | License plate recognition method based on hybrid neural network | |
CN114781499B (en) | Method for constructing ViT model-based intensive prediction task adapter | |
Manjari et al. | QEST: Quantized and efficient scene text detector using deep learning | |
CN117727046A (en) | Novel mountain torrent front-end instrument and meter reading automatic identification method and system | |
CN114283343B (en) | Map updating method, training method and device based on remote sensing satellite image | |
CN115330703A (en) | Remote sensing image cloud and cloud shadow detection method based on context information fusion | |
CN116486393A (en) | Scene text detection method based on image segmentation | |
El Abbadi | Scene Text detection and Recognition by Using Multi-Level Features Extractions Based on You Only Once Version Five (YOLOv5) and Maximally Stable Extremal Regions (MSERs) with Optical Character Recognition (OCR) | |
CN113837255B (en) | Method, apparatus and medium for predicting cell-based antibody karyotype class | |
Vidhyalakshmi et al. | Text detection in natural images with hybrid stroke feature transform and high performance deep Convnet computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |