CN112734739A - Visual building crack identification method based on attention mechanism and ResNet fusion - Google Patents
Visual building crack identification method based on attention mechanism and ResNet fusion Download PDFInfo
- Publication number
- CN112734739A CN112734739A CN202110065534.2A CN202110065534A CN112734739A CN 112734739 A CN112734739 A CN 112734739A CN 202110065534 A CN202110065534 A CN 202110065534A CN 112734739 A CN112734739 A CN 112734739A
- Authority
- CN
- China
- Prior art keywords
- building
- picture
- attention mechanism
- crack
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000007246 mechanism Effects 0.000 title claims abstract description 29
- 230000000007 visual effect Effects 0.000 title claims abstract description 14
- 230000004927 fusion Effects 0.000 title claims abstract description 11
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims abstract description 14
- 230000002146 bilateral effect Effects 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 230000004913 activation Effects 0.000 claims abstract description 9
- 238000005520 cutting process Methods 0.000 claims abstract description 8
- 238000001914 filtration Methods 0.000 claims abstract description 8
- 238000001514 detection method Methods 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 14
- 238000012800 visualization Methods 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 9
- 230000002708 enhancing effect Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 9
- 238000004590 computer program Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000007689 inspection Methods 0.000 description 5
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000007794 visualization technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration using histogram techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20028—Bilateral filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
- G06T2207/30132—Masonry; Concrete
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a visual building crack identification method based on the fusion of an attention mechanism and ResNet. The method comprises the following steps: (1) collecting a building crack image by using an unmanned aerial vehicle to construct a crack data set; (2) performing data preprocessing and data enhancement on the crack image by using methods such as histogram equalization, bilateral filtering, image center random cutting and the like; (3) establishing a building crack identification model based on the combination of an attention mechanism and a depth residual error neural network; (4) adopting a gradient weighting activation heat map algorithm to visually display the image recognition result in layers, adjusting a network structure and network parameters according to the visual result, and building and adjusting a final model; (6) and detecting the image in the actual field by using the adjusted and optimized model. The method can quickly and accurately identify the cracks, effectively break through a black box mechanism of the neural network in the identification process, and provide visual basis for the adjustment of the network structure.
Description
Technical Field
The invention relates to the technical field of computer image processing, in particular to a visual building crack identification method based on attention mechanism and ResNet fusion.
Background
In the engineering construction process, the quality and safety inspection of buildings are very important links. Among them, the detection of cracks in the outer wall of a building is particularly important. The method not only can influence the function and the attractiveness of a building, but also can reduce the structural safety and deteriorate the anti-seismic performance, so that the rapid and accurate detection and identification of the building cracks is a problem to be solved in the current structural health detection field.
At present, a manual regular inspection method is often adopted for detecting and identifying the building cracks. However, the method has great subjectivity, time and labor consumption for inspection, high cost and low working efficiency. In addition, some high-rise buildings or buildings with complex structures are not beneficial to the visual observation of inspectors, and the possibility of crack missing detection and false detection is easy to occur.
In recent years, a crack recognition method based on an image processing technology has attracted much attention, but the detection accuracy of a conventional image processing method represented by an edge detection method is low due to shadows in a crack image and noise generated by low contrast. Currently, intelligent crack identification models based on deep neural networks are widely researched. However, the existing crack detection depth neural network model is not high in identification accuracy, a black box mechanism of the neural network model cannot be overcome, in addition, the identification result of each layer in the network cannot be seen in a layered visualization manner, and the optimal network model is difficult to determine, so that the existing crack detection depth neural network model is also designed based on experience mostly.
Disclosure of Invention
The invention aims to overcome the problems of the existing method, provides a visual building crack identification method based on the fusion of an attention mechanism and ResNet, can improve the identification accuracy, can break a black box mechanism for crack identification of the traditional neural network, and is beneficial to the visual establishment of a network model. In addition, the method is based on the images acquired by the unmanned aerial vehicle, and the intelligent algorithm can be used for rapidly and accurately identifying the cracks of the outer wall of the building, so that the manual inspection cost is reduced, and the detection efficiency is greatly improved.
In order to achieve the purpose, the technical scheme of the invention is as follows: a visual building crack identification method based on attention mechanism and ResNet fusion comprises the following steps:
s1, collecting a preset number of building outer wall pictures by using an unmanned aerial vehicle, respectively marking the collected pictures as two types of samples with or without cracks, and constructing a training data set;
step S2, preprocessing and data enhancing the building outer wall picture by adopting a mode of histogram equalization, bilateral filtering and center random cutting;
step S3, establishing a depth residual error neural network crack identification model based on an attention mechanism;
s4, putting the data processed in the S2 into the model established in the S3 for training to obtain a primary building crack identification model;
step S5, activating a heat map algorithm by adopting a gradient weighting class, visualizing each convolution layer, adjusting a network structure and network parameters according to the visualization result of each convolution layer on the sensitive area of the characteristic map, and retraining the model to obtain an optimal building crack identification model;
s6, constructing a detection data set by using a newly acquired building outer wall picture of the unmanned aerial vehicle, performing image preprocessing on the detection data set by adopting histogram equalization and bilateral filtering, and transmitting the preprocessed picture into an optimal building crack identification model; and carrying out crack identification and model layering visualization on the building outer wall picture, and outputting a picture identification result and a model layering visualization result.
In an embodiment of the present invention, the building exterior wall picture preprocessing and image enhancing method in step S2 includes: respectively carrying out histogram equalization on three channels of RGB of the building outer wall picture, and then summing vectors of the three channels to obtain the building outer wall picture with enhanced information content; denoising the building outer wall picture by adopting a bilateral filter, and removing noise points while not blurring the picture; randomly cutting the center of the picture into pictures with different sizes and aspect ratios according to a set proportion range, and then scaling the cut picture with the size of 224 x 224; randomly horizontally flipping a given image according to a given probability; normalization is performed according to the mean and variance of a given RGB three-channel image.
In an embodiment of the present invention, in the step S3, the attention-based deep residual neural network is composed of a ResNet18 network + AM attention module, that is, each AM attention module is respectively inserted into the middle four layers of the ResNet18, where the AM attention module is composed of a channel attention module and a space attention module, and the AM module sequentially infers an attention map along two independent dimensions, that is, a channel and a space, and then multiplies the attention map with the input feature map for adaptive feature optimization.
In an embodiment of the present invention, in the step S5, the step of visualizing each convolutional layer by using a gradient weighted activation heatmap algorithm includes the following specific steps:
step S51, after obtaining the initial building crack identification model, calculating the output value y of the final layer of the network, namely the previous layer of the Softmax layercPartial derivatives for each point pixel on the feature map:
wherein, ycScore, A, corresponding to fracture class ckI and j are width and height serial numbers of each pixel point on each characteristic graph respectively for the kth characteristic graph output by the last layer of convolution layer;
step S52, ycAfter the partial derivative of each pixel on the kth feature map is solved, the global average pooling on the first width and height dimension is taken to obtain the weight coefficient of the kth feature map corresponding to the category c
Wherein Z represents the number of pixels of the feature map,is the value at the (i, j) position in the kth feature map;
wherein, the calculation formula of the ReLU function is as follows:
compared with the prior art, the invention has the following beneficial effects:
1. the method adopts a deep residual error learning algorithm, so that the crack identification of the building outer wall is more intelligent, and the manual inspection cost is reduced.
2. The unmanned aerial vehicle is used for acquiring the building outer wall image, so that the problem of difficulty in detection caused by severe geographic environment can be solved, and the method is high in applicability and popularization.
3. The deep residual error neural network is used as a basic network architecture, and the problem that the accuracy rate of model identification of a deep network model is reduced along with the increase of the depth can be effectively solved.
4. The AM module can perform weighting operation on the feature map on 2 angles of a channel and a space, so that the network can quickly pay attention to an image area to be noticed, and the model training and convergence speed is increased.
5. The AM attention module is a lightweight, general purpose module that can be seamlessly integrated into any convolutional neural network architecture with negligible overhead and end-to-end training with the basic convolutional neural network.
6. The Gradcam algorithm can be seamlessly integrated into the model, and the residual block can be visualized in a layered mode without changing the structure of the model, so that the gradient weighting type activation heat map is obtained. And obtaining a high-resolution and specific concept oriented gradient weighted activation heat map (Guided Gradcam) through back propagation calculation.
7. The feature map obtained by the Gradcam visualization algorithm breaks through a black box mechanism of a convolutional neural network, and the model interpretability is enhanced. The visualization result of each layer provides theoretical basis and basis for efficiently adjusting the network structure.
8. The algorithm of the invention is simple to realize and has high running speed.
Drawings
FIG. 1 is an architecture diagram of a depth residual error neural network building crack identification visualization method based on an attention mechanism, which is proposed by the invention;
FIG. 2 is a graph of 1 convolution block in the depth residual neural network crack identification method based on the attention mechanism, wherein the total number of the convolution blocks in the whole network is 4;
FIG. 3 is a flow chart of a gradient weighted class activation heatmap visualization algorithm in accordance with the present invention;
fig. 4 is a first example of the algorithm for identifying the crack of the building outer wall.
Fig. 5 is an example two of the algorithm provided by the present invention for identifying cracks on the exterior wall of a building.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the invention provides a visual identification method of a building crack based on attention mechanism and ResNet fusion, comprising the following steps:
s1, collecting a preset number of building outer wall pictures by using an unmanned aerial vehicle, respectively marking the collected pictures as two types of samples with or without cracks, and constructing a training data set;
step S2, preprocessing and data enhancing the building outer wall picture by adopting a mode of histogram equalization, bilateral filtering and center random cutting;
step S3, establishing a depth residual error neural network crack identification model based on an attention mechanism;
s4, putting the data processed in the S2 into the model established in the S3 for training to obtain a primary building crack identification model;
step S5, activating a heat map algorithm by adopting a gradient weighting class, visualizing each convolution layer, adjusting a network structure and network parameters according to the visualization result of each convolution layer on the sensitive area of the characteristic map, and retraining the model to obtain an optimal building crack identification model;
s6, constructing a detection data set by using a newly acquired building outer wall picture of the unmanned aerial vehicle, performing image preprocessing on the detection data set by adopting histogram equalization and bilateral filtering, and transmitting the preprocessed picture into an optimal building crack identification model; and carrying out crack identification and model layering visualization on the building outer wall picture, and outputting a picture identification result and a model layering visualization result.
In the embodiment of the present invention, the building exterior wall image preprocessing and image enhancing method in step S2 is that histogram equalization is performed on three channels of picture RGB, and then the three vectors are summed to obtain the building exterior wall image with enhanced information content; denoising the building outer wall picture by adopting a bilateral filter, and removing noise points while not blurring the picture; randomly cutting the center of the picture into pictures with different sizes and aspect ratios according to a set proportion range, and then scaling the cut picture with the size of 224 x 224; randomly horizontally flipping a given image according to a given probability; normalization is performed according to the mean and variance of a given RGB three-channel image.
In the embodiment of the present invention, the method for preprocessing the building exterior wall image in step S2 includes performing gray histogram equalization on each component of RGB of the building exterior wall color image, and then summing the three vectors to obtain the building exterior wall image with enhanced information content; the bilateral filter is adopted to carry out smooth filtering processing on the building outer wall image, and the bilateral filter gives the numerical value of each pixel point on the basis of comprehensively considering the distance and the color weight, so that the noise can be effectively removed, and the edge information can be well protected.
In the embodiment of the present invention, the method for enhancing the crack sample data of the building exterior wall image in step S2 includes randomly cutting the center of the picture into pictures with different sizes and aspect ratios according to a set proportion range, and then scaling the cut picture with size of 224 × 224; randomly horizontally flipping a given image according to a given probability; and normalizing according to the mean value and the variance of the given RGB three-channel image.
In the present embodiment, the depth residual error neural network based on attention mechanism in step S3 is composed of the module "ResNet 18 network + AM attention mechanism". The ResNet18 network is shown in FIG. 2. The AM attention module consists of a channel attention module and a space attention module, which in turn infers an attention map along two independent dimensions (channel and space) and then multiplies the attention map with the input feature map for adaptive feature optimization, as shown in fig. 3. The AM modules are inserted into the middle four layers of the ResNet18 respectively, and since AM is a lightweight general-purpose module, the overhead of the module can be ignored, and the module can be seamlessly integrated into any convolutional neural network architecture, and end-to-end training can be performed together with the basic convolutional neural network.
In the present embodiment, the depth residual error neural network based on attention mechanism in step S3 is composed of the module "ResNet 18 network + AM attention mechanism". ResNet18 is composed of 4 residual blocks (blocks), 18 convolutional layers, 18 ReLU layers, 2 pooling layers, and 1 full link layer, and 4 AM modules are placed behind the 4 residual blocks respectively for channel and spatial attention operations (FIG. 2).
In the present embodiment, the gradient-weighted class-activated heat map visualization algorithm in step S5 is to forward the image through the model given an image and a target class as input, and obtain the original score of the class. For all classes, except for the incoming target class whose gradient is set to 1, the remaining gradients are set to zero; setting a visualized convolution layer, and then reversely propagating the signal to the whole concerned convolution characteristic graph; synthesizing the weight information of the feature maps of the layers in front of the selected visual convolutional layer to obtain a gradient weighted activation heat map (Gradcam), wherein the darker part is the sensitive area of the model to the feature maps and is also the basis of the decision of the model, as shown in FIG. 4(a) and FIG. 5 (a); finally, the thermodynamic diagram is multiplied point by the result of Guided back propagation to obtain a Guided gradient weighted class activation heat map (Guided Gradcam) with high resolution and special concept, which can further clearly reflect the gradient information of the crack fine-grained pixel level, as shown in fig. 4(b) and fig. 5 (b).
In an embodiment of the present invention, a Gradient-weighted Class Activation heatmap algorithm (GradCam) is adopted, and the specific steps for visualizing each convolutional layer are as follows:
step S51, after obtaining the preliminary building crack identification model, calculating the output value y of the final layer of the network, namely the front output value y of the Soft max layercPartial derivatives for each point pixel on the feature map:
wherein, ycScore, A, corresponding to fracture class ckI and j are width and height serial numbers of each pixel point on each characteristic graph respectively for the kth characteristic graph output by the last layer of convolution layer;
step S52, ycAfter the partial derivative of each pixel on the kth feature map is solved, the global average pooling on the first width and height dimension is taken to obtain the weight coefficient of the kth feature map corresponding to the category c
Wherein Z represents the number of pixels of the feature map,is the value at the (i, j) position in the kth feature map;
wherein, the calculation formula of the ReLU function is as follows:
as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (4)
1. A visual building crack identification method based on attention mechanism and ResNet fusion is characterized by comprising the following steps:
s1, collecting a preset number of building outer wall pictures by using an unmanned aerial vehicle, respectively marking the collected pictures as two types of samples with or without cracks, and constructing a training data set;
step S2, preprocessing and data enhancing the building outer wall picture by adopting a mode of histogram equalization, bilateral filtering and center random cutting;
step S3, establishing a ResNet depth residual error neural network crack identification model based on an attention mechanism;
s4, putting the data processed in the S2 into the model established in the S3 for training to obtain a primary building crack identification model;
step S5, activating a heat map algorithm by adopting a gradient weighting class, visualizing each convolution layer, adjusting a network structure and network parameters according to the visualization result of each convolution layer on the sensitive area of the characteristic map, and retraining the model to obtain an optimal building crack identification model;
s6, constructing a detection data set by using a newly acquired building outer wall picture of the unmanned aerial vehicle, performing image preprocessing on the detection data set by adopting histogram equalization and bilateral filtering, and transmitting the preprocessed picture into an optimal building crack identification model; and carrying out crack identification and model layering visualization on the building outer wall picture, and outputting a picture identification result and a model layering visualization result.
2. The visual identification method for the building cracks based on the fusion of the attention mechanism and the ResNet (R-ResNet) according to claim 1, wherein the building exterior wall picture preprocessing and image enhancement method in the step S2 comprises: respectively carrying out histogram equalization on three channels of RGB of the building outer wall picture, and then summing vectors of the three channels to obtain the building outer wall picture with enhanced information content; denoising the building outer wall picture by adopting a bilateral filter, and removing noise points while not blurring the picture; randomly cutting the center of the picture into pictures with different sizes and aspect ratios according to a set proportion range, and then scaling the cut picture with the size of 224 x 224; randomly horizontally flipping a given image according to a given probability; normalization is performed according to the mean and variance of a given RGB three-channel image.
3. The method for visually identifying the building crack based on the fusion of the attention mechanism and the ResNet as claimed in claim 1, wherein in the step S3, the depth residual neural network based on the attention mechanism is composed of a ResNet18 network + AM attention mechanism modules, that is, each AM attention mechanism module is respectively inserted into the middle four layers of the ResNet18, wherein the AM attention mechanism module is composed of a channel attention module and a space attention module, the AM module sequentially infers the attention diagram along two independent dimensions, that is, a channel and a space, and then multiplies the attention diagram with the input feature diagram for adaptive feature optimization.
4. The method for visually identifying the building crack based on the fusion of the attention mechanism and the ResNet according to claim 1, wherein in the step S5, the step of visualizing each convolutional layer by using a gradient weighting type activation heatmap algorithm comprises the following specific steps:
step S51, after obtaining the initial building crack identification model, calculating the output value y of the final layer of the network, namely the previous layer of the Softmax layercPartial derivatives for each point pixel on the feature map:
wherein, ycScore, A, corresponding to fracture class ckThe k characteristic diagram outputted for the last convolution layer, i and j are respectively the characteristic diagramThe width and height serial numbers of the pixel points;
step S52, ycAfter the partial derivative of each pixel on the kth feature map is solved, the global average pooling on the first width and height dimension is taken to obtain the weight coefficient of the kth feature map corresponding to the category c
Wherein Z represents the number of pixels of the feature map,is the value at the (i, j) position in the kth feature map;
wherein, the calculation formula of the ReLU function is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110065534.2A CN112734739B (en) | 2021-01-18 | 2021-01-18 | Visual building crack identification method based on attention mechanism and ResNet fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110065534.2A CN112734739B (en) | 2021-01-18 | 2021-01-18 | Visual building crack identification method based on attention mechanism and ResNet fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112734739A true CN112734739A (en) | 2021-04-30 |
CN112734739B CN112734739B (en) | 2022-07-08 |
Family
ID=75592192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110065534.2A Active CN112734739B (en) | 2021-01-18 | 2021-01-18 | Visual building crack identification method based on attention mechanism and ResNet fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112734739B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113744205A (en) * | 2021-08-17 | 2021-12-03 | 哈尔滨工业大学(威海) | End-to-end road crack detection system |
CN114120046A (en) * | 2022-01-25 | 2022-03-01 | 武汉理工大学 | Lightweight engineering structure crack identification method and system based on phantom convolution |
CN115660647A (en) * | 2022-11-05 | 2023-01-31 | 一鸣建设集团有限公司 | Maintenance method for building outer wall |
CN117274788A (en) * | 2023-10-07 | 2023-12-22 | 南开大学 | Sonar image target positioning method, system, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018195001A (en) * | 2017-05-16 | 2018-12-06 | 株式会社パスコ | Linear graphic extraction device, linear graphic extraction program, and neural network learning method |
KR20200063364A (en) * | 2018-11-23 | 2020-06-05 | 네이버 주식회사 | Method and system for visualizing classification result of deep neural network for prediction of disease prognosis through time series medical data |
CN111401177A (en) * | 2020-03-09 | 2020-07-10 | 山东大学 | End-to-end behavior recognition method and system based on adaptive space-time attention mechanism |
CN111784679A (en) * | 2020-07-06 | 2020-10-16 | 金陵科技学院 | Retaining wall crack identification method based on CNN and SVM |
CN111860106A (en) * | 2020-05-28 | 2020-10-30 | 江苏东印智慧工程技术研究院有限公司 | Unsupervised bridge crack identification method |
CN112053354A (en) * | 2020-09-15 | 2020-12-08 | 上海应用技术大学 | Track slab crack detection method |
CN112130200A (en) * | 2020-09-23 | 2020-12-25 | 电子科技大学 | Fault identification method based on grad-CAM attention guidance |
-
2021
- 2021-01-18 CN CN202110065534.2A patent/CN112734739B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018195001A (en) * | 2017-05-16 | 2018-12-06 | 株式会社パスコ | Linear graphic extraction device, linear graphic extraction program, and neural network learning method |
KR20200063364A (en) * | 2018-11-23 | 2020-06-05 | 네이버 주식회사 | Method and system for visualizing classification result of deep neural network for prediction of disease prognosis through time series medical data |
CN111401177A (en) * | 2020-03-09 | 2020-07-10 | 山东大学 | End-to-end behavior recognition method and system based on adaptive space-time attention mechanism |
CN111860106A (en) * | 2020-05-28 | 2020-10-30 | 江苏东印智慧工程技术研究院有限公司 | Unsupervised bridge crack identification method |
CN111784679A (en) * | 2020-07-06 | 2020-10-16 | 金陵科技学院 | Retaining wall crack identification method based on CNN and SVM |
CN112053354A (en) * | 2020-09-15 | 2020-12-08 | 上海应用技术大学 | Track slab crack detection method |
CN112130200A (en) * | 2020-09-23 | 2020-12-25 | 电子科技大学 | Fault identification method based on grad-CAM attention guidance |
Non-Patent Citations (4)
Title |
---|
NAV RAJ BHATT: "基于卷积神经网络的砌体墙震后破坏评估", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 * |
RAMPRASAATH R. SELVARAJU ET AL.: "Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization", 《 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
TING DENG ET AL.: "Generate adversarial examples by spatially perturbing on the meaningful area", 《PATTERN RECOGNITION LETTERS》 * |
叶亮: "煤矿采空区地裂缝的航拍视觉检测与识别方法", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅰ辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113744205A (en) * | 2021-08-17 | 2021-12-03 | 哈尔滨工业大学(威海) | End-to-end road crack detection system |
CN113744205B (en) * | 2021-08-17 | 2024-02-06 | 哈尔滨工业大学(威海) | End-to-end road crack detection system |
CN114120046A (en) * | 2022-01-25 | 2022-03-01 | 武汉理工大学 | Lightweight engineering structure crack identification method and system based on phantom convolution |
CN115660647A (en) * | 2022-11-05 | 2023-01-31 | 一鸣建设集团有限公司 | Maintenance method for building outer wall |
CN117274788A (en) * | 2023-10-07 | 2023-12-22 | 南开大学 | Sonar image target positioning method, system, electronic equipment and storage medium |
CN117274788B (en) * | 2023-10-07 | 2024-04-30 | 南开大学 | Sonar image target positioning method, system, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112734739B (en) | 2022-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112734739B (en) | Visual building crack identification method based on attention mechanism and ResNet fusion | |
CN112270249B (en) | Target pose estimation method integrating RGB-D visual characteristics | |
CN111444809B (en) | Power transmission line abnormal target detection method based on improved YOLOv3 | |
CN110348376B (en) | Pedestrian real-time detection method based on neural network | |
CN108647585B (en) | Traffic identifier detection method based on multi-scale circulation attention network | |
CN108960135B (en) | Dense ship target accurate detection method based on high-resolution remote sensing image | |
CN114743119B (en) | High-speed rail contact net hanger nut defect detection method based on unmanned aerial vehicle | |
CN111738206B (en) | Excavator detection method for unmanned aerial vehicle inspection based on CenterNet | |
CN111860143B (en) | Real-time flame detection method for inspection robot | |
CN111242026A (en) | Remote sensing image target detection method based on spatial hierarchy perception module and metric learning | |
CN111860175B (en) | Unmanned aerial vehicle image vehicle detection method and device based on lightweight network | |
CN111414807A (en) | Tidal water identification and crisis early warning method based on YO L O technology | |
CN112818871B (en) | Target detection method of full fusion neural network based on half-packet convolution | |
CN112562255A (en) | Intelligent image detection method for cable channel smoke and fire condition in low-light-level environment | |
CN113298024A (en) | Unmanned aerial vehicle ground small target identification method based on lightweight neural network | |
CN111652297B (en) | Fault picture generation method for image detection model training | |
CN111582102B (en) | Remote sensing data refined classification method and device based on multi-mode end-to-end network | |
CN116485885A (en) | Method for removing dynamic feature points at front end of visual SLAM based on deep learning | |
CN114332739A (en) | Smoke detection method based on moving target detection and deep learning technology | |
CN115272826A (en) | Image identification method, device and system based on convolutional neural network | |
CN114399734A (en) | Forest fire early warning method based on visual information | |
CN116452983B (en) | Quick discovering method for land landform change based on unmanned aerial vehicle aerial image | |
CN116385758A (en) | Detection method for damage to surface of conveyor belt based on YOLOv5 network | |
CN115578624A (en) | Agricultural disease and pest model construction method, detection method and device | |
CN116092019A (en) | Ship periphery abnormal object monitoring system, storage medium thereof and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |