CN116416244A - Crack detection method and system based on deep learning - Google Patents

Crack detection method and system based on deep learning Download PDF

Info

Publication number
CN116416244A
CN116416244A CN202310492746.8A CN202310492746A CN116416244A CN 116416244 A CN116416244 A CN 116416244A CN 202310492746 A CN202310492746 A CN 202310492746A CN 116416244 A CN116416244 A CN 116416244A
Authority
CN
China
Prior art keywords
deep
features
convolution
feature
crack
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310492746.8A
Other languages
Chinese (zh)
Inventor
刘国良
石昌腾
田国会
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202310492746.8A priority Critical patent/CN116416244A/en
Publication of CN116416244A publication Critical patent/CN116416244A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection

Abstract

The invention provides a crack detection method and system based on deep learning, and relates to the technical field of crack detection. The method comprises the steps of obtaining an original image of the crack; taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, and a convolution layer after the fusion of shallow features and deep features of the decoder is replaced by a depth separable convolution to build the SA-deep V < 3+ > model; inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image. The invention has higher detection accuracy, higher efficiency and stronger generalization capability for cracks.

Description

Crack detection method and system based on deep learning
Technical Field
The invention belongs to the technical field of crack detection, and particularly relates to a crack detection method and system based on deep learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Cracks are one of the common disasters, and exist not only in buildings but also on roads and airport roads. On a building, a general crack has no effect on the use of the building, but if the width of the crack exceeds a certain limit value, the crack becomes a harmful crack, and the existence of the harmful crack seriously affects the service life of the building. Similarly, on roads and airport roads, smaller cracks do not affect the use of roads and airport roads, but if the road cracks are not repaired in time, the road cracks are more likely to further aggravate road damage due to repeated load and severe weather effects, even structural damage may occur, and accidents further occur. Therefore, it is necessary to detect the crack more accurately, timely and efficiently.
The inventor finds that the traditional detection method can detect cracks, but has the advantages of lower efficiency, larger limitation, high data cost and low accuracy.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a crack detection method and a crack detection system based on deep learning, which are characterized in that an algorithm is more concentrated on crack pixels, and the crack detection accuracy is higher, the efficiency is higher and the generalization capability is stronger.
To achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
the first aspect of the invention provides a crack detection method based on deep learning.
A crack detection method based on deep learning comprises the following steps:
acquiring an original image of the crack;
taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by convolution with fewer times in the network, and the deep features are features extracted by convolution with more times in the network;
inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
Further, inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image, wherein the method specifically comprises the following steps:
inputting the original image of the crack into a main feature extraction network of an encoder, respectively extracting shallow features after 3-5 convolutions of the main feature extraction network, and extracting deep features after more than 14 convolutions of the main feature extraction network;
inputting deep features into the pyramid part, carrying out parallel sampling on the deep features by using cavity convolution with different sampling rates, and capturing the context of the image in a plurality of proportions to obtain a feature map after parallel sampling;
the SA-Net attention module is utilized to distribute attention weights to the feature graphs after parallel sampling, and the attention weights and the corresponding feature graphs are weighted, so that deep features after feature weighting are obtained;
upsampling the deep features weighted by the features, inputting the upsampled result and the shallow features into a decoder together for stacking, and performing depth separable convolution on the stacked features to obtain an effective feature map;
and up-sampling the effective feature map to obtain a crack prediction image.
Further, the SA-Net attention module is utilized to distribute attention weights to the feature graphs after parallel sampling, and the attention weights and the corresponding feature graphs are weighted, so that deep features after feature weighting are obtained, specifically:
the SA-Net firstly divides the feature map after parallel sampling into G groups to obtain G sub-features, each sub-feature is divided into two branches along the channel dimension, one branch is used for generating a space attention pattern, the other branch is used for generating a channel attention pattern, each sub-feature is captured in the training process, and a corresponding weight coefficient is generated for each sub-feature through the SA-Net attention module;
weighting each sub-feature and the corresponding weight coefficient to obtain a weighted sub-feature;
and (3) utilizing a shuffling mechanism to enable each weighted sub-feature to flow in the channel dimension, and finally integrating all the weighted sub-features in the channel dimension to obtain the processed integral feature, namely the deep feature.
Further, the depth separable convolution consists of a channel-by-channel convolution and a point-by-point convolution:
in the channel-by-channel convolution, one convolution kernel takes charge of one channel height, one channel is only convolved by one convolution kernel, and the number of channels of the characteristic map generated in the process is identical to the number of channels of the input;
the point-by-point convolution only performs weighted combination in the channel direction, and the number of feature images generated by generating effective feature images is determined by the number of convolution kernels.
Further, the loss function of the SA-deep V < 3+ > model adopts a Focal loss and a Dice loss, wherein the calculation formula of the Focal loss is as follows:
Figure BDA0004210895090000031
y and y' refer to a label value and a predicted value of the image, respectively; a is a balance factor; gamma is a regulator; the calculation formula of the Dice is:
Figure BDA0004210895090000032
y i and y i ' refers to a label value and a predicted value of an image, respectively; n refers to the total number of pixels in the image.
Further, the dynamic compensation weights are fused to the Focal loss:
Figure BDA0004210895090000033
wherein y and y' refer to the label value and the predicted value, beta, respectively, of the image 1 And beta 2 Is a dynamic compensation weight coefficient.
Further, beta 1 And beta 2 The calculation is performed by the following formula:
Figure BDA0004210895090000041
Figure BDA0004210895090000042
wherein F is p Is false positive, F n P is the total number of cracked pixel points in the image; s is S n As the total number of pixels in the image,
Figure BDA0004210895090000043
is the percentage of split pixels in the whole image,/>
Figure BDA0004210895090000044
Is the percentage of non-cracked pixels in the entire image.
The second aspect of the invention provides a crack detection system based on deep learning.
A deep learning based fracture detection system, comprising:
an image acquisition module configured to: acquiring an original image of the crack;
a model building module configured to: taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by convolution with fewer times in the network, and the deep features are features extracted by convolution with more times in the network;
a crack detection module configured to: inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
A third aspect of the present invention provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the steps in the deep learning based crack detection method according to the first aspect of the present invention.
A fourth aspect of the invention provides an electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in the deep learning based fracture detection method according to the first aspect of the invention when the program is executed.
The one or more of the above technical solutions have the following beneficial effects:
the invention uses the fusion loss function of Focal loss and Dice loss after fusing dynamic compensation weights on the basis of SA-deep V < 3+ >, constructs the overall algorithm of DWF+SA-deep V < 3+ >, performs data training and testing on the basis of the algorithm, and obtains better crack extraction effect.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a diagram showing the overall structure of SA-deep V3+ according to the first embodiment.
Fig. 2 is a diagram showing the construction of the first embodiment SA-Net attention module.
Fig. 3 is a schematic diagram of a general convolution structure of the first embodiment.
Fig. 4 is a schematic diagram of a channel-by-channel convolution structure according to the first embodiment.
Fig. 5 is a schematic diagram of a point-by-point convolution structure of the first embodiment.
Fig. 6 (a) is an original picture of the crack of the first embodiment.
Fig. 6 (b) is a first embodiment crack signature.
FIG. 6 (c) shows the result of the first embodiment deep V3+ detection.
FIG. 6 (d) shows the DWF+SA-deep V3+ detection of the first embodiment.
Fig. 7 is a system configuration diagram of the second embodiment.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
The core structure of the original deep V < 3+ > neural network consists of two modules, namely a spatial pyramid pooling module and a coding and decoding module.
The specific characteristic extraction process is as follows:
inputting the image into a trunk feature extraction network, obtaining shallow layer features and deep layer features through the trunk feature extraction network, inputting the deep layer features into a cavity pyramid pooling module ASPP, respectively rolling and pooling with four cavity convolution layers and one pooling layer to obtain five feature layers, then splicing the feature layers, sending the feature images into a 1x1 convolution layer for operation, and then obtaining a new feature layer A through up-sampling.
The shallow layer features are directly subjected to 1x1 convolution to obtain dimension reduction features B, then are spliced with the dimension reduction features A, and finally output a prediction result with the same size as the original image through a 3x3 convolution layer and up-sampling.
In the invention, shallow features are extracted and acquired after convolution of a trunk feature extraction network for a few times, and deep features are extracted after convolution of the whole trunk feature network.
The invention has the overall conception that:
the traditional detection method can detect cracks, but has the advantages of lower efficiency, larger limitation, high data cost and low accuracy. With the development of computer power, the application of the deep learning method in crack detection is gradually increased, and meanwhile, the accuracy rate is higher, the efficiency is higher, and the generalization capability is strong. The invention provides a crack detection algorithm based on deep learning according to the current situation.
The invention mainly provides a new and more accurate crack detection algorithm based on the existing deep learning network framework, and the specific improvement content can be summarized into the following aspects:
1. the Sa-Net attention mechanism is added into the original deep V < 3+ > neural network pyramid module.
2. The loss function of the Diceloss and Focal loss fusion is used, so that loss calculation interference caused by small occupation of the crack prospect in the image is reduced, and the attention of the network to the crack is improved.
3. The dynamic compensation weight of the new loss function is provided, and can be used in combination with the loss functions such as a focus loss function, a cross entropy loss function and the like to compensate the problem that the ratio of the crack pixels in the whole image is too small in the crack detection process. And fusing the dynamic compensation weight with the Focal loss, using the dynamic compensation weight in combination with the Dice loss to replace an original cross entropy loss function, and forming a DWF+SA-deep V < 3+ > algorithm model together with the SA-deep V < 3+ >.
The algorithm improvement of the invention mainly aims to make the algorithm more concentrated on the crack pixels, and the practical experiment is carried out after the training on the data to obtain good effects.
Example 1
The embodiment discloses a crack detection method based on deep learning.
As shown in fig. 1, a crack detection method based on deep learning includes the following steps:
acquiring an original image of the crack;
taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by 3-5 times of convolution in the network, and the deep features are features extracted by more than 14 times of convolution in the network;
inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
Further, inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image, wherein the method specifically comprises the following steps:
inputting the original image of the crack into a main feature extraction network of an encoder, respectively extracting shallow features after convolution for a few times by the main feature extraction network, and extracting deep features after convolution for a plurality of times by the main feature extraction network;
inputting deep features into the pyramid part, carrying out parallel sampling on the deep features by using cavity convolution with different sampling rates, and capturing the context of the image in a plurality of proportions to obtain a feature map after parallel sampling;
the SA-Net attention module is utilized to distribute attention weights to the feature graphs after parallel sampling, and the attention weights and the corresponding feature graphs are weighted, so that deep features after feature weighting are obtained;
upsampling the deep features weighted by the features, inputting the upsampled result and the shallow features into a decoder together for stacking, and performing depth separable convolution on the stacked features to obtain an effective feature map;
and up-sampling the effective feature map to obtain a crack prediction image.
In this embodiment, the shallow features are features obtained by 3-5 times of convolution extraction in the network, and the deep features are features obtained by more than 14 times of convolution extraction in the network.
Specific:
integral network frame SA-deep V3+
To reduce the information loss caused by upsampling, the 3x3 convolution layer after the fusion of the decoder portion shallow features and deep features in the deep v3+ model is replaced with a depth separable convolution. Meanwhile, in order to make the algorithm pay more attention to the crack pixels, a pyramid part (ASPP) in the model is fused with an SA-Net attention module, and a specific network structure is shown in fig. 1.
The core structure of SA-deep V < 3+ > consists of two parts, namely an encoder module and a decoder module.
The main network of the encoder part model is mobilenet v2, and unlike the commonly used res net residual structure, the mobilenet v2 firstly performs dimension ascending on an input feature matrix through 1x1 convolution, increases the size of a channel, then performs convolution processing through a 3x3 deep convolution kernel (DW convolution), and finally performs dimension descending through a 1x1 convolution kernel. Meanwhile, the mobilenet v2 adopts Rectified Linear Unit (ReLU 6) activation function, when the input value is smaller than 0, the default is set to zero, and the input value is kept unchanged between [0,6], but when the input value is larger than 6, the output value is set to 6.
Meanwhile, for the spatial feature pyramid part in the encoder module, SA-Net is fused. For a given feature map, SA-Net first divides the given feature map into G groups, then captures each sub-feature during training, and generates a corresponding weight coefficient for each sub-feature through the attention module. In particular, it is that each sub-feature is split into two branches along the channel dimension, one branch being used to generate the spatial attention pattern and the other branch being used to generate the channel attention pattern, so that the model can focus more on meaningful parts. Where for the implementation of the channel attention mechanism, SA-Net does not use conventional Squeeze Excitation (SE), but rather performs global pooling first, then moves and scales the channel vector through a pair of parameters, and finally performs sigmoid activation on the value. The SA-Net also adds a shuffling mechanism, so that each group of information flows in the channel dimension, and finally the grouped information is integrated in the channel dimension, thereby achieving the processed integral characteristics. The SA-Net attention module architecture is shown in FIG. 2.
For the decoder part we replace the normal convolution with a depth separable convolution. Unlike the normal convolution approach, the depth separable convolution consists of a channel-by-channel convolution and a point-by-point convolution. In the channel-by-channel convolution, one convolution kernel is responsible for one channel high, one channel is only convolved by one convolution kernel, and the number of channels of the feature map generated by the process is identical to the number of channels of the input. The point-by-point convolution only performs weighted combination in the channel direction to generate new feature images, and the number of the generated feature images is determined by the number of convolution kernels. Thus, the depth separable convolution parameters and computational cost are smaller and the loss in the feature extraction process is smaller than in normal convolution. Exemplary diagrams of channel-by-channel and point-by-point convolution structures in normal convolution and depth separable convolution are shown in fig. 3, 4, and 5, respectively.
(II) loss function improvement
Through observation analysis of a large number of crack images, we find that, unlike traditional semantic segmentation, cracks tend to occupy a small part of the images, and therefore, for loss calculation, if a cross entropy loss function (Cross Entropy Loss) is used, training interference is caused due to the fact that the background occupies too large component.
For this case, the Focal Loss function (Focal Loss) can solve the problem of imbalance of foreground and background to a certain extent, so that a small number of target classes are weighted, and samples with wrong classification are weighted. The focus loss function is modified on the basis of the cross entropy loss function, and the calculation formula of the two-class cross entropy loss function is as follows:
Figure BDA0004210895090000091
Figure BDA0004210895090000101
y and y' refer to a label value and a predicted value of the image, respectively; the label value of the image indicates whether the image is actually a crack, if so, the value is 1, and the non-crack value is 0; the predicted value of the image represents the value output through the detection algorithm presented herein.
Focal loss is a modification based on a cross entropy loss function, and a factor gamma, gamma >0 is added on the original basis, so that the loss of an easy-to-classify sample is reduced, and the model focuses on difficult and wrong classified samples. Simultaneously changing the magnitude of γ also adjusts the rate at which simple sample weights decrease, as γ increases, the impact of the adjustment factor increases. In addition, a balance factor a is added to balance the uneven proportion of the positive and negative samples. The calculation formula of the Focal loss is as follows:
Figure BDA0004210895090000102
y and y' refer to the label value and the predicted value of the image, respectively.
Meanwhile, the invention also introduces a Dice, which is suitable for image binary segmentation, and can alleviate the problem of unbalanced number of positive and negative samples to a certain extent. For predictions with higher confidence, a lower Dice coefficient will be obtained, thus making Dice less, while for predictions with lower confidence, a higher Dice coefficient will be obtained, thus making Dice greater. The calculation formula of the Dice is as follows:
Figure BDA0004210895090000103
y i and y i ' refers to the label value and the predicted value of the image, respectively, and N refers to the total number of pixels in the image.
The single use of Dice loss has the problem of loss saturation, so the patent uses a mixture of Focal loss and Dice loss.
In addition, the invention provides a dynamic compensation weight based on the F1 score. The method mainly uses the ratio of false positive to false negative to true positive in the model training process as a weight coefficient, so that loss compensation is performed when the model prediction result is poor. The Focal loss calculation formula after fusing the dynamic compensation weight is as follows:
Figure BDA0004210895090000111
similarly, y and y' refer to the label value and the predicted value, β, respectively, of the image 1 And beta 2 Is a dynamic compensation weight coefficient.
β 1 And beta 2 The calculation can be performed by the following formula:
Figure BDA0004210895090000112
Figure BDA0004210895090000113
wherein F is p Refers to false positive, F n Refers to false negatives, and P refers to the total number of cracked pixels in the image. When the predicted false positives are more, the corresponding coefficient is increased, so that the corresponding prediction loss about the true crack pixels is increased, when the predicted false negatives are more, i.e. the number of the false-crack pixels which are misjudged as being non-crack pixels in the prediction process is more, the corresponding loss of the background pixels is increased, and when the final prediction result is gradually better, beta 1 Gradually approach toAt 0.S is S n As the total number of pixels in the image,
Figure BDA0004210895090000114
is the percentage of cracked pixels in the entire image. />
Figure BDA0004210895090000115
Is the percentage of non-cracked pixels in the entire image. Because the percentage of non-cracked pixels in the image is larger, the coefficient can compensate the problem that the prediction loss is smaller because the cracked pixels occupy smaller in the whole image.
(III) Overall algorithm Structure DWF+SA-deep V3+
The patent uses the fusion loss function of Focal loss and Dice loss after fusing dynamic compensation weights on the basis of SA-deep V < 3+ >, and constructs the overall algorithm of DWF+SA-deep V < 3+ >. Data training and testing are carried out on the basis of the algorithm, and a good crack extraction effect is obtained. The detection results of the original picture, the label graph and the deep v3+ on the common data set craackforest and the detection results of the DWF+SA-deep v3+ algorithm are shown in fig. 6 (a), 6 (b), 6 (c) and 6 (d).
The DWF+SA-deep V < 3+ > algorithm provided by the invention can effectively realize crack detection. The original algorithm and the DWF+SA-deep V3+ algorithm are trained and verified on the whole CrackForest data set, and compared to obtain a conclusion: the algorithm provided by the invention has better crack detection effect, the crack detection result is more complete and finer, the result is closer to a label image, the F1 Score (F1 Score) is improved by 0.057, the average cross-over ratio (MIOU) is improved by 0.07, the crack detection is carried out in actual use, the better effect is obtained, and the crack detection task can be well mastered. The quantitative results of the two algorithms are compared as shown in table 1.
Table 1 comparison of the results of the two algorithm tests
Figure BDA0004210895090000121
Example two
The embodiment discloses a crack detection system based on deep learning.
As shown in fig. 7, a crack detection system based on deep learning includes:
an image acquisition module configured to: acquiring an original image of the crack;
a model building module configured to: taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by convolution with fewer times in the network, and the deep features are features extracted by convolution with more times in the network;
a crack detection module configured to: inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
Example III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in a method for deep learning based crack detection as described in embodiment 1 of the present disclosure.
Example IV
An object of the present embodiment is to provide an electronic apparatus.
An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in the deep learning based fracture detection method as described in embodiment 1 of the present disclosure when the program is executed.
The steps involved in the devices of the second, third and fourth embodiments correspond to those of the first embodiment of the method, and the detailed description of the embodiments can be found in the related description section of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media including one or more sets of instructions; it should also be understood to include any medium capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any one of the methods of the present invention.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented by general-purpose computer means, alternatively they may be implemented by program code executable by computing means, whereby they may be stored in storage means for execution by computing means, or they may be made into individual integrated circuit modules separately, or a plurality of modules or steps in them may be made into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (10)

1. The crack detection method based on deep learning is characterized by comprising the following steps of:
acquiring an original image of the crack;
taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by convolution with fewer times in the network, and the deep features are features extracted by convolution with more times in the network;
inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
2. The method for detecting cracks based on deep learning according to claim 1, wherein the method for detecting cracks based on deep learning is characterized in that an original image of cracks is input into an SA-deep V < 3+ > model, and feature extraction is performed to obtain predicted images of the cracks, specifically:
inputting the original image of the crack into a main feature extraction network of an encoder, respectively extracting shallow features after convolution for a few times by the main feature extraction network, and extracting deep features after convolution for a plurality of times by the main feature extraction network;
inputting deep features into the pyramid part, carrying out parallel sampling on the deep features by using cavity convolution with different sampling rates, and capturing the context of the image in a plurality of proportions to obtain a feature map after parallel sampling;
the SA-Net attention module is utilized to distribute attention weights to the feature graphs after parallel sampling, and the attention weights and the corresponding feature graphs are weighted, so that deep features after feature weighting are obtained;
upsampling the deep features weighted by the features, inputting the upsampled result and the shallow features into a decoder together for stacking, and performing depth separable convolution on the stacked features to obtain an effective feature map;
and up-sampling the effective feature map to obtain a crack prediction image.
3. The method for detecting cracks based on deep learning according to claim 2, wherein the method for detecting cracks based on deep learning is characterized in that the SA-Net attention module is used for distributing attention weights to the feature graphs after parallel sampling, and weighting the attention weights and the corresponding feature graphs so as to obtain deep features after feature weighting, specifically comprising the following steps:
the SA-Net firstly divides the feature map after parallel sampling into G groups to obtain G sub-features, each sub-feature is divided into two branches along the channel dimension, one branch is used for generating a space attention pattern, the other branch is used for generating a channel attention pattern, each sub-feature is captured in the training process, and a corresponding weight coefficient is generated for each sub-feature through the SA-Net attention module;
weighting each sub-feature and the corresponding weight coefficient to obtain a weighted sub-feature;
and (3) utilizing a shuffling mechanism to enable each weighted sub-feature to flow in the channel dimension, and finally integrating all the weighted sub-features in the channel dimension to obtain the processed integral feature, namely the deep feature.
4. The method for deep learning based fracture detection of claim 2,
the depth separable convolution consists of a channel-by-channel convolution and a point-by-point convolution:
in the channel-by-channel convolution, one convolution kernel takes charge of one channel height, one channel is only convolved by one convolution kernel, and the number of channels of the characteristic map generated in the process is identical to the number of channels of the input;
the point-by-point convolution only performs weighted combination in the channel direction to generate effective feature images, and the number of the generated effective feature images is determined by the number of convolution kernels.
5. The deep learning based fracture detection method of claim 1, wherein the loss function of the SA-deep v3+ model employs a Focal loss and a Dice loss, wherein the Focal loss has a calculation formula:
Figure FDA0004210895060000031
y and y' refer to a label value and a predicted value of the image, respectively; a is a balance factor; gamma is a regulator; the calculation formula of the Dice is:
Figure FDA0004210895060000032
y i and y i ' tag value and prediction referring to an image, respectivelyA value; n refers to the total number of pixels in the image.
6. The deep learning based fracture detection method of claim 5, wherein the Focal loss is fused with dynamic compensation weights:
Figure FDA0004210895060000033
wherein y and y' refer to the label value and the predicted value, beta, respectively, of the image 1 And beta 2 Is a dynamic compensation weight coefficient.
7. The deep learning based fracture detection method of claim 6, wherein β 1 And beta 2 The calculation is performed by the following formula:
Figure FDA0004210895060000034
Figure FDA0004210895060000035
wherein F is p Is false positive, F n P is the total number of cracked pixel points in the image; s is S n As the total number of pixels in the image,
Figure FDA0004210895060000036
is the percentage of split pixels in the whole image,/>
Figure FDA0004210895060000037
Is the percentage of non-cracked pixels in the entire image.
8. Crack detection system based on deep learning, its characterized in that: comprising the following steps:
an image acquisition module configured to: acquiring an original image of the crack;
a model building module configured to: taking a deep V < 3+ > model comprising an encoder module and a decoder module as a basic model, wherein the encoder module comprises a main feature extraction network and a pyramid part, the pyramid part of the encoder module is fused with an SA-Net attention module, a convolution layer after fusion of shallow features and deep features in the decoder network is replaced by depth separable convolution, and the SA-deep V < 3+ > model is built, wherein the shallow features are features extracted by convolution with fewer times in the network, and the deep features are features extracted by convolution with more times in the network;
a crack detection module configured to: inputting the original image of the crack into an SA-deep V < 3+ > model, extracting features and obtaining a crack prediction image.
9. A computer readable storage medium having a program stored thereon, which when executed by a processor, implements the steps of the deep learning based crack detection method as claimed in any one of claims 1-7.
10. An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the steps of the deep learning based fracture detection method of any one of claims 1-7 when the program is executed by the processor.
CN202310492746.8A 2023-04-26 2023-04-26 Crack detection method and system based on deep learning Pending CN116416244A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310492746.8A CN116416244A (en) 2023-04-26 2023-04-26 Crack detection method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310492746.8A CN116416244A (en) 2023-04-26 2023-04-26 Crack detection method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN116416244A true CN116416244A (en) 2023-07-11

Family

ID=87059552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310492746.8A Pending CN116416244A (en) 2023-04-26 2023-04-26 Crack detection method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN116416244A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116773534A (en) * 2023-08-15 2023-09-19 宁德思客琦智能装备有限公司 Detection method and device, electronic equipment and computer readable medium
CN116993739A (en) * 2023-09-27 2023-11-03 中国计量大学 Concrete crack depth prediction model, method and application based on deep learning
CN117274789A (en) * 2023-11-21 2023-12-22 长江勘测规划设计研究有限责任公司 Underwater crack semantic segmentation method for hydraulic concrete structure
CN117291913A (en) * 2023-11-24 2023-12-26 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116773534A (en) * 2023-08-15 2023-09-19 宁德思客琦智能装备有限公司 Detection method and device, electronic equipment and computer readable medium
CN116773534B (en) * 2023-08-15 2024-03-05 宁德思客琦智能装备有限公司 Detection method and device, electronic equipment and computer readable medium
CN116993739A (en) * 2023-09-27 2023-11-03 中国计量大学 Concrete crack depth prediction model, method and application based on deep learning
CN116993739B (en) * 2023-09-27 2023-12-12 中国计量大学 Concrete crack depth prediction model, method and application based on deep learning
CN117274789A (en) * 2023-11-21 2023-12-22 长江勘测规划设计研究有限责任公司 Underwater crack semantic segmentation method for hydraulic concrete structure
CN117274789B (en) * 2023-11-21 2024-04-09 长江勘测规划设计研究有限责任公司 Underwater crack semantic segmentation method for hydraulic concrete structure
CN117291913A (en) * 2023-11-24 2023-12-26 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure
CN117291913B (en) * 2023-11-24 2024-04-16 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure

Similar Documents

Publication Publication Date Title
CN109800631B (en) Fluorescence coding microsphere image detection method based on mask region convolution neural network
CN116416244A (en) Crack detection method and system based on deep learning
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
Liu et al. Cross-SRN: Structure-preserving super-resolution network with cross convolution
CN115497005A (en) YOLOV4 remote sensing target detection method integrating feature transfer and attention mechanism
CN111783819B (en) Improved target detection method based on region of interest training on small-scale data set
CN112348036A (en) Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade
CN115035295B (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
CN110020658B (en) Salient object detection method based on multitask deep learning
CN111597920A (en) Full convolution single-stage human body example segmentation method in natural scene
CN111798469A (en) Digital image small data set semantic segmentation method based on deep convolutional neural network
CN114841972A (en) Power transmission line defect identification method based on saliency map and semantic embedded feature pyramid
CN113822951A (en) Image processing method, image processing device, electronic equipment and storage medium
CN113743505A (en) Improved SSD target detection method based on self-attention and feature fusion
CN113781510A (en) Edge detection method and device and electronic equipment
CN116152226A (en) Method for detecting defects of image on inner side of commutator based on fusible feature pyramid
CN111223087A (en) Automatic bridge crack detection method based on generation countermeasure network
CN116012722A (en) Remote sensing image scene classification method
CN113936299A (en) Method for detecting dangerous area in construction site
CN117372853A (en) Underwater target detection algorithm based on image enhancement and attention mechanism
CN117437201A (en) Road crack detection method based on improved YOLOv7
CN116523875A (en) Insulator defect detection method based on FPGA pretreatment and improved YOLOv5
CN114663654B (en) Improved YOLOv4 network model and small target detection method
CN116363629A (en) Traffic sign detection method based on improved YOLOv5
CN116452900A (en) Target detection method based on lightweight neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination