CN116486231A

CN116486231A - Concrete crack detection method based on improved YOLOv5

Info

Publication number: CN116486231A
Application number: CN202310453914.2A
Authority: CN
Inventors: 蔡泽龙; 刘杰; 何宽芳; 林显信; 黄成锵; 刘振泳; 李志聪; 茹家荣
Original assignee: Foshan University
Current assignee: Foshan University
Priority date: 2023-04-24
Filing date: 2023-04-24
Publication date: 2023-07-25

Abstract

The invention provides a concrete crack detection method based on improved YOLOv5, which comprises the following steps: constructing a concrete crack data set; building an improved YOLOv5 model; acquiring pre-training data and pre-training the improved YOLOv5 model by utilizing the pre-training data; training the pre-trained improved YOLOv5 model with a concrete crack dataset; acquiring a real-time image of the concrete to be detected; and carrying out crack detection on the concrete real-time image to be detected by using the trained improved YOLOv5 model. The invention not only can complete the target detection task without manual intervention and greatly reduce the labor cost and the time cost, but also can keep higher detection precision and accurately identify the target object, thereby improving the detection effect.

Description

Concrete crack detection method based on improved YOLOv5

Technical Field

The invention relates to the technical field of concrete detection, in particular to a concrete crack detection method based on improved YOLOv 5.

Background

Concrete is one of the most common materials in building construction, and concrete cracking is one of the common structural defects. In concrete structures, cracks can lead to reduced strength, stability and durability of the structure, thereby affecting the service life and safety of the structure. Therefore, the method and the device have the important significance in carrying out regular crack detection and evaluation on the surfaces of building engineering structures and the like, providing basis for timely early warning information and maintenance and management of buildings, prolonging the service life of the buildings and preventing accidents.

The applicant finds some typical prior art through searching, for example, chinese patent application No. CN201910502811.4 discloses a concrete crack detection device based on an unmanned aerial vehicle, and the length and the width of a crack can be directly read through a measuring ruler by arranging a crack detection mechanism with a rotatable measuring ruler, so that a large amount of data processing in the later period is not needed, the measuring method is simple, and long-term tracking and observation of the crack are realized. As another example, the chinese patent application No. CN201610799729.9 discloses an automatic detection device, method and system for a concrete member crack, which can objectively judge whether the cracks of different positions, widths and lengths belong to the qualified judgment standard of stress cracks according to image information, so that the acquisition of the bridge member load test data is facilitated, and the authenticity, accuracy and reliability of the bridge member load test data are improved. And as disclosed in the Chinese patent application No. CN202010615768.5, the invention skillfully combines the development characteristics of the concrete cracks, adopts a post-segmentation error limit control method to ensure the precision of the concrete cracks for segmentation and calculation, and provides effective guarantee for the quantitative evaluation of the concrete crack detection of the engineering structure.

Therefore, there are many technical solutions that have not been proposed in practical applications for how to detect concrete cracks.

Disclosure of Invention

Based on the detection, in order to realize the detection effect on the concrete cracks, the invention provides a concrete crack detection method based on improved YOLOv5, which has the following specific technical scheme:

a concrete crack detection method based on improved YOLOv5 comprises the following steps:

constructing a concrete crack data set;

building an improved YOLOv5 model;

acquiring pre-training data and pre-training the improved YOLOv5 model by utilizing the pre-training data;

training the pre-trained improved YOLOv5 model with a concrete crack dataset;

acquiring a real-time image of the concrete to be detected;

and carrying out crack detection on the concrete real-time image to be detected by using the trained improved YOLOv5 model.

According to the concrete crack detection method, concrete crack detection is carried out based on an improved YOLOv5 network model, so that the position of a crack can be automatically detected, a target detection task can be completed without manual intervention, and labor cost and time cost are greatly reduced; meanwhile, based on the improved YOLOv5 model, higher detection precision can be kept, and the target object can be accurately identified, so that the detection effect is improved.

Further, the specific method for constructing the improved YOLOv5 model comprises the following steps:

in the original YOLOv5 network structure, a Rep-GFPN network structure with the same input size is used for replacing an original neck feature stacking network Concat to serve as a main network of a neck network;

and adding an ECA attention mechanism before the feature fusion part of the feature extraction network module to construct an improved YOLOv5 network model.

Further, the specific method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stack network Concat as the backbone network of the neck network comprises the following steps:

firstly, a Rep-GFPN receives two characteristic graphs x0 and x1 of a backbone network and a previous layer network, and carries out convolution block operation with a convolution kernel size of 1 step length of 1 on the x0 and the x1 respectively, and outputs the operation as x00 and x10;

secondly, putting the x00 and the x10 output in the first step into a CSPS module together, and outputting as x01;

thirdly, putting the x01 output from the second step and the x10 output from the first step into a CSPS module together, and outputting as x11;

fourthly, independently putting the x11 output in the third step into CSPS, and performing Concat stacking operation on the output of the CSPS module and the output x10 in the first step, wherein the output is x12;

fifthly, performing Concat stacking operation on the output of the CSPS module and the output x00 of the first step, wherein the output x01 of the second step, the output x11 of the third step and the output x13 of the fourth step are together in the CSPS module, and the output is x02;

and a sixth step, performing stacking operation on the x02 output from the fourth step and the x12 generated from the fifth step, and performing convolution block operation with a convolution kernel size of 1 and a step length of 1 on the stacked output to serve as a final output x_final of the Rep-GFPN.

Further, the specific method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stack network Concat as the backbone network of the neck network further comprises the following steps:

the first step, the CSPS receives two or three same size inputs y0, y1 and y2, performs stacking operation on the channel number, and outputs y3;

step two, performing convolution block operation with a convolution kernel size of 3 and a step length of 1 on y3 output from the step one, normalizing and Si lu activation function, and outputting y4;

thirdly, performing convolution block operation with a convolution kernel size of 3 and a step length of 1, normalizing and Si lu activation function on y3 output from the first step, then performing RepConv with a convolution kernel size of 3 and a step length of 1, then performing convolution block operation with a convolution kernel size of 3 and a step length of 1, normalizing and Si lu activation function, and finally putting the function into an ECA attention mechanism module;

fourth, stacking the output y4 of the second step with a plurality of data of the third step, and outputting the data as y7;

and fifthly, performing convolution block operation with a convolution kernel size of 1 and a step length of 1 on the output y7 of the fourth step, normalizing and activating a function of Si lu to obtain a final output y8 of the CSPS module.

Further, the specific method for adding the ECA attention mechanism before the feature fusion part of the feature extraction network module further comprises the following steps:

firstly, an ECA module receives a feature map z0, and average pooling is carried out on each channel through a GAP module, and each channel outputs an Output nodes and a block z1 with the size of 1 x C;

step two, performing convolution operation with the convolution kernel size of 3 and the step length of 1 on the output z1 of the step one, and outputting the output z2;

thirdly, performing Sigmoid activation function operation on the output z2 of the second step, and outputting the output as z3;

and fourthly, performing point multiplication operation on the z0 of the first step and the output z3 of the third step to obtain the final output z4 of the ECA module.

Further, the concrete method for carrying out crack detection on the concrete real-time image to be detected by using the trained improved YOLOv5 model comprises the following steps:

evaluating the trained improved YOLOv5 model using three indicators of accuracy, recall and average accuracy;

and selecting the best trained improved YOLOv5 model to detect the crack of the real-time image of the concrete to be detected.

Further, the concrete method for acquiring the real-time image of the concrete to be detected comprises the following steps:

shooting a real-time video of the concrete to be detected through a camera;

and carrying out image preprocessing on shot concrete real-time videos to be detected frame by using a computer vision library OpenCV.

Further, the concrete method for constructing the concrete crack data set comprises the following steps:

collecting concrete crack images in a centralized manner through network crawling and public data;

aiming at a concrete crack image obtained by network crawling, marking cracks by using LabelImg software, removing repeated data, missing value data and abnormal value data, and dividing the concrete crack image from which the repeated data, the missing value data and the abnormal value data are removed into a training set and a verification set;

changing the collected concrete crack image into a lower resolution image by using a computer vision library OpenCV;

a data enhancement operation is performed on the data set using a mosaic data enhancement algorithm.

A computer readable storage medium storing a computer program which when executed by a processor implements the improved YOLOv5 based concrete crack detection method.

Drawings

The invention will be further understood from the following description taken in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. Like reference numerals designate corresponding parts throughout the different views.

Fig. 1 is a schematic overall flow chart of a concrete crack detection method based on improved YOLOv5 according to an embodiment of the invention.

Fig. 2 is a network configuration diagram of YOLOv5 modified in accordance with an embodiment of the present invention.

Fig. 3 is a Block diagram of a Rep-GFPN network module in accordance with one embodiment of the present invention.

Fig. 4 is a diagram of ECA network architecture and GAP network architecture in an embodiment of the invention.

FIG. 5 is a graph showing the effect of the final detection of cracks in an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following examples thereof in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description and specific examples are intended for purposes of illustration only and are not intended to limit the scope of the invention.

It will be understood that when an element is referred to as being "fixed to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only and are not meant to be the only embodiment.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

The terms "first" and "second" in this specification do not denote a particular quantity or order, but rather are used for distinguishing between similar or identical items.

At present, the concrete crack detection technology mainly comprises a plurality of methods such as manual detection, laser scanning detection, sensor detection and the like. Manual detection is the most traditional detection method and is commonly applied in the field, but a great deal of manpower and time cost are often consumed; the laser scanning detection adopts a laser scanning technology, and the surface of the concrete is scanned by a high-precision measuring instrument, so that the crack detection of the concrete is realized; the information obtained by a plurality of sensors is fused when the sensors detect the data, so that a certain time is needed for processing and analyzing the data, and the accuracy requirement on the sensors is high.

With the continuous development of intelligent technology, the concrete crack detection technology can be widely applied. For example, the concrete crack detection algorithm based on artificial intelligence and machine learning can improve the accuracy and efficiency of detection and realize intelligent detection and diagnosis.

The existing concrete crack detection technology has the following problems:

1. the artificial vision is adopted to detect the cracks on the concrete surface, so that missed detection and false detection are easy to occur, the detection result is not necessarily accurate, and tiny cracks or hidden cracks are difficult to find.

2. The detection of the deep cracks by laser scanning detection of concrete cracks is still limited and cannot be monitored in real time. The laser scanning is very sensitive to the condition of the concrete surface, and the scanning effect can be affected if the surface has the conditions of dirt, uneven color, overlarge roughness and the like. Therefore, the surface needs to be cleaned and treated before laser scanning to ensure that accurate scanning results are obtained. The laser beam used in the laser scanning method is easily interfered by surrounding light, so that the scanning effect is inaccurate. Therefore, when performing laser scanning, it is necessary to avoid interference of light and select an environment with weak light for scanning.

3. Most of the prior art cannot realize the real-time detection function, and the accuracy and the efficiency of detection are often not high.

As shown in fig. 1, a concrete crack detection method based on improved YOLOv5 according to an embodiment of the present invention includes the following steps:

s1, constructing a concrete crack data set.

In step S1, a specific method of constructing a concrete crack dataset comprises the steps of:

s10, acquiring images required by a training model, namely collecting concrete crack images through network crawling and public data set.

And S11, cleaning the collected concrete crack image. Specifically, the concrete crack image obtained by the network crawling is labeled (including the label of the crack and the labeled position information file) by utilizing LabelImg software, the repeated data, the missing value data and the abnormal value data are removed, and the concrete crack image from which the repeated data, the missing value data and the abnormal value data are removed is divided into a training set and a verification set.

And duplicate data, missing value data and abnormal value data are removed, so that the accuracy, the integrity, the consistency and the usability of the data can be ensured, and the quality and the credibility of a data set are ensured.

More specifically, the data set can be divided into a training set and a verification set according to the ratio of 9:1, so that model training is facilitated.

S12, changing the collected concrete crack image into a lower resolution image by using a computer vision library OpenCV so as to reduce the calculated amount and relieve the pressure of equipment.

And S13, performing data enhancement operation on the data set by using a mosaic data enhancement algorithm to increase the data set, reduce overfitting and improve the accuracy and stability of the model.

Specifically, four different images are randomly selected first, and then one image among the four images is randomly selected as the center image of the composite image. Next, the mosaics data enhancement algorithm randomly cuts out three adjacent images around the center image and concatenates them together according to a certain rule to form a new composite image. During stitching, algorithms may use some random transformations, such as rotation, scaling, and horizontal flipping, to increase the diversity of the composite image. Finally, the algorithm takes the label of the central area of the composite image as the label of the whole composite image, thereby generating a new training sample.

S2, building an improved YOLOv5 model.

Specifically, in step S2, further, the specific method for building the improved YOLOv5 model includes the following steps:

s20, in the original YOLOv5 network structure, a Rep-GFPN (Reparametrized General ized-FPN, reset parameter generalized characteristic pyramid) network structure with the same input size is used for replacing an original neck characteristic stacking network Concat to serve as a backbone network of the neck network.

The backbone network adopts the original backbone network and the detection head network of the YOLOv5, the feature stacking layer in the neck network is improved, and compared with the original common dimension stacking operation, the Rep-GFPN is used for further feature extraction and fusion, so that the accuracy of a model can be improved.

And S21, adding an ECA (Efficient Channel Attent ion ) attention mechanism before the feature fusion part of the feature extraction network module, and constructing an improved YOLOv5 network model.

In step S20, the specific method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stacking network Concat as the backbone network of the neck network includes the following steps:

in the first step, the Rep-GFPN receives two characteristic graphs x0 and x1 of a backbone network and a previous layer network, and carries out convolution block operation with a convolution kernel size of 1 step length of 1 on the x0 and the x1 respectively, and outputs the convolution block operation as x00 and x10.

And secondly, putting the x00 and the x10 output in the first step into a CSPS module together, and outputting as x01.

And thirdly, putting the x01 output from the second step and the x10 output from the first step into a CSPS module together, and outputting as x11.

And fourthly, independently putting the x11 output in the third step into CSPS, and performing a Concat stacking operation on the output of the CSPS (Constraint Sat isfact ion Problems, constraint satisfaction problem) module and the output x10 in the first step, wherein the output is x12.

And fifthly, performing Concat stacking operation on the output of the CSPS module and the output x00 of the first step, wherein the output x02 is output, and the output x01 of the second step, the output x11 of the third step and the output x13 of the fourth step are together in the CSPS module.

S3, obtaining pre-training data and pre-training the improved YOLOv5 model by using the pre-training data.

The disclosed pre-training data is transmitted to the improved YOLOv5 model for pre-training, and pre-training weights are generated, so that the time and the computing resources for training the model can be reduced, and the iteration period of the model can be shortened.

And S4, training the pre-trained improved YOLOv5 model by using the concrete crack data set.

In particular, the method comprises the steps of,

s5, acquiring a real-time image of the concrete to be detected.

Specifically, in step S5, the specific method for acquiring the real-time image of the concrete to be detected includes the following steps:

s50, shooting real-time video of the concrete to be detected through a camera.

S51, performing image preprocessing on shot concrete real-time video to be detected frame by using a computer vision library OpenCV. The image preprocessing includes enlarging or reducing to a prescribed size, gaussian filtering noise reduction, and the like.

S6, performing crack detection on the concrete real-time image to be detected by using the trained improved YOLOv5 model. The detection effect is shown in fig. 5.

Specifically, in step S6, the specific method for performing crack detection on the real-time image of the concrete to be detected by using the trained improved YOLOv5 model includes:

s60, evaluating the trained improved YOLOv5 model by using three indexes of accuracy rate, recall rate and average accuracy.

LOSS curve, map curve, etc. are drawn according to YOLOv5 model data, using accuracyRecall->Three average accuraciesThe individual metrics evaluate the trained improved YOLOv5 model.

And S61, selecting the best trained improved YOLOv5 model to detect cracks in the real-time image of the concrete to be detected.

Where TP represents the number of positive samples correctly classified; the actual positive samples are also classified as positive samples by the model.

FP represents the number of samples misclassified as positive; the actual is a negative sample, but is classified as a positive sample by the model.

TN represents the number of correctly classified negative samples; the actual is a negative sample, and is also classified as a negative sample by the model.

FN represents the number of erroneously classified as negative samples; the actual is a positive sample, but is classified as a negative sample by the model.

The average Accuracy (AP) is calculated by the area under the Precis ion-Recal l curve.

The indexes before and after the YOLOv5 model is improved are shown in the following table:

In one embodiment, the specific method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stacking network Concat as the backbone network of the neck network further comprises the following steps:

in the first step, the CSPS receives two or three same size inputs y0, y1 and y2, performs a stacking operation on the channel number, and outputs y3.

And secondly, performing convolution block operation with a convolution kernel size of 3 and a step length of 1 on y3 output from the first step, normalizing and Si lu activation function, and outputting y4.

Thirdly, performing convolution block operation with a convolution kernel size of 3 and a step length of 1, normalizing and Si lu activation function on y3 output from the first step, then performing RepConv with a convolution kernel size of 3 and a step length of 1, then performing convolution block operation with a convolution kernel size of 3 and a step length of 1, normalizing and Si lu activation function, and finally putting the function into an ECA attention mechanism module; here, this process is repeated N times.

Preferably, N is 1 in order to reduce the number of parameters of the model. The output of the third step is y5.

And fourth, stacking the output y4 of the second step with a plurality of data of the third step, and outputting y7.

And fifthly, performing convolution block operation with a convolution kernel size of 1 and a step length of 1 on the output y7 of the fourth step, normalizing and a Silu activation function to obtain a final output y8 of the CSPS module. Wherein the Silu activation function is in the form of:

preferably, the specific method of adding the ECA attention mechanism before the feature fusion portion of the feature extraction network module further comprises the steps of:

in the first step, the ECA module receives a feature map z0, and performs average pooling on each channel through a GAP (global average pooling) module, and each channel outputs an Output node, and outputs a block z1 with a size of 1×1×c.

And secondly, performing convolution operation with the convolution kernel size of 3 and the step size of 1 on the output z1 of the first step, and outputting the output z2.

And thirdly, performing Sigmoid activation function operation on the output z2 of the second step, and outputting the output as z3. Wherein SigThe moid activation function is in the form of:

According to the concrete crack detection method, a to-be-detected concrete video is transmitted to a computer in real time through a network camera, the transmitted video is processed frame by frame through a computer vision library OpenCV, meanwhile, a training data set and a verification data set are formed by collecting concrete structure surface crack pictures, the data set is put into an improved YOLOv5 network for model training, a Rep-GFPN network with the same input size and a YOLOv5 network with improved ECA attention mechanism are adopted, the improved YOLOv5 network structure is shown in a figure 2, compared with the original common stacking operation of YOLOv5, information exchange between the Rep-GFPN (Reparametrized Generalized-FPN) network structure is increased, high-level semantic information and low-level spatial information can be fully exchanged, and feature extraction capacity is improved. In addition, cross-layer connectivity provides more efficient information transfer through multi-scale feature fusion occurring in different scale features of the previous and current layers, which can be extended to deeper networks, the structure of which is shown in fig. 3. The ECA (Efficient Channel Attent ion, high-efficiency channel attention) attention mechanism realizes local cross-channel interaction through one-dimensional convolution, extracts the dependency relationship among channels, and finally puts the video stream processed by the computer vision library into a trained model to identify cracks in the video, wherein the structure is shown in fig. 4.

According to the concrete crack detection method based on the improved YOLOv5, the neural network is combined with the network camera, so that a quick detection method of concrete cracks can be realized, the accuracy of detecting the cracks is improved, a real-time image of concrete to be detected is input through the network camera, a real-time detection effect can be realized, and the method has the characteristics of high efficiency, high accuracy, low cost, strong robustness and the like.

In one embodiment, the present invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the improved YOLOv 5-based concrete crack detection method.

In summary, the concrete crack detection method based on the improved YOLOV5 can obtain the following

The beneficial effects are that:

1. the Concat channel stacking network layer of the YOLOv5 head network is replaced by a Rep-GFPN network with the same input size, so that the detection precision can be improved, and the method is suitable for real-time scenes.

2. The ECA attention mechanism is introduced to adaptively learn the importance weight of each channel, so that the network can extract the characteristics more accurately and the generalization performance of the model is improved.

3. The data sets under different illumination, angles, sizes and backgrounds are obtained through data enhancement, so that the robustness of the model can be improved, and the target can still be accurately detected in various application scenes.

4. The improved YOLOv5 network adopts a lightweight design, has smaller model volume and calculation amount, can be operated on lower hardware configuration (such as raspberry group and the like), and can be applied to different equipment and platforms.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims

1. The concrete crack detection method based on the improved YOLOv5 is characterized by comprising the following steps of:

constructing a concrete crack data set;

building an improved YOLOv5 model;

training the pre-trained improved YOLOv5 model with a concrete crack dataset;

acquiring a real-time image of the concrete to be detected;

2. The concrete crack detection method based on improved YOLOv5 as claimed in claim 1, wherein the concrete method for constructing the improved YOLOv5 model comprises the following steps:

3. The concrete crack detection method based on improved YOLOv5 of claim 2, wherein the concrete method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stacking network Concat as the main network of the neck network comprises the following steps:

4. The concrete crack detection method based on improved YOLOv5 of claim 3, wherein the concrete method for using the Rep-GFPN network structure with the same input size to replace the original neck feature stacking network Concat as the main network of the neck network further comprises the following steps:

step two, performing convolution block operation with a convolution kernel size of 3 and a step length of 1 on y3 output from the step one, normalizing and Silu activation functions, and outputting y4;

thirdly, performing convolution block operation with a convolution kernel size of 3 and a step length of 1, normalizing and a Silu activation function on y3 output in the first step, then performing RepConv with the convolution kernel size of 3 and the step length of 1, then performing convolution block operation with the convolution kernel size of 3 and the step length of 1, normalizing and the Silu activation function, and finally putting the function into an ECA attention mechanism module;

and fifthly, performing convolution block operation with a convolution kernel size of 1 and a step length of 1 on the output y7 of the fourth step, normalizing and a Silu activation function to obtain a final output y8 of the CSPS module.

5. The concrete crack detection method based on improved YOLOv5 of claim 4, wherein the specific method of adding ECA attention mechanism before feature fusion part of the feature extraction network module further comprises the steps of:

6. The concrete crack detection method based on improved YOLOv5 as claimed in claim 5, wherein the concrete method for performing crack detection on the real-time image of the concrete to be detected by using the trained improved YOLOv5 model comprises the following steps:

7. The concrete crack detection method based on improved YOLOv5 as claimed in claim 6, wherein the concrete method for acquiring the real-time image of the concrete to be detected comprises the following steps:

shooting a concrete video to be detected through a camera;

and carrying out image preprocessing on shot concrete videos to be detected frame by using a computer vision library OpenCV.

8. The improved YOLOv 5-based concrete crack detection method of claim 7, wherein the concrete method of constructing a concrete crack dataset comprises the steps of:

9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which when executed by a processor implements the improved YOLOv 5-based concrete crack detection method according to any one of claims 1-8.