Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a building surface disease detection method and system based on a convolutional neural network.
In order to achieve the above object, the present invention provides a building surface disease detection method based on a deep learning network model, comprising the following steps:
acquiring a building surface image as a data set;
inputting the data set into a deep learning network model for learning, wherein the deep learning network model detects and fuses the multi-scale feature map of the feature extraction network in the learning process;
performing iterative training on the fusion characteristic diagram in the deep learning network model, wherein the training process is divided into primary iterative training and secondary iterative training, storing the parameters of the model iterated each time in the secondary iterative training, and solving the median of all models to obtain a new model;
and identifying the surface diseases of the building based on the obtained new model.
According to the building surface disease detection method, the detection and fusion of the feature extraction network multi-scale feature map can identify smaller disease features, and meanwhile, the model AP accuracy and the target positioning and classification accuracy are greatly improved by adopting a median to perform iterative training in the training process under the condition that the number of parameters is not increased, so that the identification of the disease features is more accurate.
The preferable scheme of the building surface disease detection method is as follows: the deep learning network model is used for learning based on a Yolov5 network, the Yolov5 network uses PANET as a feature extraction backbone network, and 4-time, 8-time, 16-time and 32-time downsampling feature maps of the feature extraction network are output and fused.
Yolov5 uses the feature extraction backbone network of PANet, which not only extracts and learns the feature map effectively, but also fuses the learned feature maps. The structure of the PANet is improved, namely 4 times of down-sampling feature maps of the feature extraction network are output and fused, so that the detection effect of the network model on the defect with a small area is further improved.
The preferable scheme of the building surface disease detection method is as follows: the method comprises the steps that the 2 nd layer of the Yolov5 network is BottleneckCSP multiplied by 3, meanwhile, the connection between the 16 th layer and the 17 th layer is removed, the characteristic diagram obtained from the 16 th layer is subjected to BottleneckCSP operation to extract characteristics, the dimensionality is reduced through 1 multiplied by 1 convolution, the characteristic diagram is spliced with the characteristic diagram of the 2 nd layer through upsampling, then the first output is obtained through the BottleneckCSP multiplied by 3 operation, meanwhile, the characteristic diagram is subjected to 3 multiplied by 3 convolution to reduce the dimensionality to obtain a new characteristic diagram, the new characteristic diagram is subjected to characteristic fusion downwards after splicing operation, and a characteristic fusion network containing four outputs is finally formed. The detection capability of the network model to small targets is enhanced. The model is beneficial to detecting the defects of the tiny objects.
The preferable scheme of the building surface disease detection method is as follows: and performing secondary iterative training under the model finally obtained after the primary iterative training by using the cosine annealing learning rate in the set range in the secondary iterative training, storing the model parameters obtained by each iteration in the secondary iterative training, and solving the median of all the model parameters in the secondary iterative training to obtain a new model.
The reason for adopting the second iterative training is that the fluctuation of the cosine annealing learning can lead the stabilized model to explore more peripheral areas, so that the model parameters can jump out of the current local optimal solution to search more optimal solutions. The median is obtained from all model parameters, so that the results explored by the cosine annealing learning rate can be better integrated, and the effect is better than that of the average according to the experimental median.
The preferable scheme of the building surface disease detection method is as follows: when a disease area of a building surface disease is extracted, sorting prediction frames inferred by the model according to the confidence coefficient from high to low, finding out a prediction frame with the highest confidence coefficient, calculating the IOU values of the prediction frame with the highest confidence coefficient and other prediction frames, and reducing the confidence coefficient of the prediction frame by using the following formula for the prediction frame larger than the threshold IOU:
wherein s is
iIs a pending prediction box b
iThe degree of confidence of (a) is,
the prediction box with the highest confidence level is selected,
the prediction box with the highest confidence coefficient and the check box b
iThe IOU value of (a), is a hyperparameter. The method improves the detection effect under the dense condition, and has obvious defect detection effect under the condition of improving the multi-dense condition.
The invention also provides a computer storage medium, wherein at least one executable instruction is stored in the storage medium, and the executable instruction enables a processor to execute the operation corresponding to the building surface disease detection method.
The invention further provides a building surface disease detection system, which comprises a processor and a memory, wherein the processor is in communication connection with the memory, the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the building surface disease detection method.
The invention has the beneficial effects that: the invention can identify the minor diseases on the surface of the building, has high identification precision and is particularly suitable for identifying and detecting the diseases on the surface of the bridge.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the description of the present invention, unless otherwise specified and limited, it is to be noted that the terms "mounted," "connected," and "connected" are to be interpreted broadly, and may be, for example, a mechanical connection or an electrical connection, a communication between two elements, a direct connection, or an indirect connection via an intermediate medium, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The invention provides a building surface disease detection method based on a deep learning network model, which comprises the following steps:
an image of a building surface is acquired as a data set.
And inputting the data set into a deep learning network model for learning, and detecting and fusing the multi-scale feature map of the feature extraction network by the deep learning network model in the learning process.
The deep learning network model in the embodiment learns based on the Yolov5 network, the Yolov5 network uses the PANet as a feature extraction backbone network, the Yolov5 backbone network is deepened, the structure of the PANet is improved, and 4-time, 8-time, 16-time and 32-time downsampling feature maps of the feature extraction network are output and fused.
Specifically, as shown in fig. 1, a layer 2 of the Yolov5 network in the scheme is a bottleeck csp × 3, so as to better extract defect features, meanwhile, the connection between the layers 16 and 17 is removed, a feature map obtained at the layer 16 is subjected to a bottleeck csp operation to extract features, dimensionality reduction is performed through 1 × 1 convolution, upsampling is performed to splice with the feature map at the layer 2, then, a first output for detecting small object defects is obtained through a bottleeck csp × 3 operation, meanwhile, a new feature map is obtained after dimensionality reduction is performed through 3 × 3 convolution on the feature map, features of the new feature map are fused downwards after splicing operation, and a feature fusion network finally containing four outputs is formed, so as to enhance the detection capability of the network model on small objects.
And after the feature maps are output and fused, performing iterative training on the fused feature maps in the deep learning network model, wherein the training process comprises primary iterative training and secondary iterative training, storing parameters of the model iterated each time in the secondary iterative training, and solving the median of all models to obtain a new model.
Specifically, in the embodiment, the first iterative training is to perform 300 times of iterative training by using the improved model, the second iterative training is to perform 24 times of iterative training again under the final model obtained by the first iterative training with a cosine annealing learning rate within a set range, each model parameter obtained by 24 times of iteration is stored, and the median of the 24 model parameters is obtained to obtain a new model. The stability of the model performance is improved by adopting the more robust median rule, the stability of the model performance of YOLOv5 is enhanced, and the accuracy of the model AP and the accuracy of the target positioning and classification are greatly improved under the condition that the number of parameters is not increased.
And then identifying the surface diseases of the building based on the trained deep learning network model.
In particular, for buildingsWhen the surface disease of the object is extracted from the disease area, a soft non-maximum inhibition method can be adopted to replace the original non-maximum inhibition method: sorting the prediction boxes inferred by the model according to the confidence coefficient from high to low, finding out the box with the highest confidence coefficient, calculating the IOU value of the prediction box with the highest confidence coefficient and other prediction boxes, and reducing the confidence coefficient of the prediction box by using the formula for the prediction box which is larger than a threshold IOU (has higher overlapping degree):
rather than removing them as coarsely as soon as they are above the threshold in the original NMS, where s
iIs a pending prediction box b
iThe degree of confidence of (a) is,
the prediction box with the highest confidence level is selected,
the prediction box with the highest confidence coefficient and the check box b
iThe IOU value of (a), is a hyperparameter.
The following takes bridge surface defects as an example:
experimental data
The experimental data mainly come from various bridge disease photos collected in 2015-2020 years, the bridge image data containing defects are labeled by using Labelimg software through manual screening, and a bridge defect data set used for the experiment is sorted out. The disease-resistant and anti-aging coating comprises six types of diseases such as cracks, peeling, honeycombs, holes, exposed ribs, water seepage and the like, the total number of the diseases is 3828 pictures, and detailed data set information is shown in table 1. In the experimental process, 3461 pictures are randomly selected as a training set, and the rest pictures are taken as a testing set.
TABLE 1 Experimental data details
Evaluation indexes are as follows:
the example mainly uses precision, recall, average accuracy and average accuracy mean to evaluate the target detection performance of the experimental method.
1. Precision (P) and Recall (R) are calculated from TP (true positives), FP (false positives), FN (false negatives), where TP represents the number of correctly divided positive samples, FP represents the number of incorrectly divided positive samples, and FN represents the number of divided negative samples but actually negative samples. It is calculated as shown below:
2. average Accuracy (AP) and mean Average accuracy (mAP), AP represents a certain class of accuracy, mAP represents the Average of all classes of APs, and AP and mAP are calculated as follows:
where N is the number of target categories, usually the precision ratio is increased with a decrease in the recall ratio. The mAP is the sum average of all classes of APs, and mAP50 and mAP0.5:0.95 are used in the experiment to measure the performance of the detection algorithm. mAP50 sets the IOU threshold to 0.5, i.e. when the IOU of the prediction box and the real box is greater than 0.5, it is considered as a positive sample (TP); a negative sample (FP) is considered when the prediction and true frame IOU thresholds are less than 0.5. mAP0.5:0.95 is the mAP value calculated every 0.05 when the IOU threshold value is between 0.5 and 0.95, so that 10 mAP values are total, and the average value of the 10 mAP values is the mAP0.5:0.95 index value.
Results and analysis of the experiments
The experiment is carried out through a deep learning library of the pytorch and a dependency package thereof under the environments of i9-10900K CPU, 2080Ti GPU, 64GB memory and Windows 10. During the experiment, the blocksize was 8, the initial learning rate was 0.01, and the weight decay was 0.0005. For fair experiments, all experimental methods take the mean value of repeated training. After 200 epoch iterative training, the information of the convergence of the loss function when the proposed model achieves the best effect is shown in fig. 4.
From fig. 2, it can be seen that after 300 epoch training, the Box, Objectness, and Classification loss function information in the training set and the verification set all have good convergence lower limits. Meanwhile, fig. 5 shows precision ratio, recall ratio, average accuracy ratio and average accuracy ratio mean value of the proposed method on the verification set.
From fig. 3, it can be seen that the precision ratio, the recall ratio, the average accuracy rate and the average accuracy rate mean value of the verification set all obtained ideal results after 300 epoch training. And figure 4 shows the best performing mAP50 and map0.5:0.95 for the validation set.
From fig. 4, it can be seen that the optimal values of the mapp 50 and the map0.5:0.95 of the proposed method are 0.608 and 0.305, respectively, then the model under the performance is saved, and the saved model is retrained for 24 times at a cosine annealing learning rate between 0.001 and 0.00001, then the median of the parameters of the 24 models is calculated to determine the final version network model, and finally the performance detection is performed on the verification set. Table 2 reports the results of comparative experiments of the original Yolov5 algorithm on mAP50 and mAP0.5:0.95 indexes in the bridge defect data set.
TABLE 2 mAP comparison on bridge Defect dataset
From table 2, it can be seen that in terms of the mAP50 index value, neither the addition of SWA algorithm nor the STAM algorithm improves the model performance compared to the original Yolov5 algorithm. However, in the aspect of mAP0.5:0.95 indexes, the performance of the introduced SWA algorithm is improved by 0.9% compared with that of the original Yolov5 algorithm, and the performance of the added STAM algorithm is improved by 1.5% compared with that of the original Yolov5 algorithm. The STAM random median training algorithm designed by the method is more effective. Table 3 reports the comparison results of different experimental methods on the bridge defect data set.
TABLE 3 comparison of the different methods
From Table 3, it can be seen that in terms of mAP50 index, the method of Our Yolov5x + STAM + Soft-NMS provided herein is improved by 2.2% compared with the original Yolov5x, and is improved by 8.1% compared with the poorest yov 3, and the performance value (0.622) is optimal compared with other improved methods. In terms of mAP0.5:0.95 index, the provided method, Our Yolov5x + STAM + Soft-NMS, continuously maintains performance advantage, is improved by 2.1% compared with Yolov5x, is improved by 6.9% compared with the poorest yov 3, and is optimal in performance value (0.32) compared with other improved methods. FIG. 5 shows the results of the detection of the method presented herein on the validation set.
From fig. 5, even if the defect types of the bridge image set are numerous, the method can accurately identify the defect types, accurately position the defect area, automatically match the defect types, and accurately and efficiently identify the defect target.
The application also provides an embodiment of a computer storage medium, wherein the storage medium stores at least one executable instruction, and the executable instruction causes a processor to execute the operation corresponding to the building surface disease detection method.
The application also provides a building surface disease detection system, which comprises a processor and a memory, wherein the processor is in communication connection with the memory, and the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the building surface disease detection method.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.