CN116051496A

CN116051496A - Real-time sewer defect detection method

Info

Publication number: CN116051496A
Application number: CN202310006821.5A
Authority: CN
Inventors: 朱武进; 孙全平; 侯志伟; 邹赟; 孙雨晴; 袁天然
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2023-05-02

Abstract

The invention discloses a real-time sewer defect detection method, which mainly comprises the following steps: 1) Collecting live-action images of sewer defects, and carrying out frame selection labeling on the defects in the images by using labeling software to construct a data set; 2) Constructing an initial convolutional neural training network model and training environment, improving similar feature map calculation in the network, and inputting a data set to be trained into a pre-training weight network of initial weight w and bias b by using pre-training weights to perform optimized training; 3) In the back propagation process of the convolutional neural network, a gradient descent method is combined, a weighted non-maximum suppression and clustering algorithm is used for optimizing the calculation of a Loss function Loss, and the weight w and the bias b are adjusted through the back propagation process to obtain an optimal model; 4) And performing defect detection on the live-action image by using the optimal model. Compared with the prior art, the invention comprehensively considers the requirements of detection speed, detection accuracy and the like, improves the detection accuracy while guaranteeing the detection speed, and basically meets the requirements of real-time detection of sewer defects.

Description

Real-time sewer defect detection method

Technical Field

The invention relates to the technical field of defect detection, in particular to a real-time sewer defect detection method. .

Background

The health of sewer facilities is an important guarantee for normal operation of cities, and detection and maintenance work is greatly limited due to the position and the specificity of functions of the sewer facilities. At present, the related research on detecting defects of sewer in China is less, detection equipment is relatively backward, and for pipelines with larger pipe diameters and no water, the traditional detection mode adopts a manual entry mode for overhauling. With the development of technology in recent years, the pipeline detection robot breaks through the barrier of lagging purely manual overhaul, becomes the most popular pipeline overhaul mode at present and is most widely applied, and a user can observe the condition in a sewer in real time and make corresponding judgment in time through manual control of the pipeline robot. Although the pipeline robot has greatly improved the efficiency of pipeline maintenance, owing to the huge quantity of sewer facilities, only relying on manual judgment certainly wastes a great deal of manpower resources and time cost. Therefore, the development of the automatic defect detection technology of the pipeline robot is promoted to be a problem to be solved in the sewer detection development of China.

The existing target detection method based on deep learning is mainly divided into Two types, namely a Two-stage detection network, such as RCNN, fast-RCNN and Fast-RCNN, the detection process is divided into Two steps, candidate areas are firstly extracted from input images, then the extracted candidate areas are input into the detection network for classification and identification, the detection accuracy of the method is higher, but the detection accuracy of the method is higher, because of the fact that the number of step-by-step calculation and the number of extracted candidate frames is more, the training calculation parameters are more, the calculation reasoning speed is slower, the fastest Fast-RCNN is taken as an example, the detection speed is only 10fps, the real-time detection requirement is difficult to achieve, the One-stage detection network is mainly represented by SSD and YOLO, the main differences are that the position and the category of the detection target are directly output after the input image data or video stream data are subjected to main network detection processing, the detection accuracy of the method is not higher than that of the Two-stage method, and the real-time detection requirement can be met.

Based on the method, the problems of large workload, high labor cost, low detection efficiency and the like still exist in the traditional sewer detection, and the automatic detection of the sewer defects is difficult to comprehensively consider the detection speed and the detection accuracy.

Disclosure of Invention

The invention aims to: aiming at the problems in the prior art, the invention provides a real-time sewer defect detection method which comprehensively considers the requirements of detection speed, detection accuracy and the like, improves the detection accuracy while guaranteeing the detection speed, and basically meets the requirements of real-time sewer defect detection.

The technical scheme is as follows: the invention provides a real-time sewer defect detection method, which comprises the following steps:

step 1: collecting a live-action image of a sewer, marking the position with the defect by using an image marking tool, repeating the operation to generate a training data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set according to a proportion;

step 2: setting up an initial convolutional neural training network model and a training environment, initializing a learning rate lambda and training cycle epoch, and reducing the number of parameters input into the convolutional neural training network model by using packet convolution at a network detection head;

step 3: the method comprises the steps of using pre-training weights in a built convolutional neural network model and improving similar feature map calculation in the network, and inputting a data set to be trained into the pre-training weight network with initial weights w and bias b for optimal training;

step 4: in the back propagation process of the convolutional neural network, a gradient descent method is combined, a weighted non-maximum suppression and clustering algorithm is used for optimizing the calculation of a Loss function Loss, the weight w and the bias b are adjusted through the back propagation process, the operation is repeated until the Loss function reaches the minimum, and the weight w and the bias b corresponding to the minimum Loss are output, so that an optimal model is obtained;

step 5: calculating model evaluation indexes, testing an optimal model, verifying a model detection effect, and utilizing the optimal model to detect defects in a single image obtained from a real-time detection video stream and input the single image into a network.

Further, the preprocessing operation in step 1 includes cutting, rotating, scaling, adding noise, and performing image enhancement processing on the image data in the original data set.

Further, in the step 2, the network detection header uses packet convolution specifically includes: the characteristic diagram obtained after convolution is split into two parts, one part is convolved, the other part is spliced with the characteristic diagram of the input network, and the number of network input parameters is reduced to 1/2 of the original number.

Further, the similar feature map calculation mode in the step 3 is to perform convolution calculation on a feature map after obtaining the feature map, and the convolution kernel is adopted to be 3x3, so that a plurality of feature maps are obtained and input into a subsequent network.

Further, the clustering algorithm in the step 4 is as follows:

randomly selecting 4 points from the prediction results output after the network weight training is completed as the centers of the initial clusters, wherein the center point is C, calculating the distance from each sample xi in the data set to each cluster center point, dividing the distance to the designated cluster center point to the corresponding cluster center point with the minimum distance, and repeating the operation until the cluster center is stable for each defect class I.

Further, the Loss function Loss is composed of a classification Loss function, a positioning frame Loss function and a confidence Loss function, and specifically comprises the following steps:

Loss＝a*cls _loss +b*box _loss +c*obj _loss (1)

wherein ,cls_loss For classifying the loss function, the calculation formula is as follows, wherein the calculation formula is used for calculating whether the anchor frame and the corresponding calibration classification are correct or not:

wherein N is the number of categories; y is Y _n，c Is an indicator variable (0 or 1); p (P) _n，c A probability of being predicted as category C;

wherein ,box_loss For positioning the loss function, the error between the prediction frame and the calibration frame is calculated by the following calculation formula:

wherein D is the center distance between the predicted frame and the actual frame, D _C For the minimum bounding rectangle diagonal length of the predicted and actual frames, v is the aspect ratio similarity of the predicted and actual frames, α is the weighting factor of v:

IOU is the cross-over ratio

w ^gt /h ^gt For the predicted frame aspect ratio, w/h is the predicted frame aspect ratio;

wherein ,obj_loss The confidence loss function is used for calculating the confidence coefficient of the network, and the calculation formula is as follows:

wherein ,

is a sample value; c _i Is a predicted value; λnobj is a negative sample weight coefficient; λobj is a positive sample weight coefficient; s is S ₂ For all predicted times; b is the number of predictions per prediction block.

Further, in the step 4, a weighted non-maximum suppression algorithm is used for optimization, weighting operation is performed according to the confidence level of network prediction, a new predicted rectangular frame is obtained, the rectangular frame is used as a final predicted rectangular frame, and predicted frames with the same category but smaller values are removed;

the IOU is the overlapping rate of the prediction boundary frame and the reference boundary frame, the larger the intersection area is, the more accurate the prediction is, when the IOU is equal to 1, the prediction frame and the actual frame are completely overlapped, and the IOU is the optimal state of the prediction of the detection frame; taking the value of the IOU as a standard for judging the samples TP and FP, when the IOU is larger than a threshold value, judging the result as TP, otherwise judging the result as FP, inhibiting a prediction frame with poor detection effect by the value of the IOU, calculating the confidence coefficient of the IOU and the detection frame, and carrying out weighting operation, wherein the calculation formula is as follows:

wherein intersections are area intersections, union is area Union, and α is weight.

The beneficial effects are that:

the network model training time is short. The use of pre-training model weight parameters greatly reduces the data magnitude of training and the time spent training.

2. The sewer defect detection model based on deep learning can better detect the positions of defects and different defect types. The real-time detection speed is high, the end-to-end input and output are used for the defect detection network model based on the convolutional neural network, different types of images can be preprocessed, the detection speed after acceleration reaches 53FPS, and the speed requirement of real-time detection is basically met.

3. The detection method is efficient and stable. The data set formed by the live-action images is true and reliable, is suitable for defect detection under different conditions, and can effectively reduce the investment of labor cost and time cost.

Drawings

FIG. 1 is a flow chart of training a defect detection model according to the present invention;

FIG. 2 is a block diagram of a convolutional neural network of design;

FIG. 3 is a diagram showing the effect of defect model detection;

fig. 4 is a detection flow chart.

Detailed Description

The following detailed description of specific embodiments of the invention is provided to facilitate a thorough understanding of the invention.

As shown in fig. 1, the invention provides a real-time sewer defect real-time detection method, which realizes the real-time detection function of different defects of a sewer by designing an intelligent detection algorithm through a deep learning model, deploys and accelerates a detection model in a GPU platform, acquires a video data stream through a front-end camera, inputs the video data stream into a network model, and detects the position information and the defect type of the defect. The method mainly comprises the following steps:

s1, data collection: the present invention entails creating image datasets of different defect types for the sewer. Collecting sewer detection images in a real environment, performing image enhancement processing on image data in an original data set through operations such as shearing, rotating, zooming and noise adding, so that the scale of a training data set is enlarged, the proportion of samples with different defect types is balanced to a certain extent, the number proportion among the samples with different types is adjusted in a data enhancement mode, the problem of unbalanced sample proportion caused by large number difference among the samples with different defect types is solved, and the situation of model under fitting caused by small data quantity of a certain defect type is prevented. And marking the positions with defects by using an image marking tool, repeating the operation to generate a training data set, and dividing the data set into a training set, a verification set and a test set according to the proportion.

S2, training a target detection network model:

and (3) constructing an initial convolutional neural training network model and training environment, improving similar feature map calculation in the network, initializing a learning rate lambda and training cycle epoch, and reducing the number of parameters input into the convolutional neural training network model by using packet convolution at a network detection head. The network detection header uses packet convolution specifically as follows: the characteristic diagram obtained after convolution is split into two parts, one part is convolved, the other part is spliced with the characteristic diagram of the result input network obtained before, and the number of network input parameters is reduced to 1/2 of the original number.

The feature map output mode of improving the training network residual block output is that because of the characteristic that many nonlinear feature maps are relatively similar, repeated calculation of these similar feature maps will increase the calculation amount required in training, compared with the original feature map calculation mode, the ratio of the original calculation mode to the existing calculation mode is:

c*k*k

wherein c is the number of channels input; k is the convolution kernel size, in this scheme k=3; s is the number of changes, in this scheme s=3; because the identity mapping change does not need to be calculated in a complex way, the calculation method of the feature map can reduce the calculated amount in the neural network and accelerate the training process of the network.

When the network is trained, the data set is input into a trained pre-training weight network with initial weights w and bias b to perform optimized training, so that the time required by model training convergence is reduced.

And (3) combining a gradient descent method in the backward propagation process of the convolutional neural network, optimizing the calculation of a Loss function Loss by using a weighted non-maximum suppression and clustering algorithm, reversely adjusting the weight w and the bias b, repeating the operation until the error function reaches the minimum, outputting the weight w and the bias b corresponding to the minimum error, and storing a trained optimal model.

And (2) using a weighted non-maximum suppression algorithm in a large number of rectangular prediction frames generated in the step (S2), carrying out weighting operation according to the confidence level of network prediction to obtain a new prediction rectangular frame, taking the rectangular frame as a final prediction rectangular frame, and eliminating prediction frames with the same category but smaller value.

S3, calculating model evaluation indexes, testing the optimal model, and verifying the model detection effect. And performing defect detection by using a network model, deploying a trained defect detection model on a platform with the NVIDIA GPU by using TensorRT, constructing a hardware environment, acquiring content captured by a current camera in real time, inputting the content into the detection model, and detecting, thereby acquiring a feedback result of model detection.

The detection content mainly comprises the steps of detecting the position of a target, classifying the type of the target, wherein the constructed loss function mainly comprises a classification loss function, a positioning frame loss function and a confidence loss function, and the calculation formula of the total error function is as follows:

Loss＝a*cls _loss +b*box _loss +c*obj _loss (1)

wherein a, b, c are weighted weights; cls _loss For the classification loss function, the method is used for calculating whether the type of the anchor frame and the corresponding calibrated classification are correct or not, dividing the input picture into 80x80 grids, predicting three detection frames per frame, and predicting information of each prediction frame comprises N classification probabilities. Wherein N is the total category number, in the invention, detection is mainly performed for four defect types, n=4, each prediction frame has 4 0-1 classification probabilities, and the calculation formula of the total classification loss function is as follows:

wherein N is the number of categories; y is Y _n，c Is an indicator variable (0 or 1); p (P) _n，c Is the probability of being predicted as category C.

wherein ,box_loss In order to locate the loss function, the method is used for quantifying the error between the predicted frame and the correct frame, and takes the parameters such as the overlapping area between the ideal frame and the actual frame, the center distance, the aspect ratio and the like into consideration, and the calculation formula is as follows:

wherein D is the center distance between the predicted frame and the actual frame; d (D) _C A minimum bounding rectangle diagonal length for the predicted and actual frames; v is the aspect ratio similarity of the predicted and actual frames; weight factor for α is v:

IOU is the cross-over ratio

w ^gt /h ^gt For predicting the aspect ratio of the frame; w/h is the prediction box aspect ratio.

In the above, D ₂ In order to predict the center point distance between the frame A and the actual frame B, dc is the diagonal length of the minimum bounding rectangle of the frame A and the frame B, v is the aspect ratio similarity of the frame A and the frame B, and alpha is the influence factor of v; v has a value ranging from 0 to 1, v is 0 when the aspect ratios of the frames A and B are equal, and v is 1 when the aspect ratios of the frames A and B are infinitely different.

When the distance between the frames A and B is infinite and the aspect ratio difference is infinite, DIOU takes-1, v takes 1, alpha takes 0.5, and CIOU takes-1-0.5= -1.5; when the frames A, B overlap completely, DIOU takes 1, v takes 0, and α takes 0, then CIOU takes 1. Therefore, the CIOU value range is-1.5-1.

wherein obj_loss For the confidence loss function, a mask is made for calculating the confidence of the network, a 1 is not directly assigned to the position of the mask with true, the CIOU corresponding to the prediction frame and the target frame is calculated, the CIOU matrix is used as the confidence label of the prediction frame, when the CIOU is smaller than 0 or the position with false is directly assigned to 0, the calculation formula of the loss function is as follows:

wherein

And evaluating the defect detection network model from accuracy, precision, recall and IOU cross-over than four model evaluation quantization indexes.

The IOU is the overlap ratio of the prediction bounding box and the reference bounding box. The larger the intersection region, the more accurate the prediction, and when the IOU is equal to 1, the prediction frame and the actual frame are completely overlapped, so that the prediction is the optimal state of the detection frame prediction. The neural network takes the value of the IOU as the standard for judging samples TP and FP, the IOU threshold value is set to be 0.7, when the IOU is larger than 0.7, the result is judged to be TP, otherwise, the result is judged to be FP, the prediction frame with poor detection effect is restrained through the value of the IOU, the confidence between the IOU and the detection frame is calculated for weighting operation, and the calculation formula is as follows:

/>

precision represents the ratio of positive samples (TP) predicted accurately to the number of samples predicted positive, calculated as:

wherein Recall represents the ratio of the positive sample (TP) with accurate prediction to the number of all real samples, and the calculation formula is:

in the formula, TP represents a positive sample with correct model prediction; FP represents a positive sample of model prediction errors; TN represents the negative sample for which the model predicts correctly; FN represents a negative sample of model prediction errors.

Accuracy is the Accuracy of prediction:

in the formula, TP represents a positive sample with correct model prediction; FP represents a positive sample of model prediction errors; TN represents the negative sample for which the model predicts correctly; FN represents a negative sample of model prediction errors. In order to verify the effectiveness of the real-time sewer defect detection method, the real data are tested, 1000 different types of test samples are input into a detection model for testing, and experimental results show that the success rate of detection reaches 96%.

The evaluation criteria of the network model of the invention are as follows: precision value 0.73, recall value 0.75; in order to verify the effectiveness of the real-time sewer defect detection method, the method is tested in real data, 1000 different types of test samples are input into a detection model for testing, and experimental results show that the accuracy of network model detection can reach 96% as shown in fig. 4, and the accuracy is close to the evaluation result of training data.

The above is only a preferred embodiment of the present invention and not limiting, and those skilled in the art will, under the guidance of the present invention, propose various modifications and variations on the basis of the schemes, which are still within the scope of the present invention.

Claims

1. The real-time sewer defect detection method is characterized by comprising the following steps of:

2. The method according to claim 1, wherein the preprocessing operation in step 1 includes cutting, rotating, scaling, adding noise, and performing image enhancement processing on the image data in the original dataset.

3. The method for detecting a real-time sewer defect according to claim 1, wherein the network detection header in the step 2 uses packet convolution specifically includes: the characteristic diagram obtained after convolution is split into two parts, one part is convolved, the other part is spliced with the characteristic diagram of the input network, and the number of network input parameters is reduced to 1/2 of the original number.

4. The method for detecting a sewer defect in real time according to claim 1, wherein the similar feature map in the step 3 is calculated by performing convolution calculation on a feature map after obtaining the feature map, and the feature map is input into a subsequent network by adopting a convolution kernel of 3x 3.

5. The method for detecting defects in real time according to claim 1, wherein the clustering algorithm in step 4 is:

6. The method for detecting the defects of the sewer in real time according to claim 1, wherein the Loss function Loss is composed of a classification Loss function, a positioning frame Loss function and a confidence Loss function, specifically:

Loss＝a*cls _loss +b*box _loss +c*obj _loss (1)

/>

IOU is the cross-over ratio

wherein ,

7. The method for detecting the defects of the sewer in real time according to claim 1, wherein in the step 4, a weighted non-maximum suppression algorithm is used for optimization, a new predicted rectangular frame is obtained by performing a weighting operation according to the confidence level of network prediction, the rectangular frame is used as a final predicted rectangular frame, and the predicted frames with the same category but smaller value are removed;