CN113221646A

CN113221646A - Method for detecting abnormal objects of urban underground comprehensive pipe gallery based on Scaled-YOLOv4

Info

Publication number: CN113221646A
Application number: CN202110373441.6A
Authority: CN
Inventors: 谷永辉; 刘昌军
Original assignee: Shandong Jiexun Communication Technology Co ltd
Current assignee: Shandong Jiexun Communication Technology Co ltd
Priority date: 2021-04-07
Filing date: 2021-04-07
Publication date: 2021-08-06

Abstract

The invention relates to a method for detecting abnormal objects of an urban underground comprehensive pipe gallery based on Scaled-YOLOv4, which comprises the following steps: collecting images of the abnormal objects by using a camera of the underground pipe gallery; cleaning and labeling the obtained images, and establishing an underground comprehensive pipe gallery abnormal object image training set and a testing set; and training the training set data by using Scaled-Yolov4, and testing in the test set to obtain a test result. Because the abnormal object image data volume of the data set is small, after data preprocessing is completed, the Scaled-Yolov4 model is trained by using the training set, and the model is evaluated on the test set; and then the trained Scaled-Yolov4 model is used for detecting the abnormal objects of the urban underground comprehensive pipe gallery. The invention accurately detects the abnormal objects in the underground comprehensive pipe gallery, has strong practicability and high accuracy, simultaneously keeps the speed and the accuracy and realizes better balance between the speed and the accuracy. But wide application in utility tunnel abnormal object image target detection field.

Description

Method for detecting abnormal objects of urban underground comprehensive pipe gallery based on Scaled-YOLOv4

Technical Field

The invention relates to the field of image target detection of abnormal objects of an underground comprehensive pipe gallery, in particular to an abnormal object detection method of the underground comprehensive pipe gallery based on Scaled-YOLOv 4.

Background

The urban underground comprehensive pipe gallery is an urban underground tunnel space integrating various public facilities such as electric power, communication, water supply, pollution discharge, gas and heat supply, and is an important infrastructure and a life line for guaranteeing urban operation. Nowadays, the construction of underground comprehensive pipe gallery systems for centralized laying of municipal pipelines such as water supply and drainage, heat, gas, electric power, communication, radio and television and the like has become one of the modernization and scientific standards for the development of cities in developed countries in the world. The presence of anomalies in the utility tunnel, such as packing paper, sloughed material, masonry, etc., can affect the operation of the utilities of the utility tunnel, causing significant urban losses.

At present, the inspection of the abnormal objects of the underground comprehensive pipe gallery mainly depends on the experience of experts, and the experts carry out one-by-one inspection on each position in the underground comprehensive pipe gallery through an underground comprehensive pipe gallery camera. The manual investigation is long in time consumption and high in cost, and the manual judgment depends on the technical level of professionals, so that the problems of missed investigation, false inspection and the like are possible to occur, and the workload of cleaning field personnel is increased.

Deep learning is one of the important methods for image target detection. The target detection algorithms typically include SSD, fast-RCNN, RetinaNet, Effectienet, YOLO series, and the like. The YOLOv4 of the YOLO series performs best on public data sets (VOC data sets, COCO data sets). The model is convenient to deploy on equipment, the inference speed is high, and the size of the deep learning model is high. In addition to reasoning speed, the accuracy of the target detection algorithm is also taken into account.

Disclosure of Invention

The objects of the present invention include two: (1) the problem that the detection precision of the abnormal objects of the urban underground comprehensive pipe gallery is low in the traditional machine learning method is solved; (2) supplementary artifical detection utility tunnel abnormal object improves the detection precision of abnormal object. Based on the two purposes, the invention provides a method for detecting abnormal objects of an urban underground comprehensive pipe gallery based on Scaled-YOLOv 4.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

a method for detecting abnormal objects of an urban underground comprehensive pipe gallery based on Scaled-YOLOv4 comprises the following steps:

step 1: utilizing a camera of the urban underground comprehensive pipe gallery, screening and shooting images containing abnormal objects or suspected abnormal objects by an expert, and establishing an abnormal object image data set of the urban underground comprehensive pipe gallery;

step 2: preprocessing the established data set, including that an expert labels abnormal objects on a shot image, and cutting the labeled image again to establish a training set and a test set of a model;

and step 3: a Scaled-YOLOv4 model was constructed. In the model, the PAN architecture in YOLOv4 is processed by CSP-ize, and the new structure effectively reduces the calculation amount by 40%;

and 4, step 4: training a Scaled-Yolov4 model by using a training set, and modifying the size of the image input into the network into the size of the image cut finally;

and 5: and testing the performance of the trained Scaled-Yolov4 model by using the test set, then using the model for recognizing the abnormal object image of the urban comprehensive pipe gallery, and outputting the recognition result.

Preferably, in the step 2, the established urban comprehensive pipe gallery abnormal object image is preprocessed, and the specific steps are as follows:

preferably, the method further comprises the step 21: marking the shot image by adopting a tool, namely label img, and marking the abnormal object target of the urban comprehensive pipe gallery in the data set established in the step 1; in the labeling process, an expert selects an abnormal object target by using a labelImg tool and inputs a label of the selected object; then, the labelImg stores the marked information into a specified folder, wherein the format of the folder is an XML file format; the XML file stores the type of the marked abnormal object, the coordinates of the upper left corner X and Y of the marking frame and the coordinates of the lower right corner X and Y of the marking frame.

Preferably, the method further comprises the step 22: after the image labeling in step 21 is performed, in order to make the abnormal object have a larger scale in the captured image; therefore, the marked image is cut, and the size of the cut image is 600 multiplied by 600; in this case, the target proportion of the abnormal object is large, and the detection of the small target is converted into the detection of the large target.

In the above scheme, in step 3, the Scaled-YOLOv4 model includes three scaling models with different sizes, which are classified as YOLOv4-tiny, YOLOv4-CSP, and YOLOv 4-large; the network structure of YOLOv4-CSP is mainly introduced, and the network structure comprises an input end, a feature extraction part, a Neck part and a Head part.

Preferably, the method further comprises the step 31: the part at the input, the YOLOv4-CSP input is the same as the input of YOLOv 4; YOLOv4 added dynamic data enhancement and self-confrontation training; the motion data enhancement is realized by evolving according to Cutmix, four images are recombined into a new image, and in the recombination process, the length and the width of the image are random, and operations such as random image turning, random zooming, random arrangement and the like are added randomly; the dynamic data enhancement has a certain enhancement effect on the detection effect of the small target.

Preferably, the self-confrontation training is a new data expansion technology, and is divided into two stages: in a first phase, the neural network alters the original image instead of the network weights, in such a way that the neural network performs a antagonistic attack on itself, altering the original image, thereby creating the illusion that there is no target on the image; and in the second stage, training the neural network to carry out normal target detection on the modified image. The calculation of the anchor frame in YOLOv4 requires a separate calculation using the K-means clustering algorithm alone, and the result is calculated before training the network and used as the input of the network parameters.

Preferably, step 32: in the feature extraction part, YOLOv4-CSP modifies the originally used CSPDarknet, and in order to obtain better speed and precision compromise, the first CSP stage is converted into an Original Darknet Residual Layer.

Preferably, step 33: to effectively reduce the amount of computation, YOLOv4-CSP performed CSP-ize operation on the PAN architecture in YOLOv 4; the PAN architecture mainly integrates features from different feature towers, then through two sets of inverted Original dark net Residual layers, without shortcut connection. Through the CSP-ize, the new network architecture design effectively reduces the computation amount by 40%, and adds SPP (spatial Pyramid Power) network, the SPP module is inserted in the middle of the first CSP stage of the neck initially, therefore, the YOLOv4-CSP also inserts the SPP module in the middle of the first convolution block of the CSP-PAN.

Preferably, step 34: head obtains prediction results of three different scales, taking the input image size as 608 × 608 as an example, the prediction results are respectively 19 × 19 × 255, 38 × 38 × 255, and 76 × 76 × 255, the results of different sizes are used for predicting targets of different sizes, in the post-processing process of target detection, for screening many target anchor frames, usually a Non-Maximum Suppression (NMS) operation is required, which is to screen anchor frames under different confidence degrees, and an anchor frame with a relatively low Suppression score is screened, in yolv 4-CSP, a weighted NMS is used, in the process of removing the anchor frame, the confidence degree of the anchor frame is used as a weight value to obtain a new rectangular frame, the rectangle is used as a final predicted rectangular frame, and then the anchor frame with a relatively low score is removed, and a CIOU-LOSS function is used as a LOSS function, and the formula is as follows:

wherein IOU is Intersection ratio (IOU), rho²(b，b^gt) Square of Euclidean distance value representing coordinates of center of predicted anchor frame and actual anchor frame, c²In order to square the length of the diagonal line of the circumscribed rectangle of the prediction anchor frame and the actual anchor frame, α and v represent penalty factors, and the formula of the penalty factors is as follows.

Preferably, in order to adapt to different GPUs, the model of the YOLOv4-CSP is scaled by adjusting the depth, width and input size of the network to obtain three models, namely YOLOv4-tiny, YOLOv4-CSP and YOLOv4-large, which are suitable for different GPUs.

The invention provides a method for detecting abnormal objects of an urban underground comprehensive pipe gallery based on Scaled-YOLOv 4. The method has the following beneficial effects:

the invention cuts the image and uses the Scaled-YOLOv4 model to adapt to the target detection of the abnormal objects of the urban underground comprehensive pipe gallery, thereby improving the precision of the target detection, and the invention has the following advantages:

1. the method cuts the picture, and converts the original small target detection task into a larger target detection task. The use of Scaled-YOLOv4 preserves both speed and accuracy, achieving the best balance between speed and accuracy.

2. The number of the models is three, and the models correspond to the requirements of GPUs with different ports. And the minimum YOLOv4-tiny model reduces the deployment cost during deployment, and is beneficial to the rapid deployment of the model.

3. From the data already obtained, the Yolov4 tiny model reached 22.0% AP (42.0% AP) at 443FPS on RTX 2080Ti on the coco dataset₅₀) While 1774FPS can be achieved by using TensorRT, Yolov 4-tiny.

4. The method is realized by using a Pythrch frame, is easy for the expansion and application of users, and has certain practical application value in the abnormal object investigation of the urban underground comprehensive pipe gallery.

Drawings

FIG. 1 is a schematic diagram of the PAN structure of YOLOv4 and the CSP-ize structure;

FIG. 2 is a schematic structural view of YOLOv 4-large-P5;

FIG. 3 is a diagram of the detection result of Scaled-YOLOv4 on masonry of the urban underground utility tunnel, wherein the rectangular frame part is the detection result of the masonry;

FIG. 4 shows the detection results of Scaled-YOLOv4 for the wrapping paper and the fallen objects of the urban underground pipe gallery, wherein the large rectangular frame is the detection result of the wrapping paper, and the small rectangular frame is the detection result of the fallen objects.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments.

In the present invention, unless otherwise expressly stated or limited, the terms "connected," "secured," and the like are to be construed broadly, e.g., as meaning permanently attached, removably attached, or integral to one another; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any combination thereof. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

The invention is further described with reference to the following figures and examples. The example is used for detecting the abnormal objects of the urban underground comprehensive pipe gallery, wherein the abnormal objects comprise masonry, packing paper and fallen objects, and the example is used for explaining the underground comprehensive pipe gallery abnormal object detection algorithm based on the classified-Yolov 4. The specific steps of the scheme example are as follows:

step 1: the method comprises the steps of utilizing a camera of the urban underground comprehensive pipe gallery, screening shot images through experts to include images of bricks, packing paper and falling objects and establishing an image data set of the images of the urban underground comprehensive pipe gallery abnormal objects.

Step 2: and preprocessing the established data set, wherein the data set comprises the steps that an expert marks a brick stone, a packaging paper and a falling object target, the marked image is cut again, and a training set and a test set of the model are established.

Step 21: and labeling the shot image by using a tool labelImg. And (3) marking the targets of the brick, the packing paper and the abnormal objects of the falling objects in the data set established in the step (1). In the labeling process, an expert selects an abnormal object target by using a labelImg tool and inputs a label of the selected object. Then, the labelImg stores the labeled information into a specified folder, and the format is an XML file format. The XML file stores the types of marked abnormal objects, and the invention is divided into three types, namely, masonry, packaging paper and fallen objects, and coordinates of the upper left corner X and Y of the position of the marking frame and coordinates of the lower right corner X and Y of the marking frame.

Step 22: after the image labeling in step 21, the abnormal object is scaled to a larger scale in the captured image. Therefore, the marked image is cut, and the size of the cut image is 600 × 600. In this case, the target proportion of the abnormal object is large, and the detection of the small target is converted into the detection of the large target.

And step 3: the network structure of YOLOv4-CSP mainly comprises an input part, a feature extraction part, a Neck part and an output part.

Step 31: in part at the input, the YOLOv4-CSP input is the same as the input of YOLOv 4. YOLOv4 added dynamic data enhancement and self-confrontation training. The motion data enhancement is realized by evolving according to Cutmix, four images are recombined into a new image, and in the recombination process, the length and the width of the image are random, and operations such as random image turning, random zooming, random arrangement and the like are added randomly. The dynamic data enhancement has a certain enhancement effect on the detection effect of the small target. The self-confrontation training is a new data expansion technology and is divided into two stages. In the first stage, the neural network changes the original image instead of the network weights. In this way, the neural network performs a competing attack on itself, altering the original image, thereby creating artifacts that are not targeted on the image. And in the second stage, training the neural network to carry out normal target detection on the modified image. The calculation of the anchor box in YOLOv4 requires a separate calculation using the K-means clustering algorithm alone, and the result is calculated before training the network and is used as an input of the network parameters.

Step 32: in the feature extraction part, YOLOv4-CSP modifies the originally used CSPDarknet, and in order to obtain better speed and precision compromise, the first CSP stage is converted into an Original Darknet Residual Layer.

Step 33: to effectively reduce the amount of computation, YOLOv4-CSP performed CSP-ize operation on the PAN architecture in YOLOv 4. The calculation list of the PAN structure is shown in fig. 1 (a). It mainly integrates features from different feature pyramids and then connects through two sets of inverted Original Darknet Residual layers without shortcuts. With CSP-ize, the new network architecture is shown in fig. 1(b), which effectively reduces the computational load by 40%. An SPP (spatial Pyramid Power) network is also added to the network, and the SPP module is initially inserted in the middle of the first CSP stage of the nack, so that the YOLOv4-CSP also inserts the SPP module in the middle of the first volume block of the CSP-PAN.

Step 34: head obtains the prediction results of three different scales. Take an input image size of 608 × 608 as an example. The prediction result sizes were 19 × 19 × 255, 38 × 38 × 255, and 76 × 76 × 255, respectively. Different sized results are used to predict different sized targets. In the post-processing process of target detection, a Non-Maximum Suppression (NMS) operation is usually required for screening many target anchor frames. And if not, screening the anchor frames with different confidence degrees, and screening the anchor frames with lower inhibition scores. In YOLOv4-CSP, weighting NMS is used, in the process of eliminating the anchor frame, the confidence coefficient of the anchor frame is used as a weight value to obtain a new rectangular frame, the rectangle is used as a final predicted rectangular frame, and then the anchor frame with a lower score is eliminated. The LOSS function uses CIOU-LOSS LOSS function, and the formula is as follows:

wherein IOU is Intersection ratio (IOU), rho²(b，b^gt) The square of the Euclidean distance value representing the coordinates of the center of the predicted anchor frame and the actual anchor frame, c²For predicting anchor frame and realThe square of the length of the diagonal of the circumscribed rectangle of the interstar anchor frame represents a penalty factor, the formula of which is shown below.

And 4, step 4: the Scaled-YOLOv4 model is trained using the training set to modify the image size of the input network to the size of the last cut image.

Step 41: in the training process, the cross-over ratio IOU>A fraction of 0.7, we consider as a foreground target; when IOU is used<At 0.3, we consider the background target, and its penalty function is shown in step 34, where CIOU _ Loss is the Loss, L, of the regression anchor box_confLoss, L, of confidence_clsIs the classified loss.

And finally, the optimization algorithm of the model adopts an Adam method, and the learning rate is set to be 0.001. And training the training set data to obtain an abnormal object detection model for assisting in completing reasoning tasks of abnormal objects of the urban comprehensive underground pipe gallery.

And 5: and testing the performance of the trained Scaled-YOLOv4 model by using the test set, then using the model for recognizing the abnormal object image of the urban underground comprehensive pipe gallery, and outputting the recognition result.

The drain port aerial image training set and test set statistics referred to in this example are shown in table 1.

Table 1 underground comprehensive pipe gallery abnormal matter image data set statistical table

Data set	Brick stone	Falling object	Wrapping paper
				Training set	5500	4510	4300
Test set	3400	3115	3500
				Total up to	8900	7625	7800

In order to evaluate the detection result of the abnormal object target, the invention adopts Precision (Precision) and recall rate (Rcall) indexes, and the calculation formula is as follows:

wherein TP (true Positive) represents the number of correctly detected abnormal objects, FP (false positive) represents the number of incorrectly detected abnormal objects, and FN (false negative) represents the number of undetected abnormal objects.

According to the precision and the recall rate, a P-R curve can be obtained, the lower area of the P-R curve is AP (Average precision), each type of abnormal object corresponds to an AP value, further, the Average precision mAP (mean Average precision) of the target detection of various types of abnormal objects can be calculated, and the precision of the detection algorithm is weighed.

Fig. 3 and 4 are graphs showing the detection results of the abnormal objects in the underground comprehensive pipe gallery, and then the method proposed by the present invention is compared with the original fast RCNN method, where RestNet50 is adopted as the main network of the original fast RCNN, and the detection results are shown in table 2.

TABLE 2 comparison of the results of the test by the method of the present invention and the conventional fast RCNN method

Method	AP (masonry)	AP (fallen object)	AP (wrapping paper)	mAP
					Faster RCNN	79.45％	73.74％	78.25％	77.15％
The method of the invention	94.53％	86.76％	92.36％	91.22％

As can be seen from Table 2, compared with the conventional fast RCNN, the detection precision of the method provided by the invention on the brick stone, the packaging paper and the cast is respectively improved by 15.08%, 13.36% and 14.11%, and the average precision is improved by 14.07%.

The method of the invention uses the underground pipe gallery camera to collect the image of the abnormal object; cleaning and labeling the obtained images, and establishing an underground comprehensive pipe gallery abnormal object image training set and a testing set; and training the training set data by using Scaled-Yolov4, and testing in the test set to obtain a test result. Because the data volume of the abnormal object image of the data set is small, the method uses methods such as random turning, cutting and the like to carry out data enhancement on the abnormal object data set of the underground pipe gallery. After data preprocessing is completed, a Scaled-Yolov4 model is trained by using a training set, and the model is evaluated on a test set; and then the trained Scaled-Yolov4 model is used for detecting the abnormal objects of the urban underground comprehensive pipe gallery. The invention accurately detects the abnormal objects in the underground comprehensive pipe gallery, has strong practicability and high accuracy, and the Scaled-YOLOv4 target detection method can be Scaled up and down, is suitable for small and large networks, simultaneously keeps the speed and the accuracy and realizes better balance between the speed and the accuracy.

In summary, the method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 combines EfficientNet and YOLOv4, adjusts the network structure of the YOLOv4 to obtain Scaled-YOLOv4, obtains three network models suitable for different GPUs by adjusting the input size, depth and width of the network, realizes better balance between speed and precision, and realizes accurate detection of the abnormal objects in the underground comprehensive pipe gallery, and has strong practicability and high accuracy.

The invention has been described by way of examples above, but it should be understood that the above examples are known for purposes of illustration and description. Therefore, all technical approaches in the technical field, such as logical analysis, reasoning or limited experimentation, should be considered within the scope of the described examples.

Claims

1. A method for detecting abnormal objects of an urban underground comprehensive pipe gallery based on Scaled-YOLOv4 is characterized by comprising the following steps:

and step 3: constructing a Scaled-YOLOv4 model, wherein the new structure effectively reduces the calculation amount by 40% by carrying out CSP-ize on the PAN architecture in YOLOv4 in the model;

and 4, step 4: training a Scaled-Yolov4 model by using a training set, and modifying the size of the image of the input network into the size of the image cut finally;

and 5: and testing the performance of the trained Scaled-Yolov4 model by using the test set, then using the model for recognizing the abnormal object image of the urban comprehensive pipe gallery, and outputting a recognition detection result.

2. The method for detecting the abnormal objects of the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 1, wherein in the step 2, the established image of the abnormal objects of the urban comprehensive pipe gallery is preprocessed, and the specific steps are as follows:

step 21: marking the shot image by adopting a tool, namely label img, and marking the abnormal object target of the urban comprehensive pipe gallery in the data set established in the step 1; in the labeling process, an expert selects an abnormal object target by using a labelImg tool and inputs a label of the selected object; then, the labelImg stores the marked information into a specified folder, wherein the format of the folder is an XML file format; the XML file stores the type of the marked abnormal object, the coordinates of the upper left corner X and Y of the marking box and the coordinates of the lower right corner X and Y of the marking box.

3. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 2, further comprising the steps of 22: after the image labeling in step 21 is performed, in order to make the abnormal object have a larger proportion in the captured image, the image after labeling is cut into a size of 600 × 600, and in this case, the target of the abnormal object occupies a larger area, and the detection of the small target is converted into the detection of the larger target.

4. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 1, wherein in the step 3, the Scaled-YOLOv4 model comprises three scaling models with different sizes, which are YOLOv4-tiny, YOLOv4-CSP, YOLOv 4-large; the network structure of the YOLOv4-CSP is an input end, a feature extraction part, a Neck part and a Head part respectively.

5. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 4, further comprising the steps of 31: in the input end part, the input end of YOLOv4-CSP is the same as the input end of YOLOv4, and YOLOv4 adds dynamic data enhancement and self-confrontation training; the motion data enhancement is evolved according to Cutmix, four images are recombined into a new image, in the recombination process, the length and the width of the image are random, the operations of random image overturning, random zooming and random arrangement are added randomly, and the motion data enhancement has a certain enhancement effect on the detection effect of small targets.

6. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 5, wherein the self-confrontation training is a new data expansion technique and is divided into two stages: in the first stage, the neural network changes the original image instead of the network weights, in such a way that the neural network performs a antagonistic attack on itself, changing the original image, thus causing the illusion that there is no target on the image; in the second stage, training a neural network to carry out normal target detection on the modified image;

the calculation of the anchor box in YOLOv4 requires a separate calculation using the K-means clustering algorithm alone, and the result is calculated before training the network and is used as an input of the network parameters.

7. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 5, further comprising the steps of 32: in the feature extraction part, YOLOv4-CSP modifies the originally used CSPDarknet, and in order to obtain better speed and precision compromise, the first CSP stage is converted into an Original Darknet Residual Layer.

8. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 7, further comprising the steps of 33: to effectively reduce the amount of computation, YOLOv4-CSP performed CSP-ize operation on the PAN architecture in YOLOv 4; the PAN structure integrates the features from different feature pyramids, and then a new network architecture is formed by two groups of reverse Original Darknet Residual layers without shortcut connection through CSP-ize; an SPP (spatial Pyramid Power) network is also added to the network, and the SPP module is initially inserted in the middle of the first CSP stage of the nack, so that the YOLOv4-CSP also inserts the SPP module in the middle of the first volume block of the CSP-PAN.

9. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 8, further comprising the steps of 34: the Head obtains three prediction results with different scales, and the results with different sizes are used for predicting targets with different sizes; in the post-processing process of target detection, aiming at screening of many target anchor frames, usually Non-Maximum Suppression (NMS) operation is required, where Non-Maximum Suppression refers to screening of anchor frames with different confidence levels, and an anchor frame with a relatively low Suppression score is screened, in YOLOv4-CSP, a weighted NMS is used, in the process of removing an anchor frame, the confidence level of the anchor frame is used as a weight value to obtain a new rectangular frame, the rectangular frame is used as a finally predicted rectangular frame, then the anchor frame with a relatively low score is removed, and a CIOU-LOSS function is used as a LOSS function, and the formula is as follows:

10. The method for detecting the abnormal objects in the urban underground comprehensive pipe gallery based on Scaled-YOLOv4 as claimed in claim 9, further comprising the step of obtaining three models of YOLOv4-tiny, YOLOv4-CSP and YOLOv4-large suitable for different GPUs by adjusting the scaling of the depth, width and input size of the network in order to adapt to different GPUs.