CN116935157A

CN116935157A - Pavement disease detection method based on PConv-YOLOv7

Info

Publication number: CN116935157A
Application number: CN202310840520.2A
Authority: CN
Inventors: 潘广源; 白志远; 崔亚巍; 魏浩
Original assignee: Linyi University
Current assignee: Linyi University
Priority date: 2023-07-10
Filing date: 2023-07-10
Publication date: 2023-10-24

Abstract

The application discloses a pavement disease detection method based on PConv-YOLOv7, which comprises the following steps: collecting road surface images and establishing a target data set; constructing a PConv-YOLOv7 model; training and evaluating a PConv-YOLOv7 model by using a target data set; pavement damage detection was performed using the PConv-YOLOv7 model, which was evaluated. The application uses PConv layer to replace the conventional 3X 1 convolution layer, and uses Focal EIOU loss function to replace CIOU loss function, thereby not only improving the identification effect, but also reducing the complexity of model, improving the detection speed, being capable of rapidly coping with large-scale target detection task, having stronger generalization capability and higher robustness, having higher accuracy for identifying road surface diseases, preventing serious accidents, reducing economic and life and property loss, and having important effects.

Description

Pavement disease detection method based on PConv-YOLOv7

Technical Field

The application relates to the technical field of computer vision, in particular to a pavement disease detection method based on PConv-YOLOv 7.

Background

In recent years, with the acceleration of the urban process and the continuous improvement of the requirements of people on traffic safety, the maintenance and management of roads become more and more important, and the construction of highways brings great development to the economy of China, but the road maintenance level of China is relatively backward. The occurrence of road surface diseases, which are various irregular and uneven cracks, depressions, crazes, depressions, fuzzy crossroads, well covers and the like, on the road surface are gradually increased due to the long-term use of many highways but lack of necessary maintenance. Road surface diseases seriously affect the driving safety and comfort of the road, shorten the normal service life of the road, also cause the increase of maintenance cost and energy consumption of vehicles, and have negative effects on environment and economy.

The existing roadside disease identification method often uses a computer vision technology, an image processing technology is used for identifying road diseases, the efficiency of identifying road diseases can be effectively improved, the workload of detection personnel is reduced, compared with the manual identification and evaluation of road damage images, the machine vision technology is used for quickly identifying and evaluating the damage images, each image can be quickly processed, and meanwhile, the efficiency of image detection can be doubled by adopting a mode of connecting multiple machines in parallel. However, because the topological structure and the morphological difference of the pavement diseases are large, the local misjudgment and omission of the pavement diseases are easy to cause, and in addition, the detection result is also influenced by the interference of factors such as greasy dirt, illumination and the like. Therefore, how to quickly and accurately identify the road surface diseases is of great significance to the accurate control of the road surface diseases and the guarantee of the safety and the comfort of the transportation.

Disclosure of Invention

In order to solve the problems, the application provides a pavement disease detection method based on PConv-YOLOv7, which improves the recognition effect, reduces the complexity of a model, improves the detection speed and can rapidly cope with large-scale target detection tasks.

The technical scheme adopted for solving the technical problems is as follows:

the pavement disease detection method based on PConv-YOLOv7 provided by the embodiment of the application comprises the following steps:

collecting road surface images and establishing a target data set;

constructing a PConv-YOLOv7 model;

training and evaluating a PConv-YOLOv7 model by using a target data set;

pavement damage detection was performed using the PConv-YOLOv7 model, which was evaluated.

Further, the target dataset is comprised of road pictures tagged with XML documents in PASCAL VOC format.

Further, the construction of the PConv-YOLOv7 model comprises the following steps:

establishing a basic YOLOv7 model;

replacing a convolution layer with a step length of 1 with a PConv layer, wherein the convolution core in the model is 3;

the Focal EIOU loss function is used as the regression loss function of the model.

Further, the penalty term formula of the Focal EIOU loss function includes:

L _Focal-EIOU ＝IOU ^γ L _EIOU

wherein C is _W And C _h The width and height of the smallest bounding box covering two bounding boxes, respectively, c is the diagonal distance of the smallest closure region that can contain both the predicted and real boxes, b ^gt The center points, w and w, of the target frame and the anchor frame are respectively ^gt The width, h and h of the target frame and the anchor frame are respectively ^gt The distance between the target frame and the anchor frame is the height distance, ρ is the Euclidean distance between the two center points, IOU is the intersection area between the predicted frame and the real frame divided by the union, and γ is the parameter for controlling the abnormal value inhibition degree.

Further, the target data set is divided into a training set, a verification set and a test set, wherein the ratio of the training set, the verification set and the test set is 7:1:2.

Further, the evaluation of the PConv-YOLOv7 model includes:

calculating Recall, precision and F1 scores:

Recall＝TP/(TP+FN)

Precision＝TP/(TP+FP)

F1-Score＝2(Precision.Recall)/(Precision+Recall)

where TP is the number of positive classes predicted as positive classes, FN is the number of positive classes predicted as negative classes, FP is the number of negative classes predicted as positive classes, and TN is the number of negative classes predicted as negative classes.

Further, by evaluating the PConv-YOLOv7 model, the complexity of the model can be measured:

wherein h, w, k, c respectively represent the height, width, number of convolution kernels and number of channels of the input picture, c _p For the first or last consecutive channel.

Further, the types of road surface defects include longitudinal cracks, transverse cracks, crazes, pits, fuzzy intersections, fuzzy white lines, and lost well covers.

The technical scheme of the embodiment of the application has the following beneficial effects:

(1) The PConv layer is used for replacing a convolution layer with a convolution kernel of 3 and a step length of 1 in a basic YOLOv7 model, so that the improvement of light weight is performed, the computational redundancy and the memory access are reduced, the complexity of the model is reduced, and the recognition effect is improved.

(2) The Focal EIOU loss function is used for replacing the CIOU loss function, so that regression accuracy is improved.

(3) The application provides more visual and specific reference for pavement disease prevention work, completes the identification of 7 pavement diseases, and can prevent and solve the pavement disease problems, thereby having practical significance.

The application not only improves the recognition effect, but also reduces the complexity of the model, improves the detection speed, can rapidly cope with large-scale target detection tasks, has stronger generalization capability and higher robustness, has higher accuracy for recognizing road surface diseases, and provides important functions for preventing serious accidents, reducing economic and life and property losses.

Drawings

FIG. 1 is a flowchart illustrating a PConv-YOLOv 7-based pavement disease detection method, according to an exemplary embodiment;

FIG. 2 is a block diagram of a basic YOLOv7 model shown in accordance with an exemplary embodiment;

FIG. 3 is a graph illustrating training loss versus PConv-YOLOv7 model evaluation results, according to an exemplary embodiment.

Detailed Description

The application is further illustrated by the following examples in conjunction with the accompanying drawings:

in order to clearly illustrate the technical features of the present solution, the present application will be described in detail below with reference to the following detailed description and the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different structures of the application. In order to simplify the present disclosure, components and arrangements of specific examples are described below. Furthermore, the present application may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and processes are omitted so as to not unnecessarily obscure the present application.

As shown in fig. 1, the pavement disease detection method based on PConv-YOLOv7 provided by the embodiment of the application comprises the following steps:

step one: and acquiring a pavement image and establishing a target data set.

In the present embodiment, the GRDDC2020 dataset is a road image collected from india, japan, and czech by using the GRDDC2020 dataset as a main model training dataset. The Japan is a highly developed industrialized country, and has the advantages of very developed road network, higher road quality, flat road surface, clear mark and perfect traffic facilities. The expressway, expressway and common road in japan constitute a complete road network covering various places throughout the country. In addition, the traffic sign of the Japanese road is very clear, the road condition information is also very detailed, the driver can know the road condition in time conveniently, and a correct driving decision is made. India is a developing country, and the road network is relatively behind, so that the road quality is uneven, the road condition is poor, and more traffic jams and traffic accidents exist. Indian roads are mainly classified into national roads, intercontinental roads and local roads, wherein the quality of the national roads is relatively high, but problems such as uneven road surface, unclear marks and the like still exist. The Czech is a European country, the road network is relatively developed, the road quality is high, the road surface is flat, the mark is clear, and the traffic facilities are perfect. The highway network of czech mainly comprises expressways, expressways and common roads, wherein the standard of the expressways and the expressways is higher, the road surface quality is better, and the sign information is quite clear. In summary, the three countries of japan, india and czech cover road conditions of most countries, and the trained model can have better universality.

The present application uses the data set of japan as the target data set according to the similarity of specific road conditions. The target dataset is composed of road pictures marked with XML documents in pascaloc format. The experimental environment of the experiment is a deep learning framework PyTorch under Windows 10 system.

The data set contains 7 types of road diseases, as shown in table 1.

TABLE 1 road disease categories

Step two: and constructing a PConv-YOLOv7 model.

The method mainly comprises the steps of establishing a basic YOLOv7 model, improving a back plane layer in the basic YOLOv7 model, wherein a model structure is shown in fig. 2, picture input is input into the model through 3 RGB channels, firstly, the picture input is input into a framework network in the model, the framework network of the model is provided with 4 Stage, in Stage1, the model is provided with 4 model blocks which are respectively 1×1 convolution layer, 3×1 convolution layer, 3×2 convolution layer and one ELAN layer, the ELAN layer comprises a plurality of 3×1 convolution layers and a 3×2 convolution layer stack, the PConv layer is used for changing all 3×1 convolution layers in the framework network, and the method comprises two 3×1 convolution blocks in Stage1 and 4 3×1 convolution layers in the ELAN layer, and 4 ELAN layers in Stage2, stage3 and Stage4, and the total 18-layer network.

The convolution layer with the convolution kernel of 3 and the step length of 1 in the skeleton network is changed, and the 3 multiplied by 1 convolution layer in the model is replaced by the PConv layer.

Replacement with the PConv layer may reduce both computational redundancy and memory access. It only needs to apply the conventional Conv on a part of the input channels for spatial feature extraction and keep the rest of the channels unchanged. For consecutive or regular memory accesses, the first or last consecutive c _p The channel is considered as a representation of the entire feature map for calculation. Without loss of generality, the input and output feature maps are considered to have the same number of channels, and the calculation formula is as follows:

wherein h, w, k, c respectively represent the height, width, number of convolution kernels and number of channels of the input picture, c _p For the first or last consecutive channel, since only c _p The channels are used for spatial feature extraction and therefore the tips of the PConv layer is much smaller than the conventional convolutional layer.

Similarly, the PConv layer has smaller memory access, and the specific formula is as follows:

The present embodiment uses the regression loss function of Focal EIOU instead of the CIOU loss function of conventional YOLOv7, which takes into account the overlapping area, center point distance, aspect ratio of the bounding box regression. But the difference in aspect ratio reflected by v in its formula, rather than the actual difference in breadth and height from its confidence, respectively, sometimes prevents the model from effectively optimizing similarity.

The penalty term of EIOU is to split the aspect ratio influence factor based on the penalty term of CIOU to calculate the length and width of the target frame and anchor frame, respectively, and the loss function comprises three parts: IOU loss, distance loss, side length loss, the first two parts continue the method in CIOU, but the width and height losses directly minimize the difference between the width and height of the target box and the anchor frame, resulting in faster convergence speed. The penalty term formula is as follows:

L _EIOU ＝L _IOU +L _dis +L _asp

wherein L is _IOU For IOU loss, L _dis L is distance loss _asp Is the loss of side length.

Wherein C is _W And C _h Is the width and height of the smallest bounding box covering both bounding boxes, c represents the diagonal distance of the smallest closure region capable of containing both the predicted and real boxes, b ^gt Represented as the center points, w, of the target frame and anchor frame, respectively ^gt Representing the width, h and h of the target frame and the anchor frame ^gt Representing the high distance between the target frame and the anchor frame, ρ represents the Euclidean distance between two center points, IOU is the intersection area between the predicted frame and the real frame divided by the union, and γ is the parameter for controlling the degree of abnormal value inhibition.

Considering that there is also a problem of unbalance of training samples in the regression of the bounding box, that is, the number of high quality anchor boxes with small regression error in one image is far less than that of low quality samples with large error, the samples with poor quality can generate excessive gradient to influence the training process. Based on EIOU, focal Loss is combined to provide a Focal EIOU Loss function, from the gradient point of view, a high-quality anchor frame is separated from a low-quality anchor frame, and a penalty term formula is as follows:

L _Focal-EIOU ＝IOU ^γ L _EIOU

the IOU is the intersection area between the prediction frame and the real frame divided by the union, and gamma is a parameter for controlling the inhibition degree of the abnormal value. The Focal in the Loss is different from the traditional Focal Loss to a certain extent, and the traditional Focal Loss plays a role in difficult sample mining aiming at the more difficult sample Loss; and according to the above formula: the higher the IOU, the greater the loss, which is equivalent to the weighting action, giving a greater loss to the better regression target, contributing to the improvement of regression accuracy.

Step three: the improved YOLOv7 model was trained and evaluated using the target dataset.

The target data set is divided into a training set, a verification set and a test set, wherein the ratio of the training set, the verification set and the test set is 7:1:2.

The conventional training mode is adopted in the model training process, wherein the setting of the super parameters is shown in the table 2:

TABLE 2 super parameter settings

The evaluation criteria used in this example are the following values:

(1) Recall rate: also known as recall, represents the proportion of true positive examples predicted among the total number of positive samples to reflect the missing test.

Recall＝TP/(TP+FN)

(2) Accuracy: also referred to as precision, refers to the percentage of positive examples that have been predicted to measure the likelihood that a positive example is predicted to be true in a prediction.

Precision＝TP/(TP+FP)

(3) F1-Score: the method is a harmonic average value of the accuracy rate and the recall rate, and simultaneously considers the accuracy rate and the recall rate of the classification model.

F1-Score＝2(Precision.Recall)/(Precision+Recall)

(4) FLPs, an abbreviation for floating point operations, means floating point operands, which are understood to be computational quantities that can be used to measure the complexity of the algorithm/model.

The training loss function and the evaluation value display are shown in fig. 3, the first row displays the boundary frame training set loss, the target training set loss, the classification training set loss, the precision and the recall rate respectively, the second row displays the boundary frame verification set, the target verification set loss, the classification verification set loss and the value of the average precision when the IOU=0.5, the average precision is the sum value of IOU=0.5-0.95, the first row can be seen from the figure, the boundary frame loss and the classification loss gradually decrease and tend to be gentle after the iteration is carried out for 40 rounds, the target loss tends to oscillate after the iteration is carried out for 20-40 rounds, the accuracy and the recall rate after the iteration are carried out for 40 rounds can reach more than 0.6, the boundary frame verification loss and the classification verification loss displayed in the second row in the figure also tend to be gentle after the iteration is carried out for 40 rounds, and the average accuracy can also reach more than 0.6 when the IOU=0.5.

After training, the evaluation value pairs of the improved model and other models of the application are shown in table 3.

Table 3 model evaluation value comparison

As can be seen from the table, the model test on the test set included accuracy, recall and F1 score, while the model proposed by the present application was the best in effect, with 60% (accuracy) enhancement of the road surface disease recognition effect compared to conventional YOLOv7, 53% enhancement compared to recall, and 0.21 enhancement compared to F1 score.

Model parameters for PConv layer versus conventional convolution layer are compared as shown in table 4:

table 4 model parameter size comparison

As can be seen from Table 4, the calculation amount (FLOPs) of the model is reduced by 20% compared with the conventional YOLOv7 due to the introduction of the PConv layer, so that the calculation amount is greatly saved.

Step four: pavement damage was detected using the YOLOv7 model after the evaluation was completed.

The types of road surface defects include longitudinal cracks, transverse cracks, crazes, pits, fuzzy intersections, fuzzy white lines and lost well covers.

In another aspect, a computer device includes a processor, a memory storing machine-readable instructions executable by the processor, the processor in communication with the memory via the bus when the computer device is in operation, and a bus, the processor executing the machine-readable instructions to perform the steps of a PConv-YOLOv 7-based pavement damage detection method as described in any of the foregoing.

The embodiment of the application also provides a storage medium, on which a computer program is stored, which when being executed by a processor performs the steps of a PConv-YOLOv 7-based pavement disease detection method as described in any of the above.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of modules is merely a logical function division, and there may be additional divisions in actual implementation, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiment provided by the application may be integrated in one processing module, or each module may exist alone physically, or two or more modules may be integrated in one module.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the application without departing from the spirit and scope of the application, which is intended to be covered by the claims.

Claims

1. The pavement disease detection method based on PConv-YOLOv7 is characterized by comprising the following steps of:

collecting road surface images and establishing a target data set;

constructing a PConv-YOLOv7 model;

training and evaluating a PConv-YOLOv7 model by using a target data set;

2. The PConv-YOLOv 7-based pavement disease detection method of claim 1, wherein the target dataset is comprised of road pictures tagged with XML documents in paspal VOC format.

3. The pconev-YOLOv 7-based pavement disease detection method of claim 1, wherein the constructing a pconev-YOLOv 7 model comprises:

establishing a basic YOLOv7 model;

4. A pavement disease detection method based on PConv-YOLOv7 according to claim 3, wherein the penalty term formula of the Focal EIOU loss function comprises:

L _Focal-EIOU ＝IOU ^Y L _EIOU

5. The PConv-YOLOv 7-based pavement disease detection method of claim 1, wherein the target data set is divided into a training set, a validation set, and a test set, wherein the ratio of the training set, the validation set, and the test set is 7:1:2.

6. The PConv-YOLOv 7-based pavement disease detection method according to claim 1, wherein the evaluating the PConv-YOLOv7 model comprises:

calculating Recall, precision and F1 scores:

Recall＝TP/(TP+FN)

Precision＝TP/(TP+FP)

F1-Score＝2(Precision.Recall)/(Precision+Recall)

where TP is the number of samples for predicting positive class as positive class, FN is the number of samples for predicting positive class as negative class, FP is the number of samples for predicting negative class as positive class, and TN is the number of samples for predicting negative class as negative class.

7. The PConv-YOLOv 7-based pavement disease detection method according to claim 6, wherein the evaluation of the PConv-YOLOv7 model is further capable of measuring the complexity of the model:

8. The PConv-YOLOv 7-based pavement marking method of claim 1, wherein the types of pavement marking include longitudinal cracks, transverse cracks, crazes, pits, intersection blurs, white line blurs, and well cover loss.

9. A computer device comprising a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the computer device is in operation, the processor executing the machine-readable instructions to perform the steps of a PConv-YOLOv 7-based pavement damage detection method of any one of claims 1-8.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a PConv-YOLOv 7-based pavement disease detection method according to any one of claims 1-8.