CN116433629A - Airport pavement defect identification method based on GA-Unet - Google Patents

Airport pavement defect identification method based on GA-Unet Download PDF

Info

Publication number
CN116433629A
CN116433629A CN202310386523.3A CN202310386523A CN116433629A CN 116433629 A CN116433629 A CN 116433629A CN 202310386523 A CN202310386523 A CN 202310386523A CN 116433629 A CN116433629 A CN 116433629A
Authority
CN
China
Prior art keywords
model
image
airport pavement
unet
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310386523.3A
Other languages
Chinese (zh)
Inventor
罗仁泽
邓治林
罗任权
谭亮
余泓
李华督
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202310386523.3A priority Critical patent/CN116433629A/en
Publication of CN116433629A publication Critical patent/CN116433629A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0014Image feed-back for automatic industrial control, e.g. robot with camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/54Extraction of image or video features relating to texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides an airport pavement defect recognition method based on GA-Unet, which uses a GFM module to enrich the input information of the model, so that the available characteristic information of the model is increased; then, the model uses MRCM and CAM module to realize effective utilization of each level of characteristics in the network model, and improves the characteristic recognition capability of the model. In the model training process, the Focal loss function is adopted to enable the model to pay more attention to characteristics of small samples difficult to classify, such as cracks in images, and further improve recognition performance of the cracks, the crack filling and the plate cracks of the model. Experimental results show that the conventional deep learning image recognition technology is hopefully applied to the field of airport pavement image recognition, and theoretical basis is provided for airport pavement management personnel to reduce workload and scientific maintenance and management of airport pavement.

Description

Airport pavement defect identification method based on GA-Unet
Technical Field
The invention relates to the field of airport pavement defect detection, in particular to an airport pavement crack detection method based on GA-Unet.
Background
With the increase of the service life of airports and the traffic carried by airports, defects can occur in the interior and on the surface of airport runways, which can lead to the influence of the safety and comfort of taking off and landing of the aircraft. The crack of the airport pavement is an initial expression form of serious defect of the airport pavement, if the crack is continued to develop, the crack can cause more serious influence, and the crack of the airport pavement can be found and processed in time, so that the operation and maintenance cost of the airport can be reduced; meanwhile, with the increasing of the scale and the number of airports, the manual inspection and detection mode is difficult to adapt to the development requirement of the airports. Automatic identification techniques for airport pavement cracks have now evolved to some extent, but at a distance from the landing application. With the application of deep learning technology in the field of computer vision, airport pavement crack detection technology is gradually mature.
Nowadays, various detection techniques have been applied in the field of airport pavement detection, such as ground penetrating radar detection techniques (Cheng, 2019, huo Dongwei, 2014), unmanned aerial vehicle detection techniques (Chen Fengchen, cheng Lan, han Liming, xu Kun, tang Ke, jiao Huanjing, dujia, phyllanthus, 2019), FWD detection techniques (Zhao Zhihua, yuan Jie, 2015), and the like. In the apparent defect detection of the airport pavement, the collection of image data of the airport pavement is realized mainly by using a platform carrying an industrial camera (a linear camera and an area camera), and after the image collection is completed, the defects in the pavement image are identified and analyzed, so that the analysis of the specific condition of the airport pavement is realized. In image analysis, there are a variety of conventional techniques based on numerical analysis for feature recognition in images, such as thresholding (Tang and Wu, 2011), morphological segmentation (Meng et al, 2018), etc.; with the development of machine learning techniques, various machine learning techniques have also been applied to feature recognition of images, such as support vector machines, neural networks, and the like (Mahadevkar et al, 2022); now, deep learning techniques have also been well developed in the field of computer vision, and many image detection and recognition techniques based on deep learning have been proposed, such as YOLO (Hsu and Lin, 2021), FRCNN (Girshick, 2015), U-Net (ronneeberger et al, 2015 a), and the like. Although the deep learning technology has little research in the field of detecting apparent defects of airport pavement, the deep learning technology has more practical application in the field similar to the field of detecting the defects of airport pavement, such as detecting concrete building surface cracks, highway pavement cracks and the like. Yao et al (Yao et al 2022) used YOLO v5 in combination with an attention mechanism to detect road surface cracks, achieving a good road surface crack detection effect; also, FRCNN has a good practical application effect in the road surface detection field (Kortmann et al 2020). Successful application of YOLO and FRCNN network models in road surface crack detection proves the feasibility of the target detection network model in the road surface crack detection field. In the case of detecting a crack, not only the crack region but also characteristic information such as the trend of the crack and the form of the crack needs to be extracted, so in the case of detecting the crack, the crack is detected by using a semantic segmentation method, so that prediction of the image pixel level is realized, and further analysis of the crack is facilitated. Chen et al (Chen and Jahanshahi, 2020) propose NB-FCN, which uses FCN architecture in combination with bayesian probability to enable high-precision detection of cracks in nuclear power plant subsea components. Sun et al (Sun et al, 2022) propose a DMA-Net based model based on DeepLab v3+ network architecture, while introducing a multi-scale attention module in the decoder, which ultimately achieves better crack detection. The method is shown by the experimental results that the method can realize pixel-level crack detection in the image. Shamsabadi et al (Asadi Shamsabadi et al, 2022) use ViT (Vision Transformer) to detect asphalt pavement cracks, which provides better crack detection performance than conventional CNN methods. In the same way, some filtering methods have more mature application in the crack detection field, and meanwhile, the Gabor filter also realizes the detection of the road surface crack with higher precision. However, in actual crack detection, especially in the field of airport pavement crack detection, the crack characteristics of airport pavement images are not obvious due to the fact that the width of the airport pavement crack is small and various interference conditions such as rain accumulation and glue accumulation exist, so that the conventional deep learning crack detection and identification method is poor in application effect in the field.
In response to the problems encountered in airport pavement crack recognition by deep learning described above, a GA-Unet network model is presented herein for airport pavement defect detection. The model is mainly based on a Unet coding-decoding network structure (Ronneberger et al, 2015 a), adopts Gabor to filter airport pavement images at an input end, initially extracts all-directional texture features of the images, takes the feature images as network input, and enriches the network input; extracting characteristic information of different scales in an image by adopting a multi-scale residual convolution module in an encoder; and in the decoder process, a channel attention mechanism is adopted, shallow layer features and deep layer features are combined, the fused features are processed, weights are dynamically distributed among different feature layers, features of different layers are fully utilized, and finally deconvolution is used for up-sampling, so that semantic segmentation of images is realized.
Disclosure of Invention
The invention mainly overcomes the defects in the prior art, and aims to provide a method for identifying the defects of the airport pavement based on GA-Unet.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
a method for identifying airport pavement defects based on GA-Unet, characterized in that the calculation method comprises the steps of:
s1: acquiring airport pavement apparent image data by using a robot platform carrying area array camera, and preprocessing an image after the acquisition is completed;
s2: classifying the collected apparent image data of the airport pavement, carrying out pixel-level labeling on the image by using a labelme labeling tool based on python language after classification, constructing an airport pavement defect image data set, randomly dividing the data set into a training set, a verification set and a test set, and training a deep learning model;
s3: constructing a Multi-scale residual convolution module (Multi-scale Residual Convolution Module, MRCM);
s4: constructing a Gabor filter transformation module (Gabor Filter Module, GFM);
s5: constructing a channel attention module (Channel Attention Module, CAM);
s6: constructing a GA-Unet model by using the constructed residual convolution module, the channel attention module, the Gabor filter transformation module, the deconvolution module and the image maximum pooling downsampling;
s7: optimizing model parameters by using a focus loss function;
s8: setting a proper model quantification evaluation mode, and quantitatively evaluating the model performance by adopting precision, recall rate and F1 value;
s9: inputting experimental data into a model, performing model training by using data in a training set, and evaluating the generalization performance of the model by using a verification set and a test set in the training process;
s10: and detecting the actual airport pavement image by using the best model parameters, realizing image pixel level separation detection, and evaluating the model from the detection effects of different images.
Further, in the step S1, the robot platform is carried with an area array camera to collect the apparent image data of the airport pavement, and the collected data is a gray image with a size of 1800×900;
further, the implementation manner of the image preprocessing in the step S1 is as follows:
airport pavement image illumination may be expressed as:
I(x,y)=b(x,y)+g(x,y)
where I (x, y) represents the total illumination distribution of the airport pavement image, b (x, y) represents the background light intensity distribution in the airport pavement image, and g (x, y) represents the light intensity distribution of the spotlight when the airport pavement image is acquired.
After the distribution characteristics of the pixel values of the airport pavement image are obtained, a plurality of images are randomly selected to obtain the average pixel value distribution of the images, and through the step, the interference of the characteristics of the airport pavement marking, glue accumulation and the like on subsequent processing can be reduced, and the interference of noise on subsequent identification of the images can be reduced. The calculation method for calculating the average value distribution of the image pixels is as follows:
Figure SMS_1
in the method, in the process of the invention,
Figure SMS_2
representing the obtained pixel average distribution matrix of a plurality of images, N represents the number of selected images, I i (x, y) represents a pixel value distribution matrix of the i-th image.
After calculating the average pixel distribution matrix, designing an illumination compensation coefficient matrix according to the average pixel distribution, and calculating the illumination compensation coefficient matrix by the following method:
Figure SMS_3
wherein M is p (x, y) represents the resulting illumination compensation coefficient matrix.
After the coefficient matrix is obtained, multiplying the corresponding position elements of the original image matrix by the coefficient matrix to obtain an illumination compensated image, wherein the specific calculation flow is as follows:
Figure SMS_4
wherein I is n (x, y) represents the original image I (x, y)) Alpha represents an image illumination gain coefficient, the value range of the image illumination gain coefficient is usually between 0.90 and 1.20,
Figure SMS_5
representing the operation of multiplying the corresponding position elements of the matrix.
Further, in the step S2, the labelme is marked as an artificial scribe line to realize the division of the pixel region.
Further, the implementation manner of the multi-scale residual convolution module in the step S3 is as follows:
Figure SMS_6
Figure SMS_7
y MRCM =y 1 +y 2
wherein x is in Input data representing a module; y is 1 And y 2 Respectively representing the outputs of the two branches; y is MRCM Representing the output of the entire residual convolution module.
Further, the principle of the Gabor filter transformation module in the step S4 is as follows:
the complex expression of the two-dimensional Gabor function is:
Figure SMS_8
the real part expression of the two-dimensional Gabor function is:
Figure SMS_9
the imaginary part expression of the two-dimensional Gabor function is:
Figure SMS_10
wherein x 'and y' are obtained by:
x′=xcosθ+ysinθ
y′=-xsinθ+ycosθ
in the above equation, λ represents a wavelength parameter of a cosine function in the Gabor kernel function; θ represents the direction of parallel stripes in the Gabor filter kernel; psi represents the phase parameter of the cosine function in the Gabor kernel function; gamma is an aspect ratio parameter of the filter kernel, determining the shape of the Gabor filter kernel; σ represents the standard deviation of the gaussian factor of the Gabor function. Wherein λ, σ and bandwidth b are related as follows:
Figure SMS_11
Figure SMS_12
further, in the step S5, the channel attention module calculates the formula:
y CAM =L·f(L,H)+H
wherein y is CAM Representing the final output result of the channel attention module; l represents a shallow feature map output in the encoder; h represents a deep feature map of the same size as in the encoder output in the decoder; f (L, H) represents the result of cascade, global average pooling, 1×1 convolution, reLU activation function, 1×1 convolution, and Sigmoid activation function of the shallow feature map output in the encoder and the deep feature map output in the decoder.
Further, in the step S7, a cross entropy loss function and a focus loss function are adopted to optimize model parameters, and the specific formula is as follows:
first, cross Entropy (CE) loss functions are selected to optimize model parameters.
CE(p k )=-lg(p k ),
Wherein p is k Representing the probability that the predicted sample belongs to the kth class.
For crack image data, the image is mostly background information because the cracks are linearly distributed and occupy only a small portion of the image. The negative samples are too many, and the loss of the background occupies most of the total loss during training, so that the optimization direction of the model is influenced. For the case of positive and negative sample imbalance, focus loss is used herein to balance the weights of the samples.
FL(p k )=-α(1-p k ) γ lg(p k ),
Wherein, alpha is a constant, when gamma increases, alpha needs to be reduced by a little, and the experiment is alpha=2.5; γ is the attention parameter, γ=2.0 in this experiment; (1-p) k ) γ For the modulation factor, the weight of the sample easy to classify is reduced by the factor, so that the model is more focused on the sample difficult to classify.
Further, the calculation formula of the evaluation mode in step S8 is as follows:
Figure SMS_13
Figure SMS_14
Figure SMS_15
in the formula, TP represents the number of correctly detected pixels in the crack region, FP represents the number of pixels in the non-crack region predicted as the crack pixels, and FN represents the number of pixels in the crack region not detected.
The invention provides a new image semantic segmentation model GA-Unet, in the Unet model, input data is usually expanded in a convolution mode to realize channel number, a GFM module is built by a Gabor filtering conversion module, features such as textures in different directions in an image are extracted in a relatively fixed mode to realize preliminary feature extraction, input information of the model is enriched, and available feature information of the model is increased; then, the model uses MRCM and CAM module to realize effective utilization of each level of characteristics in the network model, and improves the characteristic recognition capability of the model. In the model training process, the Focal loss function is adopted to enable the model to pay more attention to characteristics of small samples difficult to classify, such as cracks in images, and further improve recognition performance of the cracks, the crack filling and the plate cracks of the model. Experimental results show that the conventional deep learning image recognition technology is hopefully applied to the field of airport pavement image recognition, and theoretical basis is provided for airport pavement management personnel to reduce workload and scientific maintenance and management of airport pavement.
The beneficial effects are that:
compared with the prior art, the invention has the following beneficial effects:
the experiment is based on airport pavement data acquired by robots, features (cracks, pouring seams and slab seams) in airport pavement images are respectively identified by using FCN, deep Lab v3, unet and GA-Unet models, so that the automatic identification of airport pavement defects is realized, and the method is summarized as follows: the experiment is mainly to use a GA-Unet model to identify apparent cracks of an airport pavement, and compare the apparent cracks with the existing classical semantic segmentation model in identification effect. The comparison experiment result shows that under the condition of using the same experimental data, the same experimental equipment and the same training period, the GA-Unet model trained by using the Focal loss function has better robustness and generalization capability, and the GA-Unet can accurately identify cracks in airport pavement images with different interference degrees. At the same time of identifying the crack of the airport pavement, the crack filling and board crack identifying performance of the GA-Unet is evaluated in detail. The comparison experiment result shows that the GA-Unet can accurately extract the characteristics of the crack pouring and the board crack in the images with different interference degrees, and the effective identification of the crack pouring and the board crack in the airport pavement is realized. Finally, all comparison experiment results show that the identification results of the GA-Unet model on cracks, pouring seams and slab joints in the airport pavement are obviously superior to those of FCN, deep Lab v3 and Unet.
Drawings
FIG. 1 is a diagram of the structure of a GA-Unet model;
FIG. 2 shows Gabor filter kernels with different window sizes and different directions;
FIG. 3 is a block diagram of a residual convolution module;
FIG. 4 is a block diagram of a channel attention module;
FIG. 5 is an airport pavement image;
FIG. 6 is a graph of loss function variation;
FIG. 7 is a graph of F1 value variation;
FIG. 8 is a graph comparing the crack recognition effect of the image of the airport pavement by different algorithms;
FIG. 9 is a graph showing comparison of crack pouring recognition effects of different algorithms;
FIG. 10 is a graph showing the comparison of the plate seam recognition effect of different algorithms.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Examples:
a airport pavement defect recognition method based on GA-Unet comprises the following steps:
s1: acquiring airport pavement apparent image data by using a robot platform carrying area array camera, and preprocessing an image after the acquisition is completed;
s2: classifying the collected apparent image data of the airport pavement, carrying out pixel-level labeling on the image by using a labelme labeling tool based on python language after classification, constructing an airport pavement defect image data set, randomly dividing the data set into a training set, a verification set and a test set, and training a deep learning model;
s3: constructing a deep learning model by using a deep learning framework pytorch based on python language, and constructing a Multi-scale residual convolution module (Multi-scale Residual Convolution Module, MRCM), wherein the specific structure is shown in FIG. 3;
the multi-scale residual convolution module in the step S3 is realized by the following steps:
Figure SMS_16
Figure SMS_17
y MRCM =y 1 +y 2
wherein x is in Input data representing a module; y is 1 And y 2 Respectively representing the outputs of the two branches; y is MRCM Representing the output of the entire residual convolution module.
S4: constructing a Gabor filter transformation module (Gabor Filter Module, GFM) using a python-based language, the filter kernel being as shown in fig. 2;
the principle of the Gabor filter conversion module in the step S4 is as follows:
the complex expression of the two-dimensional Gabor function is:
Figure SMS_18
the real part expression of the two-dimensional Gabor function is:
Figure SMS_19
the imaginary part expression of the two-dimensional Gabor function is:
Figure SMS_20
wherein x 'and y' are obtained by:
x′=xcosθ+ysinθ
y′=-xsinθ+ycosθ
in the above equation, λ represents a wavelength parameter of a cosine function in the Gabor kernel function; θ represents the direction of parallel stripes in the Gabor filter kernel; psi represents the phase parameter of the cosine function in the Gabor kernel function; gamma is an aspect ratio parameter of the filter kernel, determining the shape of the Gabor filter kernel; σ represents the standard deviation of the gaussian factor of the Gabor function. Wherein λ, σ and bandwidth b are related as follows:
Figure SMS_21
Figure SMS_22
s5: constructing a deep learning model by using a deep learning framework pytorch based on python language, and constructing a channel attention module (Channel Attention Module, CAM), wherein the specific structure is shown in fig. 4;
in the step S5, the channel attention module calculates the formula:
y cab =H+LW
wherein y is cab The final output of the channel attention module is represented, H represents the high-level features, L represents the low-level features, and W represents the output of the intermediate feature fusion flow. The specific flow is as follows:
Figure SMS_23
where Concat represents a feature concatenation operation, adaptAvgPool represents a global average pooling operation,
Figure SMS_24
representing a 1×1 convolution, reLU represents a ReLU activation function, sigmoid represents a Sigmoid activation function;
s6: the construction of the GA-Unet model is realized by using a constructed residual convolution module, a channel attention module, a Gabor filter transformation module, a deconvolution module and image maximum pooling downsampling, and the structure is shown in figure 1;
s7: optimizing model parameters by using a cross entropy loss function and a focus loss function, wherein the change situation of the loss function is shown in fig. 7;
in the step S7, a cross entropy loss function and a focus loss function are adopted to optimize model parameters, and the specific formula is as follows:
first, cross Entropy (CE) loss functions are selected to optimize model parameters.
CE(p k )=-lg(p k ),
Wherein p is k Representing the probability that the predicted sample belongs to the kth class.
For crack image data, the image is mostly background information because the cracks are linearly distributed and occupy only a small portion of the image. The negative samples are too many, and the loss of the background occupies most of the total loss during training, so that the optimization direction of the model is influenced. For the case of positive and negative sample imbalance, focus loss is used herein to balance the weights of the samples.
FL(p k )=-α(1-p k ) γ lg(p k ),
Wherein, alpha is a constant, when gamma increases, alpha needs to be reduced by a little, and the experiment is alpha=2.5; γ is the attention parameter, γ=2.0 in this experiment; (1-p) k ) γ For the modulation factor, the weight of the sample easy to classify is reduced by the factor, so that the model is more focused on the sample difficult to classify.
S8: setting a proper model quantification evaluation mode, and quantitatively evaluating the model performance by adopting precision, recall rate and F1 value, wherein the change condition of the F1 value is shown in FIG. 7;
s9: inputting experimental data into a model, performing model training by using data in a training set, and evaluating the generalization performance of the model by using a verification set and a test set in the training process;
s10: and detecting the actual airport pavement image by using the best model parameters, realizing image pixel level separation detection, and evaluating the model from the detection effects of different images.
Example 1:
the data used in this example was from an airport pavement APD dataset and the airport pavement images were from different airports as shown in fig. 5.
The i7-8700CPU with the main frequency of 3.20GHz and 16GB running memory are adopted, the GPU is a Geforce RTX 2080,8GB video memory, the operating system is a computer of Windows 10 for model training, and the change of the model F1 value during training is shown in figure 7.
The following shows the effect comparison of the airport pavement defect detection realized by the airport pavement defect identification method based on GA-Unet.
Table 1 shows the comparison of quantitative indexes of recognition performance of the airport pavement crack by using the deep learning model trained by different loss functions, and the data comparison result shows that the model trained by using the Focal loss function has better recognition effect on the crack; the specific effect comparison of different models on the crack identification in the airport pavement is shown in fig. 8, and the comparison result of the data index and the image identification effect can show that the GA-Unet has better crack identification effect on the airport pavement relative to the FCN, the deep Lab v3 and the Unet, and can accurately and efficiently identify the cracks in different forms in the airport pavement. Wherein, the crack identification accuracy rate, recall rate and F1 value of GA-Unet reach 80.27%, 86.69% and 83.36%, respectively.
As can be seen from the data in the table 1, no matter what kind of loss function is used for optimizing the parameters of the network model, the GA-Unet can always efficiently identify the cracks in the airport pavement, and in contrast, when the model is trained by using the Focal loss function, the parameters of the model can be effectively optimized for the cracks, so that the model has better crack identification performance.
The airport pavement images have larger difference, and meanwhile, the airport pavement images are also interfered by water stains, glue accumulation and the like, so that when the airport pavement cracks are identified, the model is required to have better anti-interference capability, and the airport pavement crack identification results of the images under different interferences are combined with different models shown in fig. 8.
As can be seen from the detection result of the image 1 in fig. 8, all models can effectively detect the crack in the image for the road surface image with simple background, less interference and more obvious characteristics of the crack in the image. Specifically, except for the condition that the cracks detected by the deep Lab v3 model are discontinuous, other models can completely identify the cracks in the image.
As seen from the detection result of the image 2 in fig. 8, for the image in which the water stain interference exists and the crack morphology is not obvious, the FCN, deep lab v3, unet, and GA-Unet models can detect the crack in the image. In particular, all deep learning models used in the experiments have a discontinuous condition in the image recognition effect, but the cracks recognized by FCN, deep Lab v3 and Unet have more break points, and the cracks recognized by GA-Unet are relatively complete and have better continuity.
As shown in image 3 in fig. 8, when interference conditions such as notch and water stain exist in the image and the characteristics of the pavement crack are not obvious, the identification performance of the FCN, deep lab v3, the Unet and the GA-Unet model on the crack is obviously reduced. In particular, the FCN model can only identify cracks with relatively obvious characteristics in the graph, and the identification effect is poor; the deep Lab v3 has a better recognition effect relative to the FCN, but is also disturbed by water stains and grooving to a certain extent; the recognition result of the Unet model has a better effect than the recognition result of the FCN and deep Lab v3 models, but cracks recognized by the Unet are incomplete; finally, from the recognition effect of the GA-Unet model, the GA-Unet is hardly interfered by factors such as water stain and notch in the image, cracks in the image can be effectively detected, and the recognition result has good integrity.
For images with complex image background, such as image 4 in fig. 8, and multiple interferences, such as notch, glue accumulation, and the like, the FCN, deep lab v3 and the Unet model can only detect cracks with obvious crack characteristics, and the identified cracks are discontinuous, so that the small parts of the same crack can not be effectively identified. Specifically, the cracks detected by the FCN and the deep Lab v3 are not only discontinuous, but also incomplete; the cracks detected by the Unet are relatively complete, but the Unet model can not detect the tiny cracks under the background; meanwhile, the GA-Unet can not only completely detect the crack with obvious characteristics under the background, but also completely detect the tiny crack under the background.
In general, the GA-Unet can be well adapted to different interference scenes, so that the GA-Unet can effectively extract cracks in scene images of different airport surfaces, and compared with classical FCNs, deep Lab v3 and Unet, the cracks identified by the GA-Unet are more complete and are closer to reality.
Table 1 comparison of crack identification properties
Figure SMS_25
The recognition effect and recognition performance comparison of different models to recognize the crack in the airport pavement are shown in fig. 9 and table 2. From the data in table 2, the deep learning model trained by using the Focal loss function has better recognition performance of the airfield pavement crack, wherein the recognition accuracy, recall rate and F1 value of the airfield pavement crack by using the GA-Unet model trained by using the Focal loss function respectively reach 77.06%, 82.50% and 79.68%, and the recognition performance of the model on the crack is better than that of other models from the aspect of overall evaluation indexes. As specifically seen in the line 1 image of fig. 9, for the image with obvious crack filling characteristics, FCN, deep lab v3, unet and GA-Unet can effectively identify the crack filling in the image, but the FCN, deep lab v3 and Unet model mask the crack characteristics to different degrees while identifying the crack filling, so that the crack in the image cannot be well extracted. For images with certain influence and poor image quality in the second row like the image in fig. 9, the FCN suffers less interference, but cannot completely identify the crack pouring area in the image, deep lab v3 can relatively completely identify the crack pouring area in the image, but suffers greater interference, so that the identified crack pouring area is larger, and meanwhile, the Unet also suffers interference to a certain extent, so that the identified crack pouring area is inaccurate, and GA-Unet not only completely identifies the crack pouring area in the image, but also suffers the least interference. For an image with extremely insignificant crack pouring characteristics (as shown in line 3 of fig. 9), in this case, the FCN can hardly identify the crack pouring in the image, and although deep lab v3 and Unet can extract the crack pouring in the image to a certain extent, the crack pouring extracted by the two models is incomplete and has a large gap from the actual situation; in this case, the GA-Unet can completely extract the slits in the image, and the extracted slits are closest to the distribution of the actual slits. In summary, whether the feature of the crack in the image is obvious or not, the GA-Unet model can effectively extract the crack in the image, and the identification effect is obvious due to the other three models.
Table 2 comparative crack pouring identification performance
Figure SMS_26
Fig. 10 and table 3 show the comparison of the effect of different deep learning models on the recognition of the slab joints in the airport pavement with the recognition performance. The GA-Unet trained by using the Focal loss function has the best board seam identification performance as shown in the data in the table 3, and the board seam identification precision, recall rate and F1 value of the GA-Unet reach 88.65%, 85.06% and 86.82 respectively, so that the overall identification effect is obviously better than that of other models. Referring specifically to fig. 10, it can be seen that in the seam recognition effect of each model in line 1 of fig. 10, for the image with obvious seam characteristics and less interference in the image, the FCN, deep lab v3, unet and GA-Unet models can effectively and completely recognize the seam in the image. For the image of the 2 nd row in fig. 10, there is interference of factors such as water stain around the plate seam in the image, and as can be seen from the recognition results of all models, the FCN recognition effect is worst, and the plate seam in the image in the scene cannot be completely recognized; the deep Lab v3 model is also affected to a certain extent, and the identified plate seam not only presents a curve shape, but also is affected by the water stain edge, so that the plate seam identification effect is not ideal; the Unet model is hardly affected by the water stain edge, but the identified plate seam is discontinuous; the GA-Unet model is hardly interfered by external factors, and the identified plate seam has good continuity and integrity. For images with complex backgrounds (as shown in line 3 of fig. 10), in this case, all models cannot completely identify the plate seams in the image, but, with respect to FCN, deep lab v3, and Unet models, GA-Unet models can relatively completely and consistently identify the plate seams in the image with complex image backgrounds. In summary, the GA-Unet model can effectively extract the features of the plate seams in the image to realize a better plate seam identification effect no matter what interference exists in the acquired airport pavement image.
Table 3 comparison of plate seam identification performance
Figure SMS_27
The invention provides a new image semantic segmentation model GA-Unet, which enriches the input information of the model by using a GFM module, so that the available characteristic information of the model is increased; then, the model uses MRCM and CAM module to realize effective utilization of each level of characteristics in the network model, and improves the characteristic recognition capability of the model. In the model training process, the Focal loss function is adopted to enable the model to pay more attention to characteristics of small samples difficult to classify, such as cracks in images, and further improve recognition performance of the cracks, the crack filling and the plate cracks of the model. Experimental results show that the conventional deep learning image recognition technology is hopefully applied to the field of airport pavement image recognition, and theoretical basis is provided for airport pavement management personnel to reduce workload and scientific maintenance and management of airport pavement. While the invention has been described with respect to the above embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention, and that any and all such modifications and changes in the above embodiments may be made without departing from the spirit and scope of the invention.

Claims (4)

1. A method for identifying airport pavement defects based on GA-Unet is characterized by comprising the following steps:
s1: the method comprises the steps of collecting airport pavement apparent image data by using a robot platform carrying area array camera, and preprocessing an image after collection is completed, wherein the implementation mode of the image preprocessing is as follows:
airport pavement image illumination may be expressed as:
I(x,y)=b(x,y)+g(x,y)
wherein I (x, y) represents the total illumination distribution of the airport pavement image, b (x, y) represents the background light intensity distribution in the airport pavement image, and g (x, y) represents the light intensity distribution of the spotlight when the airport pavement image is acquired;
after obtaining the distribution characteristics of the pixel values of the airport pavement images, randomly selecting a plurality of images to obtain the average pixel value distribution of the images, and calculating the average pixel value distribution of the images by the following calculation method:
Figure QLYQS_1
in the method, in the process of the invention,
Figure QLYQS_2
representing the obtained pixel average distribution matrix of a plurality of images, N represents the number of selected images, I i (x, y) represents a pixel value distribution matrix of the i-th image;
after calculating the average pixel distribution matrix, designing an illumination compensation coefficient matrix according to the average pixel distribution, and calculating the illumination compensation coefficient matrix by the following method:
Figure QLYQS_3
wherein M is p (x, y) represents the resulting illumination compensation coefficient matrix.
After the coefficient matrix is obtained, multiplying the corresponding position elements of the original image matrix by the coefficient matrix to obtain an illumination compensated image, wherein the specific calculation flow is as follows:
Figure QLYQS_4
wherein I is n (x, y) represents the illumination compensation result image of the original image I (x, y), alpha represents the illumination gain coefficient of the image, the value range is usually between 0.90 and 1.20,
Figure QLYQS_5
representing the operation of multiplying the corresponding position elements of the matrix;
s2: after the airport pavement apparent image data acquisition is completed, pixel-level labeling is carried out on pavement defects and pavement features existing in the images, an airport pavement defect image data set is constructed, and the data set is randomly divided into a training set, a verification set and a test set and is used for training a deep learning model;
s3: constructing a multi-scale residual convolution module, wherein the multi-scale residual convolution module is realized by the following steps:
Figure QLYQS_6
Figure QLYQS_7
y MRCM =y 1 +y 2
wherein x is in Input data representing a module; y is 1 And y 2 Respectively representing the outputs of the two branches; y is MRCM Representing the output of the whole multi-scale residual convolution module, conv representing convolution operations, wherein subscripts represent convolution kernels of different scales, batchNorm represents batch normalization operations, and ReLU represents a ReLU activation function;
s4: the Gabor filter transformation module is constructed, and the principle of the Gabor filter transformation module is as follows:
the complex expression of the two-dimensional Gabor function is:
Figure QLYQS_8
the real part expression of the two-dimensional Gabor function is:
Figure QLYQS_9
the imaginary part expression of the two-dimensional Gabor function is:
Figure QLYQS_10
wherein x 'and y' are obtained by:
x′=xcosθ+ysinθ
y′=-xsinθ+ycosθ
in the above equation, λ represents a wavelength parameter of a cosine function in the Gabor kernel function; θ represents the direction of parallel stripes in the Gabor filter kernel; psi represents the phase parameter of the cosine function in the Gabor kernel function; gamma is an aspect ratio parameter of the filter kernel, determining the shape of the Gabor filter kernel; σ represents the standard deviation of the gaussian factor of the Gabor function. Wherein λ, σ and bandwidth b are related as follows:
Figure QLYQS_11
Figure QLYQS_12
s5: constructing a channel attention module, and calculating a formula by the channel attention module:
y CAM =L·f(L,H)+H
wherein y is CAM Representing the final output result of the channel attention module; l represents a shallow feature map output in the encoder; h represents a deep feature map of the same size as in the encoder output in the decoder; f (L, H) represents shallow feature map output in encoder and decodingThe deep feature map output in the device is subjected to cascading, global average pooling, 1×1 convolution, reLU activation function, 1×1 convolution and Sigmoid activation function to output results;
s6: constructing a GA-Unet model by using the constructed residual convolution module, the channel attention module, the Gabor filter transformation module, the deconvolution module and the image maximum pooling downsampling;
s7: optimizing model parameters by using a focus loss function;
s8: setting a proper model quantification evaluation mode, and quantitatively evaluating the model performance by adopting precision, recall rate and F1 value;
s9: inputting experimental data into a model, training the model by using data in a training set, evaluating the generalization performance of the model by using a verification set and a test set in the training process, calculating an F1 value of the model according to the precision and the recall rate, and selecting a weight parameter corresponding to the model with the highest F1 value as a trained model;
s10: and detecting the actual airport pavement image by using the trained model, realizing image pixel level separation detection, and evaluating the model from the detection effects of different images.
2. A method for identifying defects on an airport pavement based on GA-uiet as set forth in claim 1, wherein in step S1, the surface-apparent image data of the airport pavement is collected by using a robot platform-mounted area camera, and the collected data is a gray-scale image with a size of 1800 x 900.
3. A method for identifying airport pavement defects based on GA-Unet as set forth in claim 1, wherein the optimization of model parameters by focus loss function in step S7 is as follows:
FL(p k )=-α(1-p k ) γ lg(p k ),
where α is a constant and is reduced by a small amount when γ increases, (1-p) k ) γ For the modulation factor, p k Representing prediction samples belonging toProbability of the kth category.
4. The method for identifying airport pavement defects based on GA-Unet of claim 1, wherein the algorithm model is quantitatively evaluated by using an accuracy rate, a recall rate and an F1 value, and the evaluation mode in step S8 has a calculation formula as follows:
Figure QLYQS_13
Figure QLYQS_14
Figure QLYQS_15
in the formula, TP represents the number of correctly detected pixels in the crack region, FP represents the number of pixels in the non-crack region predicted as the crack pixels, and FN represents the number of pixels in the crack region not detected.
CN202310386523.3A 2023-04-12 2023-04-12 Airport pavement defect identification method based on GA-Unet Pending CN116433629A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310386523.3A CN116433629A (en) 2023-04-12 2023-04-12 Airport pavement defect identification method based on GA-Unet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310386523.3A CN116433629A (en) 2023-04-12 2023-04-12 Airport pavement defect identification method based on GA-Unet

Publications (1)

Publication Number Publication Date
CN116433629A true CN116433629A (en) 2023-07-14

Family

ID=87079278

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310386523.3A Pending CN116433629A (en) 2023-04-12 2023-04-12 Airport pavement defect identification method based on GA-Unet

Country Status (1)

Country Link
CN (1) CN116433629A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291913A (en) * 2023-11-24 2023-12-26 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure
CN117893543A (en) * 2024-03-18 2024-04-16 辽宁云也智能信息科技有限公司 Visual-assistance-based pavement crack intelligent detection method and system
CN117893543B (en) * 2024-03-18 2024-05-10 辽宁云也智能信息科技有限公司 Visual-assistance-based pavement crack intelligent detection method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291913A (en) * 2023-11-24 2023-12-26 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure
CN117291913B (en) * 2023-11-24 2024-04-16 长江勘测规划设计研究有限责任公司 Apparent crack measuring method for hydraulic concrete structure
CN117893543A (en) * 2024-03-18 2024-04-16 辽宁云也智能信息科技有限公司 Visual-assistance-based pavement crack intelligent detection method and system
CN117893543B (en) * 2024-03-18 2024-05-10 辽宁云也智能信息科技有限公司 Visual-assistance-based pavement crack intelligent detection method and system

Similar Documents

Publication Publication Date Title
Hoang An artificial intelligence method for asphalt pavement pothole detection using least squares support vector machine and neural network with steerable filter-based feature extraction
Ai et al. Computer vision framework for crack detection of civil infrastructure—A review
Chen et al. A self organizing map optimization based image recognition and processing model for bridge crack inspection
CN112215819B (en) Airport pavement crack detection method based on depth feature fusion
CN112308826B (en) Bridge structure surface defect detection method based on convolutional neural network
CN114998852A (en) Intelligent detection method for road pavement diseases based on deep learning
CN111126183A (en) Method for detecting damage of building after earthquake based on near-ground image data
CN114359130A (en) Road crack detection method based on unmanned aerial vehicle image
CN116433629A (en) Airport pavement defect identification method based on GA-Unet
CN111462140A (en) Real-time image instance segmentation method based on block splicing
Li et al. Pixel-level recognition of pavement distresses based on U-Net
CN111105398A (en) Transmission line component crack detection method based on visible light image data
Malini et al. An automatic assessment of road condition from aerial imagery using modified VGG architecture in faster-RCNN framework
CN114973116A (en) Method and system for detecting foreign matters embedded into airport runway at night by self-attention feature
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN113326846B (en) Rapid bridge apparent disease detection method based on machine vision
Crognale et al. Damage detection with image processing: A comparative study
CN110765900B (en) Automatic detection illegal building method and system based on DSSD
CN116597411A (en) Method and system for identifying traffic sign by unmanned vehicle in extreme weather
CN115457044B (en) Pavement crack segmentation method based on class activation mapping
CN110889418A (en) Gas contour identification method
CN115527118A (en) Remote sensing image target detection method fused with attention mechanism
CN116189136A (en) Deep learning-based traffic signal lamp detection method in rainy and snowy weather
CN112036246B (en) Construction method of remote sensing image classification model, remote sensing image classification method and system
Mouzinho et al. Hierarchical semantic segmentation based approach for road surface damages and markings detection on paved road

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination