CN117788471A

CN117788471A - Method for detecting and classifying aircraft skin defects based on YOLOv5

Info

Publication number: CN117788471A
Application number: CN202410211173.1A
Authority: CN
Inventors: 易程; 汪俊; 吴巧云; 刘程子; 张沅
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2024-02-27
Filing date: 2024-02-27
Publication date: 2024-03-29
Anticipated expiration: 2044-02-27
Also published as: CN117788471B

Abstract

The invention relates to a method for detecting aircraft skin defects and classifying based on YOLOv5, which comprises the steps of collecting aircraft skin damage pictures, marking anchor frames on the pictures, classifying the obtained pictures according to different damage, dividing the pictures into a training set and a testing set, carrying out information interpolation by using a bilinear interpolation algorithm, so that the training set data are reinforced, constructing a YOLOv5 model and optimizing the model, inputting the acquired reinforced training set data into the optimized YOLOv5 model to train the model, obtaining trained weight files, storing the optimal model, applying the trained weight files to the picture detection of the testing set to detect the data set, and visualizing the detection result through a Pyside 6. The invention introduces a new attention mechanism in the backbone network backbone layer, can encode the transverse and longitudinal spatial position information into the channel attention, improves the capability of the model in detecting small objects and can accurately position the detected objects when the objects are too dense.

Description

Method for detecting and classifying aircraft skin defects based on YOLOv5

Technical Field

The invention relates to the technical field of aircraft skin defects, in particular to a method for detecting and classifying aircraft skin defects based on YOLOv 5.

Background

The aircraft skin is a covering layer positioned outside the aircraft structural frame, not only forms the appearance of the aircraft and protects internal components, but also has key effects on the aspects of aerodynamic performance, protection, stealth performance, maintainability, weight, appearance quality and the like of the aircraft. Its design and material selection requires balancing a number of factors to ensure that the aircraft performs well under a variety of operating conditions.

When the aircraft skin is damaged, the strength of relevant parts is reduced, so that the aerodynamic performance of the aircraft is affected, and finally the flight safety of the aircraft is endangered. Aircraft skin damage is of several types, namely cracking, corrosion and impact. When damage occurs, the aircraft damage can be checked through the unmanned aerial vehicle or the 3D scanner, and the damage is transmitted to maintenance personnel through video, and the maintenance personnel can judge which type the aircraft skin damage belongs to through video. In order to reduce misjudgment of maintenance personnel, improve detection accuracy and guide workers to repair damaged areas in real time, a target detection algorithm is provided.

The main target detection algorithms can be divided into two categories: one-stage and Two-stage detection algorithms. The Two-stage detection algorithm comprises a fast-r cnn and a Mask-r cnn, so that accurate classification can be achieved, the detection accuracy is high, and the detection speed is low. The One-stage detection algorithm comprises a YOLO algorithm and an SSD algorithm, and has the advantages of high detection speed and lower precision. The YOLOv5 algorithm is the fifth generation of the YOLOv algorithm, which has not only a fast detection speed but also a high detection accuracy, but its effect is poor when the objects are too dense.

Therefore, it is necessary to design an optimized YOLOv5 algorithm model to improve the ability of the model to detect small objects and to accurately locate the detected objects when the objects are too dense.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for detecting the defects and classifying the skin of an airplane based on YOLOv5, which solves the problems of poor capability of detecting small objects and poor effect when the objects are too dense in the traditional method; according to the interpolation method based on the depth information, proper consideration of short-distance and long-distance objects in the interpolation process is ensured, so that a more accurate and real multi-view interpolation image is generated, a new attention mechanism is introduced into a backbone network back plane layer through optimizing a YOLOv5 model, transverse and longitudinal space position information can be encoded into channel attention, a mobile network can pay attention to remote space interaction with accurate position information in a large range of position information acquisition without excessive calculation amount, global average pooling is decomposed, detection precision of a small model and attention of the network to channels are improved, the capability of the model in detecting the small object is improved, the detected object can be accurately positioned when the object is too dense, and different damage of an aircraft skin can be identified in real time when the model is applied to aircraft skin damage detection.

In order to solve the technical problems, the invention provides the following technical scheme: a method for detecting and classifying skin defects of an aircraft based on YOLOv5, comprising the steps of:

s1, acquiring an original image of aircraft skin damage, classifying the acquired image according to different damage, and dividing the classified image into a training set and a testing set;

s2, performing information interpolation on the collected damaged pictures of the training sets with different angles by using a bilinear interpolation algorithm, and further obtaining an optimized image so as to enhance the training set data;

s3, optimizing the original YOLOv5 model to obtain an optimized YOLOv5 model;

s4, inputting the acquired enhanced training set data into an optimized YOLOv5 model to obtain a trained weight file;

s5, applying the trained weight file to picture detection of the test set, and visualizing a detection result through a Pyside 6.

Further, the step S1 specifically includes: classifying the original images of the aircraft skin damage into damage images of corrosion, impact and crack, marking the anchor frames, recording the sequence of the categories by using a TXT file, and marking the anchor frames by using a Labelimg target detection marking tool.

Further, in step S2, the specific process includes the following steps:

s21, performing interpolation twice in the x direction to obtain pixel values R1 and R2 after the interpolation twice, and performing interpolation once in the y direction to obtain pixel values in the y direction; the formula for the two interpolations in the x direction is:

；

wherein Q is ₁₁ (x ₁ ,y ₁ )，Q ₁₂ (x ₁ ,y ₂ )，Q ₂₁ (x ₂ ,y ₁ )，Q ₂₂ (x ₂ ,y ₂ ) Four points in an x-y plane rectangular coordinate system;

s22, performing interpolation once in the y direction to obtain a pixel value in the y direction, wherein an interpolation formula in the y direction is as follows:

；

s23, carrying the two formulas in the S21 into the formulas in the S22 can obtain a high-quality and high-resolution multi-view interpolation image:

。

further, in step S3, the optimizing the original YOLOv5 model includes the following steps:

s31, introducing a new attention mechanism CA at a backbone network layer of the backhaul, wherein the new attention mechanism CA divides an input training set in the x and y directions into two partsIndependent directions, lateral and longitudinal spatial position information is encoded into the channel attention, and lateral and longitudinal features are encoded using pooling kernels 1, w and H,1, so the output of the c-th dimensional feature of height H is:；

similarly, the output of the c-th dimension of the width w is:；

wherein,is the height of the feature map; />Is the width of the feature map;x _c is an intermediate feature; />，/>，/>，/>Is a known coordinate.

S32, integrating the features of the two formulas from different directions of transverse and longitudinal directions, and outputting a pair of feature graphs with known directionsFeature map +.>Connected to obtain a connected characteristic diagramF ₁ And operates on it using a convolution change function to obtain an intermediate profile comprising lateral and longitudinal spatial information +.>The formula is as follows:

；

wherein,is a reduction factor;

s33, respectively encoding the information in the x and y directions into different attention diagrams along the space dimension, and mapping the intermediate feature diagram containing the transverse and longitudinal space informationPerforming a segmentation operation into tensors f in two directions ^h And f ^w The final high and wide attention weights are then obtained by a sigmoid activation functiong ^h Andg ^w the expression is as follows:

；

wherein,，/>is a convolution kernel of 1*1,>activating a function for sigmoid;

s34, CA attention mechanism formula can be expressed as:。

further, in step S3, the optimizing the original YOLOv5 model further includes: the DIoU-NMS algorithm is used for replacing the NMS algorithm in the original model, the DIoU-NMS not only considers IoU, but also considers the distance between the centers of two frames, if the distance between the two frames is large, the algorithm considers that the two frames are frames of two objects, so that filtering is avoided, and for optimizing the NMS in the model, only IoU in the NMS in the original model is needed to be replaced by the DIoU, and the formula is as follows:

；

wherein,，/>representing the center points of the predicted and real frames respectively,ρrepresenting the calculation of the Euclidean distance between two central points,/->Representing the diagonal distance of the minimum closure area that can contain both the predicted and real frames; the formula of the DIoU-NMS is shown below:

；

wherein s is _i Is the classification confidence;Mrepresenting a higher scoring bounding box in the prediction box; epsilon is the threshold of the DIoU-NMS,traversing the superposition condition of each frame with high confidence coefficient; r is R _DIoU Is a penalty formula.

Further, in step S3, the optimizing the original YOLOv5 model further includes: changing the loss function in the original model to CIoU, wherein the loss function is increased in loss function of detection scale based on DIoU, and increased in loss function of length and width, and the difference of center distance, width and height of the boundary frame is considered, so that regression of the target frame is more stable, and the formula is as follows:

；

wherein a, B can be seen as two sets;is a weight function; />To measure the similarity of aspect ratios.

Further, the step S4 specifically includes: placing the optimized YOLOv5 model into a configured Python environment, training the optimized YOLOv5 model by using the marked pictures in the training set after data enhancement, and accelerating training the model through a GPU.

By means of the technical scheme, the method for detecting the defects and classifying the aircraft skin based on the YOLOv5 has the following beneficial effects:

the interpolation method based on the depth information ensures proper consideration of the short-distance and long-distance objects in the interpolation process, thereby generating more accurate and more real multi-view interpolation images and enhancing the training set data; the method has the advantages that the acquired enhanced training set data is optimized through the YOLOv5 model, a new attention mechanism is introduced into a backbone network back layer, the transverse and longitudinal spatial position information can be encoded into the attention of a channel, the mobile network can pay attention to the large-range position information to acquire remote space interaction with accurate position information without bringing excessive calculation amount, the global average pooling is decomposed, the detection precision of a small model and the attention of the network to the channel are improved, the small object detection capability of the model is improved, the detected object can be accurately positioned when the object is too dense, different damage of the aircraft skin can be identified in real time when the model is applied to the aircraft skin damage detection, the detection speed can be improved, meanwhile, the method also has higher detection precision, and the problem that the effect of the model is poor when the object is too dense is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

FIG. 1 is a flowchart of a method for detecting and classifying skin defects of an aircraft based on YOLOv 5.

FIG. 2 is a schematic diagram of a specific algorithm model modified by Yolov 5.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. Therefore, the implementation process of how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in a method of implementing an embodiment described above may be implemented by a program to instruct related hardware, and thus the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Fig. 1 and 2 show a specific implementation manner of the present embodiment, in the present embodiment, firstly, an aircraft skin damage picture is collected, an anchor frame is marked on the picture, the obtained picture is classified into a training set and a testing set according to different damage, information interpolation is performed by using a bilinear interpolation algorithm, so that training set data is reinforced, a YOLOv5 model is constructed and optimized, the collected reinforced training set data is input into the optimized YOLOv5 model to train the model, a trained weight file is obtained and the optimal model is stored, the trained weight file is applied to picture detection of the testing set to detect the data set, and a detection result is visualized through a Pyside 6. The interpolation method based on the depth information ensures proper consideration of short-distance and long-distance objects in the interpolation process, thereby generating more accurate and more real multi-view interpolation images, introducing a new attention mechanism into a backbone network back plane layer by optimizing a YOLOv5 model, coding transverse and longitudinal spatial position information into channel attention, enabling a mobile network to pay attention to long-distance spatial interaction with accurate position information in a large range of position information without bringing excessive calculation amount, decomposing global average pooling, improving detection precision of a small model and attention of the network to channels, improving the capability of the model in detecting the small object and accurately positioning the detected object when the object is too dense, being applied to aircraft skin damage detection, identifying different damage of the aircraft skin in real time.

Referring to fig. 1, the present embodiment provides a method for detecting defects and classification of an aircraft skin based on YOLOv5, which includes the following steps:

s1, acquiring an original image of aircraft skin damage, classifying the acquired image according to different damage, and dividing the classified images into a training set and a testing set;

as a preferred embodiment of step S1, S1 specifically includes: classifying the original images of the aircraft skin damage into damage images of corrosion, impact and crack, marking anchor frames, recording the sequence of the categories by using TXT files, and comparing tensors generated after the damage by using a YOLOv5 model. The anchor frame is a reference frame with different preset sizes and different length-width ratios on the image. The calibration anchor frame is marked by using a Labelimg target detection marking tool.

S2, information interpolation is carried out on the collected damaged pictures of the training sets with different angles by using a bilinear interpolation algorithm, so that an optimized image is obtained, and a multi-view interpolation picture with high quality and high resolution is obtained, so that the training set data is enhanced;

as a preferred embodiment of step S2, the specific procedure of S2 includes: a bilinear interpolation algorithm for depth information in multi-view interpolation uses gray values of adjacent pixels or RGB tristimulus values to generate gray values or RGB tristimulus values of unknown pixels. The aim is to generate a higher resolution image from the original image. Let us assume that the target is the pixel value of the resulting function at point Q (x, y), and that Q is known ₁₁ (x ₁ ,y ₁ )，Q ₁₂ (x ₁ ,y ₂ )，Q ₂₁ (x ₂ ,y ₁ )，Q ₂₂ (x ₂ ,y ₂ ) And corresponding pixel values, wherein Q ₁₁ (x ₁ ,y ₁ )，Q ₁₂ (x ₁ ,y ₂ )，Q ₂₁ (x ₂ ,y ₁ )，Q ₂₂ (x ₂ ,y ₂ ) Is four points in an x and y plane rectangular coordinate system.

；

。

in the embodiment, the interpolation method based on the depth information ensures proper consideration of the short-distance and long-distance objects in the interpolation process, thereby generating more accurate and more real multi-view interpolation images.

S3, optimizing an original YOLOv5 model to obtain an optimized YOLOv5 model;

as a preferred embodiment of step S3, in S3, the optimizing the original YOLOv5 model specifically includes the following steps:

s31, introducing a new attention mechanism CA at a backbone network layer of the backhaul, wherein the new attention mechanism CA divides an input training set in x and y directions into two independent directions, codes transverse and longitudinal spatial position information into channel attention, and codes transverse and longitudinal characteristics by using pooling cores 1, W and H,1, so that the output of a c-th dimension characteristic of a height H is as follows:；

similarly, the output of the c-th dimension of the width w is:；

；

wherein,is a reduction factor;

more specifically, a neg middle layer is added after the backbone network, as shown in fig. 2, the neg middle layer is responsible for carrying out multi-scale feature fusion on the feature map and transmitting the features to a Detect detection and identification image layer, and as the size and the position of an object in the image are uncertain, a mechanism is needed to process targets with different scales and sizes. In the Neck layer, feature graphs of different layers are fused together through upsampling and downsampling operations, the Neck middle layer comprises a top-down part and a bottom-up part, and the top-down part realizes fusion of features of different layers through connection fusion of the upsampling and coarser feature graphs, and mainly comprises the following steps:

s321, up-sampling a final layer of feature map of a backbone network to obtain a finer feature map;

s322, carrying out connection fusion on the up-sampled feature map and a feature map of the upper layer (a pair of feature maps with known directions obtained through a new attention mechanism CA) to obtain richer feature expression; the bottom-up part mainly uses convolution layers to connect and fuse characteristic graphs from different layers, and mainly comprises the following steps:

s323, convolving the bottom-most feature map of the top-down part to obtain richer feature expression;

s324, connecting and fusing the convolved feature map and the feature map of the upper layer to obtain richer feature expression;

s325, finally, the final layer of feature map convolved with the backbone network is connected and fused, so that the output feature map of the backbone network is connected and fused with the feature map of the top-down part and the feature map of the bottom-up part to be output, and a final intermediate feature map is obtainedThe method is used for target detection, has higher detection precision, and optimizes the problem that the detection effect of the model is poor when objects are too dense.

；

wherein,，/>is a convolution kernel of 1*1,>activating a function for sigmoid;

s34, CA attention mechanism formula can be expressed as:。

more specifically, in S3, the optimizing the original YOLOv5 model further includes: the non-maximum suppression (Non Maximum Suppression, NMS) algorithm in the original model is replaced with a Distance-IoU Non Maximum Suppression (DIoU-NMS) algorithm that considers not only IoU but also the Distance between the centers of two boxes, which would be considered the boxes of two objects if the Distance between the two boxes were large, and thus would not be filtered out. For optimization of NMS in the model, only IoU in NMS in the original model needs to be replaced with DIoU, which is expressed as follows:

；

More specifically, in S3, the optimizing the original YOLOv5 model further includes: changing the loss function in the original model to CIoU, wherein the loss function is increased in loss function of detection scale based on DIoU, and increased in loss function of length and width, and the regression of the target frame is more stable by considering more factors such as the difference of the center distance, width and height of the boundary frame, wherein the formula is as follows:

；

In this embodiment, by optimizing the YOLOv5 model, introducing a new attention mechanism in the backbone network backbone layer, most of the attention mechanisms of the lightweight network currently adopt a Squeeze-and-Specification (SE) module, only the channel information is considered, but the position information is ignored, although the bottleneck attention module (Bottleneck Attention Module, BAM) and the convolution attention module (Convolutional Block Attention Module, CBAM) later attempt to extract the position attention information through convolution after reducing the number of channels, the convolution only can extract local relations, and lack the extraction capability of long-distance relations, so that the new attention mechanism is introduced in the backbone layer, the transverse and longitudinal spatial position information can be encoded into the channel attention, so that the mobile network can pay attention to a large range of position information to obtain remote spatial interaction with accurate position information without bringing excessive calculation amount, and decompose global averaging, so that the detection accuracy of a small model and the channel of the network are improved, the detection capability of detecting small objects and the aircraft skin can not be applied to accurately detecting skin damage of the aircraft when the aircraft is detected, and the aircraft skin can not be detected in real-time.

S4, inputting the acquired enhanced training set data into an optimized YOLOv5 model to obtain a trained best.

As a preferred embodiment of step S4, S4 specifically includes: placing the optimized YOLOv5 model into a configured Python environment, training the optimized YOLOv5 model by using the marked pictures in the training set after data enhancement, and accelerating training the model through a GPU.

And S5, applying the trained weight file to the picture detection of the test set, visualizing a detection result through the Pyside6, training the test set by using the weight file, installing the Pyside6 in a compiler Pycharm, and outputting the picture after detection and identification. Wherein the Pyside6 is a graphical interface (GUI) library of Python developed from Qt in C++ version. The Pyside is a free and commercially available software.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.

The foregoing embodiments have been presented in a detail description of the invention, and are presented herein with a particular application to the understanding of the principles and embodiments of the invention, the foregoing embodiments being merely intended to facilitate an understanding of the method of the invention and its core concepts; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A method for detecting and classifying skin defects of an aircraft based on YOLOv5, comprising the steps of:

s3, optimizing the original YOLOv5 model to obtain an optimized YOLOv5 model;

2. The YOLOv 5-based method of detecting aircraft skin defects and classification of claim 1, wherein: the S1 specifically comprises the following steps: classifying the original images of the aircraft skin damage into damage images of corrosion, impact and crack, marking the anchor frames, recording the sequence of the categories by using a TXT file, and marking the anchor frames by using a Labelimg target detection marking tool.

3. The YOLOv 5-based method of detecting aircraft skin defects and classification of claim 1, wherein: in S2, the specific process includes the following steps:

；

s23, substituting the two formulas in S21 into the formulas in S22 can obtain a multi-view interpolation image:

。

4. the YOLOv 5-based method of detecting aircraft skin defects and classification of claim 1, wherein: in S3, the optimizing the original YOLOv5 model includes the following steps:

s31, introducing a new attention mechanism CA at a backbone network layer of the backhaul, wherein the new attention mechanism CA divides an input training set in x and y directions into two independent directions, codes transverse and longitudinal spatial position information into channel attention, and codes transverse and longitudinal characteristics by using pooling cores 1, W and H,1, so that the output of a c-th dimension characteristic of a height H is as follows:

；

similarly, the output of the c-th dimension of the width w is:；

wherein,is the height of the feature map; />Is the width of the feature map;x _c is an intermediate feature; />，/>，/>，/>Is a known coordinate;

；

wherein,is a reduction factor;

；

wherein,，/>is a convolution kernel of 1*1,>activating a function for sigmoid;

s34, CA attention mechanism formula can be expressed as:。

5. the YOLOv 5-based method of detecting aircraft skin defects and classification of claim 4, wherein: in S3, the optimizing the original YOLOv5 model further includes: the DIoU-NMS algorithm is used for replacing the NMS algorithm in the original model, the DIoU-NMS not only considers IoU, but also considers the distance between the centers of two frames, if the distance between the two frames is large, the algorithm can consider that the two frames are frames of two objects, so that filtering is avoided, and for optimizing the NMS in the model, only IoU in the NMS in the original model is replaced by DIoU, and the formula is as follows:

；

6. The YOLOv 5-based method of detecting aircraft skin defects and classification of claim 4, wherein: in S3, the optimizing the original YOLOv5 model further includes: changing the loss function in the original model to CIoU, wherein the loss function is increased in loss function of detection scale based on DIoU, and increased in loss function of length and width, and the difference of center distance, width and height of the boundary frame is considered, so that regression of the target frame is more stable, and the formula is as follows:

；

7. The YOLOv 5-based method of detecting aircraft skin defects and classification of claim 1, wherein: the step S4 specifically comprises the following steps: placing the optimized YOLOv5 model into a configured Python environment, training the optimized YOLOv5 model by using the marked pictures in the training set after data enhancement, and accelerating training the model through a GPU.