CN114758255A - Unmanned aerial vehicle detection method based on YOLOV5 algorithm - Google Patents

Unmanned aerial vehicle detection method based on YOLOV5 algorithm Download PDF

Info

Publication number
CN114758255A
CN114758255A CN202210350981.7A CN202210350981A CN114758255A CN 114758255 A CN114758255 A CN 114758255A CN 202210350981 A CN202210350981 A CN 202210350981A CN 114758255 A CN114758255 A CN 114758255A
Authority
CN
China
Prior art keywords
detection
aerial vehicle
unmanned aerial
yolov5
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210350981.7A
Other languages
Chinese (zh)
Inventor
马峻
王晓
徐翠锋
陈寿宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202210350981.7A priority Critical patent/CN114758255A/en
Publication of CN114758255A publication Critical patent/CN114758255A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of target detection, in particular to an unmanned aerial vehicle detection method based on a YOLOV5 algorithm, which comprises the steps of collecting image data of an unmanned aerial vehicle, screening and labeling the image data to obtain a training set and a verification set; training a network model by using the training set and the test set to obtain a detection model; the detection model is used for detecting the video of the unmanned aerial vehicle to obtain a detection result, the detection model obtained through training of the training set and the testing set can improve the detection accuracy of small targets such as the unmanned aerial vehicle under the condition of complex background or aggregation, the probability of false detection and missed detection is reduced, and the problems that the existing unmanned aerial vehicle detection technology is not sensitive to the small targets and the detection error is large under the condition of dense targets are solved.

Description

Unmanned aerial vehicle detection method based on YOLOV5 algorithm
Technical Field
The invention relates to the technical field of target detection, in particular to an unmanned aerial vehicle detection method based on a YOLOV5 algorithm.
Background
By means of the characteristics of small size, flexible action, easiness in operation and the like, the unmanned aerial vehicle is usually applied to target tracking, target searching and the like by utilizing an unmanned aerial vehicle detection technology.
At present, the existing unmanned aerial vehicle detection technology is mainly based on a deep learning model and can be roughly divided into two categories: one is a two-stage target detection algorithm that divides the detection problem into two stages, namely generating candidate regions containing approximate position information of the target, classifying the candidate regions, and refining the positions.
By adopting the mode, the method is insensitive to small targets, and the detection error is larger under the condition that the targets are dense.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle detection method based on a YOLOV5 algorithm, and aims to solve the problems that the existing unmanned aerial vehicle detection technology is not sensitive to small targets and has large detection errors under the condition of dense targets.
In order to achieve the purpose, the invention provides an unmanned aerial vehicle detection method based on a Yolov5 algorithm, which comprises the following steps:
acquiring image data of an unmanned aerial vehicle, and screening and labeling the image data to obtain a training set and a verification set;
training a network model by using the training set and the test set to obtain a detection model;
and detecting the unmanned aerial vehicle video by using the detection model to obtain a detection result.
Wherein, gather unmanned aerial vehicle image data, and right the specific mode that image data filters and marks is:
acquiring unmanned aerial vehicle image data, and screening and labeling the image data to obtain a data set;
the data set is divided into a training set and a validation set.
The network model comprises an Input end, a Backbone part, a Neck part and a Head part.
Wherein, the network model is trained by using the training set and the testing set, and the concrete mode of obtaining the detection model is as follows:
carrying out standardized preprocessing on the training set through the Input end to obtain a data enhanced image;
extracting the features of the data enhanced image through the Backbone part to obtain a feature map set;
obtaining tensor data of the feature atlas through the Neck part based on feature pyramid structure and feature fusion;
calculating the gradient through the tensor data by the Head part to obtain a calculation result, and updating and verifying the gradient based on the calculation result and the verification set to obtain the YOLOV5 detection model.
The specific way to obtain the YOLOV5 detection model by calculating the gradient through the tensor data in the Head part to obtain a calculation result, and updating and verifying the gradient based on the calculation result and the verification set is as follows:
calculating the gradient through the Head part through tensor data based on a loss function and back propagation to obtain a calculation result;
updating the gradient based on the calculation result to obtain an updating condition;
and verifying the updating condition by using the verification set to obtain an evaluation index result of the trained model and obtain a Yolov5 detection model.
The method comprises the following steps of utilizing the detection model to detect the video of the unmanned aerial vehicle, and obtaining a detection result in a specific mode:
and processing the unmanned aerial vehicle video into frames by using the detection model, and detecting each frame of image to obtain a detection result.
The unmanned aerial vehicle detection method based on the YOLOV5 algorithm acquires unmanned aerial vehicle image data, and screens and marks the image data to obtain a training set and a verification set; training a network model by using the training set and the test set to obtain a detection model; the detection model is utilized to detect the video of the unmanned aerial vehicle to obtain a detection result, the training set and the testing set are trained to obtain the detection model, so that the detection accuracy of small targets such as the unmanned aerial vehicle under the condition of complex background or aggregation can be improved, the probability of false detection and missed detection is reduced, and the problems that the existing unmanned aerial vehicle detection technology is not sensitive to the small targets and the detection error is large under the condition of dense targets are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a method for detecting an unmanned aerial vehicle based on the YOLOV5 algorithm provided by the invention.
Fig. 2 is a network structure diagram of the unmanned aerial vehicle detection method based on the YOLOV5 algorithm provided by the present invention.
Figure 3 is a structural diagram of C3 SwinTR.
FIG. 4 is a Swin transform block diagram.
FIG. 5 is a schematic partitioning diagram of regular and shifted windows.
Fig. 6 is a schematic diagram of the FPN structure and PAN structure.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
Referring to fig. 1 to 6, the present invention provides a method for detecting an unmanned aerial vehicle based on YOLOV5 algorithm, including the following steps:
s1, collecting unmanned aerial vehicle image data, and screening and labeling the image data to obtain a training set and a verification set;
the concrete method is as follows:
s11, collecting unmanned aerial vehicle image data, and screening and labeling the image data to obtain a data set;
s12 divides the data set into a training set and a validation set.
S2, training a network model by using the training set and the test set to obtain a detection model;
specifically, the network model fuses the swin transformer, and specifically comprises an Input (Input), a Backbone network part (Backbone), a feature fusion part (tack) and a prediction output (Head).
The specific manner of step S2 is:
s21, carrying out standardized preprocessing on the training set through the Input end to obtain a data enhanced image;
specifically, the Input end processes the Input image by using a Mosaic data enhancement mode, a self-adaptive anchor frame calculation mode and a self-adaptive picture scaling mode.
And the Mosaic data enhancement adopts 4 pictures to be spliced in the modes of random zooming, random cutting and random arrangement. The method for enhancing the Mosaic data has the advantages of enriching data sets and reducing GPUs.
In the algorithm of the self-adaptive anchor frame calculation, an anchor frame with the length and the width set initially exists for different data sets. In the network training, the network outputs a prediction frame on the basis of an initial anchor frame, and then compares the prediction frame with a real frame group, calculates the difference between the prediction frame and the real frame group, and then reversely updates and iterates network parameters. Therefore, the initial anchor block is also an important part, and the function of calculating the initial anchor block value is embedded into the code in YOLOV5, and the optimal anchor block value in different training sets is calculated adaptively each time training.
In a common target detection algorithm, different pictures are different in length and width, so that the common method is to uniformly scale the original picture to a standard size and then send the standard size to a detection network. However, since many pictures have different aspect ratios, the sizes of the black edges at both ends are different after the scaling and filling, and if the filling ratio is large, information redundancy exists, which affects the inference speed. Yolov5 adds the least black edge to the original image in a self-adaptive manner, the black edges at two ends of the image height are reduced, and the calculated amount is reduced during reasoning, namely the target detection speed is improved.
Splicing 4 pictures in the training set in a random zooming, random cutting and random arrangement mode through the Mosaic data enhancement to obtain spliced pictures; calculating the optimal anchor frame value of the spliced picture through a self-adaptive anchor frame calculation algorithm to obtain a preprocessed picture; and carrying out scaling filling on the preprocessed picture to obtain a data enhanced image.
S22, performing feature extraction on the data enhanced image through the Backbone part to obtain a feature atlas;
specifically, the Backbone part extracts abundant information features from the input image, and mainly comprises two structures of convolution and C3, and C3 is a simplified Bottleneck CSP, because only 3 convolutions are provided except for the Bottleneck part, so that parameters can be reduced. The invention replaces the bottleeck with the swin transformer block, combines the C3 structure and names the structure as C3 SwinTR. A self-attention module is introduced into a backbone network, so that the network can better pay attention to global information and rich context information, and can better extract the characteristics of a target.
The Swin transducer block is composed of a left part and a right part, wherein the two parts are used alternately by W-MSA and SW-MSA, a two-layer Multilayer Perceptron (MLP) is simultaneously followed, and a layer normalization unit (LayerNorm, LN) is added before each Multi-headed Self-attention (MSA) and each MLP structure.
Swin Transformer Blocks was calculated as follows:
Figure BDA0003580251980000051
Figure BDA0003580251980000052
Figure BDA0003580251980000053
Figure BDA0003580251980000054
wherein z represents the output characteristics of the (S) WMSA module and the MLP module of block l (l represents a certain layer), respectively; W-MSA and SW-MSA represent window partition configurations based on multi-headed self-attention usage rules (regular windows) and shifts (shifted windows), respectively.
W-MSA and SW-MSA represent the attention mechanism using regular Windows and shifted Windows that enable the association between adjacent Windows of the previous layer, which is very useful for image classification, object detection and semantic segmentation. The partitioning of regular and shifted windows is shown in FIG. 5, where the left diagram is the regular window and the right diagram is the shifted window.
S23 obtaining tensor data of the feature atlas through the Neck part based on feature pyramid structure and feature fusion;
specifically, the Neck part mainly generates a feature pyramid based on information features, the feature pyramid can enhance detection of the model on objects with different scaling scales, so that the same object with different sizes and scales can be identified, and the Neck part is formed by combining an FPN structure and a PAN structure and used for mixing features of images and transferring the feature images to a prediction layer. The FPN structure and PAN structure are shown in FIG. 6, with the FPN on the left and PAN on the right.
The FPN structure establishes a top-down path for feature fusion, and then uses a fused feature layer with higher semantic information for prediction, but the structure is limited by unidirectional information flow;
the PAN structure is characterized in that a bottom-to-top path is established on the basis of the FPN, and the position information of the bottom layer is transmitted to the prediction feature layer, so that the prediction feature layer has the semantic information of the top layer and the position information of the bottom layer, and the target detection precision can be greatly improved.
S24, calculating the gradient through the tensor data by the Head part to obtain a calculation result, and updating and verifying the gradient based on the calculation result and the verification set to obtain the YOLOV5 detection model.
The concrete mode is as follows:
s241, calculating a gradient through the Head part through the tensor data based on a loss function and backward propagation to obtain a calculation result;
specifically, the Head part predicts the image features based on the feature pyramid, obtains a bounding box and predicts the category. The Loss function of the target detection task generally consists of a classification Loss function and a Bounding Box regression Loss function;
and adopting the CIOU _ Loss as a Loss function of the Bounding box.
The formula of CIOU _ Loss is as follows:
Figure BDA0003580251980000061
where p represents the calculation of the Euclidean distance between two center points, b and bgtC represents the diagonal distance of the minimum closure area which simultaneously contains the prediction frame and the real frame, wherein v is a parameter for measuring the consistency of the aspect ratio, and the formula of v is shown as the formula (2):
Figure BDA0003580251980000062
wherein wgtAnd hgtRepresents the width and height of the real box, and w and h represent the width and height of the prediction box. CIOU _ Loss should take into account three important geometric factors for the regression function of the target box: the overlap area, center point distance, and aspect ratio are all taken into account.
S242, updating the gradient based on the calculation result to obtain an updating condition;
s243, the updating condition is verified by the verification set, the verification is successful, a detection model is obtained, the verification fails, the step S11 is returned, and the unmanned aerial vehicle image data are collected again.
S3, detecting the unmanned aerial vehicle video by using the detection model to obtain a detection result.
The concrete mode is as follows: processing the unmanned aerial vehicle video into frames by using the detection model, and detecting image frames of each frame to obtain a plurality of target frames; and screening the target frames to obtain a detection result. And aiming at the screening of a plurality of target frames, processing the redundant detection frames by using the IOU as a threshold value through nms operation so as to obtain the final correct detection result.
Conv in FIG. 2: a convolution layer;
c3: a structure consisting of a Bottleneck part and 3 convolutions;
c3 SwinTR: replacing the structure after the Bottleneck part in C3 with swin transform;
SPPF: a spatial pyramid pooling layer;
up Sample: upsampling;
concat: fusing the characteristics;
NMS: non-maxima suppression.
In FIG. 3, [ bs, c1, w, h ], [ bs, c2, w, h ], [ bs, c2/2, w, h ]: inputting parameters of a feature map;
batch size (bs, batch size): the number of samples trained at one time influences the optimization degree and speed of the model;
c 1: the number of input channels of the entire C3 SwinTR;
c 2: the number of output channels of the whole C3 SwinTR;
c 2/2: half of the number of output channels of the whole C3 SwinTR;
w (width), h (height): width and height of the feature map;
conv: performing convolution;
cv1, cv2, cv31, 2 and 3 volume blocks;
k (kernel): convolution kernel size;
s (stride): step size;
concat: and (5) splicing the characteristic diagrams.
In layer l (left) in fig. 4, a regular window division scheme is employed, and self-attention is calculated within each window. In the next layer 1+1 (right), the window partition is shifted, resulting in a new window. The self-attention computation of the new window crosses the boundary of the previous l-layer window, providing a connection between them.
A local windows self-ztention: a window to perform self-attention;
a patch: an image block;
pl in FIG. 5 represents the characteristics of the layer.
Although the invention has been described with reference to a preferred embodiment based on the YOLOV5 algorithm, it is understood that the scope of the invention is not limited thereto, and that all or part of the process flow for implementing the above embodiment may be implemented by those skilled in the art, and all equivalent changes made in the claims are still within the scope of the invention.

Claims (6)

1. An unmanned aerial vehicle detection method based on a YOLOV5 algorithm is characterized by comprising the following steps:
acquiring image data of an unmanned aerial vehicle, and screening and labeling the image data to obtain a training set and a verification set;
training a network model by using the training set and the test set to obtain a detection model;
and detecting the unmanned aerial vehicle video by using the detection model to obtain a detection result.
2. The method of claim 1, wherein the method of UAV detection based on the YOLOV5 algorithm,
the method comprises the following steps of collecting unmanned aerial vehicle image data, and screening and labeling the image data in a specific mode:
acquiring unmanned aerial vehicle image data, and screening and labeling the image data to obtain a data set;
the data set is divided into a training set and a validation set.
3. The method of claim 1, wherein the method of UAV detection based on the YOLOV5 algorithm,
the network model comprises an Input end, a backhaul part, a Neck part and a Head part.
4. A UAV detection method based on the YOLOV5 algorithm of claim 3,
the specific way of training the network model by using the training set and the test set to obtain the detection model is as follows:
carrying out standardized preprocessing on the training set through the Input end to obtain a data enhanced image;
performing feature extraction on the data enhanced image through the backhaul part to obtain a feature map set;
obtaining tensor data of the feature atlas through the Neck part based on feature pyramid structure and feature fusion;
calculating the gradient through the tensor data by the Head part to obtain a calculation result, and updating and verifying the gradient based on the calculation result and the verification set to obtain the YOLOV5 detection model.
5. A UAV detection method based on the YOLOV5 algorithm of claim 4,
the specific way of calculating the gradient through the tensor data in the Head part to obtain a calculation result, updating and verifying the gradient based on the calculation result and the verification set, and obtaining the YOLOV5 detection model is as follows:
calculating the gradient through the Head part through tensor data based on a loss function and back propagation to obtain a calculation result;
updating the gradient based on the calculation result to obtain an updating condition;
and verifying the updating condition by using the verification set to obtain an evaluation index result of the trained model and obtain a Yolov5 detection model.
6. The method of claim 1, wherein the method of UAV detection based on the YOLOV5 algorithm,
the detection model is used for detecting the unmanned aerial vehicle video, and the specific mode for obtaining the detection result is as follows:
and processing the unmanned aerial vehicle video into frames by using the detection model, and detecting each frame of image to obtain a detection result.
CN202210350981.7A 2022-04-02 2022-04-02 Unmanned aerial vehicle detection method based on YOLOV5 algorithm Pending CN114758255A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210350981.7A CN114758255A (en) 2022-04-02 2022-04-02 Unmanned aerial vehicle detection method based on YOLOV5 algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210350981.7A CN114758255A (en) 2022-04-02 2022-04-02 Unmanned aerial vehicle detection method based on YOLOV5 algorithm

Publications (1)

Publication Number Publication Date
CN114758255A true CN114758255A (en) 2022-07-15

Family

ID=82330061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210350981.7A Pending CN114758255A (en) 2022-04-02 2022-04-02 Unmanned aerial vehicle detection method based on YOLOV5 algorithm

Country Status (1)

Country Link
CN (1) CN114758255A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152591A (en) * 2022-11-25 2023-05-23 中山大学 Model training method, infrared small target detection method and device and electronic equipment
CN116152685A (en) * 2023-04-19 2023-05-23 武汉纺织大学 Pedestrian detection method and system based on unmanned aerial vehicle visual field
CN116704317A (en) * 2023-08-09 2023-09-05 深圳华付技术股份有限公司 Target detection method, storage medium and computer device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116152591A (en) * 2022-11-25 2023-05-23 中山大学 Model training method, infrared small target detection method and device and electronic equipment
CN116152591B (en) * 2022-11-25 2023-11-07 中山大学 Model training method, infrared small target detection method and device and electronic equipment
CN116152685A (en) * 2023-04-19 2023-05-23 武汉纺织大学 Pedestrian detection method and system based on unmanned aerial vehicle visual field
CN116704317A (en) * 2023-08-09 2023-09-05 深圳华付技术股份有限公司 Target detection method, storage medium and computer device
CN116704317B (en) * 2023-08-09 2024-04-19 深圳华付技术股份有限公司 Target detection method, storage medium and computer device

Similar Documents

Publication Publication Date Title
CN114758255A (en) Unmanned aerial vehicle detection method based on YOLOV5 algorithm
CN112884064A (en) Target detection and identification method based on neural network
CN111950453A (en) Optional-shape text recognition method based on selective attention mechanism
CN114359851A (en) Unmanned target detection method, device, equipment and medium
CN115496752B (en) Steel surface defect detection method based on one-stage target detection algorithm
CN112149547A (en) Remote sensing image water body identification based on image pyramid guidance and pixel pair matching
CN112801270B (en) Automatic U-shaped network slot identification method integrating depth convolution and attention mechanism
CN114399672A (en) Railway wagon brake shoe fault detection method based on deep learning
CN112633149B (en) Domain-adaptive foggy-day image target detection method and device
CN115035361A (en) Target detection method and system based on attention mechanism and feature cross fusion
CN114648665A (en) Weak supervision target detection method and system
CN116342894B (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN113052834A (en) Pipeline defect detection method based on convolution neural network multi-scale features
CN111753732A (en) Vehicle multi-target tracking method based on target center point
CN113313706A (en) Power equipment defect image detection method based on detection reference point offset analysis
CN115830471A (en) Multi-scale feature fusion and alignment domain self-adaptive cloud detection method
CN116597411A (en) Method and system for identifying traffic sign by unmanned vehicle in extreme weather
CN113920479A (en) Target detection network construction method, target detection device and electronic equipment
CN113378727A (en) Remote sensing image binary change detection method based on characteristic deviation alignment
CN115294176B (en) Double-light multi-model long-time target tracking method and system and storage medium
CN117095155A (en) Multi-scale nixie tube detection method based on improved YOLO self-adaptive attention-feature enhancement network
CN114494893B (en) Remote sensing image feature extraction method based on semantic reuse context feature pyramid
CN116310293A (en) Method for detecting target of generating high-quality candidate frame based on weak supervised learning
CN114937239A (en) Pedestrian multi-target tracking identification method and tracking identification device
CN115035429A (en) Aerial photography target detection method based on composite backbone network and multiple measuring heads

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination