CN111985451A - Unmanned aerial vehicle scene detection method based on YOLOv4 - Google Patents

Unmanned aerial vehicle scene detection method based on YOLOv4 Download PDF

Info

Publication number
CN111985451A
CN111985451A CN202010921511.2A CN202010921511A CN111985451A CN 111985451 A CN111985451 A CN 111985451A CN 202010921511 A CN202010921511 A CN 202010921511A CN 111985451 A CN111985451 A CN 111985451A
Authority
CN
China
Prior art keywords
unmanned aerial
aerial vehicle
training
scene detection
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010921511.2A
Other languages
Chinese (zh)
Inventor
韩玉洁
曹杰
万思钰
刘琨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202010921511.2A priority Critical patent/CN111985451A/en
Publication of CN111985451A publication Critical patent/CN111985451A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an unmanned aerial vehicle scene detection method based on YOLOv4, which comprises the following steps: s1, establishing a proprietary data set; dividing a training set and a test set according to a certain proportion; s2, establishing a network structure; s3, using the pre-training model, and setting specific training parameters of the pre-training model to obtain a training model; s4, carrying out iterative training until the loss function is converged to obtain an unmanned aerial vehicle scene detection model; s5, testing the model by using the test set, judging whether the model meets the requirements, if not, continuing to carry out the step S4 until the test result meets the requirements; s6, outputting an unmanned aerial vehicle scene detection model meeting the requirements; and S7, carrying out target detection on the sequence images by using the unmanned aerial vehicle scene detection model. Compared with the prior art, the invention has the advantages of small memory occupation, 5.26 percent improvement of average intersection, 3.30 percent improvement of accuracy rate and 1.08 percent improvement of recall rate after improvement.

Description

Unmanned aerial vehicle scene detection method based on YOLOv4
Technical Field
The invention relates to the field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle scene detection method based on YOLOv 4.
Background
The unmanned aerial vehicle is small and flexible, has a wide visual angle, and has wide application in the aspects of agricultural plant protection, disaster detection, safety protection, aerial photography video and the like in recent years. Various deep learning schemes enable accuracy and speed of target identification to be greatly developed, but most of the current target identification objects are plane visual angles, target scales in unmanned aerial vehicle images are variable, the sizes are small, resolution ratio is low, and the existing models cannot be directly applied to the field of unmanned aerial vehicle image target identification. The two-stage target detection framework represented by fast-RCNN needs more hardware resources and is slow, and is not suitable for real-time scenes.
In order to solve the problems of multiple small targets, low pixels, multiple scales, limited resources of an unmanned aerial vehicle hardware platform and high real-time performance in an unmanned aerial vehicle image, the landing scene multi-target recognition model based on the unmanned aerial vehicle image is improved and trained based on the YOLOv4 network.
Disclosure of Invention
In view of this, the present invention provides a method for detecting a scene of an unmanned aerial vehicle based on YOLOv4, and mainly solves the following problems: the existing target recognition model has poor detection effect on small targets with variable scales, smaller size and low resolution in the unmanned aerial vehicle image, and is difficult to judge the scene of the unmanned aerial vehicle.
In order to achieve the above purpose, the invention provides the following technical scheme:
the invention discloses an unmanned aerial vehicle scene detection method based on YOLOv4, which comprises the following steps:
s1, establishing a proprietary data set; dividing a training set and a test set according to a certain proportion;
s2, establishing a network structure, wherein the network structure is based on an improved YOLOv4 network, CSPdarknet53 is used as a main network, a spatial pyramid pooling module and a path aggregation network module are used as necks, and YOLOV3 is used as head prediction output;
s3, firstly, training the network structure obtained in the step S2 by using an ImageNet large-scale data set to obtain a pre-training model, and then setting specific training parameters for the network structure;
s4, carrying out iterative training on the pre-training model by using a training set until a loss function is converged to obtain an unmanned aerial vehicle scene detection model;
s5, testing the unmanned aerial vehicle scene detection model by using the test set, judging whether the unmanned aerial vehicle scene detection model meets the requirements, if not, continuing to perform the step S4, and continuing to perform iterative training until the test result meets the requirements;
s6, outputting an unmanned aerial vehicle scene detection model meeting the requirements;
and S7, carrying out target detection on the sequence images by using the unmanned aerial vehicle scene detection model meeting the requirements in the step S6, and identifying the scene where the unmanned aerial vehicle is located.
Further, in the step S1, the creating of the proprietary data set includes the following steps:
s1.1, acquiring basic data samples, wherein the basic data samples comprise: intercepting a picture formed by shooting a video by an unmanned aerial vehicle, and intercepting a picture in an aerial data set on a network; the picture comprises: pictures containing six targets of an automobile, a ship, an playground, a basketball court, a bridge and a port, and pictures containing six targets of the automobile, the ship, the playground, the basketball court, the bridge and the port;
s1.2, labeling the target in the basic data sample, and processing the label into a format required by a YOLO network, wherein the label comprises: category, center point abscissa, center point ordinate, target width and target length;
s1.3, expanding the pictures which are subjected to the tags and the pictures which are not subjected to the tags by adopting a data enhancement method to obtain a proprietary data set, wherein the proprietary data set comprises 1000 pictures.
Further, in step S1, the ratio of the training set to the test set is: 9:1.
Further, in the step S3, the specific training parameters are: the batch size is 64, the size of each graph is 608x608, the batch subdivision is 16, the maximum batch number is 20000, and the initial learning rate is 0.0013.
Further, in the step S4, the convergence value of the loss function is 0.5.
Further, in the step S5, the meeting of the requirement is to perform performance evaluation on the unmanned aerial vehicle scene detection model, and the mapp @0.5 is 93.33% or more.
The invention has the beneficial effects that:
the unmanned aerial vehicle scene detection model provided by the invention occupies less memory than an RCNN series network, only occupies 2G video memory, and can be used for an unmanned aerial vehicle airborne hardware platform and other devices with insufficient hardware resources; the improved YOLOV4 is improved by 5.26% compared with the average intersection of the original network, the accuracy is improved by 3.30% compared with the original network, and the recall rate is improved by 1.08%.
Drawings
Fig. 1 is a flowchart of a method for detecting a scene of an unmanned aerial vehicle based on YOLOv 4.
Fig. 2 is a YOLOv4 network framework diagram.
Fig. 3 is a detailed structure diagram of the YOLOv4 network.
FIG. 4 is a graph of the loss function as a function of the number of model iterative training iterations.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention discloses an unmanned aerial vehicle scene detection method based on YOLOv4, and compared with the existing model, the unmanned aerial vehicle scene detection model provided by the invention occupies small memory, only 2G video memory is occupied, and the overall performance of the model is better improved after training.
Example 1
Referring to fig. 1, fig. 2 and fig. 3, embodiment 1 discloses an unmanned aerial vehicle scene detection method based on YOLOv4, including:
s1, establishing a proprietary data set; dividing a training set and a test set according to a ratio of 9: 1;
the data of the proprietary data set is from basic data samples, which are obtained through two channels, the first channel being: intercepting pictures formed by videos shot by an unmanned aerial vehicle, wherein the second channel is the pictures in aerial data sets on a network; the pictures are classified into two types, wherein the first type of pictures contain at least one of six targets of automobiles, ships, playgrounds, basketball courts, bridges and ports, and the second type of pictures do not contain six targets of automobiles, ships, playgrounds, basketball courts, bridges and ports.
In this embodiment, the act of tagging the object in the picture may be using a labelImg tagging tool or another tagging tool having the same function, a file format output by the labelImg tagging tool is an xml file, and since the tag file trained by YOLO is in txt format, the xml file needs to be converted into txt format, and the tag may be manually converted or may be converted in batch by a tool, where the specific tag includes: category, center point abscissa, center point ordinate, target width and target length.
And finally, expanding the pictures which are subjected to the tags and the pictures which are not subjected to the tags by adopting a data enhancement method to obtain a proprietary data set, wherein the proprietary data set comprises 1000 pictures.
S2, establishing a network structure, wherein the network structure is based on an improved YOLOv4 network, CSPdacknet 53 is used as a backbone network, a spatial pyramid pooling module and a path aggregation network module are used as necks, and YOLOV3 is used as head prediction output; referring to fig. 2 and 3, a frame diagram of a network structure;
the multi-Channel (CSP) only directly performs convolution operation on one part of the feature map, and the convolution result is combined with the other part of the original features, so that the learning capability of the neural network can be enhanced, the accuracy can be kept while the weight is reduced, the calculation bottleneck can be reduced, and the memory cost can be reduced.
The spatial pyramid pooling module integrates a spatial pyramid matching feature method on the convolutional neural network, and can generate output with a fixed size no matter the size of an input image.
The path aggregation network module performs up-sampling and then down-sampling, so that the flow path of information in the neural network is improved, and the characteristic pyramid is enhanced.
S3, firstly, training the network structure obtained in the step S2 by using an ImageNet large-scale data set to obtain a pre-training model, wherein the pre-training model contains initialization characteristic parameters related to various objects, and then, setting specific training parameters for the network structure; specifically, the specific training parameters are: the number of pictures batch sent to the network per batch is 64, the size of each picture is 608x608, the batch subdivision is 16, so as to reduce the video memory occupation, the maximum batch number is 20000, and the initial learning rate is 0.0013.
S4, inputting a training set into the pre-training model for iterative training until the loss function is converged, wherein the convergence value of the loss function is 0.5, so as to obtain the unmanned aerial vehicle scene detection model, and according to the shape of the training loss function shown in the figure 4, the learning rate is set reasonably, the training occupies 2G of video memory, and the calculation resources are saved compared with a target detection framework in two stages.
The overall performance index of the model obtained after training is shown in table 1, the average cross-over is improved by 5.26% compared with the improved cross-over, the accuracy is improved by 3.30% compared with the original edition, the recall rate is improved by 1.08% compared with the improved recall rate, and the mAP @0.5 reaches 93%.
TABLE 1
Performance index Rate of accuracy Recall rate Average cross-over ratio mAP@0.5
YOLOV4 0.91 0.93 0.76 0.89
Improved YOLOv4 0.94 0.94 0.80 0.93
S5, testing the unmanned aerial vehicle scene detection model by using the test set, judging whether the unmanned aerial vehicle scene detection model meets the requirements, if not, continuing to perform the step S4, and continuing to perform iterative training until the test result meets the requirements; specifically, the requirement means that mAP @0.5 is 93.33% or more.
And S6, outputting the unmanned aerial vehicle scene detection model meeting the requirements.
S7, carrying out target detection on the sequence images by using the unmanned aerial vehicle scene detection model meeting the requirements in the step S6, specifically, after the unmanned aerial vehicle transmits the video shot by the airborne camera to the ground station, the ground station detects the images in the sequence and identifies the scene where the unmanned aerial vehicle is located.
The invention is not described in detail, but is well known to those skilled in the art.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (6)

1. A method for unmanned aerial vehicle scene detection based on YOLOv4 is characterized by comprising the following steps:
s1, establishing a proprietary data set; dividing a training set and a test set according to a certain proportion;
s2, establishing a network structure, wherein the network structure is based on an improved YOLOv4 network, CSPdarknet53 is used as a main network, a spatial pyramid pooling module and a path aggregation network module are used as necks, and YOLOV3 is used as head prediction output;
s3, firstly, training the network structure obtained in the step S2 by using an ImageNet large-scale data set to obtain a pre-training model, and then setting specific training parameters for the network structure;
s4, carrying out iterative training on the pre-training model by using a training set until a loss function is converged to obtain an unmanned aerial vehicle scene detection model;
s5, testing the unmanned aerial vehicle scene detection model by using the test set, judging whether the unmanned aerial vehicle scene detection model meets the requirements, if not, continuing to perform the step S4, and continuing to perform iterative training until the test result meets the requirements;
s6, outputting an unmanned aerial vehicle scene detection model meeting the requirements;
and S7, carrying out target detection on the sequence images by using the unmanned aerial vehicle scene detection model meeting the requirements in the step S6, and identifying the scene where the unmanned aerial vehicle is located.
2. The method for detecting unmanned aerial vehicle scene based on YOLOv4 of claim 1, wherein in the step S1, the establishing the proprietary data set includes the following steps:
s1.1, acquiring basic data samples, wherein the basic data samples comprise: intercepting a picture formed by shooting a video by an unmanned aerial vehicle, and intercepting a picture in an aerial data set on a network;
the picture comprises: pictures containing six targets of an automobile, a ship, an playground, a basketball court, a bridge and a port, and pictures containing six targets of the automobile, the ship, the playground, the basketball court, the bridge and the port;
s1.2, labeling the target of the picture in the basic data sample, and processing the label into a format required by a YOLO network, wherein the label comprises: category, center point abscissa, center point ordinate, target width and target length;
s1.3, expanding the pictures which are subjected to the tags and the pictures which are not subjected to the tags by adopting a data enhancement method to obtain a proprietary data set, wherein the proprietary data set comprises 1000 pictures.
3. The method of claim 1, wherein in step S1, the ratio of the training set to the test set is: 9:1.
4. The method of claim 1, wherein the specific training parameters in step S3 are: the batch size is 64, the size of each graph is 608x608, the batch subdivision is 16, the maximum batch number is 20000, and the initial learning rate is 0.0013.
5. The method of claim 1, wherein in step S4, the convergence value of the loss function is 0.5.
6. The method of claim 1, wherein in step S5, the compliance is a performance evaluation of the drone scene detection model, and the mapp @0.5 is 93.33% or more.
CN202010921511.2A 2020-09-04 2020-09-04 Unmanned aerial vehicle scene detection method based on YOLOv4 Withdrawn CN111985451A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010921511.2A CN111985451A (en) 2020-09-04 2020-09-04 Unmanned aerial vehicle scene detection method based on YOLOv4

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010921511.2A CN111985451A (en) 2020-09-04 2020-09-04 Unmanned aerial vehicle scene detection method based on YOLOv4

Publications (1)

Publication Number Publication Date
CN111985451A true CN111985451A (en) 2020-11-24

Family

ID=73447540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010921511.2A Withdrawn CN111985451A (en) 2020-09-04 2020-09-04 Unmanned aerial vehicle scene detection method based on YOLOv4

Country Status (1)

Country Link
CN (1) CN111985451A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287899A (en) * 2020-11-26 2021-01-29 山东捷讯通信技术有限公司 Unmanned aerial vehicle aerial image river drain detection method and system based on YOLO V5
CN112465794A (en) * 2020-12-10 2021-03-09 无锡卡尔曼导航技术有限公司 Golf ball detection method based on YOLOv4 and embedded platform
CN112508076A (en) * 2020-12-02 2021-03-16 国网江西省电力有限公司建设分公司 Intelligent identification method and system for abnormal state of power engineering
CN112561996A (en) * 2020-12-08 2021-03-26 江苏科技大学 Target detection method in autonomous underwater robot recovery docking
CN113158962A (en) * 2021-05-06 2021-07-23 北京工业大学 Swimming pool drowning detection method based on YOLOv4
CN113160219A (en) * 2021-05-12 2021-07-23 北京交通大学 Real-time railway scene analysis method for unmanned aerial vehicle remote sensing image
CN113361347A (en) * 2021-05-25 2021-09-07 东南大学成贤学院 Job site safety detection method based on YOLO algorithm
CN113420607A (en) * 2021-05-31 2021-09-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-scale target detection and identification method for unmanned aerial vehicle
CN113657261A (en) * 2021-08-16 2021-11-16 成都民航空管科技发展有限公司 Real-time target detection method and device based on airport remote tower panoramic video
CN113702393A (en) * 2021-09-29 2021-11-26 安徽理工大学 Intrinsic safety type mining conveyor belt surface damage detection system and detection method
WO2022267686A1 (en) * 2021-06-24 2022-12-29 广州汽车集团股份有限公司 Adaptive processing method, apparatus and system for automatic driving and new scene
JP2023527615A (en) * 2021-04-28 2023-06-30 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Target object detection model training method, target object detection method, device, electronic device, storage medium and computer program

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287899A (en) * 2020-11-26 2021-01-29 山东捷讯通信技术有限公司 Unmanned aerial vehicle aerial image river drain detection method and system based on YOLO V5
CN112508076A (en) * 2020-12-02 2021-03-16 国网江西省电力有限公司建设分公司 Intelligent identification method and system for abnormal state of power engineering
CN112561996A (en) * 2020-12-08 2021-03-26 江苏科技大学 Target detection method in autonomous underwater robot recovery docking
CN112465794A (en) * 2020-12-10 2021-03-09 无锡卡尔曼导航技术有限公司 Golf ball detection method based on YOLOv4 and embedded platform
JP2023527615A (en) * 2021-04-28 2023-06-30 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Target object detection model training method, target object detection method, device, electronic device, storage medium and computer program
CN113158962A (en) * 2021-05-06 2021-07-23 北京工业大学 Swimming pool drowning detection method based on YOLOv4
CN113160219B (en) * 2021-05-12 2023-02-07 北京交通大学 Real-time railway scene analysis method for unmanned aerial vehicle remote sensing image
CN113160219A (en) * 2021-05-12 2021-07-23 北京交通大学 Real-time railway scene analysis method for unmanned aerial vehicle remote sensing image
CN113361347A (en) * 2021-05-25 2021-09-07 东南大学成贤学院 Job site safety detection method based on YOLO algorithm
CN113420607A (en) * 2021-05-31 2021-09-21 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-scale target detection and identification method for unmanned aerial vehicle
WO2022267686A1 (en) * 2021-06-24 2022-12-29 广州汽车集团股份有限公司 Adaptive processing method, apparatus and system for automatic driving and new scene
CN113657261A (en) * 2021-08-16 2021-11-16 成都民航空管科技发展有限公司 Real-time target detection method and device based on airport remote tower panoramic video
CN113702393A (en) * 2021-09-29 2021-11-26 安徽理工大学 Intrinsic safety type mining conveyor belt surface damage detection system and detection method
CN113702393B (en) * 2021-09-29 2023-10-27 安徽理工大学 Intrinsic safety type mining conveyor belt surface damage detection system and detection method

Similar Documents

Publication Publication Date Title
CN111985451A (en) Unmanned aerial vehicle scene detection method based on YOLOv4
CN110532878B (en) Driver behavior identification method based on lightweight convolutional neural network
CN112380921A (en) Road detection method based on Internet of vehicles
CN114202672A (en) Small target detection method based on attention mechanism
CN111461083A (en) Rapid vehicle detection method based on deep learning
CN113723377B (en) Traffic sign detection method based on LD-SSD network
CN109376580B (en) Electric power tower component identification method based on deep learning
CN111680705B (en) MB-SSD method and MB-SSD feature extraction network suitable for target detection
CN112307853A (en) Detection method of aerial image, storage medium and electronic device
US20220114396A1 (en) Methods, apparatuses, electronic devices and storage media for controlling image acquisition
CN111597920A (en) Full convolution single-stage human body example segmentation method in natural scene
CN117671509A (en) Remote sensing target detection method and device, electronic equipment and storage medium
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN117789077A (en) Method for predicting people and vehicles for video structuring in general scene
CN117437615A (en) Foggy day traffic sign detection method and device, storage medium and electronic equipment
CN112364864A (en) License plate recognition method and device, electronic equipment and storage medium
CN112115737B (en) Vehicle orientation determining method and device and vehicle-mounted terminal
CN116682085A (en) Lane line detection method based on geometric feature extraction and position information coding
CN115690770A (en) License plate recognition method based on space attention characteristics in non-limited scene
CN111104965A (en) Vehicle target identification method and device
CN115965831A (en) Vehicle detection model training method and vehicle detection method
CN113947723B (en) High-resolution remote sensing scene target detection method based on size balance FCOS
CN112686147B (en) Vehicle and wheel subordinate relation prediction method, system, storage medium and terminal
CN113610838A (en) Bolt defect data set expansion method
CN113011268A (en) Intelligent vehicle navigation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201124

WW01 Invention patent application withdrawn after publication