CN111460968A - Video-based UAV identification and tracking method and device - Google Patents
Video-based UAV identification and tracking method and device Download PDFInfo
- Publication number
- CN111460968A CN111460968A CN202010231230.4A CN202010231230A CN111460968A CN 111460968 A CN111460968 A CN 111460968A CN 202010231230 A CN202010231230 A CN 202010231230A CN 111460968 A CN111460968 A CN 111460968A
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- aerial vehicle
- unmanned aerial
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
本发明提供了一种基于视频的无人机识别与跟踪方法及装置,该方法包括:对收集的数据集逐个进行手工无人机的标注,获得多个型号且不同尺寸的无人机标注样本;以此数据集对基于YOLOv3的网络进行训练,得到训练后的深度学习目标检测模型;采用Retinex图像增强手段提高待检测的无人机视频的图像质量,通过深度学习目标检测模型识别待检测的无人机视频每一帧;基于Sort算法实现快速对视频中的无人机跟踪。本发明可以高鲁棒高精度识别视频中的无人机并跟踪无人机,并且当无人机图像不清晰时,可以进行图像增强,适用于各种复杂场景。
The invention provides a video-based drone identification and tracking method and device. The method includes: manually marking the collected data sets one by one, and obtaining multiple models and different sizes of drone marking samples ;Train the network based on YOLOv3 with this data set, and obtain the deep learning target detection model after training; use Retinex image enhancement method to improve the image quality of the UAV video to be detected, and use the deep learning target detection model to identify the target detection model to be detected. Every frame of the drone video; based on the Sort algorithm to quickly track the drone in the video. The invention can identify the drone in the video with high robustness and high precision and track the drone, and when the drone image is not clear, it can perform image enhancement, and is suitable for various complex scenes.
Description
技术领域technical field
本发明涉及无人机识别与跟踪领域,尤其涉及基于视频的无人机识别与跟踪方法及装置。The invention relates to the field of UAV identification and tracking, in particular to a video-based UAV identification and tracking method and device.
背景技术Background technique
基于视频的运动目标检测和跟踪问题在科学技术发展和工程应用上有了一定的研究基础,在智能交通、智能监控以及人工智能研究领域都已经有了一些比较成熟的解决方法。现代无人机发挥着越来越重要的作用,目前已经受到各方的重视。随着人们对智能化提出更高的要求,无人机理所当然地被各行各业所青睐:演唱会现场的无人机录制、顺丰快递的无人机送货、户外探险的无人机拍摄等等,表明无人机已经较好地应用于人们的日常生活当中,为人们带来诸多便利。近几年,对无人机实现实时监控已经展现出巨大的军事及民用价值,引起了学术界及产业界的高度重视,作为典型的基于视频的运动目标检测与跟踪问题,如何将现有技术应用于无人机运动目标的视频监控,实现无人机目标的实时检测和跟踪,该技术在军事警戒、公共安保等许多方面都具有显著的经济效益与社会效益。The video-based moving target detection and tracking problem has a certain research foundation in the development of science and technology and engineering applications, and there have been some mature solutions in the fields of intelligent transportation, intelligent monitoring and artificial intelligence research. Modern UAVs play an increasingly important role and have now received attention from all parties. As people put forward higher requirements for intelligence, drones are naturally favored by all walks of life: drone recording at concerts, drone delivery by SF Express, drone photography for outdoor adventures, etc. etc., indicating that drones have been well used in people's daily life, bringing a lot of convenience to people. In recent years, real-time monitoring of UAVs has shown great military and civilian value, which has attracted great attention from academia and industry. As a typical video-based moving target detection and tracking problem, how to integrate existing technology? It is applied to video surveillance of UAV moving targets to realize real-time detection and tracking of UAV targets. This technology has significant economic and social benefits in many aspects such as military alert and public security.
由于小型无人机目标具有尺寸小、飞行速度易变、飞行环境复杂等特点,因此仅仅使用雷达探测、无源定位等方法容易受其他信号杂波的影响,多产生虚警的结果,且获取到的可能只是几个像素的结果,仅仅获得了无人机目标的位置信息,不能高精度地监控无人机的飞行区域与飞行动机,无法为后续的干扰拦截提供精准的目标定位,所以难以有理想的结果。近年来出现基于光学图像处理的无人机识别与跟踪方法,但是效果并不尽如人意。Since small UAV targets have the characteristics of small size, variable flight speed, and complex flight environment, only using radar detection, passive positioning and other methods are easily affected by other signal clutter, resulting in false alarm results. The result may only be a few pixels, and only the position information of the UAV target is obtained. It cannot monitor the flight area and flight motive of the UAV with high precision, and cannot provide accurate target positioning for subsequent interference and interception, so it is difficult to There are ideal results. In recent years, UAV identification and tracking methods based on optical image processing have appeared, but the results are not satisfactory.
经检索,中国发明专利申请CN201911268966.2,CN110706266A,公开了一种基于YOLOv3的空中目标跟踪方法,包括:生成模型文件;实时采集视频文件,创建YOLOv3目标检测和KCF目标跟踪两个线程;YOLOv3目标跟踪线程进行目标检测;将步骤S03中的目标位置信息发送给KCF目标跟踪线程,同时执行步骤S07和步骤S11;启动KCF目标跟踪线程,判断KCF目标跟踪线程是否完成初始化;手动设置检测框;完成KCF参数初始化;KCF目标跟踪线程进行目标检测;将响应值最大的检测框作为目标;更新位置参数;得到最终的目标位置信息。该专利中虽然采用YOLOv3,但是跟踪的速度还有待进一步提高。After retrieval, Chinese invention patent applications CN201911268966.2 and CN110706266A disclose a YOLOv3-based aerial target tracking method, including: generating a model file; collecting video files in real time, creating two threads of YOLOv3 target detection and KCF target tracking; YOLOv3 target The tracking thread performs target detection; the target position information in step S03 is sent to the KCF target tracking thread, and steps S07 and S11 are executed simultaneously; the KCF target tracking thread is started, and it is judged whether the KCF target tracking thread is initialized; manually set the detection frame; complete The KCF parameters are initialized; the KCF target tracking thread performs target detection; the detection frame with the largest response value is used as the target; the position parameters are updated; and the final target position information is obtained. Although YOLOv3 is used in this patent, the tracking speed needs to be further improved.
发明内容SUMMARY OF THE INVENTION
针对上述现有技术中存在的问题,本发明提出一种基于视频的无人机识别与跟踪方法及装置,大大提高了跟踪的实时性。In view of the above problems in the prior art, the present invention proposes a video-based method and device for identifying and tracking unmanned aerial vehicles, which greatly improves the real-time performance of tracking.
为解决上述技术问题,本发明是通过如下技术方案实现的:In order to solve the above-mentioned technical problems, the present invention is achieved through the following technical solutions:
根据本发明的第一方面,提供一种基于视频的无人机识别与跟踪方法,其包括:According to a first aspect of the present invention, a video-based UAV identification and tracking method is provided, comprising:
S11:获得多个型号且不同尺寸的无人机标注图像样本,作为数据集;S11: Obtain multiple types and different sizes of UAV labeled image samples as a data set;
S12:采用YOLOv3网络训练所述数据集,得到训练后的深度学习目标检测模型;S12: adopt the YOLOv3 network to train the described data set, obtain the deep learning target detection model after training;
S13:采用Retinex图像增强方法提高输入视频的图像质量,通过训练好的所述YOLOv3深度学习目标检测模型识别输入的无人机视频每一帧,得到每一帧的目标无人机检测框,为后续跟踪任务做准备;S13: use the Retinex image enhancement method to improve the image quality of the input video, identify each frame of the input drone video through the trained YOLOv3 deep learning target detection model, and obtain the target drone detection frame of each frame, as Prepare for follow-up tasks;
S14:根据S13的识别结果,采用Sort算法实现快速对视频中的无人机跟踪。S14: According to the recognition result of S13, the Sort algorithm is used to quickly track the drone in the video.
本发明采用基于YOLOv3网络和Sort跟踪算法的改进,在提升跟踪速度的同时,能同时也保证良好的精度。The invention adopts the improvement based on the YOLOv3 network and the Sort tracking algorithm, which can ensure good accuracy while improving the tracking speed.
较佳地,所述S11具体为:Preferably, the S11 is specifically:
收集大量包含无人机的图像,涵盖各种型号的无人机,每种无人机型号多张图像,设置无人机图像为统一大小,对每个图像逐个进行无人机的标注。Collect a large number of images containing drones, covering various types of drones, multiple images for each drone model, set the drone images to a uniform size, and label each drone one by one.
较佳地,S12中,所述YOLOv3网络训练所述数据集,调整网络超参,得到梯度下降稳定、损失函数降到预期值、拟合度达到要求的深度学习目标检测模型。Preferably, in S12, the YOLOv3 network trains the data set, adjusts the network hyperparameters, and obtains a deep learning target detection model with a stable gradient descent, a loss function that is reduced to an expected value, and a fit that meets the requirements.
较佳地,在所述YOLOv3网络的Darknet-53中加入注意力机制,以快速提取数据的重要特征,改进网络识别效果,注意力机制可以将注意力放在重要的信息上节约系统资源,常见的卷积神经网络池化层直接用最大池化或者平均池化的方式太过于简单粗暴而且会导致关键信息无法识别出来,因此采用注意力机制可以改善这一问题,提高模型的准确性。Preferably, an attention mechanism is added to the Darknet-53 of the YOLOv3 network to quickly extract the important features of the data and improve the network recognition effect. The attention mechanism can focus on important information to save system resources. Common The pooling layer of the convolutional neural network directly uses the maximum pooling or average pooling method, which is too simple and rude and will cause the key information to not be recognized. Therefore, the attention mechanism can improve this problem and improve the accuracy of the model.
较佳地,所述YOLOv3网络,其中损失函数采用GIoU函数作为衡量目标检测定位性能的指标:Preferably, in the YOLOv3 network, the loss function adopts the GIoU function as an indicator for measuring the target detection and positioning performance:
上式中,A表示预测框,B表示真实框,C表示包含A与B在内的最小封闭区域面积,分子表示C中没有同时覆盖A和B的区域面积;损失函数值GIoU的范围是从-1到1,能更好的体现预测框与真实框之间的关系,IoU为YOLOv3网络中IoU损失函数值。本发明改进损失函数IoU为GIoU,可以更好地体现预测框与真实框之间的关系,有利于提高网络的识别准确率。。In the above formula, A represents the prediction frame, B represents the real frame, C represents the minimum enclosed area including A and B, and the numerator represents the area of C that does not cover both A and B; the loss function value GIoU ranges from -1 to 1, can better reflect the relationship between the predicted frame and the real frame, IoU is the value of the IoU loss function in the YOLOv3 network. The improved loss function IoU of the present invention is GIoU, which can better reflect the relationship between the prediction frame and the real frame, and is beneficial to improve the recognition accuracy of the network. .
较佳地,所述S13中,还包括:Preferably, in the S13, it also includes:
将输入视频的图像转换成恒常性的图像,可以保持图像高保真度与对图像动态范围进行压缩的同时,增强色彩,保持颜色恒常性,有利于提高后续的识别网络的鲁棒性。Converting the image of the input video into a constancy image can maintain the high fidelity of the image and compress the dynamic range of the image while enhancing the color and maintaining the color constancy, which is beneficial to improve the robustness of the subsequent recognition network.
恒常性图像r(x,y)为The constancy image r(x,y) is
上式中,K是高斯中心环绕函数的个数分别为1,2,3,wk是第k个尺度所对应的权值,Fk(x,y)第k个中心环绕函数。In the above formula, K is the number of Gaussian center wrapping functions, which are 1, 2, and 3, respectively, w k is the weight corresponding to the kth scale, and F k (x, y) is the kth center wrapping function.
较佳地,S14中,所述采用Sort算法实现快速对视频中的无人机跟踪,包括:Preferably, in S14, the use of the Sort algorithm to quickly track the drone in the video includes:
每一帧中,以检测出的无人机检测框为基准,同时采用卡尔曼滤波器预测无人机的跟踪框,计算当前帧所有目标的检测框与卡尔曼预测的所有跟踪框之间的IoU,通过匈牙利算法得到检测框与跟踪框IoU的最优匹配对,将匹配的检测框表示为当前帧的跟踪结果,并用当前检测到的目标位置信息去更新卡尔曼跟踪器,继续进行下一帧的预测框与下一帧的检测框进行匹配;In each frame, the detected UAV detection frame is used as the benchmark, and the Kalman filter is used to predict the tracking frame of the UAV, and the difference between the detection frame of all targets in the current frame and all the tracking frames predicted by Kalman is calculated. IoU, obtain the optimal matching pair of detection frame and tracking frame IoU through the Hungarian algorithm, represent the matched detection frame as the tracking result of the current frame, and update the Kalman tracker with the currently detected target position information, and proceed to the next step The prediction frame of the frame is matched with the detection frame of the next frame;
重复上述过程,实现对无人机的持续跟踪。Repeat the above process to achieve continuous tracking of the UAV.
根据本发明的第二方面,提供一种基于视频的无人机识别与跟踪装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时可用于执行所述的基于视频的无人机识别与跟踪方法。According to a second aspect of the present invention, there is provided a video-based drone identification and tracking device, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor executing the can be used to implement the described video-based drone identification and tracking method.
与现有技术相比,本发明具有如下的有益效果:Compared with the prior art, the present invention has the following beneficial effects:
本发明提供的基于视频的无人机识别与跟踪方法,根据大量的数据集训练网络模型,并利用深度学习方法进行无人机识别与跟踪,对现有网络进行改进,对图像进行增强,得到更加精确且效果鲁棒性更强的识别跟踪结果。The video-based UAV identification and tracking method provided by the present invention trains a network model according to a large number of data sets, uses a deep learning method to identify and track UAVs, improves the existing network, and enhances the image to obtain More accurate and robust identification and tracking results.
本发明提供的基于视频的无人机识别与跟踪方法,针对目前目标跟踪无法兼顾实时性与精度的情况下,提供了一种在保证跟踪精度较高的前提下,跟踪速度非常快速的方法,可以适用于实际的目标跟踪任务。The video-based UAV identification and tracking method provided by the present invention provides a method with a very fast tracking speed under the premise of ensuring high tracking accuracy under the situation that the current target tracking cannot take into account real-time performance and accuracy. It can be applied to actual target tracking tasks.
附图说明Description of drawings
通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present invention will become more apparent by reading the detailed description of non-limiting embodiments with reference to the following drawings:
图1是本发明一实施例的基于视频的无人机识别与跟踪方法流程图;1 is a flowchart of a video-based drone identification and tracking method according to an embodiment of the present invention;
图2是本发明一实施例的基于YOLOv3网络的Darknet-53修改的网络结构示意图;2 is a schematic diagram of a network structure modified by Darknet-53 based on a YOLOv3 network according to an embodiment of the present invention;
图3是本发明一实施例的基于视频的无人机识别与跟踪方法流程框图。FIG. 3 is a flowchart of a video-based drone identification and tracking method according to an embodiment of the present invention.
具体实施方式Detailed ways
下面对本发明的实施例作详细说明:本实施例在以本发明技术方案为前提下进行实施,给出了详细的实施方式和具体的操作过程。应当指出的是,对本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。The embodiments of the present invention are described in detail below: This embodiment is implemented on the premise of the technical solution of the present invention, and provides detailed implementation modes and specific operation processes. It should be pointed out that for those skilled in the art, without departing from the concept of the present invention, several modifications and improvements can be made, which all belong to the protection scope of the present invention.
如图1所示为本发明的实施例的基于视频的无人机识别与跟踪方法的流程图。FIG. 1 is a flowchart of a video-based drone identification and tracking method according to an embodiment of the present invention.
请参考图1,本实施例的基于视频的无人机识别与跟踪方法包括以下步骤:Referring to FIG. 1, the video-based UAV identification and tracking method of the present embodiment includes the following steps:
S11:获得多个型号且不同尺寸的无人机标注图像样本;S11: Obtain annotated image samples of UAVs of multiple models and different sizes;
收集大量包含无人机的图像,涵盖各种型号的无人机,每种无人机型号多张图像,对每个图像逐个进行无人机的标注,得到无人机标注图像样本作为训练用数据集。Collect a large number of images containing drones, covering various types of drones, multiple images for each type of drone, label each image one by one, and get the drone-labeled image samples for training Use datasets.
S12:基于YOLOv3网络训练S11得到的数据集,得到训练后的深度学习目标检测模型。S12: Based on the data set obtained by training S11 on the YOLOv3 network, a trained deep learning target detection model is obtained.
S13:采用Retinex图像增强手段提高待检测的无人机视频的图像质量,通过深度学习目标检测模型识别待检测的无人机视频每一帧,得到每一帧的目标无人机检测框,为后续跟踪任务做准备;S13: Use Retinex image enhancement method to improve the image quality of the UAV video to be detected, identify each frame of the UAV video to be detected through the deep learning target detection model, and obtain the target UAV detection frame of each frame, which is Prepare for follow-up tasks;
S14:根据S13得到的目标无人机的检测框,采用Sort算法实现快速对视频中的无人机跟踪。S14: According to the detection frame of the target UAV obtained in S13, the Sort algorithm is used to quickly track the UAV in the video.
本实施例根据大量的数据集训练网络模型,利用深度学习方法进行无人机识别跟踪,提高了无人机识别与跟踪的精确度和鲁棒性,并且当无人机图像不清晰时,可以进行图像增强,适用于各种复杂场景。在提升跟踪速度的同时,能同时也保证良好的精度。In this embodiment, the network model is trained according to a large number of data sets, and the deep learning method is used to identify and track the UAV, which improves the accuracy and robustness of the identification and tracking of the UAV. For image enhancement, it is suitable for various complex scenes. While improving the tracking speed, it can also ensure good accuracy.
在一优选实施例中,上述S11可以使用2664张无人机图片作为训练数据集,这些图片基本涵盖各种类型的、不同状态、各种背景下的无人机,图像大小尺寸均相同。当然,这里的图片数量仅仅是举例说明,在其他实施例中,也可以是其他数量的无人机图片,并不限于2664张。In a preferred embodiment, the above S11 can use 2664 UAV images as a training data set, these images basically cover UAVs of various types, different states, and various backgrounds, and the image sizes are the same. Of course, the number of pictures here is only an example, and in other embodiments, there may be other numbers of drone pictures, which are not limited to 2664 pictures.
在另一优选实施例中,上述S12使用基于YOLOv3网络的训练S11得到的数据集,调整网络超参,得到梯度下降稳定、损失函数降到预期值、拟合度达到要求的深度学习模型。本实施例将常应用于车辆等图像领域的YOLOv3网络应用在无人机识别与跟踪上。为了取得更好效果,在原有的YOLOv3网络基础上做了如下改进:In another preferred embodiment, the above-mentioned S12 uses the data set obtained from the training S11 based on the YOLOv3 network, adjusts the network hyperparameters, and obtains a deep learning model with stable gradient descent, a loss function reduced to an expected value, and a fit degree that meets the requirements. In this embodiment, the YOLOv3 network, which is often used in image fields such as vehicles, is applied to UAV identification and tracking. In order to achieve better results, the following improvements have been made on the basis of the original YOLOv3 network:
1)在YOLOv3网络的Darknet-53中加入注意力机制,可以快速提取数据的重要特征,改进网络识别效果,注意力机制可以将注意力放在重要的信息上节约系统资源,常见的卷积神经网络池化层直接用最大池化或者平均池化的方式太过于简单粗暴而且会导致关键信息无法识别出来,因此采用注意力机制可以改善这一问题,提高模型的准确性。1) Adding an attention mechanism to the Darknet-53 of the YOLOv3 network can quickly extract important features of the data and improve the network recognition effect. The attention mechanism can focus on important information to save system resources. The common convolutional neural network The network pooling layer directly uses the maximum pooling or average pooling method, which is too simple and rude and will cause the key information to be unrecognized. Therefore, the attention mechanism can improve this problem and improve the accuracy of the model.
2)改进损失函数IoU(Intersection over Union)为GIoU(GeneralizedIntersection over Union),可以更好地体现预测框与真实框之间的关系,弥补IoU的不足。2) Improve the loss function IoU (Intersection over Union) to GIoU (Generalized Intersection over Union), which can better reflect the relationship between the predicted frame and the real frame and make up for the lack of IoU.
在YOLOv3网络中采用IoU作为衡量目标检测定位性能的指标,In the YOLOv3 network, IoU is used as an indicator to measure the performance of target detection and localization.
上式中,A表示预测框,B表示真实框,分子表示预测框与真实框的的并集,分母表示预测框与真实框的交集。但是如果预测框与真实框没有相交,那么IoU为零,且无法优化;即使相同的IoU,也不能代表检测效果相同。GIoU针对上述问题做出了改进,In the above formula, A represents the prediction frame, B represents the real frame, the numerator represents the union of the prediction frame and the real frame, and the denominator represents the intersection of the prediction frame and the real frame. However, if the predicted frame and the real frame do not intersect, then the IoU is zero and cannot be optimized; even the same IoU does not mean the same detection effect. GIoU has made improvements to the above problems,
上式中,C表示包含A与B在内的最小封闭区域面积,分子表示C中没有同时覆盖A和B的区域面积。由于IoU的范围是从0到1,而GIoU的范围是从-1到1,能更好的体现预测框与真实框之间的关系,有利于提高网络的识别准确率。In the above formula, C represents the minimum enclosed area including A and B, and the numerator represents the area of C that does not cover both A and B. Since the range of IoU is from 0 to 1, and the range of GIoU is from -1 to 1, it can better reflect the relationship between the predicted frame and the real frame, which is beneficial to improve the recognition accuracy of the network.
如图3所示,基于YOLOv3网络的Darknet-53修改的网络结构,其中具体包括52层卷积和23个残差单元。此网络是一个全卷积网络,大量使用残差的跳层连接,并且为了降低池化带来的梯度负面效果,所以摒弃了池化层,用步长为2的卷积来进行5次降采样。在5级降采样过程中,卷积层后都跟上残差单元与注意力机制。比如,如果输入为416x416,则输出为13x13(416/25=13),从而进行张量的尺寸变换。通过这些改进,可以很好地提高无人机识别与跟踪的精确度和鲁棒性。As shown in Figure 3, the modified network structure of Darknet-53 based on YOLOv3 network specifically includes 52 layers of convolution and 23 residual units. This network is a fully convolutional network, which uses a large number of residual skip layer connections, and in order to reduce the negative effect of the gradient caused by pooling, the pooling layer is discarded, and the convolution with a stride of 2 is used to perform 5 drops. sampling. In the 5-level downsampling process, the convolutional layers are followed by residual units and attention mechanisms. For example, if the input is 416x416, the output is 13x13 (416/2 5 =13), so that the size of the tensor is transformed. Through these improvements, the accuracy and robustness of UAV identification and tracking can be well improved.
在另一优选实施例中,上述S13,可以将无人机视频的图像转换成恒常性的图像,可以保持图像高保真度与对图像动态范围进行压缩的同时,增强色彩,保持颜色恒常性。具体的,恒常性图像r(x,y)为In another preferred embodiment, in the above S13, the image of the drone video can be converted into an image of constancy, which can maintain the high fidelity of the image and compress the dynamic range of the image, while enhancing the color and maintaining the color constancy. Specifically, the constancy image r(x,y) is
上式中,K为3分别表示RGB三通道,wk是第k个尺度所对应的权值,值都为1/3,三尺度分别为15,101,301,S(x,y)是观察到的图像,Fk(x,y)第k个中心环绕函数。In the above formula, K is 3 to represent the three RGB channels respectively, w k is the weight corresponding to the kth scale, the values are all 1/3, the three scales are 15, 101, 301 respectively, S(x, y) is Observed image, F k (x,y) k-th center-surround function.
在另一优选实施例中,上述S14,采用Sort算法实现快速对视频中的无人机跟踪,可以按照以下方法实现:In another preferred embodiment, the above-mentioned S14, adopts Sort algorithm to realize fast tracking of the drone in the video, which can be realized according to the following method:
将视频通过训练好的所述YOLOv3深度学习目标检测模型,检测输入的无人机视频每一帧,得到每一帧的目标无人机检测框。每一帧中,以检测出的无人机检测框为基准,同时采用卡尔曼滤波器预测无人机的跟踪框,计算当前帧所有目标的检测框与卡尔曼预测的所有跟踪框之间的IoU,通过匈牙利算法得到检测框与跟踪框IoU的最优匹配对,将匹配的检测框表示为当前帧的跟踪结果,并用当前检测到的目标位置信息去更新卡尔曼跟踪器,继续进行下一帧的预测框与下一帧的检测框进行匹配。即可实现对目标的持续跟踪。Pass the video through the trained YOLOv3 deep learning target detection model, detect each frame of the input drone video, and obtain the target drone detection frame of each frame. In each frame, the detected UAV detection frame is used as the benchmark, and the Kalman filter is used to predict the tracking frame of the UAV, and the difference between the detection frame of all targets in the current frame and all the tracking frames predicted by Kalman is calculated. IoU, obtain the optimal matching pair of detection frame and tracking frame IoU through the Hungarian algorithm, represent the matched detection frame as the tracking result of the current frame, and update the Kalman tracker with the currently detected target position information, and proceed to the next step The predicted frame of the frame is matched with the detection frame of the next frame. Continuous tracking of goals can be achieved.
上述实施例中的基于视频的无人机识别与跟踪方法,根据大量的数据集训练网络模型,并利用深度学习方法进行无人机识别与跟踪,对现有网络进行改进,对图像进行增强,得到更加精确且效果鲁棒性更强的识别跟踪结果。The video-based UAV identification and tracking method in the above-mentioned embodiment trains the network model according to a large number of data sets, and uses the deep learning method to identify and track the UAV, improve the existing network, and enhance the image, Obtain more accurate and more robust identification and tracking results.
本发明另一实施例中还提供一种基于视频的无人机识别与跟踪装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行所述程序时可用于执行上述各实施例的基于视频的无人机识别与跟踪方法。Another embodiment of the present invention also provides a video-based drone identification and tracking device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the program The video-based UAV identification and tracking method can be used to implement the above embodiments.
可选地,存储器,用于存储程序;存储器,可以包括易失性存储器(英文:volatilememory),例如随机存取存储器(英文:random-access memory,缩写:RAM),如静态随机存取存储器(英文:static random-access memory,缩写:SRAM),双倍数据率同步动态随机存取存储器(英文:Double Data Rate Synchronous Dynamic Random Access Memory,缩写:DDR SDRAM)等;存储器也可以包括非易失性存储器(英文:non-volatile memory),例如快闪存储器(英文:flash memory)。存储器62用于存储计算机程序(如实现上述方法的应用程序、功能模块等)、计算机指令等,上述的计算机程序、计算机指令等可以分区存储在一个或多个存储器中。并且上述的计算机程序、计算机指令、数据等可以被处理器调用。Optionally, the memory is used to store the program; the memory may include volatile memory (English: volatile memory), such as random-access memory (English: random-access memory, abbreviation: RAM), such as static random-access memory ( English: static random-access memory, abbreviation: SRAM), double data rate synchronous dynamic random access memory (English: Double Data Rate Synchronous Dynamic Random Access Memory, abbreviation: DDR SDRAM), etc.; memory can also include non-volatile Memory (English: non-volatile memory), such as flash memory (English: flash memory). The memory 62 is used to store computer programs (such as application programs, functional modules, etc. for implementing the above methods), computer instructions, etc., and the above computer programs, computer instructions, etc. may be stored in one or more memories in partitions. And the above-mentioned computer programs, computer instructions, data, etc. can be called by the processor.
上述的计算机程序、计算机指令等可以分区存储在一个或多个存储器中。并且上述的计算机程序、计算机指令、数据等可以被处理器调用。The computer programs, computer instructions, etc. described above may be partitioned and stored in one or more memories. And the above-mentioned computer programs, computer instructions, data, etc. can be called by the processor.
处理器,用于执行存储器存储的计算机程序,以实现上述实施例涉及的方法中的各个步骤。具体可以参见前面方法实施例中的相关描述。The processor is configured to execute the computer program stored in the memory, so as to implement each step in the method involved in the above embodiments. For details, refer to the relevant descriptions in the foregoing method embodiments.
处理器和存储器可以是独立结构,也可以是集成在一起的集成结构。当处理器和存储器是独立结构时,存储器、处理器可以通过总线耦合连接。The processor and memory can be separate structures or integrated structures that are integrated together. When the processor and the memory are independent structures, the memory and the processor can be coupled and connected through a bus.
本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当用户设备的至少一个处理器执行该计算机执行指令时,用户设备执行上述各种可能的方法。Embodiments of the present application further provide a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium. When at least one processor of the user equipment executes the computer-executable instructions, the user equipment executes the various possible methods described above. .
此处公开的仅为本发明的优选实施例,本说明书选取并具体描述这些实施例,是为了更好地解释本发明的原理和实际应用,并不是对本发明的限定。任何本领域技术人员在说明书范围内所做的修改和变化,均应落在本发明所保护的范围内。上述各较佳特征,可以在任一实施例中单独使用,在互不冲突的前提下,也可以任一组合使用。Only preferred embodiments of the present invention are disclosed herein, and the present specification selects and specifically describes these embodiments to better explain the principles and practical applications of the present invention, but not to limit the present invention. Any modifications and changes made by those skilled in the art within the scope of the description should fall within the protection scope of the present invention. The above-mentioned preferred features can be used alone in any embodiment, and can also be used in any combination on the premise that they do not conflict with each other.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010231230.4A CN111460968B (en) | 2020-03-27 | 2020-03-27 | Video-based drone identification and tracking method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010231230.4A CN111460968B (en) | 2020-03-27 | 2020-03-27 | Video-based drone identification and tracking method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111460968A true CN111460968A (en) | 2020-07-28 |
CN111460968B CN111460968B (en) | 2024-02-06 |
Family
ID=71680515
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010231230.4A Active CN111460968B (en) | 2020-03-27 | 2020-03-27 | Video-based drone identification and tracking method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111460968B (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931654A (en) * | 2020-08-11 | 2020-11-13 | 精英数智科技股份有限公司 | Intelligent monitoring method, system and device for personnel tracking |
CN112132032A (en) * | 2020-09-23 | 2020-12-25 | 平安国际智慧城市科技股份有限公司 | Traffic sign detection method and device, electronic equipment and storage medium |
CN112184767A (en) * | 2020-09-22 | 2021-01-05 | 深研人工智能技术(深圳)有限公司 | Method, device, equipment and storage medium for tracking moving object track |
CN112288770A (en) * | 2020-09-25 | 2021-01-29 | 航天科工深圳(集团)有限公司 | Video real-time multi-target detection and tracking method and device based on deep learning |
CN112348057A (en) * | 2020-10-20 | 2021-02-09 | 歌尔股份有限公司 | Target identification method and device based on YOLO network |
CN112419368A (en) * | 2020-12-03 | 2021-02-26 | 腾讯科技(深圳)有限公司 | Method, device and equipment for tracking track of moving target and storage medium |
CN112465854A (en) * | 2020-12-17 | 2021-03-09 | 北京三川未维科技有限公司 | Unmanned aerial vehicle tracking method based on anchor-free detection algorithm |
CN112734794A (en) * | 2021-01-14 | 2021-04-30 | 北京航空航天大学 | Moving target tracking and positioning method based on deep learning |
CN112819858A (en) * | 2021-01-29 | 2021-05-18 | 北京博雅慧视智能技术研究院有限公司 | Target tracking method, device and equipment based on video enhancement and storage medium |
CN112884811A (en) * | 2021-03-18 | 2021-06-01 | 中国人民解放军国防科技大学 | Photoelectric detection tracking method and system for unmanned aerial vehicle cluster |
CN112906523A (en) * | 2021-02-04 | 2021-06-04 | 上海航天控制技术研究所 | Hardware accelerated deep learning target machine type identification method |
CN113139419A (en) * | 2020-12-28 | 2021-07-20 | 西安天和防务技术股份有限公司 | Unmanned aerial vehicle detection method and device |
CN114140501A (en) * | 2022-01-30 | 2022-03-04 | 南昌工程学院 | Target tracking method and device and readable storage medium |
CN117455955A (en) * | 2023-12-14 | 2024-01-26 | 武汉纺织大学 | Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle |
WO2025038580A1 (en) * | 2023-08-14 | 2025-02-20 | Epirus, Inc. | Stacking color and motion signals to detect tiny objects |
CN120014560A (en) * | 2025-04-18 | 2025-05-16 | 民航成都电子技术有限责任公司 | Aircraft identification method, device, medium and electronic equipment based on panoramic video |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101674490A (en) * | 2009-09-23 | 2010-03-17 | 电子科技大学 | A color constancy method for color images based on retinal vision mechanism |
CN101706964A (en) * | 2009-08-27 | 2010-05-12 | 北京交通大学 | Color constancy calculating method and system based on derivative structure of image |
WO2018133666A1 (en) * | 2017-01-17 | 2018-07-26 | 腾讯科技(深圳)有限公司 | Method and apparatus for tracking video target |
CN110070561A (en) * | 2019-03-13 | 2019-07-30 | 合肥师范学院 | A kind of image enhancement of blurred vision condition and method for tracking target and system |
US20190362190A1 (en) * | 2018-05-28 | 2019-11-28 | Samsung Electronics Co., Ltd. | Method and system for dnn based imaging |
CN110516556A (en) * | 2019-07-31 | 2019-11-29 | 平安科技(深圳)有限公司 | Multi-target tracking detection method, device and storage medium based on Darkflow-DeepSort |
CN110706266A (en) * | 2019-12-11 | 2020-01-17 | 北京中星时代科技有限公司 | Aerial target tracking method based on YOLOv3 |
-
2020
- 2020-03-27 CN CN202010231230.4A patent/CN111460968B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101706964A (en) * | 2009-08-27 | 2010-05-12 | 北京交通大学 | Color constancy calculating method and system based on derivative structure of image |
CN101674490A (en) * | 2009-09-23 | 2010-03-17 | 电子科技大学 | A color constancy method for color images based on retinal vision mechanism |
WO2018133666A1 (en) * | 2017-01-17 | 2018-07-26 | 腾讯科技(深圳)有限公司 | Method and apparatus for tracking video target |
US20190362190A1 (en) * | 2018-05-28 | 2019-11-28 | Samsung Electronics Co., Ltd. | Method and system for dnn based imaging |
CN110070561A (en) * | 2019-03-13 | 2019-07-30 | 合肥师范学院 | A kind of image enhancement of blurred vision condition and method for tracking target and system |
CN110516556A (en) * | 2019-07-31 | 2019-11-29 | 平安科技(深圳)有限公司 | Multi-target tracking detection method, device and storage medium based on Darkflow-DeepSort |
CN110706266A (en) * | 2019-12-11 | 2020-01-17 | 北京中星时代科技有限公司 | Aerial target tracking method based on YOLOv3 |
Non-Patent Citations (1)
Title |
---|
徐义鎏,贺鹏: "改进损失函数的Yolov3车型检测算法", no. 12, pages 4 - 7 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931654A (en) * | 2020-08-11 | 2020-11-13 | 精英数智科技股份有限公司 | Intelligent monitoring method, system and device for personnel tracking |
CN112184767A (en) * | 2020-09-22 | 2021-01-05 | 深研人工智能技术(深圳)有限公司 | Method, device, equipment and storage medium for tracking moving object track |
CN112132032A (en) * | 2020-09-23 | 2020-12-25 | 平安国际智慧城市科技股份有限公司 | Traffic sign detection method and device, electronic equipment and storage medium |
CN112288770A (en) * | 2020-09-25 | 2021-01-29 | 航天科工深圳(集团)有限公司 | Video real-time multi-target detection and tracking method and device based on deep learning |
CN112348057A (en) * | 2020-10-20 | 2021-02-09 | 歌尔股份有限公司 | Target identification method and device based on YOLO network |
CN112419368A (en) * | 2020-12-03 | 2021-02-26 | 腾讯科技(深圳)有限公司 | Method, device and equipment for tracking track of moving target and storage medium |
CN112465854A (en) * | 2020-12-17 | 2021-03-09 | 北京三川未维科技有限公司 | Unmanned aerial vehicle tracking method based on anchor-free detection algorithm |
CN113139419B (en) * | 2020-12-28 | 2024-05-31 | 西安天和防务技术股份有限公司 | Unmanned aerial vehicle detection method and device |
CN113139419A (en) * | 2020-12-28 | 2021-07-20 | 西安天和防务技术股份有限公司 | Unmanned aerial vehicle detection method and device |
CN112734794A (en) * | 2021-01-14 | 2021-04-30 | 北京航空航天大学 | Moving target tracking and positioning method based on deep learning |
CN112734794B (en) * | 2021-01-14 | 2022-12-23 | 北京航空航天大学 | A moving target tracking and localization method based on deep learning |
CN112819858A (en) * | 2021-01-29 | 2021-05-18 | 北京博雅慧视智能技术研究院有限公司 | Target tracking method, device and equipment based on video enhancement and storage medium |
CN112819858B (en) * | 2021-01-29 | 2024-03-22 | 北京博雅慧视智能技术研究院有限公司 | Target tracking method, device, equipment and storage medium based on video enhancement |
CN112906523A (en) * | 2021-02-04 | 2021-06-04 | 上海航天控制技术研究所 | Hardware accelerated deep learning target machine type identification method |
CN112884811A (en) * | 2021-03-18 | 2021-06-01 | 中国人民解放军国防科技大学 | Photoelectric detection tracking method and system for unmanned aerial vehicle cluster |
CN114140501A (en) * | 2022-01-30 | 2022-03-04 | 南昌工程学院 | Target tracking method and device and readable storage medium |
WO2025038580A1 (en) * | 2023-08-14 | 2025-02-20 | Epirus, Inc. | Stacking color and motion signals to detect tiny objects |
CN117455955A (en) * | 2023-12-14 | 2024-01-26 | 武汉纺织大学 | Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle |
CN117455955B (en) * | 2023-12-14 | 2024-03-08 | 武汉纺织大学 | Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle |
CN120014560A (en) * | 2025-04-18 | 2025-05-16 | 民航成都电子技术有限责任公司 | Aircraft identification method, device, medium and electronic equipment based on panoramic video |
Also Published As
Publication number | Publication date |
---|---|
CN111460968B (en) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111460968B (en) | Video-based drone identification and tracking method and device | |
CN112418117B (en) | Small target detection method based on unmanned aerial vehicle image | |
CN111898651B (en) | A tree detection method based on Tiny YOLOV3 algorithm | |
WO2021147325A1 (en) | Object detection method and apparatus, and storage medium | |
CN106845430A (en) | Pedestrian detection and tracking based on acceleration region convolutional neural networks | |
Cepni et al. | Vehicle detection using different deep learning algorithms from image sequence | |
CN110555420B (en) | Fusion model network and method based on pedestrian regional feature extraction and re-identification | |
CN106897738A (en) | A kind of pedestrian detection method based on semi-supervised learning | |
CN114241511B (en) | Weak supervision pedestrian detection method, system, medium, equipment and processing terminal | |
CN113807399A (en) | Neural network training method, neural network detection method and neural network detection device | |
CN110781744A (en) | A small-scale pedestrian detection method based on multi-level feature fusion | |
CN109919223B (en) | Target detection method and device based on deep neural network | |
CN110969648A (en) | 3D target tracking method and system based on point cloud sequence data | |
CN110096979B (en) | Model construction method, crowd density estimation method, device, equipment and medium | |
Ataş | Performance evaluation of jaccard-dice coefficient on building segmentation from high resolution satellite images | |
Castellano et al. | Density-based clustering with fully-convolutional networks for crowd flow detection from drones | |
CN113033356B (en) | A scale-adaptive long-term correlation target tracking method | |
CN117809230A (en) | Water flow velocity identification method based on image identification and related products | |
CN118351435A (en) | A method and device for detecting target in UAV remote sensing images based on lightweight model LTE-Det | |
Wu et al. | Research on asphalt pavement disease detection based on improved YOLOv5s | |
CN109934147B (en) | Target detection method, system and device based on deep neural network | |
Zhang et al. | Visual image and radio signal fusion identification based on convolutional neural networks | |
CN102156879A (en) | Human target matching method based on weighted terrestrial motion distance | |
Zhu et al. | Real-time traffic sign detection based on YOLOv2 | |
May et al. | Polo–point-based, multi-class animal detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |