CN114220053B - Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching - Google Patents

Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching Download PDF

Info

Publication number
CN114220053B
CN114220053B CN202111534212.4A CN202111534212A CN114220053B CN 114220053 B CN114220053 B CN 114220053B CN 202111534212 A CN202111534212 A CN 202111534212A CN 114220053 B CN114220053 B CN 114220053B
Authority
CN
China
Prior art keywords
vehicle
map
feature
layer
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111534212.4A
Other languages
Chinese (zh)
Other versions
CN114220053A (en
Inventor
吕京国
白颖奇
曹逸飞
王琛
贺柳良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lingyun Space Technology Co ltd
Original Assignee
Beijing University of Civil Engineering and Architecture
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Civil Engineering and Architecture filed Critical Beijing University of Civil Engineering and Architecture
Priority to CN202111534212.4A priority Critical patent/CN114220053B/en
Publication of CN114220053A publication Critical patent/CN114220053A/en
Application granted granted Critical
Publication of CN114220053B publication Critical patent/CN114220053B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides an unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching, which comprises the following steps: inputting the image frames into a trained light suppression model and feature-enhanced multi-scale vehicle detection module, and acquiring a plurality of vehicle detection result frames, namely respectively capturing images in each vehicle detection result frame in the image frames to obtain z detected vehicle images; inputting each detected vehicle map and the target vehicle map S into a multi-feature united vehicle search network for feature matching to obtain a detected vehicle map of the target vehicle; thereby completing retrieval and positioning of the target vehicle. The method is suitable for videos shot by the unmanned aerial vehicle in different complex scenes, the influences of insufficient vehicle detail information caused by illumination and target size change of the unmanned aerial vehicle at different heights are removed to the maximum extent, the problem that the vehicle to be inquired is difficult to find in a plurality of targets is solved, and the vehicle to be inquired can be more accurately retrieved.

Description

一种基于车辆特征匹配的无人机视频车辆检索方法A UAV video vehicle retrieval method based on vehicle feature matching

技术领域technical field

本发明属于遥感信息智能处理技术领域,具体涉及一种基于车辆特征匹配的无人机视频车辆检索方法。The invention belongs to the technical field of intelligent processing of remote sensing information, and in particular relates to a method for retrieving unmanned aerial vehicle video vehicles based on vehicle feature matching.

背景技术Background technique

地面监控视频通过在十字路口、高速路口等关键地方安装固定的摄像机来获取道路信息,具有全天候、受环境影响小的优点。基于地面监控视频的车辆检索系统主要包括:(1)传统的车辆检索方法:传统的车辆检索方法通过视觉词袋、深度哈希等算法提取目标车辆的细节特征,如Haar特征、SIFT特征以及HOG特征等,对车辆的表征能力有限,导致相似车辆的区分能力较弱;(2)基于深度学习的车辆检索方法:基于深度学习的车辆检索方法通过大量样本数据对神经网络进行训练,使神经网络能够提取车辆特征,进而完成车辆检索任务。该类检索方法基于经典的目标检测网络提取车辆的语义信息,如Faster R-CNN、YOLOV3、SPP-NET,在简单场景下具有较高的检索精度。Ground surveillance video obtains road information by installing fixed cameras at key places such as intersections and high-speed intersections, which has the advantages of all-weather and less environmental impact. The vehicle retrieval system based on ground surveillance video mainly includes: (1) Traditional vehicle retrieval method: The traditional vehicle retrieval method extracts the detailed features of the target vehicle through algorithms such as visual word bag and deep hashing, such as Haar feature, SIFT feature and HOG (2) Vehicle retrieval method based on deep learning: The vehicle retrieval method based on deep learning trains the neural network through a large number of sample data, making the neural network The vehicle features can be extracted to complete the vehicle retrieval task. This kind of retrieval method extracts the semantic information of vehicles based on the classic object detection network, such as Faster R-CNN, YOLOV3, SPP-NET, and has high retrieval accuracy in simple scenes.

但由于地面监控摄像机的拍摄角度倾斜,视频中包含众多种尺度的车辆,在检测时会导致车辆漏检,进而影响车辆检索的精度。同时,因摄像机的机位固定,待检索车辆仅短暂出现在画面,对后续跟踪任务帮助有限。However, due to the oblique shooting angle of the ground surveillance camera, the video contains vehicles of various scales, which will lead to vehicle missed detection during detection, thereby affecting the accuracy of vehicle retrieval. At the same time, due to the fixed position of the camera, the vehicle to be retrieved only briefly appears on the screen, which is of limited help for subsequent tracking tasks.

不同于地面监控,无人机具有成本低廉、快速部署、灵活机动、监控范围广等优点,无人机监控视频的车辆检索不仅能够快速实现对任意路口车辆的检索任务,还能在检索成功后继续实现目标车辆的跟踪等任务。但由于视频中所有车辆的大小均会随着无人机高度而变化,当候选框设计不合理、网络过深时,会因候选框回归能力不足、目标信息被稀释问题,导致过大或过小尺度的车辆被漏检。同时,因无人机通常在天气较好的情况下使用,视频中往往会存在亮度过高导致车辆细节丢失问题,导致视频亮度过高区域的车辆被漏检。。Different from ground monitoring, UAV has the advantages of low cost, rapid deployment, flexible maneuverability, and wide monitoring range. Vehicle retrieval of UAV surveillance video can not only quickly realize the retrieval task of vehicles at any intersection, but also after successful retrieval. Continue to achieve tasks such as tracking of the target vehicle. However, since the size of all vehicles in the video will change with the height of the drone, when the candidate frame design is unreasonable and the network is too deep, it will be too large or too large due to the insufficient regression ability of the candidate frame and the dilution of the target information. Small-scale vehicles are missed. At the same time, because drones are usually used in good weather conditions, there is often a problem of losing vehicle details due to excessive brightness in the video, resulting in vehicles in areas with high video brightness being missed. .

发明内容SUMMARY OF THE INVENTION

针对现有技术直接应用在无人机视频车辆检索出现的漏检问题,本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,可有效解决上述问题。Aiming at the problem of missed detection when the prior art is directly applied to the UAV video vehicle retrieval, the present invention provides a UAV video vehicle retrieval method based on vehicle feature matching, which can effectively solve the above problems.

本发明采用的技术方案如下:The technical scheme adopted in the present invention is as follows:

本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,包括以下步骤:The present invention provides a UAV video vehicle retrieval method based on vehicle feature matching, comprising the following steps:

步骤1,确定需要检索的目标车辆图S;Step 1, determine the target vehicle map S to be retrieved;

步骤2,无人机对地面进行拍摄,获得无人机视频数据;Step 2, the drone shoots the ground to obtain the drone video data;

步骤3,对所述无人机视频数据的每一帧图像均执行步骤4-步骤8,判断每一帧图像中是否包含需要检索的目标车辆图S:Step 3: Steps 4 to 8 are performed on each frame of the UAV video data to determine whether each frame of the image contains the target vehicle map S that needs to be retrieved:

其中,将当前图像帧记为Frm(t),t为当前图像帧的帧数,采用以下步骤4-步骤8,判断图像帧Frm(t)中是否包含需要检索的目标车辆图S:Wherein, the current image frame is denoted as Frm(t), t is the frame number of the current image frame, and the following steps 4 to 8 are used to determine whether the image frame Frm(t) contains the target vehicle map S that needs to be retrieved:

步骤4,将所述图像帧Frm(t)输入到训练完成的抑光模型中,进行特征提取和抑光处理,得到包含n个图层的光照抑制特征图,记为FRestrainMap;Step 4: Input the image frame Frm(t) into the light suppression model that has been trained, perform feature extraction and light suppression processing, and obtain a light suppression feature map containing n layers, denoted as F Restrain Map;

步骤5,将光照抑制特征图FRestrainMap输入到特征增强的多尺度车辆检测模块中,获取图像帧Frm(t)中的z个车辆检测结果框:Step 5: Input the light suppression feature map F Restrain Map into the feature-enhanced multi-scale vehicle detection module, and obtain z vehicle detection result frames in the image frame Frm(t):

步骤5.1,光照抑制特征图FRestrainMap具有n个图层,对于每一图层,表示为:图层layeri,i=1,...n,均执行步骤5.1.1-步骤5.1.3,得到图层layeri的依赖权重值w″iStep 5.1, the light suppression feature map F Restrain Map has n layers, and for each layer, it is expressed as: layer layer i , i=1,...n, step 5.1.1-step 5.1.3 is performed. , get the dependent weight value w″ i of layer i :

步骤5.1.1,计算图层layeri的所有像素点的平均值,作为图层layeri的初始权重wiStep 5.1.1, calculate the average value of all pixels of layer i as the initial weight w i of layer i ;

步骤5.1.2,将图层layeri的初始权重wi输入到全连接层,通过sigmoid激活函数,将初始权重wi映射到(0,1)特征空间,从而输出图层layeri的归一化权重值w′iStep 5.1.2, input the initial weight wi of layer i to the fully connected layer, and map the initial weight wi to the (0, 1) feature space through the sigmoid activation function, thereby outputting the normalization of layer i the weighted value w′ i ;

步骤5.1.3,建立分段函数,对图层layeri的归一化权重值w′i进行分段抑制或增强,得到图层layeri的依赖权重值w″iStep 5.1.3, establish a segment function, perform segment suppression or enhancement on the normalized weight value w′ i of layer i , and obtain the dependent weight value w′ i of layer i :

Figure BDA0003412564660000031
Figure BDA0003412564660000031

其中:in:

ε代表系统常数,用于调节依赖权重值对图层的影响程度;ε represents the system constant, which is used to adjust the degree of influence of the dependent weight value on the layer;

步骤5.2,由于得到光照抑制特征图FRestrainMap的n个图层的依赖权重值,分别为:w″1...w″nIn step 5.2, the dependent weight values of the n layers of the light suppression feature map F Restrain Map are obtained, respectively: w″ 1 ... w″ n ;

将w″1...w″n合并,得到光照抑制特征图FRestrainMap的1*1*n的依赖权重向量W″;Combine w″ 1 ... w″ n to obtain the 1*1*n dependent weight vector W″ of the light suppression feature map F Restrain Map;

将依赖权重向量W″作为卷积核对光照抑制特征图FRestrainMap进行卷积,得到图层增强特征图FEhcMap;Convolve the light suppression feature map F Restrain Map with the dependent weight vector W" as the convolution kernel to obtain the layer enhancement feature map F Ehc Map;

步骤5.3,将图层增强特征图FEhcMap输入到小目标响应层,得到小目标显著特征图FSmallMap;Step 5.3, input the layer enhancement feature map F Ehc Map to the small target response layer, and obtain the small target salient feature map F Small Map;

其中,小目标显著特征图FSmallMap中包含更多的车辆细节信息,在无人机飞行高度较高时,可提高小目标车辆检测的成功率;Among them, the small target salient feature map F Small Map contains more vehicle detail information, which can improve the success rate of small target vehicle detection when the drone is flying at a high altitude;

步骤5.4,将小目标显著特征图FSmallMap输入到大目标响应层,得到大目标显著特征图FLargeMap;Step 5.4, input the salient feature map F Small Map of the small target into the response layer of the large target, and obtain the salient feature map F Large Map of the large target;

其中:大目标显著特征图FLargeMap包含更多的语义信息,在无人机飞行高度较低时,可提高大目标车辆检测的精确率;Among them: the large target salient feature map F Large Map contains more semantic information, which can improve the accuracy of large target vehicle detection when the UAV is flying at a low altitude;

步骤5.5,将小目标显著特征图FSmallMap输入到结果框生成层,从而在图像帧Frm(t)中,得到p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p);Step 5.5, input the small target salient feature map F Small Map into the result frame generation layer, so that in the image frame Frm(t), p small target vehicle detection result frames Box Small (1)...Box Small (p );

将大目标显著特征图FLargeMap输入到结果框生成层,从而在图像帧Frm(t)中,得到q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q);Input the large target salient feature map F Large Map into the result box generation layer, so that in the image frame Frm(t), q large target vehicle detection result boxes Box Large (1)...Box Large (q) are obtained;

具体方法为:The specific method is:

步骤5.5.1,将小目标显著特征图FSmallMap中的每个像素点作为锚点,以每个锚点为中心,生成多个大小不同的候选框;因此,对于小目标显著特征图FSmallMap中的所有像素点,一共得到若干个候选框;Step 5.5.1, take each pixel in the small target salient feature map F Small Map as an anchor point, and take each anchor point as the center to generate multiple candidate boxes of different sizes; therefore, for the small target salient feature map F All the pixels in the Small Map get a total of several candidate frames;

步骤5.5.2,计算得到每个候选框的车辆概率值;Step 5.5.2, calculate the vehicle probability value of each candidate frame;

步骤5.5.3,对候选框进行筛选,去除车辆概率值低于预设阈值的候选框,从而得到候选框:A1,A2...Ap;其中,p代表候选框数量;Step 5.5.3: Screen the candidate frames, and remove the candidate frames whose vehicle probability value is lower than the preset threshold, so as to obtain the candidate frames: A 1 , A 2 . . . A p ; wherein, p represents the number of candidate frames;

步骤5.5.4计算候选框A1,A2...Ap中每个候选框的回归参数,每个候选框均具有以下回归参数:宽度,高度和锚点偏移量;Step 5.5.4 Calculate the regression parameters of each candidate box in the candidate boxes A 1 , A 2 . . . A p , each candidate box has the following regression parameters: width, height and anchor offset;

步骤5.5.5,将候选框A1,A2...Ap中每个候选框的锚点坐标和其对应的回归参数映射回图像帧Frm(t),从而在图像帧Frm(t)中,得到p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p);Step 5.5.5, map the anchor point coordinates and the corresponding regression parameters of each candidate frame in the candidate frame A 1 , A 2 . . . A p back to the image frame Frm(t), so that in the image frame Frm(t) , obtain p small target vehicle detection result boxes Box Small (1)...Box Small (p);

步骤5.5.6,以大目标显著特征图FLargeMap替换步骤5.5.1中的小目标显著特征图FSmallMap,增大步骤5.5.1中的候选框的初始生成尺寸,采用步骤5.5.1-5.5.5的方法,在图像帧Frm(t)中,得到q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q);Step 5.5.6, replace the small target salient feature map F Small Map in step 5.5.1 with the large target salient feature map F Large Map, increase the initial generation size of the candidate frame in step 5.5.1, and use step 5.5.1 - The method of 5.5.5, in the image frame Frm(t), obtain q large target vehicle detection result boxes Box Large (1)...Box Large (q);

步骤5.6,将图像帧Frm(t)中的p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p)和q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q),统称为p+q个车辆检测结果框;Step 5.6, put the p small target vehicle detection result boxes Box Small (1)...Box Small (p) and q large target vehicle detection result boxes Box Large (1)... Box Large (q), collectively referred to as p+q vehicle detection result boxes;

对于图像帧Frm(t)中得到的p+q个车辆检测结果框,计算任意两个车辆检测结果框之间的相似系数,如果相似系数小于设定阈值,则不进行处理;如果相似系数大于设定阈值,则将这两个车辆检测结果框合并为一个车辆检测结果框,最终得到z个车辆检测结果框,表示为:Box(1)...Box(z);For the p+q vehicle detection result frames obtained in the image frame Frm(t), calculate the similarity coefficient between any two vehicle detection result frames. If the similarity coefficient is less than the set threshold, it will not be processed; if the similarity coefficient is greater than If the threshold is set, the two vehicle detection result boxes are combined into one vehicle detection result box, and finally z vehicle detection result boxes are obtained, which are expressed as: Box(1)...Box(z);

步骤6,在图像帧Frm(t)中分别截取每个车辆检测结果框中的图像,得到z个检测车辆图;Step 6, in the image frame Frm(t), respectively intercept the image in each vehicle detection result frame to obtain z detection vehicle images;

步骤7,将每个检测车辆图与目标车辆图S输入到多特征联合车辆搜索网络进行特征匹配,得到目标车辆所在的检测车辆图;该检测车辆图在图像帧Frm(t)中的位置,即为目标车辆在图像帧Frm(t)的位置,从而完成对目标车辆的检索定位;Step 7: Input each detected vehicle map and the target vehicle map S into the multi-feature joint vehicle search network for feature matching to obtain the detected vehicle map where the target vehicle is located; the position of the detected vehicle map in the image frame Frm(t), That is, the position of the target vehicle in the image frame Frm(t), so as to complete the retrieval and positioning of the target vehicle;

步骤8,若当前图像帧Frm(t)中所有检测车辆图与目标车辆图S的匹配度均低于设定阈值,即当前图像帧Frm(t)中不存在目标车辆,继续对下一时刻图像帧Frm(t+1)进行检索。Step 8: If the matching degree of all detected vehicle maps in the current image frame Frm(t) and the target vehicle map S is lower than the set threshold, that is, there is no target vehicle in the current image frame Frm(t), continue to the next moment. The image frame Frm(t+1) is retrieved.

优选的,步骤4具体为:Preferably, step 4 is specifically:

步骤4.1,构建抑光模型;Step 4.1, build a light suppression model;

所述抑光模型为双分支网络,包括学习分支网络和抑制分支网络;其中,所述学习分支网络包括串联的卷积层conv1、浅层特征挑选层f1()和深层特征挑选层f2();所述抑制分支网络包括串联的卷积层conv1′、浅层特征挑选层f′1()和深层特征挑选层f′2();The light suppression model is a double-branch network, including a learning branch network and a suppressing branch network; wherein, the learning branch network includes a convolutional layer conv1 in series, a shallow feature selection layer f 1 () and a deep feature selection layer f 2 ( ); the suppression branch network includes a convolutional layer conv1 ′ in series, a shallow feature selection layer f′ 1 ( ) and a deep feature selection layer f′ 2 ( );

步骤4.2,获取a组训练样本对;Step 4.2, obtain a group of training sample pairs;

每组训练样本对包括无人机视角下光线正常图像I和光线过亮图像I′;其中,光线过亮图像I′为对光线正常图像I随机添加亮度值的方式获得;将a组训练样本对分别表示为:(I1,I′1),(I2,I′2),...,(Ia,I′a);Each group of training sample pairs includes a normal light image I and an excessively bright image I' from the perspective of the drone; wherein, the excessively bright image I' is obtained by randomly adding brightness values to the normal light image I; The pairs are respectively expressed as: (I 1 , I' 1 ), (I 2 , I' 2 ), ..., (I a , I' a );

步骤4.3,采用a组训练样本对输入到步骤4.1构建的抑光模型进行离线训练,离线训练的目标函数为:Step 4.3, use a group of training samples to perform offline training on the light suppression model constructed in step 4.1. The objective function of offline training is:

Figure BDA0003412564660000061
Figure BDA0003412564660000061

其中:in:

Loss抑光代表抑光损失函数;Loss suppression represents the suppression loss function;

argmin()代表使目标函数取最小值时的变量值;argmin() represents the variable value when the objective function takes the minimum value;

f′1(I′j)代表光线过亮图像I′j输入到浅层特征挑选层f′1()后,输出的浅层特征值;f' 1 (I' j ) represents the output shallow feature value after the image I' j with too bright light is input to the shallow feature selection layer f' 1 ();

f′2(I′j)代表光线过亮图像I′j输入到深层特征挑选层f′2()后,输出的深层特征值;f′ 2 (I′ j ) represents the output deep feature value after the image I′ j with too bright light is input to the deep feature selection layer f′ 2 ( );

f1(Ij)代表光线正常图像Ij输入到浅层特征挑选层f1()后,输出的浅层特征值;f 1 (I j ) represents the shallow feature value of the output after the normal light image I j is input to the shallow feature selection layer f 1 ();

f2(Ij)代表光线正常图像Ij输入到深层特征挑选层f2()后,输出的深层特征值;f 2 (I j ) represents the deep feature value of the output after the normal light image I j is input to the deep feature selection layer f 2 ();

Figure BDA0003412564660000062
代表:L2范数的平方;
Figure BDA0003412564660000062
Represents: the square of the L2 norm;

γ代表惩罚系数,通过人为设置控制

Figure BDA0003412564660000063
对抑光损失函数的影响,其值越大,
Figure BDA0003412564660000064
对抑光损失函数的影响越大;γ represents the penalty coefficient, which is controlled by artificial settings
Figure BDA0003412564660000063
The effect on the suppression loss function, the larger the value, the
Figure BDA0003412564660000064
The greater the impact on the suppression loss function;

步骤4.4,通过对抑光模型进行离线训练,削弱抑制分支网络对于亮度特征的敏感程度,使抑制分支网络能够对无人机拍摄的亮度过高图像进行光照特征抑制,提高无人机视角下车辆细节特征的显著性;Step 4.4, through offline training of the light suppression model, the sensitivity of the suppression branch network to the brightness feature is weakened, so that the suppression branch network can suppress the light feature of the image with high brightness taken by the UAV, and improve the vehicle under the perspective of the UAV. The salience of detail features;

因此,将图像帧Frm(t)输入到训练完成的抑光模型的抑制分支网络,得到光照抑制特征图FRestrainMap。Therefore, the image frame Frm(t) is input into the suppression branch network of the trained light suppression model, and the light suppression feature map F Restrain Map is obtained.

优选的,步骤5.6中,将两个车辆检测结果框合并为一个车辆检测结果框,具体为:Preferably, in step 5.6, two vehicle detection result frames are combined into one vehicle detection result frame, specifically:

设需要合并的两个车辆检测结果框分别为:车辆检测结果框BoxSmall(1)和车辆检测结果框BoxLarge(1);合并后的车辆检测结果框表示为Box(1),则:Suppose the two vehicle detection result boxes that need to be merged are: the vehicle detection result box Box Small (1) and the vehicle detection result box Box Large (1); the combined vehicle detection result box is represented as Box (1), then:

Box(1)的中心点,为BoxSmall(1)中心点和BoxLarge(1)中心点连线的中间点;The center point of Box(1) is the middle point of the line connecting the center point of Box Small (1) and the center point of Box Large (1);

Box(1)的高度,为BoxSmall(1)高度和BoxLarge(1)高度的平均值;The height of Box(1) is the average of the height of Box Small (1) and Box Large (1);

Box(1)的宽度,为BoxSmall(1)宽度和BoxLarge(1)宽度的平均值。The width of Box(1), which is the average of the width of Box Small (1) and the width of Box Large (1).

优选的,步骤7中,多特征联合车辆搜索网络建立方式为:Preferably, in step 7, the multi-feature joint vehicle search network is established in the following manner:

以车辆颜色特征、车辆类型特征作为车辆全局特征,以车辆侧视图、车辆前视图、车辆后视图、车辆顶视图和非车辆视图作为车辆局部特征,建立多特征联合车辆搜索网络。A multi-feature joint vehicle search network is established with vehicle color features and vehicle type features as vehicle global features, and vehicle side view, vehicle front view, vehicle rear view, vehicle top view and non-vehicle view as vehicle local features.

优选的,步骤7具体为:Preferably, step 7 is specifically:

步骤7.1,构建多特征联合车辆搜索网络;所述多特征联合车辆搜索网络包括全局特征识别模块和局部特征匹配模块;Step 7.1, building a multi-feature joint vehicle search network; the multi-feature joint vehicle search network includes a global feature identification module and a local feature matching module;

步骤7.2,将z个检测车辆图和目标车辆图S分别输入到全局特征识别模块,采用以下方法,得到z′个与目标车辆图S颜色、车辆类型一致的疑似车辆图;Step 7.2, input the z detected vehicle images and the target vehicle image S to the global feature recognition module respectively, and obtain z' suspected vehicle images with the same color and vehicle type as the target vehicle image S by using the following method;

其中,全局特征识别模块包括共享特征层、车辆颜色特征层和车辆类型特征层;The global feature recognition module includes a shared feature layer, a vehicle color feature layer, and a vehicle type feature layer;

步骤7.2.1,识别目标车辆图S的颜色特征,包括以下步骤:Step 7.2.1, identify the color features of the target vehicle map S, including the following steps:

步骤7.2.1.1,将目标车辆图S输入到共享特征层,得到共享特征图FShrMap;Step 7.2.1.1, input the target vehicle map S into the shared feature layer to obtain the shared feature map F Shr Map;

步骤7.2.1.2,将共享特征图FShrMap输入到车辆颜色特征层,得到车辆颜色特征向量VColor;其中,所述车辆颜色特征层包括conv4Color、最大池化层Maxpool和全连接层FCColorStep 7.2.1.2, the shared feature map F Shr Map is input into the vehicle color feature layer, and the vehicle color feature vector V Color is obtained; wherein, the vehicle color feature layer includes conv4 Color , maximum pooling layer Maxpool and fully connected layer FC Color ;

步骤7.2.1.3,采用矩阵广播的方式将车辆颜色特征向量VColor与共享特征图FShrMap相乘,得到颜色敏感特征图FColorMap;Step 7.2.1.3: Multiply the vehicle color feature vector V Color and the shared feature map F Shr Map by means of matrix broadcasting to obtain the color-sensitive feature map F Color Map;

步骤7.2.1.4,以颜色敏感特征图FColorMap为卷积核,对目标车辆图S进行互卷积,得到颜色特征增强图S′Color,增强目标车辆图S对颜色特征的响应程度;Step 7.2.1.4, take the color-sensitive feature map F Color Map as the convolution kernel, perform mutual convolution on the target vehicle map S to obtain a color feature enhancement map S′ Color , and enhance the response degree of the target vehicle map S to the color feature;

步骤7.2.1.5,将颜色特征增强图S′Color依次输入到共享特征层、Conv4Color、Conv5Color、最大值池化层、全连接层,通过非极大值抑制算法得到目标车辆图S的颜色类别;Step 7.2.1.5, input the color feature enhancement map S′ Color to the shared feature layer, Conv4 Color , Conv5 Color , maximum pooling layer, and fully connected layer in turn, and obtain the color of the target vehicle map S through the non-maximum value suppression algorithm category;

步骤7.2.2,采用相同方法,得到目标车辆图S的车辆类型,进而得到每个检测车辆图所属的颜色类别和车辆类型;Step 7.2.2, using the same method, obtain the vehicle type of the target vehicle map S, and then obtain the color category and vehicle type to which each detected vehicle map belongs;

步骤7.2.3,在z个检测车辆图中,判断是否存在与目标车辆图S颜色、车辆类型相同的检测车辆图,如果没有,则直接对下一帧图像进行检索;Step 7.2.3, in the z detected vehicle images, determine whether there is a detected vehicle image with the same color and vehicle type as the target vehicle image S, if not, directly search the next frame of image;

如果有,则将所有与目标车辆图S颜色、车辆类型相同的检测车辆图均提取出来,假设一共提取到z′个,将提取到的z′个检测车辆图称为疑似车辆图,表示为:疑似车辆图Dc,其中,c=1...z′;If there are, all the detected vehicle images with the same color and vehicle type as the target vehicle image S are extracted. Assuming that a total of z' are extracted, the extracted z' detected vehicle images are called suspected vehicle images, which are expressed as : Suspected vehicle map D c , where c=1...z′;

步骤7.3,将目标车辆图S和每个疑似车辆图Dc分别输入到局部特征匹配模块,局部特征匹配模块采用匹配算法,得到目标车辆图S的车辆均值向量矩阵VsStep 7.3, input the target vehicle map S and each suspected vehicle map D c to the local feature matching module respectively, and the local feature matching module adopts a matching algorithm to obtain the vehicle mean vector matrix V s of the target vehicle map S;

局部特征匹配模块采用相同匹配算法,得到每个疑似车辆图Dc的疑似车辆均值向量矩阵VcThe local feature matching module adopts the same matching algorithm to obtain the suspected vehicle mean value vector matrix V c of each suspected vehicle map D c ;

其中,局部特征匹配模块包括特征提取层、特征稀疏卷积层Conv6和一个全连接层FCsightAmong them, the local feature matching module includes a feature extraction layer, a feature sparse convolutional layer Conv6 and a fully connected layer FC sight ;

局部特征匹配模块对目标车辆图S进行特征匹配,得到目标车辆图S的车辆均值向量矩阵Vs,具体为:The local feature matching module performs feature matching on the target vehicle map S, and obtains the vehicle mean vector matrix V s of the target vehicle map S, specifically:

步骤7.3.1,将目标车辆图S通过4*4网格进行格网分割,得到16个车辆子块图;Step 7.3.1, divide the target vehicle map S through a 4*4 grid to obtain 16 vehicle sub-block maps;

步骤7.3.2,将每个车辆子块图分别输入到特征提取层,得到对应的车辆子块特征图FsubMap(m),m=1...16;Step 7.3.2, input each vehicle sub-block map to the feature extraction layer respectively, and obtain the corresponding vehicle sub-block feature map F sub Map(m), m=1...16;

步骤7.3.3,将每个车辆子块特征图FsubMap(m)输入到特征稀疏卷积层Conv6,得到对应的稀疏特征图FsparseMap(m);Step 7.3.3, input the feature map F sub Map(m) of each vehicle sub-block into the feature sparse convolution layer Conv6 to obtain the corresponding sparse feature map F sparse Map(m);

步骤7.3.4,确定车辆子块图的视角类别:Step 7.3.4, determine the viewing angle category of the vehicle subplot:

将每个稀疏特征图FsparseMap(m)输入到全连接层FCsight,通过非极大值抑制,得到该车辆子块图的视角类别;其中,视角类别包括侧视图、前视图、后视图、顶视图和非车辆视图五类;Input each sparse feature map F sparse Map(m) into the fully connected layer FC sight , and obtain the viewing angle category of the vehicle sub-block map through non-maximum suppression; among which, the viewing angle categories include side view, front view, and rear view , top view and non-vehicle view five categories;

步骤7.3.5,确定车辆子块图的视角类别的视角向量:Step 7.3.5, determine the view vector of the view category of the vehicle subplot:

如果视角类别为侧视图、前视图、后视图和顶视图,则提取每个稀疏特征图FsparseMap(m)的特征,将其重塑为一维特征向量,该一维特征向量作为该车辆子块图对应的视角向量;其中,所述视角向量依据视角类别划分包括:侧视向量、正视向量、后视向量和顶视向量;If the viewing angle categories are side view, front view, rear view and top view, extract the features of each sparse feature map F sparse Map(m) and reshape it into a one-dimensional feature vector, which is used as the vehicle The viewing angle vector corresponding to the sub-block map; wherein, the viewing angle vector is divided according to the viewing angle category and includes: a side view vector, a front view vector, a rear view vector and a top view vector;

如果视角类别为非车辆视图,则舍弃;If the viewing angle category is non-vehicle view, it will be discarded;

步骤7.3.6,确定每种视角类别的视角均值向量:Step 7.3.6, determine the view mean vector for each view category:

求取目标车辆图S中相同视角类别的各个车辆子块图的视角向量均值,分别得到侧视视角均值向量、正视视角均值向量、后视视角均值向量和顶视视角均值向量;Obtain the mean value of the angle vector of each vehicle sub-block graph of the same angle of view category in the target vehicle map S, and obtain the mean vector of the side view angle, the mean vector of the front view, the mean vector of the rear view and the mean vector of the top view respectively;

若不存在某一视角类别,则其视角均值向量不存在,则将该视角均值向量的所有元素置为0;If a certain viewing angle category does not exist, then its viewing angle mean vector does not exist, then all elements of the viewing angle mean vector are set to 0;

因此,得到四个视角类别的视角均值向量Vcl;其中,cl=1,2,3,4,分别代表侧视视角均值向量V1、正视视角均值向量V2、后视视角均值向量V3和顶视视角均值向量V4;四个视角类别的视角均值向量Vcl,构成目标车辆图S的车辆均值向量矩阵VsTherefore, the viewing angle mean vector V cl of the four viewing angle categories is obtained; wherein, cl=1, 2, 3, 4, respectively representing the side view viewing angle mean vector V 1 , the front viewing viewing angle mean vector V 2 , and the rear viewing viewing angle mean vector V 3 and the top view angle mean vector V 4 ; the angle mean vector V cl of the four viewing angle categories constitutes the vehicle mean vector matrix V s of the target vehicle map S;

与之对应,得到每个疑似车辆图Dc的四个视角类别的疑似车辆均值向量V′cl,构成疑似车辆图Dc的疑似车辆均值向量矩阵VcCorrespondingly, the suspected vehicle mean vector V′ cl of the four viewing angle categories of each suspected vehicle map D c is obtained, forming the suspected vehicle mean vector matrix V c of the suspected vehicle map D c ;

步骤7.4,计算目标车辆图S与每个疑似车辆图Dc的共有视角类别的视角均值向量个数Num,采用下式,得到与每个疑似车辆图Dc对应的特征匹配值Match;Step 7.4: Calculate the number Num of the average viewing angle vectors of the common viewing angle category of the target vehicle map S and each suspected vehicle map D c , and use the following formula to obtain the feature matching value Match corresponding to each suspected vehicle map D c ;

Figure BDA0003412564660000101
Figure BDA0003412564660000101

其中,λ为视角均值向量个数的权重;T表示转置;tr表示矩阵的迹,代表矩阵主对角线元素之和;Among them, λ is the weight of the number of viewing angle mean vectors; T is the transpose; tr is the trace of the matrix, which is the sum of the main diagonal elements of the matrix;

步骤7.5,当存在多个疑似车辆图Dc的特征匹配值Match高于阈值时,则通过非极大值抑制方法在多个疑似车辆图Dc中,确定目标车辆所在的疑似车辆图,该疑似车辆图在图像帧Frm(t)中的位置为目标车辆在图像帧Frm(t)的位置;Step 7.5, when the feature matching value Match of multiple suspected vehicle maps Dc is higher than the threshold, the non-maximum value suppression method is used to determine the suspected vehicle map where the target vehicle is located in the multiple suspected vehicle maps Dc . The position of the suspected vehicle map in the image frame Frm(t) is the position of the target vehicle in the image frame Frm(t);

当目标车辆图S和所有疑似车辆图Dc的特征匹配值Match均低于阈值,则图像帧Frm(t)中不包含目标车辆。When the feature matching value Match of the target vehicle map S and all suspected vehicle maps Dc is lower than the threshold, the image frame Frm( t ) does not contain the target vehicle.

优选的,步骤7.3.3中,为使稀疏特征图能够充分表达车辆子块特征图FsubMap(m)中的特征,减少压缩过程中的信息损失,在训练时采用压缩损失函数LosssparsePreferably, in step 7.3.3, in order to enable the sparse feature map to fully express the features in the vehicle sub-block feature map F sub Map(m) and reduce the information loss in the compression process, the compression loss function Loss sparse is used during training:

Losssparse=Min(FsubMap(m)-(FsparseMap(m)*WTran))Loss sparse = Min(F sub Map(m)-(F sparse Map(m)*W Tran ))

式中:where:

WTran为通过反卷积得到的上采样权重。W Tran is the upsampling weight obtained by deconvolution.

本发明提供的一种基于车辆特征匹配的无人机视频车辆检索方法具有以下优点:A UAV video vehicle retrieval method based on vehicle feature matching provided by the present invention has the following advantages:

本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,适用于无人机在不同复杂场景下拍摄的视频,最大限度去除光照导致车辆细节信息不足和无人机不同高度下目标大小变化的影响,解决待查询车辆在众多目标中难以发现的问题,可以更加准确的检索到待查询车辆。The invention provides a UAV video vehicle retrieval method based on vehicle feature matching, which is suitable for videos shot by UAVs in different complex scenes, and minimizes the lack of vehicle detail information caused by illumination and the target size at different heights of the UAV. The impact of changes can solve the problem that the vehicle to be queried is difficult to find among many targets, and the vehicle to be queried can be retrieved more accurately.

附图说明Description of drawings

图1为本发明提供的一种基于车辆特征匹配的无人机视频车辆检索方法的流程示意图;1 is a schematic flowchart of a method for retrieving UAV video vehicles based on vehicle feature matching provided by the present invention;

图2为抑光模型的结构图;Fig. 2 is the structure diagram of light suppression model;

图3为每个特征挑选层f的结构图;Fig. 3 is the structure diagram of each feature selection layer f;

图4为右侧分支网络的结构图;Fig. 4 is the structure diagram of the right branch network;

图5为车辆多维特征概率识别网络的图。Figure 5 is a diagram of a vehicle multi-dimensional feature probabilistic identification network.

具体实施方式Detailed ways

为了使本发明所解决的技术问题、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the technical problems, technical solutions and beneficial effects solved by the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,主要思路为:构建并训练抑光模型,生成光照抑制特征图;构建特征增强的多尺度车辆检测模块,获取当前帧内所有车辆检测结果框;在图像帧Frm(t)中分别截取每个车辆检测结果框中的图像,得到z个检测车辆图;将每个检测车辆图与目标车辆图S输入到多特征联合车辆搜索网络进行特征匹配,得到目标车辆所在的检测车辆图;该检测车辆图在图像帧Frm(t)中的位置,即为目标车辆在图像帧Frm(t)的位置,从而完成对目标车辆的检索定位。本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,适用于无人机在不同复杂场景下拍摄的视频,最大限度去除光照导致车辆细节信息不足和无人机不同高度下目标大小变化的影响,解决待查询车辆在众多目标中难以发现的问题,可以更加准确的检索到待查询车辆。The present invention provides a UAV video vehicle retrieval method based on vehicle feature matching. The main ideas are: constructing and training a light suppression model, and generating a light suppression feature map; building a feature-enhanced multi-scale vehicle detection module to obtain all the Vehicle detection result box; intercept the image in each vehicle detection result box in the image frame Frm(t), respectively, to obtain z detection vehicle maps; input each detected vehicle map and the target vehicle map S into the multi-feature joint vehicle search The network performs feature matching to obtain the detection vehicle map where the target vehicle is located; the position of the detection vehicle map in the image frame Frm(t) is the position of the target vehicle in the image frame Frm(t), thus completing the retrieval of the target vehicle position. The invention provides a UAV video vehicle retrieval method based on vehicle feature matching, which is suitable for videos shot by UAVs in different complex scenes, and minimizes the lack of vehicle detail information caused by illumination and the target size at different heights of the UAV. The impact of changes can solve the problem that the vehicle to be queried is difficult to find among many targets, and the vehicle to be queried can be retrieved more accurately.

本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,参考图1,包括以下步骤:The present invention provides a UAV video vehicle retrieval method based on vehicle feature matching. Referring to FIG. 1, the method includes the following steps:

步骤1,确定需要检索的目标车辆图S;Step 1, determine the target vehicle map S to be retrieved;

步骤2,无人机对地面进行拍摄,获得无人机视频数据;Step 2, the drone shoots the ground to obtain the drone video data;

步骤3,对所述无人机视频数据的每一帧图像均执行步骤4-步骤8,判断每一帧图像中是否包含需要检索的目标车辆图S:Step 3: Steps 4 to 8 are performed on each frame of the UAV video data to determine whether each frame of the image contains the target vehicle map S that needs to be retrieved:

其中,将当前图像帧记为Frm(t),t为当前图像帧的帧数,采用以下步骤4-步骤8,判断图像帧Frm(t)中是否包含需要检索的目标车辆图S:Wherein, the current image frame is denoted as Frm(t), t is the frame number of the current image frame, and the following steps 4 to 8 are used to determine whether the image frame Frm(t) contains the target vehicle map S that needs to be retrieved:

步骤4,将所述图像帧Frm(t)输入到训练完成的抑光模型中,进行特征提取和抑光处理,得到包含n个图层的光照抑制特征图,记为FRestrainMap;Step 4: Input the image frame Frm(t) into the light suppression model that has been trained, perform feature extraction and light suppression processing, and obtain a light suppression feature map containing n layers, denoted as F Restrain Map;

具体的,无人机视频数据中经常存在由于拍摄期间光线较强,使得视频中图像的亮度过高,车辆检索方法难以从视频图像中提取有效信息,导致漏检问题。因此,本发明采用抑光模型,抑光模型采用光线正常图像和光线过亮图像组成的影像对训练,使抑光模型能够对亮度过高图像进行光照特征抑制,提高光线过强时的检测精度。Specifically, in the video data of drones, the brightness of the images in the video is often too high due to the strong light during the shooting period, and it is difficult for the vehicle retrieval method to extract valid information from the video images, resulting in the problem of missed detection. Therefore, the present invention adopts a light suppression model, and the light suppression model adopts the image pair training consisting of a normal light image and an excessively bright image, so that the light suppression model can suppress the illumination feature of the image with excessive brightness, and improve the detection accuracy when the light is too strong .

步骤4具体为:Step 4 is specifically:

步骤4.1,构建抑光模型;Step 4.1, build a light suppression model;

如图2所示,为抑光模型的结构图;所述抑光模型为双分支网络,包括学习分支网络和抑制分支网络;其中,所述学习分支网络包括串联的卷积层conv1、浅层特征挑选层f1()和深层特征挑选层f2();所述抑制分支网络包括串联的卷积层conv1′、浅层特征挑选层f′1()和深层特征挑选层f′2();As shown in Figure 2, it is the structure diagram of the light suppression model; the light suppression model is a double branch network, including a learning branch network and a suppression branch network; wherein, the learning branch network includes a convolutional layer conv1 connected in series, a shallow layer feature selection layer f 1 ( ) and deep feature selection layer f 2 ( ); the suppression branch network includes a concatenated convolution layer conv1′, a shallow feature selection layer f′ 1 ( ) and a deep feature selection layer f′ 2 ( );

作为一种具体实现方式,初始未训练时,学习分支网络和抑制分支网络的网络结构相同,每侧的分支网络结构如下表所示:As a specific implementation method, when the initial training is not performed, the network structures of the learning branch network and the suppression branch network are the same. The branch network structure on each side is shown in the following table:

表1:抑光模型主干网络卷积核参数Table 1: Light suppression model backbone network convolution kernel parameters

Figure BDA0003412564660000131
Figure BDA0003412564660000131

每个特征挑选层f的结构如图3所示,包括:3个1*1的卷积核和2个3*3卷积核、两个最大值池化层(Maxpool)。The structure of each feature selection layer f is shown in Figure 3, including: three 1*1 convolution kernels, two 3*3 convolution kernels, and two maximum pooling layers (Maxpool).

步骤4.2,获取a组训练样本对;Step 4.2, obtain a group of training sample pairs;

每组训练样本对包括无人机视角下光线正常图像I和光线过亮图像I′;其中,光线过亮图像I′为对光线正常图像I随机添加亮度值的方式获得;将a组训练样本对分别表示为:(I1,I′1),(I2,I′2),...,(Ia,I′a);Each group of training sample pairs includes a normal light image I and an excessively bright image I' from the perspective of the drone; wherein, the excessively bright image I' is obtained by randomly adding brightness values to the normal light image I; The pairs are respectively expressed as: (I 1 , I' 1 ), (I 2 , I' 2 ), ..., (I a , I' a );

步骤4.3,采用a组训练样本对输入到步骤4.1构建的抑光模型进行离线训练,离线训练的目标函数为:Step 4.3, use a group of training samples to perform offline training on the light suppression model constructed in step 4.1. The objective function of offline training is:

Figure BDA0003412564660000132
Figure BDA0003412564660000132

其中:in:

Loss抑光代表抑光损失函数;Loss suppression represents the suppression loss function;

argmin()代表使目标函数取最小值时的变量值;argmin() represents the variable value when the objective function takes the minimum value;

f′1(I′j)代表光线过亮图像I′j输入到浅层特征挑选层f′1()后,输出的浅层特征值;f' 1 (I' j ) represents the output shallow feature value after the image I' j with too bright light is input to the shallow feature selection layer f' 1 ();

f′2(I′j)代表光线过亮图像I′j输入到深层特征挑选层f′2()后,输出的深层特征值;f′ 2 (I′ j ) represents the output deep feature value after the image I′ j with too bright light is input to the deep feature selection layer f′ 2 ( );

f1(Ij)代表光线正常图像Ij输入到浅层特征挑选层f1()后,输出的浅层特征值;f 1 (I j ) represents the shallow feature value of the output after the normal light image I j is input to the shallow feature selection layer f 1 ();

f2(Ij)代表光线正常图像Ij输入到深层特征挑选层f2()后,输出的深层特征值;f 2 (I j ) represents the deep feature value of the output after the normal light image I j is input to the deep feature selection layer f 2 ();

Figure BDA0003412564660000141
代表:L2范数的平方;
Figure BDA0003412564660000141
Represents: the square of the L2 norm;

γ代表惩罚系数,通过人为设置控制

Figure BDA0003412564660000142
对抑光损失函数的影响,其值越大,
Figure BDA0003412564660000143
对抑光损失函数的影响越大;γ represents the penalty coefficient, which is controlled by artificial settings
Figure BDA0003412564660000142
The effect on the suppression loss function, the larger the value, the
Figure BDA0003412564660000143
The greater the impact on the suppression loss function;

步骤4.4,通过对抑光模型进行离线训练,削弱抑制分支网络对于亮度特征的敏感程度,使抑制分支网络能够对无人机拍摄的亮度过高图像进行光照特征抑制,提高无人机视角下车辆细节特征的显著性;Step 4.4, through offline training of the light suppression model, the sensitivity of the suppression branch network to the brightness feature is weakened, so that the suppression branch network can suppress the light feature of the image with high brightness captured by the UAV, and improve the vehicle under the perspective of the UAV. The salience of detail features;

因此,将图像帧Frm(t)输入到训练完成的抑光模型的抑制分支网络,得到光照抑制特征图FRestrainMap。Therefore, the image frame Frm(t) is input into the suppression branch network of the trained light suppression model, and the light suppression feature map F Restrain Map is obtained.

综上所述,为了使抑光模型具有抑光能力,抑光模型采用双分支网络,每侧的分支网络包括:一个卷积层和两个特征挑选层;其中,由于无人机视频中物体较多,采用两个特征挑选层分别对图像的浅层特征与深层特征进行光照特征抑制,增强网络学习能力,提高抑制效果。To sum up, in order to make the light suppression model have light suppression ability, the light suppression model adopts a double branch network, and the branch network on each side includes: a convolution layer and two feature selection layers; If there are many, two feature selection layers are used to suppress the illumination features of the shallow features and deep features of the image respectively, so as to enhance the network learning ability and improve the suppression effect.

抑光模型在线检测时,仅采用抑制分支网络,如图4所示,为抑制分支网络的结构图。When the light suppression model is detected online, only the suppression branch network is used, as shown in Figure 4, which is the structure diagram of the suppression branch network.

抑制分支网络对图像帧Frm(t)的处理方法,如图3所示,具体为:The processing method of the image frame Frm(t) by the suppression branch network, as shown in Figure 3, is as follows:

1)图像帧Frm(t)通过conv1层得到低维特征图FLowMap;目的是为了将输入图像不同层的特征信息进行融合,提高物体特征的显著性。1) The image frame Frm(t) obtains the low-dimensional feature map F Low Map through the conv1 layer; the purpose is to fuse the feature information of different layers of the input image and improve the saliency of object features.

2)将低维特征图FLowMap输入到浅层特征挑选层f1(),低维特征图FLowMap经过conv21和conv22后,一方面,输出中维特征图FMidMap;另一方面,继续经过conv23和conv24后,输出高维特征图FhighMap2) Input the low-dimensional feature map F Low Map to the shallow feature selection layer f 1 (), and after the low-dimensional feature map F Low Map passes through conv2 1 and conv2 2 , on the one hand, output the mid-dimensional feature map F Mid Map; On the one hand, after going through conv2 3 and conv2 4 , output the high-dimensional feature map F hig hMap ;

3)将低维特征图FLowMap经过3*3最大值池化,得到特征图F′LowMap;将中维特征图FMidMap经过2*2的最大值池化,得到特征图F′MidMap;3) The low-dimensional feature map F Low Map is subjected to 3*3 maximum pooling to obtain the feature map F' Low Map; the mid-dimensional feature map F Mid Map is subjected to 2*2 maximum pooling to obtain the feature map F' Mid Map;

其中,经过本步骤操作,对低维特征图FLowMap和中维特征图FMidMap进行尺寸缩放处理,使处理后的特征图F′LowMap和特征图F′MidMap,与高维特征图FhighMap尺寸相同;Among them, after this step, the low-dimensional feature map F Low Map and the mid-dimensional feature map F Mid Map are scaled, so that the processed feature map F' Low Map and feature map F' Mid Map are the same as the high-dimensional feature map. Figure F high Map has the same size;

4)将特征图F′LowMap、特征图F′MidMap和高维特征图FhighMap进行串联,经过conv25卷积后,输出多维特征图FMultiMap;4) Connect the feature map F' Low Map, the feature map F' Mid Map and the high-dimensional feature map F high Map in series, and after conv2 5 convolution, output the multi-dimensional feature map F Multi Map;

具体的,由于无人机的机载能力较弱,且拍摄的视频图像中车辆目标较小,多维特征图融合不仅减少计算量,还通过融合提高不同维度特征的利用率,获得更多的物体特征信息。Specifically, due to the weak airborne capability of the UAV and the small vehicle targets in the captured video images, the multi-dimensional feature map fusion not only reduces the amount of calculation, but also improves the utilization of different dimensional features through fusion, and obtains more objects. characteristic information.

其中:浅层特征挑选层f1()属于浅层特征,其对物体的纹理和形状特征敏感,可以抑制图像中大部分区域的亮度,但由于无人机视野宽广,会有许多非物体的纹理、几何特征,这些特征会干扰亮度抑制。因此,经过浅层特征挑选层f1()后,需要再经过深层特征挑选层f2()进行特征处理。Among them: the shallow feature selection layer f 1 ( ) belongs to the shallow feature, which is sensitive to the texture and shape features of the object and can suppress the brightness of most areas in the image, but due to the wide field of view of the drone, there will be many non-object features. Texture, geometric features that interfere with luminance suppression. Therefore, after going through the shallow feature selection layer f 1 ( ), it needs to go through the deep feature selection layer f 2 ( ) for feature processing.

5)将多维特征图FMultiMap作为新的特征图FLowMap,输入到深层特征挑选层f2()中,重复以上步骤,得到光照抑制特征图FRestrainMap。5) Use the multi-dimensional feature map F Multi Map as a new feature map F Low Map, and input it into the deep feature selection layer f 2 ( ), and repeat the above steps to obtain the light suppression feature map F Restrain Map.

随着网络深度的增加,深层特征挑选层f2()的深度特征对于语义特征更加敏感,可以有效抑制非物体纹理、几何特征带来的干扰,弥补浅层特征挑选层f1()的不足。With the increase of network depth, the deep features of the deep feature selection layer f 2 ( ) are more sensitive to semantic features, which can effectively suppress the interference caused by non-object texture and geometric features, and make up for the deficiencies of the shallow feature selection layer f 1 ( ). .

步骤5,将光照抑制特征图FRestrainMap输入到特征增强的多尺度车辆检测模块中,获取图像帧Frm(t)中的z个车辆检测结果框:Step 5: Input the light suppression feature map F Restrain Map into the feature-enhanced multi-scale vehicle detection module, and obtain z vehicle detection result frames in the image frame Frm(t):

具体的,由于无人机视频图像中存在大量相似外观的物体,例如,路边矩形的电箱、长形状的遮阳伞等。为了使车辆更加显著,本发明提出了特征增强的多尺度车辆检测模块,从而更有利于提取到车辆。Specifically, there are a large number of objects with similar appearances in the UAV video images, such as roadside rectangular electric boxes, long-shaped sunshades, etc. In order to make the vehicle more salient, the present invention proposes a feature-enhanced multi-scale vehicle detection module, which is more beneficial to extract the vehicle.

同时,由于无人机视频图像中的车辆大小会随着无人机高度变化而变化,当无人机飞行高度较低时,车辆在视频图像中外观较大,浅层网络由于感受野不够大造成漏检、误检;当无人机飞行高度过高,车辆外观过小,深层网络由于过度卷积会造成信息丢失,造成漏检,对此,本发明设计了基于层级检测的大小目标检测方法,实现对各种大小不同的车辆的检测,提高视频图像中车辆检测的全面性,避免漏检。At the same time, since the size of the vehicle in the UAV video image will change with the height of the UAV, when the UAV flying height is low, the vehicle will appear larger in the video image, and the shallow network is not large enough due to the receptive field. Cause missed detection and false detection; when the flying height of the drone is too high, the appearance of the vehicle is too small, and the deep network will cause information loss due to excessive convolution, resulting in missed detection. In this regard, the present invention designs a large and small target detection based on hierarchical detection. The method realizes the detection of vehicles of different sizes, improves the comprehensiveness of vehicle detection in video images, and avoids missed detection.

步骤5.1,光照抑制特征图FRestrainMap具有n个图层,对于每一图层,表示为:图层layeri,i=1,...n,均执行步骤5.1.1-步骤5.1.3,得到图层layeri的依赖权重值w″iStep 5.1, the light suppression feature map F Restrain Map has n layers, and for each layer, it is expressed as: layer layer i , i=1,...n, step 5.1.1-step 5.1.3 is performed. , get the dependent weight value w″ i of layer i :

步骤5.1.1,计算图层layeri的所有像素点的平均值,作为图层layeri的初始权重wiStep 5.1.1, calculate the average value of all pixels of layer i as the initial weight w i of layer i ;

步骤5.1.2,将图层layeri的初始权重wi输入到全连接层,通过sigmoid激活函数,将初始权重wi映射到(0,1)特征空间,从而输出图层layeri的归一化权重值w′iStep 5.1.2, input the initial weight wi of layer i to the fully connected layer, and map the initial weight wi to the (0, 1) feature space through the sigmoid activation function, thereby outputting the normalization of layer i the weighted value w′ i ;

步骤5.1.3,建立分段函数,对图层layeri的归一化权重值w′i进行分段抑制或增强,得到图层layeri的依赖权重值w″iStep 5.1.3, establish a segment function, perform segment suppression or enhancement on the normalized weight value w′ i of layer i , and obtain the dependent weight value w′ i of layer i :

Figure BDA0003412564660000161
Figure BDA0003412564660000161

其中:in:

ε代表系统常数,用于调节依赖权重值对图层的影响程度;ε represents the system constant, which is used to adjust the degree of influence of the dependent weight value on the layer;

步骤5.2,由于得到光照抑制特征图FRestrainMap的n个图层的依赖权重值,分别为:w″1...w″nIn step 5.2, the dependent weight values of the n layers of the light suppression feature map F Restrain Map are obtained, respectively: w″ 1 ... w″ n ;

将w″1...w″n合并,得到光照抑制特征图FRestrainMap的1*1*n的依赖权重向量W″;Combine w″ 1 ... w″ n to obtain the 1*1*n dependent weight vector W″ of the light suppression feature map F Restrain Map;

将依赖权重向量W″作为卷积核对光照抑制特征图FRestrainMap进行卷积,得到图层增强特征图FEhcMap;Convolve the light suppression feature map F Restrain Map with the dependent weight vector W" as the convolution kernel to obtain the layer enhancement feature map F Ehc Map;

步骤5.3,将图层增强特征图FEhcMap输入到小目标响应层,得到小目标显著特征图FSmallMap;Step 5.3, input the layer enhancement feature map F Ehc Map to the small target response layer, and obtain the small target salient feature map F Small Map;

其中,小目标响应层可以采用一个1*1的卷积层。1*1的卷积层的目的是降低特征图深度,提高小目标检测的成功率。Among them, the small target response layer can use a 1*1 convolutional layer. The purpose of the 1*1 convolutional layer is to reduce the depth of the feature map and improve the success rate of small target detection.

其中,小目标显著特征图FSmallMap中包含更多的车辆细节信息,在无人机飞行高度较高时,可提高小目标车辆检测的成功率;Among them, the small target salient feature map F Small Map contains more vehicle detail information, which can improve the success rate of small target vehicle detection when the drone is flying at a high altitude;

步骤5.4,将小目标显著特征图FSmallMap输入到大目标响应层,得到大目标显著特征图FLargeMap;Step 5.4, input the salient feature map F Small Map of the small target into the response layer of the large target, and obtain the salient feature map F Large Map of the large target;

其中,大目标响应层可以采用2个3*3的卷积层。通过两个3*3的卷积层增大感受野,提高大目标检测的成功率。同时,在感受野相同的前提下,两个3*3卷积层比一个5*5卷积层的计算量更少。Among them, the large target response layer can use two 3*3 convolutional layers. The receptive field is increased by two 3*3 convolutional layers, and the success rate of large target detection is improved. At the same time, under the premise of the same receptive field, two 3*3 convolutional layers require less computation than one 5*5 convolutional layer.

其中:大目标显著特征图FLargeMap包含更多的语义信息,在无人机飞行高度较低时,可提高大目标车辆检测的精确率;Among them: the large target salient feature map F Large Map contains more semantic information, which can improve the accuracy of large target vehicle detection when the UAV is flying at a low altitude;

步骤5.5,将小目标显著特征图FSmallMap输入到结果框生成层,从而在图像帧Frm(t)中,得到p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p);Step 5.5, input the small target salient feature map F Small Map into the result frame generation layer, so that in the image frame Frm(t), p small target vehicle detection result frames Box Small (1)...Box Small (p );

将大目标显著特征图FLargeMap输入到结果框生成层,从而在图像帧Frm(t)中,得到q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q);Input the large target salient feature map F Large Map into the result box generation layer, so that in the image frame Frm(t), q large target vehicle detection result boxes Box Large (1)...Box Large (q) are obtained;

具体方法为:The specific method is:

步骤5.5.1,将小目标显著特征图FSmallMap中的每个像素点作为锚点,以每个锚点为中心,生成多个大小不同的候选框;因此,对于小目标显著特征图FSmallMap中的所有像素点,一共得到若干个候选框;Step 5.5.1, take each pixel in the small target salient feature map F Small Map as an anchor point, and take each anchor point as the center to generate multiple candidate boxes of different sizes; therefore, for the small target salient feature map F All the pixels in the Small Map get a total of several candidate frames;

例如,以每个锚点为中心,生成6个大小不同的候选框;其中,当进行小目标检测时,可以按1∶1,1∶2,2∶1的长和宽的比例,生成3个面积为8的候选框,以及,生成3个面积为16的候选框。For example, taking each anchor point as the center, 6 candidate boxes of different sizes are generated; among them, when small target detection is performed, 3 A candidate box with an area of 8, and 3 candidate boxes with an area of 16 are generated.

当进行大目标检测时,可以按1∶1,1∶2,2∶1的长和宽的比例,生成3个面积为32的候选框,以及,生成3个面积为64的候选框。When performing large target detection, three candidate boxes with an area of 32 and three candidate boxes with an area of 64 can be generated according to the ratio of length and width of 1:1, 1:2, and 2:1.

1∶1,1∶2,2∶1的长和宽的比例,是根据无人机视窗中车辆的长宽比特点设定。The ratio of length and width of 1:1, 1:2, 2:1 is set according to the aspect ratio of the vehicle in the drone window.

步骤5.5.2,计算得到每个候选框的车辆概率值;Step 5.5.2, calculate the vehicle probability value of each candidate frame;

例如,采用1个1*1卷积层,将候选框重塑为一个1维向量,再利用sigmoid函数计算候选框的车辆概率值。For example, use a 1*1 convolutional layer to reshape the candidate frame into a 1-dimensional vector, and then use the sigmoid function to calculate the vehicle probability value of the candidate frame.

步骤5.5.3,对候选框进行筛选,去除车辆概率值低于预设阈值的候选框,例如,阈值设定为0.6,从而得到候选框:A1,A2...Ap;其中,p代表候选框数量;Step 5.5.3: Screen the candidate frames, and remove the candidate frames whose vehicle probability value is lower than the preset threshold. For example, the threshold is set to 0.6, so as to obtain the candidate frames: A 1 , A 2 . . . A p ; wherein, p represents the number of candidate boxes;

步骤5.5.4计算候选框A1,A2...Ap中每个候选框的回归参数,每个候选框均具有以下回归参数:宽度,高度和锚点偏移量;Step 5.5.4 Calculate the regression parameters of each candidate box in the candidate boxes A 1 , A 2 . . . A p , each candidate box has the following regression parameters: width, height and anchor offset;

步骤5.5.5,将候选框A1,A2...Ap中每个候选框的锚点坐标和其对应的回归参数映射回图像帧Frm(t),从而在图像帧Frm(t)中,得到p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p);Step 5.5.5, map the anchor point coordinates and the corresponding regression parameters of each candidate frame in the candidate frame A 1 , A 2 . . . A p back to the image frame Frm(t), so that in the image frame Frm(t) , obtain p small target vehicle detection result boxes Box Small (1)...Box Small (p);

步骤5.5.6,以大目标显著特征图FLargeMap替换步骤5.5.1中的小目标显著特征图FSmallMap,增大步骤5.5.1中的候选框的初始生成尺寸,采用步骤5.5.1-5.5.5的方法,在图像帧Frm(t)中,得到q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q);Step 5.5.6, replace the small target salient feature map F Small Map in step 5.5.1 with the large target salient feature map F Large Map, increase the initial generation size of the candidate frame in step 5.5.1, and use step 5.5.1 - The method of 5.5.5, in the image frame Frm(t), obtain q large target vehicle detection result boxes Box Large (1)...Box Large (q);

步骤5.6,将图像帧Frm(t)中的p个小目标车辆检测结果框BoxSmall(1)...BoxSmall(p)和q个大目标车辆检测结果框BoxLarge(1)...BoxLarge(q),统称为p+q个车辆检测结果框;Step 5.6, put the p small target vehicle detection result boxes Box Small (1)...Box Small (p) and q large target vehicle detection result boxes Box Large (1)... Box Large (q), collectively referred to as p+q vehicle detection result boxes;

对于图像帧Frm(t)中得到的p+q个车辆检测结果框,计算任意两个车辆检测结果框之间的相似系数,如果相似系数小于设定阈值,则不进行处理;如果相似系数大于设定阈值,则将这两个车辆检测结果框合并为一个车辆检测结果框,最终得到z个车辆检测结果框,表示为:Box(1)...Box(z);For the p+q vehicle detection result frames obtained in the image frame Frm(t), calculate the similarity coefficient between any two vehicle detection result frames. If the similarity coefficient is less than the set threshold, it will not be processed; if the similarity coefficient is greater than If the threshold is set, the two vehicle detection result boxes are combined into one vehicle detection result box, and finally z vehicle detection result boxes are obtained, which are expressed as: Box(1)...Box(z);

例如,两个候选框之间的Jaccard相似系数Ja>0.8,则进行合并操作。For example, if the Jaccard similarity coefficient Ja>0.8 between the two candidate frames, then the merge operation is performed.

假设两个候选框分别表示为:BoxSmall(1),BoxLarge(1),则采用下式,计算Jaccard相似系数:Assuming that the two candidate boxes are respectively expressed as: Box Small (1), Box Large (1), the following formula is used to calculate the Jaccard similarity coefficient:

Figure BDA0003412564660000191
Figure BDA0003412564660000191

步骤5.6中,将两个车辆检测结果框合并为一个车辆检测结果框,具体为:In step 5.6, the two vehicle detection result frames are combined into one vehicle detection result frame, specifically:

设需要合并的两个车辆检测结果框分别为:车辆检测结果框BoxSmall(1)和车辆检测结果框BoxLarge(1);合并后的车辆检测结果框表示为Box(1),则:Suppose the two vehicle detection result boxes that need to be merged are: the vehicle detection result box Box Small (1) and the vehicle detection result box Box Large (1); the combined vehicle detection result box is represented as Box (1), then:

Box(1)的中心点,为BoxSmall(1)中心点和BoxLarge(1)中心点连线的中间点;The center point of Box(1) is the middle point of the line connecting the center point of Box Small (1) and the center point of Box Large (1);

Box(1)的高度,为BoxSmall(1)高度和BoxLarge(1)高度的平均值;The height of Box(1) is the average of the height of Box Small (1) and Box Large (1);

Box(1)的宽度,为BoxSmall(1)宽度和BoxLarge(1)宽度的平均值。The width of Box(1), which is the average of the width of Box Small (1) and the width of Box Large (1).

步骤6,在图像帧Frm(t)中分别截取每个车辆检测结果框中的图像,得到z个检测车辆图;Step 6, in the image frame Frm(t), respectively intercept the image in each vehicle detection result frame to obtain z detection vehicle images;

步骤7,将每个检测车辆图与目标车辆图S输入到多特征联合车辆搜索网络进行特征匹配,得到目标车辆所在的检测车辆图;该检测车辆图在图像帧Frm(t)中的位置,即为目标车辆在图像帧Frm(t)的位置,从而完成对目标车辆的检索定位;Step 7: Input each detected vehicle map and the target vehicle map S into the multi-feature joint vehicle search network for feature matching to obtain the detected vehicle map where the target vehicle is located; the position of the detected vehicle map in the image frame Frm(t), That is, the position of the target vehicle in the image frame Frm(t), so as to complete the retrieval and positioning of the target vehicle;

步骤7中,多特征联合车辆搜索网络建立方式为:In step 7, the multi-feature joint vehicle search network is established in the following manner:

以车辆颜色特征、车辆类型特征作为车辆全局特征,以车辆侧视图、车辆前视图、车辆后视图、车辆顶视图和非车辆视图作为车辆局部特征,建立多特征联合车辆搜索网络。A multi-feature joint vehicle search network is established with vehicle color features and vehicle type features as vehicle global features, and vehicle side view, vehicle front view, vehicle rear view, vehicle top view and non-vehicle view as vehicle local features.

步骤7具体为:Step 7 is as follows:

步骤7.1,构建多特征联合车辆搜索网络;所述多特征联合车辆搜索网络包括全局特征识别模块和局部特征匹配模块;Step 7.1, building a multi-feature joint vehicle search network; the multi-feature joint vehicle search network includes a global feature identification module and a local feature matching module;

步骤7.2,将z个检测车辆图和目标车辆图S分别输入到全局特征识别模块,采用以下方法,得到z′个与目标车辆图S颜色、车辆类型一致的疑似车辆图;Step 7.2, input the z detected vehicle images and the target vehicle image S to the global feature recognition module respectively, and obtain z' suspected vehicle images with the same color and vehicle type as the target vehicle image S by using the following method;

其中,全局特征识别模块包括共享特征层、车辆颜色特征层和车辆类型特征层;The global feature recognition module includes a shared feature layer, a vehicle color feature layer, and a vehicle type feature layer;

步骤7.2.1,识别目标车辆图S的颜色特征,包括以下步骤:Step 7.2.1, identify the color features of the target vehicle map S, including the following steps:

步骤7.2.1.1,将目标车辆图S输入到共享特征层,得到共享特征图FShrMap;Step 7.2.1.1, input the target vehicle map S into the shared feature layer to obtain the shared feature map F Shr Map;

步骤7.2.1.2,将共享特征图FShrMap输入到车辆颜色特征层,得到车辆颜色特征向量VColor;其中,所述车辆颜色特征层包括conv4Color、最大池化层Maxpool和全连接层FCColorStep 7.2.1.2, the shared feature map F Shr Map is input to the vehicle color feature layer, and the vehicle color feature vector V Color is obtained; wherein, the vehicle color feature layer includes conv4 Color , maximum pooling layer Maxpool and fully connected layer FC Color ;

步骤7.2.1.3,采用矩阵广播的方式将车辆颜色特征向量VColor与共享特征图FShrMap相乘,得到颜色敏感特征图FColorMap;Step 7.2.1.3: Multiply the vehicle color feature vector V Color and the shared feature map F Shr Map by means of matrix broadcasting to obtain the color-sensitive feature map F Color Map;

步骤7.2.1.4,以颜色敏感特征图FColorMap为卷积核,对目标车辆图S进行互卷积,得到颜色特征增强图S′Color,增强目标车辆图S对颜色特征的响应程度;Step 7.2.1.4, take the color-sensitive feature map F Color Map as the convolution kernel, perform mutual convolution on the target vehicle map S to obtain a color feature enhancement map S′ Color , and enhance the response degree of the target vehicle map S to the color feature;

步骤7.2.1.5,将颜色特征增强图S′Color依次输入到共享特征层、Conv4Color、Conv5color、最大值池化层、全连接层,通过非极大值抑制算法得到目标车辆图S的颜色类别;Step 7.2.1.5, input the color feature enhancement map S′ Color to the shared feature layer, Conv4 Color , Conv5 color , maximum pooling layer, and fully connected layer in turn, and obtain the color of the target vehicle map S through the non-maximum value suppression algorithm category;

步骤7.2.2,采用相同方法,得到目标车辆图S的车辆类型,进而得到每个检测车辆图所属的颜色类别和车辆类型;Step 7.2.2, using the same method, obtain the vehicle type of the target vehicle map S, and then obtain the color category and vehicle type to which each detected vehicle map belongs;

步骤7.2.3,在z个检测车辆图中,判断是否存在与目标车辆图S颜色、车辆类型相同的检测车辆图,如果没有,则直接对下一帧图像进行检索;Step 7.2.3, in the z detected vehicle images, determine whether there is a detected vehicle image with the same color and vehicle type as the target vehicle image S, if not, directly search the next frame of image;

如果有,则将所有与目标车辆图S颜色、车辆类型相同的检测车辆图均提取出来,假设一共提取到z′个,将提取到的z′个检测车辆图称为疑似车辆图,表示为:疑似车辆图Dc,其中,c=1...z′;If there are, all the detected vehicle images with the same color and vehicle type as the target vehicle image S are extracted. Assuming that a total of z' are extracted, the extracted z' detected vehicle images are called suspected vehicle images, which are expressed as : Suspected vehicle map D c , where c=1...z′;

步骤7.3,将目标车辆图S和每个疑似车辆图Dc分别输入到局部特征匹配模块,局部特征匹配模块采用匹配算法,得到目标车辆图S的车辆均值向量矩阵VsStep 7.3, input the target vehicle map S and each suspected vehicle map D c to the local feature matching module respectively, and the local feature matching module adopts a matching algorithm to obtain the vehicle mean vector matrix V s of the target vehicle map S;

局部特征匹配模块采用相同匹配算法,得到每个疑似车辆图Dc的疑似车辆均值向量矩阵VcThe local feature matching module adopts the same matching algorithm to obtain the suspected vehicle mean value vector matrix V c of each suspected vehicle map D c ;

其中,局部特征匹配模块包括特征提取层、特征稀疏卷积层Conv6和一个全连接层FCsightAmong them, the local feature matching module includes a feature extraction layer, a feature sparse convolutional layer Conv6 and a fully connected layer FC sight ;

局部特征匹配模块对目标车辆图S进行特征匹配,得到目标车辆图S的车辆均值向量矩阵Vs,具体为:The local feature matching module performs feature matching on the target vehicle map S, and obtains the vehicle mean vector matrix V s of the target vehicle map S, specifically:

步骤7.3.1,将目标车辆图S通过4*4网格进行格网分割,得到16个车辆子块图;Step 7.3.1, divide the target vehicle map S through a 4*4 grid to obtain 16 vehicle sub-block maps;

步骤7.3.2,将每个车辆子块图分别输入到特征提取层,得到对应的车辆子块特征图FsubMap(m),m=1...16;Step 7.3.2, input each vehicle sub-block map to the feature extraction layer respectively, and obtain the corresponding vehicle sub-block feature map F sub Map(m), m=1...16;

步骤7.3.3,将每个车辆子块特征图FsubMap(m)输入到特征稀疏卷积层Conv6,得到对应的稀疏特征图FsparseMap(m);Step 7.3.3, input the feature map F sub Map(m) of each vehicle sub-block into the feature sparse convolution layer Conv6 to obtain the corresponding sparse feature map F sparse Map(m);

步骤7.3.3中,为使稀疏特征图能够充分表达车辆子块特征图FsubMap(m)中的特征,减少压缩过程中的信息损失,在训练时采用压缩损失函数LosssparseIn step 7.3.3, in order to make the sparse feature map fully express the features in the vehicle sub-block feature map F sub Map(m) and reduce the information loss in the compression process, the compression loss function Loss sparse is used during training:

Losssparse=Min(FsubMap(m)-(FsparseMap(m)*WTran))Loss sparse = Min(F sub Map(m)-(F sparse Map(m)*W Tran ))

式中:where:

WTran为通过反卷积得到的上采样权重。W Tran is the upsampling weight obtained by deconvolution.

步骤7.3.4,确定车辆子块图的视角类别:Step 7.3.4, determine the viewing angle category of the vehicle subplot:

将每个稀疏特征图FsparseMap(m)输入到全连接层FCsight,通过非极大值抑制,得到该车辆子块图的视角类别;其中,视角类别包括侧视图、前视图、后视图、顶视图和非车辆视图五类;Input each sparse feature map F spars eMap(m) to the fully connected layer FC sight , and obtain the view category of the vehicle sub-block map through non-maximum suppression; the view category includes side view, front view, and rear view , top view and non-vehicle view five categories;

步骤7.3.5,确定车辆子块图的视角类别的视角向量:Step 7.3.5, determine the view vector of the view category of the vehicle subplot:

如果视角类别为侧视图、前视图、后视图和顶视图,则提取每个稀疏特征图FsparseMap(m)的特征,将其重塑为一维特征向量,该一维特征向量作为该车辆子块图对应的视角向量;其中,所述视角向量依据视角类别划分包括:侧视向量、正视向量、后视向量和顶视向量;If the viewing angle categories are side view, front view, rear view and top view, extract the features of each sparse feature map F sparse Map(m) and reshape it into a one-dimensional feature vector, which is used as the vehicle The viewing angle vector corresponding to the sub-block map; wherein, the viewing angle vector is divided according to the viewing angle category and includes: a side view vector, a front view vector, a rear view vector and a top view vector;

如果视角类别为非车辆视图,则舍弃;If the viewing angle category is non-vehicle view, it will be discarded;

步骤7.3.6,确定每种视角类别的视角均值向量:Step 7.3.6, determine the view mean vector for each view category:

求取目标车辆图S中相同视角类别的各个车辆子块图的视角向量均值,分别得到侧视视角均值向量、正视视角均值向量、后视视角均值向量和顶视视角均值向量;Obtain the mean value of the angle vector of each vehicle sub-block graph of the same angle of view category in the target vehicle map S, and obtain the mean vector of the side view angle, the mean vector of the front view, the mean vector of the rear view and the mean vector of the top view respectively;

若不存在某一视角类别,则其视角均值向量不存在,则将该视角均值向量的所有元素置为0;If a certain viewing angle category does not exist, then its viewing angle mean vector does not exist, then all elements of the viewing angle mean vector are set to 0;

因此,得到四个视角类别的视角均值向量Vcl;其中,cl=1,2,3,4,分别代表侧视视角均值向量V1、正视视角均值向量V2、后视视角均值向量V3和顶视视角均值向量V4;四个视角类别的视角均值向量Vcl,构成目标车辆图S的车辆均值向量矩阵VsTherefore, the viewing angle mean vector V cl of the four viewing angle categories is obtained; wherein, cl=1, 2, 3, 4, respectively representing the side view viewing angle mean vector V 1 , the front viewing viewing angle mean vector V 2 , and the rear viewing viewing angle mean vector V 3 and the top view angle mean vector V 4 ; the angle mean vector V cl of the four viewing angle categories constitutes the vehicle mean vector matrix V s of the target vehicle map S;

与之对应,得到每个疑似车辆图Dc的四个视角类别的疑似车辆均值向量V′cl,构成疑似车辆图Dc的疑似车辆均值向量矩阵VcCorrespondingly, the suspected vehicle mean vector V′ cl of the four viewing angle categories of each suspected vehicle map D c is obtained, forming the suspected vehicle mean vector matrix V c of the suspected vehicle map D c ;

步骤7.4,计算目标车辆图S与每个疑似车辆图Dc的共有视角类别的视角均值向量个数Num,采用下式,得到与每个疑似车辆图Dc对应的特征匹配值Match;Step 7.4: Calculate the number Num of the average viewing angle vectors of the common viewing angle category of the target vehicle map S and each suspected vehicle map D c , and use the following formula to obtain the feature matching value Match corresponding to each suspected vehicle map D c ;

Figure BDA0003412564660000231
Figure BDA0003412564660000231

其中,λ为视角均值向量个数的权重;T表示转置;tr表示矩阵的迹,代表矩阵主对角线元素之和;Among them, λ is the weight of the number of viewing angle mean vectors; T is the transpose; tr is the trace of the matrix, which is the sum of the main diagonal elements of the matrix;

步骤7.5,当存在多个疑似车辆图Dc的特征匹配值Match高于阈值时,则通过非极大值抑制方法在多个疑似车辆图Dc中,确定目标车辆所在的疑似车辆图,该疑似车辆图在图像帧Frm(t)中的位置为目标车辆在图像帧Frm(t)的位置;Step 7.5, when the feature matching value Match of multiple suspected vehicle maps Dc is higher than the threshold, the non-maximum value suppression method is used to determine the suspected vehicle map where the target vehicle is located in the multiple suspected vehicle maps Dc . The position of the suspected vehicle map in the image frame Frm(t) is the position of the target vehicle in the image frame Frm(t);

当目标车辆图S和所有疑似车辆图Dc的特征匹配值Match均低于阈值,则图像帧Frm(t)中不包含目标车辆。When the feature matching value Match of the target vehicle map S and all suspected vehicle maps Dc is lower than the threshold, the image frame Frm( t ) does not contain the target vehicle.

步骤8,若当前图像帧Frm(t)中所有检测车辆图与目标车辆图S的匹配度均低于设定阈值,即当前图像帧Frm(t)中不存在目标车辆,继续对下一时刻图像帧Frm(t+1)进行检索。Step 8: If the matching degree of all detected vehicle maps in the current image frame Frm(t) and the target vehicle map S is lower than the set threshold, that is, there is no target vehicle in the current image frame Frm(t), continue to the next moment. The image frame Frm(t+1) is retrieved.

本发明提供一种基于车辆特征匹配的无人机视频车辆检索方法,适用于无人机在不同复杂场景下拍摄的视频,最大限度去除光照导致车辆细节信息不足和无人机不同高度下目标大小变化的影响,解决待查询车辆在众多目标中难以发现的问题,可以更加准确的检索到待查询车辆。The invention provides a UAV video vehicle retrieval method based on vehicle feature matching, which is suitable for videos shot by UAVs in different complex scenes, and minimizes the lack of vehicle detail information caused by illumination and the target size at different heights of the UAV. The impact of changes can solve the problem that the vehicle to be queried is difficult to find among many targets, and the vehicle to be queried can be retrieved more accurately.

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can be made. It should be regarded as the protection scope of the present invention.

Claims (6)

1. An unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching is characterized by comprising the following steps:
step 1, determining a target vehicle map S to be retrieved;
step 2, shooting the ground by the unmanned aerial vehicle to obtain video data of the unmanned aerial vehicle;
step 3, executing steps 4 to 8 on each frame of image of the unmanned aerial vehicle video data, and judging whether each frame of image contains a target vehicle image S to be retrieved:
recording the current image frame as Frm (t), wherein t is the frame number of the current image frame, and judging whether the image frame Frm (t) contains a target vehicle map S to be searched by adopting the following steps 4-8:
step 4, inputting the image frame frm (t) into a trained light suppression model, and performing feature extraction and light suppression processing to obtain an illumination suppression feature map comprising n image layers, which is marked as FRestrainMap;
The light suppression model is a double-branch network and comprises a learning branch network and a suppression branch network; wherein the learning branch network comprises a convolution layer conv1 and a shallow feature selection layer f which are connected in series1() And deep layer characteristic selection layer f2() (ii) a The suppression branch network comprises convolution layers conv1 ' and shallow feature selection layers f ' connected in series '1() And deep characteristic choosing layer f'2();
Step 5, the illumination inhibition characteristic diagram FRestrainMap is input into a multi-scale vehicle detection module with enhanced features, and z vehicle detection result frames in an image frame frm (t) are acquired:
step 5.1, light inhibition feature map FRestrainMap has n layers, and for each layer, it is expressed as: layeriAnd i is 1, … n, executing the steps 5.1.1-5.1.3 to obtain the layeriDependent weight value w ″)i
Step 5.1.1, calculating layeriThe average value of all the pixel points is used as the layeriInitial weight w ofi
Step 5.1.2, layer of the graphic layeriInitial weight w ofiInputting the initial weight w into a full connection layer, and activating a function through sigmoidiMapping to (0, 1) feature space, thereby outputting layeriNormalized weight value w'i
Step 5.1.3, establishing a piecewise function and carrying out layer matchingiOf normalized weight value w'iPerforming segmented suppression or enhancement to obtain the layeriDependent weight value w ″)i
Figure FDA0003610103140000021
Wherein:
epsilon represents a system constant and is used for adjusting the influence degree of the dependence weight value on the layer;
step 5.2, obtaining an illumination inhibition characteristic diagram FRestrainThe dependency weighted values of the n layers of the Map are respectively as follows: w ″)1…w″n
Will w ″)1…w″nCombining to obtain an illumination inhibition characteristic diagram FRestrain1 x n dependent weight vector W "for Map;
using the dependent weight vector W' as convolution kernel to check the illumination inhibition characteristic diagram FRestrainThe Map is convoluted to obtain a layer enhancement feature Map FEhcMap;
Step 5.3, enhancing feature map F of image layerEhcInputting Map into small target response layer to obtain small target significant feature graph FSmallMap;
Wherein, the small target significant feature map FSmallThe Map contains more vehicle detail information, and the success rate of small target vehicle detection can be improved when the flying height of the unmanned aerial vehicle is higher;
step 5.4, a small target salient feature map FSmallInputting Map into large target response layer to obtain large target significant characteristic diagram FLargeMap;
Wherein: large target significant feature map FLargeThe Map contains more semantic information, so that the accuracy rate of large target vehicle detection can be improved when the flying height of the unmanned aerial vehicle is low;
step 5.5, a small target salient feature map FSmallMap is input to the result frame generation layer, so that in the image frame frm (t), p small target vehicle detection result frames Box are obtainedSmall(1)…BoxSmall(p);
Drawing F for salient features of large targetLargeMap is input to the result frame generation layer, so that q large target vehicle detection result frames Box are obtained in the image frame frm (t)Large(1)…BoxLarge(q);
The specific method comprises the following steps:
step 5.5.1, a small target salient feature map FSmallEach pixel point in the Map is used as an anchor point, and a plurality of candidate frames with different sizes are generated by taking each anchor point as a center; thus, for a small target salient feature map FSmallAll pixel points in the Map obtain a plurality of candidate frames;
step 5.5.2, calculating to obtain the vehicle probability value of each candidate box;
and 5.5.3, screening the candidate frames, and removing the candidate frames with the vehicle probability value lower than a preset threshold value to obtain the candidate frames: a. the1,A2…Ap(ii) a Wherein p represents the number of candidate boxes;
step 5.5.4 calculating candidate Box A1,A2…ApThe regression parameters of each candidate box in the list, each candidate box having the following regression parameters: width, height, and anchor point offset;
step 5.5.5, candidate frame A1,A2…ApThe anchor point coordinates of each candidate frame and the regression parameters corresponding to the anchor point coordinates are mapped back to the image frame Frm (t), so that p small target vehicle detection result frames Box are obtained in the image frame Frm (t)Small(1)…BoxSmall(p);
Step 5.5.6, toLarge target significant feature map FLargeMap substitution small target salient feature Map F in step 5.5.1SmallMap, increasing the initial generation size of the candidate frame in step 5.5.1, and obtaining q large target vehicle detection result frames Box in the image frame frm (t) by adopting the method of steps 5.5.1-5.5.5Large(1)…BoxLarge(q);
Step 5.6, detecting result frames Box of p small target vehicles in the image frames Frm (t)Small(1)…BoxSmall(p) and q large target vehicle detection result boxes BoxLarge(1)…BoxLarge(q), collectively referred to as p + q vehicle detection result frames;
calculating a similarity coefficient between any two vehicle detection result frames for the p + q vehicle detection result frames obtained in the image frame frm (t), and if the similarity coefficient is smaller than a set threshold, not performing processing; if the similarity coefficient is larger than the set threshold, combining the two vehicle detection result frames into one vehicle detection result frame, and finally obtaining z vehicle detection result frames, wherein the z vehicle detection result frames are represented as: box (1) … Box (z);
step 6, respectively intercepting images in each vehicle detection result frame in an image frame Frm (t) to obtain z detection vehicle images;
step 7, inputting each detected vehicle map and the target vehicle map S into a multi-feature united vehicle search network for feature matching to obtain a detected vehicle map of the target vehicle; the position of the detected vehicle map in the image frame frm (t) is the position of the target vehicle in the image frame frm (t), so that the retrieval and positioning of the target vehicle are completed;
the multi-feature combined vehicle search network comprises a global feature recognition module and a local feature matching module; the global feature identification module comprises a shared feature layer, a vehicle color feature layer and a vehicle type feature layer; the local feature matching module comprises a feature extraction layer, a feature sparse convolution layer Conv6 and a full connection layer FCsight
Step 8, if the matching degrees of all the detected vehicle maps and the target vehicle map S in the current image frame frm (t) are lower than the set threshold, that is, the target vehicle does not exist in the current image frame frm (t), the image frame Frm (t +1) at the next time is continuously retrieved.
2. The unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching according to claim 1, wherein the step 4 specifically comprises:
step 4.1, constructing a light suppression model;
the light suppression model is a double-branch network and comprises a learning branch network and a suppression branch network; wherein the learning branch network comprises a convolution layer conv1 and a shallow feature selection layer f which are connected in series1() And deep layer characteristic selection layer f2() (ii) a The suppression branch network comprises convolution layers conv1 ' and a shallow feature selection layer f ' which are connected in series '1() And deep characteristic choosing layer f'2();
Step 4.2, obtaining a group of training sample pairs;
each group of training sample pairs comprises a normal light image I and an over-bright light image I' under the visual angle of the unmanned aerial vehicle; the light over-bright image I' is obtained by randomly adding a brightness value to the light normal image I; the a groups of training sample pairs are respectively expressed as: (I)1,I′1),(I2,I′2),...,(Ia,I′a);
Step 4.3, performing off-line training on the light-inhibiting model constructed in the step 4.1 by adopting a group of training samples, wherein the target function of the off-line training is as follows:
Figure FDA0003610103140000051
wherein:
Losslight suppressionRepresenting a light loss suppressing function;
argmin () represents the value of the variable at which the target function takes the minimum value;
f′1(I′j) Represents light over-bright image I'jInput to shallow feature chosen layer f'1() Then outputting a shallow layer characteristic value;
f′2(I′j) Representing too bright lightImage I'jInputting into deep layer characteristic choosing layer f'2() Then, outputting the deep characteristic value;
f1(Ij) Representing normal light images IjInput to the shallow feature selection layer f1() Then outputting a shallow layer characteristic value;
f2(Ij) Representing normal light images IjInputting into deep characteristic selection layer f2() Then, outputting the deep characteristic value;
Figure FDA0003610103140000052
represents: the square of the L2 norm;
gamma represents a penalty coefficient and is controlled by artificial setting
Figure FDA0003610103140000053
The effect on the light loss suppressing function, the larger its value,
Figure FDA0003610103140000054
the greater the effect on the light loss suppressing function;
4.4, the sensitivity of the suppression branch network to the brightness characteristics is weakened by performing off-line training on the light suppression model, so that the suppression branch network can perform illumination characteristic suppression on the image with overhigh brightness, which is shot by the unmanned aerial vehicle, and the significance of the detail characteristics of the vehicle at the view angle of the unmanned aerial vehicle is improved;
therefore, the image frame frm (t) is input to the suppression branch network of the trained light suppression model to obtain the light suppression feature map FRestrainMap。
3. The unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching of claim 1, wherein in step 5.6, two vehicle detection result frames are combined into one vehicle detection result frame, specifically:
setting two vehicle detection result frames which need to be combined as follows: vehicle test result Box BoxSmall(1) And a vehicleTest result Box BoxLarge(1) (ii) a The merged vehicle detection result Box is denoted as Box (1), and then:
the center point of Box (1) is BoxSmall(1) Center point and BoxLarge(1) The middle point of the central point connecting line;
height of Box (1), BoxSmall(1) Height and BoxLarge(1) An average value of the heights;
the width of Box (1) is BoxSmall(1) Width and BoxLarge(1) Average value of the width.
4. The unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching of claim 1, wherein in step 7, the multi-feature joint vehicle search network establishment method is as follows:
and establishing a multi-feature joint vehicle search network by taking the vehicle color feature and the vehicle type feature as vehicle global features and taking the vehicle side view, the vehicle front view, the vehicle rear view, the vehicle top view and the non-vehicle view as vehicle local features.
5. The unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching according to claim 4, wherein the step 7 specifically comprises:
step 7.1, constructing a multi-feature united vehicle search network; the multi-feature combined vehicle search network comprises a global feature identification module and a local feature matching module;
step 7.2, inputting the z detected vehicle images and the target vehicle image S into a global feature recognition module respectively, and obtaining z' suspected vehicle images with the same color and the same vehicle type as the target vehicle image S by adopting the following method;
the global feature identification module comprises a shared feature layer, a vehicle color feature layer and a vehicle type feature layer;
step 7.2.1, identifying the color characteristics of the target vehicle map S, comprising the steps of:
step 7.2.1.1, inputting the target vehicle map S into the shared characteristic layer to obtain a shared characteristic map FShrMap;
Step 7.2.1.2, sharing the feature map FShrMap is input to the vehicle color feature layer to obtain a vehicle color feature vector VColor(ii) a Wherein the vehicle color feature layer comprises conv4ColorMax pooling layer Maxpool and full junction layer FCColor
7.2.1.3, adopting the matrix broadcast mode to make the vehicle color feature vector VColorAnd sharing the profile FShrMultiplying Map to obtain color sensitive characteristic diagram FColorMap;
Step 7.2.1.4, color sensitive feature map FColorMap is a convolution kernel, and the target vehicle Map S is subjected to mutual convolution to obtain a color feature enhancement Map S'ColorEnhancing the response degree of the target vehicle map S to the color features;
step 7.2.1.5, enhancing the color feature by map S'ColorSequentially input to a shared feature layer, Conv4Color、Conv5ColorThe maximum value pooling layer and the full connection layer are used for obtaining the color type of the target vehicle image S through a non-maximum value suppression algorithm;
step 7.2.2, obtaining the vehicle type of the target vehicle map S by adopting the same method, and further obtaining the color type and the vehicle type of each detected vehicle map;
step 7.2.3, judging whether a detected vehicle image with the same color and the same vehicle type as the target vehicle image S exists in the z detected vehicle images, and if not, directly searching the next frame of image;
if yes, extracting all the detected vehicle maps with the same color and the same vehicle type as the target vehicle map S, and assuming that z 'are extracted in total, referring the extracted z' detected vehicle maps as suspected vehicle maps, and representing that: suspected vehicle map DcWherein, c is 1 … z';
step 7.3, the target vehicle map S and each suspected vehicle map DcRespectively input into a local feature matching module, and the local feature matching module obtains a vehicle mean vector matrix V of a target vehicle map S by adopting a matching algorithms
Local feature matchingThe same matching algorithm is adopted by the modules to obtain each suspected vehicle image DcIs suspected of vehicle mean vector matrix Vc
Wherein the local feature matching module comprises a feature extraction layer, a feature sparse convolution layer Conv6 and a full connection layer FCsight
The local feature matching module performs feature matching on the target vehicle image S to obtain a vehicle mean vector matrix V of the target vehicle image SsThe method specifically comprises the following steps:
step 7.3.1, performing grid segmentation on the target vehicle map S through 4-by-4 grids to obtain 16 vehicle sub-block maps;
step 7.3.2, respectively inputting each vehicle sub-block map into the feature extraction layer to obtain corresponding vehicle sub-block feature maps FsubMap(m),m=1…16;
Step 7.3.3, each vehicle sub-block feature map FsubMap (m) is input to the feature sparse convolution layer Conv6 to obtain the corresponding sparse feature map FsparseMap(m);
And 7.3.4, determining the view angle type of the vehicle sub-block map:
each sparse feature map FsparseMap (m) input to full connection layer FCsightObtaining the view angle type of the vehicle sub-block map through non-maximum value suppression; wherein the view angle categories comprise five categories of side view, front view, rear view, top view and non-vehicle view;
step 7.3.5, determining the view angle vector of the view angle category of the vehicle sub-block diagram:
if the view angle categories are side view, front view, back view and top view, extracting each sparse feature map FsparseMap (m), remolding the features into one-dimensional feature vectors, wherein the one-dimensional feature vectors are used as view angle vectors corresponding to the vehicle sub-block diagram; wherein the view vector is divided according to view categories, including: a side view vector, a front view vector, a rear view vector, and a top view vector;
if the visual angle category is a non-vehicle view, discarding;
step 7.3.6, determine the view mean vector for each view category:
obtaining a view angle vector mean value of each vehicle sub-block map of the same view angle category in the target vehicle map S, and respectively obtaining a side view angle mean value vector, a front view angle mean value vector, a rear view angle mean value vector and a top view angle mean value vector;
if a certain visual angle type does not exist, the visual angle mean vector does not exist, and all elements of the visual angle mean vector are set to be 0;
thus, a view mean vector V for the four view classes is obtainedcl(ii) a Where cl is 1,2,3, and 4, and represents a side view mean vector V1Mean vector V of front view angle2Mean vector V of rear view angle3And top view mean vector V4(ii) a View mean vector V for four view classesclA vehicle mean vector matrix V constituting a target vehicle map Ss
Correspondingly, each suspected vehicle image D is obtainedcIs a suspected vehicle mean vector V 'of the four view angle categories'clTo construct a suspected vehicle map DcIs suspected of vehicle mean vector matrix Vc
Step 7.4, calculating a target vehicle map S and each suspected vehicle map DcThe number Num of the viewing angle mean value vectors of the common viewing angle category is obtained by adopting the following formula to obtain a vehicle map D corresponding to each suspected vehiclecCorresponding feature matching value Match;
Figure FDA0003610103140000091
wherein, lambda is the weight of the number of the visual angle mean vectors; t represents transposition; tr represents the trace of the matrix and represents the sum of the main diagonal elements of the matrix;
step 7.5, when a plurality of suspected vehicle images D existcWhen the feature matching value Match is higher than the threshold value, the non-maximum value suppression method is used to suppress the plurality of suspected vehicle images DcDetermining a suspected vehicle map of the target vehicle, wherein the position of the suspected vehicle map in the image frame Frm (t) is the position of the target vehicle in the image frame Frm (t);
when the target vehicleMap S and all suspect vehicles map DcIf the feature matching value Match of (1) is lower than the threshold value, the target vehicle is not included in the image frame frm (t).
6. The unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching of claim 1, wherein in step 7.3.3, in order to make the sparse feature map sufficiently express the vehicle sub-block feature map FsubFeatures in map (m) reduce information Loss in compression process, and compression Loss function Loss is adopted in trainingsparse
Losssparse=Min(FsubMap(m)-(FsparseMap(m)*WTran))
In the formula:
WTranare upsampled weights obtained by deconvolution.
CN202111534212.4A 2021-12-15 2021-12-15 Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching Active CN114220053B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111534212.4A CN114220053B (en) 2021-12-15 2021-12-15 Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111534212.4A CN114220053B (en) 2021-12-15 2021-12-15 Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching

Publications (2)

Publication Number Publication Date
CN114220053A CN114220053A (en) 2022-03-22
CN114220053B true CN114220053B (en) 2022-06-03

Family

ID=80702585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111534212.4A Active CN114220053B (en) 2021-12-15 2021-12-15 Unmanned aerial vehicle video vehicle retrieval method based on vehicle feature matching

Country Status (1)

Country Link
CN (1) CN114220053B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436738A (en) * 2011-09-26 2012-05-02 同济大学 Traffic monitoring device based on unmanned aerial vehicle
CN110110624A (en) * 2019-04-24 2019-08-09 江南大学 A kind of Human bodys' response method based on DenseNet network and the input of frame difference method feature
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 YOLOv 2-based vehicle target detection method, system and equipment
CN110717387A (en) * 2019-09-02 2020-01-21 东南大学 A real-time vehicle detection method based on UAV platform

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107885764B (en) * 2017-09-21 2020-12-18 银江股份有限公司 A fast hash vehicle retrieval method based on multi-task deep learning
US11036216B2 (en) * 2018-09-26 2021-06-15 International Business Machines Corporation Voice-controllable unmanned aerial vehicle for object retrieval and delivery
CN109815886B (en) * 2019-01-21 2020-12-18 南京邮电大学 A pedestrian and vehicle detection method and system based on improved YOLOv3
CN109977812B (en) * 2019-03-12 2023-02-24 南京邮电大学 A vehicle video object detection method based on deep learning
US20200301015A1 (en) * 2019-03-21 2020-09-24 Foresight Ai Inc. Systems and methods for localization
CN112703368B (en) * 2020-04-16 2022-08-09 华为技术有限公司 Vehicle positioning method and device and positioning layer generation method and device
CN112149643B (en) * 2020-11-09 2022-02-22 西北工业大学 Vehicle weight identification method for unmanned aerial vehicle platform based on multi-stage attention mechanism
CN112381043A (en) * 2020-11-27 2021-02-19 华南理工大学 A method of flag detection
CN112766087A (en) * 2021-01-04 2021-05-07 武汉大学 Optical remote sensing image ship detection method based on knowledge distillation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436738A (en) * 2011-09-26 2012-05-02 同济大学 Traffic monitoring device based on unmanned aerial vehicle
CN110110624A (en) * 2019-04-24 2019-08-09 江南大学 A kind of Human bodys' response method based on DenseNet network and the input of frame difference method feature
CN110443208A (en) * 2019-08-08 2019-11-12 南京工业大学 YOLOv 2-based vehicle target detection method, system and equipment
CN110717387A (en) * 2019-09-02 2020-01-21 东南大学 A real-time vehicle detection method based on UAV platform

Also Published As

Publication number Publication date
CN114220053A (en) 2022-03-22

Similar Documents

Publication Publication Date Title
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
CN111767882B (en) Multi-mode pedestrian detection method based on improved YOLO model
CN110020651B (en) License plate detection and positioning method based on deep learning network
CN108416307B (en) An aerial image pavement crack detection method, device and equipment
Zhang et al. A novel infrared video surveillance system using deep learning based techniques
WO2019169816A1 (en) Deep neural network for fine recognition of vehicle attributes, and training method thereof
CN108460403A (en) The object detection method and system of multi-scale feature fusion in a kind of image
CN111126184B (en) Post-earthquake building damage detection method based on unmanned aerial vehicle video
CN111783523B (en) A method for detecting rotating objects in remote sensing images
CN107480620B (en) Remote sensing image automatic target identification method based on heterogeneous feature fusion
CN105404886B (en) Characteristic model generation method and characteristic model generating means
CN112150493B (en) Semantic guidance-based screen area detection method in natural scene
CN106023257B (en) A kind of method for tracking target based on rotor wing unmanned aerial vehicle platform
CN113420607A (en) Multi-scale target detection and identification method for unmanned aerial vehicle
CN107273832B (en) License plate recognition method and system based on integral channel characteristics and convolutional neural network
Akshatha et al. Manipal-UAV person detection dataset: A step towards benchmarking dataset and algorithms for small object detection
CN106897673A (en) A kind of recognition methods again of the pedestrian based on retinex algorithms and convolutional neural networks
CN114049572A (en) Detection method for identifying small target
CN103049751A (en) Improved weighting region matching high-altitude video pedestrian recognizing method
CN111506759B (en) Image matching method and device based on depth features
CN111368660A (en) A single-stage semi-supervised image human object detection method
CN117409339A (en) Unmanned aerial vehicle crop state visual identification method for air-ground coordination
CN108681718A (en) A kind of accurate detection recognition method of unmanned plane low target
CN110738100A (en) A method and system for camouflaged military target recognition based on deep learning
CN110689578A (en) An obstacle recognition method for UAV based on monocular vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230518

Address after: 3032B, 3rd Floor, Building 9, No.16 Fengguan Road, Fengtai District, Beijing, 100071

Patentee after: Beijing Lingyun Space Technology Co.,Ltd.

Address before: 100044 No. 1, Exhibition Road, Beijing, Xicheng District

Patentee before: Beijing University of Civil Engineering and Architecture

TR01 Transfer of patent right