CN111414997B - A Method for Battlefield Target Recognition Based on Artificial Intelligence - Google Patents
A Method for Battlefield Target Recognition Based on Artificial Intelligence Download PDFInfo
- Publication number
- CN111414997B CN111414997B CN202010231438.6A CN202010231438A CN111414997B CN 111414997 B CN111414997 B CN 111414997B CN 202010231438 A CN202010231438 A CN 202010231438A CN 111414997 B CN111414997 B CN 111414997B
- Authority
- CN
- China
- Prior art keywords
- target frame
- target
- battlefield
- learning
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000005457 optimization Methods 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 230000001629 suppression Effects 0.000 claims abstract description 5
- 230000007423 decrease Effects 0.000 claims abstract 2
- 230000010355 oscillation Effects 0.000 claims abstract 2
- 238000013519 translation Methods 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 14
- 230000009466 transformation Effects 0.000 claims description 11
- 239000013598 vector Substances 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 238000013528 artificial neural network Methods 0.000 abstract description 7
- 238000013527 convolutional neural network Methods 0.000 abstract description 7
- 238000012545 processing Methods 0.000 abstract description 7
- 230000003287 optical effect Effects 0.000 abstract description 5
- 238000013526 transfer learning Methods 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000010410 layer Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Astronomy & Astrophysics (AREA)
- Remote Sensing (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明属于人工智能、深度学习领域,具体涉及一种基于人工智能的目标识别的方法。The invention belongs to the fields of artificial intelligence and deep learning, and in particular relates to a method for object recognition based on artificial intelligence.
背景技术Background technique
未来战争必将是智能化的军事角逐,为了提高战场决策能力,首先要解决的问题是在强复杂电磁环境下的智能目标识别。传统的机器学习和专家系统等非智能化、无“自学习”能力方法都难以应对目标的智能识别难题。Future warfare will inevitably be an intelligent military competition. In order to improve battlefield decision-making capabilities, the first problem to be solved is intelligent target recognition in a strong and complex electromagnetic environment. Non-intelligent, non-self-learning methods such as traditional machine learning and expert systems are difficult to deal with the problem of intelligent recognition of targets.
目前已有利用深度学习算法实现对战场环境的感知,借助于人工智能技术来获取及时、准确的综合战场态势信息,辅助作战指挥员快速指挥决策。At present, deep learning algorithms have been used to realize the perception of the battlefield environment, and artificial intelligence technology is used to obtain timely and accurate comprehensive battlefield situation information to assist combat commanders to make quick command decisions.
关于目标识别方面的研究大部分是基于雷达情报进行的识别,但在复杂电磁环境下,这种识别会严重受限。而基于光学传感器获取的图像受此干扰较小。目前基于人工智能的图像识别算法存在过拟合、冗余识别、识别精度欠缺等问题,无法满足战场目标识别的需求。Most of the research on target recognition is based on radar intelligence, but in complex electromagnetic environments, this kind of recognition will be severely limited. The image acquired based on the optical sensor is less disturbed by this. At present, image recognition algorithms based on artificial intelligence have problems such as overfitting, redundant recognition, and lack of recognition accuracy, which cannot meet the needs of battlefield target recognition.
发明内容Contents of the invention
本发明的目的提供一种基于人工智能的用于战场目标识别的方法,它利用改进的深度学习智能算法进行目标识别,能够提高侦察无人机在复杂环境下对战场目标的识别能力。The purpose of the present invention is to provide an artificial intelligence-based method for battlefield target recognition, which uses an improved deep learning intelligent algorithm for target recognition, and can improve the ability of reconnaissance drones to recognize battlefield targets in complex environments.
本发明的技术方案如下:一种基于人工智能的用于战场目标识别的方法,具体包括如下步骤:The technical scheme of the present invention is as follows: a method for battlefield target recognition based on artificial intelligence, specifically comprises the following steps:
步骤1:图像预处理优化;Step 1: Image preprocessing optimization;
步骤2:学习率优化;Step 2: Learning rate optimization;
设置一个下降率,在训练指定步长后,将原有的学习率进行缩小,以防止其震荡;Set a drop rate, after training the specified step size, reduce the original learning rate to prevent it from oscillating;
步骤3:多分辨率学习识别;Step 3: Multi-resolution learning recognition;
步骤4:非最大值抑制。Step 4: Non-maximum suppression.
所述的步骤1包括如下:Described step 1 comprises as follows:
(1)生成目标框;(1) Generate a target frame;
(2)图像变换优化;(2) Image transformation optimization;
(3)高斯模糊。(3) Gaussian blur.
所述的步骤1中的步骤(1)包括,通过Transfer-Faster-RCNN模型生成目标框,在识别目标前首先生成一个目标框,对于目标框使用四维向量来表示(x,y,w,h),其中,x为目标框的中心点横坐标、y为目标框的中心点纵坐标、w为目标框的宽、h为目标框的高,Step (1) in the described step 1 includes generating a target frame by the Transfer-Faster-RCNN model, first generating a target frame before identifying the target, and using a four-dimensional vector to represent the target frame (x, y, w, h ), where x is the abscissa of the center point of the target frame, y is the ordinate of the center point of the target frame, w is the width of the target frame, and h is the height of the target frame,
A=(Ax,Ay,Aw,Ah) (1)A=(Ax,Ay,Aw,Ah) (1)
G=(Gx,Gy,Gw,Gh) (2)G=(Gx,Gy,Gw,Gh) (2)
其中,A为原始的目标框数据集,G为真实的目标框数据集。Among them, A is the original target box dataset, and G is the real target box dataset.
使得输入原始的窗口经过映射到一个和真实框G更接近的回归窗口G′,G′表示平移变换:The input original window is mapped to a regression window G' that is closer to the real box G, and G' represents translation transformation:
G′x=Ax+Aw·dx(A) (3)G'x=Ax+Aw·dx(A) (3)
G′y=Ay+Ah·dy(A) (4)G'y=Ay+Ah·dy(A) (4)
其中,dx(A)、dy(A)表示平移量,G′x表示平移后的中心点横坐标,G′y表示平移后的中心点纵坐标;Wherein, dx(A), dy(A) represent the amount of translation, G'x represents the abscissa of the center point after the translation, and G'y represents the ordinate of the center point after the translation;
所述的步骤1中的步骤(2)包括,Step (2) in said step 1 comprises,
寻找一种图像的变换F使得Find a transformation F of an image such that
F(Ax,Ay,Aw,Ah)=(G′x,G′y,G′w,G′h) (5)F(Ax,Ay,Aw,Ah)=(G'x,G'y,G'w,G'h) (5)
F的计算可以通过平移和缩放来实现:The calculation of F can be achieved by translation and scaling:
G′w=Aw·dy(A) (6)G'w=Aw·dy(A) (6)
G′h=Ah·dy(A) (7)G'h=Ah·dy(A) (7)
其中,dy(A)表示平移量,G′w表示缩放后的目标框宽度,G′h表示缩放后的目标框高度;Among them, dy(A) represents the amount of translation, G'w represents the width of the zoomed target frame, and G'h represents the height of the zoomed target frame;
构建目标函数build objective function
其中,φ(A)是对应特征图组成的特征向量,wT是需要学习的参数;d(A)是得到预测值,(*表示x,y,w,h,也就是每一个变换对应一个上述目标函数),为了让预测值dx(A),dy(A),dw(A),dh(A)和真实值差距tx,ty,tw,th最小,代价函数loss如下:Among them, φ(A) is the feature vector composed of the corresponding feature map, w T is the parameter to be learned; d(A) is the predicted value, (* means x, y, w, h, that is, each transformation corresponds to a The above objective function), in order to minimize the difference between the predicted value dx(A), dy(A), dw(A), dh(A) and the actual value tx, ty, tw, th, the cost function loss is as follows:
式中,表示目标框真实的中心点,N表示特征图的数量,Ai表示第i个特征图的目标框;In the formula, Represents the real center point of the target frame, N represents the number of feature maps, and A i represents the target frame of the i-th feature map;
函数优化目标w*为:The function optimization objective w* is:
所述的步骤1中的步骤(2)包括,在将数据载入Faster-CNN模型前,首先对同一张图片进行了不同程度的高斯模糊和曝光处理:The step (2) in the described step 1 includes, before loading the data into the Faster-CNN model, at first different degrees of Gaussian blur and exposure processing are performed on the same picture:
其中,p、q是每个RGB通道中像素点位置,σ为曝光程度系数。Among them, p and q are the pixel positions in each RGB channel, and σ is the exposure degree coefficient.
所述的步骤3中选用处理后图像像质损失最少的两次立方插值算法,该插值算法用函数W(m)表示如下:In the described
其中,m为自变量,a为调节值。Among them, m is an independent variable, and a is an adjustment value.
所述的步骤4包括如下:Described step 4 comprises as follows:
(1)计算每个目标框与其相临近目标框的重叠区域面积比例IoU;(1) Calculate the area ratio IoU of the overlapping area of each target frame and its adjacent target frame;
(2)将IoU与阈值进行对比,改变相邻近目标框置信度:(2) Compare the IoU with the threshold and change the confidence of adjacent target boxes:
其中,si为各目标框的置信度,Nt为设定的阈值。Among them, s i is the confidence degree of each target frame, and N t is the set threshold.
本发明的有益效果在于:本发明使用深度学习方法对获取的光学图像数据进行识别处理,具体是利用深度神经网络,对空中无人机上获取的相关图像数据进行卷积神经网络训练,针对原始模型中存在的过拟合、冗余识别、识别精度欠缺三大问题,根据无人机侦察回传光学图像特点对算法模型进行升级,通过图像预处理优化、学习率优化以及迁移学习,智能生成一套可以快速识别战场多类目标的神经网络。通过优化和迁移学习策略,有效解决了过拟合和冗余识别的问题,显著提升了目标识别精度。The beneficial effect of the present invention is that: the present invention uses deep learning method to identify and process the acquired optical image data, specifically, utilizes deep neural network to perform convolutional neural network training on relevant image data acquired on aerial drones, aiming at the original model According to the three major problems of over-fitting, redundant recognition, and lack of recognition accuracy, the algorithm model is upgraded according to the characteristics of the optical image returned by UAV reconnaissance. Through image preprocessing optimization, learning rate optimization and transfer learning, intelligently generate a A neural network that can quickly identify multiple types of targets on the battlefield. Through optimization and transfer learning strategies, the problems of overfitting and redundant recognition are effectively solved, and the accuracy of target recognition is significantly improved.
附图说明Description of drawings
图1为Faster-RCNN模型;Figure 1 is the Faster-RCNN model;
图2为YLOL V3模型;Figure 2 is the YLOL V3 model;
图3为Transfer-Faster RCNN模型;Figure 3 is the Transfer-Faster RCNN model;
图4为YOLO v3模型识别结果;Figure 4 shows the recognition results of the YOLO v3 model;
图5为Faster-RCNN模型识别结果。Figure 5 shows the recognition results of the Faster-RCNN model.
具体实施方式Detailed ways
下面结合附图及具体实施例对本发明作进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.
通常情况下,一个卷积神经网络的训练需要大量的数据,但由于实际战场环境的复杂性与高时效性,要收集大量包含真实战场目标的图像信息非常困难。本发明根据实际战场环境需求,引入迁移学习算法,通过调整在一个已经训练好的模型上修改,进而满足、适用一个新的需求。由于在已经训练完成的深度学习网络模型中仅有最后一个单层全连接神经网络用于区分各个类型的图像,因此前面的输入层和卷积层可以被用来作为任意图像的特征向量提取,并使用提取到的特征向量作为输入来训练一个新的分类器。本发明根据实际战场环境的需求,在图1所示的Faster-RCNN深度学习模型的基础上,引入迁移学习算法,通过对已经训练好的模型进行优化,建立Transfer-Faster RCNN模型(如图3所示),进而满足、适用于战场目标识别的需求。Usually, the training of a convolutional neural network requires a large amount of data, but due to the complexity and high timeliness of the actual battlefield environment, it is very difficult to collect a large amount of image information containing real battlefield targets. According to the requirements of the actual battlefield environment, the present invention introduces a transfer learning algorithm, adjusts and modifies an already trained model, and then satisfies and applies a new requirement. Since only the last single-layer fully-connected neural network is used to distinguish various types of images in the trained deep learning network model, the previous input layer and convolutional layer can be used as feature vector extraction for any image. And use the extracted feature vector as input to train a new classifier. According to the needs of the actual battlefield environment, the present invention introduces a transfer learning algorithm on the basis of the Faster-RCNN deep learning model shown in Figure 1, and establishes the Transfer-Faster RCNN model (as shown in Figure 3) by optimizing the trained model. shown), and then meet and apply to the requirements of battlefield target recognition.
本发明提供的一种基于人工智能的用于战场目标识别的方法,具体包括如下步骤:A method for battlefield target recognition based on artificial intelligence provided by the present invention specifically includes the following steps:
步骤1:图像预处理优化;Step 1: Image preprocessing optimization;
在本发明的一个实施例中,步骤1包括如下:In one embodiment of the present invention, step 1 includes as follows:
(1)生成目标框(1) Generate target box
通过Transfer-Faster-RCNN模型生成目标框,Transfer-Faster-RCNN模型支持输入任意大小的图片,在识别目标前首先生成一个目标框,对于目标框使用四维向量来表示(x,y,w,h),其中,x为目标框的中心点横坐标、y为目标框的中心点纵坐标、w为目标框的宽、h为目标框的高,The target frame is generated by the Transfer-Faster-RCNN model. The Transfer-Faster-RCNN model supports inputting pictures of any size. Before recognizing the target, a target frame is first generated. The target frame is represented by a four-dimensional vector (x, y, w, h ), where x is the abscissa of the center point of the target frame, y is the ordinate of the center point of the target frame, w is the width of the target frame, and h is the height of the target frame,
A=(Ax,Ay,Aw,Ah) (1)A=(Ax,Ay,Aw,Ah) (1)
G=(Gx,Gy,Gw,Gh) (2)G=(Gx,Gy,Gw,Gh) (2)
其中,A为原始的目标框数据集,G为真实的目标框数据集。Among them, A is the original target box dataset, and G is the real target box dataset.
目标是寻找一种关系,使得输入原始的窗口经过映射到一个和真实框G更接近的回归窗口G′,G′表示平移变换:The goal is to find a relationship so that the input original window is mapped to a regression window G' that is closer to the real frame G, and G' represents the translation transformation:
G′x=Ax+Aw·dx(A) (3)G'x=Ax+Aw·dx(A) (3)
G′y=Ay+Ah·dy(A) (4)G'y=Ay+Ah·dy(A) (4)
其中,dx(A)、dy(A)表示平移量,G′x表示平移后的中心点横坐标,G′y表示平移后的中心点纵坐标。Among them, dx(A) and dy(A) represent the amount of translation, G'x represents the abscissa of the center point after translation, and G'y represents the ordinate of the center point after translation.
(2)图像变换优化(2) Image transformation optimization
由于摄像机及其载机平台的机械不确定性,势必会对获取的图形其方向、大小产生偏移,因此,对图像进行了不同程度的放缩及旋转,以便载入模型后使系统能够更加灵敏的识别不同角度及大小的同一目标。Due to the mechanical uncertainty of the camera and its carrier platform, the direction and size of the acquired graphics will inevitably be shifted. Therefore, the images are scaled and rotated to different degrees, so that the system can be more accurate after loading the model. Sensitively identify the same target at different angles and sizes.
即寻找一种图像的变换F使得That is to find an image transformation F such that
F(Ax,Ay,Aw,Ah)=(G′x,G′y,G′w,G′h) (5)F(Ax,Ay,Aw,Ah)=(G'x,G'y,G'w,G'h) (5)
F的计算可以通过平移和缩放来实现:The calculation of F can be achieved by translation and scaling:
G′w=Aw·dy(A) (6)G'w=Aw·dy(A) (6)
G′h=Ah·dy(A) (7)G'h=Ah·dy(A) (7)
其中,dy(A)表示平移量,G′w表示缩放后的目标框宽度,G′h表示缩放后的目标框高度。Among them, dy(A) represents the amount of translation, G'w represents the width of the zoomed target box, and G'h represents the height of the zoomed target box.
构建目标函数build objective function
其中,φ(A)是对应特征图组成的特征向量,wT是需要学习的参数;d(A)是得到预测值,(*表示x,y,w,h,也就是每一个变换对应一个上述目标函数),为了让预测值dx(A),dy(A),dw(A),dh(A)和真实值差距tx,ty,tw,th最小,代价函数loss如下:Among them, φ(A) is the feature vector composed of the corresponding feature map, w T is the parameter to be learned; d(A) is the predicted value, (* means x, y, w, h, that is, each transformation corresponds to a The above objective function), in order to minimize the difference between the predicted value dx(A), dy(A), dw(A), dh(A) and the actual value tx, ty, tw, th, the cost function loss is as follows:
式中,表示目标框真实的中心点,N表示特征图的数量,Ai表示第i个特征图的目标框。In the formula, Indicates the true center point of the target frame, N represents the number of feature maps, and A i represents the target frame of the i-th feature map.
函数优化目标w*为:The function optimization objective w* is:
(3)高斯模糊(3) Gaussian blur
同时考虑到现实战场环境中复杂干扰,在将数据载入Faster-CNN模型前,首先对同一张图片进行了不同程度的高斯模糊和曝光处理:At the same time, taking into account the complex interference in the real battlefield environment, before loading the data into the Faster-CNN model, the same picture was first processed with different degrees of Gaussian blur and exposure:
其中,p、q是每个RGB通道中像素点位置,σ为曝光程度系数。Among them, p and q are the pixel positions in each RGB channel, and σ is the exposure degree coefficient.
优化处理后,将进行上述处理后的照片同原始照片一同载入Faster-CNN模型进行训练。其优势在于,不仅增大获取到的目标数据量,使训练精度得到提升更重要的是,通过将图片进行处理后,能够很好的适应真实战场环境中由于天气条件和机械原因产生的影响,大大提高系统的鲁棒性。After optimization, the photos after the above processing are loaded into the Faster-CNN model together with the original photos for training. Its advantage is that it not only increases the amount of target data obtained, but also improves the training accuracy. More importantly, after processing the pictures, it can well adapt to the influence of weather conditions and mechanical reasons in the real battlefield environment. Greatly improve the robustness of the system.
步骤2:学习率优化;Step 2: Learning rate optimization;
在本发明的一个实施例中,步骤2包括如下:In one embodiment of the present invention, step 2 includes as follows:
在训练一个模型时,学习率的大小会影响模型的训练效果,过大的学习率会使训练的损失率震荡,而过小的学习率会导致训练速度过慢。针对这个问题,本发明采取以下处理办法:When training a model, the size of the learning rate will affect the training effect of the model. If the learning rate is too large, the loss rate of the training will fluctuate, and if the learning rate is too small, the training speed will be too slow. For this problem, the present invention takes the following approach:
首先设置一个下降率,在训练指定步长后,将原有的学习率进行缩小,以防止其震荡。First set a descent rate, and after training the specified step size, reduce the original learning rate to prevent it from oscillating.
具体的,在发明中采取每训练一万步将学习率下降至现有学习率的90%,其次,在训练十万步后,选择中断训练,并根据损失率来调整学习率大小,如果损失率超过30%,则将学习率提升50%。调整后,继续之前的训练数据训练,进而获得一个效果更优的成熟模型。Specifically, in the invention, the learning rate is reduced to 90% of the existing learning rate for every 10,000 steps of training. Secondly, after training for 100,000 steps, the training is interrupted and the learning rate is adjusted according to the loss rate. If the loss If the learning rate exceeds 30%, increase the learning rate by 50%. After adjustment, continue training with the previous training data to obtain a mature model with better effect.
步骤3:多分辨率学习识别;Step 3: Multi-resolution learning recognition;
在本发明的一个实施例中,步骤3包括如下:In one embodiment of the present invention,
在对Faster-CNN模型的验证中发现,训练集的匮乏会致使模型在最后的检测中出现误识别率较高的问题,很难满足真实战场环境需求。因此,本发明提出一种多分辨率学习的优化方法,具体如下:In the verification of the Faster-CNN model, it was found that the lack of training set will cause the model to have a high misrecognition rate in the final detection, and it is difficult to meet the needs of the real battlefield environment. Therefore, the present invention proposes an optimization method for multi-resolution learning, specifically as follows:
为了恢复图像中所丢失的信息,本发明基于模型框架,从低分辨率图像插值生成高分辨率图像,得到了更多的图像细节及特征,供给神经网络进行学习。In order to restore the lost information in the image, the present invention interpolates from the low-resolution image to generate a high-resolution image based on the model framework, obtains more image details and features, and supplies the neural network for learning.
经过对比分析几种经典的图像插值算法包括最近相邻插值算法、双线性内插法、两次立方插值算法,本发明的实施例选用处理后图像像质损失最少的两次立方插值算法,该插值算法用函数W(m)表示如下:Through comparative analysis of several classic image interpolation algorithms including nearest neighbor interpolation algorithm, bilinear interpolation method, and bi-cubic interpolation algorithm, the embodiment of the present invention selects the bi-cubic interpolation algorithm with the least image quality loss after processing, The interpolation algorithm is expressed by the function W(m) as follows:
其中,m为自变量,a为调节值。在对原有图像细节尽可能保留的同时,我们将原始图像进行放大,使神经网络能够更好的获取图像中目标的特征,从而优化了训练集的质量。Among them, m is an independent variable, and a is an adjustment value. While preserving the details of the original image as much as possible, we enlarge the original image so that the neural network can better acquire the features of the target in the image, thereby optimizing the quality of the training set.
此外,在对图像进行识别时,做插值二次识别,对目标框区域进行放大,极大地提高了目标识别的精确度,降低了误识别率。在后续的实验中,经过此过程优化后的模型可以更好的识别出真实背景下的战场迷彩伪装,具有更强的复杂环境适应性。In addition, when the image is recognized, interpolation and secondary recognition are performed to enlarge the target frame area, which greatly improves the accuracy of target recognition and reduces the false recognition rate. In subsequent experiments, the model optimized through this process can better recognize the camouflage camouflage in the real background, and has stronger adaptability to complex environments.
步骤4:非最大值抑制(Non-Maximum Suppression,NMS)Step 4: Non-Maximum Suppression (NMS)
在本发明的一个实施例中,步骤4包括如下:In one embodiment of the present invention, step 4 includes as follows:
在采取了上述模型优化后,模型的识别能力较初始状态有了极大提升,但随之而来却会产生另外一个问题——多次识别的过拟合现象,即同一个目标会检测出多个识别目标框,产生了大量的冗余信息,这一点对战场环境下的辅助决策非常不利。After adopting the above model optimization, the recognition ability of the model has been greatly improved compared with the initial state, but there will be another problem - the over-fitting phenomenon of multiple recognitions, that is, the same target will be detected Multiple recognition target frames generate a large amount of redundant information, which is very unfavorable for auxiliary decision-making in the battlefield environment.
针对此现象,本发明采取线性非极大值抑制算法来去除同一目标识别产生的冗余目标框,保留其中效果最好的一个,从而提高识别的精确度和降低误识别率。Aiming at this phenomenon, the present invention adopts a linear non-maximum value suppression algorithm to remove redundant target frames generated by the same target recognition, and retain the one with the best effect, thereby improving the recognition accuracy and reducing the false recognition rate.
具体实现过程包括:The specific implementation process includes:
(1)计算每个目标框与其相临近目标框的重叠区域面积比例IoU(IntersectionoverUnion);(1) Calculate the area ratio IoU (IntersectionoverUnion) of the overlapping area of each target frame and its adjacent target frame;
(2)将IoU与阈值进行对比,改变相邻近目标框置信度:(2) Compare the IoU with the threshold and change the confidence of adjacent target boxes:
其中,si为各目标框的置信度,Nt为设定的阈值,本发明的一个实施例中Nt设置为0.75。经此处理后,在保证识别精度的同时能够较好过滤冗余识别目标框。Wherein, s i is the confidence degree of each target frame, and N t is a set threshold, and N t is set to 0.75 in one embodiment of the present invention. After this processing, redundant recognition target frames can be better filtered while ensuring the recognition accuracy.
为了体现本发明中战场目标识别方法的有效性,下面通过实验来进行验证。In order to reflect the effectiveness of the battlefield target recognition method in the present invention, it will be verified through experiments below.
为了更好的切合复杂的现实战场环境,在实验时只利用侦察无人机做了短时间目标侦察,具体目标识别流程为:In order to better adapt to the complex real battlefield environment, only the reconnaissance drone was used for short-term target reconnaissance during the experiment. The specific target recognition process is as follows:
(1)将无人机获取到的有限样本集进行平移,旋转,放缩以及模糊曝光的图像预处理,将处理后的图像加入原始样本集中,扩大了样本集数量。(1) The limited sample set obtained by the UAV is preprocessed by translation, rotation, scaling, and blurred exposure images, and the processed image is added to the original sample set to expand the number of sample sets.
(2)在对获取的信息进行预处理后,载入已经训练好的Faster-RCNN模型(如图1所示)和YOLO v3模型(如图2所示)进行训练。在训练模型期间根据实时的训练损失率及时调整学习率,避免其陷入局部最优化或是过拟合。(2) After preprocessing the acquired information, load the trained Faster-RCNN model (as shown in Figure 1) and YOLO v3 model (as shown in Figure 2) for training. During the training of the model, the learning rate is adjusted in time according to the real-time training loss rate to prevent it from falling into local optimization or overfitting.
在对Faster-RCNN和YOLO v3模型进行训练时,设置识别的目标类型为3种(飞机,坦克,舰船),根据目标种类修改每一个卷积层中的参数,初始学习率为0.0003,每训练一万步将其学习率下降至现有学习率的95%,每十万步中断一次训练通过观察损失率变化曲线进行学习率的调整。分别经过二十万步训练后导出模型参数,对模型参数进行随机测试,训练结果如图4和图5所示。When training the Faster-RCNN and YOLO v3 models, set the recognized target types to 3 types (aircraft, tanks, ships), modify the parameters in each convolutional layer according to the target type, the initial learning rate is 0.0003, and each After training for 10,000 steps, the learning rate is reduced to 95% of the existing learning rate, and the training is interrupted every 100,000 steps to adjust the learning rate by observing the change curve of the loss rate. After 200,000 steps of training, the model parameters are exported, and the model parameters are randomly tested. The training results are shown in Figure 4 and Figure 5.
(3)将训练集中照片进行图像插值,提升其分辨率,增大画面信息。然后对一次识别出来的权重较高的重点区域,插值放大区域分辨率,进行二次识别,提升精准度、降低误识别。(3) Perform image interpolation on the photos in the training set to increase its resolution and increase the picture information. Then, for the key areas with higher weights identified once, the resolution of the area is enlarged by interpolation, and a second identification is performed to improve accuracy and reduce misidentification.
(4)基于上一步得到的权重,如果相邻两处识别区域权重均高于设置的阈值,检测其重叠率,重叠率过高认为是一个目标,降低对一个目标多次识别及两个目标过于临近这两种情况的误识别率。(4) Based on the weights obtained in the previous step, if the weights of two adjacent recognition areas are higher than the set threshold, check their overlap rate. If the overlap rate is too high, it is considered a target, and the multiple recognition of one target and two targets will be reduced. Too close to the false recognition rate of these two cases.
在每个模型下各验证60张照片,一共进行三组检测,算法比较结果如表1所示。60 photos were verified under each model, and a total of three groups of tests were carried out. The comparison results of the algorithms are shown in Table 1.
表1算法比较Table 1 Algorithm comparison
对上述实验结果观察和分析不难看出,在同等训练条件下,YOLO v3以其进行两次检测的结构优势模型有着较高的识别速度,但准确性较其余两种略差。Transfer-FasterRCNN通过对原始模型的优化后,在保持原识别速度前提下,系统识别精度有了很大的提升,能够更加适应复杂多变的战场环境。From the observation and analysis of the above experimental results, it is not difficult to see that under the same training conditions, YOLO v3 has a higher recognition speed with its structural advantage model of two detections, but its accuracy is slightly worse than the other two. After Transfer-FasterRCNN optimizes the original model, the recognition accuracy of the system has been greatly improved under the premise of maintaining the original recognition speed, and it can better adapt to the complex and changeable battlefield environment.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010231438.6A CN111414997B (en) | 2020-03-27 | 2020-03-27 | A Method for Battlefield Target Recognition Based on Artificial Intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010231438.6A CN111414997B (en) | 2020-03-27 | 2020-03-27 | A Method for Battlefield Target Recognition Based on Artificial Intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111414997A CN111414997A (en) | 2020-07-14 |
CN111414997B true CN111414997B (en) | 2023-06-06 |
Family
ID=71491576
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010231438.6A Active CN111414997B (en) | 2020-03-27 | 2020-03-27 | A Method for Battlefield Target Recognition Based on Artificial Intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111414997B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112465057B (en) * | 2020-12-08 | 2023-05-12 | 中国人民解放军空军工程大学 | Target detection and identification method based on deep convolutional neural network |
CN112633168B (en) * | 2020-12-23 | 2023-10-31 | 长沙中联重科环境产业有限公司 | Garbage truck and method and device for identifying garbage can overturning action of garbage truck |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109690554B (en) * | 2016-07-21 | 2023-12-05 | 西门子保健有限责任公司 | Method and system for artificial intelligence based medical image segmentation |
EP4293574A3 (en) * | 2017-08-08 | 2024-04-03 | RealD Spark, LLC | Adjusting a digital representation of a head region |
CN108399362B (en) * | 2018-01-24 | 2022-01-07 | 中山大学 | Rapid pedestrian detection method and device |
CN109522938A (en) * | 2018-10-26 | 2019-03-26 | 华南理工大学 | The recognition methods of target in a kind of image based on deep learning |
-
2020
- 2020-03-27 CN CN202010231438.6A patent/CN111414997B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111414997A (en) | 2020-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110020651B (en) | License plate detection and positioning method based on deep learning network | |
Wu et al. | Rapid target detection in high resolution remote sensing images using YOLO model | |
CN106845478B (en) | A kind of secondary licence plate recognition method and device of character confidence level | |
CN109308483B (en) | Dual-source image feature extraction and fusion identification method based on convolutional neural network | |
CN111241931B (en) | Aerial unmanned aerial vehicle target identification and tracking method based on YOLOv3 | |
EP3254238B1 (en) | Method for re-identification of objects | |
CN108108746B (en) | License plate character recognition method based on Caffe deep learning framework | |
CN107529650B (en) | Closed loop detection method and device and computer equipment | |
CN111461213B (en) | A training method for a target detection model and a fast target detection method | |
CN106023257B (en) | A kind of method for tracking target based on rotor wing unmanned aerial vehicle platform | |
CN111353512A (en) | Obstacle classification method, device, storage medium and computer equipment | |
Lai et al. | Traffic Signs Recognition and Classification based on Deep Feature Learning. | |
US11244188B2 (en) | Dense and discriminative neural network architectures for improved object detection and instance segmentation | |
CN108647573A (en) | A kind of military target recognition methods based on deep learning | |
CN114549891B (en) | Foundation cloud image cloud class identification method based on comparison self-supervision learning | |
CN111414997B (en) | A Method for Battlefield Target Recognition Based on Artificial Intelligence | |
CN112149533A (en) | Target detection method based on improved SSD model | |
EP4024343A1 (en) | Viewpoint image processing method and related device | |
CN115761552B (en) | Target detection method, device and medium for unmanned aerial vehicle carrying platform | |
CN116503763A (en) | Unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback | |
CN116363535A (en) | Ship detection method in unmanned aerial vehicle aerial image based on convolutional neural network | |
CN114973026A (en) | Target detection system in unmanned aerial vehicle scene of taking photo by plane, unmanned aerial vehicle system of taking photo by plane | |
CN116309270B (en) | Binocular image-based transmission line typical defect identification method | |
CN114549969B (en) | Saliency detection method and system based on image information fusion | |
CN116665097A (en) | Self-adaptive target tracking method combining context awareness |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |