CN111414997B - A Method for Battlefield Target Recognition Based on Artificial Intelligence - Google Patents

A Method for Battlefield Target Recognition Based on Artificial Intelligence Download PDF

Info

Publication number
CN111414997B
CN111414997B CN202010231438.6A CN202010231438A CN111414997B CN 111414997 B CN111414997 B CN 111414997B CN 202010231438 A CN202010231438 A CN 202010231438A CN 111414997 B CN111414997 B CN 111414997B
Authority
CN
China
Prior art keywords
target frame
target
battlefield
learning
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010231438.6A
Other languages
Chinese (zh)
Other versions
CN111414997A (en
Inventor
权文
宋亚飞
路艳丽
王坚
王亚男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Air Force Engineering University of PLA
Original Assignee
Air Force Engineering University of PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Air Force Engineering University of PLA filed Critical Air Force Engineering University of PLA
Priority to CN202010231438.6A priority Critical patent/CN111414997B/en
Publication of CN111414997A publication Critical patent/CN111414997A/en
Application granted granted Critical
Publication of CN111414997B publication Critical patent/CN111414997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for battlefield target identification based on artificial intelligence, which specifically comprises the following steps: step 1: image preprocessing optimization; step 2: optimizing the learning rate; setting a decline rate, and reducing the original learning rate after training a designated step length so as to prevent oscillation; step 3: multi-resolution learning and identification; step 4: non-maximum suppression. The beneficial effects are that: the method comprises the steps of carrying out recognition processing on acquired optical image data, specifically carrying out convolutional neural network training on related image data acquired on an air unmanned aerial vehicle by utilizing a deep neural network, aiming at three problems of overfitting, redundant recognition and lack of recognition accuracy existing in an original model, upgrading an algorithm model according to the characteristics of reconnaissance return optical images of the unmanned aerial vehicle, and intelligently generating a set of neural network capable of rapidly recognizing battlefield multi-type targets through image preprocessing optimization, learning rate optimization and transfer learning.

Description

一种基于人工智能的用于战场目标识别的方法A Method for Battlefield Target Recognition Based on Artificial Intelligence

技术领域technical field

本发明属于人工智能、深度学习领域,具体涉及一种基于人工智能的目标识别的方法。The invention belongs to the fields of artificial intelligence and deep learning, and in particular relates to a method for object recognition based on artificial intelligence.

背景技术Background technique

未来战争必将是智能化的军事角逐,为了提高战场决策能力,首先要解决的问题是在强复杂电磁环境下的智能目标识别。传统的机器学习和专家系统等非智能化、无“自学习”能力方法都难以应对目标的智能识别难题。Future warfare will inevitably be an intelligent military competition. In order to improve battlefield decision-making capabilities, the first problem to be solved is intelligent target recognition in a strong and complex electromagnetic environment. Non-intelligent, non-self-learning methods such as traditional machine learning and expert systems are difficult to deal with the problem of intelligent recognition of targets.

目前已有利用深度学习算法实现对战场环境的感知,借助于人工智能技术来获取及时、准确的综合战场态势信息,辅助作战指挥员快速指挥决策。At present, deep learning algorithms have been used to realize the perception of the battlefield environment, and artificial intelligence technology is used to obtain timely and accurate comprehensive battlefield situation information to assist combat commanders to make quick command decisions.

关于目标识别方面的研究大部分是基于雷达情报进行的识别,但在复杂电磁环境下,这种识别会严重受限。而基于光学传感器获取的图像受此干扰较小。目前基于人工智能的图像识别算法存在过拟合、冗余识别、识别精度欠缺等问题,无法满足战场目标识别的需求。Most of the research on target recognition is based on radar intelligence, but in complex electromagnetic environments, this kind of recognition will be severely limited. The image acquired based on the optical sensor is less disturbed by this. At present, image recognition algorithms based on artificial intelligence have problems such as overfitting, redundant recognition, and lack of recognition accuracy, which cannot meet the needs of battlefield target recognition.

发明内容Contents of the invention

本发明的目的提供一种基于人工智能的用于战场目标识别的方法,它利用改进的深度学习智能算法进行目标识别,能够提高侦察无人机在复杂环境下对战场目标的识别能力。The purpose of the present invention is to provide an artificial intelligence-based method for battlefield target recognition, which uses an improved deep learning intelligent algorithm for target recognition, and can improve the ability of reconnaissance drones to recognize battlefield targets in complex environments.

本发明的技术方案如下:一种基于人工智能的用于战场目标识别的方法,具体包括如下步骤:The technical scheme of the present invention is as follows: a method for battlefield target recognition based on artificial intelligence, specifically comprises the following steps:

步骤1:图像预处理优化;Step 1: Image preprocessing optimization;

步骤2:学习率优化;Step 2: Learning rate optimization;

设置一个下降率,在训练指定步长后,将原有的学习率进行缩小,以防止其震荡;Set a drop rate, after training the specified step size, reduce the original learning rate to prevent it from oscillating;

步骤3:多分辨率学习识别;Step 3: Multi-resolution learning recognition;

步骤4:非最大值抑制。Step 4: Non-maximum suppression.

所述的步骤1包括如下:Described step 1 comprises as follows:

(1)生成目标框;(1) Generate a target frame;

(2)图像变换优化;(2) Image transformation optimization;

(3)高斯模糊。(3) Gaussian blur.

所述的步骤1中的步骤(1)包括,通过Transfer-Faster-RCNN模型生成目标框,在识别目标前首先生成一个目标框,对于目标框使用四维向量来表示(x,y,w,h),其中,x为目标框的中心点横坐标、y为目标框的中心点纵坐标、w为目标框的宽、h为目标框的高,Step (1) in the described step 1 includes generating a target frame by the Transfer-Faster-RCNN model, first generating a target frame before identifying the target, and using a four-dimensional vector to represent the target frame (x, y, w, h ), where x is the abscissa of the center point of the target frame, y is the ordinate of the center point of the target frame, w is the width of the target frame, and h is the height of the target frame,

A=(Ax,Ay,Aw,Ah) (1)A=(Ax,Ay,Aw,Ah) (1)

G=(Gx,Gy,Gw,Gh) (2)G=(Gx,Gy,Gw,Gh) (2)

其中,A为原始的目标框数据集,G为真实的目标框数据集。Among them, A is the original target box dataset, and G is the real target box dataset.

使得输入原始的窗口经过映射到一个和真实框G更接近的回归窗口G′,G′表示平移变换:The input original window is mapped to a regression window G' that is closer to the real box G, and G' represents translation transformation:

G′x=Ax+Aw·dx(A) (3)G'x=Ax+Aw·dx(A) (3)

G′y=Ay+Ah·dy(A) (4)G'y=Ay+Ah·dy(A) (4)

其中,dx(A)、dy(A)表示平移量,G′x表示平移后的中心点横坐标,G′y表示平移后的中心点纵坐标;Wherein, dx(A), dy(A) represent the amount of translation, G'x represents the abscissa of the center point after the translation, and G'y represents the ordinate of the center point after the translation;

所述的步骤1中的步骤(2)包括,Step (2) in said step 1 comprises,

寻找一种图像的变换F使得Find a transformation F of an image such that

F(Ax,Ay,Aw,Ah)=(G′x,G′y,G′w,G′h) (5)F(Ax,Ay,Aw,Ah)=(G'x,G'y,G'w,G'h) (5)

F的计算可以通过平移和缩放来实现:The calculation of F can be achieved by translation and scaling:

G′w=Aw·dy(A) (6)G'w=Aw·dy(A) (6)

G′h=Ah·dy(A) (7)G'h=Ah·dy(A) (7)

其中,dy(A)表示平移量,G′w表示缩放后的目标框宽度,G′h表示缩放后的目标框高度;Among them, dy(A) represents the amount of translation, G'w represents the width of the zoomed target frame, and G'h represents the height of the zoomed target frame;

构建目标函数build objective function

Figure GDA0004201704060000036
Figure GDA0004201704060000036

其中,φ(A)是对应特征图组成的特征向量,wT是需要学习的参数;d(A)是得到预测值,(*表示x,y,w,h,也就是每一个变换对应一个上述目标函数),为了让预测值dx(A),dy(A),dw(A),dh(A)和真实值差距tx,ty,tw,th最小,代价函数loss如下:Among them, φ(A) is the feature vector composed of the corresponding feature map, w T is the parameter to be learned; d(A) is the predicted value, (* means x, y, w, h, that is, each transformation corresponds to a The above objective function), in order to minimize the difference between the predicted value dx(A), dy(A), dw(A), dh(A) and the actual value tx, ty, tw, th, the cost function loss is as follows:

Figure GDA0004201704060000031
Figure GDA0004201704060000031

式中,

Figure GDA0004201704060000032
表示目标框真实的中心点,N表示特征图的数量,Ai表示第i个特征图的目标框;In the formula,
Figure GDA0004201704060000032
Represents the real center point of the target frame, N represents the number of feature maps, and A i represents the target frame of the i-th feature map;

函数优化目标w*为:The function optimization objective w* is:

Figure GDA0004201704060000033
Figure GDA0004201704060000033

所述的步骤1中的步骤(2)包括,在将数据载入Faster-CNN模型前,首先对同一张图片进行了不同程度的高斯模糊和曝光处理:The step (2) in the described step 1 includes, before loading the data into the Faster-CNN model, at first different degrees of Gaussian blur and exposure processing are performed on the same picture:

Figure GDA0004201704060000034
Figure GDA0004201704060000034

其中,p、q是每个RGB通道中像素点位置,σ为曝光程度系数。Among them, p and q are the pixel positions in each RGB channel, and σ is the exposure degree coefficient.

所述的步骤3中选用处理后图像像质损失最少的两次立方插值算法,该插值算法用函数W(m)表示如下:In the described step 3, select the least cubic interpolation algorithm of image quality loss after processing, and this interpolation algorithm is expressed as follows with function W (m):

Figure GDA0004201704060000035
Figure GDA0004201704060000035

其中,m为自变量,a为调节值。Among them, m is an independent variable, and a is an adjustment value.

所述的步骤4包括如下:Described step 4 comprises as follows:

(1)计算每个目标框与其相临近目标框的重叠区域面积比例IoU;(1) Calculate the area ratio IoU of the overlapping area of each target frame and its adjacent target frame;

(2)将IoU与阈值进行对比,改变相邻近目标框置信度:(2) Compare the IoU with the threshold and change the confidence of adjacent target boxes:

Figure GDA0004201704060000041
Figure GDA0004201704060000041

其中,si为各目标框的置信度,Nt为设定的阈值。Among them, s i is the confidence degree of each target frame, and N t is the set threshold.

本发明的有益效果在于:本发明使用深度学习方法对获取的光学图像数据进行识别处理,具体是利用深度神经网络,对空中无人机上获取的相关图像数据进行卷积神经网络训练,针对原始模型中存在的过拟合、冗余识别、识别精度欠缺三大问题,根据无人机侦察回传光学图像特点对算法模型进行升级,通过图像预处理优化、学习率优化以及迁移学习,智能生成一套可以快速识别战场多类目标的神经网络。通过优化和迁移学习策略,有效解决了过拟合和冗余识别的问题,显著提升了目标识别精度。The beneficial effect of the present invention is that: the present invention uses deep learning method to identify and process the acquired optical image data, specifically, utilizes deep neural network to perform convolutional neural network training on relevant image data acquired on aerial drones, aiming at the original model According to the three major problems of over-fitting, redundant recognition, and lack of recognition accuracy, the algorithm model is upgraded according to the characteristics of the optical image returned by UAV reconnaissance. Through image preprocessing optimization, learning rate optimization and transfer learning, intelligently generate a A neural network that can quickly identify multiple types of targets on the battlefield. Through optimization and transfer learning strategies, the problems of overfitting and redundant recognition are effectively solved, and the accuracy of target recognition is significantly improved.

附图说明Description of drawings

图1为Faster-RCNN模型;Figure 1 is the Faster-RCNN model;

图2为YLOL V3模型;Figure 2 is the YLOL V3 model;

图3为Transfer-Faster RCNN模型;Figure 3 is the Transfer-Faster RCNN model;

图4为YOLO v3模型识别结果;Figure 4 shows the recognition results of the YOLO v3 model;

图5为Faster-RCNN模型识别结果。Figure 5 shows the recognition results of the Faster-RCNN model.

具体实施方式Detailed ways

下面结合附图及具体实施例对本发明作进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

通常情况下,一个卷积神经网络的训练需要大量的数据,但由于实际战场环境的复杂性与高时效性,要收集大量包含真实战场目标的图像信息非常困难。本发明根据实际战场环境需求,引入迁移学习算法,通过调整在一个已经训练好的模型上修改,进而满足、适用一个新的需求。由于在已经训练完成的深度学习网络模型中仅有最后一个单层全连接神经网络用于区分各个类型的图像,因此前面的输入层和卷积层可以被用来作为任意图像的特征向量提取,并使用提取到的特征向量作为输入来训练一个新的分类器。本发明根据实际战场环境的需求,在图1所示的Faster-RCNN深度学习模型的基础上,引入迁移学习算法,通过对已经训练好的模型进行优化,建立Transfer-Faster RCNN模型(如图3所示),进而满足、适用于战场目标识别的需求。Usually, the training of a convolutional neural network requires a large amount of data, but due to the complexity and high timeliness of the actual battlefield environment, it is very difficult to collect a large amount of image information containing real battlefield targets. According to the requirements of the actual battlefield environment, the present invention introduces a transfer learning algorithm, adjusts and modifies an already trained model, and then satisfies and applies a new requirement. Since only the last single-layer fully-connected neural network is used to distinguish various types of images in the trained deep learning network model, the previous input layer and convolutional layer can be used as feature vector extraction for any image. And use the extracted feature vector as input to train a new classifier. According to the needs of the actual battlefield environment, the present invention introduces a transfer learning algorithm on the basis of the Faster-RCNN deep learning model shown in Figure 1, and establishes the Transfer-Faster RCNN model (as shown in Figure 3) by optimizing the trained model. shown), and then meet and apply to the requirements of battlefield target recognition.

本发明提供的一种基于人工智能的用于战场目标识别的方法,具体包括如下步骤:A method for battlefield target recognition based on artificial intelligence provided by the present invention specifically includes the following steps:

步骤1:图像预处理优化;Step 1: Image preprocessing optimization;

在本发明的一个实施例中,步骤1包括如下:In one embodiment of the present invention, step 1 includes as follows:

(1)生成目标框(1) Generate target box

通过Transfer-Faster-RCNN模型生成目标框,Transfer-Faster-RCNN模型支持输入任意大小的图片,在识别目标前首先生成一个目标框,对于目标框使用四维向量来表示(x,y,w,h),其中,x为目标框的中心点横坐标、y为目标框的中心点纵坐标、w为目标框的宽、h为目标框的高,The target frame is generated by the Transfer-Faster-RCNN model. The Transfer-Faster-RCNN model supports inputting pictures of any size. Before recognizing the target, a target frame is first generated. The target frame is represented by a four-dimensional vector (x, y, w, h ), where x is the abscissa of the center point of the target frame, y is the ordinate of the center point of the target frame, w is the width of the target frame, and h is the height of the target frame,

A=(Ax,Ay,Aw,Ah) (1)A=(Ax,Ay,Aw,Ah) (1)

G=(Gx,Gy,Gw,Gh) (2)G=(Gx,Gy,Gw,Gh) (2)

其中,A为原始的目标框数据集,G为真实的目标框数据集。Among them, A is the original target box dataset, and G is the real target box dataset.

目标是寻找一种关系,使得输入原始的窗口经过映射到一个和真实框G更接近的回归窗口G′,G′表示平移变换:The goal is to find a relationship so that the input original window is mapped to a regression window G' that is closer to the real frame G, and G' represents the translation transformation:

G′x=Ax+Aw·dx(A) (3)G'x=Ax+Aw·dx(A) (3)

G′y=Ay+Ah·dy(A) (4)G'y=Ay+Ah·dy(A) (4)

其中,dx(A)、dy(A)表示平移量,G′x表示平移后的中心点横坐标,G′y表示平移后的中心点纵坐标。Among them, dx(A) and dy(A) represent the amount of translation, G'x represents the abscissa of the center point after translation, and G'y represents the ordinate of the center point after translation.

(2)图像变换优化(2) Image transformation optimization

由于摄像机及其载机平台的机械不确定性,势必会对获取的图形其方向、大小产生偏移,因此,对图像进行了不同程度的放缩及旋转,以便载入模型后使系统能够更加灵敏的识别不同角度及大小的同一目标。Due to the mechanical uncertainty of the camera and its carrier platform, the direction and size of the acquired graphics will inevitably be shifted. Therefore, the images are scaled and rotated to different degrees, so that the system can be more accurate after loading the model. Sensitively identify the same target at different angles and sizes.

即寻找一种图像的变换F使得That is to find an image transformation F such that

F(Ax,Ay,Aw,Ah)=(G′x,G′y,G′w,G′h) (5)F(Ax,Ay,Aw,Ah)=(G'x,G'y,G'w,G'h) (5)

F的计算可以通过平移和缩放来实现:The calculation of F can be achieved by translation and scaling:

G′w=Aw·dy(A) (6)G'w=Aw·dy(A) (6)

G′h=Ah·dy(A) (7)G'h=Ah·dy(A) (7)

其中,dy(A)表示平移量,G′w表示缩放后的目标框宽度,G′h表示缩放后的目标框高度。Among them, dy(A) represents the amount of translation, G'w represents the width of the zoomed target box, and G'h represents the height of the zoomed target box.

构建目标函数build objective function

Figure GDA0004201704060000064
Figure GDA0004201704060000064

其中,φ(A)是对应特征图组成的特征向量,wT是需要学习的参数;d(A)是得到预测值,(*表示x,y,w,h,也就是每一个变换对应一个上述目标函数),为了让预测值dx(A),dy(A),dw(A),dh(A)和真实值差距tx,ty,tw,th最小,代价函数loss如下:Among them, φ(A) is the feature vector composed of the corresponding feature map, w T is the parameter to be learned; d(A) is the predicted value, (* means x, y, w, h, that is, each transformation corresponds to a The above objective function), in order to minimize the difference between the predicted value dx(A), dy(A), dw(A), dh(A) and the actual value tx, ty, tw, th, the cost function loss is as follows:

Figure GDA0004201704060000061
Figure GDA0004201704060000061

式中,

Figure GDA0004201704060000062
表示目标框真实的中心点,N表示特征图的数量,Ai表示第i个特征图的目标框。In the formula,
Figure GDA0004201704060000062
Indicates the true center point of the target frame, N represents the number of feature maps, and A i represents the target frame of the i-th feature map.

函数优化目标w*为:The function optimization objective w* is:

Figure GDA0004201704060000063
Figure GDA0004201704060000063

(3)高斯模糊(3) Gaussian blur

同时考虑到现实战场环境中复杂干扰,在将数据载入Faster-CNN模型前,首先对同一张图片进行了不同程度的高斯模糊和曝光处理:At the same time, taking into account the complex interference in the real battlefield environment, before loading the data into the Faster-CNN model, the same picture was first processed with different degrees of Gaussian blur and exposure:

Figure GDA0004201704060000071
Figure GDA0004201704060000071

其中,p、q是每个RGB通道中像素点位置,σ为曝光程度系数。Among them, p and q are the pixel positions in each RGB channel, and σ is the exposure degree coefficient.

优化处理后,将进行上述处理后的照片同原始照片一同载入Faster-CNN模型进行训练。其优势在于,不仅增大获取到的目标数据量,使训练精度得到提升更重要的是,通过将图片进行处理后,能够很好的适应真实战场环境中由于天气条件和机械原因产生的影响,大大提高系统的鲁棒性。After optimization, the photos after the above processing are loaded into the Faster-CNN model together with the original photos for training. Its advantage is that it not only increases the amount of target data obtained, but also improves the training accuracy. More importantly, after processing the pictures, it can well adapt to the influence of weather conditions and mechanical reasons in the real battlefield environment. Greatly improve the robustness of the system.

步骤2:学习率优化;Step 2: Learning rate optimization;

在本发明的一个实施例中,步骤2包括如下:In one embodiment of the present invention, step 2 includes as follows:

在训练一个模型时,学习率的大小会影响模型的训练效果,过大的学习率会使训练的损失率震荡,而过小的学习率会导致训练速度过慢。针对这个问题,本发明采取以下处理办法:When training a model, the size of the learning rate will affect the training effect of the model. If the learning rate is too large, the loss rate of the training will fluctuate, and if the learning rate is too small, the training speed will be too slow. For this problem, the present invention takes the following approach:

首先设置一个下降率,在训练指定步长后,将原有的学习率进行缩小,以防止其震荡。First set a descent rate, and after training the specified step size, reduce the original learning rate to prevent it from oscillating.

具体的,在发明中采取每训练一万步将学习率下降至现有学习率的90%,其次,在训练十万步后,选择中断训练,并根据损失率来调整学习率大小,如果损失率超过30%,则将学习率提升50%。调整后,继续之前的训练数据训练,进而获得一个效果更优的成熟模型。Specifically, in the invention, the learning rate is reduced to 90% of the existing learning rate for every 10,000 steps of training. Secondly, after training for 100,000 steps, the training is interrupted and the learning rate is adjusted according to the loss rate. If the loss If the learning rate exceeds 30%, increase the learning rate by 50%. After adjustment, continue training with the previous training data to obtain a mature model with better effect.

步骤3:多分辨率学习识别;Step 3: Multi-resolution learning recognition;

在本发明的一个实施例中,步骤3包括如下:In one embodiment of the present invention, step 3 includes as follows:

在对Faster-CNN模型的验证中发现,训练集的匮乏会致使模型在最后的检测中出现误识别率较高的问题,很难满足真实战场环境需求。因此,本发明提出一种多分辨率学习的优化方法,具体如下:In the verification of the Faster-CNN model, it was found that the lack of training set will cause the model to have a high misrecognition rate in the final detection, and it is difficult to meet the needs of the real battlefield environment. Therefore, the present invention proposes an optimization method for multi-resolution learning, specifically as follows:

为了恢复图像中所丢失的信息,本发明基于模型框架,从低分辨率图像插值生成高分辨率图像,得到了更多的图像细节及特征,供给神经网络进行学习。In order to restore the lost information in the image, the present invention interpolates from the low-resolution image to generate a high-resolution image based on the model framework, obtains more image details and features, and supplies the neural network for learning.

经过对比分析几种经典的图像插值算法包括最近相邻插值算法、双线性内插法、两次立方插值算法,本发明的实施例选用处理后图像像质损失最少的两次立方插值算法,该插值算法用函数W(m)表示如下:Through comparative analysis of several classic image interpolation algorithms including nearest neighbor interpolation algorithm, bilinear interpolation method, and bi-cubic interpolation algorithm, the embodiment of the present invention selects the bi-cubic interpolation algorithm with the least image quality loss after processing, The interpolation algorithm is expressed by the function W(m) as follows:

Figure GDA0004201704060000081
Figure GDA0004201704060000081

其中,m为自变量,a为调节值。在对原有图像细节尽可能保留的同时,我们将原始图像进行放大,使神经网络能够更好的获取图像中目标的特征,从而优化了训练集的质量。Among them, m is an independent variable, and a is an adjustment value. While preserving the details of the original image as much as possible, we enlarge the original image so that the neural network can better acquire the features of the target in the image, thereby optimizing the quality of the training set.

此外,在对图像进行识别时,做插值二次识别,对目标框区域进行放大,极大地提高了目标识别的精确度,降低了误识别率。在后续的实验中,经过此过程优化后的模型可以更好的识别出真实背景下的战场迷彩伪装,具有更强的复杂环境适应性。In addition, when the image is recognized, interpolation and secondary recognition are performed to enlarge the target frame area, which greatly improves the accuracy of target recognition and reduces the false recognition rate. In subsequent experiments, the model optimized through this process can better recognize the camouflage camouflage in the real background, and has stronger adaptability to complex environments.

步骤4:非最大值抑制(Non-Maximum Suppression,NMS)Step 4: Non-Maximum Suppression (NMS)

在本发明的一个实施例中,步骤4包括如下:In one embodiment of the present invention, step 4 includes as follows:

在采取了上述模型优化后,模型的识别能力较初始状态有了极大提升,但随之而来却会产生另外一个问题——多次识别的过拟合现象,即同一个目标会检测出多个识别目标框,产生了大量的冗余信息,这一点对战场环境下的辅助决策非常不利。After adopting the above model optimization, the recognition ability of the model has been greatly improved compared with the initial state, but there will be another problem - the over-fitting phenomenon of multiple recognitions, that is, the same target will be detected Multiple recognition target frames generate a large amount of redundant information, which is very unfavorable for auxiliary decision-making in the battlefield environment.

针对此现象,本发明采取线性非极大值抑制算法来去除同一目标识别产生的冗余目标框,保留其中效果最好的一个,从而提高识别的精确度和降低误识别率。Aiming at this phenomenon, the present invention adopts a linear non-maximum value suppression algorithm to remove redundant target frames generated by the same target recognition, and retain the one with the best effect, thereby improving the recognition accuracy and reducing the false recognition rate.

具体实现过程包括:The specific implementation process includes:

(1)计算每个目标框与其相临近目标框的重叠区域面积比例IoU(IntersectionoverUnion);(1) Calculate the area ratio IoU (IntersectionoverUnion) of the overlapping area of each target frame and its adjacent target frame;

(2)将IoU与阈值进行对比,改变相邻近目标框置信度:(2) Compare the IoU with the threshold and change the confidence of adjacent target boxes:

Figure GDA0004201704060000091
Figure GDA0004201704060000091

其中,si为各目标框的置信度,Nt为设定的阈值,本发明的一个实施例中Nt设置为0.75。经此处理后,在保证识别精度的同时能够较好过滤冗余识别目标框。Wherein, s i is the confidence degree of each target frame, and N t is a set threshold, and N t is set to 0.75 in one embodiment of the present invention. After this processing, redundant recognition target frames can be better filtered while ensuring the recognition accuracy.

为了体现本发明中战场目标识别方法的有效性,下面通过实验来进行验证。In order to reflect the effectiveness of the battlefield target recognition method in the present invention, it will be verified through experiments below.

为了更好的切合复杂的现实战场环境,在实验时只利用侦察无人机做了短时间目标侦察,具体目标识别流程为:In order to better adapt to the complex real battlefield environment, only the reconnaissance drone was used for short-term target reconnaissance during the experiment. The specific target recognition process is as follows:

(1)将无人机获取到的有限样本集进行平移,旋转,放缩以及模糊曝光的图像预处理,将处理后的图像加入原始样本集中,扩大了样本集数量。(1) The limited sample set obtained by the UAV is preprocessed by translation, rotation, scaling, and blurred exposure images, and the processed image is added to the original sample set to expand the number of sample sets.

(2)在对获取的信息进行预处理后,载入已经训练好的Faster-RCNN模型(如图1所示)和YOLO v3模型(如图2所示)进行训练。在训练模型期间根据实时的训练损失率及时调整学习率,避免其陷入局部最优化或是过拟合。(2) After preprocessing the acquired information, load the trained Faster-RCNN model (as shown in Figure 1) and YOLO v3 model (as shown in Figure 2) for training. During the training of the model, the learning rate is adjusted in time according to the real-time training loss rate to prevent it from falling into local optimization or overfitting.

在对Faster-RCNN和YOLO v3模型进行训练时,设置识别的目标类型为3种(飞机,坦克,舰船),根据目标种类修改每一个卷积层中的参数,初始学习率为0.0003,每训练一万步将其学习率下降至现有学习率的95%,每十万步中断一次训练通过观察损失率变化曲线进行学习率的调整。分别经过二十万步训练后导出模型参数,对模型参数进行随机测试,训练结果如图4和图5所示。When training the Faster-RCNN and YOLO v3 models, set the recognized target types to 3 types (aircraft, tanks, ships), modify the parameters in each convolutional layer according to the target type, the initial learning rate is 0.0003, and each After training for 10,000 steps, the learning rate is reduced to 95% of the existing learning rate, and the training is interrupted every 100,000 steps to adjust the learning rate by observing the change curve of the loss rate. After 200,000 steps of training, the model parameters are exported, and the model parameters are randomly tested. The training results are shown in Figure 4 and Figure 5.

(3)将训练集中照片进行图像插值,提升其分辨率,增大画面信息。然后对一次识别出来的权重较高的重点区域,插值放大区域分辨率,进行二次识别,提升精准度、降低误识别。(3) Perform image interpolation on the photos in the training set to increase its resolution and increase the picture information. Then, for the key areas with higher weights identified once, the resolution of the area is enlarged by interpolation, and a second identification is performed to improve accuracy and reduce misidentification.

(4)基于上一步得到的权重,如果相邻两处识别区域权重均高于设置的阈值,检测其重叠率,重叠率过高认为是一个目标,降低对一个目标多次识别及两个目标过于临近这两种情况的误识别率。(4) Based on the weights obtained in the previous step, if the weights of two adjacent recognition areas are higher than the set threshold, check their overlap rate. If the overlap rate is too high, it is considered a target, and the multiple recognition of one target and two targets will be reduced. Too close to the false recognition rate of these two cases.

在每个模型下各验证60张照片,一共进行三组检测,算法比较结果如表1所示。60 photos were verified under each model, and a total of three groups of tests were carried out. The comparison results of the algorithms are shown in Table 1.

表1算法比较Table 1 Algorithm comparison

Figure GDA0004201704060000101
Figure GDA0004201704060000101

对上述实验结果观察和分析不难看出,在同等训练条件下,YOLO v3以其进行两次检测的结构优势模型有着较高的识别速度,但准确性较其余两种略差。Transfer-FasterRCNN通过对原始模型的优化后,在保持原识别速度前提下,系统识别精度有了很大的提升,能够更加适应复杂多变的战场环境。From the observation and analysis of the above experimental results, it is not difficult to see that under the same training conditions, YOLO v3 has a higher recognition speed with its structural advantage model of two detections, but its accuracy is slightly worse than the other two. After Transfer-FasterRCNN optimizes the original model, the recognition accuracy of the system has been greatly improved under the premise of maintaining the original recognition speed, and it can better adapt to the complex and changeable battlefield environment.

Claims (3)

1. The artificial intelligence-based method for battlefield target identification is characterized by comprising the following steps of:
step 1: image preprocessing optimization, comprising:
(1) Generating a target frame, specifically:
generating a target frame through a Transfer-fast-RCNN model, generating a target frame first before identifying a target, using four-dimensional vectors for the target frame to represent (x, y, w, h), wherein x is the abscissa of the center point of the target frame, y is the ordinate of the center point of the target frame, w is the width of the target frame,
A=(Ax,Ay,Aw,Ah) (1)
G=(Gx,Gy,Gw,Gh) (2)
wherein A is an original target frame data set, and G is a real target frame data set;
such that the input original window is mapped to a regression window G 'closer to the real box G, G' represents the translation transformation:
G′x=Ax+Aw·dx(A) (3)
G′y=Ay+Ah·dy(A) (4)
wherein dx (A) and dy (A) represent translation amounts, G 'x represents the abscissa of the translated center point, and G' y represents the ordinate of the translated center point;
(2) Image transformation optimization, specifically:
finding a transformation F of an image causes
F(Ax,Ay,Aw,Ah)=(G′x,G′y,G′w,G′h) (5)
The calculation of F is achieved by translation and scaling:
G′w=Aw·dy(A) (6)
G′h=Ah·dy(A) (7)
where dy (A) represents the amount of translation, G 'w represents the scaled target frame width, and G' h represents the scaled target frame height;
construction of objective functions
Figure FDA0004201704050000021
Wherein phi (A) is a feature vector composed of corresponding feature graphs, w T The parameter to be learned, d (A) is the predicted value, and in order to make the predicted values dx (A), dy (A), dw (A), dh (A) and the true value differences tx, ty, tw, th minimum, the cost function loss is as follows:
Figure FDA0004201704050000022
in the method, in the process of the invention,
Figure FDA0004201704050000023
representing the true center point of the target frame, N represents the number of feature images, A i A target box representing an ith feature map;
the function optimization objective w is:
Figure FDA0004201704050000024
(3) Gaussian blur, in particular:
before loading data into the fast-CNN model, the same picture is first subjected to different degrees of Gaussian blur and exposure:
Figure FDA0004201704050000025
wherein, p and q are pixel point positions in each RGB channel;
step 2: optimizing the learning rate;
setting a decline rate, and reducing the original learning rate after training a designated step length so as to prevent oscillation;
step 3: multi-resolution learning and identification;
step 4: non-maximum suppression.
2. A method for battlefield target recognition based on artificial intelligence as recited in claim 1, wherein: in the step 3, a cubic interpolation algorithm is selected, and the interpolation algorithm is expressed as follows by a function W (m):
Figure FDA0004201704050000031
wherein m is an independent variable, and a is an adjustment value.
3. The method for battlefield target recognition based on artificial intelligence of claim 1, wherein said step 4 comprises the steps of:
(1) Calculating the area ratio IoU of the overlapping area of each target frame and the adjacent target frames;
(2) Comparing IoU to a threshold, changing the confidence of the adjacent target frame:
Figure FDA0004201704050000032
wherein s is i For the confidence of each target frame, N t Is a set threshold.
CN202010231438.6A 2020-03-27 2020-03-27 A Method for Battlefield Target Recognition Based on Artificial Intelligence Active CN111414997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010231438.6A CN111414997B (en) 2020-03-27 2020-03-27 A Method for Battlefield Target Recognition Based on Artificial Intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010231438.6A CN111414997B (en) 2020-03-27 2020-03-27 A Method for Battlefield Target Recognition Based on Artificial Intelligence

Publications (2)

Publication Number Publication Date
CN111414997A CN111414997A (en) 2020-07-14
CN111414997B true CN111414997B (en) 2023-06-06

Family

ID=71491576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010231438.6A Active CN111414997B (en) 2020-03-27 2020-03-27 A Method for Battlefield Target Recognition Based on Artificial Intelligence

Country Status (1)

Country Link
CN (1) CN111414997B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112465057B (en) * 2020-12-08 2023-05-12 中国人民解放军空军工程大学 Target detection and identification method based on deep convolutional neural network
CN112633168B (en) * 2020-12-23 2023-10-31 长沙中联重科环境产业有限公司 Garbage truck and method and device for identifying garbage can overturning action of garbage truck

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109690554B (en) * 2016-07-21 2023-12-05 西门子保健有限责任公司 Method and system for artificial intelligence based medical image segmentation
EP4293574A3 (en) * 2017-08-08 2024-04-03 RealD Spark, LLC Adjusting a digital representation of a head region
CN108399362B (en) * 2018-01-24 2022-01-07 中山大学 Rapid pedestrian detection method and device
CN109522938A (en) * 2018-10-26 2019-03-26 华南理工大学 The recognition methods of target in a kind of image based on deep learning

Also Published As

Publication number Publication date
CN111414997A (en) 2020-07-14

Similar Documents

Publication Publication Date Title
CN110020651B (en) License plate detection and positioning method based on deep learning network
Wu et al. Rapid target detection in high resolution remote sensing images using YOLO model
CN106845478B (en) A kind of secondary licence plate recognition method and device of character confidence level
CN109308483B (en) Dual-source image feature extraction and fusion identification method based on convolutional neural network
CN111241931B (en) Aerial unmanned aerial vehicle target identification and tracking method based on YOLOv3
EP3254238B1 (en) Method for re-identification of objects
CN108108746B (en) License plate character recognition method based on Caffe deep learning framework
CN107529650B (en) Closed loop detection method and device and computer equipment
CN111461213B (en) A training method for a target detection model and a fast target detection method
CN106023257B (en) A kind of method for tracking target based on rotor wing unmanned aerial vehicle platform
CN111353512A (en) Obstacle classification method, device, storage medium and computer equipment
Lai et al. Traffic Signs Recognition and Classification based on Deep Feature Learning.
US11244188B2 (en) Dense and discriminative neural network architectures for improved object detection and instance segmentation
CN108647573A (en) A kind of military target recognition methods based on deep learning
CN114549891B (en) Foundation cloud image cloud class identification method based on comparison self-supervision learning
CN111414997B (en) A Method for Battlefield Target Recognition Based on Artificial Intelligence
CN112149533A (en) Target detection method based on improved SSD model
EP4024343A1 (en) Viewpoint image processing method and related device
CN115761552B (en) Target detection method, device and medium for unmanned aerial vehicle carrying platform
CN116503763A (en) Unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback
CN116363535A (en) Ship detection method in unmanned aerial vehicle aerial image based on convolutional neural network
CN114973026A (en) Target detection system in unmanned aerial vehicle scene of taking photo by plane, unmanned aerial vehicle system of taking photo by plane
CN116309270B (en) Binocular image-based transmission line typical defect identification method
CN114549969B (en) Saliency detection method and system based on image information fusion
CN116665097A (en) Self-adaptive target tracking method combining context awareness

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant