CN116935033A

CN116935033A - Infrared target detection and identification method based on convolutional neural network

Info

Publication number: CN116935033A
Application number: CN202310879338.8A
Authority: CN
Inventors: 周欢喜; 田岩; 杨俊波; 贾红辉; 胡政欢
Original assignee: Hunan Hongdong Photoelectric Co ltd
Current assignee: Hunan Hongdong Photoelectric Co ltd
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2023-10-24

Abstract

The invention relates to an infrared target detection and identification method based on a convolutional neural network, which comprises the following steps: s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images; s2, building a convolutional neural network model and training; s3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability; s4, if the target is detected in the S3, recording the size and the motion information of the target and displaying a detection result; and S5, if the target is not detected in the step S3, predicting a possible occurrence area of the target by using the motion information recorded in the step S4, obtaining a predicted area, representing the target position by using the predicted position, reducing a threshold value in the predicted area, checking and verifying, and finally correcting the target track by using the detection result, and displaying the detection result. The invention can realize accurate detection on the far-distance target with low signal-to-noise/noise ratio of the infrared image, and has high target detection rate and wider application environment range.

Description

Infrared target detection and identification method based on convolutional neural network

Technical Field

The invention relates to the technical field of image recognition, in particular to an infrared target detection and recognition method based on a convolutional neural network.

Background

In general, due to the influence of detection environment, performance of sensors and other factors, the signal-to-noise ratio of the acquired infrared image is often not high, which is represented by low contrast of the image, which makes detection of the target difficult. In addition to the problem of low signal-to-noise ratio of the infrared image, in practical situations, the problem of low signal-to-noise ratio may be faced, for example, when the target passes through the detection scene, the detection of the target may be challenged due to the fact that there may be more bright background interference in the scene (such as cloud layers, leaves, buildings, etc. in the image).

Under the scene of low signal-to-noise ratio and low signal-to-noise ratio, the conventional infrared target detection algorithm has the following disadvantages: firstly, the lack of robustness of the dependent structured features results in the inability to distinguish between background and small objects in complex scenes; second, while some approaches have employed local contrast, these comparisons are made on low-level visual features, lacking cognition and understanding of high-level features. Therefore, the traditional method often causes the problems of low target detection rate, high false alarm rate and the like.

The present invention relates to a method for identifying a target, and more particularly, to a method for identifying a target by using a convolutional neural network, which is capable of solving the defects of few conventional characteristics and difficult description of the infrared target due to the strong high-level characteristic extraction capability of the convolutional neural network, and has the advantages of rolling the conventional method in many target identification fields.

The far-distance infrared target is focused, so that the target is represented as a point or a spot on an image, the target image at the moment has no geometric characteristic and texture characteristic, and only the track characteristic of a target size machine can be considered, so that the target can be detected in a mode of combining detection and prediction according to the two characteristics, and the far-distance infrared target extraction method based on the convolutional neural network can realize the accurate detection of the target of the far-distance infrared image with low signal-to-noise/noise ratio.

Disclosure of Invention

Aiming at the problems, the invention provides the infrared target detection and identification method based on the convolutional neural network, which aims at the target of the far-distance infrared image with low signal-to-noise/noise ratio and can realize high detection rate and accurate detection of the target.

The technical scheme adopted for solving the technical problems is as follows: the infrared target detection and identification method based on the convolutional neural network comprises the following steps:

s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images;

s2, building a convolutional neural network model and training, wherein the method comprises the following specific steps of:

s201, data set preparation: performing target labeling on the image data, and enhancing the training data set by using target simulation;

s202, network structure design: constructing a network node, comprising an input layer, an implicit layer and an output layer, and initializing network parameters;

s203, parameter training: inputting the data set of the S201 into the network structure of the S202 for parameter training of the network model, so that the network model is obtained by determining a cost function, interpreting a forward result and updating weights reversely;

s3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability;

s4, if the target is detected in the S3, recording the size and the motion information of the target, displaying the detection result, and identifying the position of the target in the image and the position of the framed target;

s5, if the target is not detected in S3, predicting a possible occurrence area of the target by utilizing the motion information recorded in S4 to obtain a predicted area, if the target is not detected in the image predicted area of a plurality of continuous frames, confirming that the target is lost, representing the target position by using the predicted position, reducing a threshold value in the predicted area for verification, correcting the target track by utilizing the detection result, and displaying the detection result to identify the target in the image and the position of the framed target.

Preferably, the hidden layers in the step S202 include a 3-layer convolution layer, a 2-layer pooling layer and a 1-layer full connection layer,

convolution layer: for feature extraction, the convolution layer internally comprises a plurality of convolution kernels, each element composing the convolution kernels corresponds to a weight coefficient and a deviation amount,

given an image: x epsilon R ^M×N And a filter: w epsilon R ^U×V Typically U < M, V < N, the convolution is:

pooling layer: the pooling layer comprises a set pooling function, and the function of the pooling function is to replace the result of a single point in the feature map with the feature map statistic of the adjacent area of the feature map;

common pooling includes both maximum pooling and average pooling:

max Pooling (Max Pooling): for a regionThe maximum activity value of all neurons in this region is chosen as a representation of this region:

wherein x is _i Is a regionAn activity value for each neuron within;

average Pooling (Mean Pooling): for a regionThe average of all neuronal activity values in this region was chosen as a representation of this region:

full tie layer: the full connection layer does not have feature extraction capability per se, but is used for carrying out nonlinear combination on the extracted features to obtain output, so that the target feature map loses a space topological structure and is expressed as a group of vectors.

Compared with the prior art, the invention has the following beneficial effects:

the invention constructs a constructed convolutional neural network model for target detection, can realize accurate detection on targets of far-distance infrared images with low signal-to-noise/noise ratio, has high target detection rate, greatly reduces the influence of complex background interference factors, has wider application environment range and high automation degree, and can meet the practical requirements.

Drawings

FIG. 1 is a flow chart of the infrared target detection and identification process of the present invention;

FIG. 2 is a schematic diagram of the convolutional neural network model-based target detection of the present invention;

FIG. 3 is a schematic diagram of the construction of a convolutional neural network of the present invention;

FIG. 4 is a schematic diagram of experimental result 1 in the example of the present invention;

FIG. 5 is a schematic diagram of experimental result 2 in the example of the present invention.

Detailed Description

The present invention will now be described in detail with reference to fig. 1-5, wherein the exemplary embodiments and descriptions of the present invention are provided for illustration of the present invention and are not intended to be limiting.

The infrared target detection and identification method based on the convolutional neural network comprises the following steps:

the invention comprehensively considers the balance between efficiency and effect to further determine the layer number of the network. In order to ensure the effect of the method, namely the detection rate and the omission rate of the target, the hidden layers comprise 3 convolution layers, 2 pooling layers and 1 full connection layer,

common pooling includes both maximum pooling and average pooling:

maximum valuePooling (Max Pooling): for a regionThe maximum activity value of all neurons in this region is chosen as a representation of this region:

wherein x is _i Is a regionAn activity value for each neuron within;

the input needed in the training process is single frame image and label data corresponding to targets such as bounding boxes, masks and the like, and the network structure and cost function are required to be designed and updated by adopting a proper optimizer. The reasoning process is input as a single frame image, and a proper evaluation system is required to be designed to measure the similarity between the reasoning result and the real label.

s4, if the target is detected in the S3, recording the size and the motion information of the target and displaying the detection result, wherein the step aims to store the dynamic characteristics of the target and provide a priori condition for accurate screening of the subsequent target again so as to identify the target in the image and the position of the framed target;

s5, if the target is not detected in the S3, predicting a possible occurrence area of the target by utilizing the motion information recorded in the S4 to obtain a predicted area, wherein the aim of the step is to screen out a suspected area, avoid full-image detection, reduce the processed data amount, thereby reducing the time consumption of an algorithm, if the target is not detected in the image predicted area of a plurality of continuous frames, confirm that the target is lost, represent the target position by a predicted position, reduce a threshold value in the predicted area for verification and perform verification, ensure that the target is not missed in the predicted area through the verification and verification, and finally correct the target track by utilizing the detection result, display the detection result, provide priori knowledge for the subsequent detection, thereby identifying the target in the image and the position of the framed target.

In the implementation process, in order to verify the feasibility and effectiveness of the method, relevant data are collected by the subject group, experiments are performed according to the flow, and partial experimental results are shown in fig. 4 and 5. Fig. 4 and 5 are two frames in the captured video. The experimental result shows that the scene at this time contains a large number of trees, the intensity difference between the target and the interference is very small, and the target is difficult to accurately detect by using the traditional method, but the target can be perfectly detected by adopting the technical route of the project, so that the feasibility and the effectiveness of the method are preliminarily verified.

The foregoing has described in detail the technical solutions provided by the embodiments of the present invention, and specific examples have been applied to illustrate the principles and implementations of the embodiments of the present invention, where the above description of the embodiments is only suitable for helping to understand the principles of the embodiments of the present invention; meanwhile, as for those skilled in the art, according to the embodiments of the present invention, there are variations in the specific embodiments and the application scope, and the present description should not be construed as limiting the present invention.

Claims

1. The infrared target detection and identification method based on the convolutional neural network is characterized by comprising the following steps of:

2. The method for detecting and identifying an infrared target based on a convolutional neural network according to claim 1, wherein the method comprises the following steps: the hidden layers in the step S202 include 3 convolutional layers, 2 pooling layers and 1 fully-connected layer,

common pooling includes both maximum pooling and average pooling:

wherein x is _i Is a regionAn activity value for each neuron within;