CN116935033A - Infrared target detection and identification method based on convolutional neural network - Google Patents

Infrared target detection and identification method based on convolutional neural network Download PDF

Info

Publication number
CN116935033A
CN116935033A CN202310879338.8A CN202310879338A CN116935033A CN 116935033 A CN116935033 A CN 116935033A CN 202310879338 A CN202310879338 A CN 202310879338A CN 116935033 A CN116935033 A CN 116935033A
Authority
CN
China
Prior art keywords
target
pooling
layer
convolutional neural
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310879338.8A
Other languages
Chinese (zh)
Inventor
周欢喜
田岩
杨俊波
贾红辉
胡政欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Hongdong Photoelectric Co ltd
Original Assignee
Hunan Hongdong Photoelectric Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Hongdong Photoelectric Co ltd filed Critical Hunan Hongdong Photoelectric Co ltd
Priority to CN202310879338.8A priority Critical patent/CN116935033A/en
Publication of CN116935033A publication Critical patent/CN116935033A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an infrared target detection and identification method based on a convolutional neural network, which comprises the following steps: s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images; s2, building a convolutional neural network model and training; s3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability; s4, if the target is detected in the S3, recording the size and the motion information of the target and displaying a detection result; and S5, if the target is not detected in the step S3, predicting a possible occurrence area of the target by using the motion information recorded in the step S4, obtaining a predicted area, representing the target position by using the predicted position, reducing a threshold value in the predicted area, checking and verifying, and finally correcting the target track by using the detection result, and displaying the detection result. The invention can realize accurate detection on the far-distance target with low signal-to-noise/noise ratio of the infrared image, and has high target detection rate and wider application environment range.

Description

Infrared target detection and identification method based on convolutional neural network
Technical Field
The invention relates to the technical field of image recognition, in particular to an infrared target detection and recognition method based on a convolutional neural network.
Background
In general, due to the influence of detection environment, performance of sensors and other factors, the signal-to-noise ratio of the acquired infrared image is often not high, which is represented by low contrast of the image, which makes detection of the target difficult. In addition to the problem of low signal-to-noise ratio of the infrared image, in practical situations, the problem of low signal-to-noise ratio may be faced, for example, when the target passes through the detection scene, the detection of the target may be challenged due to the fact that there may be more bright background interference in the scene (such as cloud layers, leaves, buildings, etc. in the image).
Under the scene of low signal-to-noise ratio and low signal-to-noise ratio, the conventional infrared target detection algorithm has the following disadvantages: firstly, the lack of robustness of the dependent structured features results in the inability to distinguish between background and small objects in complex scenes; second, while some approaches have employed local contrast, these comparisons are made on low-level visual features, lacking cognition and understanding of high-level features. Therefore, the traditional method often causes the problems of low target detection rate, high false alarm rate and the like.
The present invention relates to a method for identifying a target, and more particularly, to a method for identifying a target by using a convolutional neural network, which is capable of solving the defects of few conventional characteristics and difficult description of the infrared target due to the strong high-level characteristic extraction capability of the convolutional neural network, and has the advantages of rolling the conventional method in many target identification fields.
The far-distance infrared target is focused, so that the target is represented as a point or a spot on an image, the target image at the moment has no geometric characteristic and texture characteristic, and only the track characteristic of a target size machine can be considered, so that the target can be detected in a mode of combining detection and prediction according to the two characteristics, and the far-distance infrared target extraction method based on the convolutional neural network can realize the accurate detection of the target of the far-distance infrared image with low signal-to-noise/noise ratio.
Disclosure of Invention
Aiming at the problems, the invention provides the infrared target detection and identification method based on the convolutional neural network, which aims at the target of the far-distance infrared image with low signal-to-noise/noise ratio and can realize high detection rate and accurate detection of the target.
The technical scheme adopted for solving the technical problems is as follows: the infrared target detection and identification method based on the convolutional neural network comprises the following steps:
s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images;
s2, building a convolutional neural network model and training, wherein the method comprises the following specific steps of:
s201, data set preparation: performing target labeling on the image data, and enhancing the training data set by using target simulation;
s202, network structure design: constructing a network node, comprising an input layer, an implicit layer and an output layer, and initializing network parameters;
s203, parameter training: inputting the data set of the S201 into the network structure of the S202 for parameter training of the network model, so that the network model is obtained by determining a cost function, interpreting a forward result and updating weights reversely;
s3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability;
s4, if the target is detected in the S3, recording the size and the motion information of the target, displaying the detection result, and identifying the position of the target in the image and the position of the framed target;
s5, if the target is not detected in S3, predicting a possible occurrence area of the target by utilizing the motion information recorded in S4 to obtain a predicted area, if the target is not detected in the image predicted area of a plurality of continuous frames, confirming that the target is lost, representing the target position by using the predicted position, reducing a threshold value in the predicted area for verification, correcting the target track by utilizing the detection result, and displaying the detection result to identify the target in the image and the position of the framed target.
Preferably, the hidden layers in the step S202 include a 3-layer convolution layer, a 2-layer pooling layer and a 1-layer full connection layer,
convolution layer: for feature extraction, the convolution layer internally comprises a plurality of convolution kernels, each element composing the convolution kernels corresponds to a weight coefficient and a deviation amount,
given an image: x epsilon R M×N And a filter: w epsilon R U×V Typically U < M, V < N, the convolution is:
pooling layer: the pooling layer comprises a set pooling function, and the function of the pooling function is to replace the result of a single point in the feature map with the feature map statistic of the adjacent area of the feature map;
common pooling includes both maximum pooling and average pooling:
max Pooling (Max Pooling): for a regionThe maximum activity value of all neurons in this region is chosen as a representation of this region:
wherein x is i Is a regionAn activity value for each neuron within;
average Pooling (Mean Pooling): for a regionThe average of all neuronal activity values in this region was chosen as a representation of this region:
full tie layer: the full connection layer does not have feature extraction capability per se, but is used for carrying out nonlinear combination on the extracted features to obtain output, so that the target feature map loses a space topological structure and is expressed as a group of vectors.
Compared with the prior art, the invention has the following beneficial effects:
the invention constructs a constructed convolutional neural network model for target detection, can realize accurate detection on targets of far-distance infrared images with low signal-to-noise/noise ratio, has high target detection rate, greatly reduces the influence of complex background interference factors, has wider application environment range and high automation degree, and can meet the practical requirements.
Drawings
FIG. 1 is a flow chart of the infrared target detection and identification process of the present invention;
FIG. 2 is a schematic diagram of the convolutional neural network model-based target detection of the present invention;
FIG. 3 is a schematic diagram of the construction of a convolutional neural network of the present invention;
FIG. 4 is a schematic diagram of experimental result 1 in the example of the present invention;
FIG. 5 is a schematic diagram of experimental result 2 in the example of the present invention.
Detailed Description
The present invention will now be described in detail with reference to fig. 1-5, wherein the exemplary embodiments and descriptions of the present invention are provided for illustration of the present invention and are not intended to be limiting.
The infrared target detection and identification method based on the convolutional neural network comprises the following steps:
s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images;
s2, building a convolutional neural network model and training, wherein the method comprises the following specific steps of:
s201, data set preparation: performing target labeling on the image data, and enhancing the training data set by using target simulation;
s202, network structure design: constructing a network node, comprising an input layer, an implicit layer and an output layer, and initializing network parameters;
the invention comprehensively considers the balance between efficiency and effect to further determine the layer number of the network. In order to ensure the effect of the method, namely the detection rate and the omission rate of the target, the hidden layers comprise 3 convolution layers, 2 pooling layers and 1 full connection layer,
convolution layer: for feature extraction, the convolution layer internally comprises a plurality of convolution kernels, each element composing the convolution kernels corresponds to a weight coefficient and a deviation amount,
given an image: x epsilon R M×N And a filter: w epsilon R U×V Typically U < M, V < N, the convolution is:
pooling layer: the pooling layer comprises a set pooling function, and the function of the pooling function is to replace the result of a single point in the feature map with the feature map statistic of the adjacent area of the feature map;
common pooling includes both maximum pooling and average pooling:
maximum valuePooling (Max Pooling): for a regionThe maximum activity value of all neurons in this region is chosen as a representation of this region:
wherein x is i Is a regionAn activity value for each neuron within;
average Pooling (Mean Pooling): for a regionThe average of all neuronal activity values in this region was chosen as a representation of this region:
full tie layer: the full connection layer does not have feature extraction capability per se, but is used for carrying out nonlinear combination on the extracted features to obtain output, so that the target feature map loses a space topological structure and is expressed as a group of vectors.
S203, parameter training: inputting the data set of the S201 into the network structure of the S202 for parameter training of the network model, so that the network model is obtained by determining a cost function, interpreting a forward result and updating weights reversely;
the input needed in the training process is single frame image and label data corresponding to targets such as bounding boxes, masks and the like, and the network structure and cost function are required to be designed and updated by adopting a proper optimizer. The reasoning process is input as a single frame image, and a proper evaluation system is required to be designed to measure the similarity between the reasoning result and the real label.
S3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability;
s4, if the target is detected in the S3, recording the size and the motion information of the target and displaying the detection result, wherein the step aims to store the dynamic characteristics of the target and provide a priori condition for accurate screening of the subsequent target again so as to identify the target in the image and the position of the framed target;
s5, if the target is not detected in the S3, predicting a possible occurrence area of the target by utilizing the motion information recorded in the S4 to obtain a predicted area, wherein the aim of the step is to screen out a suspected area, avoid full-image detection, reduce the processed data amount, thereby reducing the time consumption of an algorithm, if the target is not detected in the image predicted area of a plurality of continuous frames, confirm that the target is lost, represent the target position by a predicted position, reduce a threshold value in the predicted area for verification and perform verification, ensure that the target is not missed in the predicted area through the verification and verification, and finally correct the target track by utilizing the detection result, display the detection result, provide priori knowledge for the subsequent detection, thereby identifying the target in the image and the position of the framed target.
In the implementation process, in order to verify the feasibility and effectiveness of the method, relevant data are collected by the subject group, experiments are performed according to the flow, and partial experimental results are shown in fig. 4 and 5. Fig. 4 and 5 are two frames in the captured video. The experimental result shows that the scene at this time contains a large number of trees, the intensity difference between the target and the interference is very small, and the target is difficult to accurately detect by using the traditional method, but the target can be perfectly detected by adopting the technical route of the project, so that the feasibility and the effectiveness of the method are preliminarily verified.
The foregoing has described in detail the technical solutions provided by the embodiments of the present invention, and specific examples have been applied to illustrate the principles and implementations of the embodiments of the present invention, where the above description of the embodiments is only suitable for helping to understand the principles of the embodiments of the present invention; meanwhile, as for those skilled in the art, according to the embodiments of the present invention, there are variations in the specific embodiments and the application scope, and the present description should not be construed as limiting the present invention.

Claims (2)

1. The infrared target detection and identification method based on the convolutional neural network is characterized by comprising the following steps of:
s1, reading a picture: collecting images of a plurality of frames in a prediction area video as input images;
s2, building a convolutional neural network model and training, wherein the method comprises the following specific steps of:
s201, data set preparation: performing target labeling on the image data, and enhancing the training data set by using target simulation;
s202, network structure design: constructing a network node, comprising an input layer, an implicit layer and an output layer, and initializing network parameters;
s203, parameter training: inputting the data set of the S201 into the network structure of the S202 for parameter training of the network model, so that the network model is obtained by determining a cost function, interpreting a forward result and updating weights reversely;
s3, detecting a target based on a convolutional neural network model by the image to be identified in the S1 to obtain target probability;
s4, if the target is detected in the S3, recording the size and the motion information of the target, displaying the detection result, and identifying the position of the target in the image and the position of the framed target;
s5, if the target is not detected in S3, predicting a possible occurrence area of the target by utilizing the motion information recorded in S4 to obtain a predicted area, if the target is not detected in the image predicted area of a plurality of continuous frames, confirming that the target is lost, representing the target position by using the predicted position, reducing a threshold value in the predicted area for verification, correcting the target track by utilizing the detection result, and displaying the detection result to identify the target in the image and the position of the framed target.
2. The method for detecting and identifying an infrared target based on a convolutional neural network according to claim 1, wherein the method comprises the following steps: the hidden layers in the step S202 include 3 convolutional layers, 2 pooling layers and 1 fully-connected layer,
convolution layer: for feature extraction, the convolution layer internally comprises a plurality of convolution kernels, each element composing the convolution kernels corresponds to a weight coefficient and a deviation amount,
given an image: x epsilon R M×N And a filter: w epsilon R U×V Typically U < M, V < N, the convolution is:
pooling layer: the pooling layer comprises a set pooling function, and the function of the pooling function is to replace the result of a single point in the feature map with the feature map statistic of the adjacent area of the feature map;
common pooling includes both maximum pooling and average pooling:
max Pooling (Max Pooling): for a regionThe maximum activity value of all neurons in this region is chosen as a representation of this region:
wherein x is i Is a regionAn activity value for each neuron within;
average Pooling (Mean Pooling): for a regionThe average of all neuronal activity values in this region was chosen as a representation of this region:
full tie layer: the full connection layer does not have feature extraction capability per se, but is used for carrying out nonlinear combination on the extracted features to obtain output, so that the target feature map loses a space topological structure and is expressed as a group of vectors.
CN202310879338.8A 2023-07-18 2023-07-18 Infrared target detection and identification method based on convolutional neural network Pending CN116935033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310879338.8A CN116935033A (en) 2023-07-18 2023-07-18 Infrared target detection and identification method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310879338.8A CN116935033A (en) 2023-07-18 2023-07-18 Infrared target detection and identification method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN116935033A true CN116935033A (en) 2023-10-24

Family

ID=88376826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310879338.8A Pending CN116935033A (en) 2023-07-18 2023-07-18 Infrared target detection and identification method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN116935033A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117475262A (en) * 2023-12-26 2024-01-30 苏州镁伽科技有限公司 Image generation method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117475262A (en) * 2023-12-26 2024-01-30 苏州镁伽科技有限公司 Image generation method and device, storage medium and electronic equipment
CN117475262B (en) * 2023-12-26 2024-03-19 苏州镁伽科技有限公司 Image generation method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN107423702B (en) Video target tracking method based on TLD tracking system
CN106897681B (en) Remote sensing image contrast analysis method and system
CN110309808B (en) Self-adaptive smoke root node detection method in large-scale space
CN113591968A (en) Infrared weak and small target detection method based on asymmetric attention feature fusion
CN110415260B (en) Smoke image segmentation and identification method based on dictionary and BP neural network
CN112597928B (en) Event detection method and related device
CN116935033A (en) Infrared target detection and identification method based on convolutional neural network
CN111967345B (en) Method for judging shielding state of camera in real time
CN114973112A (en) Scale-adaptive dense crowd counting method based on antagonistic learning network
CN112907626A (en) Moving object extraction method based on satellite time-exceeding phase data multi-source information
CN115719463A (en) Smoke and fire detection method based on super-resolution reconstruction and adaptive extrusion excitation
CN116402852A (en) Dynamic high-speed target tracking method and device based on event camera
US20130271601A1 (en) Method and device for the detection of change in illumination for vision systems
CN113962900A (en) Method, device, equipment and medium for detecting infrared dim target under complex background
CN113657264A (en) Forest fire smoke root node detection method based on fusion of dark channel and KNN algorithm
CN110472639B (en) Target extraction method based on significance prior information
CN110837787B (en) Multispectral remote sensing image detection method and system for three-party generated countermeasure network
CN114596244A (en) Infrared image identification method and system based on visual processing and multi-feature fusion
Park et al. Automatic radial un-distortion using conditional generative adversarial network
CN114757941A (en) Transformer substation equipment defect identification method and device, electronic equipment and storage medium
CN113313678A (en) Automatic sperm morphology analysis method based on multi-scale feature fusion
CN112084922A (en) Abnormal behavior crowd detection method based on gestures and facial expressions
CN111191575A (en) Naked flame detection method and system based on flame jumping modeling
CN113591705B (en) Inspection robot instrument identification system and method and storage medium
CN117011196B (en) Infrared small target detection method and system based on combined filtering optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination