CN113920159A - Infrared aerial small target tracking method based on full convolution twin network - Google Patents

Infrared aerial small target tracking method based on full convolution twin network Download PDF

Info

Publication number
CN113920159A
CN113920159A CN202111081287.1A CN202111081287A CN113920159A CN 113920159 A CN113920159 A CN 113920159A CN 202111081287 A CN202111081287 A CN 202111081287A CN 113920159 A CN113920159 A CN 113920159A
Authority
CN
China
Prior art keywords
target
value
response
tracking
apce
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111081287.1A
Other languages
Chinese (zh)
Other versions
CN113920159B (en
Inventor
刘刚
张文波
曹紫绚
董猛
刘龙哲
田慧
权冰洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Science and Technology
Original Assignee
Henan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Science and Technology filed Critical Henan University of Science and Technology
Priority to CN202111081287.1A priority Critical patent/CN113920159B/en
Publication of CN113920159A publication Critical patent/CN113920159A/en
Application granted granted Critical
Publication of CN113920159B publication Critical patent/CN113920159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention provides a full convolution twin network-based infrared aerial small and medium target tracking algorithm, which aims to solve the practical problems of background clutter interference, shielding and the like in the tracking process of an infrared imaging guidance system for aerial small and medium targets. The invention can adapt to complex and various infrared aerial scenes and realize effective and stable real-time tracking of small infrared aerial targets.

Description

Infrared aerial small target tracking method based on full convolution twin network
Technical Field
The invention belongs to the technical field of infrared aerial target tracking, and particularly relates to an infrared aerial small target tracking method based on a full convolution twin network.
Background
The air target tracking is one of the key technologies of the infrared imaging guidance system. The infrared imaging guidance system has higher and higher requirements on an aerial target tracking technology, and mainly reflects the phenomena that the target distance is long and the natural environment is complex and changeable, so that most targets in a field of view in a tracking stage are limited pixels (small targets), characteristic information is weak, background clutter is much interfered, shielding exists and the like, and great troubles are brought to target tracking. Therefore, how to effectively solve the problem of tracking failure caused by background clutter interference, shielding and the like appearing on the target in the infrared air small target tracking process and improve the accuracy and the real-time performance of the tracking method is a technical problem which needs to be solved urgently by technical personnel in the field at present.
At present, most of infrared target tracking are traditional algorithms, and targeted feature extraction methods are designed manually according to different scenes. However, for complex infrared air scenes, it is difficult for existing traditional tracking algorithms to adapt to all situations. In recent years, deep learning algorithms are rapidly developed, and domestic and foreign scholars widely apply deep features to target tracking algorithms and have deep learning-based algorithms which exceed traditional related filtering tracking algorithms in performance. However, the convolution neural network has a large calculation amount in the back propagation process, so that the tracking algorithm for updating the network parameters on line has a low speed, and the real-time tracking requirement cannot be met.
Aiming at the problems, a twin network-based target tracking algorithm adopts a similarity verification method to convert tracking into a template matching problem, and the target tracking algorithm becomes an important research direction in the field of target tracking due to the strong end-to-end training capability and tracking real-time property. At present, a target tracking algorithm based on a full-convolution twin network (SiamFC) is a classic algorithm in the aspect of target tracking, cross-correlation operation is performed on a to-be-searched area and depth characteristics of a target template, similarity is measured by an obtained response value, the maximum position of the response value is selected as a target center point, and good tracking accuracy and speed are obtained.
Disclosure of Invention
Based on the problems, the invention provides a twin network infrared space small and medium target tracking method based on full convolution, which aims to solve the problem of infrared small target tracking in a complex air scene, and particularly solves the situations of background clutter interference, shielding and the like of a tracked target.
In order to achieve the purpose, the invention adopts the technical scheme that: a twin network infrared space small and medium target tracking method based on full convolution comprises the following steps:
s1, inputting the image sequence into a full convolution twin network, selecting the first frame of the sequence as a target template z, providing a region x to be searched for in the subsequent frame, and respectively passing through the convolution neural network sharing parameters
Figure BDA0003264139380000021
Extracting depth features to obtain a feature map of the target template
Figure BDA0003264139380000022
And feature map of the region to be searched
Figure BDA0003264139380000023
S2, using the feature map of the target template
Figure BDA0003264139380000024
Checking the area to be searched for by convolutionCharacteristic diagram
Figure BDA0003264139380000025
Performing convolution operation to obtain a characteristic response graph M;
s3, evaluating and judging the current frame target tracking state by using the average peak value correlation energy and the maximum peak value of the current frame response image, and executing the step S4 if the current target tracking is judged to be in a normal state; if the target is judged to be interfered by the background clutter, executing the step S5; if yes, executing step S6;
s4, under normal tracking state, responding to current t frametApplying Hamming window with the same size to inhibit boundary effect, and selecting the maximum peak point of the response graph as a target point;
s5, when the targets are interfered by background clutter, forming a candidate point set by the current frame response image multi-peak points, calculating the feature similarity score of each candidate target and the real target of the historical frame by using the depth feature and the local contrast feature, and selecting the candidate point with the highest similarity as the current frame target point;
and S6, when the target is occluded, predicting the target position of the current frame in the occlusion state according to Kalman filtering constructed by the target position information of the historical frame. If the normal tracking is performed after the occlusion is determined to be released, step S4 is executed.
Further, step S3 includes,
s3.1, measuring the fluctuation condition of the response graph by using an Average Peak Correlation Energy (APCE) index, wherein the APCE is specifically defined as:
Figure BDA0003264139380000031
wherein, Fmax、FminRespectively representing the maximum value and the minimum value in the response diagram; i represents the abscissa of the response plot, j represents the ordinate of the response plot, Fi,jIs the response value at (i, j) in the response map; under the normal tracking condition, the fluctuation of the response diagram is small, and the APCE value is large; when the target is interfered by background clutter and shielded, the response diagram fluctuates severely compared with the APCE tracked normallyThe value is greatly reduced; smaller APCE values indicate more unstable tracking conditions;
s3.2, define λAPCEAnd
Figure BDA0003264139380000032
the ratio of the APCE value of the response image of the current frame and the corresponding average value of the maximum peak value and the historical frame is used for quantifying the change degree of the APCE value and the maximum peak value of the current frame, namely:
Figure BDA0003264139380000041
wherein, APCEtAnd Fmax-t、APCEiAnd Fmax-iThe response map APCE values and the maximum peak value, n, of the current frame and the historical frame, respectively1For reference historical frame number, passing lambda in tracking processAPCEAnd
Figure BDA0003264139380000042
the value is combined with the information of the historical frame response map to judge the current tracking state.
Further, in step S3.2 by λAPCEAnd
Figure BDA0003264139380000043
the method for judging the current tracking state by combining the value with the information of the historical frame response image comprises the following steps:
a. variation rate lambda of APCE value of current frame response diagramAPCEIf the current tracking state is larger than a certain threshold value, judging that the current tracking state is a normal tracking state, otherwise, executing the step b to judge other tracking states;
b. maximum peak change rate of current frame response map
Figure BDA0003264139380000044
Less than a threshold and n consecutive before the current frame4And if the maximum value of the frame response image gradually decreases and the decrease amplitude is larger than a certain threshold value compared with the previous frame, judging that the current tracking state is a shielding state, otherwise, judging that the current tracking state is a background clutter interference state.
Further, step S5 includes:
s5.1, defining a feature similarity score S of each candidate targetjComprises the following steps:
Figure BDA0003264139380000045
wherein D is a candidate target point set, and the number is n3(ii) a j is the candidate target number, FjAnd CjThe response value and the local contrast of the candidate target j of the current frame; fmax-iAnd Cmax-iIs the peak value and local contrast of the maximum peak in the historical frame response map, n2A historical frame number for reference; beta and 1-beta are respectively the response value and the weight occupied by the local contrast; feature similarity score S constructed using depth feature response values and local contrast features of a targetjThe similarity of the candidate target j and the real target characteristic is measured, SjSmaller values indicate that the candidate target characteristic value is closer to the real target;
s5.2, searching the multi-peak point in the current frame response image by using a maximum filter, and selecting the front n of the multi-peak value3The named point is the center point of the candidate target;
s5.3, calculating the local contrast of each candidate target by using the same size of the target frame;
s5.4, calculating corresponding feature similarity score S according to the local contrast and the response value of each candidate targetj,SjAnd obtaining the position of the center point of the target of the current frame through position transformation by taking the candidate target point with the minimum value as a target point.
Compared with the prior art, the invention has the beneficial effects that: the infrared air small target tracking method provided by the invention uses a full convolution twin network to extract depth characteristics, and obtains a depth characteristic response graph. And judging the current target tracking state through the average peak value correlation energy and the maximum peak value change condition of the response graph. When the target is normally tracked, selecting a maximum peak point of a response graph to correspond to a target central point; when the background clutter interference of the target is judged, eliminating the clutter interference by using a depth characteristic response value and local contrast method; and when the target is judged to be shielded, performing position prediction by using Kalman filtering. The tracking method provided by the invention can effectively process the conditions that the small target in the infrared space is interfered and shielded by a complex background, has better performance on tracking the small target in the infrared space, and can meet the real-time requirement of tracking.
Drawings
FIG. 1 is a schematic overall flow diagram of the tracking method of the present invention;
FIG. 2 is a diagram of a full convolution twin network in accordance with the present invention;
FIG. 3 is a graph of the success rate of the method of the present invention (Our) and 9 other comparison algorithms on an infrared aerial small target test set;
FIG. 4 is a graph of the accuracy of the method of the present invention (Our) and 9 other comparison algorithms on an infrared aerial small target test set;
FIG. 5 is a graph of the success rate of the method of the present invention (Our) and 9 other comparison algorithms on infrared aerial small target test data with complex background interference attributes;
FIG. 6 is a graph of the accuracy of the method of the present invention (Our) and 9 other comparison algorithms on infrared aerial small target test data with complex background interference attributes;
FIG. 7 is a graph of the success rate of the method of the present invention (Our) and 9 other comparison algorithms on infrared aerial small target test data with target occluded attributes;
FIG. 8 is a graph of the accuracy of the method (Our) of the present invention and 9 other comparison algorithms on infrared aerial small target test data with target occluded property.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts belong to the protection scope of the present invention.
The principle of the invention is as follows: the infrared air small target tracking method provided by the invention uses a full convolution twin network to extract depth characteristics, and obtains a depth characteristic response graph. And judging the current target tracking state through the average peak value correlation energy and the maximum peak value change condition of the response graph. When the target is normally tracked, selecting a maximum peak point of a response graph to correspond to a target central point; when the background clutter interference of the target is judged, eliminating the clutter interference by using a depth characteristic response value and local contrast method; and when the target is judged to be shielded, performing position prediction by using Kalman filtering. The tracking method provided by the invention can effectively process the conditions that the small target in the infrared space is interfered and shielded by a complex background, has better performance on tracking the small target in the infrared space, and can meet the real-time requirement of tracking.
The invention discloses a twin network infrared space medium and small target tracking method based on full convolution, which comprises the following steps as shown in the attached figure 1:
s1, inputting the image sequence into a full convolution twin network, selecting the first frame of the sequence as a target template z, providing a region x to be searched for in the subsequent frame, and respectively passing through the convolution neural network sharing parameters
Figure BDA0003264139380000071
Extracting depth features to obtain a feature map
Figure BDA0003264139380000072
And
Figure BDA0003264139380000073
specifically, the deep feature extraction network in step S1 is designed on the basis of AlexNet. Five convolutional layers in total, wherein the maximum pooling layer is used after the first two convolutional layers, each convolutional layer except the last convolutional layer uses ReLU, and the last three convolutional layers use packet convolution. After each linear layer there is a Batch Normalization layer (Batch Normalization), the convolutional layer has no padding operation, and the net total step size is 8. In addition, the present invention trains features through off-lineAnd extracting a network, wherein in an off-line training stage, a visible light data set (ILSVRC2015) is used for training, and then, the training is further carried out on the infrared aerial small target image sequence to better acquire the depth characteristics of the infrared aerial small target. The network parameter optimization is carried out by using a stochastic gradient descent algorithm in the training process, 50 rounds of training are carried out, and the learning rate is set to be 10-2The batch size of the training images is set to be 8, and the optimal result in the last 10 rounds is obtained.
S2, using the feature map of the target template
Figure BDA0003264139380000082
Treating feature maps of search regions for convolution kernels
Figure BDA0003264139380000083
And performing convolution operation to obtain a characteristic response graph M.
Specifically, in the actual tracking process, as shown in the full convolution twin network structure diagram of fig. 2, the target template and the region to be searched are respectively clipped and transformed into 127 × 127 × 3 z and 255 × 255 × 3 x, and the two are passed through the feature extraction network
Figure BDA0003264139380000084
Then 6X 6 is obtained
Figure BDA0003264139380000085
And
Figure BDA0003264139380000086
the characteristic diagram of (1). Taking the 6 × 6 × 128 feature map as a convolution kernel, performing convolution operation on the feature map with the 22 × 22 × 128 feature map to obtain a 17 × 17 feature response map, and performing bicubic interpolation to obtain a 272 × 272 final feature response map.
And S3, evaluating and judging the target tracking state of the current frame by using the average peak correlation energy and the maximum peak of the response graph of the current frame. If the current target tracking is determined to be in the normal state, step S4 is executed. If the target is judged to be interfered by the background clutter, executing the step S5; if it is determined that the target is blocked, step S6 is executed.
Specifically, step S3 includes:
s3.1, measuring the fluctuation condition of the response graph by using an average peak to correlation energy (APCE) index, wherein the APCE is specifically defined as:
Figure BDA0003264139380000081
in the formula, Fmax、FminRespectively representing the maximum value and the minimum value in the response diagram; i represents the abscissa of the response plot, j represents the ordinate of the response plot, Fi,jIs the response value at (i, j) in the response map. Under normal tracking conditions, the response plot fluctuates less, the vision appears to be a "unimodal" state, and the APCE value is greater. When the target is interfered by background clutter and shielded, the response graph fluctuates severely, the vision is in a 'multi-peak' state, and the APCE value is greatly reduced compared with that of normal tracking. The current tracking state can be effectively reflected by analyzing the fluctuation condition of the response diagram and calculating the APCE value of the response diagram.
The tracking failure reason is judged according to the change state of the maximum peak value of the current frame response image, when a target is interfered by background clutter, the maximum peak value of the current frame response image changes suddenly compared with the maximum peak value of the previous frame response image, and the maximum peak value of the previous frame response image of the current frame is kept stable for a period of time. In the process that the target is slightly shielded to be completely shielded, the maximum peak value of the corresponding response image starts to be gradually reduced to be the minimum value when the target is completely shielded, namely when the target of the current frame is completely shielded, the maximum peak value of the response image of the current frame is the minimum value, and the maximum value of the response image before the current frame is gradually reduced. Therefore, when the APCE value is small, the change state of the maximum peak value of the response diagram can be combined to classify factors which can cause the tracking failure.
S3.2, definition of lambda in the inventionAPCEAnd
Figure BDA0003264139380000091
the ratio of the APCE value of the response image of the current frame and the corresponding average value of the maximum peak value and the historical frame is used for quantifying the change degree of the APCE value and the maximum peak value of the current frame, namely:
Figure BDA0003264139380000092
wherein, APCEtAnd Fmax-t、APCEiAnd Fmax-iThe response map APCE values and the maximum peak value, n, of the current frame (t frame) and the historical frame, respectively1For reference historical frame number, n is determined experimentally1Is 10, in particular by λ during trackingAPCEAnd
Figure BDA0003264139380000093
the value is combined with the information of the historical frame response map to judge the current tracking state.
In the present invention, λ is passed in the tracking processAPCEAnd
Figure BDA0003264139380000101
the current tracking state is judged by combining the value with the information of the historical frame response image, and the specific implementation steps are as follows:
a. variation rate lambda of APCE value of current frame response diagramAPCEGreater than a threshold value alpha1If so, the current tracking state is judged to be the normal tracking state, otherwise, the step S3.2 is executed to judge other tracking states. Determination of alpha from experiments1Is 0.55.
b. Maximum peak change rate of current frame response map
Figure BDA0003264139380000102
Less than threshold value alpha2And n is consecutive before the current frame4The maximum value of the frame response image is gradually reduced, and the reduction amplitude is larger than alpha compared with the previous frame3If so, judging that the current tracking state is a shielding state, otherwise, judging that the current tracking state is a background clutter interference state. Determination of alpha from experiments2Is 0.68, alpha3Is 0.08, n4Is 4.
S4, under normal tracking state, responding to current t frametAnd applying Hamming window with the same size to inhibit the boundary effect, and selecting the maximum peak point of the response graph as a target point.
Specifically, the maximum peak point in the response graph is selected as a target point, an offset value of the target point relative to the center of the response graph is obtained, and the offset value is multiplied by the total network step length to obtain the center position of the target of the current frame.
S5, when the target is interfered by the background clutter, forming a candidate point set by the current frame response image multi-peak points, calculating the feature similarity score of each candidate target and the real target of the historical frame by using the depth feature and the local contrast feature, and selecting the candidate point with the highest similarity as the current frame target point.
Specifically, step S5 includes:
s5.1, the local contrast in step S5 is the ratio of the target gray level mean to the local neighborhood background gray level mean, and is defined as:
Figure BDA0003264139380000111
in the formula, Ω represents a target region, Ψ represents a neighborhood of the target region, N is the number of pixels in the region, and I (I, j) is the gray level of the pixel in the original image (I, j).
In step S5, the depth feature response value and the local contrast feature of the target are used to construct a feature similarity score S between each candidate target and the real target of the historical framejFeature similarity score S for each candidate targetjIs defined as:
Figure BDA0003264139380000112
wherein D is a candidate target point set, and the number is n3(ii) a j is the candidate target number, FjAnd CjThe response value and the local contrast of the candidate target j of the current frame. Fmax-iAnd Cmax-iIs the peak value and local contrast of the maximum peak in the historical frame response map, n2The historical frame number for reference. Beta and 1-beta are the response values and the weights of the local contrast, respectively. Determination of n from experiments3Is 8, n2Is 5, beta is 0.4. Feature similarity score SjThe candidate target j and the real target feature are measuredSimilarity of (D), SjSmaller values indicate that the candidate target feature value is closer to the true target.
S5.2, searching the multi-peak point in the current frame response image by using a maximum filter, and selecting the front n of the multi-peak value3The named point is the candidate target center point.
And S5.3, calculating the local contrast of each candidate target by using the same target frame size.
S5.4, calculating corresponding feature similarity score S according to the local contrast and the response value of each candidate targetj,SjAnd obtaining the position of the center point of the target of the current frame through position transformation by taking the candidate target point with the minimum value as a target point.
S6, when the target is shielded, predicting the target position of the current frame in the shielding state according to Kalman filtering constructed by the target position information of the historical frame, when the target is judged to be separated from the shielding, carrying out normal tracking, and executing the step S4.
Specifically, in the target tracking process, a Kalman filter is initialized, and then the motion state of the target is predicted and optimized and estimated in an iteration process according to the subsequent target tracking result. Namely, when the target is occluded, the target position of the current frame can be predicted by using Kalman filtering according to the target position information of the historical frame. In the shielding process, if the APCE value change rate lambda of the current frameAPCEGreater than a threshold value alpha1And the maximum value change rate lambdaFmax-tGreater than a threshold value alpha4If the target is not shielded, normal tracking is carried out. Determination of alpha from experiments1Is 0.55, alpha4Is 0.88.
The above-mentioned S1-S2 are the target depth feature extraction process, S3 is the target tracking state judgment, and S4, S5, S6 are processing methods respectively for normal tracking, when the target is interfered by background clutter and when the target is occluded, and are combined together to form a complete target tracking process. In the actual tracking process, the entire target tracking is completed by repeating the steps S1-S4/S5/S6.
The tracking effect of the invention (Our) is verified by a simulation experiment, which adopts a public infrared image weak and small airplane target detection tracking data set under the ground-air background to compare the tracking performance with 9 classical tracking algorithms. The method comprises a complete and a STRCF based on a related filtering algorithm, a related filtering algorithm ECO-HC fusing depth characteristics, a deep learning algorithm MDNet based on online fine adjustment, and deep learning algorithms SimFC, SimRPN, DaSimRPN, SimDW and SimFC + +, which are based on a twin network. The deep learning-based algorithm is trained by adopting the same training set.
The simulation experiment results refer to fig. 3 and fig. 4, which are a success rate graph and a precision graph of the tracking method and other 9 comparison algorithms in the test set, respectively. In fig. 3 and fig. 4, the uppermost curves are the tracking success rate curve and the tracking accuracy curve of the present invention, respectively, and it can be seen that the tracking method of the present invention is significantly superior to other 9 algorithms in terms of tracking success rate and accuracy, and rank first in both tracking success rate and tracking accuracy.
In order to further analyze the performance of the tracking method under the conditions of complex background and occlusion in detail, data of two attributes of complex background interference and occluded condition of a target are selected from a test set, and the tracking method and other 9 comparison algorithms are respectively compared on the two attribute data. In fig. 5 and 6, the uppermost curves are the tracking success rate curve and the tracking accuracy curve of the present invention under the condition of the complex background interference, respectively, as shown in fig. 5 and 6, the tracking performance of the tracking method of the present invention under the condition of the complex background interference is ranked first; in fig. 7 and 8, the curves at the top are the tracking success rate curve and the tracking accuracy curve under the occlusion condition, respectively, as shown in fig. 7 and 8, the tracking performance of the tracking method of the present invention is ranked first under the occlusion condition. In addition, the tracking method has the testing speed of 145 frames/s and meets the real-time requirement.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. A twin network infrared space small and medium target tracking method based on full convolution is characterized by comprising the following steps:
s1, inputting the image sequence into a full convolution twin network, selecting the first frame of the sequence as a target template z, providing a region x to be searched for in the subsequent frame, and respectively passing through the convolution neural network sharing parameters
Figure FDA0003264139370000014
Extracting depth features to obtain a feature map of the target template
Figure FDA0003264139370000015
And feature map of the region to be searched
Figure FDA0003264139370000013
S2, using the feature map of the target template
Figure FDA0003264139370000012
Treating feature maps of search regions for convolution kernels
Figure FDA0003264139370000011
Performing convolution operation to obtain a characteristic response graph M;
s3, evaluating and judging the current frame target tracking state by using the average peak value correlation energy and the maximum peak value of the current frame response image, and executing the step S4 if the current target tracking is judged to be in a normal state; if the target is judged to be interfered by the background clutter, executing the step S5; if yes, executing step S6;
s4, under normal tracking state, responding to current t frametApplying Hamming window with the same size to inhibit boundary effect, and selectingTaking the maximum peak point of the response map as a target point;
s5, when the targets are interfered by background clutter, forming a candidate point set by the current frame response image multi-peak points, calculating the feature similarity score of each candidate target and the real target of the historical frame by using the depth feature and the local contrast feature, and selecting the candidate point with the highest similarity as the current frame target point;
and S6, when the target is shielded, predicting the target position of the current frame in the shielding state according to Kalman filtering constructed by the target position information of the historical frame, and if the target position is judged to be separated from the shielding, carrying out normal tracking, and executing the step S4.
2. The method as claimed in claim 1, wherein the step S3 includes,
s3.1, measuring the fluctuation condition of the response graph by using an Average Peak Correlation Energy (APCE) index, wherein the APCE is specifically defined as:
Figure FDA0003264139370000021
wherein, Fmax、FminRespectively representing the maximum value and the minimum value in the response diagram; i represents the abscissa of the response plot, j represents the ordinate of the response plot, Fi,jIs the response value at (i, j) in the response map; under the normal tracking condition, the fluctuation of the response diagram is small, and the APCE value is large; when the target is interfered by background clutter and shielded, the fluctuation of the response diagram is severe, and the APCE value is greatly reduced compared with the APCE value of normal tracking; smaller APCE values indicate more unstable tracking conditions;
s3.2, define λAPCEAnd
Figure FDA0003264139370000022
the ratio of the APCE value of the response image of the current frame and the corresponding average value of the maximum peak value and the historical frame is used for quantifying the change degree of the APCE value and the maximum peak value of the current frame, namely:
Figure FDA0003264139370000023
wherein, APCEtAnd Fmax-t、APCEiAnd Fmax-iThe response map APCE values and the maximum peak value, n, of the current frame and the historical frame, respectively1For reference historical frame number, passing lambda in tracking processAPCEAnd
Figure FDA0003264139370000024
the value is combined with the information of the historical frame response map to judge the current tracking state.
3. The twin network infrared space small and medium target tracking method based on full convolution as claimed in claim 2, wherein in step S3.2, λ is usedAPCEAnd
Figure FDA0003264139370000031
the method for judging the current tracking state by combining the value with the information of the historical frame response image comprises the following steps:
a. variation rate lambda of APCE value of current frame response diagramAPCEIf the current tracking state is larger than a certain threshold value, judging that the current tracking state is a normal tracking state, otherwise, executing the step b to judge other tracking states;
b. maximum peak change rate of current frame response map
Figure FDA0003264139370000032
Less than a threshold and n consecutive before the current frame4And if the maximum value of the frame response image gradually decreases and the decrease amplitude is larger than a certain threshold value compared with the previous frame, judging that the current tracking state is a shielding state, otherwise, judging that the current tracking state is a background clutter interference state.
4. The method as claimed in claim 1, wherein the step S5 includes:
s5.1, defining each candidate targetFeature similarity score of (S)jComprises the following steps:
Figure FDA0003264139370000033
wherein D is a candidate target point set, and the number is n3(ii) a j is the candidate target number, FjAnd CjThe response value and the local contrast of the candidate target j of the current frame; fmax-iAnd Cmax-iIs the peak value and local contrast of the maximum peak in the historical frame response map, n2A historical frame number for reference; beta and 1-beta are respectively the response value and the weight occupied by the local contrast; feature similarity score S constructed using depth feature response values and local contrast features of a targetjThe similarity between the candidate target j and the real target characteristic is measured, SjSmaller values indicate that the candidate target characteristic value is closer to the real target;
s5.2, searching the multi-peak point in the current frame response image by using a maximum filter, and selecting the front n of the multi-peak value3The named point is the center point of the candidate target;
s5.3, calculating the local contrast of each candidate target by using the same size of the target frame;
s5.4, calculating corresponding feature similarity score S according to the local contrast and the response value of each candidate targetj,SjAnd obtaining the position of the center point of the target of the current frame through position transformation by taking the candidate target point with the minimum value as a target point.
CN202111081287.1A 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network Active CN113920159B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111081287.1A CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111081287.1A CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Publications (2)

Publication Number Publication Date
CN113920159A true CN113920159A (en) 2022-01-11
CN113920159B CN113920159B (en) 2024-05-10

Family

ID=79235149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111081287.1A Active CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Country Status (1)

Country Link
CN (1) CN113920159B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862910A (en) * 2022-04-25 2022-08-05 南京航空航天大学 Multi-vehicle target tracking method based on deep learning
CN116630373A (en) * 2023-07-19 2023-08-22 江南大学 Infrared weak and small target tracking method based on style recalibration and improved twin network
CN118279568A (en) * 2024-05-31 2024-07-02 西北工业大学 Multi-target identity judging method for distributed double-infrared sensor time sequence twin network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544269A (en) * 2019-08-06 2019-12-06 西安电子科技大学 twin network infrared target tracking method based on characteristic pyramid
CN110728697A (en) * 2019-09-30 2020-01-24 华中光电技术研究所(中国船舶重工集团有限公司第七一七研究所) Infrared dim target detection tracking method based on convolutional neural network
US20200051250A1 (en) * 2018-08-08 2020-02-13 Beihang University Target tracking method and device oriented to airborne-based monitoring scenarios
CN112069896A (en) * 2020-08-04 2020-12-11 河南科技大学 Video target tracking method based on twin network fusion multi-template features
WO2021035807A1 (en) * 2019-08-23 2021-03-04 深圳大学 Target tracking method and device fusing optical flow information and siamese framework
CN112581502A (en) * 2020-12-23 2021-03-30 北京环境特性研究所 Target tracking method based on twin network
KR20210096473A (en) * 2020-01-28 2021-08-05 인하대학교 산학협력단 Robust visual object tracking based on global and local search with confidence estimation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200051250A1 (en) * 2018-08-08 2020-02-13 Beihang University Target tracking method and device oriented to airborne-based monitoring scenarios
CN110544269A (en) * 2019-08-06 2019-12-06 西安电子科技大学 twin network infrared target tracking method based on characteristic pyramid
WO2021035807A1 (en) * 2019-08-23 2021-03-04 深圳大学 Target tracking method and device fusing optical flow information and siamese framework
CN110728697A (en) * 2019-09-30 2020-01-24 华中光电技术研究所(中国船舶重工集团有限公司第七一七研究所) Infrared dim target detection tracking method based on convolutional neural network
KR20210096473A (en) * 2020-01-28 2021-08-05 인하대학교 산학협력단 Robust visual object tracking based on global and local search with confidence estimation
CN112069896A (en) * 2020-08-04 2020-12-11 河南科技大学 Video target tracking method based on twin network fusion multi-template features
CN112581502A (en) * 2020-12-23 2021-03-30 北京环境特性研究所 Target tracking method based on twin network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
柳恩涵;张锐;赵硕;王茹;: "一种基于视频预测的红外行人目标跟踪方法", 哈尔滨工业大学学报, no. 10, 25 September 2020 (2020-09-25) *
火元莲;李明;曹鹏飞;石明;: "基于深度特征与抗遮挡策略的运动目标跟踪", 西北师范大学学报(自然科学版), no. 04, 15 July 2020 (2020-07-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862910A (en) * 2022-04-25 2022-08-05 南京航空航天大学 Multi-vehicle target tracking method based on deep learning
CN116630373A (en) * 2023-07-19 2023-08-22 江南大学 Infrared weak and small target tracking method based on style recalibration and improved twin network
CN116630373B (en) * 2023-07-19 2023-09-22 江南大学 Infrared weak and small target tracking method based on style recalibration and improved twin network
CN118279568A (en) * 2024-05-31 2024-07-02 西北工业大学 Multi-target identity judging method for distributed double-infrared sensor time sequence twin network

Also Published As

Publication number Publication date
CN113920159B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN108665481B (en) Self-adaptive anti-blocking infrared target tracking method based on multi-layer depth feature fusion
CN108830145B (en) People counting method based on deep neural network and storage medium
CN113920159A (en) Infrared aerial small target tracking method based on full convolution twin network
CN112258554B (en) Double-current hierarchical twin network target tracking method based on attention mechanism
CN110120064B (en) Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning
CN107154024A (en) Dimension self-adaption method for tracking target based on depth characteristic core correlation filter
CN114863097B (en) Infrared dim target detection method based on attention mechanism convolutional neural network
CN110555870B (en) DCF tracking confidence evaluation and classifier updating method based on neural network
CN111462191B (en) Non-local filter unsupervised optical flow estimation method based on deep learning
CN113706581B (en) Target tracking method based on residual channel attention and multi-level classification regression
Wang et al. GKFC-CNN: Modified Gaussian kernel fuzzy C-means and convolutional neural network for apple segmentation and recognition
CN107944354B (en) Vehicle detection method based on deep learning
CN111160407A (en) Deep learning target detection method and system
CN107452022A (en) A kind of video target tracking method
CN110942471A (en) Long-term target tracking method based on space-time constraint
CN107977683A (en) Joint SAR target identification methods based on convolution feature extraction and machine learning
CN112561796A (en) Laser point cloud super-resolution reconstruction method based on self-attention generation countermeasure network
CN111160229A (en) Video target detection method and device based on SSD (solid State disk) network
CN111027586A (en) Target tracking method based on novel response map fusion
CN113393457A (en) Anchor-frame-free target detection method combining residual dense block and position attention
CN110795599B (en) Video emergency monitoring method and system based on multi-scale graph
CN116381672A (en) X-band multi-expansion target self-adaptive tracking method based on twin network radar
CN115830537A (en) Crowd counting method
CN114998890A (en) Three-dimensional point cloud target detection algorithm based on graph neural network
CN114913337A (en) Camouflage target frame detection method based on ternary cascade perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant