CN113920159B - Infrared air small and medium target tracking method based on full convolution twin network - Google Patents

Infrared air small and medium target tracking method based on full convolution twin network Download PDF

Info

Publication number
CN113920159B
CN113920159B CN202111081287.1A CN202111081287A CN113920159B CN 113920159 B CN113920159 B CN 113920159B CN 202111081287 A CN202111081287 A CN 202111081287A CN 113920159 B CN113920159 B CN 113920159B
Authority
CN
China
Prior art keywords
target
value
response
tracking
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111081287.1A
Other languages
Chinese (zh)
Other versions
CN113920159A (en
Inventor
刘刚
张文波
曹紫绚
董猛
刘龙哲
田慧
权冰洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Science and Technology
Original Assignee
Henan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Science and Technology filed Critical Henan University of Science and Technology
Priority to CN202111081287.1A priority Critical patent/CN113920159B/en
Publication of CN113920159A publication Critical patent/CN113920159A/en
Application granted granted Critical
Publication of CN113920159B publication Critical patent/CN113920159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Radar Systems Or Details Thereof (AREA)

Abstract

The invention provides a method for judging a target tracking state according to average peak correlation energy and maximum peak of a depth characteristic response diagram on the basis of a full convolution twin network, wherein when background clutter interference occurs, a method for judging local contrast is combined with a depth characteristic response value to select a target, and when occlusion occurs, the position of the target is predicted through Kalman filtering. The method can adapt to complex and diverse infrared air scenes and realize effective and stable real-time tracking of small and medium-sized targets in infrared air.

Description

Infrared air small and medium target tracking method based on full convolution twin network
Technical Field
The invention belongs to the technical field of infrared space target tracking, and particularly relates to an infrared space small target tracking method based on a full convolution twin network.
Background
Aerial target tracking is one of the key technologies for infrared imaging guidance systems. The infrared imaging guidance system has higher and higher requirements on the technology of tracking the air target, and is mainly characterized in that the distance of the target is far, the natural environment is complex and changeable, so that the phenomena of most of targets in the field of view of the tracking stage are limited pixels (small targets), characteristic information is weak, background clutter interference is large, shielding exists and the like, and great trouble is brought to the tracking of the target. Therefore, in the process of tracking small and medium targets in infrared air, how to effectively solve the problem of tracking failure caused by background clutter interference, shielding and the like of the targets, and improve the accuracy and the instantaneity of a tracking method is a technical problem which needs to be solved by the current technicians in the field.
At present, most of the infrared target tracking methods are conventional algorithms, and the methods are manually designed according to different scenes to obtain targeted characteristic extraction methods. However, for complex infrared air scenes, it has been difficult for conventional tracking algorithms to accommodate all situations. In recent years, the deep learning algorithm is rapidly developed, and students at home and abroad widely apply the deep features to the target tracking algorithm, and the deep learning algorithm based on the traditional correlation filtering tracking algorithm has been adopted. However, the calculation amount of the back propagation process of the convolutional neural network is large, so that the speed of a tracking algorithm for online updating network parameters is low, and the real-time tracking requirement cannot be met.
Aiming at the problems, a target tracking algorithm based on a twin network adopts a similarity verification method to convert tracking into a template matching problem, and the end-to-end training capability and the tracking instantaneity are strong, so that the method becomes an important research direction in the field of target tracking. At present, a target tracking algorithm based on a full convolution twin network (Fully-Convolutional Siamese Networks, siamFC) is a classical algorithm in the aspect of target tracking, the cross-correlation operation is carried out on the depth characteristics of a region to be searched and a target template, the obtained response value is used for measuring the similarity, and the position with the maximum response value is selected as a target center point, so that good tracking precision and speed are obtained.
Disclosure of Invention
Based on the problems, the invention provides a full convolution twin network-based infrared air small target tracking method, which aims to solve the problem of infrared small target tracking in a complex air scene, and particularly to solve the situations of background clutter interference, shielding and the like of a tracked target.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a method for tracking small and medium targets in infrared air based on a full convolution twin network comprises the following steps:
S1, inputting an image sequence into a full convolution twin network, selecting a first frame of the sequence to be marked as a target template z, and providing a region x to be searched by a subsequent frame, wherein the first frame and the subsequent frame respectively pass through a convolution neural network sharing parameters Extracting depth features to obtain a feature map/> -of the target templateAnd feature map of the region to be searched/>
S2, using a feature map of the target templateFeature map/>, for a region to be searched, for convolution kernelsPerforming convolution operation to obtain a characteristic response diagram M;
S3, evaluating and judging a target tracking state of the current frame by using the average peak correlation energy and the maximum peak of the current frame response diagram, and executing a step S4 if the current target tracking is judged to be in a normal state; if the target is judged to be interfered by the background clutter, executing a step S5; if the target is judged to be shielded, executing a step S6;
s4, under the normal tracking state, applying a Hamming window suppression boundary effect with the same size to a current t frame response diagram M t, and selecting the maximum peak point of the response diagram as a target point;
S5, when the target is interfered by background clutter, forming a candidate point set by using multi-peak points of the current frame response graph, calculating feature similarity scores of each candidate target and a historical frame real target by using depth feature and local contrast feature, and selecting a candidate point with highest similarity as a current frame target point;
S6, when the target is shielded, predicting the target position of the current frame in the shielding state according to Kalman filtering constructed by the target position information of the historical frame. And if the shielding is judged to be separated, normal tracking is performed, and step S4 is executed.
Further, the step S3 includes,
S3.1, measuring fluctuation conditions of a response graph by using an Average Peak Correlation Energy (APCE) index, wherein the APCE is specifically defined as:
Wherein, F max、Fmin represents the maximum value and the minimum value in the response diagram respectively; i represents the abscissa of the response graph, j represents the ordinate of the response graph, and F i,j is the response value at (i, j) in the response graph; under the normal tracking condition, the response graph has smaller fluctuation and larger APCE value; when the target is interfered by background clutter and is blocked, the fluctuation of the response diagram is severe, and compared with the APCE value of normal tracking, the response diagram is greatly reduced; smaller APCE values indicate a less stable tracking state;
S3.2, definition lambda APCE and The ratio of the APCE value of the response graph of the current frame to the corresponding average value of the maximum peak value and the historical frame is used for quantifying the variation degree of the APCE value and the maximum peak value of the current frame, namely:
wherein APCE t, F max-t、APCEi and F max-i are the response map APCE value and maximum peak value of the current frame and the history frame, n 1 is the reference history frame number, and the tracking process is carried out by lambda APCE and F max-i The values are combined with the information of the historical frame response map to determine the current tracking state.
Further, step S3.2 is performed by lambda APCE andThe value is combined with the information of the historical frame response diagram to judge the current tracking state, and the method comprises the following steps:
a. When the change rate lambda APCE of the APCE value of the current frame response chart is larger than a certain threshold value, judging that the current tracking state is a normal tracking state, otherwise, executing the step b to judge other tracking states;
b. Maximum peak change rate of current frame response map And if the current tracking state is smaller than a certain threshold value and the maximum value of the response graph of the continuous n 4 frames before the current frame is gradually reduced, and the reduction amplitude is larger than a certain threshold value compared with the previous frame, judging that the current tracking state is an occlusion state, otherwise, judging that the current tracking state is a background clutter interference state.
Further, step S5 includes:
s5.1, defining a feature similarity score S j of each candidate target as follows:
Wherein D is a candidate target point set, and the number is n 3; j is a candidate target sequence number, and F j and C j are the response value and local contrast of the candidate target j of the current frame; f max-i and C max-i are peak values of maximum peaks and local contrasts in the historical frame response diagram, and n 2 is the reference historical frame number; beta and 1-beta are the weights occupied by the response value and the local contrast respectively; the feature similarity score S j constructed by the depth feature response value and the local contrast feature of the target is used for measuring the similarity between the candidate target j and the real target feature, and the smaller the S j value is, the closer the candidate target feature value is to the real target;
S5.2, searching a multimodal point in the current frame response diagram by using a maximum filter, and selecting n 3 points before the multimodal peak as the center point of the candidate target;
S5.3, calculating the local contrast of each candidate target by using the same size of the target frame;
And S5.4, calculating a candidate target point with the minimum corresponding characteristic similarity score S j,Sj according to the local contrast and the response value of each candidate target as a target point, and obtaining the position of the center point of the current frame target through position transformation.
Compared with the prior art, the invention has the beneficial effects that: according to the infrared space small and medium target tracking method provided by the invention, the depth features are extracted by using the full convolution twin network, and the depth feature response diagram is obtained. And judging the current target tracking state through the average peak correlation energy and the maximum peak change condition of the response diagram. Selecting a maximum peak point of the response diagram to correspond to a target center point when the target is tracked normally; when the background clutter interference of the target is judged, the depth characteristic response value is combined with the local contrast method to eliminate the clutter interference; and when the target is judged to be shielded, performing position prediction by using Kalman filtering. The tracking method provided by the invention can effectively treat the condition that the small and medium infrared targets are interfered by complex background and shielded, has better performance on tracking the small and medium infrared targets, and can meet the real-time requirement of tracking.
Drawings
FIG. 1 is a schematic overall flow diagram of a tracking method of the present invention;
FIG. 2 is a diagram of a full convolution twinning network in accordance with the present invention;
FIG. 3 is a graph of the success rate of the method (Our) of the present invention and other 9 comparison algorithms on an infrared empty small target test set;
FIG. 4 is a graph of the accuracy of the method (Our) of the present invention and other 9 comparison algorithms on an infrared empty small target test set;
FIG. 5 is a graph of the success rate of the method (Our) of the present invention and other 9 comparison algorithms on small and medium target test data in the infrared air with complex background interference properties;
FIG. 6 is a graph of the accuracy of the method (Our) of the present invention and other 9 comparison algorithms on small and medium-sized target test data in the infrared air with complex background interference properties;
FIG. 7 is a graph of the success rate of the method (Our) of the present invention and other 9 comparison algorithms on small and medium-sized target test data in the infrared air with target occlusion properties;
FIG. 8 is a graph of the accuracy of the method (Our) of the present invention and other 9 comparison algorithms on small and medium-sized target test data in the infrared air with target occlusion properties.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all, embodiments of the present invention, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.
The principle of the invention is as follows: according to the infrared space small and medium target tracking method provided by the invention, the depth features are extracted by using the full convolution twin network, and the depth feature response diagram is obtained. And judging the current target tracking state through the average peak correlation energy and the maximum peak change condition of the response diagram. Selecting a maximum peak point of the response diagram to correspond to a target center point when the target is tracked normally; when the background clutter interference of the target is judged, the depth characteristic response value is combined with the local contrast method to eliminate the clutter interference; and when the target is judged to be shielded, performing position prediction by using Kalman filtering. The tracking method provided by the invention can effectively treat the condition that the small and medium infrared targets are interfered by complex background and shielded, has better performance on tracking the small and medium infrared targets, and can meet the real-time requirement of tracking.
The invention discloses a method for tracking small and medium targets in infrared air based on a full convolution twin network, which is shown in a figure 1 and comprises the following steps:
S1, inputting an image sequence into a full convolution twin network, selecting a first frame of the sequence to be marked as a target template z, and providing a region x to be searched by a subsequent frame, wherein the first frame and the subsequent frame respectively pass through a convolution neural network sharing parameters Extracting depth features to obtain a feature mapAnd/>
Specifically, the depth feature extraction network in step S1 is designed on the basis of AlexNet. Five layers of convolution layers are used after the first two convolution layers, the largest pooling layer is used, each convolution layer except the last one uses a ReLU, and the last three layers use packet convolution. Following each linear layer is a batch normalization layer (Batch Normalization), the convolutional layer is not filling operation, and the total step size of the network is 8. In addition, the invention uses the visible light data set (ILSVRC) to train in the off-line training stage through the off-line training feature extraction network, and then trains further on the infrared air small and medium target image sequence, so as to obtain the depth features of the infrared air small and medium targets better. In the training process, a random gradient descent algorithm is used for optimizing network parameters, 50 training rounds are performed, the learning rate is set to be 10 -2, the batch size of training images is set to be 8, and the optimal result in the last 10 rounds is obtained.
S2, using a feature map of the target templateFeature map/>, for a region to be searched, for convolution kernelsAnd performing convolution operation to obtain a characteristic response diagram M.
Specifically, in the actual tracking process, as shown in fig. 2, the full convolution twin network structure diagram, the target template and the region to be searched are respectively cut and transformed into 127×127×3 z and 255×255×3 x, and the two are processed by the feature extraction networkAfter that, 6X/>, was obtainedAnd/>Is a feature map of (1). And performing convolution operation on the characteristic diagram with the size of 6 multiplied by 128 as a convolution kernel and the characteristic diagram with the size of 22 multiplied by 128 to obtain a characteristic response diagram with the size of 17 multiplied by 17, and performing bicubic interpolation to obtain a final characteristic response diagram with the size of 272 multiplied by 272.
And S3, evaluating and judging the target tracking state of the current frame by using the average peak correlation energy and the maximum peak of the response diagram of the current frame. If it is determined that the current target tracking is in the normal state, step S4 is executed. If the target is judged to be interfered by the background clutter, executing a step S5; if it is determined that the target is blocked, step S6 is executed.
Specifically, step S3 includes:
S3.1, measuring fluctuation conditions of a response graph by using an average peak correlation energy (AVERAGE PEAK to correlation energy, APCE) index, wherein the APCE is specifically defined as:
Wherein F max、Fmin represents the maximum value and the minimum value in the response diagram respectively; i represents the abscissa of the response map, j represents the ordinate of the response map, and F i,j is the response value at (i, j) in the response map. Under the normal tracking condition, the response diagram has smaller fluctuation, the visual appearance is in a 'single peak' state, and the APCE value is larger. When the target is interfered by background clutter and is blocked, the response diagram fluctuates violently, and the vision presents a 'multimodal' state, and compared with the APCE value of normal tracking, the APCE value of the target is greatly reduced. The current tracking state can be effectively reflected by analyzing the fluctuation condition of the response diagram and calculating the APCE value of the response diagram.
Judging the cause of tracking failure according to the change state of the maximum peak value of the response diagram of the current frame, and when the target is interfered by background clutter, the maximum peak value of the response diagram of the current frame is suddenly changed compared with the maximum peak value of the response diagram of the previous frame, and the maximum peak value of the response diagram before the current frame is kept stable for a period of time. In the process of slightly shielding the target to completely shielding, the maximum peak value of the corresponding response diagram is gradually reduced, and the maximum peak value of the response diagram is gradually reduced to the minimum value when the target of the current frame is completely shielded. Thus, when the APCE value is small, factors that may cause tracking failure can be classified in conjunction with the changing state of the maximum peak of the response map.
S3.2, the invention defines lambda APCE andThe ratio of the APCE value of the response graph of the current frame to the corresponding average value of the maximum peak value and the historical frame is used for quantifying the variation degree of the APCE value and the maximum peak value of the current frame, namely:
Wherein APCE t, F max-t、APCEi and F max-i are the response map APCE value and maximum peak value of the current frame (t frame) and the history frame respectively, n 1 is the history frame number of reference, n 1 is determined to be 10 according to experiments, and the values are determined to pass through lambda APCE and lambda max-i in the tracking process The values are combined with the information of the historical frame response map to determine the current tracking state.
In the invention, lambda APCE and lambda APCE are used in the tracking processThe value is combined with the information of the historical frame response diagram to judge the current tracking state, and the specific implementation steps are as follows:
a. When the change rate lambda APCE of the current frame response map APCE value is greater than the threshold alpha 1, the current tracking state is judged to be the normal tracking state, otherwise, step S3.2 is executed to judge other tracking states. Alpha 1 was determined to be 0.55 based on the experiment.
B. Maximum peak change rate of current frame response mapAnd if the current tracking state is smaller than the threshold value alpha 2 and the maximum value of the response map of n 4 frames which are continuous before the current frame is gradually reduced, and the reduction amplitude is larger than alpha 3 compared with the previous frame, judging that the current tracking state is an occlusion state, and otherwise, judging that the current tracking state is a background clutter interference state. It was experimentally determined that α 2 was 0.68, α 3 was 0.08, and n 4 was 4.
S4, in a normal tracking state, applying a Hamming window suppression boundary effect with the same size to the current t frame response map M t, and selecting the maximum peak point of the response map as a target point.
Specifically, the maximum peak point in the response diagram is selected as a target point, the offset value of the maximum peak point relative to the center of the response diagram is obtained, and the center position of the current frame target can be obtained by multiplying the offset value by the total step length of the network.
And S5, when the target is interfered by background clutter, forming a candidate point set by using multi-peak points of the current frame response graph, calculating the feature similarity score of each candidate target and the real target of the historical frame by using the depth feature and the local contrast feature, and selecting the candidate point with the highest similarity as the current frame target point.
Specifically, step S5 includes:
S5.1, in the step S5, the local contrast ratio of the target gray average value to the local neighborhood background gray average value is defined as:
where Ω represents a target region, ψ represents a neighborhood of the target region, N is the number of pixels in the region, and I (I, j) is the gray value of the pixel at the original image (I, j).
In step S5, a feature similarity score S j between each candidate object and the real object of the history frame is constructed by using the depth feature response value and the local contrast feature of the object, and the feature similarity score S j of each candidate object is defined as:
Wherein D is a candidate target point set, and the number is n 3; j is the candidate object sequence number, and F j and C j are the response value and local contrast of the candidate object j for the current frame. F max-i and C max-i are peak values of maximum peaks and local contrasts in the history frame response map, and n 2 is the history frame number of the reference. Beta and 1-beta are the weights taken up by the response value and local contrast, respectively. According to the experiment, n 3 is 8, n 2 is 5, and beta is 0.4. The feature similarity score S j measures the similarity of the candidate object j to the real object feature, and a smaller value of S j indicates that the candidate object feature value is closer to the real object.
S5.2, searching for a multimodal point in the current frame response chart by using a maximum filter, and selecting n 3 points before the multimodal peak as candidate target center points.
S5.3, calculating the local contrast of each candidate target by using the same target frame size.
And S5.4, calculating a candidate target point with the minimum corresponding characteristic similarity score S j,Sj according to the local contrast and the response value of each candidate target as a target point, and obtaining the position of the center point of the current frame target through position transformation.
S6, when the target is shielded, predicting the target position of the current frame in the shielding state according to Kalman filtering constructed by the target position information of the historical frame, and performing normal tracking after judging that the target is out of shielding, and executing the step S4.
Specifically, in the process of target tracking, a Kalman filter is initialized, and then the motion state of a target is predicted and optimally estimated in an iterative process according to a subsequent target tracking result. I.e. the target position of the current frame can be predicted by using kalman filtering from the target position information of the history frame when the target is occluded. In the occlusion process, if the APCE value change rate lambda APCE of the current frame is larger than the threshold value alpha 1 and the maximum value change rate lambda Fmax-t is larger than the threshold value alpha 4, the target is indicated to be out of occlusion, and normal tracking is performed. Alpha 1 was determined to be 0.55 and alpha 4 was determined to be 0.88 based on the experiment.
The S1-S2 is a target depth feature extraction process, S3 is target tracking state judgment, and S4, S5 and S6 are processing methods respectively aiming at normal tracking, target interference caused by background clutter and target shielding, and are combined together to form a complete target tracking process. In the actual tracking process, the whole target tracking is completed by repeating the steps S1-S4/S5/S6.
The tracking effect of the invention (Our) is verified by a simulation experiment, wherein the simulation experiment adopts an infrared image weak and small aircraft target detection tracking dataset under the open ground-air background, and the tracking performance is compared with 9 classical tracking algorithms. The method comprises a stage based on a correlation filtering algorithm, an STRCF, a correlation filtering algorithm ECO-HC integrating depth characteristics, a depth learning algorithm MDNet based on-line fine tuning, and depth learning algorithms SiamFC, siamRPN, daSiamRPN, siamDW and SiamFC ++ based on a twin network. The deep learning-based algorithm adopts the same training set for training.
The simulation experiment results refer to fig. 3 and 4, which are respectively a success rate diagram and an accuracy diagram of the tracking method and other 9 comparison algorithms on the test set. In fig. 3 and fig. 4, the uppermost curves are the tracking success rate curve and the tracking accuracy curve of the present invention, respectively, and it can be seen that the tracking method of the present invention is significantly better than other 9 algorithms in terms of tracking success rate and accuracy, and the tracking success rate and the tracking accuracy are ranked first.
In order to further analyze the performance of the tracking method under the complex background and shielding conditions in detail, the data of the two attributes of the complex background interference and the shielding conditions of the target are selected from the test set, and the tracking method and other 9 comparison algorithms are respectively compared on the two attribute data. In fig. 5 and 6, the uppermost curves are a tracking success rate curve and a tracking precision curve of the tracking method under the condition of complex background interference, and as shown in fig. 5 and 6, the tracking performance of the tracking method is ranked first under the condition of complex background interference; in fig. 7 and 8, the curve at the top is the tracking success rate curve and the tracking accuracy curve of the present invention under the shielding condition, respectively, and as shown in fig. 7 and 8, the tracking performance of the tracking method of the present invention is ranked first under the shielding condition. In addition, the test speed of the tracking method reaches 145 frames/s, and the real-time requirement is met.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (2)

1. The method for tracking the small and medium-sized targets in the infrared air based on the full convolution twin network is characterized by comprising the following steps of:
S1, inputting an image sequence into a full convolution twin network, selecting a first frame of the sequence to be marked as a target template z, and providing a region x to be searched by a subsequent frame, wherein the first frame and the subsequent frame respectively pass through a convolution neural network sharing parameters Extracting depth features to obtain a feature map/> -of the target templateAnd feature map of the region to be searched/>
S2, using a feature map of the target templateFeature map/>, for a region to be searched, for convolution kernelsPerforming convolution operation to obtain a characteristic response diagram M;
S3, evaluating and judging a target tracking state of the current frame by using the average peak correlation energy and the maximum peak of the current frame response diagram, and executing a step S4 if the current target tracking is judged to be in a normal state; if the target is judged to be interfered by the background clutter, executing a step S5; if the target is judged to be shielded, executing a step S6;
s4, under the normal tracking state, applying a Hamming window suppression boundary effect with the same size to a current t frame response diagram M t, and selecting the maximum peak point of the response diagram as a target point;
S5, when the target is interfered by background clutter, forming a candidate point set by using multi-peak points of the current frame response graph, calculating feature similarity scores of each candidate target and a historical frame real target by using depth feature and local contrast feature, and selecting a candidate point with highest similarity as a current frame target point;
S6, when the target is shielded, predicting the target position of the current frame in the shielding state according to Kalman filtering constructed by the target position information of the historical frame, and if the target is judged to be separated from shielding, carrying out normal tracking, and executing the step S4; the step S3 includes the steps of,
S3.1, measuring fluctuation conditions of a response graph by using an Average Peak Correlation Energy (APCE) index, wherein the APCE is specifically defined as:
Wherein, F max、Fmin represents the maximum value and the minimum value in the response diagram respectively; i represents the abscissa of the response graph, j represents the ordinate of the response graph, and F i,j is the response value at (i, j) in the response graph; under the normal tracking condition, the response graph has smaller fluctuation and larger APCE value; when the target is interfered by background clutter and is blocked, the fluctuation of the response diagram is severe, and compared with the APCE value of normal tracking, the response diagram is greatly reduced; smaller APCE values indicate a less stable tracking state;
S3.2, definition lambda APCE and The ratio of the APCE value of the response graph of the current frame to the corresponding average value of the maximum peak value and the historical frame is used for quantifying the variation degree of the APCE value and the maximum peak value of the current frame, namely:
wherein APCE t, F max-t、APCEi and F max-i are the response map APCE value and maximum peak value of the current frame and the history frame, n 1 is the reference history frame number, and the tracking process is carried out by lambda APCE and F max-i The value is combined with the information of the historical frame response diagram to judge the current tracking state; the step S5 comprises the following steps:
s5.1, defining a feature similarity score S j of each candidate target as follows:
Wherein D is a candidate target point set, and the number is n 3; j is a candidate target sequence number, and F j and C j are the response value and local contrast of the candidate target j of the current frame; f max-i and C max-i are peak values of maximum peaks and local contrasts in the historical frame response diagram, and n 2 is the reference historical frame number; beta and 1-beta are the weights occupied by the response value and the local contrast respectively; the feature similarity score S j constructed by using the depth feature response value and the local contrast feature of the target measures the similarity between the candidate target j and the real target feature, and the smaller the S j value is, the closer the candidate target feature value is to the real target;
S5.2, searching a multimodal point in the current frame response diagram by using a maximum filter, and selecting n 3 points before the multimodal peak as the center point of the candidate target;
S5.3, calculating the local contrast of each candidate target by using the same size of the target frame;
And S5.4, calculating a candidate target point with the minimum corresponding characteristic similarity score S j,Sj according to the local contrast and the response value of each candidate target as a target point, and obtaining the position of the center point of the current frame target through position transformation.
2. The method for tracking the small and medium targets in the infrared air based on the full convolution twin network as set forth in claim 1, wherein in step S3.2, lambda APCE and lambda APCE are passedThe value is combined with the information of the historical frame response diagram to judge the current tracking state, and the method comprises the following steps:
a. When the change rate lambda APCE of the APCE value of the current frame response chart is larger than a certain threshold value, judging that the current tracking state is a normal tracking state, otherwise, executing the step b to judge other tracking states;
b. Maximum peak change rate of current frame response map And if the current tracking state is smaller than a certain threshold value and the maximum value of the response graph of the continuous n 4 frames before the current frame is gradually reduced, and the reduction amplitude is larger than a certain threshold value compared with the previous frame, judging that the current tracking state is an occlusion state, otherwise, judging that the current tracking state is a background clutter interference state.
CN202111081287.1A 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network Active CN113920159B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111081287.1A CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111081287.1A CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Publications (2)

Publication Number Publication Date
CN113920159A CN113920159A (en) 2022-01-11
CN113920159B true CN113920159B (en) 2024-05-10

Family

ID=79235149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111081287.1A Active CN113920159B (en) 2021-09-15 2021-09-15 Infrared air small and medium target tracking method based on full convolution twin network

Country Status (1)

Country Link
CN (1) CN113920159B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116630373B (en) * 2023-07-19 2023-09-22 江南大学 Infrared weak and small target tracking method based on style recalibration and improved twin network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544269A (en) * 2019-08-06 2019-12-06 西安电子科技大学 twin network infrared target tracking method based on characteristic pyramid
CN110728697A (en) * 2019-09-30 2020-01-24 华中光电技术研究所(中国船舶重工集团有限公司第七一七研究所) Infrared dim target detection tracking method based on convolutional neural network
CN112069896A (en) * 2020-08-04 2020-12-11 河南科技大学 Video target tracking method based on twin network fusion multi-template features
WO2021035807A1 (en) * 2019-08-23 2021-03-04 深圳大学 Target tracking method and device fusing optical flow information and siamese framework
CN112581502A (en) * 2020-12-23 2021-03-30 北京环境特性研究所 Target tracking method based on twin network
KR20210096473A (en) * 2020-01-28 2021-08-05 인하대학교 산학협력단 Robust visual object tracking based on global and local search with confidence estimation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109272530B (en) * 2018-08-08 2020-07-21 北京航空航天大学 Target tracking method and device for space-based monitoring scene

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544269A (en) * 2019-08-06 2019-12-06 西安电子科技大学 twin network infrared target tracking method based on characteristic pyramid
WO2021035807A1 (en) * 2019-08-23 2021-03-04 深圳大学 Target tracking method and device fusing optical flow information and siamese framework
CN110728697A (en) * 2019-09-30 2020-01-24 华中光电技术研究所(中国船舶重工集团有限公司第七一七研究所) Infrared dim target detection tracking method based on convolutional neural network
KR20210096473A (en) * 2020-01-28 2021-08-05 인하대학교 산학협력단 Robust visual object tracking based on global and local search with confidence estimation
CN112069896A (en) * 2020-08-04 2020-12-11 河南科技大学 Video target tracking method based on twin network fusion multi-template features
CN112581502A (en) * 2020-12-23 2021-03-30 北京环境特性研究所 Target tracking method based on twin network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于视频预测的红外行人目标跟踪方法;柳恩涵;张锐;赵硕;王茹;;哈尔滨工业大学学报;20200925(10);全文 *
基于深度特征与抗遮挡策略的运动目标跟踪;火元莲;李明;曹鹏飞;石明;;西北师范大学学报(自然科学版);20200715(04);全文 *

Also Published As

Publication number Publication date
CN113920159A (en) 2022-01-11

Similar Documents

Publication Publication Date Title
CN111354017B (en) Target tracking method based on twin neural network and parallel attention module
CN108665481B (en) Self-adaptive anti-blocking infrared target tracking method based on multi-layer depth feature fusion
CN110929578B (en) Anti-shielding pedestrian detection method based on attention mechanism
CN112184752A (en) Video target tracking method based on pyramid convolution
CN111160407B (en) Deep learning target detection method and system
CN110084149B (en) Face verification method based on hard sample quadruple dynamic boundary loss function
CN107918772B (en) Target tracking method based on compressed sensing theory and gcForest
CN112837344B (en) Target tracking method for generating twin network based on condition countermeasure
CN109886128B (en) Face detection method under low resolution
CN109784358B (en) No-reference image quality evaluation method integrating artificial features and depth features
Wang et al. GKFC-CNN: Modified Gaussian kernel fuzzy C-means and convolutional neural network for apple segmentation and recognition
CN107944354B (en) Vehicle detection method based on deep learning
CN109359661B (en) Sentinel-1 radar image classification method based on convolutional neural network
CN110276784B (en) Correlation filtering moving target tracking method based on memory mechanism and convolution characteristics
CN110555870A (en) DCF tracking confidence evaluation and classifier updating method based on neural network
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN114463677A (en) Safety helmet wearing detection method based on global attention
CN113920159B (en) Infrared air small and medium target tracking method based on full convolution twin network
CN110991554B (en) Improved PCA (principal component analysis) -based deep network image classification method
CN116030396A (en) Accurate segmentation method for video structured extraction
CN109344720B (en) Emotional state detection method based on self-adaptive feature selection
CN112464982A (en) Target detection model, method and application based on improved SSD algorithm
CN113496159B (en) Multi-scale convolution and dynamic weight cost function smoke target segmentation method
CN115294424A (en) Sample data enhancement method based on generation countermeasure network
CN116453033A (en) Crowd density estimation method with high precision and low calculation amount in video monitoring scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant