CN109102522B - Target tracking method and device - Google Patents

Target tracking method and device Download PDF

Info

Publication number
CN109102522B
CN109102522B CN201810768738.0A CN201810768738A CN109102522B CN 109102522 B CN109102522 B CN 109102522B CN 201810768738 A CN201810768738 A CN 201810768738A CN 109102522 B CN109102522 B CN 109102522B
Authority
CN
China
Prior art keywords
target
filter
frame
image
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810768738.0A
Other languages
Chinese (zh)
Other versions
CN109102522A (en
Inventor
魏振忠
闵玥
谈可
张广军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201810768738.0A priority Critical patent/CN109102522B/en
Publication of CN109102522A publication Critical patent/CN109102522A/en
Application granted granted Critical
Publication of CN109102522B publication Critical patent/CN109102522B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a target tracking method and a target tracking device, wherein the method comprises the following steps: initializing all filters according to a given target position in an initial frame image; and performing response calculation for the frame candidate matched filter of the current frame image. If the response peak value is larger than the threshold value, determining the target position according to the response peak value; if the response peak value does not exist and is larger than the threshold value, the updating of the filter is suspended, and the shielding state is entered; and determining a current frame target detection area according to the previous frame target enclosure frame. Extracting candidate frames according to the detection high-resolution frames, respectively matching filters and calculating response peak values, if the detection scores and the response peak values are both greater than a threshold value, considering that the target is retrieved, and if not, the target is still in a shielding state; and expanding the detection range to the full image by continuous multiframe occlusion until the target is retrieved again. The method has stronger multi-state adaptability and robustness in tracking the posture change of the target, the change of environmental illumination, shielding and other conditions, and is convenient for realizing the whole-process tracking and observation of the target.

Description

Target tracking method and device
Technical Field
The invention relates to a target tracking method and a target tracking device, belongs to the technical field of image processing, and particularly relates to a detection-assisted multi-filter target stable tracking algorithm and a detection-assisted multi-filter target stable tracking device.
Background
Target tracking has been widely used in the research fields of computer vision, monitoring systems, civil security inspection, infrared guidance and the like. The essence of object tracking is to determine the position and geometric information of an object in a sequence of images. Similar background interference, shielding of similar objects of a tracking target and the like greatly increase the difficulty of long-time stable and accurate tracking, and become a research hotspot in the field of computer vision.
Appearance models adopted by the target tracking method are divided into two main categories: a generative model and a discriminant model. The biggest difference is that the generative model does not utilize background information and the discriminant model utilizes background information. Namely, the generative model utilizes each positive sample to establish prior distribution of appearance data of the target, specifically determines the characteristics of the target, and ignores background image information. And the discriminant model also utilizes a negative sample containing a background to train a classifier which can well separate the positive sample from the negative sample and can be popularized. The tracking algorithm is now mainly a discriminant model because the background information can be exploited. The algorithm used by the correlation filter tracking algorithm is a discriminant model,
at present, much work has been done on the related filter Tracking algorithm, and a few effective improvement methods are proposed, for example, the article "Henriques J F, Rui C, Martins P, et al," High-Speed Tracking with Kernelized Correlation Filters, "IEEE Transactions on Pattern Analysis and Machine Analysis, vol.37, No.3, pp.583-596,2015," the article "Danelljan M,
Figure BDA0001729707480000011
g, Khan F S, "Accurate scale estimation for robust Visual Tracking," in Bright Machine Vision Conference, Nottingham, United Kingdom,2014, pp.65.1-65.11, "adapting to The dimensional changes of The Tracking target," Galoogahi H K, Sim T, Lucy S, "Correlation Filters with limited boundaries," in IEEE Conference Computer Vision and Pattern registration, Boston, USA,2015, pp.4630-4638, "reduced boundary Effect," Danellouin M, Robinson A, Khan F S, et al, "balance Filters: Learg input videos for comparison," Ocular video Tracking camera for "," Ocandum video recorder for, diffusion 472, "diffusion image for diffusion characteristic 2016And (5) continuing the characteristic images and training a universal continuous filter to fuse characteristic image information of different scales and the like. The KCF tracking algorithm with the gaussian kernel function introduced therein is widely applied in practical engineering as a tracking algorithm with excellent performance. However, as the KCF cannot identify occlusion and can only store the latest tracking target image information, when the tracking target is severely deformed, the target bounding box is easy to drift, and a large number of error training samples are introduced when the target is occluded, which all result in the final failure of tracking.
For the problem of severe deformation and occlusion of the Tracking target, the researchers have studied, such as the articles "Ma C, Yang X, Zhang C, et al," Long-term correction Tracking, "in IEEE Conference on Vision and Pattern Recognition, Boston, USA,2015, pp.5388-5396," fast deformation and conservative scale estimation of the Tracking target using a regression model that is not as good as the new rate, "Danelljan M, Hager G, Khan F S, et al," Adaptive subtraction of the Tracking Set: A Universal Formulation for characterization Vision Tracking, "in IEEE Conference on Vision and Pattern Recognition, Las Vegas, USA,2016, 1430-1438," learning weight of the sample of video, "weight learning method of weight of calculation, Wprojector B, marseille, France,2008, pp.788-801. "determine occlusion using tracking trajectory information to assist tracking," Cehovin L, Kristan M, Leonardis A, "road visual tracking using an adaptive doubled-layer visual mode," IEEE Transactions on Pattern Analysis and Machine Analysis, vol.35, No.4, pp.941-953,2013. "correspond occlusion to target template and sparse representation of non-zero elements only in image tile positions, etc. Although the robustness of the tracking algorithm to the occlusion is improved to a certain extent, the methods are large in calculation amount and cannot meet the speed requirement of real-time tracking. In order to avoid the problems, the KCF tracking algorithm adopting a multi-filter can be considered to classify the training samples, so that the typical historical forms of various tracking targets can be recorded, the pollution of error training samples to correct training samples can be reduced, the occlusion is identified by combining the introduced detection algorithm, the target is found back after the target reappears, and the position of a target surrounding frame is corrected.
Disclosure of Invention
The invention aims to provide a target tracking method and a target tracking device.
The technical scheme adopted by the invention is as follows: a target tracking device comprises a visible light zooming imaging system, a control computer and a two-axis servo system, wherein:
the visible light zooming imaging system consists of an industrial camera and a visible light zooming lens. The industrial camera shoots a target image, image data are transmitted to the control computer through the conversion transmission system, and the variable-light focusing lens controls the focal length of the lens according to a feedback result of the tracking computer, so that the size of the target in the shot image is kept constant;
the control computer determines the concrete position of the tracking target frame in the current image by using a tracking algorithm, adjusts the focal length of the lens according to the proportion of the target tracking frame in the image, and controls the two-axis servo system by the deviation amount of the target position and the central point of the field of view given by the tracking algorithm, namely the offset of the target center from the image center, thereby controlling the two-axis servo system to keep the target in the central area of the image.
The two-axis servo system mainly comprises a rotary table body and an electric cabinet. The turntable body part is a final actuating mechanism of the system, adopts a vertical U-shaped structural form and is respectively used for finishing angular motions in the azimuth direction and the pitching direction. The control system is positioned below the inner part of the mechanical table body and is used for placing a control part of the two-axis servo system, and the control part receives control signals of a control computer and realizes real-time motion control of each frame.
The target tracking method comprises the following steps:
step 1: in the initial frame image, extracting a characteristic image of a target according to a given target region, training to obtain the characteristic image and a weight coefficient of a first filter, initializing other filters, and sequentially performing the following operations:
extracting a target characteristic image: extracting a corresponding hog characteristic image according to a given target area;
training weight coefficient: the trained filter is a linear combination of all images obtained by circularly shifting original training images after nonlinear transformation, the weight coefficient of the linear combination is a training weight coefficient, and the calculation mode is as follows:
Figure BDA0001729707480000031
wherein
Figure BDA0001729707480000032
x is a c-dimensional hog feature image, and ^ represents discrete Fourier transform, F-1Represents the inverse discrete fourier transform and is,
Figure BDA0001729707480000033
represents the multiplication of elements at the same position,*representing conjugation, λ is the canonical coefficient, σ is the mean square error of the Gaussian distribution, see "Henriques J F, Rui C, Martins P, et al," High-Speed transportation with Kernelized Correlation Filters, "IEEE Transactions on Pattern Analysis and Machine Analysis, vol.37, No.3, pp.583-596,2015". The characteristic image and the weight coefficient of the rest filters are the same as those of the first filter, but the corresponding filter weight is smaller than that of the first filter.
Step 2; obtaining candidate frame feature maps in different scales in the current frame image, respectively matching filters, and setting each filter fiThe corresponding training feature image is Ifi. For the current candidate frame feature image I with a certain scale, in order to select the best matched filter, the similarity of the training feature images corresponding to the I and all the N filters is calculated, and the following similarity evaluation mode is adopted:
Figure BDA0001729707480000034
and the filter corresponding to the characteristic image with the minimum S is the best matching filter.
And calculating the response peak value, namely convolving the candidate frame characteristic image with the matched filter, wherein the maximum value of all positions of the convolution result is the response peak value.
And if the response peak value is larger than the threshold value, selecting the best matched filter and the corresponding candidate frame feature image x according to the maximum response peak value, and calculating the weight coefficient alpha by using the selected candidate frame feature image according to the method in the step 1. The candidate frame feature image and the calculated weight coefficient represent newly trained filter information, and the history storage information of the best matched filter is represented by the corresponding history training feature image x 'and the history weight vector alpha'. The best-matched filter updating method comprises the following steps: the historical training image x 'is updated to θ x + (1- θ) x', and the historical weight vector α 'is updated to θ α + (1- θ) α'.
And increasing the filter weight of the best matched filter, correspondingly decreasing the weights of other filters:
Figure BDA0001729707480000041
wherein f isβIs the best-matched filter(s) and,
Figure BDA0001729707480000042
is the filter weight, ω, corresponding to the best matched filteriAre the filter weights corresponding to the other filters. And if the response peak value does not exist and is larger than the threshold value, entering an occlusion state.
And step 3: and (5) pausing the updating of the filter in the shielding state, and determining a target detection area in the kth frame according to the position of a target surrounding frame of the k-1 frame. Extracting characteristic images of candidate frames with different scales based on the position information of the detection high-resolution frame, respectively matching filters and calculating response peak values, if the detection score and the filter response peak value are both larger than a threshold value, considering that the target is retrieved again, and jumping out of a shielding state, otherwise, directly entering the shielding state for the next frame:
and the target detection area in the k frame is still a rectangular surrounding frame, the center image coordinate of the target detection area in the k frame is the same as the center of the target rectangular surrounding frame in the k-1 frame, and the length and the width of the target detection area in the k frame are respectively specific multiples of the length and the width of the target rectangular surrounding frame in the k-1 frame. When the candidate frame feature image is determined according to the rectangular detection high-resolution frame, the coordinates of the rectangular detection high-resolution frame and the center image of the candidate frame are the same, the length and the width of the candidate rectangular frame are respectively specific multiples eta of the length and the width of the rectangular detection frame, and extracting the candidate frames with different scales, namely, taking eta as different numerical values to extract the feature image
And 4, step 4: and (3) if the target is blocked by 20 continuous frames and is not retrieved, the emergency state is entered, the detection range is expanded to the full image, namely the detection is not limited to the position of the target bounding box of the previous frame and the surrounding area thereof, but the full image is detected, and the subsequent target retrieving steps of detecting the high-score candidate frame matched filter, calculating the response peak value and the like are the same as the algorithm in the step 2.
And 5: and after the target is retrieved again, ending the shielding or emergency state, updating the filter if the weight of the filter of a certain filter is smaller than the threshold, otherwise updating the best matched filter, increasing the weight of the updated filter and correspondingly reducing the weights of the filters of other filters.
The theoretical basis of the invention is a detection-assisted multi-filter tracking algorithm, the implementation method is the steps 1-5, and the implementation block diagram of the complete algorithm is shown in FIG. 4. The main innovation of the method is that all historical training images are classified into different training image sets according to the similarity, a plurality of filters corresponding to different training image sets (namely different target historical forms) are obtained through training, the image positions of the current tracking target are determined through the combined action of the filters, the weight coefficient is determined by the matching degree of the form of the current tracking target and each filter, and the currently determined tracking target enclosure frame is only used for updating the corresponding matched filter. The invention has another innovation that detection auxiliary tracking is introduced to carry out shielding judgment and target retrieval.
The invention has the advantages and effects that: the method can not only memorize various historical forms to adapt to a discontinuous appearance model of a tracked target, but also separate a wrong training sample from a correct training sample, and even if the wrong training sample is introduced, the historical storage information of other correct filters is not polluted. Detection is introduced for correction, and the accuracy of the tracking frame is ensured. The detection is combined with the appearance information of various tracking targets stored by a plurality of filters to judge the occlusion and even the occlusion of the similar target objects, so that the introduction of error samples can be avoided, the detection range can be narrowed (only the original target candidate frame and the surrounding area thereof need to be detected for short-time occlusion) in the process of retrieving the target again, the interference of the similar tracking target objects is eliminated, and the target in any historical typical form is retrieved.
Drawings
FIG. 1 is a flow diagram of a device detection tracking module;
FIG. 2 is a schematic diagram of a two-axis servo system;
FIG. 3 is a visible light zoom imaging system;
FIG. 4 is a block diagram of a complete algorithm implementation of the present invention;
FIG. 5 is a schematic diagram of object loss and recovery in an embodiment of the present invention;
fig. 6 is a schematic view of the tracking effect of the airplane in the embodiment of the invention.
Detailed Description
The device comprises a visible light zooming imaging system, a control computer and a two-axis servo system, wherein:
the visible light zooming imaging system consists of an industrial camera and a visible light zooming lens. The industrial camera shoots a target image, image data are transmitted to the control computer through the conversion transmission system, and the variable-light focusing lens controls the focal length of the lens according to a feedback result of the tracking computer, so that the size of the target in the shot image is kept constant;
the control computer determines the concrete position of the tracking target frame in the current image by using a tracking algorithm, adjusts the focal length of the lens according to the proportion of the target tracking frame in the image, and controls the two-axis servo system by the deviation amount of the target position and the central point of the field of view given by the tracking algorithm, namely the offset of the target center from the image center, thereby controlling the two-axis servo system to keep the target in the central area of the image.
The two-axis servo system mainly comprises a rotary table body and an electric cabinet. The turntable body part is a final actuating mechanism of the system, adopts a vertical U-shaped structural form and is respectively used for finishing angular motions in the azimuth direction and the pitching direction. The control system is positioned below the inner part of the mechanical table body and is used for placing a control part of the two-axis servo system, and the control part receives control signals of a control computer and realizes real-time motion control of each frame.
The target stable tracking method comprises the following steps:
step 1: in the initial frame image, extracting a characteristic image of a target according to a given target region, training to obtain the characteristic image and a weight coefficient of a first filter, initializing other filters, and sequentially performing the following operations:
extracting a target characteristic image: extracting a corresponding hog characteristic image according to a given target area;
training weight coefficient: in order to more fully express the characteristic image information and further improve the tracking precision, a KCF author nonlinearly converts the original characteristic image into a large-scale characteristic image with enlarged length and width, and trains a corresponding large-scale filter. The large-scale filter is a linear combination of all images obtained by circularly shifting the original training images and subjected to the same nonlinear transformation. And the sum (point multiplication) of the products of all corresponding position elements of the large-scale characteristic image and the large-scale filter is the score of the center point target in the original image. The large-scale filter for training is a linear combination obtained by circularly shifting an original training image and subjecting all images to nonlinear transformation, the weight coefficient of the linear combination is a training weight coefficient, and the calculation mode is as follows:
Figure BDA0001729707480000061
wherein
Figure BDA0001729707480000062
x is a c-dimensional hog feature image, and ^ represents discrete Fourier transform, F-1Representing discrete fourier functionsThe inverse transformation of the leaves is performed,
Figure BDA0001729707480000063
represents the multiplication of elements at the same position,*representing conjugation, λ is the canonical coefficient, σ is the mean square error of the Gaussian distribution, see "Henriques J F, Rui C, Martins P, et al," High-Speed transportation with Kernelized Correlation Filters, "IEEE Transactions on Pattern Analysis and Machine Analysis, vol.37, No.3, pp.583-596,2015". The characteristic image and the weight coefficient of the rest filters are the same as those of the first filter, but the corresponding filter weight is smaller than that of the first filter.
Step 2; based on the position of the target surrounding frame in the previous frame, acquiring candidate frame feature maps in different scales in the current frame image, respectively matching filters, and setting each filter fiThe corresponding training feature image is Ifi. For the current candidate frame feature image I with a certain scale, in order to select the best matched filter, the similarity of the training feature images corresponding to the I and all the N filters is calculated, and the following similarity evaluation mode is adopted:
Figure BDA0001729707480000064
and the filter corresponding to the characteristic image with the minimum S is the best matching filter.
And calculating the response peak value, namely convolving the candidate frame characteristic image with the matched filter, wherein the maximum value of all positions of the convolution result is the response peak value.
And if the response peak value is larger than the threshold value, selecting the best matched filter and the corresponding candidate frame feature image x according to the maximum response peak value, and calculating the weight coefficient alpha by using the selected candidate frame feature image according to the method in the step 1. The candidate frame feature image and the calculated weight coefficient represent newly trained filter information, and the history storage information of the best matched filter is represented by the corresponding history training feature image x 'and the history weight vector alpha'. The best-matched filter updating method comprises the following steps: the historical training image x 'is updated to θ x + (1- θ) x', and the historical weight vector α 'is updated to θ α + (1- θ) α'. And increasing the filter weight of the best matched filter, correspondingly decreasing the weights of other filters:
Figure BDA0001729707480000071
wherein f isβIs the best-matched filter(s) and,
Figure BDA0001729707480000072
is the filter weight, ω, corresponding to the best matched filteriAre the filter weights corresponding to the other filters. And if the response peak value does not exist and is larger than the threshold value, entering an occlusion state.
And step 3: and (3) pausing the updating of the filter in the shielding state, calculating a detection area according to the position of the target enclosure frame of the previous frame, and setting the detection area to be M times of the target enclosure frame to obtain a detection area image. The target detection area in the current frame is still a rectangular surrounding frame, the central image coordinates of the rectangular surrounding frame are the same as the center of the target rectangular surrounding frame in the previous frame, and the length and the width of the rectangular surrounding frame in the previous frame are specific multiples of the length and the width of the target rectangular surrounding frame in the previous frame respectively. When the candidate frame feature image is determined according to the rectangular detection high-resolution frame, the coordinates of the rectangular detection high-resolution frame and the center image of the candidate frame are the same, the length and the width of the candidate rectangular frame are respectively specific multiples eta of the length and the width of the rectangular detection frame, and extracting candidate frames with different scales, namely, taking eta as different numerical values to extract the feature image. And detecting all the tracking target similar objects in the detection area by the SSD, and if the detection scores are lower than the threshold value, considering the tracking target similar objects to be shielded, otherwise, obtaining a detection frame with the detection score d larger than the threshold value v 1. Under the condition that the interference of the same kind of target objects is more, the maximum response calculation is carried out on the detection frame matched with the filter again to confirm that the same target is found back, based on the position information of the detection high-resolution frame, the characteristic images of the candidate frames with different scales are extracted, the filters are matched respectively and the response peak value is calculated, if the detection score and the filter response peak value are both larger than the threshold value, the target is found back again, the shielding state is jumped out, and otherwise, the next frame still directly enters the shielding state. Under the tracking environment without the interference of similar objects, the step of calculating the maximum response of the detection frame and re-matching the filter can be omitted.
And 4, step 4: and (3) if 20 continuous frames are shielded and the target is not retrieved, the emergency state is entered, the detection range is expanded to the full image, namely the detection is not limited to the position of the target bounding box of the previous frame and the surrounding area thereof, but the full image is detected, and the subsequent target retrieving steps of detecting the high-score candidate frame matched filter, calculating the response peak value and the like are the same as the algorithm in the step 2.
And 5: and after the target is retrieved again, ending the shielding or emergency state, updating the filter if the weight of the filter of a certain filter is smaller than the threshold, otherwise updating the best matched filter, increasing the weight of the updated filter and correspondingly reducing the weights of the filters of other filters.
Examples
The technical solution of the present invention is further described in detail by the following specific examples.
The device consists of a visible light zooming imaging system, a control computer and a two-axis servo system. The visible light zooming imaging system consists of an industrial camera and a visible light zooming lens. The industrial camera shoots a target image and transmits image data to the control computer through the conversion transmission system;
the control computer determines the position of a target enclosure frame in the current frame image by using a tracking algorithm, and controls the two-axis servo system through the deviation of the center of the target enclosure frame and the center point of a view field, namely, the offset of the center of the target from the center of the image, so that the two-axis servo system is controlled to keep the target in the central area of the image. The control computer also controls the focal length of the lens according to the proportion of the target enclosure frame in the current frame image to the whole image, so that the size of the target in the shot image is kept constant. The system detection tracking module flow chart is shown in fig. 1.
The two-axis servo system completes corresponding two-dimensional angular motion under a set instruction for testing. The turntable body part is a final actuating mechanism of the system, adopts a vertical U-shaped structural form and is respectively used for finishing angular motions in the azimuth direction and the pitching direction. The control system is positioned below the inside of the mechanical table body and used for placing a control part of the two-axis servo system, and the functions of real-time motion control, monitoring, protection and the like of each frame are realized. Various control instructions are input through the rocker, and various control functions of the rotary table are achieved. As shown in fig. 2.
The visible light zooming imaging system consists of a visible light zooming lens and an industrial camera. The visible light zoom lens sequentially comprises a focusing assembly, a zooming assembly, a rear fixing assembly, a sealing assembly, a camera assembly and the like. The industrial camera is an SP-5000 model high-definition color camera designed by JAI of Japan, as shown in FIG. 3.
The tracking algorithm of the invention is used for tracking a group of pedestrian moving image sequences which have similar background interference and are shielded even by tracking the similar objects of the target, the images are jpg images with the size of 640 multiplied by 480 and the bit depth of 24, the total frame number of the image sequences is 3624 frames, the tracked target pedestrian is shielded for 9 times, and the total frame number of the shielded pedestrian reaches 1379 frames. The platform used for the experiment was ubuntu 16.04. All experiments are completed on a computer provided with intel core i7 GPU, a main frequency 2.81Ghz, an 8GB memory and a video card NVIDIA GeForce GTX 1050 Ti. Other parameters in the experiment were all provided with code default parameters using the original author.
Fig. 4 presents a system flow diagram of a detection-assisted multi-filter tracking method. The effect of retrieving the shielded target is shown in fig. 5, and the target pedestrian is shielded once per action, so that the shielding is judged, and the target pedestrian reappears and is retrieved.
When the interference of other target similar objects is less in the actual environment applied by the tracking algorithm, the step of detecting the response of the high-score candidate frame matched filter and calculating the filter can be omitted, and the detection result is directly used as the target position. ECO using depth features is selected in the experiment, KCF using hog features and a multi-filter tracking method assisted by KCF improved detection are adopted to track 10 airplane videos which do not have shielding interference and have obvious target form change together. The screenshot of the tracking process is shown in fig. 6, and each row is a schematic diagram of the positions of the tracking boxes in the tracking process in turn from left to right of the ECO, KCF and the method herein. Specific methods for ECO are described in "Danelljan M, Bhat G, Khan F S, et al," ECO: Efficient connection Operators for Transmission, "in IEEE Conference on Computer Vision and Pattern Recognition, State of Hawaii, USA, 2017, pp.6931-6939," KCF specific methods are described in "Henriques J F, Rui C, Martin P, et al," High-Speed Transmission with Kernelized Correlation Filters, "IEEE Transactions on Pattern and Machine analysis integration, vol.37, No.3, pp.583-596, 2015".
It can be seen that although the speed of the KCF is high, the obvious tracking effect is relatively unstable, and when the airplane does not take off, the tracking frame of the KCF has obviously deviated due to the obvious field airflow turbulence and poor picture quality. The ECO is relatively stable, but after the airplane takes off, the ECO cannot adapt to severe deformation due to obvious change of the appearance of the airplane, and the tracking frame gradually deviates from the target. The method proposed herein tracks the aircraft very well throughout, with essentially no drift of the tracking frame. The specific experimental result data and the pair ratio of several tracking methods are shown in table 1, where the FPS is the number of picture frames that can be tracked per second during tracking, and this is related to the size and resolution of the picture, the average CLE is the average center position error and is defined as the average euclidean distance between the target center located by the algorithm and the reference target center, and the average OR is the average coverage and is defined as the average euclidean distance between the target center located by the algorithm and the reference target center.
TABLE 1 ECO, KCF and algorithms herein track various data comparisons
Video name Land1 Land2 Land3 Land4 Land5 Launch1 Launch2 Launch3 Launch4 Launch5 Mean value of
Length (frame) 4800 7200 1997 2760 5101 4800 3600 3449 3960 4640 4230.7
FPS(ECO) <10 <10 <10 <10 <10 <10 <10 <10 <10 <10 <10
FPS(KCF) 55 70 57 47 64 61 62 34 73 87.5 61.05
FPS (modified) 48 34 69 74.6 62 33.6 50 34 43.5 45 49.37
Average CLE (ECO) 90.5 74.9 131.6 94.5 81 51.7 43 81.8 36 46.3 73.13
Average CLE (KCF) 55 37 94 48.5 45 60 74 105 30.5 43.8 59.28
Average CLE (modified) 26.43 23 26 34.6 30.5 15| 25 17.6 16.7 31.8 24.663
Average OR (ECO) 0.76 0.72 0.65 0.8 0.75 0.76 0.82 0.63 0.82 0.75 0.746
Average OR (KCF) 0.7 0.77 0.73 0.73 0.7 0.73 0.78 0.71 0.73 0.68 0.726
Average OR (modified) 0.92 0.92 0.88 0.87 0.83 0.94 0.9 0.93 0.93 0.81 0.893
Although 10 video segments contain images of the same size, all are 1920 x 1080 pixels, a jpg image of 24 bit depth. But the size and weight of the objects in the image are different, and the weight of the objects largely determines the calculation amount and the tracking speed. The KCF is fastest over most of the video segments because it trains the filter for only one frame of the target bounding box image at a time, the filter is a combination of weights after cyclic shifting of the target bounding box image, and the weight coefficients can be directly calculated. The real-time generally requires that the frame frequency is more than 25 frames, and the KCF can completely meet or even greatly exceed the real-time required speed, but the precision of the KCF cannot be guaranteed. The average OR is always around 0.7, the drift is obvious when the appearance of the target changes dramatically, and the average CLE reaches even 100 pixels on the launch3 video segment.
The ECO needs to combine all the target bounding box images stored in history to train a continuous filter together every time, the filter cannot be directly calculated, and a time-consuming conjugate gradient method needs to be used for iterative optimization, so that the ECO is a tracking algorithm with the slowest speed in all video segments and completely fails to meet the real-time requirement. The average OR is around 0.75, the shift is obvious when the appearance of the target changes dramatically, and the average CLE reaches even 130 pixels in land 3.
The speed of the multi-filter improved algorithm introduced with detection is greatly changed because the multi-filter improved algorithm is the combination of the detection and the KCF, the speed of the detection and the KCF is inconsistent, the detection speed is slower than that of the KCF, and if the number of times of the airplane appearance in a video band is changed violently is large, the typical historical form is large, and the detection needs to be introduced for multiple times, the time consumption is long. The reason why the improved algorithm approaches or even exceeds the KCF is that after the KCF shifts, a large number of background parts which do not belong to the target are framed, so that the calculation amount is increased, and the speed is slowed down. The frame frequency of the improved algorithm is at least more than 30, which is enough to meet the real-time requirement, the precision of the improved algorithm is obviously improved, the average OR is always more than 0.8, frequently reaches 0.9, and the average CLE is always within 35 pixels.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent structural protection and equivalent process changes made by using the contents of the present specification and the drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (5)

1. A target tracking method using a target tracking apparatus, the apparatus comprising: a visible light zoom imaging system, a control computer and a two-axis servo system; wherein,
the visible light zooming imaging system consists of an industrial camera and a visible light zooming lens, wherein the industrial camera shoots a target image and transmits image data to the control computer through the conversion transmission system, and the visible light zooming lens controls the focal length of the lens according to a feedback result of the tracking computer, so that the size of the target in the shot image is kept constant;
the control computer determines the specific position of a tracking target frame in the current image by using a tracking algorithm, adjusts the focal length of a lens according to the proportion of the target tracking frame in the image, and controls a two-axis servo system by the deviation amount of the target position and the central point of a view field given by the tracking algorithm, namely the offset of the target center from the image center, so that the two-axis servo system is controlled to keep the target in the central area of the image;
two axis servo mainly comprises two parts of revolving stage platform body and electric cabinet, and revolving stage platform body part is the final actuating mechanism of system, adopts vertical U type structural style, is used for accomplishing the angular motion of position, two directions of every single move respectively, and control system is located the inside below of mechanical stage body, is used for laying two axis servo's control part, and its control signal that accepts the control computer realizes the real-time motion control to each frame, its characterized in that: the method comprises the following implementation steps:
firstly, extracting a characteristic image of a target according to a given target area in an initial frame image, training to obtain a characteristic image and a weight coefficient of a first filter, and initializing other filters and filter weights thereof;
step two, acquiring candidate frame feature maps in different scales in the current frame image, and respectively matching filters to calculate response peak values; if the response peak value is larger than the threshold value, selecting a candidate frame feature map and a filter with corresponding scales according to the maximum response peak value, updating the corresponding filter and increasing the weight of the filter; if the response peak value does not exist and is larger than the threshold value, entering a shielding state;
step three, pausing filter updating in the shielding state, and determining a target detection area in the current frame according to the position of a target surrounding frame of the previous frame; extracting characteristic images of candidate frames with different scales based on the position information of the detected high-resolution frame, respectively matching filters and calculating response peak values, if the detection score and the response peak values are both larger than a threshold value, considering that the target is retrieved again, and if not, directly entering a shielding state for the next frame;
in the third step, the updating of the filter is suspended in the shielding state, and a target detection area in the kth frame is determined according to the position of a target enclosure frame of the k-1 frame; extracting characteristic images of candidate frames with different scales based on the position information of the detection high-resolution frame, respectively matching filters and calculating response peak values, if the detection score and the filter response peak value are both larger than a threshold value, considering that the target is retrieved again, and jumping out of a shielding state, otherwise, directly entering the shielding state in the next frame;
the target detection area in the kth frame is still a rectangular surrounding frame, the center image coordinate of the target detection area is the same as the center of the target rectangular surrounding frame in the k-1 frame, and the length and the width of the target detection area are specific multiples of the length and the width of the target rectangular surrounding frame in the k-1 frame respectively; when a candidate frame feature image is determined according to the rectangular detection high-resolution frame, the coordinates of the rectangular detection high-resolution frame and the center image of the candidate frame are the same, the length and the width of the candidate rectangular frame are respectively specific multiples eta of the length and the width of the rectangular detection frame, and extracting candidate frames with different scales, namely, taking eta as different numerical values to extract the feature image;
fourthly, entering an emergency state when 20 continuous frames are shielded and the target is not retrieved, and expanding the detection range to the full image;
and step five, after the target is retrieved again, ending the shielding or emergency state, if the filter weight of a certain filter is smaller than the threshold value, updating the filter, otherwise, updating the best matched filter, increasing the filter weight of the updated filter and correspondingly reducing the filter weights of other filters.
2. The target tracking method of claim 1, wherein:
in the first step, in an initial frame image, extracting a feature image of a target according to a given target region, training to obtain a feature image and a weight coefficient of a first filter, initializing other filters, and sequentially performing the following operations:
extracting a target characteristic image: extracting a corresponding hog characteristic image according to a given target area;
training weight coefficient: the trained filter is a linear combination of all images obtained by circularly shifting the original training images after nonlinear transformation, the weight coefficient of the linear combination is a training weight coefficient, the characteristic images and the weight coefficients of the rest filters are the same as those of the first filter, but the weight of the corresponding filter is smaller than that of the first filter.
3. The target tracking method of claim 1, wherein:
in the second step, the candidate frame feature maps are obtained in different scales in the current frame image, the filters are respectively matched, and each filter f is setiThe corresponding training feature image is Ifi(ii) a For the current candidate frame feature image I with a certain scale, in order to select the best matched filter, the similarity of the training feature images corresponding to the I and all the N filters is calculated, and the following similarity evaluation mode is adopted:
Figure FDA0003168990830000021
the filter corresponding to the characteristic image with the minimum S is the matched filter;
calculating a response peak value, namely convolving the current scale candidate frame characteristic image I with a matched filter, wherein the maximum value of all positions of a convolution result is the response peak value;
if the response peak value is larger than the threshold value, selecting the best matching filter and the corresponding candidate frame feature image x according to the maximum response peak value, and calculating a weight coefficient alpha according to the selected candidate frame feature image; the candidate frame characteristic image and the calculated weight coefficient represent newly trained filter information, and the historical storage information of the best matched filter is represented by a corresponding historical training characteristic image x 'and a historical weight vector alpha'; the best-matched filter updating method comprises the following steps: updating the historical training image x 'to be theta x + (1-theta) x', and updating the historical weight vector alpha 'to be theta alpha + (1-theta) alpha';
and increasing the filter weight of the best matched filter, correspondingly decreasing the weights of other filters:
Figure FDA0003168990830000031
wherein f isβIs the best-matched filter(s) and,
Figure FDA0003168990830000032
is the filter weight, ω, corresponding to the best matched filteriAnd if no response peak value is larger than the threshold value, entering an occlusion state.
4. The target tracking method of claim 2, wherein:
in the fourth step, 20 continuous frames of shelters are covered and the target is not retrieved, the emergency state is entered, the detection range is expanded to the full image, namely the detection is not limited to the position of the target bounding box of the previous frame and the surrounding area, but the full image is detected, the high-score candidate frame is detected to be matched with the filter, and the response peak value and other subsequent targets are calculated.
5. The target tracking method of claim 2, wherein:
in the fifth step, after the target is retrieved again, the blocking or emergency state is ended, if the filter weight of a certain filter is smaller than the threshold value, the filter is updated, otherwise, the best matched filter is updated, the filter weight of the updated filter is increased, and the filter weights of other filters are correspondingly reduced.
CN201810768738.0A 2018-07-13 2018-07-13 Target tracking method and device Active CN109102522B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810768738.0A CN109102522B (en) 2018-07-13 2018-07-13 Target tracking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810768738.0A CN109102522B (en) 2018-07-13 2018-07-13 Target tracking method and device

Publications (2)

Publication Number Publication Date
CN109102522A CN109102522A (en) 2018-12-28
CN109102522B true CN109102522B (en) 2021-08-31

Family

ID=64846336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810768738.0A Active CN109102522B (en) 2018-07-13 2018-07-13 Target tracking method and device

Country Status (1)

Country Link
CN (1) CN109102522B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009660B (en) * 2019-03-06 2021-02-12 浙江大学 Object position tracking method based on correlation filter algorithm
CN109977928B (en) * 2019-04-25 2021-03-23 中国科学院自动化研究所 Robot target pedestrian retrieval method
CN110210304B (en) * 2019-04-29 2021-06-11 北京百度网讯科技有限公司 Method and system for target detection and tracking
CN110189365B (en) * 2019-05-24 2023-04-07 上海交通大学 Anti-occlusion correlation filtering tracking method
CN110400347B (en) * 2019-06-25 2022-10-28 哈尔滨工程大学 Target tracking method for judging occlusion and target relocation
CN110290351B (en) * 2019-06-26 2021-03-23 广东康云科技有限公司 Video target tracking method, system, device and storage medium
CN110517483B (en) * 2019-08-06 2021-05-18 新奇点智能科技集团有限公司 Road condition information processing method and digital rail side unit
CN110490907B (en) * 2019-08-21 2023-05-16 上海无线电设备研究所 Moving target tracking method based on multi-target feature and improved correlation filter
CN110599519B (en) * 2019-08-27 2022-11-08 上海交通大学 Anti-occlusion related filtering tracking method based on domain search strategy
CN112489077A (en) * 2019-09-12 2021-03-12 阿里巴巴集团控股有限公司 Target tracking method and device and computer system
CN110909604B (en) * 2019-10-23 2024-04-19 深圳市重投华讯太赫兹科技有限公司 Security check image detection method, terminal equipment and computer storage medium
CN112585944A (en) * 2020-01-21 2021-03-30 深圳市大疆创新科技有限公司 Following method, movable platform, apparatus and storage medium
CN112084914B (en) * 2020-08-31 2024-04-26 的卢技术有限公司 Multi-target tracking method integrating space motion and apparent feature learning
US11566521B2 (en) 2020-09-22 2023-01-31 Trans Astronautica Corporation Systems and methods for radiant gas dynamic mining of permafrost
CN112862863B (en) * 2021-03-04 2023-01-31 广东工业大学 Target tracking and positioning method based on state machine
CN114842048B (en) * 2022-04-11 2024-08-02 北京航天晨信科技有限责任公司 Target tracking method, system, readable storage medium and computer device
US11748897B1 (en) * 2022-06-24 2023-09-05 Trans Astronautica Corporation Optimized matched filter tracking of space objects

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651913A (en) * 2016-11-29 2017-05-10 开易(北京)科技有限公司 Target tracking method based on correlation filtering and color histogram statistics and ADAS (Advanced Driving Assistance System)
CN106686306A (en) * 2016-12-22 2017-05-17 西安工业大学 Target tracking device and target tracking method
CN107657630A (en) * 2017-07-21 2018-02-02 南京邮电大学 A kind of modified anti-shelter target tracking based on KCF
CN107748873A (en) * 2017-10-31 2018-03-02 河北工业大学 A kind of multimodal method for tracking target for merging background information
CN107993257A (en) * 2017-12-28 2018-05-04 中国科学院西安光学精密机械研究所 Intelligent IMM Kalman filtering feedforward compensation target tracking method and system
CN108010067A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of visual target tracking method based on combination determination strategy
CN108037510A (en) * 2017-12-07 2018-05-15 武汉华之洋科技有限公司 A kind of photoelectronic reconnaissance equipment for unmanned boat

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651913A (en) * 2016-11-29 2017-05-10 开易(北京)科技有限公司 Target tracking method based on correlation filtering and color histogram statistics and ADAS (Advanced Driving Assistance System)
CN106686306A (en) * 2016-12-22 2017-05-17 西安工业大学 Target tracking device and target tracking method
CN107657630A (en) * 2017-07-21 2018-02-02 南京邮电大学 A kind of modified anti-shelter target tracking based on KCF
CN107748873A (en) * 2017-10-31 2018-03-02 河北工业大学 A kind of multimodal method for tracking target for merging background information
CN108037510A (en) * 2017-12-07 2018-05-15 武汉华之洋科技有限公司 A kind of photoelectronic reconnaissance equipment for unmanned boat
CN108010067A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of visual target tracking method based on combination determination strategy
CN107993257A (en) * 2017-12-28 2018-05-04 中国科学院西安光学精密机械研究所 Intelligent IMM Kalman filtering feedforward compensation target tracking method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"High-Speed Tracking with Kernelized Correlation Filters";Joa~o F. Henriques, at el.;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20150331;第37卷(第3期);583-596 *
"基于似物性采样和核化相关滤波器的目标跟踪算法研究";王鹏飞;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180215(第2期);I138-1994 *
"基于核相关滤波器的目标跟踪方法研究";江维创;《中国优秀硕士学位论文全文数据库 信息科技辑》;20171215(第12期);I138-1411 *
"电视导引头相关滤波跟踪算法研究";马晓楠;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20170715(第7期);C032-143 *

Also Published As

Publication number Publication date
CN109102522A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109102522B (en) Target tracking method and device
CN109800689B (en) Target tracking method based on space-time feature fusion learning
CN106780620B (en) Table tennis motion trail identification, positioning and tracking system and method
CN113807187B (en) Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN110443827B (en) Unmanned aerial vehicle video single-target long-term tracking method based on improved twin network
CN104574445B (en) A kind of method for tracking target
CN111354017A (en) Target tracking method based on twin neural network and parallel attention module
CN109102525B (en) Mobile robot following control method based on self-adaptive posture estimation
CN110490907B (en) Moving target tracking method based on multi-target feature and improved correlation filter
CN109448023B (en) Satellite video small target real-time tracking method
CN108009494A (en) A kind of intersection wireless vehicle tracking based on unmanned plane
CN109087337B (en) Long-time target tracking method and system based on hierarchical convolution characteristics
CN111383252B (en) Multi-camera target tracking method, system, device and storage medium
CN111680713B (en) Unmanned aerial vehicle ground target tracking and approaching method based on visual detection
CN109323697B (en) Method for rapidly converging particles during starting of indoor robot at any point
CN110827321B (en) Multi-camera collaborative active target tracking method based on three-dimensional information
CN110992378B (en) Dynamic updating vision tracking aerial photographing method and system based on rotor flying robot
CN113643329B (en) Twin attention network-based online update target tracking method and system
CN113538509B (en) Visual tracking method and device based on adaptive correlation filtering feature fusion learning
CN111415370A (en) Embedded infrared complex scene target real-time tracking method and system
CN108986139B (en) Feature integration method with significance map for target tracking
CN111915653B (en) Dual-station visual target tracking method
CN106408593A (en) Video-based vehicle tracking method and device
Wu et al. Joint feature embedding learning and correlation filters for aircraft tracking with infrared imagery
CN112884799A (en) Target tracking method in complex scene based on twin neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant