CN104036528A - Real-time distribution field target tracking method based on global search - Google Patents

Real-time distribution field target tracking method based on global search Download PDF

Info

Publication number
CN104036528A
CN104036528A CN201410298728.7A CN201410298728A CN104036528A CN 104036528 A CN104036528 A CN 104036528A CN 201410298728 A CN201410298728 A CN 201410298728A CN 104036528 A CN104036528 A CN 104036528A
Authority
CN
China
Prior art keywords
distribution field
target image
target
fft
field model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410298728.7A
Other languages
Chinese (zh)
Inventor
宁纪锋
叱干鹏飞
石武祯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest A&F University
Original Assignee
Northwest A&F University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest A&F University filed Critical Northwest A&F University
Priority to CN201410298728.7A priority Critical patent/CN104036528A/en
Publication of CN104036528A publication Critical patent/CN104036528A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a real-time distribution field target tracking method based on global search. The method includes the steps of firstly, marking the position of a target by hand in a target image I selected at the first frame; secondly, establishing an objective model after Gaussian smoothing processing is conducted on a distribution field model of the target image I, determining a region to be searched in the next frame of image, conducting Gaussian smoothing processing on the region to be searched to obtain a candidate region distribution field, and then looking for an image block which has the maximum coefficient of correlation with the target model; thirdly, fusing the two garget models obtained at the two frames according to a certain learning rate rho; fourthly, executing the second step and the third step in a circulating mode till the whole video sequence ends. The overall template matching search strategy based on the correlation coefficient is adopted, the defect that an original distribution field is easily caught in the limitation of the local optimum due to a gradient descent method is overcome, the defect that the process of measuring the similarity through the L1 distance is easily influenced by noise is overcome as well, and the tracking performance is improved.

Description

A kind of real-time distribution field method for tracking target based on global search
[technical field]
The invention belongs to computer vision and art of image analysis, be specifically related to a kind of real-time distribution field method for tracking target based on global search.
[background technology]
Tracking technique is a major issue of computer vision, has been widely applied to video monitoring, the fields such as man-machine interface, robot perception, behavior understanding and action recognition.Due to the rotation of target in tracing process, be out of shape, block the impact with the complicated factor such as illumination variation, Visual Tracking is a problem that is worth further investigation always [1,2].
Usually, track algorithm mainly comprises object representation, search strategy and three aspects of model modification.Wherein, object representation is the problem that first track algorithm will solve.The conventional template representation of early stage target, the brightness that this template comprises target, gradient information or other features [3]but, its space structure sensitive to target.The another kind of object representation method based on template is that histogram represents method [4,5], to calculate simply, speed is fast, insensitive to the deformation of target, postural change etc., can avoid to a certain extent drift.But it is a kind of object representation method based on statistics, can lose some spatial informations.And when target and background similarity are when higher, these method expressive forces decline.Calendar year 2001, Viola [6]deng people, first the Adaboost algorithm based on Haar-like feature is incorporated in the detection of people's face.Owing to the thought of integral image being applied in the calculating of Haar-like feature, greatly improved the acquisition speed of feature.Inspired by this, Babenko [7]deng people, by the method for online many learn-by-example training classifiers, utilize Haar-like feature to realize robust to a discriminative model of target and background training and follow the tracks of.Haar-like feature calculation is simple, but edge, line segment are more responsive, and can only describe the feature of particular orientation, more coarse.Yao Zhijun [8]a kind of new spatial histogram similarity measurement has been proposed, see each the interval space distribution in spatial histogram as a Gaussian distribution, by the similarity in JSD (Jensen-Shannon Divergence) and the metric space distribution of histogram intersection method difference and color histogram, and apply it in particle filter tracking algorithm, improve tracking results.
Recently, Laura Sevilla-Lara [9]deng a kind of distribution field (Distribution Fields is called for short DF) the object representation method that has adopted novelty, and be introduced into target tracking domain.First the method is passed through image natural layering, the essential information that retains original image, by each layer of image and interlayer are carried out, after Gaussian smoothing, having introduced " ambiguity " in object representation, overcome to a certain extent the impact that deformation and illumination etc. change.Yet on search strategy, original distribution field is searched for according to gradient descent method, when appearance is minimizing, stop search, part reduces operand.But protruding for objective function right and wrong, gradient descent method just detects according to previous frame the peak response position that obtains target in concrete tracing process, in present frame, from then on position starts to calculate L1 norm search target in limited region, for target travel, compare in very fast situation like this, in limited area search, be easy to be absorbed in locally optimal solution, limited tracking effect.
List of references:
[1]Yilmaz A,Javed O,ShahHAH M.Object Tracking:a Survey[J],ACM Computing Surveys(CSUR),2006,38(4):13.
[2]Yang Han-xuan,Zheng Feng,Wang Liang,et al.Recent advances and trends in visual tracking:A review[J].Neurocomputing.2011,74(18):3823-3831.
[3]Baker S,Matthews I.Lucas-Kanade20years on:A unifying framework[J].International Journal of Computer Vision,2004,56(3):221-255.
[4]Collins R T,Mean-shift blob tracking through scale space[C]//Proc of IEEE Computer Society Conference on Computer Vision and Pattern Recognition,Madison:IEEE Press,2003,2:II-234-40.
[5]Ning Ji-feng,Zhang Lei,Zhang David,et al.Robust Mean Shift Tracking with Corrected Background-Weighted Histogram[J],IET Computer Vision.2012,6(1):62-69.
[6]Viola P,Jones M,Rapid object detection using a boosted cascade of simple features[C]//Proc of IEEE Conference on Computer Vision and Pattern Recognition,Hawaii:IEEE Press2001,1:511-518.
[7]Babenko B,Yang M,Belongie S,Robust Object Tracking with Online Multiple Instance Learning[J],IEEE Transaction on Pattern Analysis and Machine Intelligence,2011,33(8):1619-1632.
[8] Yao Zhijun. a kind of new spatial histogram method for measuring similarity and the application in target following [J] thereof. electronics and information journal, 2013,35 (7): 1644-1649.
[9]Laura S L,Erik L M,Distribution fields for tracking[C]//Proc of IEEE Conference on Computer Vision and Pattern Recognition.Providence:IEEE Press,2012:1910-1917.
[summary of the invention]
The object of the invention is to for existing target by the difficulties such as partial occlusion, rotation, convergent-divergent, illumination variation, motion blur, complex background and challenge in video frequency object tracking, provide a kind of real-time distribution field method for tracking target based on global search, to adapt to the variation of target appearance model.
For reaching above-mentioned object, be that the present invention is achieved through the following technical solutions:
A real-time distribution field method for tracking target based on global search, comprises the following steps:
1) in the selected target image I hand labeled target location of the first frame, with a rectangle frame, delimit target area, mark the wide, high of the upper left corner coordinate of rectangle frame and rectangle frame, then, to delimiting target area distribution field model representation, the structure of distribution field model is as follows:
Utilize Kronecker delta function a distribution field model representation for target image I, obtain the distribution field model df (i, j, k) of target image I, its formula is as follows:
df ( i , j , k ) = 1 if I ( i , j ) / ( 255 / K ) = = k 0 otherwise - - - ( 1 )
In formula: i and j represent respectively the row and column of target image I;
K represents the number of plies that will divide target image I;
K represents the sequence number of each layer, k=1,2,3 ..., K;
Df (i, j, k) represents the target image I value that the capable j of i is listed as on k layer after decomposing, and its span is 0 or 1;
The degree of depth is that the set of 255/K is called " one deck ";
I (i, j) is illustrated in the pixel value that the upper i of target image I is capable and j is listed as;
Then, distribution field model to target image I carries out Gaussian smoothing, Gaussian smoothing divides that spatial domain is level and smooth and property field is level and smooth, first the distribution field model of target image I is carried out to spatial domain level and smooth, spatial domain is smoothly on the x of target image I and y both direction, to carry out smoothly, and its computing formula is as follows:
df s ( k ) = df ( k ) * h σ s - - - ( 2 )
In formula: df s(k) the distribution field model of the target image I of representation space territory after level and smooth;
Df (k) represents the k layer of the distribution field model of target image I;
that a standard deviation is σ s2D gaussian kernel;
" * " is convolution symbol;
Then it is level and smooth that the distribution field model of the target image I after level and smooth to spatial domain carries out property field, and its computing formula is as follows:
df ss ( i , j ) = df s ( i , j ) * h σ f - - - ( 3 )
In formula: df ssthe distribution field model of target image I after (i, j) representation feature territory is level and smooth;
Df s(i, j) represents the smooth value that the distribution field model i of target image I after spatial domain is level and smooth is capable and j is listed as;
a standard σ fthe 1D gaussian kernel that difference is;
After Gaussian smoothing, each row integration of each pixel of distribution field model of target image I is 1;
In tracing process, in order to adapt to the appearance change of targeted environment, need to dynamically upgrade the distribution field model of target image, mixes by a certain percentage old target image distribution field model and follow the tracks of with new the corresponding distribution field model of target image obtaining, and then a simple target image distribution field model modification formula is as follows:
df t+1(i,j,k)=ρdf t(i,j,k)+(1-ρ)df t-1(i,j,k) (4)
In formula: ρ represents learning rate, it is worth between 0~1, is generally made as 0.9, and it is used for controlling the renewal speed of object module;
T represents the present frame of tracking target image, and t+1 represents the next frame of the present frame of tracking target image, and t-1 represents the previous frame of the present frame of tracking target image;
Df t+1(i, j, k) represents the value of the capable j row of the k layer i of the next frame target distribution field model of present frame in tracking target image;
Df t(i, j, k) represents the value of the capable j row of the k layer i of present frame target distribution field model in tracking target image;
Df t-1(i, j, k) represents the value of the capable j row of the k layer i of the previous frame target distribution field model of present frame in tracking target image;
2) when the distribution field model of target image I is after Gaussian smoothing, object module is set up, next be in next frame image, to determine region to be searched and it is carried out to Gaussian smoothing to obtain candidate region distribution field, then find the image block with object module with maximum correlation coefficient, wherein, the related coefficient of object module and candidate region distribution field can be expressed as:
C i , j = Σ k = 1 K Σ m = 1 M Σ n = 1 N df ( m , n , k ) df i , j ( m + 1 , n + 1 , k ) - - - ( 5 )
In formula: df (m, n, k) is object module;
Df i,j(m+1, n+1, k) is the distribution field of candidate region;
I, j is the coordinate that candidate region i is capable and j is listed as;
C i,jcorrelation matrix for candidate region distribution field and object module;
M and n represent the coordinate that candidate region m is capable and n is listed as;
If candidate image is M * M, target image is N * N, and in frequency domain, need to carry out zero padding continuation to candidate image and target image is (M+N-1) 2, making S=M+N-1, the complexity of algorithm will be reduced to O (S 2log 2s), because the convolution of time domain realizes as the product of formula (5) available frequency domain, therefore, in the related coefficient of frequency-domain calculations object module and candidate region distribution field, formula is as follows:
C i , j = Σ k = 1 K ifft ( fft ( df ( k ) ) · ( fft ( df i , j ( k ) ) ) * ) - - - ( 6 )
In formula:
Ifft (fft (df (k)) (fft (df i,j(k))) *) represent target image distribution field model df (k) and candidate image distribution field df i,j(k) k layer carries out respectively Fast Fourier Transform (FFT) fft (df (k)), fft (df i,j), and obtain the conjugation (fft (df of candidate image distribution field after conversion (k) i,j(k))) *, then by (fft (df i,j(k))) *multiply each other and obtain frequency domain correlation coefficient matrix with fft (df (k)), finally by inversefouriertransform, obtain time domain correlation matrix C i,j;
Fft (df (k)) represents target image distribution field model df (k) k layer to carry out Fast Fourier Transform (FFT);
(fft (df i,j(k))) *expression is to candidate image distribution field df i,j(k) k layer carries out Fast Fourier Transform (FFT) and obtains fft (df i,j) and obtain its conjugation (fft (df (k) i,j(k))) *;
Df (k) and df i,j(k) be respectively the k layer of object module and candidate region distribution field;
C i,jcorrelation matrix for candidate region distribution field and target image distribution field model;
3) two object modules that utilize formula (4) that front and back two frames are obtained merge to upgrade the object module of present frame according to certain learning rate ρ;
4) circulate 2) and 3), until whole video sequence finishes.
Compared with prior art, the present invention has following beneficial effect:
(1) due to the overall template matches search strategy adopting based on related coefficient, both having overcome original distribution field utilizes gradient descent method to be easily absorbed in the limitation of local optimum, avoid again using easily shortcoming affected by noise of L1 distance metric similarity, improved tracking performance.
(2) owing to having adopted intensive sampling strategy, theoretical based on circular matrix, utilized Fourier transform, related coefficient is transformed into the frequency domain that computation complexity is low from the high time domain of computation complexity and realizes, the complexity of algorithm greatly reduces, and has met requirement of real-time.
[accompanying drawing explanation]
Fig. 1 is for being converted into the schematic diagram of distribution field by image Twinings, and wherein, Fig. 1 (a) be original image, and Fig. 1 (b) is the schematic diagram of original image distribution field, and Fig. 1 (c) is the schematic diagram of the distribution field of original image after smoothly.
Fig. 2 is time domain global search algorithm schematic diagram.
Fig. 3 is the correlation coefficient matching method algorithm schematic diagram based on FFT.
Fig. 4 is video sequence errors of centration figure.
Fig. 5 is some tracking results comparison diagrams of each video sequence.
[embodiment]
Below in conjunction with accompanying drawing, the invention will be further described.
A kind of real-time distribution field method for tracking target based on global search of the present invention, comprises the following steps:
1) in the selected target image I hand labeled target location of the first frame, with a rectangle frame, delimit target area, mark the wide, high of the upper left corner coordinate of rectangle frame and rectangle frame, then, to delimiting target area distribution field model representation, the structure of distribution field model is as follows:
Utilize Kronecker delta function a distribution field model representation for target image I, obtain the distribution field model df (i, j, k) of target image I, its formula is as follows:
df ( i , j , k ) = 1 if I ( i , j ) / ( 255 / K ) = = k 0 otherwise - - - ( 1 )
In formula: i and j represent respectively the row and column of target image I;
K represents the number of plies that will divide target image I;
K represents the sequence number of each layer, k=1,2,3 ..., K;
Df (i, j, k) represents the target image I value that the capable j of i is listed as on k layer after decomposing, and its span is 0 or 1;
The degree of depth is that the set of 255/K is called " one deck ";
I (i, j) is illustrated in the pixel value that the upper i of target image I is capable and j is listed as;
Then, distribution field model to target image I carries out Gaussian smoothing, Gaussian smoothing divides that spatial domain is level and smooth and property field is level and smooth, first the distribution field model of target image I is carried out to spatial domain level and smooth, spatial domain is smoothly on the x of target image I and y both direction, to carry out smoothly, and its computing formula is as follows:
df s ( k ) = df ( k ) * h σ s - - - ( 2 )
In formula: df s(k) the distribution field model of the target image I of representation space territory after level and smooth;
Df (k) represents the k layer of the distribution field model of target image I;
that a standard deviation is σ s2D gaussian kernel;
" * " is convolution symbol;
Then it is level and smooth that the distribution field model of the target image I after level and smooth to spatial domain carries out property field, and its computing formula is as follows:
df ss ( i , j ) = df s ( i , j ) * h σ f - - - ( 3 )
In formula: df ssthe distribution field model of target image I after (i, j) representation feature territory is level and smooth;
Df s(i, j) represents the smooth value that the distribution field model i of target image I after spatial domain is level and smooth is capable and j is listed as;
a standard σ fthe 1D gaussian kernel that difference is;
After Gaussian smoothing, each row integration of each pixel of distribution field model of target image I is 1;
In tracing process, in order to adapt to the appearance change of targeted environment, need to dynamically upgrade the distribution field model of target image, mixes by a certain percentage old target image distribution field model and follow the tracks of with new the corresponding distribution field model of target image obtaining, and then a simple target image distribution field model modification formula is as follows:
df t+1(i,j,k)=ρdf t(i,j,k)+(1-ρ)df t-1(i,j,k) (4)
In formula: ρ represents learning rate, it is worth between 0~1, is generally made as 0.9, and it is used for controlling the renewal speed of object module;
T represents the present frame of tracking target image, and t+1 represents the next frame of the present frame of tracking target image, and t-1 represents the previous frame of the present frame of tracking target image;
Df t+1(i, j, k) represents the value of the capable j row of the k layer i of the next frame target distribution field model of present frame in tracking target image;
Df t(i, j, k) represents the value of the capable j row of the k layer i of present frame target distribution field model in tracking target image;
Df t-1(i, j, k) represents the value of the capable j row of the k layer i of the previous frame target distribution field model of present frame in tracking target image;
2) when the distribution field model of target image I is after Gaussian smoothing, object module is set up, next be in next frame image, to determine region to be searched and it is carried out to Gaussian smoothing to obtain candidate region distribution field, then find the image block with object module with maximum correlation coefficient, wherein, the related coefficient of object module and candidate region distribution field can be expressed as:
C i , j = Σ k = 1 K Σ m = 1 M Σ n = 1 N df ( m , n , k ) df i , j ( m + 1 , n + 1 , k ) - - - ( 5 )
In formula: df (m, n, k) is object module;
Df i,j(m+1, n+1, k) is the distribution field of candidate region;
I, j is the coordinate that candidate region i is capable and j is listed as;
C i,jcorrelation matrix for candidate region distribution field and object module;
M and n represent the coordinate that candidate region m is capable and n is listed as;
If candidate image is M * M, target image is N * N, and in frequency domain, need to carry out zero padding continuation to candidate image and target image is (M+N-1) 2, making S=M+N-1, the complexity of algorithm will be reduced to O (S 2log 2s), because the convolution of time domain realizes as the product of formula (5) available frequency domain, therefore, in the related coefficient of frequency-domain calculations object module and candidate region distribution field, formula is as follows:
C i , j = Σ k = 1 K ifft ( fft ( df ( k ) ) · ( fft ( df i , j ( k ) ) ) * ) - - - ( 6 )
In formula:
Ifft (fft (df (k)) (fft (df i,j(k))) *) represent target image distribution field model df (k) and candidate image distribution field df i,j(k) k layer carries out respectively Fast Fourier Transform (FFT) fft (df (k)), fft (df i,j), and obtain the conjugation (fft (df of candidate image distribution field after conversion (k) i,j(k))) *, then by (fft (df i,j(k))) *multiply each other and obtain frequency domain correlation coefficient matrix with fft (df (k)), finally by inversefouriertransform, obtain time domain correlation matrix C i,j;
Fft (df (k)) represents target image distribution field model df (k) k layer to carry out Fast Fourier Transform (FFT);
(fft (df i,j(k))) *expression is to candidate image distribution field df i,j(k) k layer carries out Fast Fourier Transform (FFT) and obtains fft (df i,j) and obtain its conjugation (fft (df (k) i,j(k))) *;
Df (k) and df i,j(k) be respectively the k layer of object module and candidate region distribution field;
C i,jcorrelation matrix for candidate region distribution field and target image distribution field model;
3) two object modules that utilize formula (4) that front and back two frames are obtained merge to upgrade the object module of present frame according to certain learning rate ρ;
4) circulate 2) and 3), until whole video sequence finishes.
Referring to Fig. 1 and Fig. 2, Fig. 1 is for being converted into image Twinings the schematic diagram of distribution field, and wherein, Fig. 1 (a) is original image, Fig. 1 (b) is the schematic diagram of original image distribution field, and Fig. 1 (c) is the schematic diagram of the distribution field of original image after level and smooth.Fig. 2 is time domain global search algorithm schematic diagram.
Image matching algorithm based on global search strategy is in candidate image, to carry out point by point search and the algorithm that mates: as shown in Figure 2: the indicate position of search of each round dot, the round dot of black represents best matched position.The global search strategy of this traversal formula, makes it have the characteristic of globally optimal solution, avoid local extremum, but it has individual shortcoming: when target is large or candidate region is larger, calculated amount is huge, cannot requirement of real time.
Fig. 3 is the correlation coefficient matching method algorithm schematic diagram based on FFT
Utilize FFT to realize the time that rapid image coupling can effectively improve correlation computations, what make that large-scale correlation matching algorithm can be real-time processes.Conventionally on target image border, exceed the correlation of candidate image boundary member to not contribution of final calculation result, therefore establishing search graph is X (k), size is L * L, template figure is H (k), size is M * M, two width images all can be amplified to N >=L, and N=2 r, (r is integer), now, the upper left of correlation surface is the required part of images match, lower right corner part is circular correlation obscures part.It is uncared-for in concrete matching process, obscuring part, does not affect matching result.
Concrete operation step is as follows:
(1) to search graph X (k), template figure H (k) first carries out continuation processing, expands to N * N, and the FFT then carrying out obtains X (k), H (K), and obtains conjugation X *(k).
(2) to complex matrix X *(k) H (K) pointwise is got conjugation after multiplying each other, and obtains:
Y(k)=(H(k)X *(k)) *=H *(k)X(k)
(3) Y (k) is done to N * N point IFFT, obtain correlation matrix Y (n), then reject lower right corner mixing portion, obtain the correlation matrix y (n) needing;
(4) search matrix y (n) maximal value and be optimal match point, its (x, y) coordinate of inverse, returns results.
Embodiment:
In order to verify the performance of algorithm herein, chosen the video library edited by Babenko (2009) herein as test set, it has contained the many-sided difficulty in vision tracking field, such as blocking for a long time (Occluded Face, Occluded Face2), target rotation (Sylv, Twinings), special shape (Surfer, Coke11), target deformation (Cliffbar, Twinings), illumination variation (David, Sylv, Coke11), rapid movement and block (Tiger1, Tiger2), homologue induction (Dollar, Cliffbar) etc., on comparison algorithm, choose and represent many examples track algorithm (MIL) of sparse sampling and original distribution field track algorithm (DF), from success ratio, the aspect such as average error and speed compares, the performance of algorithm is proposed with test.Under windows7 system, Matlab R2010a moves algorithm, and allocation of computer is Inter (R) Core (TM) i3-2130.3.40GHz CPU, 4GB RAM.
The setting of parameter
Algorithm parameter arranges as follows: for the number of plies K of distribution field, consider the characteristic of different video sequence, be set to 4 or 8 layers.Though the number of plies less speed is fast, accuracy rate declines; For some videos, the number of plies too much also may not have better tracking effect.Image is done to parameter width and the variance of spatial domain Gaussian smoothing, that advises with original distribution field track algorithm is consistent, and target is larger, and parameter is larger, otherwise less.Spatial domain Gaussian smoothing variance of the present invention is 0.3, and the variance of property field Gaussian smoothing is 1.For the search radius of candidate region, because each video object size is different with target travel amplitude, be set to 5 or 20 differences slightly.Finally, when object module upgrades, learning rate is set to 0.95, and removing Coke11 and Tiger1 is that 0.8, Sylv and Surf are 0.75, to meet different video requirement.For many learn-by-examples, follow the tracks of and original distribution field track algorithm, we recall the best tracking results of every kind of algorithm.
Quantitative test
Experiment is analyzed tracking results by three kinds of Different Strategies, is respectively off-centring distance (table 1) and tracking success ratio (table 2) and the tracking velocity (table 3) of test position and actual position.Wherein, errors of centration has reacted the size of test position and physical location skew, and its value more bright tracking of novel more approaches physical location, and effect is better; The larger explanation tracing positional of its value is from poor distant of actual position, and tracking performance is poor.Due to each video sequence for scene different, its value fluctuation is also larger.For a frame of video, if (A ∩ B)/(A ∪ B) be > 0.5, think and follow the tracks of successfully, wherein A represents tracking results rectangle frame, B represents target location actual value rectangle frame.It is also an index of measure algorithm tracking performance, it has been generally acknowledged that tracking rectangle frame and target location true rectangular frame coincidence factor are greater than 50% and just think and follow the tracks of successfully, otherwise thinks and follow the tracks of unsuccessfully, and the larger explanation tracking effect of its value is better.
By table 1 and table 2, can be found out, for most of video sequence, the algorithm of proposition has obtained better tracking effect than many examples and two kinds of algorithms of distribution field in tracking errors of centration and tracking success ratio.Table 3 is the comparisons aspect tracking velocity of three kinds of algorithms.Obviously, the algorithm of proposition has tracking velocity faster, under Matlab environment, still has real-time.Fig. 4 is the relative position mistake (take pixel as unit) between tracking results and target location actual value.
Table 1 tracking results and actual position centre distance
Table 2 is followed the tracks of success ratio
Illustrate: in table 1 and table 2, bold represents best result, and italic font representation time good result.
By table 1 and table 2, can be found out, for most of video sequence, the algorithm of proposition has obtained better tracking effect than many examples and two kinds of algorithms of distribution field.Table 3 is the comparisons aspect tracking velocity of three kinds of algorithms.Obviously, the algorithm of proposition has tracking velocity faster, under Matlab environment, still has real-time.Fig. 4 is the relative position mistake (take pixel as unit) between tracking results and target location actual value.Analyze its reason, this is mainly by special distribution field simulated target method for expressing, by by image spreading to solid space and property field and spatial domain carry out level and smooth after, not only can retain all information of target, expanded again the basin of attraction of target, deal carefully with the impact of " uncertain " factor, tackling target deformation at the volley, block, the impact of illumination variation.But original distribution field adopts gradient descent method in object searching strategy, being about to a new two field picture also extends and smoothly obtains distribution field model, then calculate the gradient of the difference of target distribution field and candidate samples L1 norm, then the direction along Gradient Descent continues search, until reach a local better solution solution, just with it, carry out the more display model of fresh target.Yet when target travel more acutely or is for a long time blocked, only in limited region, search is easily absorbed in the local pole figure of merit, and object module can add the background information of mistake, can not effectively represent target like this, the precise decreasing of mating while causing again searching for target, has limited tracking effect.Therefore, the present invention proposes a kind of global search strategy that adopts related coefficient to replace L1 norm, overcome the poor limitation of original distribution field Local Search and real-time.
Table 3 tracking velocity (frame/second)
Illustrate: bold represents best result, and italic font representation time good result.
Qualitative analysis
Fig. 5 has listed the method for proposition and the tracking effect of other two kinds of track algorithms some frames in 12 video sequences.Original distribution field (DF) and algorithm of the present invention have been obtained good tracking effect, owing to all having inherited, the processing of distribution field algorithm is fuzzy and keep the advantages such as space structure susceptibility, and algorithm shows good characteristic to blocking for a long time (referring to #732Occluded face, #387Occluded face2), complex background on a large scale while changing the scenes such as (referring to #886Sylvester, #281David) and target more similar to background (#236Dollar).
The distribution field goal description operator that original distribution field track algorithm proposes has wide basin of attraction, spatial information and the features such as uncertainty representation to target that can retain target, make this goal description operator can use simple gradient descent method search target, and obtain good tracking effect.But gradient descent method exists locally optimal solution problem, thereby cause that tracking is inaccurate.Follow the tracks of inaccurate accumulation and finally can cause following the tracks of drift.And the present invention adopts, be that intensive sampling strategy carries out global search coupling, avoided this defect, and adopted Fast Fourier Transform (FFT), algorithm complex is low, and travelling speed is fast, has better tracking results (referring to Twinings#429, Girl#311).
For many learn-by-examples method, it can upgrade display model adaptively, correctly reflects the change of the outward appearance of target, has also introduced the concept of bag, has alleviated to a certain extent the inaccurate tracking drifting problem causing of object matching.It is to blocking the good performance of performance, but just there will be drift when leaving away when blocking, this may be because the Noisy-OR model that many examples are used does not have fully to use the correct information of previous frame, inefficiency while causing the positive sample of next frame collection, training classifier time error increases and finally causes drift (referring to Occluded face#732, Occluded face2#387).
Generally speaking, the algorithm proposing first adopts intensive sampling strategy to carry out global search coupling, avoided algorithm to be absorbed in locally optimal solution, then by Fourier transform, the complicated calculations of time domain being transformed into frequency domain calculates fast, when improving tracking velocity, overcome many learn-by-example tracking and distribution field tracking velocity and more slowly, be easily absorbed in the deficiency of local minimum.Therefore, algorithm of the present invention blocks in reply, appearance change, illumination variation, rotation variation etc. have more advantage, and have real-time.
You need to add is that: the present invention has chosen the video library edited by Babenko (2009) as test set, it has contained the many-sided difficulty in vision tracking field, such as blocking for a long time (Occluded Face, Occluded Face2), target rotation (Sylv, Twinings), special shape (Surfer, Coke11), target deformation (Cliffbar, Twinings), illumination variation (David, Sylv, Coke11), rapid movement and block (Tiger1, Tiger2), homologue induction (Dollar, Cliffbar) etc., by most researcher, adopted at present and become the actual standard testing of target tracking domain storehouse.
List of references: Babenko B, M.H.Yang, and S.Belongie.2009.Visual Tracking with Online Multiple Instance Learning.Computer Vision and Pattern Recognition, IEEE Conference on, IEEE:983-990

Claims (2)

1. the real-time distribution field method for tracking target based on global search, is characterized in that, comprises the following steps:
1) in the selected target image I hand labeled target location of the first frame, with a rectangle frame, delimit target area, mark the wide, high of the upper left corner coordinate of rectangle frame and rectangle frame, then, to delimiting target area distribution field model representation, the structure of distribution field model is as follows:
Utilize Kronecker delta function a distribution field model representation for target image I, obtain the distribution field model df (i, j, k) of target image I, its formula is as follows:
df ( i , j , k ) = 1 if I ( i , j ) / ( 255 / K ) = = k 0 otherwise - - - ( 1 )
In formula: i and j represent respectively the row and column of target image I;
K represents the number of plies that will divide target image I;
K represents the sequence number of each layer, k=1,2,3 ..., K;
Df (i, j, k) represents the target image I value that the capable j of i is listed as on k layer after decomposing, and its span is 0 or 1;
The degree of depth is that the set of 255/K is called " one deck ";
I (i, j) is illustrated in the pixel value that the upper i of target image I is capable and j is listed as;
Then, distribution field model to target image I carries out Gaussian smoothing, Gaussian smoothing divides that spatial domain is level and smooth and property field is level and smooth, first the distribution field model of target image I is carried out to spatial domain level and smooth, spatial domain is smoothly on the x of target image I and y both direction, to carry out smoothly, and its computing formula is as follows:
df s ( k ) = df ( k ) * h σ s - - - ( 2 )
In formula: df s(k) the distribution field model of the target image I of representation space territory after level and smooth;
Df (k) represents the k layer of the distribution field model of target image I;
that a standard deviation is σ s2D gaussian kernel;
" * " is convolution symbol;
Then it is level and smooth that the distribution field model of the target image I after level and smooth to spatial domain carries out property field, and its computing formula is as follows:
df ss ( i , j ) = df s ( i , j ) * h σ f - - - ( 3 )
In formula: df ssthe distribution field model of target image I after (i, j) representation feature territory is level and smooth;
Df s(i, j) represents the smooth value that the distribution field model i of target image I after spatial domain is level and smooth is capable and j is listed as;
a standard σ fthe 1D gaussian kernel that difference is;
After Gaussian smoothing, each row integration of each pixel of distribution field model of target image I is 1;
In tracing process, in order to adapt to the appearance change of targeted environment, need to dynamically upgrade the distribution field model of target image, mixes by a certain percentage old target image distribution field model and follow the tracks of with new the corresponding distribution field model of target image obtaining, and then a simple target image distribution field model modification formula is as follows:
df t+1(i,j,k)=ρdf t(i,j,k)+(1-ρ)df t-1(i,j,k) (4)
In formula: ρ represents learning rate, it is worth between 0~1, and it is used for controlling the renewal speed of object module;
T represents the present frame of tracking target image, and t+1 represents the next frame of the present frame of tracking target image, and t-1 represents the previous frame of the present frame of tracking target image;
Df t+1(i, j, k) represents the value of the capable j row of the k layer i of the next frame target distribution field model of present frame in tracking target image;
Df t(i, j, k) represents the value of the capable j row of the k layer i of present frame target distribution field model in tracking target image;
Df t-1(i, j, k) represents the value of the capable j row of the k layer i of the previous frame target distribution field model of present frame in tracking target image;
2) when the distribution field model of target image I is after Gaussian smoothing, object module is set up, next be in next frame image, to determine region to be searched and it is carried out to Gaussian smoothing to obtain candidate region distribution field, then find the image block with object module with maximum correlation coefficient, wherein, the related coefficient of object module and candidate region distribution field can be expressed as:
C i , j = Σ k = 1 K Σ m = 1 M Σ n = 1 N df ( m , n , k ) df i , j ( m + 1 , n + 1 , k ) - - - ( 5 )
In formula: df (m, n, k) is object module;
Df i,j(m+1, n+1, k) is the distribution field of candidate region;
I, j is the coordinate that candidate region i is capable and j is listed as;
C i,jcorrelation matrix for candidate region distribution field and object module;
M and n represent the coordinate that candidate region m is capable and n is listed as;
If candidate image is M * M, target image is N * N, and in frequency domain, need to carry out zero padding continuation to candidate image and target image is (M+N-1) 2, making S=M+N-1, the complexity of algorithm will be reduced to O (S 2log 2s), because the convolution of time domain realizes as the product of formula (5) available frequency domain, therefore, in the related coefficient of frequency-domain calculations object module and candidate region distribution field, formula is as follows:
C i , j = Σ k = 1 K ifft ( fft ( df ( k ) ) · ( fft ( df i , j ( k ) ) ) * ) - - - ( 6 )
In formula:
Ifft (fft (df (k)) (fft (df i,j(k))) *) represent target image distribution field model df (k) and candidate image distribution field df i,j(k) k layer carries out respectively Fast Fourier Transform (FFT) fft (df (k)), fft (df i,j), and obtain the conjugation (fft (df of candidate image distribution field after conversion (k) i,j(k))) *, then by (fft (df i,j(k))) *multiply each other and obtain frequency domain correlation coefficient matrix with fft (df (k)), finally by inversefouriertransform, obtain time domain correlation matrix C i,j;
Fft (df (k)) represents target image distribution field model df (k) k layer to carry out Fast Fourier Transform (FFT);
(fft (df i,j(k))) *expression is to candidate image distribution field df i,j(k) k layer carries out Fast Fourier Transform (FFT) and obtains fft (df i,j) and obtain its conjugation (fft (df (k) i,j(k))) *;
Df (k) and df i,j(k) be respectively the k layer of object module and candidate region distribution field;
C i,jcorrelation matrix for candidate region distribution field and target image distribution field model;
3) two object modules that utilize formula (4) that front and back two frames are obtained merge to upgrade the object module of present frame according to certain learning rate ρ;
4) circulate 2) and 3), until whole video sequence finishes.
2. a kind of real-time distribution field method for tracking target based on global search according to claim 1, is characterized in that, learning rate ρ is made as 0.9.
CN201410298728.7A 2014-06-26 2014-06-26 Real-time distribution field target tracking method based on global search Pending CN104036528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410298728.7A CN104036528A (en) 2014-06-26 2014-06-26 Real-time distribution field target tracking method based on global search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410298728.7A CN104036528A (en) 2014-06-26 2014-06-26 Real-time distribution field target tracking method based on global search

Publications (1)

Publication Number Publication Date
CN104036528A true CN104036528A (en) 2014-09-10

Family

ID=51467287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410298728.7A Pending CN104036528A (en) 2014-06-26 2014-06-26 Real-time distribution field target tracking method based on global search

Country Status (1)

Country Link
CN (1) CN104036528A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104574445A (en) * 2015-01-23 2015-04-29 北京航空航天大学 Target tracking method and device
CN105100727A (en) * 2015-08-14 2015-11-25 河海大学 Real-time tracking method for specified object in fixed position monitoring image
CN106408593A (en) * 2016-09-18 2017-02-15 东软集团股份有限公司 Video-based vehicle tracking method and device
CN109255304A (en) * 2018-08-17 2019-01-22 西安电子科技大学 Method for tracking target based on distribution field feature
CN109766943A (en) * 2019-01-10 2019-05-17 哈尔滨工业大学(深圳) A kind of template matching method and system based on global perception diversity measurement
CN113610888A (en) * 2021-06-29 2021-11-05 南京信息工程大学 Twin network target tracking method based on Gaussian smoothness

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761747A (en) * 2013-12-31 2014-04-30 西北农林科技大学 Target tracking method based on weighted distribution field

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761747A (en) * 2013-12-31 2014-04-30 西北农林科技大学 Target tracking method based on weighted distribution field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
叱干鹏飞等: "《http://www.cnki.net/kcms/detail/51.1196.TP.20140418.0917.063.html》", 18 April 2014 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104574445A (en) * 2015-01-23 2015-04-29 北京航空航天大学 Target tracking method and device
CN105100727A (en) * 2015-08-14 2015-11-25 河海大学 Real-time tracking method for specified object in fixed position monitoring image
CN105100727B (en) * 2015-08-14 2018-03-13 河海大学 A kind of fixed bit monitoring image middle finger earnest product method for real time tracking
CN106408593A (en) * 2016-09-18 2017-02-15 东软集团股份有限公司 Video-based vehicle tracking method and device
CN106408593B (en) * 2016-09-18 2019-05-17 东软集团股份有限公司 A kind of wireless vehicle tracking and device based on video
CN109255304A (en) * 2018-08-17 2019-01-22 西安电子科技大学 Method for tracking target based on distribution field feature
CN109766943A (en) * 2019-01-10 2019-05-17 哈尔滨工业大学(深圳) A kind of template matching method and system based on global perception diversity measurement
CN109766943B (en) * 2019-01-10 2020-08-21 哈尔滨工业大学(深圳) Template matching method and system based on global perception diversity measurement
CN113610888A (en) * 2021-06-29 2021-11-05 南京信息工程大学 Twin network target tracking method based on Gaussian smoothness
CN113610888B (en) * 2021-06-29 2023-11-24 南京信息工程大学 Twin network target tracking method based on Gaussian smoothing

Similar Documents

Publication Publication Date Title
Wang et al. Automatic laser profile recognition and fast tracking for structured light measurement using deep learning and template matching
Liu et al. Effective template update mechanism in visual tracking with background clutter
CN106204638B (en) It is a kind of based on dimension self-adaption and the method for tracking target of taking photo by plane for blocking processing
CN110084836B (en) Target tracking method based on deep convolution characteristic hierarchical response fusion
CN101246547B (en) Method for detecting moving objects in video according to scene variation characteristic
Li et al. A generic approach to simultaneous tracking and verification in video
CN104036528A (en) Real-time distribution field target tracking method based on global search
CN103310194B (en) Pedestrian based on crown pixel gradient direction in a video shoulder detection method
CN107481264A (en) A kind of video target tracking method of adaptive scale
CN105117720B (en) Target scale adaptive tracking method based on space-time model
CN109191488B (en) Target tracking system and method based on CSK and TLD fusion algorithm
CN101216885A (en) Passerby face detection and tracing algorithm based on video
CN107103326A (en) The collaboration conspicuousness detection method clustered based on super-pixel
CN103218609A (en) Multi-pose face recognition method based on hidden least square regression and device thereof
Wang et al. Detection based visual tracking with convolutional neural network
CN109146911A (en) A kind of method and device of target following
Moujahid et al. Visual object tracking via the local soft cosine similarity
CN103500345A (en) Method for learning person re-identification based on distance measure
CN110826389A (en) Gait recognition method based on attention 3D frequency convolution neural network
CN111582349A (en) Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering
Zhao et al. A hybrid tracking framework based on kernel correlation filtering and particle filtering
CN103871056A (en) Anisotropic optical flow field and deskew field-based brain MR (magnetic resonance) image registration method
CN104318559A (en) Quick feature point detecting method for video image matching
CN106127798B (en) Dense space-time contextual target tracking based on adaptive model
Gu et al. Linear time offline tracking and lower envelope algorithms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140910