CN107341817A - Self-adaptive visual track algorithm based on online metric learning - Google Patents
Self-adaptive visual track algorithm based on online metric learning Download PDFInfo
- Publication number
- CN107341817A CN107341817A CN201710455281.3A CN201710455281A CN107341817A CN 107341817 A CN107341817 A CN 107341817A CN 201710455281 A CN201710455281 A CN 201710455281A CN 107341817 A CN107341817 A CN 107341817A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- template
- distance
- mtd
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to Visual Tracking field,Specifically a kind of self-adaptive visual track algorithm based on online metric learning,In the practical application scene in vision tracking field,Obtainable target priori is generally seldom in video sequence to be tracked,The predefined formula distance metric algorithm of tradition is difficult reply long-range tracing task requirement,The present invention proposes a kind of online Vision Tracking of robust of combination learning distance metric,Before tracking is considered as by it,Two classification problems of background,And constantly update grader as video promotes,It also proposed a kind of new template renewal algorithm,Tracking process is set to have more robustness,To improve the precision and efficiency of algorithm,It is proposed to reduce characteristic dimension while tracking effect is ensured using dense SIFT feature and random PCA,A series of experiments result is shown,Carried algorithm has certain competitiveness compared with many epidemic algorithms instantly.
Description
Technical field:
The present invention relates to Visual Tracking field, specifically a kind of self-adaptive visual based on online metric learning
Track algorithm.
Background technology:
As an important topic of computer vision, the discussion of vision tracking has continued for many years.Its main task
It is to identify the target in video sequence, and the position of the propulsion constantly tracking target with video.In numerous trackings,
It is one of most popular method to carry out classification to target morphology in feature space.
For most of existing algorithms, the distance matrix metric of predefined formula is all applied in similarity computing, than
Such as Euclidean distance and mahalanobis distance.However, when obvious deformation occurs for such as structure target, the measurement of this fixation
Mode is extremely difficult to higher tracking accuracy requirement.Moreover, the change of background and illumination also result in based on it is predefined away from
Track algorithm from measurement tracks failure.
Therefore, there is scholar to propose the adaptable track algorithm based on learning distance metric, tracking is improved with this
Robustness.The basic thought of metric learning tracking is to train to obtain grader with former frames of video sequence, and follow-up
Constantly updated during tracking.The presence of distance metric cause before, background preferably distinguished in feature space, have simultaneously
The point distance for having identical category label reduces.New projecting space generally has lower dimension, and this also causes algorithm to compare
The amount of calculation of luv space substantially reduces.
The content of the invention:
The present invention is for shortcoming and defect present in prior art, it is proposed that one kind has abandoned Most current algorithm institute
The distance metric mode of the predefined formula of application, matching result is found in the new space of continuous acquistion, can be with so as to expand
The diversity of track target, improve the robustness under the tracking of algorithm long-range;In addition adaptivity screen selecting formwork is had also combined, is changed
The mechanical template renewal mode of existing algorithm so that ATL more accurately and rapidly fits while fault-tolerant ability is ensured
Answer the self-adaptive visual track algorithm based on online metric learning of object variations.
The present invention is reached by following measures:
A kind of self-adaptive visual track algorithm based on online metric learning, it is characterised in that including herein below:
Step 1:Target is described with SIFT feature:DSIFT in the VLFeat feature extractions storehouse of feature extraction algorithm application
Fast algorithm, for the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, and this is calculated online tracking
It is very big amount of calculation for method;Therefore dimensionality reduction is carried out to the SIFT feature extracted with random PCA (RPCA):
Give an eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixBy original n
Dimensional feature space is projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose, after known features matrix and target dimension,
RPCA can define an over-sampling dimensionWith an accidental projection matrixTo the new matrix Y=X being calculated
Ω application feature decompositions, have
Wherein B is an intermediary matrix, is then carried out SVD decomposition
X approximate matrix can be thus obtained by following formula
Finally, the eigenmatrix X after new projectionprojCan passes through Xproj=XVkObtain, here VkIt is k before being intercepted to y
What row obtained, for the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up;
Step 2:Grader is constructed and trains using supervised study:Give two samplesDistance
The purpose of metric learning is by adjusting original feature space, changing the relative space position relation between training set sample, make
Identical category sample between distance as far as possible reduce, it is different classes of between sample distance as far as possible increase, distance on this condition can
To be write as
dG(x, y)=(x-y)TG(x-y) (4)
Wherein, G is the distance matrix metric of acquistion, and the index for weighing similarity, constraint bar above are used as using K-L divergences
Part can is write as
Wherein l and u is two distance thresholds;
Above-mentioned divergence problem, because the sample number obtained in a frame is very limited, nothing are solved with LogDet methods
Method obtains the value of whole parameters, and " boot-strap " (bootstrap) method construct training set, practical application have been used in algorithm
In, training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, tracking calculate
Method in follow-up every frame inner updating distance metric, can give two segment u of extraction in t framestAnd vt, then between them away from
From forIf the distance of prediction is yt, then can obtains new distance metric by solving following formula
Gt+1
Wherein D is normalized function, and η is normalized parameter,It is the loss between target range and estimated distance
Function, make zt=ut-vt,Then the solution of this minimization problem is
The target of learning distance metric is
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
Also include adaptive selection template renewal in the present invention, training template is divided into according to update mode used in algorithm
Two classes:Quick more new template and sane more new template;The former use is and the latter in order to adapt to the deformation of target in time
Using being then in order to prevent tracking result from drifting about, wherein quick more new template is each frame to be extracted to obtain according to searching template
, the design for searching template both ensure that efficiency of the algorithm in processing background information, also ensure that the accurate description to target;
Sane more new template is then stored in ATL, and its size is fixed for given video sequence, and original template is from user
Extract and obtain around the target location of first frame mark.
The centre bit that each asterisk (*) in template extraction mode represents an extraction sample segment is quickly updated in the present invention
Put;Current goal center is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample;
After algorithm estimates target location out of present frame, the segment of 9 target sizes will be by track algorithm from 2 around the position
Extracted in region in individual pixel coverage, it is assumed that I (x;T) represent to extract obtained segment out of t frames, its corresponding and instruction
Practice the positive sample collection T in set TposFrom the segment of minimum;The SIFT feature extraction result of the segment is represented, ifWith TposBetween average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith current mould
Plate storehouse Mt={ m1..., mkIn each template distance, and it is compared the distance between with target in ATL, if newly
Distance corresponding to template is more than MtIn distance corresponding at least one template, then that minimum template of respective distances is by by new
Template I (x;T) the reason for substituting, doing so is, if the distance between new template and positive sample collection are less than threshold value, the template
It is considered positive sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that this carry more
Fresh information, and these information are not available for the template in current template storehouse, and therefore, new template is used instead that
With ATL MtThe more like old template of other interior templates.
The present invention proposes a kind of online Vision Tracking based on metric learning, efficiently solves long-range tracking process
The problem of middle target is easy to be lost, has stronger adaptability to motion blur and object deformation.Carried algorithm passes through distance metric
Practise and constantly update metric space, so as to improve the robustness of algorithm.The template in adaptive template storehouse proposed in text, according to more
New paragon can be divided into quick more new template and sane more new template, be respectively intended to tackle the Shandong of obvious deformation and the tracking of target
Rod, continuity.The combination of the two, it ensure that algorithm can rapidly adapt to the change of target, can also be missed in algorithm
When sentencing, the possibility for giving target for change again is still ensured that.In addition, text in carry Vision Tracking apply with owner into
Divide analytic approach that dSIFT primitive character dimension effectively is reduced into 2/3, further increases algorithm as Feature Dimension Reduction mode
Speed.Carried algorithm and current popular algorithm are compared under OTB video sequences, the results show inventive algorithm energy
It is enough to tackle most of tracing task well, there is stronger competitiveness.
Brief description of the drawings:
Accompanying drawing 1 is that template extraction mode is quickly updated in the present invention.
Accompanying drawing 2 is Duplication curve synoptic diagram in the present invention.
Accompanying drawing 3 is centre deviation schematic diagram in the present invention.
Accompanying drawing 4 is that tracking effect intuitively compares schematic diagram in the present invention.
Embodiment:
The present invention is further illustrated below in conjunction with the accompanying drawings.
A kind of Vision Tracking based on learning distance metric is proposed in the present invention, has abandoned Most current algorithm
The distance metric mode for the predefined formula applied, matching result is found in the new space of continuous acquistion, can so as to expand
The diversity of target is tracked, improves the robustness under the tracking of algorithm long-range.In addition, algorithm has also combined adaptivity screening mould
Plate, change the mechanical template renewal mode of existing algorithm so that ATL while fault-tolerant ability is ensured, more accurately,
Quickly adapt to object variations.
Construct one of major issue of a robust Vision Tracking is how to select suitable goal description mode, this
The result of tracking can be not only influenceed, can also influence the speed of tracking.Intuitively, pixel value can effectively one thing of description
Body, and easily obtain.If however, being not added with advanced processes, grey value characteristics are easily by illumination variation, attitudes vibration etc.
The influence of condition change.Even if it is compensated by metric matrix study, it is desirable to improve the tracking based on grey value characteristics
The performance of algorithm is still challenged heavy.Rotation, miniature deformation, illumination variation etc. can be preferably tackled in view of SIFT feature
Complex situations, SIFT feature is primarily upon in of the invention.
SIFT feature portrays target with characteristic point and description.For an object, it is micro- that key point corresponds to its multilayer Gauss
The extreme value divided under (DoG).After a series of necessary intermediate steps (delete key point, feedback dimensionality reduction, direction are specified etc.), mesh
Each key point extracted in mark can correspond to 128 dimensional feature vectors.In the application, 50 × 50 pixel sizes
Segment can filter out 9 SIFT key points.Because SIFT feature extraction is computationally intensive, algorithm keeps track speed has been had a strong impact on.
Therefore, here feature extraction algorithm application VLFeat feature extractions storehouse] in dSIFT fast algorithms, ensure that feature
Arithmetic speed is improved while effect and consistency.
For the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, and this is calculated online tracking
It is very big amount of calculation for method.Therefore, this algorithm proposes special to the SIFT extracted with random PCA (RPCA)
Sign carries out dimensionality reduction.
Give an eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixOriginal n dimensional feature spaces are projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose.When known features square
After battle array and target dimension, RPCA can define an over-sampling dimensionWith an accidental projection matrixTo calculating
The new matrix Y=X Ω application feature decompositions arrived, have
Wherein B is an intermediary matrix.Then carried out SVD decomposition
X approximate matrix can be thus obtained by following formula
Finally, the eigenmatrix W after new projectionprojCan passes through Wproj=XVkObtain.Here VkIt is k before being intercepted to y
What row obtained.For the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up.Compare
It is more stable in accidental projection algorithm (RP), RPCA algorithm effects.
Whether machine learning method can have class label to be divided into two classes according to training set:Supervised learn and it is non-supervisory
Formula learns.On the whole, supervised study (training process includes class label) shows more in the tracking application of most of visions
It is good.First, during tracking, it is easy to obtain the class label of training sample because no matter initialization procedure or with
Track deterministic process, target location and relevant information before present frame are all known.Target is marked with classification 1 as prospect,
In addition all information are referred to as background, are marked with classification 0 (or -1).In addition, non-supervisory formula algorithm usually require it is longer
Training time because its training process by constantly cluster and mean shift form.Therefore, this algorithm applies supervised
Practise to construct and train grader.
Give two samplesThe purpose of learning distance metric is by adjusting original feature space, changing
Relative space position relation between training set sample so that distance reduces as far as possible between the sample of identical category, different classes of
Sample distance increases as far as possible.Distance on this condition can be write as
dG(x, y)=(x-y)TG(x-y) (12)
Wherein, G is the distance matrix metric of acquistion.The index for weighing similarity, constraint bar above are used as using K-L divergences
Part can is write as
Wherein l and u is two distance thresholds.
Above-mentioned divergence problem is solved with LogDet methods., can not because the sample number obtained in a frame is very limited
The value of whole parameters is obtained, " boot-strap " (bootstrap) method construct training set has been used in algorithm.Practical application
In, training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, tracking calculate
Method can be in follow-up every frame inner updating distance metric.Two segment u of extraction in given t framestAnd vt, then between them away from
From forIf the distance of prediction is yt, then can obtains new distance metric by solving following formula
Gt+1
Wherein D is normalized function, and η is normalized parameter,It is the loss between target range and estimated distance
Function.Make zt=ut-vt,Then the solution of this minimization problem is
The target of learning distance metric is
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
During tracking, template is presented with important influence for algorithm.Generally, when there is new frame of video input
When, most of existing algorithm can extract segment around the target location that the center or estimation that user gives obtain, from
And obtain positive and negative template.It is sensitive that this results in deformation of the target for object, therefore, it is difficult to tackle well target object regarding
The deformation that may occur during frequency, so as to cause tracking to fail.And once target is lost, because all templates all receive
Influence, such algorithm is also difficult to give target for change again.
In order to avoid this case, training template has been divided into two classes according to update mode used in algorithm:Quick renewal
Template and sane more new template.The former use is to adapt to the deformation of target in time, and the use of the latter is then in order to anti-
Only tracking result is drifted about.Quick more new template is to extract what is obtained to each frame according to the search template shown in Fig. 1.Search template
Design both ensure that algorithm processing background information on efficiency, also ensure that the accurate description to target.Sane renewal mould
Plate is then stored in ATL, and its size is fixed for given video sequence.Original template is marked from user in first frame
Target location around extract and obtain.
As shown in figure 1, quickly each asterisk (*) represents the center of an extraction sample segment in renewal template extraction mode
Position;Current goal center is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample
This.
After algorithm estimates target location out of present frame, the segments of 9 target sizes will by track algorithm from this
Extracted in region around position in 2 pixel coverages.It is assumed that I (x;T) represent to extract obtained segment out of t frames,
Positive sample collection T in its corresponding T with training setposFrom the segment of minimum.Represent the SIFT feature extraction knot of the segment
Fruit.IfWith TposBetween average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith
Current template storehouse Mt={ m1..., mkIn each template distance, and it is compared the distance between with target in ATL.
If distance corresponding to new template is more than MtIn distance corresponding at least one template, then that minimum template of respective distances will
By new template I (x;T) substitute.The reason for doing so is, if the distance between new template and positive sample collection are less than threshold value,
The template is considered positive sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that it is carried
More fresh informations, and these information are not available for the template in current template storehouse.Therefore, new template is used to replace
For that and ATL MtThe more like old template of other interior templates.
Carry out a series of experiments in the present invention to compare carried algorithm and some current epidemic algorithms, used in experiment
Video sequence is the video sequence in OTB benchmark.These sequences contain posture, illumination, rotation and dimensional variation, screening
The multiclass situation such as gear and quick movement.Experimental situation is MATLAB R2012a, the use of hardware is Intel 3.10GHz processors,
4GB RAM。
The present invention carries algorithm and compared with following seven classes algorithm:1)CCT;2)CSK;3)DFT;4)FOT;5)KCF;
6)LCT;7)LSHT.
Two class evaluation indexes have been used in comparing:Duplication (OR) and centre deviation (CLE).One in given t frames
Segment P corresponding to estimated locationtThe corresponding true value position G with the framet, defining Duplication is
Wherein ∩ and ∪ represents the common factor and union in region respectively, | | represent the pixel number in region.Duplication OR
Curve is drawn from 0 to 1 according to threshold value, and the point on curve represents that Duplication is higher than the frame proportion of threshold value.It can be seen that Duplication is bent
Line is higher, and it is better to represent algorithm performance.
CLE represents the center deviation between estimated location and true value under Euclidean distance
Wherein (Pxt, Pyt) and (Gxt, Gyt) estimated location center and true value place-centric are represented respectively.Likewise, centre bit
It is smaller to put deviation, then algorithm performance is better.
Here the tracking test result of some challenging sequences is given.It is all in whole experiment process
Parameter used in track algorithm is all fixed, to meet a needs of algorithm can preferably tackle all situations.Fig. 2 is portion
Divide Duplication (OR) curve corresponding to OTB sequences.Abscissa corresponds to threshold value setting from 0% to 100%, and curve then represents corresponding weight
Ratio of the folded rate higher than whole video sequence frame number shared by the frame of threshold value.Therefore, a curve declines slower, its corresponding algorithm
Performance is better.In view of if without overlapping region algorithm keeps track fail, then threshold value be numerical value corresponding to 1% place it is visual
For the success rate of the algorithm.Fig. 3 is centre deviation (CLE) corresponding to the OTB video sequences of part.Similar with Fig. 2, centre deviation is bent
Line is drawn from 1 to 50 according to threshold value and formed, and it is whole shared by the frame of threshold value to represent that centre deviation is less than for corresponding point on curve
The ratio of video sequence frame number.It can be seen that ratio more Gao Ze represent algorithm performance it is better.It is 20 institutes usually using centre deviation threshold value
Corresponding value is analyzed.
Quick motion (FM) and motion blur (MB) are challenges common in actual video sequence.Therefore, algorithm tackles this
The performance of a little problems is an important indicator in actual assessment.Contained in OTB benchmark many containing quick motion
With the video sequence of motion blur, such as BlurCar1, BlurFace, Car11,、Deer、Girl2,、Human9,、
Soccer etc..Carried algorithm and other algorithms are compared using these video sequences in text, corresponding OR and CLE are represented respectively
In figs. 2 and 3.It can be seen that carried algorithm reply it is most of include FM and MB video sequence when effect all
Better than other algorithms.Although some algorithms (such as CCT) have good effect approximate being considered as in static target following,
But open defect be present in situations such as its reply rapid moving object and motion blur.
Accompanying drawing 2 is Duplication curve synoptic diagram, and for curve according to 1% to 100% threshold rendering, transverse axis is threshold value, and the longitudinal axis is
Duplication is higher than the frame of video proportion of threshold value.
Accompanying drawing 3 is centre deviation schematic diagram, and wherein curve transverse axis is drawn according to threshold value 1 to 50, and the value of threshold value is corresponding to be estimated
Position and the Euclidean distance of the center of true value, the longitudinal axis represent the frame of video proportion that deviation is less than threshold value.
Except FM and MB, the deformation (DEF) of structural object, (OCC), plane internal rotation (IPR) and plane outward turning are blocked
Turn the significant challenge in (OPR) and video tracking.Video sequence comprising these factors has Bolt2、Coupon、David3、
Gym, Trellis etc..In addition, David3 and Trellis also contains blurred background, illumination variation and dimensional variation etc..These
Experimental result corresponding to video sequence also show the superiority of put forward track algorithm.
Fig. 4 has marked out the tracking result that last frame is corresponded in above-mentioned video sequence, for more intuitively comparing.Institute
Tracking result is marked with algorithm with 8 kinds of different colours respectively.It can be seen that the performance for carrying algorithm is better than other algorithms, and track mesh
Target position away from real goal closer to.
Claims (3)
1. a kind of self-adaptive visual track algorithm based on online metric learning, it is characterised in that including herein below:
Step 1:Target is described with SIFT feature:DSIFT in the VLFeat feature extractions storehouse of feature extraction algorithm application is quick
Algorithm, for the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, this to on-line tracking and
Speech is very big amount of calculation;Therefore dimensionality reduction is carried out to the SIFT feature extracted with random PCA (RPCA):It is given
One eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixBy original n Wei Te
Sign space is projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose, after known features matrix and target dimension, and RPCA
An over-sampling dimension can be definedWith an accidental projection matrixShould to the new matrix Y=X Ω being calculated
With feature decomposition, have
<mrow>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>Q</mi>
<mi>R</mi>
<mo>=</mo>
<mi>Y</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>B</mi>
<mo>:</mo>
<mo>=</mo>
<msup>
<mi>Q</mi>
<mi>T</mi>
</msup>
<mi>X</mi>
</mrow>
</mtd>
</mtr>
</mtable>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>19</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein B is an intermediary matrix, is then carried out SVD decomposition
<mrow>
<mi>B</mi>
<mo>=</mo>
<mover>
<mi>U</mi>
<mo>~</mo>
</mover>
<msup>
<mi>&Sigma;V</mi>
<mi>T</mi>
</msup>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>20</mn>
<mo>)</mo>
</mrow>
</mrow>
X approximate matrix can be thus obtained by following formula
<mrow>
<mi>X</mi>
<mo>&ap;</mo>
<msup>
<mi>QQ</mi>
<mi>T</mi>
</msup>
<mi>X</mi>
<mo>=</mo>
<mi>Q</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>U</mi>
<mo>~</mo>
</mover>
<msup>
<mi>&Sigma;V</mi>
<mi>T</mi>
</msup>
<mo>)</mo>
</mrow>
<mo>:</mo>
<mo>=</mo>
<msup>
<mi>U&Sigma;V</mi>
<mi>T</mi>
</msup>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>21</mn>
<mo>)</mo>
</mrow>
</mrow>
Finally, the eigenmatrix X after new projectionprojCan passes through Xproj=XVkObtain, here VkIt is that k arranges to obtain before being intercepted to V
, for the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up;
Step 2:Grader is constructed and trains using supervised study:Give two samplesDistance metric
The purpose of habit is by adjusting original feature space, changing the relative space position relation between training set sample so that identical
Distance reduces as far as possible between the sample of classification, it is different classes of between sample distance increase as far as possible, distance on this condition can be write as
dG(x, y)=(x-y)TG(x-y) (22)
Wherein, G is the distance matrix metric of acquistion, and using K-L divergences as the index for weighing similarity, constraints above is just
It can be write as
<mrow>
<mtable>
<mtr>
<mtd>
<mrow>
<munder>
<mi>min</mi>
<mi>G</mi>
</munder>
<mi>K</mi>
<mi>L</mi>
<mrow>
<mo>(</mo>
<mi>p</mi>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<msub>
<mi>G</mi>
<mn>0</mn>
</msub>
</mrow>
<mo>)</mo>
<mi>p</mi>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<mi>G</mi>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mtable>
<mtr>
<mtd>
<mrow>
<mi>s</mi>
<mo>.</mo>
<mi>t</mi>
<mo>.</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>d</mi>
<mi>G</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>&le;</mo>
<mi>l</mi>
<mo>,</mo>
<mi>i</mi>
<mi>f</mi>
<mi> </mi>
<mi>l</mi>
<mi>a</mi>
<mi>b</mi>
<mi>e</mi>
<mi>l</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>l</mi>
<mi>a</mi>
<mi>b</mi>
<mi>e</mi>
<mi>l</mi>
<mrow>
<mo>(</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow></mrow>
</mtd>
<mtd>
<mrow>
<msub>
<mi>d</mi>
<mi>G</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>,</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
<mo>&GreaterEqual;</mo>
<mi>u</mi>
<mo>,</mo>
<mi>i</mi>
<mi>f</mi>
<mi> </mi>
<mi>l</mi>
<mi>a</mi>
<mi>b</mi>
<mi>e</mi>
<mi>l</mi>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
<mo>&NotEqual;</mo>
<mi>l</mi>
<mi>a</mi>
<mi>b</mi>
<mi>e</mi>
<mi>l</mi>
<mrow>
<mo>(</mo>
<mi>y</mi>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mtd>
</mtr>
</mtable>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>23</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein l and u is two distance thresholds;
Above-mentioned divergence problem is solved with LogDet methods, because the sample number obtained in a frame is very limited, can not be obtained
The value of whole parameters, has used " boot-strap " (bootstrap) method construct training set in algorithm, in practical application,
Training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, track algorithm meeting
In follow-up every frame inner updating distance metric, the two segment u extracted in t frames are giventAnd vt, then the distance between they beIf the distance of prediction is yt, then can obtains new distance metric G by solving following formulat+1
<mrow>
<msub>
<mi>G</mi>
<mrow>
<mi>t</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>=</mo>
<munder>
<mrow>
<mi>arg</mi>
<mi>min</mi>
</mrow>
<mrow>
<mi>G</mi>
<mo>></mo>
<mn>0</mn>
</mrow>
</munder>
<mi>D</mi>
<mrow>
<mo>(</mo>
<mi>G</mi>
<mo>,</mo>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&eta;</mi>
<mi>L</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>d</mi>
<mi>G</mi>
</msub>
<mo>(</mo>
<mrow>
<msub>
<mi>u</mi>
<mi>t</mi>
</msub>
<mo>,</mo>
<msub>
<mi>v</mi>
<mi>t</mi>
</msub>
</mrow>
<mo>)</mo>
<mo>,</mo>
<msub>
<mi>y</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>24</mn>
<mo>)</mo>
</mrow>
</mrow>
Wherein D is normalized function, and η is normalized parameter,It is the loss function between target range and estimated distance,
Make zt=ut-vt,Then the solution of this minimization problem is
<mrow>
<msub>
<mi>G</mi>
<mrow>
<mi>t</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>=</mo>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
<mo>-</mo>
<mfrac>
<mrow>
<mi>&eta;</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>y</mi>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<msub>
<mi>y</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
<msub>
<mi>z</mi>
<mi>t</mi>
</msub>
<msubsup>
<mi>z</mi>
<mi>t</mi>
<mi>T</mi>
</msubsup>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
</mrow>
<mrow>
<mn>1</mn>
<mo>+</mo>
<mi>&eta;</mi>
<mrow>
<mo>(</mo>
<mover>
<mi>y</mi>
<mo>&OverBar;</mo>
</mover>
<mo>-</mo>
<msub>
<mi>y</mi>
<mi>t</mi>
</msub>
<mo>)</mo>
</mrow>
<msubsup>
<mi>z</mi>
<mi>t</mi>
<mi>T</mi>
</msubsup>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
<msub>
<mi>z</mi>
<mi>t</mi>
</msub>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>25</mn>
<mo>)</mo>
</mrow>
</mrow>
The target of learning distance metric is
<mrow>
<mo>{</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<msub>
<mi>d</mi>
<msub>
<mi>G</mi>
<mrow>
<mi>t</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
</msub>
<mrow>
<mo>(</mo>
<mover>
<mi>I</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>,</mo>
<mi>M</mi>
<mo>)</mo>
</mrow>
<mo><</mo>
<mo><</mo>
<msub>
<mi>d</mi>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
</msub>
<mrow>
<mo>(</mo>
<mover>
<mi>I</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>,</mo>
<mi>M</mi>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>d</mi>
<msub>
<mi>G</mi>
<mrow>
<mi>t</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
</msub>
<mrow>
<mo>(</mo>
<mover>
<mi>I</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>,</mo>
<mover>
<mi>J</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<msub>
<mi>x</mi>
<mi>j</mi>
</msub>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
<mo><</mo>
<mo><</mo>
<msub>
<mi>d</mi>
<msub>
<mi>G</mi>
<mi>t</mi>
</msub>
</msub>
<mrow>
<mo>(</mo>
<mover>
<mi>I</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<mi>x</mi>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>,</mo>
<mover>
<mi>J</mi>
<mo>^</mo>
</mover>
<mo>(</mo>
<mrow>
<msub>
<mi>x</mi>
<mi>j</mi>
</msub>
<mo>;</mo>
<mi>t</mi>
</mrow>
<mo>)</mo>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>26</mn>
<mo>)</mo>
</mrow>
</mrow>
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
A kind of 2. self-adaptive visual track algorithm based on online metric learning according to claim 1, it is characterised in that
Also include adaptive selection template renewal, training template has been divided into two classes according to update mode used in algorithm:Quick renewal
Template and sane more new template;The former use is to adapt to the deformation of target in time, and the use of the latter is then in order to anti-
Only tracking result is drifted about, wherein quick more new template is extracted to obtain according to template is searched to each frame, searches setting for template
Meter both ensure that efficiency of the algorithm in processing background information, also ensure that the accurate description to target;Sane more new template is then
It is stored in ATL, its size is fixed for given video sequence, and original template is the mesh marked from user in first frame
Extraction obtains around cursor position.
A kind of 3. self-adaptive visual track algorithm based on online metric learning according to claim 2, it is characterised in that
Each asterisk (*) represents the center of an extraction sample segment in quick renewal template extraction mode;Current goal center
Position is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample;When algorithm is from present frame
After inside estimating target location, the segments of 9 target sizes will be by track algorithm from around the position in 2 pixel coverages
Extracted in region, it is assumed that I (x;T) represent to extract obtained segment out of t frames, in its corresponding T with training set just
Sample set TposFrom the segment of minimum;The SIFT feature extraction result of the segment is represented, ifWith TposBetween
Average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith current template storehouse Mt={ m1,…,
mkIn each template distance, and it is compared the distance between with target in ATL, if distance is big corresponding to new template
In MtIn distance corresponding at least one template, then that minimum template of respective distances is by by new template I (x;T) substitute,
The reason for doing so is, if the distance between new template and positive sample collection are less than threshold value, the template is considered just
Sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that this carry more fresh informations, and this
A little information are not available for the template in current template storehouse, and therefore, new template is used instead that and ATL MtIt is interior
The more like old template of other templates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710455281.3A CN107341817B (en) | 2017-06-16 | 2017-06-16 | Self-adaptive visual track algorithm based on online metric learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710455281.3A CN107341817B (en) | 2017-06-16 | 2017-06-16 | Self-adaptive visual track algorithm based on online metric learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341817A true CN107341817A (en) | 2017-11-10 |
CN107341817B CN107341817B (en) | 2019-05-21 |
Family
ID=60220727
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710455281.3A Active CN107341817B (en) | 2017-06-16 | 2017-06-16 | Self-adaptive visual track algorithm based on online metric learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341817B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934849A (en) * | 2019-03-08 | 2019-06-25 | 西北工业大学 | Online multi-object tracking method based on track metric learning |
CN111854728A (en) * | 2020-05-20 | 2020-10-30 | 哈尔滨工程大学 | Fault-tolerant filtering method based on generalized relative entropy |
CN112037255A (en) * | 2020-08-12 | 2020-12-04 | 深圳市道通智能航空技术有限公司 | Target tracking method and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103778415A (en) * | 2014-01-21 | 2014-05-07 | 蔺全录 | Mine personnel iris checking-in and tracking and positioning method and system |
CN104616324A (en) * | 2015-03-06 | 2015-05-13 | 厦门大学 | Target tracking method based on adaptive appearance model and point-set distance metric learning |
-
2017
- 2017-06-16 CN CN201710455281.3A patent/CN107341817B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103778415A (en) * | 2014-01-21 | 2014-05-07 | 蔺全录 | Mine personnel iris checking-in and tracking and positioning method and system |
CN104616324A (en) * | 2015-03-06 | 2015-05-13 | 厦门大学 | Target tracking method based on adaptive appearance model and point-set distance metric learning |
Non-Patent Citations (3)
Title |
---|
SHUQIAO SUN: "Self-Adaptive Visual Tracker Based on Background Information", 《HARBIN: 2016 THE SIXTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL》 * |
贾桂敏: "复杂背景下基于自适应模板更新的目标跟踪算法", 《光学学报》 * |
赵永威: "基于特征分组与特征值最优化的距离度量学习方法", 《数据采集与处理》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934849A (en) * | 2019-03-08 | 2019-06-25 | 西北工业大学 | Online multi-object tracking method based on track metric learning |
CN111854728A (en) * | 2020-05-20 | 2020-10-30 | 哈尔滨工程大学 | Fault-tolerant filtering method based on generalized relative entropy |
CN111854728B (en) * | 2020-05-20 | 2022-12-13 | 哈尔滨工程大学 | Fault-tolerant filtering method based on generalized relative entropy |
CN112037255A (en) * | 2020-08-12 | 2020-12-04 | 深圳市道通智能航空技术有限公司 | Target tracking method and device |
Also Published As
Publication number | Publication date |
---|---|
CN107341817B (en) | 2019-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ruggero Ronchi et al. | Benchmarking and error diagnosis in multi-instance pose estimation | |
CN109146921B (en) | Pedestrian target tracking method based on deep learning | |
Newell et al. | Stacked hourglass networks for human pose estimation | |
Sullivan et al. | Recognizing and tracking human action | |
CN106127804B (en) | The method for tracking target of RGB-D data cross-module formula feature learnings based on sparse depth denoising self-encoding encoder | |
CN110717414A (en) | Target detection tracking method, device and equipment | |
CN106548151B (en) | Target analyte detection track identification method and system towards intelligent robot | |
CN109613006A (en) | A kind of fabric defect detection method based on end-to-end neural network | |
McQueen et al. | Automatically recognizing on-ball screens | |
CN106780557A (en) | A kind of motion target tracking method based on optical flow method and crucial point feature | |
Pham et al. | Bayesian semantic instance segmentation in open set world | |
CN107341817A (en) | Self-adaptive visual track algorithm based on online metric learning | |
Vercruyssen et al. | Qualitative spatial reasoning for soccer pass prediction | |
CN108460790A (en) | A kind of visual tracking method based on consistency fallout predictor model | |
Zhao et al. | Accurate pedestrian detection by human pose regression | |
Morimitsu et al. | Exploring structure for long-term tracking of multiple objects in sports videos | |
CN109299732A (en) | The method, apparatus and electronic equipment of unmanned behaviour decision making and model training | |
CN108288020A (en) | Video shelter detecting system based on contextual information and method | |
CN113269103B (en) | Abnormal behavior detection method, system, storage medium and equipment based on space map convolutional network | |
CN109214245A (en) | A kind of method for tracking target, device, equipment and computer readable storage medium | |
Li et al. | Robust object tracking with discrete graph-based multiple experts | |
CN109598742A (en) | A kind of method for tracking target and system based on SSD algorithm | |
CN110008900A (en) | A kind of visible remote sensing image candidate target extracting method by region to target | |
CN112734803A (en) | Single target tracking method, device, equipment and storage medium based on character description | |
WO2021010342A1 (en) | Action recognition device, action recognition method, and action recognition program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |