CN107341817A - Self-adaptive visual track algorithm based on online metric learning - Google Patents

Self-adaptive visual track algorithm based on online metric learning Download PDF

Info

Publication number
CN107341817A
CN107341817A CN201710455281.3A CN201710455281A CN107341817A CN 107341817 A CN107341817 A CN 107341817A CN 201710455281 A CN201710455281 A CN 201710455281A CN 107341817 A CN107341817 A CN 107341817A
Authority
CN
China
Prior art keywords
mrow
msub
template
distance
mtd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710455281.3A
Other languages
Chinese (zh)
Other versions
CN107341817B (en
Inventor
康文静
孙叔桥
刘功亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN201710455281.3A priority Critical patent/CN107341817B/en
Publication of CN107341817A publication Critical patent/CN107341817A/en
Application granted granted Critical
Publication of CN107341817B publication Critical patent/CN107341817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to Visual Tracking field,Specifically a kind of self-adaptive visual track algorithm based on online metric learning,In the practical application scene in vision tracking field,Obtainable target priori is generally seldom in video sequence to be tracked,The predefined formula distance metric algorithm of tradition is difficult reply long-range tracing task requirement,The present invention proposes a kind of online Vision Tracking of robust of combination learning distance metric,Before tracking is considered as by it,Two classification problems of background,And constantly update grader as video promotes,It also proposed a kind of new template renewal algorithm,Tracking process is set to have more robustness,To improve the precision and efficiency of algorithm,It is proposed to reduce characteristic dimension while tracking effect is ensured using dense SIFT feature and random PCA,A series of experiments result is shown,Carried algorithm has certain competitiveness compared with many epidemic algorithms instantly.

Description

Self-adaptive visual track algorithm based on online metric learning
Technical field:
The present invention relates to Visual Tracking field, specifically a kind of self-adaptive visual based on online metric learning Track algorithm.
Background technology:
As an important topic of computer vision, the discussion of vision tracking has continued for many years.Its main task It is to identify the target in video sequence, and the position of the propulsion constantly tracking target with video.In numerous trackings, It is one of most popular method to carry out classification to target morphology in feature space.
For most of existing algorithms, the distance matrix metric of predefined formula is all applied in similarity computing, than Such as Euclidean distance and mahalanobis distance.However, when obvious deformation occurs for such as structure target, the measurement of this fixation Mode is extremely difficult to higher tracking accuracy requirement.Moreover, the change of background and illumination also result in based on it is predefined away from Track algorithm from measurement tracks failure.
Therefore, there is scholar to propose the adaptable track algorithm based on learning distance metric, tracking is improved with this Robustness.The basic thought of metric learning tracking is to train to obtain grader with former frames of video sequence, and follow-up Constantly updated during tracking.The presence of distance metric cause before, background preferably distinguished in feature space, have simultaneously The point distance for having identical category label reduces.New projecting space generally has lower dimension, and this also causes algorithm to compare The amount of calculation of luv space substantially reduces.
The content of the invention:
The present invention is for shortcoming and defect present in prior art, it is proposed that one kind has abandoned Most current algorithm institute The distance metric mode of the predefined formula of application, matching result is found in the new space of continuous acquistion, can be with so as to expand The diversity of track target, improve the robustness under the tracking of algorithm long-range;In addition adaptivity screen selecting formwork is had also combined, is changed The mechanical template renewal mode of existing algorithm so that ATL more accurately and rapidly fits while fault-tolerant ability is ensured Answer the self-adaptive visual track algorithm based on online metric learning of object variations.
The present invention is reached by following measures:
A kind of self-adaptive visual track algorithm based on online metric learning, it is characterised in that including herein below:
Step 1:Target is described with SIFT feature:DSIFT in the VLFeat feature extractions storehouse of feature extraction algorithm application Fast algorithm, for the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, and this is calculated online tracking It is very big amount of calculation for method;Therefore dimensionality reduction is carried out to the SIFT feature extracted with random PCA (RPCA): Give an eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixBy original n Dimensional feature space is projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose, after known features matrix and target dimension, RPCA can define an over-sampling dimensionWith an accidental projection matrixTo the new matrix Y=X being calculated Ω application feature decompositions, have
Wherein B is an intermediary matrix, is then carried out SVD decomposition
X approximate matrix can be thus obtained by following formula
Finally, the eigenmatrix X after new projectionprojCan passes through Xproj=XVkObtain, here VkIt is k before being intercepted to y What row obtained, for the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up;
Step 2:Grader is constructed and trains using supervised study:Give two samplesDistance The purpose of metric learning is by adjusting original feature space, changing the relative space position relation between training set sample, make Identical category sample between distance as far as possible reduce, it is different classes of between sample distance as far as possible increase, distance on this condition can To be write as
dG(x, y)=(x-y)TG(x-y) (4)
Wherein, G is the distance matrix metric of acquistion, and the index for weighing similarity, constraint bar above are used as using K-L divergences Part can is write as
Wherein l and u is two distance thresholds;
Above-mentioned divergence problem, because the sample number obtained in a frame is very limited, nothing are solved with LogDet methods Method obtains the value of whole parameters, and " boot-strap " (bootstrap) method construct training set, practical application have been used in algorithm In, training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, tracking calculate Method in follow-up every frame inner updating distance metric, can give two segment u of extraction in t framestAnd vt, then between them away from From forIf the distance of prediction is yt, then can obtains new distance metric by solving following formula Gt+1
Wherein D is normalized function, and η is normalized parameter,It is the loss between target range and estimated distance Function, make zt=ut-vt,Then the solution of this minimization problem is
The target of learning distance metric is
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
Also include adaptive selection template renewal in the present invention, training template is divided into according to update mode used in algorithm Two classes:Quick more new template and sane more new template;The former use is and the latter in order to adapt to the deformation of target in time Using being then in order to prevent tracking result from drifting about, wherein quick more new template is each frame to be extracted to obtain according to searching template , the design for searching template both ensure that efficiency of the algorithm in processing background information, also ensure that the accurate description to target; Sane more new template is then stored in ATL, and its size is fixed for given video sequence, and original template is from user Extract and obtain around the target location of first frame mark.
The centre bit that each asterisk (*) in template extraction mode represents an extraction sample segment is quickly updated in the present invention Put;Current goal center is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample; After algorithm estimates target location out of present frame, the segment of 9 target sizes will be by track algorithm from 2 around the position Extracted in region in individual pixel coverage, it is assumed that I (x;T) represent to extract obtained segment out of t frames, its corresponding and instruction Practice the positive sample collection T in set TposFrom the segment of minimum;The SIFT feature extraction result of the segment is represented, ifWith TposBetween average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith current mould Plate storehouse Mt={ m1..., mkIn each template distance, and it is compared the distance between with target in ATL, if newly Distance corresponding to template is more than MtIn distance corresponding at least one template, then that minimum template of respective distances is by by new Template I (x;T) the reason for substituting, doing so is, if the distance between new template and positive sample collection are less than threshold value, the template It is considered positive sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that this carry more Fresh information, and these information are not available for the template in current template storehouse, and therefore, new template is used instead that With ATL MtThe more like old template of other interior templates.
The present invention proposes a kind of online Vision Tracking based on metric learning, efficiently solves long-range tracking process The problem of middle target is easy to be lost, has stronger adaptability to motion blur and object deformation.Carried algorithm passes through distance metric Practise and constantly update metric space, so as to improve the robustness of algorithm.The template in adaptive template storehouse proposed in text, according to more New paragon can be divided into quick more new template and sane more new template, be respectively intended to tackle the Shandong of obvious deformation and the tracking of target Rod, continuity.The combination of the two, it ensure that algorithm can rapidly adapt to the change of target, can also be missed in algorithm When sentencing, the possibility for giving target for change again is still ensured that.In addition, text in carry Vision Tracking apply with owner into Divide analytic approach that dSIFT primitive character dimension effectively is reduced into 2/3, further increases algorithm as Feature Dimension Reduction mode Speed.Carried algorithm and current popular algorithm are compared under OTB video sequences, the results show inventive algorithm energy It is enough to tackle most of tracing task well, there is stronger competitiveness.
Brief description of the drawings:
Accompanying drawing 1 is that template extraction mode is quickly updated in the present invention.
Accompanying drawing 2 is Duplication curve synoptic diagram in the present invention.
Accompanying drawing 3 is centre deviation schematic diagram in the present invention.
Accompanying drawing 4 is that tracking effect intuitively compares schematic diagram in the present invention.
Embodiment:
The present invention is further illustrated below in conjunction with the accompanying drawings.
A kind of Vision Tracking based on learning distance metric is proposed in the present invention, has abandoned Most current algorithm The distance metric mode for the predefined formula applied, matching result is found in the new space of continuous acquistion, can so as to expand The diversity of target is tracked, improves the robustness under the tracking of algorithm long-range.In addition, algorithm has also combined adaptivity screening mould Plate, change the mechanical template renewal mode of existing algorithm so that ATL while fault-tolerant ability is ensured, more accurately, Quickly adapt to object variations.
Construct one of major issue of a robust Vision Tracking is how to select suitable goal description mode, this The result of tracking can be not only influenceed, can also influence the speed of tracking.Intuitively, pixel value can effectively one thing of description Body, and easily obtain.If however, being not added with advanced processes, grey value characteristics are easily by illumination variation, attitudes vibration etc. The influence of condition change.Even if it is compensated by metric matrix study, it is desirable to improve the tracking based on grey value characteristics The performance of algorithm is still challenged heavy.Rotation, miniature deformation, illumination variation etc. can be preferably tackled in view of SIFT feature Complex situations, SIFT feature is primarily upon in of the invention.
SIFT feature portrays target with characteristic point and description.For an object, it is micro- that key point corresponds to its multilayer Gauss The extreme value divided under (DoG).After a series of necessary intermediate steps (delete key point, feedback dimensionality reduction, direction are specified etc.), mesh Each key point extracted in mark can correspond to 128 dimensional feature vectors.In the application, 50 × 50 pixel sizes Segment can filter out 9 SIFT key points.Because SIFT feature extraction is computationally intensive, algorithm keeps track speed has been had a strong impact on. Therefore, here feature extraction algorithm application VLFeat feature extractions storehouse] in dSIFT fast algorithms, ensure that feature Arithmetic speed is improved while effect and consistency.
For the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, and this is calculated online tracking It is very big amount of calculation for method.Therefore, this algorithm proposes special to the SIFT extracted with random PCA (RPCA) Sign carries out dimensionality reduction.
Give an eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixOriginal n dimensional feature spaces are projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose.When known features square After battle array and target dimension, RPCA can define an over-sampling dimensionWith an accidental projection matrixTo calculating The new matrix Y=X Ω application feature decompositions arrived, have
Wherein B is an intermediary matrix.Then carried out SVD decomposition
X approximate matrix can be thus obtained by following formula
Finally, the eigenmatrix W after new projectionprojCan passes through Wproj=XVkObtain.Here VkIt is k before being intercepted to y What row obtained.For the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up.Compare It is more stable in accidental projection algorithm (RP), RPCA algorithm effects.
Whether machine learning method can have class label to be divided into two classes according to training set:Supervised learn and it is non-supervisory Formula learns.On the whole, supervised study (training process includes class label) shows more in the tracking application of most of visions It is good.First, during tracking, it is easy to obtain the class label of training sample because no matter initialization procedure or with Track deterministic process, target location and relevant information before present frame are all known.Target is marked with classification 1 as prospect, In addition all information are referred to as background, are marked with classification 0 (or -1).In addition, non-supervisory formula algorithm usually require it is longer Training time because its training process by constantly cluster and mean shift form.Therefore, this algorithm applies supervised Practise to construct and train grader.
Give two samplesThe purpose of learning distance metric is by adjusting original feature space, changing Relative space position relation between training set sample so that distance reduces as far as possible between the sample of identical category, different classes of Sample distance increases as far as possible.Distance on this condition can be write as
dG(x, y)=(x-y)TG(x-y) (12)
Wherein, G is the distance matrix metric of acquistion.The index for weighing similarity, constraint bar above are used as using K-L divergences Part can is write as
Wherein l and u is two distance thresholds.
Above-mentioned divergence problem is solved with LogDet methods., can not because the sample number obtained in a frame is very limited The value of whole parameters is obtained, " boot-strap " (bootstrap) method construct training set has been used in algorithm.Practical application In, training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, tracking calculate Method can be in follow-up every frame inner updating distance metric.Two segment u of extraction in given t framestAnd vt, then between them away from From forIf the distance of prediction is yt, then can obtains new distance metric by solving following formula Gt+1
Wherein D is normalized function, and η is normalized parameter,It is the loss between target range and estimated distance Function.Make zt=ut-vt,Then the solution of this minimization problem is
The target of learning distance metric is
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
During tracking, template is presented with important influence for algorithm.Generally, when there is new frame of video input When, most of existing algorithm can extract segment around the target location that the center or estimation that user gives obtain, from And obtain positive and negative template.It is sensitive that this results in deformation of the target for object, therefore, it is difficult to tackle well target object regarding The deformation that may occur during frequency, so as to cause tracking to fail.And once target is lost, because all templates all receive Influence, such algorithm is also difficult to give target for change again.
In order to avoid this case, training template has been divided into two classes according to update mode used in algorithm:Quick renewal Template and sane more new template.The former use is to adapt to the deformation of target in time, and the use of the latter is then in order to anti- Only tracking result is drifted about.Quick more new template is to extract what is obtained to each frame according to the search template shown in Fig. 1.Search template Design both ensure that algorithm processing background information on efficiency, also ensure that the accurate description to target.Sane renewal mould Plate is then stored in ATL, and its size is fixed for given video sequence.Original template is marked from user in first frame Target location around extract and obtain.
As shown in figure 1, quickly each asterisk (*) represents the center of an extraction sample segment in renewal template extraction mode Position;Current goal center is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample This.
After algorithm estimates target location out of present frame, the segments of 9 target sizes will by track algorithm from this Extracted in region around position in 2 pixel coverages.It is assumed that I (x;T) represent to extract obtained segment out of t frames, Positive sample collection T in its corresponding T with training setposFrom the segment of minimum.Represent the SIFT feature extraction knot of the segment Fruit.IfWith TposBetween average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith Current template storehouse Mt={ m1..., mkIn each template distance, and it is compared the distance between with target in ATL. If distance corresponding to new template is more than MtIn distance corresponding at least one template, then that minimum template of respective distances will By new template I (x;T) substitute.The reason for doing so is, if the distance between new template and positive sample collection are less than threshold value, The template is considered positive sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that it is carried More fresh informations, and these information are not available for the template in current template storehouse.Therefore, new template is used to replace For that and ATL MtThe more like old template of other interior templates.
Carry out a series of experiments in the present invention to compare carried algorithm and some current epidemic algorithms, used in experiment Video sequence is the video sequence in OTB benchmark.These sequences contain posture, illumination, rotation and dimensional variation, screening The multiclass situation such as gear and quick movement.Experimental situation is MATLAB R2012a, the use of hardware is Intel 3.10GHz processors, 4GB RAM。
The present invention carries algorithm and compared with following seven classes algorithm:1)CCT;2)CSK;3)DFT;4)FOT;5)KCF; 6)LCT;7)LSHT.
Two class evaluation indexes have been used in comparing:Duplication (OR) and centre deviation (CLE).One in given t frames Segment P corresponding to estimated locationtThe corresponding true value position G with the framet, defining Duplication is
Wherein ∩ and ∪ represents the common factor and union in region respectively, | | represent the pixel number in region.Duplication OR Curve is drawn from 0 to 1 according to threshold value, and the point on curve represents that Duplication is higher than the frame proportion of threshold value.It can be seen that Duplication is bent Line is higher, and it is better to represent algorithm performance.
CLE represents the center deviation between estimated location and true value under Euclidean distance
Wherein (Pxt, Pyt) and (Gxt, Gyt) estimated location center and true value place-centric are represented respectively.Likewise, centre bit It is smaller to put deviation, then algorithm performance is better.
Here the tracking test result of some challenging sequences is given.It is all in whole experiment process Parameter used in track algorithm is all fixed, to meet a needs of algorithm can preferably tackle all situations.Fig. 2 is portion Divide Duplication (OR) curve corresponding to OTB sequences.Abscissa corresponds to threshold value setting from 0% to 100%, and curve then represents corresponding weight Ratio of the folded rate higher than whole video sequence frame number shared by the frame of threshold value.Therefore, a curve declines slower, its corresponding algorithm Performance is better.In view of if without overlapping region algorithm keeps track fail, then threshold value be numerical value corresponding to 1% place it is visual For the success rate of the algorithm.Fig. 3 is centre deviation (CLE) corresponding to the OTB video sequences of part.Similar with Fig. 2, centre deviation is bent Line is drawn from 1 to 50 according to threshold value and formed, and it is whole shared by the frame of threshold value to represent that centre deviation is less than for corresponding point on curve The ratio of video sequence frame number.It can be seen that ratio more Gao Ze represent algorithm performance it is better.It is 20 institutes usually using centre deviation threshold value Corresponding value is analyzed.
Quick motion (FM) and motion blur (MB) are challenges common in actual video sequence.Therefore, algorithm tackles this The performance of a little problems is an important indicator in actual assessment.Contained in OTB benchmark many containing quick motion With the video sequence of motion blur, such as BlurCar1, BlurFace, Car11,、Deer、Girl2,、Human9,、 Soccer etc..Carried algorithm and other algorithms are compared using these video sequences in text, corresponding OR and CLE are represented respectively In figs. 2 and 3.It can be seen that carried algorithm reply it is most of include FM and MB video sequence when effect all Better than other algorithms.Although some algorithms (such as CCT) have good effect approximate being considered as in static target following, But open defect be present in situations such as its reply rapid moving object and motion blur.
Accompanying drawing 2 is Duplication curve synoptic diagram, and for curve according to 1% to 100% threshold rendering, transverse axis is threshold value, and the longitudinal axis is Duplication is higher than the frame of video proportion of threshold value.
Accompanying drawing 3 is centre deviation schematic diagram, and wherein curve transverse axis is drawn according to threshold value 1 to 50, and the value of threshold value is corresponding to be estimated Position and the Euclidean distance of the center of true value, the longitudinal axis represent the frame of video proportion that deviation is less than threshold value.
Except FM and MB, the deformation (DEF) of structural object, (OCC), plane internal rotation (IPR) and plane outward turning are blocked Turn the significant challenge in (OPR) and video tracking.Video sequence comprising these factors has Bolt2、Coupon、David3、 Gym, Trellis etc..In addition, David3 and Trellis also contains blurred background, illumination variation and dimensional variation etc..These Experimental result corresponding to video sequence also show the superiority of put forward track algorithm.
Fig. 4 has marked out the tracking result that last frame is corresponded in above-mentioned video sequence, for more intuitively comparing.Institute Tracking result is marked with algorithm with 8 kinds of different colours respectively.It can be seen that the performance for carrying algorithm is better than other algorithms, and track mesh Target position away from real goal closer to.

Claims (3)

1. a kind of self-adaptive visual track algorithm based on online metric learning, it is characterised in that including herein below:
Step 1:Target is described with SIFT feature:DSIFT in the VLFeat feature extractions storehouse of feature extraction algorithm application is quick Algorithm, for the segment of 50 × 50 sizes, corresponding SIFT feature dimension is 128 × 9=1152, this to on-line tracking and Speech is very big amount of calculation;Therefore dimensionality reduction is carried out to the SIFT feature extracted with random PCA (RPCA):It is given One eigenmatrixThe basic thought of Feature Dimension Reduction is to find a mapping matrixBy original n Wei Te Sign space is projected to new k dimensional feature spaces, so as to realize dimensionality reduction purpose, after known features matrix and target dimension, and RPCA An over-sampling dimension can be definedWith an accidental projection matrixShould to the new matrix Y=X Ω being calculated With feature decomposition, have
<mrow> <mtable> <mtr> <mtd> <mrow> <mi>Q</mi> <mi>R</mi> <mo>=</mo> <mi>Y</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>B</mi> <mo>:</mo> <mo>=</mo> <msup> <mi>Q</mi> <mi>T</mi> </msup> <mi>X</mi> </mrow> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>19</mn> <mo>)</mo> </mrow> </mrow>
Wherein B is an intermediary matrix, is then carried out SVD decomposition
<mrow> <mi>B</mi> <mo>=</mo> <mover> <mi>U</mi> <mo>~</mo> </mover> <msup> <mi>&amp;Sigma;V</mi> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>20</mn> <mo>)</mo> </mrow> </mrow>
X approximate matrix can be thus obtained by following formula
<mrow> <mi>X</mi> <mo>&amp;ap;</mo> <msup> <mi>QQ</mi> <mi>T</mi> </msup> <mi>X</mi> <mo>=</mo> <mi>Q</mi> <mrow> <mo>(</mo> <mover> <mi>U</mi> <mo>~</mo> </mover> <msup> <mi>&amp;Sigma;V</mi> <mi>T</mi> </msup> <mo>)</mo> </mrow> <mo>:</mo> <mo>=</mo> <msup> <mi>U&amp;Sigma;V</mi> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>21</mn> <mo>)</mo> </mrow> </mrow>
Finally, the eigenmatrix X after new projectionprojCan passes through Xproj=XVkObtain, here VkIt is that k arranges to obtain before being intercepted to V , for the SIFT feature of one 1152 dimension, the feature quantity after dimensionality reduction is about original 1/3, i.e., 384 tie up;
Step 2:Grader is constructed and trains using supervised study:Give two samplesDistance metric The purpose of habit is by adjusting original feature space, changing the relative space position relation between training set sample so that identical Distance reduces as far as possible between the sample of classification, it is different classes of between sample distance increase as far as possible, distance on this condition can be write as
dG(x, y)=(x-y)TG(x-y) (22)
Wherein, G is the distance matrix metric of acquistion, and using K-L divergences as the index for weighing similarity, constraints above is just It can be write as
<mrow> <mtable> <mtr> <mtd> <mrow> <munder> <mi>min</mi> <mi>G</mi> </munder> <mi>K</mi> <mi>L</mi> <mrow> <mo>(</mo> <mi>p</mi> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <msub> <mi>G</mi> <mn>0</mn> </msub> </mrow> <mo>)</mo> <mi>p</mi> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <mi>G</mi> </mrow> <mo>)</mo> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <msub> <mi>d</mi> <mi>G</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&amp;le;</mo> <mi>l</mi> <mo>,</mo> <mi>i</mi> <mi>f</mi> <mi> </mi> <mi>l</mi> <mi>a</mi> <mi>b</mi> <mi>e</mi> <mi>l</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>l</mi> <mi>a</mi> <mi>b</mi> <mi>e</mi> <mi>l</mi> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow></mrow> </mtd> <mtd> <mrow> <msub> <mi>d</mi> <mi>G</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&amp;GreaterEqual;</mo> <mi>u</mi> <mo>,</mo> <mi>i</mi> <mi>f</mi> <mi> </mi> <mi>l</mi> <mi>a</mi> <mi>b</mi> <mi>e</mi> <mi>l</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>&amp;NotEqual;</mo> <mi>l</mi> <mi>a</mi> <mi>b</mi> <mi>e</mi> <mi>l</mi> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>23</mn> <mo>)</mo> </mrow> </mrow>
Wherein l and u is two distance thresholds;
Above-mentioned divergence problem is solved with LogDet methods, because the sample number obtained in a frame is very limited, can not be obtained The value of whole parameters, has used " boot-strap " (bootstrap) method construct training set in algorithm, in practical application, Training set includes two class samples:Represent the positive sample of targetWith the negative sample for representing background information
It is being met constraintsInitial distance metric matrix after, track algorithm meeting In follow-up every frame inner updating distance metric, the two segment u extracted in t frames are giventAnd vt, then the distance between they beIf the distance of prediction is yt, then can obtains new distance metric G by solving following formulat+1
<mrow> <msub> <mi>G</mi> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>min</mi> </mrow> <mrow> <mi>G</mi> <mo>&gt;</mo> <mn>0</mn> </mrow> </munder> <mi>D</mi> <mrow> <mo>(</mo> <mi>G</mi> <mo>,</mo> <msub> <mi>G</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>+</mo> <mi>&amp;eta;</mi> <mi>L</mi> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mi>G</mi> </msub> <mo>(</mo> <mrow> <msub> <mi>u</mi> <mi>t</mi> </msub> <mo>,</mo> <msub> <mi>v</mi> <mi>t</mi> </msub> </mrow> <mo>)</mo> <mo>,</mo> <msub> <mi>y</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>24</mn> <mo>)</mo> </mrow> </mrow>
Wherein D is normalized function, and η is normalized parameter,It is the loss function between target range and estimated distance, Make zt=ut-vt,Then the solution of this minimization problem is
<mrow> <msub> <mi>G</mi> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>=</mo> <msub> <mi>G</mi> <mi>t</mi> </msub> <mo>-</mo> <mfrac> <mrow> <mi>&amp;eta;</mi> <mrow> <mo>(</mo> <mover> <mi>y</mi> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <msub> <mi>y</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <msub> <mi>G</mi> <mi>t</mi> </msub> <msub> <mi>z</mi> <mi>t</mi> </msub> <msubsup> <mi>z</mi> <mi>t</mi> <mi>T</mi> </msubsup> <msub> <mi>G</mi> <mi>t</mi> </msub> </mrow> <mrow> <mn>1</mn> <mo>+</mo> <mi>&amp;eta;</mi> <mrow> <mo>(</mo> <mover> <mi>y</mi> <mo>&amp;OverBar;</mo> </mover> <mo>-</mo> <msub> <mi>y</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <msubsup> <mi>z</mi> <mi>t</mi> <mi>T</mi> </msubsup> <msub> <mi>G</mi> <mi>t</mi> </msub> <msub> <mi>z</mi> <mi>t</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>25</mn> <mo>)</mo> </mrow> </mrow>
The target of learning distance metric is
<mrow> <mo>{</mo> <mtable> <mtr> <mtd> <mrow> <msub> <mi>d</mi> <msub> <mi>G</mi> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </msub> <mrow> <mo>(</mo> <mover> <mi>I</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>,</mo> <mi>M</mi> <mo>)</mo> </mrow> <mo>&lt;</mo> <mo>&lt;</mo> <msub> <mi>d</mi> <msub> <mi>G</mi> <mi>t</mi> </msub> </msub> <mrow> <mo>(</mo> <mover> <mi>I</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>,</mo> <mi>M</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>d</mi> <msub> <mi>G</mi> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </msub> <mrow> <mo>(</mo> <mover> <mi>I</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>,</mo> <mover> <mi>J</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mo>&lt;</mo> <mo>&lt;</mo> <msub> <mi>d</mi> <msub> <mi>G</mi> <mi>t</mi> </msub> </msub> <mrow> <mo>(</mo> <mover> <mi>I</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <mi>x</mi> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>,</mo> <mover> <mi>J</mi> <mo>^</mo> </mover> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>;</mo> <mi>t</mi> </mrow> <mo>)</mo> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>26</mn> <mo>)</mo> </mrow> </mrow>
WhereinWithT frames are represented respectively corresponds to target and background sample in Sample Storehouse.
A kind of 2. self-adaptive visual track algorithm based on online metric learning according to claim 1, it is characterised in that Also include adaptive selection template renewal, training template has been divided into two classes according to update mode used in algorithm:Quick renewal Template and sane more new template;The former use is to adapt to the deformation of target in time, and the use of the latter is then in order to anti- Only tracking result is drifted about, wherein quick more new template is extracted to obtain according to template is searched to each frame, searches setting for template Meter both ensure that efficiency of the algorithm in processing background information, also ensure that the accurate description to target;Sane more new template is then It is stored in ATL, its size is fixed for given video sequence, and original template is the mesh marked from user in first frame Extraction obtains around cursor position.
A kind of 3. self-adaptive visual track algorithm based on online metric learning according to claim 2, it is characterised in that Each asterisk (*) represents the center of an extraction sample segment in quick renewal template extraction mode;Current goal center Position is origin, is positive sample away from the segment extracted in its 2 pixel coverage, and remaining segment is negative sample;When algorithm is from present frame After inside estimating target location, the segments of 9 target sizes will be by track algorithm from around the position in 2 pixel coverages Extracted in region, it is assumed that I (x;T) represent to extract obtained segment out of t frames, in its corresponding T with training set just Sample set TposFrom the segment of minimum;The SIFT feature extraction result of the segment is represented, ifWith TposBetween Average distance be less than threshold value, then the more new task of ATL is exactly to calculateWith current template storehouse Mt={ m1,…, mkIn each template distance, and it is compared the distance between with target in ATL, if distance is big corresponding to new template In MtIn distance corresponding at least one template, then that minimum template of respective distances is by by new template I (x;T) substitute, The reason for doing so is, if the distance between new template and positive sample collection are less than threshold value, the template is considered just Sample;Meanwhile if it is bigger with the template distance in current template storehouse, then it is assumed that this carry more fresh informations, and this A little information are not available for the template in current template storehouse, and therefore, new template is used instead that and ATL MtIt is interior The more like old template of other templates.
CN201710455281.3A 2017-06-16 2017-06-16 Self-adaptive visual track algorithm based on online metric learning Active CN107341817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710455281.3A CN107341817B (en) 2017-06-16 2017-06-16 Self-adaptive visual track algorithm based on online metric learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710455281.3A CN107341817B (en) 2017-06-16 2017-06-16 Self-adaptive visual track algorithm based on online metric learning

Publications (2)

Publication Number Publication Date
CN107341817A true CN107341817A (en) 2017-11-10
CN107341817B CN107341817B (en) 2019-05-21

Family

ID=60220727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710455281.3A Active CN107341817B (en) 2017-06-16 2017-06-16 Self-adaptive visual track algorithm based on online metric learning

Country Status (1)

Country Link
CN (1) CN107341817B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning
CN111854728A (en) * 2020-05-20 2020-10-30 哈尔滨工程大学 Fault-tolerant filtering method based on generalized relative entropy
CN112037255A (en) * 2020-08-12 2020-12-04 深圳市道通智能航空技术有限公司 Target tracking method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778415A (en) * 2014-01-21 2014-05-07 蔺全录 Mine personnel iris checking-in and tracking and positioning method and system
CN104616324A (en) * 2015-03-06 2015-05-13 厦门大学 Target tracking method based on adaptive appearance model and point-set distance metric learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778415A (en) * 2014-01-21 2014-05-07 蔺全录 Mine personnel iris checking-in and tracking and positioning method and system
CN104616324A (en) * 2015-03-06 2015-05-13 厦门大学 Target tracking method based on adaptive appearance model and point-set distance metric learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SHUQIAO SUN: "Self-Adaptive Visual Tracker Based on Background Information", 《HARBIN: 2016 THE SIXTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL》 *
贾桂敏: "复杂背景下基于自适应模板更新的目标跟踪算法", 《光学学报》 *
赵永威: "基于特征分组与特征值最优化的距离度量学习方法", 《数据采集与处理》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934849A (en) * 2019-03-08 2019-06-25 西北工业大学 Online multi-object tracking method based on track metric learning
CN111854728A (en) * 2020-05-20 2020-10-30 哈尔滨工程大学 Fault-tolerant filtering method based on generalized relative entropy
CN111854728B (en) * 2020-05-20 2022-12-13 哈尔滨工程大学 Fault-tolerant filtering method based on generalized relative entropy
CN112037255A (en) * 2020-08-12 2020-12-04 深圳市道通智能航空技术有限公司 Target tracking method and device

Also Published As

Publication number Publication date
CN107341817B (en) 2019-05-21

Similar Documents

Publication Publication Date Title
Ruggero Ronchi et al. Benchmarking and error diagnosis in multi-instance pose estimation
CN109146921B (en) Pedestrian target tracking method based on deep learning
Newell et al. Stacked hourglass networks for human pose estimation
Sullivan et al. Recognizing and tracking human action
CN106127804B (en) The method for tracking target of RGB-D data cross-module formula feature learnings based on sparse depth denoising self-encoding encoder
CN110717414A (en) Target detection tracking method, device and equipment
CN106548151B (en) Target analyte detection track identification method and system towards intelligent robot
CN109613006A (en) A kind of fabric defect detection method based on end-to-end neural network
McQueen et al. Automatically recognizing on-ball screens
CN106780557A (en) A kind of motion target tracking method based on optical flow method and crucial point feature
Pham et al. Bayesian semantic instance segmentation in open set world
CN107341817A (en) Self-adaptive visual track algorithm based on online metric learning
Vercruyssen et al. Qualitative spatial reasoning for soccer pass prediction
CN108460790A (en) A kind of visual tracking method based on consistency fallout predictor model
Zhao et al. Accurate pedestrian detection by human pose regression
Morimitsu et al. Exploring structure for long-term tracking of multiple objects in sports videos
CN109299732A (en) The method, apparatus and electronic equipment of unmanned behaviour decision making and model training
CN108288020A (en) Video shelter detecting system based on contextual information and method
CN113269103B (en) Abnormal behavior detection method, system, storage medium and equipment based on space map convolutional network
CN109214245A (en) A kind of method for tracking target, device, equipment and computer readable storage medium
Li et al. Robust object tracking with discrete graph-based multiple experts
CN109598742A (en) A kind of method for tracking target and system based on SSD algorithm
CN110008900A (en) A kind of visible remote sensing image candidate target extracting method by region to target
CN112734803A (en) Single target tracking method, device, equipment and storage medium based on character description
WO2021010342A1 (en) Action recognition device, action recognition method, and action recognition program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant