CN112233140A - SSVM tracking method based on DIOU loss and smoothness constraint - Google Patents

SSVM tracking method based on DIOU loss and smoothness constraint Download PDF

Info

Publication number
CN112233140A
CN112233140A CN202010755733.1A CN202010755733A CN112233140A CN 112233140 A CN112233140 A CN 112233140A CN 202010755733 A CN202010755733 A CN 202010755733A CN 112233140 A CN112233140 A CN 112233140A
Authority
CN
China
Prior art keywords
target
tracking
representing
formula
structured
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010755733.1A
Other languages
Chinese (zh)
Other versions
CN112233140B (en
Inventor
袁广林
孙子文
李从利
秦晓燕
韩裕生
陈萍
李豪
琚长瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PLA Army Academy of Artillery and Air Defense
Original Assignee
PLA Army Academy of Artillery and Air Defense
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PLA Army Academy of Artillery and Air Defense filed Critical PLA Army Academy of Artillery and Air Defense
Priority to CN202010755733.1A priority Critical patent/CN112233140B/en
Publication of CN112233140A publication Critical patent/CN112233140A/en
Application granted granted Critical
Publication of CN112233140B publication Critical patent/CN112233140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an SSVM tracking method based on DIOU loss and smoothness constraint. The method comprises the following steps: establishing a structured SVM model based on DIOU loss and smoothness constraint, and converting the structured SVM model based on DIOU loss and smoothness constraint into a dual problem according to a Lagrange multiplier method; solving a structured SVM model based on DIOU loss and smooth constraint by adopting a dual coordinate descent principle, and estimating the state of a target; and evaluating the position of the tracking target by adopting a multi-scale target tracking method, and selecting the structured output with the maximum response as a tracking result. The method overcomes the problems of inaccurate loss function and model drift, and effectively improves the accuracy and the success rate of the target tracking algorithm based on the structured SVM.

Description

SSVM tracking method based on DIOU loss and smoothness constraint
Technical Field
The invention relates to the technical field of computer visual target tracking, in particular to an SSVM tracking method based on DIOU loss and smoothness constraint.
Background
Object tracking is an important research content in the field of computer vision, the object of which is to estimate the state of an object using a sequence of images. The target tracking has important application prospects in the civil fields of video monitoring, vehicle navigation, man-machine interaction, intelligent transportation and the like, and the military fields of vision guidance, target positioning, fire control and the like. In recent years, although the target tracking has been greatly developed, it still faces many challenging problems such as complex background, illumination change and target shielding, so the target tracking is always a hot problem in the field of computer vision.
Target tracking is classified into production tracking and discriminant tracking. Representative methods in generative Tracking are IVT Tracking [ Ross D A, Lim J, Lin R S, et al. inventive Learning for Robust Visual Tracking [ J ]. International Journal of Computer Vision,2008,77(1-3):125- & ltd > 141.], L1 Tracking [ Mei X, Ling H. Robust Visual Tracking L1 minimization [ A ]. Proceedings of IEEE Conference on Computer Vision [ C ]. Kyoto, Japan: IEEE Computer Society Press,2009,1436- & ltd. ] and coherent filter Tracking [ Sun voyage, Lily, Xiao Fu, Wai ] related filter target Tracking [ J ]. electroc, 2017,45(10) & ltd. & gt 2332 ] based on multi-stage Learning. Representative methods of discriminant tracking are MIL tracking [ Babenko B, Yang M H, Belongie S.Robust object tracking with online multiple instance learning [ J ]. IEEE Transactions on Pattern and Machine learning, 2011,33(8): 1619. 1632 ], TLD tracking [ Kalal Z, MikolajczyK, as J.Tracking-learning-detecting [ J ]. IEEE Transactions on Pattern and Machine learning, 2012,34 (7. nA. 1422 ], random forest tracking [ SaffariA, Leistner C, Torque J, Godec M, Bib. Link [ Saffoni ] C, equation [ J.Soffoni J.: sample C, sample J, sample M, sample C, 2011,263-. In 2004, Avidan first proposed an SVM-based target tracking method [ s.avidan.supported vector tracking [ J ]. IEEE Transactions on Pattern Analysis and Machine Analysis, 2004,26(8): 1064-. Inspired by the application of structured SVM in target detection, in 2011, Hare et al first proposed a structured SVM-based target tracking method on ICCV [ Hare S, Saffari A, Torr P H S.Struck: structured output tracking with keys [ A ]. Procedents of IEEE International Conference on Computer Vision: Barcelona, Spain: IEEE Computer Society Press,2011, 263. 270 ], an extension of this article in 2015 [ Hare S, Saffari A, Torr PH S.Struck: structured output tracking with keys [ J ]. IEEE Transactions Patches and interfaces, 2016 (2106): intelligent model for analysis on the top of the International journal (PAMI 2099). The Struck considers the target tracking as a structured output problem, avoids the intermediate classification link of the traditional discriminant tracking and obviously improves the target tracking performance. In order to adapt to the change of the target without losing the time context information of the target, Yao et al [ Yao R, Shi Q, Shen C, et al. robust Tracking with Weighted Online Structured Tracking [ A ]. Proceedings of the European conference on Computer Vision [ C ]. Berlin Heidelberg: Springer,2012, Part III: 158-. In order to improve the Tracking performance of the occlusion and deformation target, Yao et al (Yao R, Shi Q, Shen C, et al, part-based Visual Tracking with on line tension Structural Learning [ A ]. Procedings of Computer Vision and Pattern registration [ C ]. Portland, OR, USA: IEEE Computer Society Press,2013, 2363-. In order to solve the problem of model drift in target Tracking, Bai and Tang [ BaiY, Tang M. robust Tracking via super powered Tracking SVM [ A ]. Proceedings of IEEE Conference on Computer Vision and Pattern registration [ C ]. Providence, RI, USA IEEE Computer Society Press,2012,1854 1861 ] proposed an online Laplace sequencing SVM Tracking in 2012. Also in order to deal with the problem of model drift, MEEM tracking was proposed in 2014 by Zhang et al [ J.Zhang, S.Ma, and S.Scalaroff. Meem: Robust tracking via multiple experiments using estimation minimization [ A ]. Proceedings of European Conference on Computer Vision [ C ]. Zurich, Switzerland: IEEE Computer Society Press,2014, 188-. In 2015, Hong et al [ Hong S, You T, KWak S, et al, on line Tracking by boundary Learning Using Map with a coherent Neural Network [ A ]. Proceedings of International Conference on Machine Learning [ C ]. Lille, France: IEEE Computer Society Press,2015,597-606.] proposes to use an online SVM to guide back-propagation of CNN characteristics of a specific target to an input layer, and further to establish a Saliency Map Tracking target of the specific target. Ning et al [ Ning J F, Yang J M, Jiang S J, et al, object tracking via dual linear structured SVM and explicit feature map [ A ]. Procedence of IEEE Conference on Computer Vision and Pattern Recognition [ C ]. Las Vegas: IEEE Computer Society Press,2016, 4266-. Also, to increase the speed of the target, in 2017 Wang et al [ Wang, Mengmeng, Liu, Yong, Huang, Zeyi. Large marker Object Tracking with circular Feature Maps [ A ]. Procedents of IEEE Conference on Computer Vision and Pattern registration [ C ]. Honolulu, HI, USA: IEEE Computer Society Press,2017, 4800-. Ji et al (Ji Z, Feng K, Qian Y.part-based Visual Tracking Via Structural Support Correlation Filter [ J ]. Computing Research reproduction, 2018,1805 and 09971.] in 2018 adopt a similar idea to LMCF Tracking to accelerate structured SVM Tracking based on a target component, and the aim is to improve the speed of target Tracking. In order to further improve the performance of target Tracking based on SVM, in 2019, Zuo et al [ Zuo W, Wu X, Lin L, et al. learning Support Correlation Filters for Visual Tracking [ J ]. IEEE Transactions on Pattern Analysis and Machine Analysis, 2019,41(5): 1158-.
In summary, the target tracking method based on the structured SVM has better tracking performance and thus receives wide attention, but the existing method has two problems. In one aspect, prior structured SVM target tracking utilizes IOU as a loss function, and as shown in FIG. 1(a), when the sample box does not overlap with the target box, there is no difference in IOU loss for the sample. However, the distances between most of the sampling frames and the center of the target frame are different, the IOU loss cannot describe the difference, and the influence on the hyperplane of the structured SVM cannot be reflected, so that the performance of the classifier is influenced. On the other hand, in order to adapt to the change of the target, the existing method updates the structured SVM by using the tracking result. As shown in fig. 1(b), due to target occlusion, illumination variation, complex background, motion blur, target deformation, and the like, non-target information is introduced into the training sample, thereby causing a structural SVM hyperplane offset (model drift), and further causing a tracking failure.
Disclosure of Invention
The invention aims to provide an SSVM tracking method based on DIOU loss and smoothness constraint, which effectively solves the problems of inaccurate loss function and model drift in target tracking based on a structured SVM, thereby improving the accuracy and success rate of a target tracking algorithm based on the structured SVM.
The technical solution for realizing the purpose of the invention is as follows: a SSVM tracking method based on DIOU loss and smoothness constraint comprises the following steps:
step 1, establishing a structured SVM model based on DIOU loss and smoothness constraint, and converting the structured SVM model based on DIOU loss and smoothness constraint into a dual problem according to a Lagrange multiplier method;
step 2, solving a structured SVM model based on DIOU loss and smooth constraint by adopting a dual coordinate descent principle, and estimating the state of the target;
and 3, evaluating the position of the tracking target by adopting a multi-scale target tracking method, and selecting the structured output with the maximum response as a tracking result.
Further, the step 1 of establishing a structured SVM model based on the DIOU loss and the smoothness constraint, and converting the structured SVM model based on the DIOU loss and the smoothness constraint into a dual problem according to a lagrange multiplier method, specifically as follows:
step 1.1, in the target tracking based on the structured SVM, Y is a rectangular frame space, any element of Y is represented by (x ', Y', w, h), wherein (x ', Y') represents the central position of the rectangular frame, w and h respectively represent the width and height of the rectangular frame, and the training data is assumed to be
Figure BDA0002611504850000041
Where N denotes the total number of training data, i denotes any number of data indices in the training data set, i ═ 1,2, …, N, (x)i,yi) A rectangular frame center position representing the ith training data representation; establishing a DCSSVM (distributed support vector machine) model based on DIOU loss and smoothness constraint, wherein the model is described as follows:
Figure BDA0002611504850000042
Figure BDA0002611504850000043
Figure BDA0002611504850000044
wherein w is a normal vector of the structured SVM at the moment t, and wt-1The normal vector of the structured SVM at the moment t-1 is shown, and lambda is a smooth constraint coefficient; Ψi(y)=Φ(xi,yi)-Φ(xi,y),Ψi(y) represents a feature vector Φ (x)i,yi) And the eigenvector phi (x)iY) difference, phi (x)i,yi) Rectangular box y representing training sampleiIn the image xiCharacteristic vector of (c), phi (x)iY) represents the prediction rectangle y in the training image xiA feature vector of (a); c is a regularization parameter, ξiIs a relaxation variable; l (y)iY) is a loss function representing the loss of structural error of the predicted output rectangular box y;
Figure BDA0002611504850000051
Figure BDA0002611504850000052
b、bgtrespectively represent B, BgtCenter point of (B)gt=(xgt,ygt,wgt,hgt) Is the position of the target frame, where (x)gt,ygt) Indicates the center position, w, of the target framegtAnd hgtRespectively representing the width and height of the target frame; b ═ x, y, w, h) is the position of the prediction box, where (x, y) denotes the center position of the prediction box, and w and h denote the width and height of the prediction box, respectively; ρ (·) is the euclidean distance, c is the diagonal length of the smallest bounding box covering the two location boxes;
step 1.2, in order to solve the formula (1), obtaining the dual problem of the formula (1) by using a Lagrangian multiplier method, and introducing the Lagrangian multiplier for the dual problem
Figure BDA0002611504850000053
And
Figure BDA0002611504850000054
the following conditions are satisfied:
Figure BDA0002611504850000055
the lagrangian function of equation (1) is as follows:
Figure BDA0002611504850000056
where ξ represents the relaxation variable, α and β are the introduced lagrangian multipliers; the right side of the equation represents the meaning of the value, ξ, corresponding to a particular training sampleiRepresents the corresponding relaxation variable at the ith training sample,
Figure BDA0002611504850000057
and
Figure BDA0002611504850000058
representing a Lagrange multiplier corresponding to the ith training sample;
the Lagrange function L (w, xi, alpha, beta) is respectively corresponding to w and xiiCalculating the partial derivative and making the partial derivative be 0:
Figure BDA0002611504850000059
Figure BDA00026115048500000510
substituting the formulas (4) and (5) into the formula (3), and eliminating w, beta and xi in L (w, xi, alpha, beta) to obtain the dual problem of the formula (1), which is shown in formulas (6a) to (6 c):
Figure BDA00026115048500000511
Figure BDA00026115048500000512
Figure BDA0002611504850000061
further, the step 2 of solving the structured SVM model based on the DIOU loss and the smooth constraint by using the dual coordinate descent principle estimates the state of the target as follows:
step 2.1, the dual coordinate descent optimization algorithm selects a training sample k from the training set by using a formula (7) each time, and then updates the dual scalar quantity by using a formula (8)
Figure BDA0002611504850000062
Figure BDA0002611504850000063
Figure BDA0002611504850000064
Wherein the content of the first and second substances,
Figure BDA0002611504850000065
representing the dual scalar before the update,
Figure BDA0002611504850000066
representing the dual scalar after the update,
Figure BDA0002611504850000067
is represented by
Figure BDA0002611504850000068
To
Figure BDA0002611504850000069
An increment of change;
to obtain
Figure BDA00026115048500000610
Formula (8) is first substituted into formula (6a), and then formula (6a) is converted to
Figure BDA00026115048500000611
To yield formula (9):
Figure BDA00026115048500000612
wherein c is an and increment
Figure BDA00026115048500000613
An unrelated constant; will be the pair of formula (9)
Figure BDA00026115048500000614
Derivative and let the derivative be 0:
Figure BDA00026115048500000615
according to the constraint condition of the formula (6c), obtaining
Figure BDA00026115048500000616
The value range is as follows:
Figure BDA00026115048500000617
wherein the content of the first and second substances,
Figure BDA00026115048500000618
representing the dual scalar corresponding to the kth training sample,
Figure BDA00026115048500000619
representing the accumulated sum of dual scalars corresponding to all the training samples;
step 2.2, benefitObtained by the formulae (10) to (11)
Figure BDA00026115048500000620
Then, combining equation (4) to obtain an updated equation of w, as shown in equation (12):
Figure BDA0002611504850000071
in the formula w(old)Representing a classification hyperplane normal vector, w(new)Representing the classification hyperplane normal vector after the update;
Figure BDA0002611504850000072
is represented by
Figure BDA0002611504850000073
To
Figure BDA0002611504850000074
An increment of change; λ is a smooth constraint coefficient;
Figure BDA0002611504850000075
Ψk(y*) Representing feature vectors
Figure BDA0002611504850000076
And the eigenvector phi (x)k,y*) The difference value of (a) to (b),
Figure BDA0002611504850000077
rectangular box for representing training sample
Figure BDA0002611504850000078
In the image xkThe feature vector of (a) above (b),
Figure BDA0002611504850000079
representing the prediction rectangle y*In the image xkA feature vector of (a);
and 2.3, calculating the score of the candidate sample by utilizing inner product operation, and estimating the state of the target according to a maximum score criterion formula (13):
Figure BDA00026115048500000710
structured output y with maximum response*I.e. the position of the target, Y represents the set of all predicted structured output rectangular boxes; Ψt(y)=Φ(xt,yt)-Φ(xt,y),Ψt(y) represents a feature vector Φ (x)t,yt) And the eigenvector phi (x)tY) difference, phi (x)t,yt) Rectangular box y representing training sampletIn the image xtCharacteristic vector of (c), phi (x)tY) denotes predicting the rectangular frame y in the image xtThe feature vector of (1).
Further, the step 3 of estimating the position of the tracking target by using the multi-scale target tracking method, and selecting the structured output with the maximum response as the tracking result specifically comprises the following steps:
when target tracking is carried out, a conservative scale pool S is used as {1, 0.995, 1.005}, three different scales are adopted for evaluation, and the maximum response is selected as a tracking result; for the target feature, the Lab color and local rank transformed LRT feature of the target is selected.
Compared with the prior art, the invention has the remarkable advantages that: (1) the problems of inaccurate loss function and model drift in the existing target tracking method based on the structured SVM are solved, and the accuracy and the success rate of the target tracking algorithm based on the structured SVM are improved; (2) the method has better performance in challenging videos such as background mixing, fast moving, illumination change, motion blur, target deformation, target shielding and the like, and improves the performance of target tracking.
Drawings
Fig. 1 is a schematic diagram of loss function inaccuracy and model drift problems existing in a conventional target tracking method based on a structured SVM, wherein (a) is a schematic diagram of a sampling box, and (b) is a schematic diagram of online learning of the structured SVM.
FIG. 2 is a flow chart of the SSVM tracking method based on DIOU loss and smoothness constraint of the present invention.
Fig. 3 is a schematic diagram of results of OPE performance evaluation on OTB100 data set by using the method of the present invention and other 3 high performance algorithms in the embodiment of the present invention, where (a) is a mean center error diagram, and (b) is a position coincidence ratio diagram.
Fig. 4 is a key frame screenshot of 5 challenging videos using the method of the present invention and other 3 high performance tracking algorithms in an embodiment of the present invention, where (a) is a toy video screenshot, (b) is a pedestrian video screenshot, (c) is a walking video screenshot, (d) is a face video screenshot, and (e) is a singer 2 video screenshot.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
With reference to fig. 2, the ssov tracking method based on DIOU loss and smoothness constraint of the present invention includes the following steps:
step 1, establishing a structured SVM model based on DIOU loss and smoothness constraint, and converting the structured SVM model based on DIOU loss and smoothness constraint into a dual problem according to a Lagrange multiplier method, which is specifically as follows:
step 1.1, in the target tracking based on the structured SVM, Y is a rectangular frame space, any element of Y is represented by (x ', Y', w, h), wherein (x ', Y') represents the central position of the rectangular frame, w and h respectively represent the width and height of the rectangular frame, and the training data is assumed to be
Figure BDA0002611504850000081
Where N denotes the total number of training data, i denotes any number of data indices in the training data set, i ═ 1,2, …, N, (x)i,yi) A rectangular frame center position representing the ith training data representation; an SSVM (Structured SVM Based on DIOU Loss and Smoothness Constraints, DCSSVM) model Based on DIOU Loss and Smoothness Constraints is established, and is described as follows:
Figure BDA0002611504850000082
Figure BDA0002611504850000083
Figure BDA0002611504850000084
wherein w is a normal vector of the structured SVM at the moment t, and wt-1The normal vector of the structured SVM at the moment t-1 is shown, and lambda is a smooth constraint coefficient; Ψi(y)=Φ(xi,yi)-Φ(xi,y),Ψi(y) represents a feature vector Φ (x)i,yi) And the eigenvector phi (x)iY) difference, phi (x)i,yi) Rectangular box y representing training sampleiIn the image xiCharacteristic vector of (c), phi (x)iY) represents the prediction rectangle y in the training image xiA feature vector of (a); c is a regularization parameter, ξiIs a relaxation variable; l (y)iY) is a loss function representing the loss of structural error of the predicted output rectangular box y;
Figure BDA0002611504850000085
Figure BDA0002611504850000086
b、bgtrespectively represent B, BgtCenter point of (B)gt=(xgt,ygt,wgt,hgt) Is the position of the target frame, where (x)gt,ygt) Indicates the center position, w, of the target framegtAnd hgtRespectively representing the width and height of the target frame; b ═ x, y, w, h) is the position of the prediction box, where (x, y) denotes the center position of the prediction box, and w and h denote the width and height of the prediction box, respectively; ρ (·) is the euclidean distance, c is the diagonal length of the smallest bounding box covering the two location boxes; .
Step 1.2, in order to solve the formula (1), obtaining the dual problem of the formula (1) by using a Lagrangian multiplier method, and introducing the Lagrangian multiplier for the purpose
Figure BDA0002611504850000091
And
Figure BDA0002611504850000092
the following conditions are satisfied:
Figure BDA0002611504850000093
the lagrangian function of equation (1) is as follows:
Figure BDA0002611504850000094
where ξ represents the relaxation variable, α and β are the introduced lagrangian multipliers; the right side of the equation represents the meaning of the value, ξ, corresponding to a particular training sampleiRepresents the corresponding relaxation variable at the ith training sample,
Figure BDA0002611504850000095
and
Figure BDA0002611504850000096
representing a Lagrange multiplier corresponding to the ith training sample;
the Lagrange function L (w, xi, alpha, beta) is respectively corresponding to w and xiiCalculating the partial derivative and making the partial derivative be 0:
Figure BDA0002611504850000097
Figure BDA0002611504850000098
substituting the formulas (4) and (5) into the formula (3), and eliminating w, beta and xi in L (w, xi, alpha, beta) to obtain the dual problem of the formula (1), which is shown in formulas (6a) to (6 c):
Figure BDA0002611504850000099
Figure BDA00026115048500000910
Figure BDA00026115048500000911
step 2, solving a structural SVM model based on DIOU loss and smooth constraint by adopting a dual coordinate descent principle, and estimating the state of the target, wherein the structural SVM model comprises the following specific steps:
step 2.1, the dual coordinate descent optimization algorithm selects a training sample k from the training set by using a formula (7) each time, and then updates the dual scalar quantity by using a formula (8)
Figure BDA0002611504850000101
Figure BDA0002611504850000102
Figure BDA0002611504850000103
Wherein the content of the first and second substances,
Figure BDA0002611504850000104
representing the dual scalar before the update,
Figure BDA0002611504850000105
representing the dual scalar after the update,
Figure BDA0002611504850000106
is represented by
Figure BDA0002611504850000107
To
Figure BDA0002611504850000108
An increment of change;
the key to solve the problem is how to obtain the formula (8)
Figure BDA0002611504850000109
Is represented by
Figure BDA00026115048500001010
To
Figure BDA00026115048500001011
The increment of the change. For this purpose, formula (8) is first substituted into formula (6a), and then formula (6a) is converted to
Figure BDA00026115048500001012
Thus, the formula (9) is obtained.
Figure BDA00026115048500001013
Wherein c is an and increment
Figure BDA00026115048500001014
An unrelated constant; will be the pair of formula (9)
Figure BDA00026115048500001015
Derivative and let the derivative be 0:
Figure BDA00026115048500001016
based on the constraint conditions of the formula (6c), the following results are obtained
Figure BDA00026115048500001017
The value range is as follows:
Figure BDA00026115048500001018
wherein the content of the first and second substances,
Figure BDA00026115048500001019
representing the dual scalar corresponding to the kth training sample,
Figure BDA00026115048500001020
representing the accumulated sum of dual scalars corresponding to all the training samples;
step 2.2 obtaining the compound of formulae (10) to (11)
Figure BDA00026115048500001021
Then, combining equation (4) to obtain an updated equation of w, as shown in equation (12):
Figure BDA00026115048500001022
in the formula w(old)Representing a classification hyperplane normal vector, w(new)Representing the classification hyperplane normal vector after the update;
Figure BDA00026115048500001023
is represented by
Figure BDA00026115048500001024
To
Figure BDA00026115048500001025
An increment of change; λ is a smooth constraint coefficient;
Figure BDA00026115048500001026
Ψk(y*) Representing feature vectors
Figure BDA00026115048500001027
And the eigenvector phi (x)k,y*) The difference value of (a) to (b),
Figure BDA00026115048500001028
rectangular box for representing training sample
Figure BDA00026115048500001029
In the image xkCharacteristic vector of (c), phi (x)k,y*) Representing the prediction rectangle y*In the image xkThe feature vector of (1).
In the target tracking based on the structured SVM, the number of support vectors in the structured SVM is increasing with the passage of time, and in order to ensure the efficiency of target tracking, the number of support vectors needs to be fixed. For this purpose, when the number of patterns in the structured SVM exceeds the budget, a support vector deletion is selected according to equation (13). The structuring SVM of the DIOU loss and smooth constraint provided by the invention adopts the strategy.
Figure BDA0002611504850000111
In the formula of alpha*Representing the corresponding dual scalar when the support vector is the minimum norm,
Figure BDA0002611504850000112
represents the training sample (x)i,yi) Corresponding dual scalars, alpha representing the set of all dual scalars, Ψi(xiY) is sample xiThe support vector of (2).
In summary, the DCSSVM learning algorithm proposed by the present invention is shown as algorithm 1.
Figure BDA0002611504850000113
Figure BDA0002611504850000121
Example 1
The experimental data for this example is data on the 2015 published OTB100 reference dataset y.wu, j.lim, and m.h.yang.object tracking benchmark J. IEEE Transactions on Pattern Analysis and Machine Analysis 2015,37(9): 1834. 1848. the reference library contains 100 videos annotated as 11 challenging attributes, respectively Illumination Variation (IV), Scale Variation (SV), Occlusion (OCC), Deformation (DEF), Motion Blur (MB), Fast Motion (FM), in-plane rotation (IPR), out-of-plane rotation (OPR), object out-of-bounds (OV), Background Clutter (BC), Low Resolution (LR). The OTB100 reference data set provides two indexes for evaluating the performance of the target tracking method. One is precision index and the other is success index. The present embodiment is written using Matlab (version R2017 a) and OpenCV (version 2.4.8). In terms of hardware environment, experiments were performed on a DELL XPS8930 desktop (with a CPU model of Intel core i78700K, 16GB for memory).
4 recent structured SVM-based trackers [ Hare S, safari a, Torr P H s.struck: structured output tracking with kernels [ A ]. Proceedings of IEEE International Conference on Computer Vision: barcelona, Spain: IEEE Computer Society Press,2011,263-270.] [ Ning J F, Yang J M, Jiang S J, et al, object tracking video dual linear structured SVM and explicit feature map [ A ]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition [ C ]. Las Vegas: IEEE Computer Society Press,2016, 4266-: IEEE Computer Society Press,2017, 4800-. At tracking speed, we chose a long video, liquor, with 1741 frames to evaluate. Table 1 shows OPE performance versus velocity for 6 structured SVM tracking methods on OTB100 data sets.
TABLE 16 OPE Performance vs. velocity comparisons on OTB100 datasets for structured SVM tracking methods
Figure BDA0002611504850000131
Table 2 shows the results of Success in different challenging videos by 6 structured SVM tracking methods. As can be seen from table 2, the Scale-DCSSVM tracking method of the present invention performs better in challenging videos such as background clutter, fast movement, illumination change, motion blur, target deformation, and target occlusion, and the result verifies the effectiveness of the tracking method of the present invention.
TABLE 26 comparison of Success in different attribute videos by structured SVM tracking method
Figure BDA0002611504850000141
We evaluated the performance of our tracking algorithm Scale-DCSSVM on the OTB100 dataset along with the 3 newer high performance algorithms DeepSRDCF, CREST, SimFC. Respectively, SiamFC [ Bertonitto L., Valmadre J., Henriques J.F., Vedal A., & Torr P.H.S.Fully-conditional Simerase Networks for Object Tracking [ C ]// processing of European Conference Convergence Vision, Amsterdam, The Netherlands: IEEE Computer Society Press,2016, 850-. Fig. 3 shows the results of OPE performance evaluation on OTB100 data set by 4 high performance algorithms, wherein fig. 3(a) is a mean center error graph and fig. 3(b) is a position coincidence graph. The result shows that the algorithm of the invention has obvious improvement on two indexes of accuracy and success rate.
For further comparison, fig. 4 shows key frame screenshots of the tracking model of the present invention on 5 challenging videos with 3 high performance algorithms, such as SiamFC, CREST and DeepSRDCF, where fig. 4(a) is a toy video screenshot, fig. 4(b) is a pedestrian video screenshot, fig. 4(c) is a walking video screenshot, fig. 4(d) is a face video screenshot, and fig. 4(e) is a singer 2 video screenshot. The main challenges of the toy video are target occlusion, the main challenges of the pedestrian video are target occlusion and target deformation, the main challenges of the walking video are target occlusion, motion blur and background mixing, the main challenges of the face video are illumination change, and the main challenges of the singer 2 video are background mixing. From fig. 4(a) - (e), the method of the present invention has better tracking performance on the challenging video such as target occlusion, background mixing, and illumination change.

Claims (4)

1. A SSVM tracking method based on DIOU loss and smoothness constraint is characterized by comprising the following steps:
step 1, establishing a structured SVM model based on DIOU loss and smoothness constraint, and converting the structured SVM model based on DIOU loss and smoothness constraint into a dual problem according to a Lagrange multiplier method;
step 2, solving a structured SVM model based on DIOU loss and smooth constraint by adopting a dual coordinate descent principle, and estimating the state of the target;
and 3, evaluating the position of the tracking target by adopting a multi-scale target tracking method, and selecting the structured output with the maximum response as a tracking result.
2. The DIOU loss and smoothness constraint-based SSVM tracking method of claim 1, wherein the step 1 of establishing a DIOU loss and smoothness constraint-based structured SVM model, and according to lagrangian multiplier method, transforming the DIOU loss and smoothness constraint-based structured SVM model into a dual problem specifically as follows:
step 1.1, in the target tracking based on the structured SVM, Y is a rectangular frame space, any element of Y is represented by (x ', Y', w, h), wherein (x ', Y') represents the central position of the rectangular frame, w and h respectively represent the width and height of the rectangular frame, and the training data is assumed to be
Figure FDA0002611504840000011
Where N denotes the total number of training data, i denotes any number of data indices in the training data set, i ═ 1,2, …, N, (x)i,yi) A rectangular frame center position representing the ith training data representation; establishing a DCSSVM (distributed support vector machine) model based on DIOU loss and smoothness constraint, wherein the model is described as follows:
Figure FDA0002611504840000012
Figure FDA0002611504840000013
Figure FDA0002611504840000014
wherein w is a normal vector of the structured SVM at the moment t, and wt-1The normal vector of the structured SVM at the moment t-1 is shown, and lambda is a smooth constraint coefficient; Ψi(y)=Φ(xi,yi)-Φ(xi,y),Ψi(y) represents a feature vector Φ (x)i,yi) And the eigenvector phi (x)iY) difference, phi (x)i,yi) Rectangular box y representing training sampleiIn the image xiCharacteristic vector of (c), phi (x)iY) represents the prediction rectangle y in the training image xiA feature vector of (a); c is a regularization parameter, ξiIs a relaxation variable; l (y)iY) is a loss function representing the loss of structural error of the predicted output rectangular box y;
Figure FDA0002611504840000015
Figure FDA0002611504840000016
b、bgtrespectively represent B, BgtCenter point of (B)gt=(xgt,ygt,wgt,hgt) Is the position of the target frame, where (x)gt,ygt) Indicates the center position, w, of the target framegtAnd hgtRespectively representing the width and height of the target frame; b ═ x, y, w, h) is the position of the prediction box, where (x, y) denotes the center position of the prediction box, and w and h denote the width and height of the prediction box, respectively; ρ (·) is the euclidean distance, c is the diagonal length of the smallest bounding box covering the two location boxes;
step 1.2, in order to solve the formula (1), obtaining the dual problem of the formula (1) by using a Lagrangian multiplier method, and introducing the Lagrangian multiplier for the dual problem
Figure FDA0002611504840000021
And
Figure FDA0002611504840000022
the following conditions are satisfied:
Figure FDA0002611504840000023
the lagrangian function of equation (1) is as follows:
Figure FDA0002611504840000024
where ξ represents the relaxation variable, α and β are the introduced lagrangian multipliers; the right side of the equation represents the meaning of the value, ξ, corresponding to a particular training sampleiRepresents the corresponding relaxation variable at the ith training sample,
Figure FDA0002611504840000025
and
Figure FDA0002611504840000026
representing a Lagrange multiplier corresponding to the ith training sample;
the Lagrange function L (w, xi, alpha, beta) is respectively corresponding to w and xiiCalculating the partial derivative and making the partial derivative be 0:
Figure FDA0002611504840000027
Figure FDA0002611504840000028
substituting the formulas (4) and (5) into the formula (3), and eliminating w, beta and xi in L (w, xi, alpha, beta) to obtain the dual problem of the formula (1), which is shown in formulas (6a) to (6 c):
Figure FDA0002611504840000029
Figure FDA00026115048400000210
Figure FDA00026115048400000211
3. the method of claim 2, wherein the step 2 of solving the structured SVM model based on the DIOU loss and the smoothness constraint by using the dual coordinate descent principle estimates the state of the target as follows:
step 2.1, the dual coordinate descent optimization algorithm selects a training sample k from the training set by using a formula (7) each time, and then updates the dual scalar quantity by using a formula (8)
Figure FDA0002611504840000031
Figure FDA0002611504840000032
Figure FDA0002611504840000033
Wherein the content of the first and second substances,
Figure FDA0002611504840000034
representing the dual scalar before the update,
Figure FDA0002611504840000035
representing the dual scalar after the update,
Figure FDA0002611504840000036
is represented by
Figure FDA0002611504840000037
To
Figure FDA0002611504840000038
An increment of change;
to obtain
Figure FDA0002611504840000039
Formula (8) is first substituted into formula (6a), and then formula (6a) is converted to
Figure FDA00026115048400000310
To yield formula (9):
Figure FDA00026115048400000311
wherein c is an and increment
Figure FDA00026115048400000312
An unrelated constant; will be the pair of formula (9)
Figure FDA00026115048400000313
Derivative and let the derivative be 0:
Figure FDA00026115048400000314
according to the constraint condition of the formula (6c), obtaining
Figure FDA00026115048400000315
The value range is as follows:
Figure FDA00026115048400000316
wherein the content of the first and second substances,
Figure FDA00026115048400000317
representing the dual scalar corresponding to the kth training sample,
Figure FDA00026115048400000318
representing the accumulated sum of dual scalars corresponding to all the training samples;
step 2.2 obtaining the compound of formulae (10) to (11)
Figure FDA00026115048400000319
Then, combining equation (4) to obtain an updated equation of w, as shown in equation (12):
Figure FDA00026115048400000320
in the formula w(old)Representing a classification hyperplane normal vector, w(new)Representing the classification hyperplane normal vector after the update;
Figure FDA00026115048400000321
is represented by
Figure FDA0002611504840000041
To
Figure FDA0002611504840000042
An increment of change; λ is a smooth constraint coefficient;
Figure FDA0002611504840000043
Ψk(y*) Representing feature vectors
Figure FDA0002611504840000044
And the eigenvector phi (x)k,y*) The difference value of (a) to (b),
Figure FDA0002611504840000045
rectangular box for representing training sample
Figure FDA0002611504840000046
In the image xkCharacteristic vector of (c), phi (x)k,y*) Representing the prediction rectangle y*In the image xkA feature vector of (a);
and 2.3, calculating the score of the candidate sample by utilizing inner product operation, and estimating the state of the target according to a maximum score criterion formula (13):
Figure FDA0002611504840000047
structured output y with maximum response*I.e. the position of the target, Y represents the set of all predicted structured output rectangular boxes; Ψt(y)=Φ(xt,yt)-Φ(xt,y),Ψt(y) represents a feature vector Φ (x)t,yt) And the eigenvector phi (x)tY) difference, phi (x)t,yt) Rectangular box y representing training sampletIn the image xtCharacteristic vector of (c), phi (x)tY) denotes predicting the rectangular frame y in the image xtThe feature vector of (1).
4. The SSVM tracking method based on DIOU loss and smoothness constraint of claim 1, wherein the step 3 of estimating the position of the tracked target by using the multi-scale target tracking method, selects the structured output with the largest response as the tracking result, and specifically comprises the following steps:
when target tracking is carried out, a conservative scale pool S is used as {1, 0.995, 1.005}, three different scales are adopted for evaluation, and the maximum response is selected as a tracking result; for the target feature, the Lab color and local rank transformed LRT feature of the target is selected.
CN202010755733.1A 2020-07-31 2020-07-31 SSVM tracking method based on DIOU loss and smoothness constraint Active CN112233140B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010755733.1A CN112233140B (en) 2020-07-31 2020-07-31 SSVM tracking method based on DIOU loss and smoothness constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010755733.1A CN112233140B (en) 2020-07-31 2020-07-31 SSVM tracking method based on DIOU loss and smoothness constraint

Publications (2)

Publication Number Publication Date
CN112233140A true CN112233140A (en) 2021-01-15
CN112233140B CN112233140B (en) 2022-10-21

Family

ID=74116515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010755733.1A Active CN112233140B (en) 2020-07-31 2020-07-31 SSVM tracking method based on DIOU loss and smoothness constraint

Country Status (1)

Country Link
CN (1) CN112233140B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180211396A1 (en) * 2015-11-26 2018-07-26 Sportlogiq Inc. Systems and Methods for Object Tracking and Localization in Videos with Adaptive Image Representation
US20200065976A1 (en) * 2018-08-23 2020-02-27 Seoul National University R&Db Foundation Method and system for real-time target tracking based on deep learning
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111460948A (en) * 2020-03-25 2020-07-28 中国人民解放军陆军炮兵防空兵学院 Target tracking method based on cost sensitive structured SVM

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180211396A1 (en) * 2015-11-26 2018-07-26 Sportlogiq Inc. Systems and Methods for Object Tracking and Localization in Videos with Adaptive Image Representation
US20200065976A1 (en) * 2018-08-23 2020-02-27 Seoul National University R&Db Foundation Method and system for real-time target tracking based on deep learning
CN111292355A (en) * 2020-02-12 2020-06-16 江南大学 Nuclear correlation filtering multi-target tracking method fusing motion information
CN111460948A (en) * 2020-03-25 2020-07-28 中国人民解放军陆军炮兵防空兵学院 Target tracking method based on cost sensitive structured SVM

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IOANNIS SARAFIS: "Weighted SVM from clickthrough data for image retrieval", 《2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 *
江少杰: "结构化支持向量机目标跟踪中的特征表示与优化方法研究", 《中国优秀硕士学位论文全文数据库》 *

Also Published As

Publication number Publication date
CN112233140B (en) 2022-10-21

Similar Documents

Publication Publication Date Title
Li et al. DXSLAM: A robust and efficient visual SLAM system with deep features
CN104200495B (en) A kind of multi-object tracking method in video monitoring
Gong et al. Pagerank tracker: From ranking to tracking
CN113628244B (en) Target tracking method, system, terminal and medium based on label-free video training
Zhang et al. A background-aware correlation filter with adaptive saliency-aware regularization for visual tracking
CN109544600A (en) It is a kind of based on it is context-sensitive and differentiate correlation filter method for tracking target
Zhang et al. Sparse learning-based correlation filter for robust tracking
Zhang et al. SiamOA: siamese offset-aware object tracking
Wang et al. Convolution operators for visual tracking based on spatial–temporal regularization
Huang et al. Correlation-filter based scale-adaptive visual tracking with hybrid-scheme sample learning
Li et al. Adaptive multi-branch correlation filters for robust visual tracking
Lu et al. Distracter-aware tracking via correlation filter
CN111460948B (en) Target tracking method based on cost sensitive structured SVM
CN112233140A (en) SSVM tracking method based on DIOU loss and smoothness constraint
Hu et al. Siamese network object tracking algorithm combining attention mechanism and correlation filter theory
Yin et al. Fast scale estimation method in object tracking
Yang et al. High-performance UAVs visual tracking using deep convolutional feature
Liu et al. Anti-occlusion object tracking based on correlation filter
Chen et al. Scale adaptive part-based tracking method using multiple correlation filters
Yuan et al. Algorithms based on correlation filter in target tracking: A survey
Wei et al. An IoU-aware Siamese network for real-time visual tracking
Fan et al. A multi-scale face detection algorithm based on improved SSD model
Xiao et al. Research on scale adaptive particle filter tracker with feature integration
Zhang et al. Rt-track: robust tricks for multi-pedestrian tracking
Hao et al. Robust cascaded-parallel visual tracking using collaborative color and correlation filter models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant