CN111862167A - Rapid robust target tracking method based on sparse compact correlation filter - Google Patents
Rapid robust target tracking method based on sparse compact correlation filter Download PDFInfo
- Publication number
- CN111862167A CN111862167A CN202010705423.9A CN202010705423A CN111862167A CN 111862167 A CN111862167 A CN 111862167A CN 202010705423 A CN202010705423 A CN 202010705423A CN 111862167 A CN111862167 A CN 111862167A
- Authority
- CN
- China
- Prior art keywords
- sparse
- filter
- channel
- correlation filter
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 230000006870 function Effects 0.000 claims abstract description 45
- 238000013138 pruning Methods 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 23
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 18
- 238000013519 translation Methods 0.000 claims abstract description 11
- 230000004044 response Effects 0.000 claims description 15
- 239000013598 vector Substances 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 11
- 230000002123 temporal effect Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000015556 catabolic process Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 238000006731 degradation reaction Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000008602 contraction Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000004438 eyesight Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000001914 filtration Methods 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 230000016776 visual perception Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 2
- 230000002787 reinforcement Effects 0.000 description 2
- 230000002860 competitive effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
A fast robust target tracking method based on a sparse compact correlation filter relates to a computer vision technology. Constructing a basic sample by a target and the context thereof, constructing a training sample by all cyclic translation samples of the basic sample, and training a loss function of the multi-channel correlation filter by a DCF (digital-to-analog converter); in the multi-task learning, an exclusive sparse regular term and a group sparse regular term are integrated to construct an intra-group and inter-group sparse regular term, time consistency constraint is introduced in target tracking to relieve the problem that DCF degrades along with time, an intra-group and inter-group sparse regular term and a time regular term are introduced to define a regression loss function, and a sparse correlation filter is learned; channel pruning removes the redundant filters integrally, sorts the D channel filters according to the importance degree, and selects the channel filter with the top sorting for tracking; and constructing a Lagrange function, and optimizing the regression loss by adopting an ADMM algorithm. The discriminability and the interpretability of the filter are effectively improved, the precision is high, and the speed is high.
Description
Technical Field
The invention relates to a computer vision technology, in particular to a fast robust tracking method based on a sparse compact correlation filter.
Background
The human body has high visual perception capability to the outside video, and the brain can quickly and accurately locate the moving target in the video. Computers are intended to mimic the visual perception of the human brain, to the human level in terms of speed and accuracy. Visual tracking is a fundamental problem in computer vision, and is the fundamental content of visual perception, and the speed and precision of the visual perception determine the real-time performance and precision of the visual perception. Target tracking is one of important research directions in the field of computer vision, and plays an important role in the fields of intelligent video monitoring, human-computer interaction, robot navigation, virtual reality, medical diagnosis, public safety and the like. The task first selects an object of interest in an initial frame of a video and then predicts the state of the object in the next successive frame. In addition, target tracking is a challenging task, and the target often changes in appearance (such as occlusion, deformation, rotation, etc.) during tracking, and is accompanied by complicated illumination changes, interference of similar targets in the background, and rapid movement of the target, which all make the task difficult. In recent years, a target tracking method based on correlation filtering and deep learning becomes a mainstream direction of current research due to good performance of the target tracking method.
Methods based on correlation filters have become one of the research hotspots in the field of target tracking in recent years, and have a good speed advantage and achieve good results in numerous data sets and various games. The DCF provides a hot application trend of the correlation filtering in the target tracking field. Subsequently, many researchers made improvements to DCF. In order to process scale and rotation change, the LDES proposes that a phase filter simultaneously estimates the scale and the rotation angle of a target in a polar coordinate system; the MCPF effectively embeds a correlation filter into the particle filter tracking framework to handle scale changes in the target tracking process. In order to effectively alleviate the problem of the spatial boundary effect of the filter, the DSARCF and ASRCF respectively introduce a dynamic significance response map and an adaptive spatial response map in filter learning to adaptively weight the filter coefficients. In order to effectively alleviate the problem of filter degradation over time, the STRCF and the LADCF introduce a time regularization term in filter learning to perform robust and fast target tracking. In order to train the filter by adopting more samples, the CACF and the BACF respectively use the context sample and the background sample for training the related filter, thereby ensuring real-time and greatly improving the precision. In order to select more robust and discriminative features, the HCF applies the multilayer depth features extracted from the VGG-Net into a related filtering tracking framework, and realizes accurate and robust tracking through fusion of multilayer response graphs. GFSDCF combines feature selection and filter learning, so that the learned filter has stronger discrimination capability, the problem of overfitting can be effectively relieved, and the target tracking precision is improved. The ECO uses a decomposition matrix to effectively compress the original depth features to train a continuous convolution filter, thereby achieving efficient and accurate target tracking. In order to enhance the discriminability of the response diagram, the LMCF introduces a correlation filter into a Struck tracking frame, and fully utilizes the characteristic of high speed of the correlation filter and the characteristic of strong distinguishing capability of the Struck to realize the fast and robust tracking.
In recent years, a method based on deep learning has become another research hotspot in the field of target tracking with its advantage of higher precision. Currently, target tracking methods based on deep learning can be roughly divided into two categories: the first type of deep learning-based method is to construct a deep network, select a sample for off-line training, and realize target tracking by on-line fine tuning of the network, and the representative method is MDNET. The MDNET-based tracking framework has the advantages that VITAL maintains the characteristic of good robustness in the tracking process through counterstudy, and ADNET predicts various states of a target in the tracking process through reinforcement study to adapt to a complex tracking environment. The tracking accuracy of the method is high, but the real-time tracking is difficult to achieve. The second type is that the target tracking problem is converted into an instance retrieval problem, and a matching function used for instance retrieval is obtained by training external video data offline. SINT and SimFC solve the problem of deep similarity measurement by training twin networks offline; DCFNet and CFNet add differentiable relevant filtering layers in the twin network to train end-to-end characteristic expressions suitable for relevant filtering; EAST introduces reinforcement learning in a twin network to adaptively select the depth feature of a certain layer to realize rapid and robust tracking; SiamRPN further introduces RPN networks in the twin network to effectively handle the scale and aspect ratio changes of the target during tracking. This off-line training method is mostly capable of achieving real-time, but its accuracy depends on the network and data used for training.
Disclosure of Invention
The invention aims to provide a rapid robust target tracking method based on a sparse compact correlation filter, which can effectively solve the problems of overfitting, high calculation complexity and the like of the traditional correlation filter and improve the robustness of an algorithm to shielding, deformation, rotation and background interference.
The invention comprises the following steps:
1) for a given target, constructing a basic sample by the target and the context thereof, wherein a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and DCF trains a loss function of a multi-channel correlation filter;
2) in the multi-task learning, integrating an exclusive sparse regular term and a group sparse regular term together to construct an intra-group-inter-group sparse regular term, introducing a time consistency constraint in filter learning by a tracker based on DCF to relieve the problem of DCF degradation along with time in target tracking, and introducing the intra-group-inter-group sparse regular term and the time regular term to define a regression loss function so as to learn a sparse correlation filter;
3) performing channel pruning based on the regression loss function defined in the step 2), integrally removing redundant filters to further accelerate the calculation process, and calculating the change of the regression loss by removing the filter of a certain specific channel; sorting the D channel filters according to the importance degree, and selecting C channel filters which are sorted in the front for tracking;
4) And constructing a Lagrange function, optimizing regression loss by adopting an ADMM algorithm, and completing the fast robust target tracking based on the sparse compact correlation filter.
In step 1), for a given target, a basic sample is constructed from the target and its context, a training sample is composed of all cyclic shift samples of the basic sample, labels of the cyclic shift samples are determined by gaussian functions, and a specific method for DCF training a loss function of a multichannel correlation filter may be:
in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
wherein,for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss functionIn equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter.
In step 2), in the multitask learning, integrating the exclusive sparsity regular term and the group sparsity regular term together to construct an intra-group sparsity regular term, as follows:
wherein, represents WtOf the vector at the (m, n) position,represents WtThe element in the (m, n, d) position, theta is a weight parameter to balance exclusive sparsity and group sparsity regularization terms;
group sparsity is performed on channels2The norm is then taken over spatially1A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel1The norm is then taken over spatially2A norm for removing redundant features on the channel such that the filter is sparse on the channel;
in target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation over time, the introduced temporal regularization term is as follows:
wherein, Wt-1A filter representing a t-1 th frame;
introducing an intra-group-to-inter-group sparsity regularization term REG(Wt) And a temporal regularization term RT(Wt) Defining a regression loss function to learn the sparse correlation filter, the regression loss function being as follows:
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2).
In step 2), the weight parameter θ in equation (2) is 0.2, and the regularization term parameter λ in equation (4) is 1.0 × 10-4,μ=5。
In step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:
ΔL=L(Xt,Y;W′t)-L(Xt,Y;Wt), (5)
wherein, WtAnd W'tRespectively representing filters which are not pruned and filters which are pruned; for a filter with D channels, channel pruning requires estimation of the loss function 2DChannel pruning can be completed in the next time;
by removing a particular channelTo calculate the change in the regression loss; order toIs represented byGenerated response graph:Dt={Xt,Y},order toTo representThe vector of (c) can be obtained:
wherein,vector representing current response graphThe loss after the pruning is carried out,to representLoss of non-pruned branches; to pairAt the point ofIs subjected to a first order Taylor expansion as followsShown in the figure:
wherein,representing a first-order residue in the Taylor representation; substituting equation (7) into equation (6) and removingThe following can be obtained:
wherein,tensor Z representing the response maptAn element located at the (m, n, d) position; according to the degree of importanceAnd sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter remains in subsequent frames, thus the computational complexity can be significantly reduced.
In step 3), the channel parameter C is 64.
In step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm may be:
in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introducedt=WtAnd constructs the lagrange function as follows:
wherein, VtIs Lagrange multiplier, gamma is penalty factor; the naadmm algorithm alternately solves for the following variables:
for correlation filter WtOptimizing; firstly, the correlation filter W is processed by adopting the Pasteval theoremtThe conversion from the time domain to the frequency domain of (1) is as follows:
wherein,a discrete fourier transform symbol is represented,represents a dot-by-symbol; similar to the solution of STRCF and LADCF,each vector inThe solution of (a) is as follows:
wherein, to representThe element at the (m, n) position in (c); computingThen, the obtained product is subjected to inverse Fourier transform to obtain the product
For auxiliary variable UtOptimization is carried out in order to solve the auxiliary variable UtThe following sub-problems are optimized:
Updating lagrange multiplier VtAnd a penalty factor γ: known filter WtAnd an auxiliary variable UtFor lagrange multiplier VtAnd the penalty factor gamma is updated as follows:γi+1=max(γmin,ργi) (ii) a Where i is the iteration index, γminIs the minimum value of γ, and ρ is the scale factor.
In step 4), the parameter γmin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
Compared with the prior art, the invention has the following advantages:
the sparse and compact correlation filter provided by the invention is used for robust real-time target tracking, and can effectively relieve the problems of overfitting and high calculation complexity in the tracking process. The proposed intra-group-inter-group sparse regularization term can effectively select specific target features with discriminant power to train the filter, so that the discriminant and interpretability of the filter are effectively improved. On one hand, a new intra-group-inter-group sparse regular term is introduced in filter learning, so that the learned filter keeps sparse in space and channels simultaneously, the characteristic of a specific target with discriminability can be activated in the tracking process to effectively relieve the problem of overfitting, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. On the other hand, a new channel pruning algorithm based on Taylor expansion is adopted to prune the filter, so that a small number of filters with strong response aiming at a specific target are effectively reserved, a large number of redundant filters are removed, the problem of overfitting can be further relieved, and the calculation complexity is effectively reduced. The solution of the correlation filtering uses an efficient ADMM algorithm, which can efficiently optimize the filter with only a few iterations. Experimental results on various challenging data sets show that the method can obtain a good tracking result, and is high in precision and high in speed. On the OTB-2015 data set, the DP/AUC score of the invention is 93.3%/70.0%, and the speed can reach 20 FPS.
Detailed Description
The present invention belongs to a target tracking method of the related filtering class, and the following embodiments will further describe the present invention.
The embodiment of the invention comprises the following steps:
A. in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
wherein,for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function
In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter. However, a significant portion of these features are unrelated to the particular target being tracked or otherwise useless for distinguishing between background and target. In order to select discriminative and target-specific features to train the filter to effectively alleviate the problems of overfitting and high computational complexity, a sparse and compact correlation filter is proposed below for fast and robust tracking.
B. In step a, the correlation filter of each channel is usually trained by the features of each channel individually. However, different feature channels exhibit different characteristics, some feature channels being mutually exclusive and some feature channels being mutually synergistic. In training a multichannel correlation filter, mutually exclusive eigen-channels require individual training of the correlation filter of the respective channel and mutually cooperating eigen-channels require joint training of their correlation filters. At this time, the learning problem of the multi-channel correlation filter can be converted into a multi-task learning problem, wherein each task corresponds to the correlation filter of each channel.
In multi-task learning, the exclusive sparse regular term can effectively promote model parameters of different tasks to be in a competitive state, and finally intra-group sparsity can be realized; the group sparse regular term can effectively promote model parameters of different tasks to be in a collaborative state, and finally, inter-group sparsity can be achieved. For target tracking, both intra-group sparsity and inter-group sparsity can effectively alleviate the over-fitting problem. In order to solve the problem of using only exclusive sparse or group sparse regularization terms, the two are integrated together to construct a new regularization term, i.e. an intra-group-inter-group sparse regularization term, as follows:
Wherein, representing a filter WtOf the vector at the (m, n) position,representing a filter WtIs located at the (m, n, d) position. θ is a weight parameter to balance the exclusive sparsity and the group sparsity regularization terms.
On the one hand, group sparseness is first performed on channels l2The norm is then taken over spatially1Norm, and thus, group sparseness can effectively remove spatially redundant features, making the filter spatially sparse. Exclusive sparseness, on the other hand, is first performed on the channel1The norm is then taken over spatially2Norm, and therefore exclusive sparsity, can effectively remove features that are redundant on the channel, making the filter sparse on the channel. In general, the proposed groupThe interclass sparse regularization term can effectively select specific target features with discriminative power to train the filter, so that the discriminative power and the interpretability of the filter are effectively improved.
In target tracking, some recent DCF-based trackers often introduce a temporal consistency constraint in filter learning to effectively alleviate the DCF degradation over time, and the temporal regularization term that is usually introduced is as follows:
wherein, Wt-1Representing the filter for frame t-1.
In order to fully utilize the sparsity in space, the sparsity in channels and the consistency in time, an intra-group-to-inter-group sparsity regular term R is simultaneously introduced into a regression loss function EG(Wt) And a temporal regularization term RT(Wt) To learn the sparse correlation filter, as follows:
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2). By designing the regression loss function, the learned filter can effectively enhance the characteristics with discrimination power and can effectively relieve the problem of overfitting.
C. The regression loss function defined by step B can make the learned correlation filter sparse in space and channels, and can alleviate the problem of overfitting. However, the sparse correlation filters are not compact enough in structure, and the computation complexity is still high, so that in order to further accelerate the computation process, an effective solution is to remove the redundant filters as a whole, namely channel pruning. The goal of channel pruning, which is usually based on an evaluation criterion of the importance of the filter, is to minimize the impact of removing the filter. Oracle channel pruning is the best criterion for removing redundant filters, and it estimates the importance of a filter based on the variation of the loss as follows:
ΔL=L(Xt,Y;W′t)-L(Xt,Y;Wt), (5)
wherein, WtAnd W'tRespectively showing the filters without pruning and after pruning. Oracle channel pruning can achieve very high accuracy but its computational complexity is high. For a filter for the D channel, Oracle channel pruning requires estimation of the loss function 2 DChannel pruning can be completed the next time.
Based on the idea of Oracle channel pruning, channel pruning based on taylor expansion calculates the change in regression loss by removing the filter of a particular channel. Order toIs represented byGenerated response graph:Dt={Xt,Y},order toTo representThe vector of (c) can be obtained:
wherein,vector representing current response graphThe loss after the pruning is carried out,representing response map vectorsLoss without pruning. To pairAt the point ofThe first order Taylor expansion is performed as follows:
wherein,representing the first-order residue in the taylor equation. Substituting equation (7) into equation (6) and removingThe following can be obtained:
whereinTensor Z representing the response maptIs located at the (m, n, d) position. According to the degree of importanceAnd sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter remains in subsequent frames, thus the computational complexity can be significantly reduced.
D. To minimize the regression loss presented by equation (4) in step B, the problem was optimized using the ADMM algorithm. Considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introduced t=WtAnd constructs the lagrange function as follows:
wherein, VtIs lagrange multiplier and gamma is penalty factor. The ADMM algorithm is adopted to solve the following variables alternately:
to filter WtAnd (6) optimizing. To efficiently align the filter WtOptimizing by first applying Parceval's theorem to filter WtThe conversion from the time domain to the frequency domain of (1) is as follows:
wherein,a discrete fourier transform symbol is represented,indicating a dot-by-symbol. Similar to the solution of STRCF and LADCF,each vector inThe solution of (a) is as follows:
wherein, to representIs located at the (m, n) position. ComputingThen, the obtained product is subjected to inverse Fourier transform to obtain the product
For auxiliary variable UtAnd (6) optimizing. To solve for the auxiliary variable UtThe following sub-problems are optimized:
Updating lagrange multiplier VtAnd a penalty factor γ: known filter WtAnd an auxiliary variable UtFor lagrange multiplier VtAnd the penalty factor gamma is updated as follows:γi+1=max(γmin,ργi). Where i is the iteration index, γminIs the minimum value of γ, and ρ is the scale factor.
In step B, the weight parameter θ in formula (2) is 0.2, and the regularization term parameter λ in formula (4) is 1.0 × 10 -4,μ=5。
In step C, the channel parameter C is 64.
In step D, the parameter γmin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
Table 1 shows the accuracy, success rate and speed of the OTB100 data set of the present invention and several other correlation filter-based target tracking methods. Wherein SCCF is the method of the present invention.
TABLE 1
Tracking method | CCOT | MCPF | ECO | STRCF | MCCT | ASRCF | LADCF | GFSDCF | SCCF |
Precision (%) | 89.6 | 87.3 | 90.9 | 88.0 | 91.7 | 91.9 | 90.6 | 92.5 | 93.3 |
Success rate (%) | 66.6 | 62.8 | 68.7 | 67.5 | 68.2 | 68.9 | 69.6 | 68.9 | 70.0 |
Speed (FPS) | 2.1 | 3.2 | 8.4 | 5.2 | 6.8 | 24.8 | 10.6 | 7.8 | 19.8 |
CCOT corresponds to the method proposed by Danelljan, M.et al (Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correction filters: left connecting connectors for visual tracking. in: ECCV. pp.472-488,2016);
MCPF corresponds to the method proposed by Zhang, T. et al (Zhang, T., Xu, C., Yang, M.H.: Multi-tasskcorrration partial filter for robust object tracking. in: CVPR. pp.4819-4827,2017);
ECO corresponds to the method proposed by Danelljan, M. et al (Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: effective restriction operators for tracking. in: CVPR. pp.6931-6939,2017);
STRCF corresponds to the method proposed by Li, F et al (Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: leaving spatial-temporal regulated correlation filters for visual tracking. in: CVPR. pp.4904-4913,2018);
MCCT corresponds to the method proposed by Wang, N. et al (Wang, N., Zhou, W., Tian, Q., Hong, R., Wang, M., Li, H.: Multi-core correlation filters for robust visual tracking. in: CVPR.pp.4844-4853,2018);
ASRCF corresponds to the method proposed by Dai, K. et al (Dai, K., Wang, D., Lu, H., Sun, C., Li, J.: Visual tracking via adaptive sizing filters. in: CVPR. pp.4670-4679,2019);
LADCF corresponds to the method proposed by Xu, T.et al (Xu, T., Feng, Z., Wu, X., Kittler, J.: Learning adaptive correlation filters, visual temporal correlation selection. IEEETIP 28(11), 5596-one 5609, 2019);
GFSDCF corresponds to the method proposed by Xu, T, et al (Xu, T., Feng, Z.H., Wu, X.J., Kittler, J.: Joint group failure selection and discrete filter tracking. in: ICCV. pp.7950-7960,2019).
According to the invention, the sparse and compact correlation filter is learned to carry out rapid robust visual tracking, the learned correlation filter can adaptively select the characteristics related to the target and inhibit redundancy and characteristics unrelated to the target, the problems of overfitting and high calculation complexity of the traditional correlation filter can be effectively relieved, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. Through sparseness and time consistency constraint, the correlation filter adaptively selects discriminant features of a small number of channels which are continuous in time and have regional characteristics. The derived correlation-filtered learning problem can be solved by ADMM, which can be solved efficiently with only a few iterations. Experiments are carried out on various challenging data sets (OTB-2013, OTB-2015, VOT-2016, VOT2017 and UAV20L), and the results show that the method can obtain better performance, high precision and high speed. Specifically, on the OTB-2015 dataset, the tracker AUC scored 70.0% with a velocity of approximately 20FPS when using the Handcrafted and CNN features.
Claims (10)
1. A fast robust target tracking method based on a sparse compact correlation filter is characterized by comprising the following steps:
1) for a given target, constructing a basic sample by the target and the context thereof, wherein a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and DCF trains a loss function of a multi-channel correlation filter;
2) in the multi-task learning, integrating an exclusive sparse regular term and a group sparse regular term together to construct an intra-group-inter-group sparse regular term, introducing a time consistency constraint in filter learning by a tracker based on DCF to relieve the problem of DCF degradation along with time in target tracking, and introducing the intra-group-inter-group sparse regular term and the time regular term to define a regression loss function so as to learn a sparse correlation filter;
3) performing channel pruning based on the regression loss function defined in the step 2), integrally removing redundant filters to further accelerate the calculation process, and calculating the change of the regression loss by removing the filter of a certain specific channel; sorting the D channel filters according to the importance degree, and selecting C channel filters which are sorted in the front for tracking;
4) And constructing a Lagrange function, optimizing regression loss by adopting an ADMM algorithm, and completing the fast robust target tracking based on the sparse compact correlation filter.
2. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 1, wherein in step 1), for a given target, a basic sample is constructed from the target and its context, the training sample is composed of all cyclic shift samples of the basic sample, the labels of the cyclic shift samples are determined by gaussian function, and the DCF training loss function of the multichannel correlation filter is specifically:
in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
wherein,for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; filter science The objective is to minimize the loss functionIn equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter.
3. The sparse compact correlation filter-based fast robust target tracking method according to claim 1, wherein in step 2), in the multitask learning, an exclusive sparse regularization term and a group sparse regularization term are integrated together to construct an intra-group-inter-group sparse regularization term, as follows:
wherein, represents WtOf the vector at the (m, n) position,represents WtThe element in the (m, n, d) position, theta is a weight parameter to balance exclusive sparsity and group sparsity regularization terms;
group sparsity is performed on channels2The norm is then taken over spatially1A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel1The norm is then taken over spatially2Norm, which is used to remove redundant features on the channel, so that the filter is sparse on the channel.
4. A fast robust target tracking method based on sparse compact correlation filter as claimed in claim 3 wherein the weight parameter θ is 0.2.
5. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 1, wherein in step 2), in the target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation problem over time, and the introduced temporal regularization term is as follows:
Wherein, Wt-1A filter representing a t-1 th frame;
introducing an intra-group-to-inter-group sparsity regularization term REG(Wt) And a temporal regularization term RT(Wt) Defining a regression loss function to learn the sparse correlation filter, the regression loss function being as follows:
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2).
6. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 5 wherein the regularization term parameter λ ═ 1.0 × 10-4,μ=5。
7. The sparse compact correlation filter-based fast robust target tracking method according to claim 1, wherein in step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:
ΔL=L(Xt,Y;Wt')-L(Xt,Y;Wt), (5)
wherein, WtAnd Wt' filter without and after pruning respectively;for a filter with D channels, channel pruning requires estimation of the loss function 2DChannel pruning can be completed in the next time;
calculating the change of the regression loss by removing a filter of a specific channel; order toIs represented byGenerated response graph:Dt={Xt,Y},order toTo representThe vector of (a) is obtained:
wherein,vector representing current response graphThe loss after the pruning is carried out,to representLoss of non-pruned branches(ii) a To pair At the point ofThe first order Taylor expansion is performed as follows:
wherein,representing a first-order residue in the Taylor representation; substituting equation (7) into equation (6) and removingObtaining:
8. The sparse compact correlation filter-based fast robust target tracking method of claim 7, wherein C-64.
9. The fast robust target tracking method based on the sparse compact correlation filter as claimed in claim 1, wherein in step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm is as follows:
in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introducedt=WtAnd constructs the lagrange function as follows:
wherein, VtIs Lagrange multiplier, gamma is penalty factor; the naadmm algorithm alternately solves for the following variables:
For correlation filter WtOptimizing; firstly, the correlation filter W is processed by adopting the Pasteval theoremtThe conversion from the time domain to the frequency domain of (1) is as follows:
wherein,indicates a discrete Fourier transform symbol,. indicates a point-by-symbol; similar to the solution of STRCF and LADCF,each vector inThe solution of (a) is as follows:
wherein, to representThe element at the (m, n) position in (c); computingThen, it is subjected to inverse Fourier transform to obtain
For auxiliary variable UtOptimization is carried out in order to solve the auxiliary variable UtThe following sub-problems are optimized:
10. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 9 wherein the parameter γ ismin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010705423.9A CN111862167B (en) | 2020-07-21 | 2020-07-21 | Rapid robust target tracking method based on sparse compact correlation filter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010705423.9A CN111862167B (en) | 2020-07-21 | 2020-07-21 | Rapid robust target tracking method based on sparse compact correlation filter |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111862167A true CN111862167A (en) | 2020-10-30 |
CN111862167B CN111862167B (en) | 2022-05-10 |
Family
ID=73000807
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010705423.9A Active CN111862167B (en) | 2020-07-21 | 2020-07-21 | Rapid robust target tracking method based on sparse compact correlation filter |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111862167B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112767450A (en) * | 2021-01-25 | 2021-05-07 | 开放智能机器(上海)有限公司 | Multi-loss learning-based related filtering target tracking method and system |
CN113379804A (en) * | 2021-07-12 | 2021-09-10 | 闽南师范大学 | Unmanned aerial vehicle target tracking method, terminal equipment and storage medium |
CN114117926A (en) * | 2021-12-01 | 2022-03-01 | 南京富尔登科技发展有限公司 | Robot cooperative control algorithm based on federal learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203495A (en) * | 2016-07-01 | 2016-12-07 | 广东技术师范学院 | A kind of based on the sparse method for tracking target differentiating study |
CN109859241A (en) * | 2019-01-09 | 2019-06-07 | 厦门大学 | Adaptive features select and time consistency robust correlation filtering visual tracking method |
CN109859244A (en) * | 2019-01-22 | 2019-06-07 | 西安微电子技术研究所 | A kind of visual tracking method based on convolution sparseness filtering |
CN110490907A (en) * | 2019-08-21 | 2019-11-22 | 上海无线电设备研究所 | Motion target tracking method based on multiple target feature and improvement correlation filter |
CN111126132A (en) * | 2019-10-25 | 2020-05-08 | 宁波必创网络科技有限公司 | Learning target tracking algorithm based on twin network |
-
2020
- 2020-07-21 CN CN202010705423.9A patent/CN111862167B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203495A (en) * | 2016-07-01 | 2016-12-07 | 广东技术师范学院 | A kind of based on the sparse method for tracking target differentiating study |
CN109859241A (en) * | 2019-01-09 | 2019-06-07 | 厦门大学 | Adaptive features select and time consistency robust correlation filtering visual tracking method |
CN109859244A (en) * | 2019-01-22 | 2019-06-07 | 西安微电子技术研究所 | A kind of visual tracking method based on convolution sparseness filtering |
CN110490907A (en) * | 2019-08-21 | 2019-11-22 | 上海无线电设备研究所 | Motion target tracking method based on multiple target feature and improvement correlation filter |
CN111126132A (en) * | 2019-10-25 | 2020-05-08 | 宁波必创网络科技有限公司 | Learning target tracking algorithm based on twin network |
Non-Patent Citations (4)
Title |
---|
CHENGGANG GUO ET AL.: "Learning Local Structured Correlation Filters for Visual Tracking via Spatial Joint Regularization", 《IEEE ACCESS》 * |
LUO XIONG ET AL.: "correlation filter tracking with adaptive proposal selection for accurate scale estimation", 《ARXIV》 * |
YANJIE LIANG ET AL.: "Robust Correlation Filter Tracking with Shepherded Instance-Aware Proposals", 《26TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA 2018》 * |
王欣远: "基于相关滤波器的目标跟踪算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112767450A (en) * | 2021-01-25 | 2021-05-07 | 开放智能机器(上海)有限公司 | Multi-loss learning-based related filtering target tracking method and system |
CN113379804A (en) * | 2021-07-12 | 2021-09-10 | 闽南师范大学 | Unmanned aerial vehicle target tracking method, terminal equipment and storage medium |
CN113379804B (en) * | 2021-07-12 | 2023-05-09 | 闽南师范大学 | Unmanned aerial vehicle target tracking method, terminal equipment and storage medium |
CN114117926A (en) * | 2021-12-01 | 2022-03-01 | 南京富尔登科技发展有限公司 | Robot cooperative control algorithm based on federal learning |
CN114117926B (en) * | 2021-12-01 | 2024-05-14 | 南京富尔登科技发展有限公司 | Robot cooperative control algorithm based on federal learning |
Also Published As
Publication number | Publication date |
---|---|
CN111862167B (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111862167B (en) | Rapid robust target tracking method based on sparse compact correlation filter | |
Luo et al. | Decomposition algorithm for depth image of human health posture based on brain health | |
CN112560656B (en) | Pedestrian multi-target tracking method combining attention mechanism end-to-end training | |
CN109859241B (en) | Adaptive feature selection and time consistency robust correlation filtering visual tracking method | |
CN104484890B (en) | Video target tracking method based on compound sparse model | |
CN108874149B (en) | Method for continuously estimating human body joint angle based on surface electromyogram signal | |
CN102508867A (en) | Human-motion diagram searching method | |
CN110135365B (en) | Robust target tracking method based on illusion countermeasure network | |
CN107203747B (en) | Sparse combined model target tracking method based on self-adaptive selection mechanism | |
Zhang et al. | Robust low-rank kernel subspace clustering based on the schatten p-norm and correntropy | |
CN115147456B (en) | Target tracking method based on time sequence self-adaptive convolution and attention mechanism | |
CN112258557B (en) | Visual tracking method based on space attention feature aggregation | |
CN113673313B (en) | Gesture recognition method based on hierarchical convolutional neural network | |
CN112668543B (en) | Isolated word sign language recognition method based on hand model perception | |
CN110555864B (en) | Self-adaptive target tracking method based on PSPCE | |
CN114781441B (en) | EEG motor imagery classification method and multi-space convolution neural network model | |
Sahbi | Topologically-consistent magnitude pruning for very lightweight graph convolutional networks | |
CN111428555A (en) | Joint-divided hand posture estimation method | |
Jeong et al. | Deep efficient continuous manifold learning for time series modeling | |
Lajkó et al. | Surgical skill assessment automation based on sparse optical flow data | |
Hashida et al. | Multi-channel mhlf: Lstm-fcn using macd-histogram with multi-channel input for time series classification | |
Firouznia et al. | Adaptive chaotic sampling particle filter to handle occlusion and fast motion in visual object tracking | |
Wang et al. | Human motion data refinement unitizing structural sparsity and spatial-temporal information | |
CN116597996A (en) | Infant brain development quantitative evaluation system based on self-adaptive neighbor propagation self-clustering model | |
Kalimuthu et al. | Multi-class facial emotion recognition using hybrid dense squeeze network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |