CN111862167A - Rapid robust target tracking method based on sparse compact correlation filter - Google Patents

Rapid robust target tracking method based on sparse compact correlation filter Download PDF

Info

Publication number
CN111862167A
CN111862167A CN202010705423.9A CN202010705423A CN111862167A CN 111862167 A CN111862167 A CN 111862167A CN 202010705423 A CN202010705423 A CN 202010705423A CN 111862167 A CN111862167 A CN 111862167A
Authority
CN
China
Prior art keywords
sparse
filter
channel
correlation filter
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010705423.9A
Other languages
Chinese (zh)
Other versions
CN111862167B (en
Inventor
王菡子
梁艳杰
熊逻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202010705423.9A priority Critical patent/CN111862167B/en
Publication of CN111862167A publication Critical patent/CN111862167A/en
Application granted granted Critical
Publication of CN111862167B publication Critical patent/CN111862167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

A fast robust target tracking method based on a sparse compact correlation filter relates to a computer vision technology. Constructing a basic sample by a target and the context thereof, constructing a training sample by all cyclic translation samples of the basic sample, and training a loss function of the multi-channel correlation filter by a DCF (digital-to-analog converter); in the multi-task learning, an exclusive sparse regular term and a group sparse regular term are integrated to construct an intra-group and inter-group sparse regular term, time consistency constraint is introduced in target tracking to relieve the problem that DCF degrades along with time, an intra-group and inter-group sparse regular term and a time regular term are introduced to define a regression loss function, and a sparse correlation filter is learned; channel pruning removes the redundant filters integrally, sorts the D channel filters according to the importance degree, and selects the channel filter with the top sorting for tracking; and constructing a Lagrange function, and optimizing the regression loss by adopting an ADMM algorithm. The discriminability and the interpretability of the filter are effectively improved, the precision is high, and the speed is high.

Description

Rapid robust target tracking method based on sparse compact correlation filter
Technical Field
The invention relates to a computer vision technology, in particular to a fast robust tracking method based on a sparse compact correlation filter.
Background
The human body has high visual perception capability to the outside video, and the brain can quickly and accurately locate the moving target in the video. Computers are intended to mimic the visual perception of the human brain, to the human level in terms of speed and accuracy. Visual tracking is a fundamental problem in computer vision, and is the fundamental content of visual perception, and the speed and precision of the visual perception determine the real-time performance and precision of the visual perception. Target tracking is one of important research directions in the field of computer vision, and plays an important role in the fields of intelligent video monitoring, human-computer interaction, robot navigation, virtual reality, medical diagnosis, public safety and the like. The task first selects an object of interest in an initial frame of a video and then predicts the state of the object in the next successive frame. In addition, target tracking is a challenging task, and the target often changes in appearance (such as occlusion, deformation, rotation, etc.) during tracking, and is accompanied by complicated illumination changes, interference of similar targets in the background, and rapid movement of the target, which all make the task difficult. In recent years, a target tracking method based on correlation filtering and deep learning becomes a mainstream direction of current research due to good performance of the target tracking method.
Methods based on correlation filters have become one of the research hotspots in the field of target tracking in recent years, and have a good speed advantage and achieve good results in numerous data sets and various games. The DCF provides a hot application trend of the correlation filtering in the target tracking field. Subsequently, many researchers made improvements to DCF. In order to process scale and rotation change, the LDES proposes that a phase filter simultaneously estimates the scale and the rotation angle of a target in a polar coordinate system; the MCPF effectively embeds a correlation filter into the particle filter tracking framework to handle scale changes in the target tracking process. In order to effectively alleviate the problem of the spatial boundary effect of the filter, the DSARCF and ASRCF respectively introduce a dynamic significance response map and an adaptive spatial response map in filter learning to adaptively weight the filter coefficients. In order to effectively alleviate the problem of filter degradation over time, the STRCF and the LADCF introduce a time regularization term in filter learning to perform robust and fast target tracking. In order to train the filter by adopting more samples, the CACF and the BACF respectively use the context sample and the background sample for training the related filter, thereby ensuring real-time and greatly improving the precision. In order to select more robust and discriminative features, the HCF applies the multilayer depth features extracted from the VGG-Net into a related filtering tracking framework, and realizes accurate and robust tracking through fusion of multilayer response graphs. GFSDCF combines feature selection and filter learning, so that the learned filter has stronger discrimination capability, the problem of overfitting can be effectively relieved, and the target tracking precision is improved. The ECO uses a decomposition matrix to effectively compress the original depth features to train a continuous convolution filter, thereby achieving efficient and accurate target tracking. In order to enhance the discriminability of the response diagram, the LMCF introduces a correlation filter into a Struck tracking frame, and fully utilizes the characteristic of high speed of the correlation filter and the characteristic of strong distinguishing capability of the Struck to realize the fast and robust tracking.
In recent years, a method based on deep learning has become another research hotspot in the field of target tracking with its advantage of higher precision. Currently, target tracking methods based on deep learning can be roughly divided into two categories: the first type of deep learning-based method is to construct a deep network, select a sample for off-line training, and realize target tracking by on-line fine tuning of the network, and the representative method is MDNET. The MDNET-based tracking framework has the advantages that VITAL maintains the characteristic of good robustness in the tracking process through counterstudy, and ADNET predicts various states of a target in the tracking process through reinforcement study to adapt to a complex tracking environment. The tracking accuracy of the method is high, but the real-time tracking is difficult to achieve. The second type is that the target tracking problem is converted into an instance retrieval problem, and a matching function used for instance retrieval is obtained by training external video data offline. SINT and SimFC solve the problem of deep similarity measurement by training twin networks offline; DCFNet and CFNet add differentiable relevant filtering layers in the twin network to train end-to-end characteristic expressions suitable for relevant filtering; EAST introduces reinforcement learning in a twin network to adaptively select the depth feature of a certain layer to realize rapid and robust tracking; SiamRPN further introduces RPN networks in the twin network to effectively handle the scale and aspect ratio changes of the target during tracking. This off-line training method is mostly capable of achieving real-time, but its accuracy depends on the network and data used for training.
Disclosure of Invention
The invention aims to provide a rapid robust target tracking method based on a sparse compact correlation filter, which can effectively solve the problems of overfitting, high calculation complexity and the like of the traditional correlation filter and improve the robustness of an algorithm to shielding, deformation, rotation and background interference.
The invention comprises the following steps:
1) for a given target, constructing a basic sample by the target and the context thereof, wherein a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and DCF trains a loss function of a multi-channel correlation filter;
2) in the multi-task learning, integrating an exclusive sparse regular term and a group sparse regular term together to construct an intra-group-inter-group sparse regular term, introducing a time consistency constraint in filter learning by a tracker based on DCF to relieve the problem of DCF degradation along with time in target tracking, and introducing the intra-group-inter-group sparse regular term and the time regular term to define a regression loss function so as to learn a sparse correlation filter;
3) performing channel pruning based on the regression loss function defined in the step 2), integrally removing redundant filters to further accelerate the calculation process, and calculating the change of the regression loss by removing the filter of a certain specific channel; sorting the D channel filters according to the importance degree, and selecting C channel filters which are sorted in the front for tracking;
4) And constructing a Lagrange function, optimizing regression loss by adopting an ADMM algorithm, and completing the fast robust target tracking based on the sparse compact correlation filter.
In step 1), for a given target, a basic sample is constructed from the target and its context, a training sample is composed of all cyclic shift samples of the basic sample, labels of the cyclic shift samples are determined by gaussian functions, and a specific method for DCF training a loss function of a multichannel correlation filter may be:
in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
Figure BDA0002594515090000031
wherein,
Figure BDA0002594515090000032
for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function
Figure BDA0002594515090000033
In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter.
In step 2), in the multitask learning, integrating the exclusive sparsity regular term and the group sparsity regular term together to construct an intra-group sparsity regular term, as follows:
Figure BDA0002594515090000034
wherein,
Figure BDA0002594515090000035
Figure BDA0002594515090000036
represents WtOf the vector at the (m, n) position,
Figure BDA0002594515090000037
represents WtThe element in the (m, n, d) position, theta is a weight parameter to balance exclusive sparsity and group sparsity regularization terms;
group sparsity is performed on channels2The norm is then taken over spatially1A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel1The norm is then taken over spatially2A norm for removing redundant features on the channel such that the filter is sparse on the channel;
in target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation over time, the introduced temporal regularization term is as follows:
Figure BDA0002594515090000041
wherein, Wt-1A filter representing a t-1 th frame;
introducing an intra-group-to-inter-group sparsity regularization term REG(Wt) And a temporal regularization term RT(Wt) Defining a regression loss function to learn the sparse correlation filter, the regression loss function being as follows:
Figure BDA0002594515090000042
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2).
In step 2), the weight parameter θ in equation (2) is 0.2, and the regularization term parameter λ in equation (4) is 1.0 × 10-4,μ=5。
In step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:
ΔL=L(Xt,Y;W′t)-L(Xt,Y;Wt), (5)
wherein, WtAnd W'tRespectively representing filters which are not pruned and filters which are pruned; for a filter with D channels, channel pruning requires estimation of the loss function 2DChannel pruning can be completed in the next time;
by removing a particular channelTo calculate the change in the regression loss; order to
Figure BDA0002594515090000043
Is represented by
Figure BDA0002594515090000044
Generated response graph:
Figure BDA0002594515090000045
Dt={Xt,Y},
Figure BDA0002594515090000046
order to
Figure BDA0002594515090000047
To represent
Figure BDA0002594515090000048
The vector of (c) can be obtained:
Figure BDA0002594515090000049
wherein,
Figure BDA00025945150900000410
vector representing current response graph
Figure BDA00025945150900000411
The loss after the pruning is carried out,
Figure BDA00025945150900000412
to represent
Figure BDA00025945150900000413
Loss of non-pruned branches; to pair
Figure BDA00025945150900000414
At the point of
Figure BDA00025945150900000415
Is subjected to a first order Taylor expansion as followsShown in the figure:
Figure BDA00025945150900000416
wherein,
Figure BDA00025945150900000417
representing a first-order residue in the Taylor representation; substituting equation (7) into equation (6) and removing
Figure BDA00025945150900000418
The following can be obtained:
Figure BDA0002594515090000051
thus, for each filter
Figure BDA0002594515090000052
Its degree of importance
Figure BDA0002594515090000053
The calculation formula is as follows:
Figure BDA0002594515090000054
wherein,
Figure BDA0002594515090000055
tensor Z representing the response maptAn element located at the (m, n, d) position; according to the degree of importance
Figure BDA0002594515090000056
And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter remains in subsequent frames, thus the computational complexity can be significantly reduced.
In step 3), the channel parameter C is 64.
In step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm may be:
in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introducedt=WtAnd constructs the lagrange function as follows:
Figure BDA0002594515090000057
wherein, VtIs Lagrange multiplier, gamma is penalty factor; the naadmm algorithm alternately solves for the following variables:
for correlation filter WtOptimizing; firstly, the correlation filter W is processed by adopting the Pasteval theoremtThe conversion from the time domain to the frequency domain of (1) is as follows:
Figure BDA0002594515090000058
wherein,
Figure BDA0002594515090000059
a discrete fourier transform symbol is represented,
Figure BDA00025945150900000510
represents a dot-by-symbol; similar to the solution of STRCF and LADCF,
Figure BDA00025945150900000511
each vector in
Figure BDA00025945150900000512
The solution of (a) is as follows:
Figure BDA00025945150900000513
wherein,
Figure BDA00025945150900000514
Figure BDA00025945150900000515
to represent
Figure BDA00025945150900000516
The element at the (m, n) position in (c); computing
Figure BDA0002594515090000061
Then, the obtained product is subjected to inverse Fourier transform to obtain the product
Figure BDA0002594515090000062
For auxiliary variable UtOptimization is carried out in order to solve the auxiliary variable UtThe following sub-problems are optimized:
Figure BDA0002594515090000063
auxiliary variable UtEach element of
Figure BDA0002594515090000064
The solution of (a) is as follows:
Figure BDA0002594515090000065
Figure BDA0002594515090000066
wherein
Figure BDA0002594515090000067
(·)+For the contraction operator, the following is defined: (x) +=max(0,x)。
Updating lagrange multiplier VtAnd a penalty factor γ: known filter WtAnd an auxiliary variable UtFor lagrange multiplier VtAnd the penalty factor gamma is updated as follows:
Figure BDA0002594515090000068
γi+1=max(γmin,ργi) (ii) a Where i is the iteration index, γminIs the minimum value of γ, and ρ is the scale factor.
In step 4), the parameter γmin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
Compared with the prior art, the invention has the following advantages:
the sparse and compact correlation filter provided by the invention is used for robust real-time target tracking, and can effectively relieve the problems of overfitting and high calculation complexity in the tracking process. The proposed intra-group-inter-group sparse regularization term can effectively select specific target features with discriminant power to train the filter, so that the discriminant and interpretability of the filter are effectively improved. On one hand, a new intra-group-inter-group sparse regular term is introduced in filter learning, so that the learned filter keeps sparse in space and channels simultaneously, the characteristic of a specific target with discriminability can be activated in the tracking process to effectively relieve the problem of overfitting, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. On the other hand, a new channel pruning algorithm based on Taylor expansion is adopted to prune the filter, so that a small number of filters with strong response aiming at a specific target are effectively reserved, a large number of redundant filters are removed, the problem of overfitting can be further relieved, and the calculation complexity is effectively reduced. The solution of the correlation filtering uses an efficient ADMM algorithm, which can efficiently optimize the filter with only a few iterations. Experimental results on various challenging data sets show that the method can obtain a good tracking result, and is high in precision and high in speed. On the OTB-2015 data set, the DP/AUC score of the invention is 93.3%/70.0%, and the speed can reach 20 FPS.
Detailed Description
The present invention belongs to a target tracking method of the related filtering class, and the following embodiments will further describe the present invention.
The embodiment of the invention comprises the following steps:
A. in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
Figure BDA0002594515090000071
wherein,
Figure BDA0002594515090000072
for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function
Figure BDA0002594515090000073
In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter. However, a significant portion of these features are unrelated to the particular target being tracked or otherwise useless for distinguishing between background and target. In order to select discriminative and target-specific features to train the filter to effectively alleviate the problems of overfitting and high computational complexity, a sparse and compact correlation filter is proposed below for fast and robust tracking.
B. In step a, the correlation filter of each channel is usually trained by the features of each channel individually. However, different feature channels exhibit different characteristics, some feature channels being mutually exclusive and some feature channels being mutually synergistic. In training a multichannel correlation filter, mutually exclusive eigen-channels require individual training of the correlation filter of the respective channel and mutually cooperating eigen-channels require joint training of their correlation filters. At this time, the learning problem of the multi-channel correlation filter can be converted into a multi-task learning problem, wherein each task corresponds to the correlation filter of each channel.
In multi-task learning, the exclusive sparse regular term can effectively promote model parameters of different tasks to be in a competitive state, and finally intra-group sparsity can be realized; the group sparse regular term can effectively promote model parameters of different tasks to be in a collaborative state, and finally, inter-group sparsity can be achieved. For target tracking, both intra-group sparsity and inter-group sparsity can effectively alleviate the over-fitting problem. In order to solve the problem of using only exclusive sparse or group sparse regularization terms, the two are integrated together to construct a new regularization term, i.e. an intra-group-inter-group sparse regularization term, as follows:
Figure BDA0002594515090000074
Wherein,
Figure BDA0002594515090000075
Figure BDA0002594515090000076
representing a filter WtOf the vector at the (m, n) position,
Figure BDA0002594515090000077
representing a filter WtIs located at the (m, n, d) position. θ is a weight parameter to balance the exclusive sparsity and the group sparsity regularization terms.
On the one hand, group sparseness is first performed on channels l2The norm is then taken over spatially1Norm, and thus, group sparseness can effectively remove spatially redundant features, making the filter spatially sparse. Exclusive sparseness, on the other hand, is first performed on the channel1The norm is then taken over spatially2Norm, and therefore exclusive sparsity, can effectively remove features that are redundant on the channel, making the filter sparse on the channel. In general, the proposed groupThe interclass sparse regularization term can effectively select specific target features with discriminative power to train the filter, so that the discriminative power and the interpretability of the filter are effectively improved.
In target tracking, some recent DCF-based trackers often introduce a temporal consistency constraint in filter learning to effectively alleviate the DCF degradation over time, and the temporal regularization term that is usually introduced is as follows:
Figure BDA0002594515090000081
wherein, Wt-1Representing the filter for frame t-1.
In order to fully utilize the sparsity in space, the sparsity in channels and the consistency in time, an intra-group-to-inter-group sparsity regular term R is simultaneously introduced into a regression loss function EG(Wt) And a temporal regularization term RT(Wt) To learn the sparse correlation filter, as follows:
Figure BDA0002594515090000082
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2). By designing the regression loss function, the learned filter can effectively enhance the characteristics with discrimination power and can effectively relieve the problem of overfitting.
C. The regression loss function defined by step B can make the learned correlation filter sparse in space and channels, and can alleviate the problem of overfitting. However, the sparse correlation filters are not compact enough in structure, and the computation complexity is still high, so that in order to further accelerate the computation process, an effective solution is to remove the redundant filters as a whole, namely channel pruning. The goal of channel pruning, which is usually based on an evaluation criterion of the importance of the filter, is to minimize the impact of removing the filter. Oracle channel pruning is the best criterion for removing redundant filters, and it estimates the importance of a filter based on the variation of the loss as follows:
ΔL=L(Xt,Y;W′t)-L(Xt,Y;Wt), (5)
wherein, WtAnd W'tRespectively showing the filters without pruning and after pruning. Oracle channel pruning can achieve very high accuracy but its computational complexity is high. For a filter for the D channel, Oracle channel pruning requires estimation of the loss function 2 DChannel pruning can be completed the next time.
Based on the idea of Oracle channel pruning, channel pruning based on taylor expansion calculates the change in regression loss by removing the filter of a particular channel. Order to
Figure BDA0002594515090000091
Is represented by
Figure BDA0002594515090000092
Generated response graph:
Figure BDA0002594515090000093
Dt={Xt,Y},
Figure BDA0002594515090000094
order to
Figure BDA0002594515090000095
To represent
Figure BDA0002594515090000096
The vector of (c) can be obtained:
Figure BDA0002594515090000097
wherein,
Figure BDA0002594515090000098
vector representing current response graph
Figure BDA0002594515090000099
The loss after the pruning is carried out,
Figure BDA00025945150900000910
representing response map vectors
Figure BDA00025945150900000911
Loss without pruning. To pair
Figure BDA00025945150900000912
At the point of
Figure BDA00025945150900000913
The first order Taylor expansion is performed as follows:
Figure BDA00025945150900000914
wherein,
Figure BDA00025945150900000915
representing the first-order residue in the taylor equation. Substituting equation (7) into equation (6) and removing
Figure BDA00025945150900000916
The following can be obtained:
Figure BDA00025945150900000917
thus, for each filter
Figure BDA00025945150900000918
Its degree of importance
Figure BDA00025945150900000919
The calculation formula is as follows:
Figure BDA00025945150900000920
wherein
Figure BDA00025945150900000921
Tensor Z representing the response maptIs located at the (m, n, d) position. According to the degree of importance
Figure BDA00025945150900000922
And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter remains in subsequent frames, thus the computational complexity can be significantly reduced.
D. To minimize the regression loss presented by equation (4) in step B, the problem was optimized using the ADMM algorithm. Considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introduced t=WtAnd constructs the lagrange function as follows:
Figure BDA00025945150900000923
wherein, VtIs lagrange multiplier and gamma is penalty factor. The ADMM algorithm is adopted to solve the following variables alternately:
to filter WtAnd (6) optimizing. To efficiently align the filter WtOptimizing by first applying Parceval's theorem to filter WtThe conversion from the time domain to the frequency domain of (1) is as follows:
Figure BDA0002594515090000101
wherein,
Figure BDA0002594515090000102
a discrete fourier transform symbol is represented,
Figure BDA0002594515090000103
indicating a dot-by-symbol. Similar to the solution of STRCF and LADCF,
Figure BDA0002594515090000104
each vector in
Figure BDA0002594515090000105
The solution of (a) is as follows:
Figure BDA0002594515090000106
wherein,
Figure BDA0002594515090000107
Figure BDA0002594515090000108
to represent
Figure BDA0002594515090000109
Is located at the (m, n) position. Computing
Figure BDA00025945150900001010
Then, the obtained product is subjected to inverse Fourier transform to obtain the product
Figure BDA00025945150900001011
For auxiliary variable UtAnd (6) optimizing. To solve for the auxiliary variable UtThe following sub-problems are optimized:
Figure BDA00025945150900001012
auxiliary variable UtEach element of
Figure BDA00025945150900001013
The solution of (a) is as follows:
Figure BDA00025945150900001014
Figure BDA00025945150900001015
wherein,
Figure BDA00025945150900001016
(·)+for the contraction operator, the following is defined: (x)+=sign(x)max(0,x)。
Updating lagrange multiplier VtAnd a penalty factor γ: known filter WtAnd an auxiliary variable UtFor lagrange multiplier VtAnd the penalty factor gamma is updated as follows:
Figure BDA00025945150900001017
γi+1=max(γmin,ργi). Where i is the iteration index, γminIs the minimum value of γ, and ρ is the scale factor.
In step B, the weight parameter θ in formula (2) is 0.2, and the regularization term parameter λ in formula (4) is 1.0 × 10 -4,μ=5。
In step C, the channel parameter C is 64.
In step D, the parameter γmin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
Table 1 shows the accuracy, success rate and speed of the OTB100 data set of the present invention and several other correlation filter-based target tracking methods. Wherein SCCF is the method of the present invention.
TABLE 1
Tracking method CCOT MCPF ECO STRCF MCCT ASRCF LADCF GFSDCF SCCF
Precision (%) 89.6 87.3 90.9 88.0 91.7 91.9 90.6 92.5 93.3
Success rate (%) 66.6 62.8 68.7 67.5 68.2 68.9 69.6 68.9 70.0
Speed (FPS) 2.1 3.2 8.4 5.2 6.8 24.8 10.6 7.8 19.8
CCOT corresponds to the method proposed by Danelljan, M.et al (Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correction filters: left connecting connectors for visual tracking. in: ECCV. pp.472-488,2016);
MCPF corresponds to the method proposed by Zhang, T. et al (Zhang, T., Xu, C., Yang, M.H.: Multi-tasskcorrration partial filter for robust object tracking. in: CVPR. pp.4819-4827,2017);
ECO corresponds to the method proposed by Danelljan, M. et al (Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: effective restriction operators for tracking. in: CVPR. pp.6931-6939,2017);
STRCF corresponds to the method proposed by Li, F et al (Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: leaving spatial-temporal regulated correlation filters for visual tracking. in: CVPR. pp.4904-4913,2018);
MCCT corresponds to the method proposed by Wang, N. et al (Wang, N., Zhou, W., Tian, Q., Hong, R., Wang, M., Li, H.: Multi-core correlation filters for robust visual tracking. in: CVPR.pp.4844-4853,2018);
ASRCF corresponds to the method proposed by Dai, K. et al (Dai, K., Wang, D., Lu, H., Sun, C., Li, J.: Visual tracking via adaptive sizing filters. in: CVPR. pp.4670-4679,2019);
LADCF corresponds to the method proposed by Xu, T.et al (Xu, T., Feng, Z., Wu, X., Kittler, J.: Learning adaptive correlation filters, visual temporal correlation selection. IEEETIP 28(11), 5596-one 5609, 2019);
GFSDCF corresponds to the method proposed by Xu, T, et al (Xu, T., Feng, Z.H., Wu, X.J., Kittler, J.: Joint group failure selection and discrete filter tracking. in: ICCV. pp.7950-7960,2019).
According to the invention, the sparse and compact correlation filter is learned to carry out rapid robust visual tracking, the learned correlation filter can adaptively select the characteristics related to the target and inhibit redundancy and characteristics unrelated to the target, the problems of overfitting and high calculation complexity of the traditional correlation filter can be effectively relieved, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. Through sparseness and time consistency constraint, the correlation filter adaptively selects discriminant features of a small number of channels which are continuous in time and have regional characteristics. The derived correlation-filtered learning problem can be solved by ADMM, which can be solved efficiently with only a few iterations. Experiments are carried out on various challenging data sets (OTB-2013, OTB-2015, VOT-2016, VOT2017 and UAV20L), and the results show that the method can obtain better performance, high precision and high speed. Specifically, on the OTB-2015 dataset, the tracker AUC scored 70.0% with a velocity of approximately 20FPS when using the Handcrafted and CNN features.

Claims (10)

1. A fast robust target tracking method based on a sparse compact correlation filter is characterized by comprising the following steps:
1) for a given target, constructing a basic sample by the target and the context thereof, wherein a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and DCF trains a loss function of a multi-channel correlation filter;
2) in the multi-task learning, integrating an exclusive sparse regular term and a group sparse regular term together to construct an intra-group-inter-group sparse regular term, introducing a time consistency constraint in filter learning by a tracker based on DCF to relieve the problem of DCF degradation along with time in target tracking, and introducing the intra-group-inter-group sparse regular term and the time regular term to define a regression loss function so as to learn a sparse correlation filter;
3) performing channel pruning based on the regression loss function defined in the step 2), integrally removing redundant filters to further accelerate the calculation process, and calculating the change of the regression loss by removing the filter of a certain specific channel; sorting the D channel filters according to the importance degree, and selecting C channel filters which are sorted in the front for tracking;
4) And constructing a Lagrange function, optimizing regression loss by adopting an ADMM algorithm, and completing the fast robust target tracking based on the sparse compact correlation filter.
2. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 1, wherein in step 1), for a given target, a basic sample is constructed from the target and its context, the training sample is composed of all cyclic shift samples of the basic sample, the labels of the cyclic shift samples are determined by gaussian function, and the DCF training loss function of the multichannel correlation filter is specifically:
in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:
Figure FDA0002594515080000011
wherein,
Figure FDA0002594515080000012
for cyclic convolution operation symbols, Xt∈RM×N×DAnd Wt∈RM×N×DFor the base sample sum filter of the t-th frame, Y ∈ RM×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; filter science The objective is to minimize the loss function
Figure FDA0002594515080000013
In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter.
3. The sparse compact correlation filter-based fast robust target tracking method according to claim 1, wherein in step 2), in the multitask learning, an exclusive sparse regularization term and a group sparse regularization term are integrated together to construct an intra-group-inter-group sparse regularization term, as follows:
Figure FDA0002594515080000021
wherein,
Figure FDA0002594515080000022
Figure FDA0002594515080000023
represents WtOf the vector at the (m, n) position,
Figure FDA0002594515080000024
represents WtThe element in the (m, n, d) position, theta is a weight parameter to balance exclusive sparsity and group sparsity regularization terms;
group sparsity is performed on channels2The norm is then taken over spatially1A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel1The norm is then taken over spatially2Norm, which is used to remove redundant features on the channel, so that the filter is sparse on the channel.
4. A fast robust target tracking method based on sparse compact correlation filter as claimed in claim 3 wherein the weight parameter θ is 0.2.
5. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 1, wherein in step 2), in the target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation problem over time, and the introduced temporal regularization term is as follows:
Figure FDA0002594515080000025
Wherein, Wt-1A filter representing a t-1 th frame;
introducing an intra-group-to-inter-group sparsity regularization term REG(Wt) And a temporal regularization term RT(Wt) Defining a regression loss function to learn the sparse correlation filter, the regression loss function being as follows:
Figure FDA0002594515080000026
wherein λ and μ are each REG(Wt) And RT(Wt) The regularization term parameter of (2).
6. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 5 wherein the regularization term parameter λ ═ 1.0 × 10-4,μ=5。
7. The sparse compact correlation filter-based fast robust target tracking method according to claim 1, wherein in step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:
ΔL=L(Xt,Y;Wt')-L(Xt,Y;Wt), (5)
wherein, WtAnd Wt' filter without and after pruning respectively;for a filter with D channels, channel pruning requires estimation of the loss function 2DChannel pruning can be completed in the next time;
calculating the change of the regression loss by removing a filter of a specific channel; order to
Figure FDA0002594515080000031
Is represented by
Figure FDA0002594515080000032
Generated response graph:
Figure FDA0002594515080000033
Dt={Xt,Y},
Figure FDA0002594515080000034
order to
Figure FDA0002594515080000035
To represent
Figure FDA0002594515080000036
The vector of (a) is obtained:
Figure FDA0002594515080000037
wherein,
Figure FDA0002594515080000038
vector representing current response graph
Figure FDA0002594515080000039
The loss after the pruning is carried out,
Figure FDA00025945150800000310
to represent
Figure FDA00025945150800000311
Loss of non-pruned branches(ii) a To pair
Figure FDA00025945150800000312
At the point of
Figure FDA00025945150800000313
The first order Taylor expansion is performed as follows:
Figure FDA00025945150800000314
wherein,
Figure FDA00025945150800000315
representing a first-order residue in the Taylor representation; substituting equation (7) into equation (6) and removing
Figure FDA00025945150800000316
Obtaining:
Figure FDA00025945150800000317
thus, for each filter
Figure FDA00025945150800000318
Its degree of importance
Figure FDA00025945150800000319
The calculation formula is as follows:
Figure FDA00025945150800000320
wherein,
Figure FDA00025945150800000321
tensor Z representing the response maptAn element located at the (m, n, d) position; according to the degree of importance
Figure FDA00025945150800000322
And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking.
8. The sparse compact correlation filter-based fast robust target tracking method of claim 7, wherein C-64.
9. The fast robust target tracking method based on the sparse compact correlation filter as claimed in claim 1, wherein in step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm is as follows:
in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introducedt=WtAnd constructs the lagrange function as follows:
Figure FDA0002594515080000041
wherein, VtIs Lagrange multiplier, gamma is penalty factor; the naadmm algorithm alternately solves for the following variables:
For correlation filter WtOptimizing; firstly, the correlation filter W is processed by adopting the Pasteval theoremtThe conversion from the time domain to the frequency domain of (1) is as follows:
Figure FDA0002594515080000042
wherein,
Figure FDA0002594515080000043
indicates a discrete Fourier transform symbol,. indicates a point-by-symbol; similar to the solution of STRCF and LADCF,
Figure FDA0002594515080000044
each vector in
Figure FDA0002594515080000045
The solution of (a) is as follows:
Figure FDA0002594515080000046
wherein,
Figure FDA0002594515080000047
Figure FDA0002594515080000048
to represent
Figure FDA0002594515080000049
The element at the (m, n) position in (c); computing
Figure FDA00025945150800000410
Then, it is subjected to inverse Fourier transform to obtain
Figure FDA00025945150800000411
For auxiliary variable UtOptimization is carried out in order to solve the auxiliary variable UtThe following sub-problems are optimized:
Figure FDA00025945150800000412
auxiliary variable UtEach element of
Figure FDA00025945150800000413
The solution of (a) is as follows:
Figure FDA00025945150800000414
Figure FDA00025945150800000415
wherein,
Figure FDA00025945150800000416
(·)+for the contraction operator, the following is defined: (x)+=max(0,x);
Updating lagrange multiplier VtAnd a penalty factor γ: known filter WtAnd an auxiliary variable UtFor lagrange multiplier VtAnd the penalty factor gamma is updated as follows:
Figure FDA00025945150800000417
γi+1=max(γmin,ργi) (ii) a Where i is the iteration index, γminIs the minimum value of γ, and ρ is the scale factor.
10. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 9 wherein the parameter γ ismin=0.002,γ0At 0.01, ρ 0.2, ADMM iterates 2 times.
CN202010705423.9A 2020-07-21 2020-07-21 Rapid robust target tracking method based on sparse compact correlation filter Active CN111862167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010705423.9A CN111862167B (en) 2020-07-21 2020-07-21 Rapid robust target tracking method based on sparse compact correlation filter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010705423.9A CN111862167B (en) 2020-07-21 2020-07-21 Rapid robust target tracking method based on sparse compact correlation filter

Publications (2)

Publication Number Publication Date
CN111862167A true CN111862167A (en) 2020-10-30
CN111862167B CN111862167B (en) 2022-05-10

Family

ID=73000807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010705423.9A Active CN111862167B (en) 2020-07-21 2020-07-21 Rapid robust target tracking method based on sparse compact correlation filter

Country Status (1)

Country Link
CN (1) CN111862167B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767450A (en) * 2021-01-25 2021-05-07 开放智能机器(上海)有限公司 Multi-loss learning-based related filtering target tracking method and system
CN113379804A (en) * 2021-07-12 2021-09-10 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium
CN114117926A (en) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203495A (en) * 2016-07-01 2016-12-07 广东技术师范学院 A kind of based on the sparse method for tracking target differentiating study
CN109859241A (en) * 2019-01-09 2019-06-07 厦门大学 Adaptive features select and time consistency robust correlation filtering visual tracking method
CN109859244A (en) * 2019-01-22 2019-06-07 西安微电子技术研究所 A kind of visual tracking method based on convolution sparseness filtering
CN110490907A (en) * 2019-08-21 2019-11-22 上海无线电设备研究所 Motion target tracking method based on multiple target feature and improvement correlation filter
CN111126132A (en) * 2019-10-25 2020-05-08 宁波必创网络科技有限公司 Learning target tracking algorithm based on twin network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203495A (en) * 2016-07-01 2016-12-07 广东技术师范学院 A kind of based on the sparse method for tracking target differentiating study
CN109859241A (en) * 2019-01-09 2019-06-07 厦门大学 Adaptive features select and time consistency robust correlation filtering visual tracking method
CN109859244A (en) * 2019-01-22 2019-06-07 西安微电子技术研究所 A kind of visual tracking method based on convolution sparseness filtering
CN110490907A (en) * 2019-08-21 2019-11-22 上海无线电设备研究所 Motion target tracking method based on multiple target feature and improvement correlation filter
CN111126132A (en) * 2019-10-25 2020-05-08 宁波必创网络科技有限公司 Learning target tracking algorithm based on twin network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHENGGANG GUO ET AL.: "Learning Local Structured Correlation Filters for Visual Tracking via Spatial Joint Regularization", 《IEEE ACCESS》 *
LUO XIONG ET AL.: "correlation filter tracking with adaptive proposal selection for accurate scale estimation", 《ARXIV》 *
YANJIE LIANG ET AL.: "Robust Correlation Filter Tracking with Shepherded Instance-Aware Proposals", 《26TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA 2018》 *
王欣远: "基于相关滤波器的目标跟踪算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767450A (en) * 2021-01-25 2021-05-07 开放智能机器(上海)有限公司 Multi-loss learning-based related filtering target tracking method and system
CN113379804A (en) * 2021-07-12 2021-09-10 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium
CN113379804B (en) * 2021-07-12 2023-05-09 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium
CN114117926A (en) * 2021-12-01 2022-03-01 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning
CN114117926B (en) * 2021-12-01 2024-05-14 南京富尔登科技发展有限公司 Robot cooperative control algorithm based on federal learning

Also Published As

Publication number Publication date
CN111862167B (en) 2022-05-10

Similar Documents

Publication Publication Date Title
CN111862167B (en) Rapid robust target tracking method based on sparse compact correlation filter
Luo et al. Decomposition algorithm for depth image of human health posture based on brain health
CN112560656B (en) Pedestrian multi-target tracking method combining attention mechanism end-to-end training
CN109859241B (en) Adaptive feature selection and time consistency robust correlation filtering visual tracking method
CN104484890B (en) Video target tracking method based on compound sparse model
CN108874149B (en) Method for continuously estimating human body joint angle based on surface electromyogram signal
CN102508867A (en) Human-motion diagram searching method
CN110135365B (en) Robust target tracking method based on illusion countermeasure network
CN107203747B (en) Sparse combined model target tracking method based on self-adaptive selection mechanism
Zhang et al. Robust low-rank kernel subspace clustering based on the schatten p-norm and correntropy
CN115147456B (en) Target tracking method based on time sequence self-adaptive convolution and attention mechanism
CN112258557B (en) Visual tracking method based on space attention feature aggregation
CN113673313B (en) Gesture recognition method based on hierarchical convolutional neural network
CN112668543B (en) Isolated word sign language recognition method based on hand model perception
CN110555864B (en) Self-adaptive target tracking method based on PSPCE
CN114781441B (en) EEG motor imagery classification method and multi-space convolution neural network model
Sahbi Topologically-consistent magnitude pruning for very lightweight graph convolutional networks
CN111428555A (en) Joint-divided hand posture estimation method
Jeong et al. Deep efficient continuous manifold learning for time series modeling
Lajkó et al. Surgical skill assessment automation based on sparse optical flow data
Hashida et al. Multi-channel mhlf: Lstm-fcn using macd-histogram with multi-channel input for time series classification
Firouznia et al. Adaptive chaotic sampling particle filter to handle occlusion and fast motion in visual object tracking
Wang et al. Human motion data refinement unitizing structural sparsity and spatial-temporal information
CN116597996A (en) Infant brain development quantitative evaluation system based on self-adaptive neighbor propagation self-clustering model
Kalimuthu et al. Multi-class facial emotion recognition using hybrid dense squeeze network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant