CN111862167B

CN111862167B - Rapid robust target tracking method based on sparse compact correlation filter

Info

Publication number: CN111862167B
Application number: CN202010705423.9A
Authority: CN
Inventors: 王菡子; 梁艳杰; 熊逻
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2020-07-21
Filing date: 2020-07-21
Publication date: 2022-05-10
Anticipated expiration: 2040-07-21
Also published as: CN111862167A

Abstract

A fast robust target tracking method based on a sparse compact correlation filter relates to a computer vision technology. Constructing a basic sample by a target and the context thereof, constructing a training sample by all cyclic translation samples of the basic sample, and training a loss function of the multi-channel correlation filter by a DCF (digital-to-analog converter); in the multi-task learning, an exclusive sparse regular term and a group sparse regular term are integrated to construct an intra-group and inter-group sparse regular term, time consistency constraint is introduced in target tracking to relieve the problem that DCF degrades along with time, an intra-group and inter-group sparse regular term and a time regular term are introduced to define a regression loss function, and a sparse correlation filter is learned; channel pruning removes the redundant filters integrally, sorts the D channel filters according to the importance degree, and selects the channel filter sorted in front for tracking; and constructing a Lagrange function, and optimizing the regression loss by adopting an ADMM algorithm. The discriminability and the interpretability of the filter are effectively improved, the precision is high, and the speed is high.

Description

Rapid robust target tracking method based on sparse compact correlation filter

Technical Field

The invention relates to a computer vision technology, in particular to a fast robust tracking method based on a sparse compact correlation filter.

Background

The human body has high visual perception capability to the outside video, and the brain can quickly and accurately locate the moving target in the video. Computers are intended to mimic the visual perception of the human brain, to the human level in terms of speed and accuracy. Visual tracking is a fundamental problem in computer vision, and is the fundamental content of visual perception, and the speed and precision of the visual perception determine the real-time performance and precision of the visual perception. Target tracking is one of important research directions in the field of computer vision, and plays an important role in the fields of intelligent video monitoring, human-computer interaction, robot navigation, virtual reality, medical diagnosis, public safety and the like. The task first selects an object of interest in an initial frame of a video and then predicts the state of the object in the next successive frame. In addition, target tracking is a challenging task, and the target often changes in appearance (such as occlusion, deformation, rotation, etc.) during tracking, and is accompanied by complicated illumination changes, interference of similar targets in the background, and rapid movement of the target, which all make the task difficult. In recent years, a target tracking method based on correlation filtering and deep learning becomes a mainstream direction of current research due to good performance of the target tracking method.

Methods based on correlation filters have become one of the research hotspots in the field of target tracking in recent years, and have a good speed advantage and achieve good results in numerous data sets and various games. The DCF provides a hot application trend of the correlation filtering in the target tracking field. Subsequently, many researchers made improvements to DCF. In order to process scale and rotation change, the LDES proposes that a phase filter simultaneously estimates the scale and the rotation angle of a target in a polar coordinate system; the MCPF effectively embeds a correlation filter into the particle filter tracking framework to handle scale changes in the target tracking process. In order to effectively alleviate the problem of the spatial boundary effect of the filter, the DSARCF and ASRCF respectively introduce a dynamic significance response map and an adaptive spatial response map in filter learning to adaptively weight the filter coefficients. In order to effectively alleviate the problem of filter degradation over time, the STRCF and the LADCF introduce a time regularization term in filter learning to perform robust and fast target tracking. In order to train the filter by adopting more samples, the CACF and the BACF respectively use the context sample and the background sample for training the related filter, thereby ensuring real-time and greatly improving the precision. In order to select more robust and discriminative features, the HCF applies the multilayer depth features extracted from the VGG-Net into a related filtering tracking framework, and realizes accurate and robust tracking through fusion of multilayer response graphs. GFSDCF combines feature selection and filter learning, so that the learned filter has stronger discrimination capability, the problem of overfitting can be effectively relieved, and the target tracking precision is improved. The ECO uses a decomposition matrix to effectively compress the original depth features to train a continuous convolution filter, thereby achieving efficient and accurate target tracking. In order to enhance the discriminability of the response diagram, the LMCF introduces a correlation filter into a Struck tracking frame, and fully utilizes the characteristic of high speed of the correlation filter and the characteristic of strong distinguishing capability of the Struck to realize the fast and robust tracking.

In recent years, a method based on deep learning has become another research hotspot in the field of target tracking with its advantage of higher precision. Currently, target tracking methods based on deep learning can be roughly divided into two categories: the first type of deep learning-based method is to construct a deep network, select a sample for off-line training, and realize target tracking by on-line fine tuning of the network, and the representative method is MDNET. The MDNET-based tracking framework has the advantages that VITAL maintains the characteristic of good robustness in the tracking process through counterstudy, and ADNET predicts various states of a target in the tracking process through reinforcement study to adapt to a complex tracking environment. The tracking accuracy of the method is high, but the real-time tracking is difficult to achieve. The second category is that the target tracking problem is converted into an instance retrieval problem, and a matching function used for instance retrieval is obtained by off-line training of external video data. SINT and SimFC solve the problem of deep similarity measurement by training twin networks offline; DCFNet and CFNet add differentiable relevant filtering layers in the twin network to train end-to-end characteristic expressions suitable for relevant filtering; EAST introduces reinforcement learning in a twin network to adaptively select the depth feature of a certain layer to realize rapid and robust tracking; SiamRPN further introduces RPN networks in the twin network to effectively handle the scale and aspect ratio changes of the target during tracking. This off-line training method is mostly capable of achieving real-time, but its accuracy depends on the network and data used for training.

Disclosure of Invention

The invention aims to provide a rapid robust target tracking method based on a sparse compact correlation filter, which can effectively solve the problems of overfitting, high calculation complexity and the like of the traditional correlation filter and improve the robustness of an algorithm to shielding, deformation, rotation and background interference.

The invention comprises the following steps:

1) for a given target, constructing a basic sample by the target and the context thereof, wherein a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and DCF trains a loss function of a multi-channel correlation filter;

2) in the multi-task learning, integrating an exclusive sparse regular term and a group sparse regular term together to construct an intra-group-inter-group sparse regular term, introducing a time consistency constraint in filter learning by a tracker based on DCF to relieve the problem of DCF degradation along with time in target tracking, and introducing the intra-group-inter-group sparse regular term and the time regular term to define a regression loss function so as to learn a sparse correlation filter;

3) performing channel pruning based on the regression loss function defined in the step 2), integrally removing redundant filters to further accelerate the calculation process, and calculating the change of the regression loss by removing the filter of a certain specific channel; sorting the D channel filters according to the importance degree, and selecting C channel filters which are sorted in the front for tracking;

4) and constructing a Lagrange function, optimizing regression loss by adopting an ADMM algorithm, and completing the fast robust target tracking based on the sparse compact correlation filter.

In step 1), for a given target, a basic sample is constructed from the target and its context, a training sample is composed of all cyclic shift samples of the basic sample, labels of the cyclic shift samples are determined by gaussian functions, and a specific method for DCF training a loss function of a multichannel correlation filter may be:

in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:

wherein the content of the first and second substances,

for cyclic convolution operation symbols, X_t∈R^M×N×DAnd W_t∈R^M×N×DFor the base sample sum filter of the t-th frame, Y ∈ R^M×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function

In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter.

In step 2), in the multitask learning, integrating the exclusive sparsity regular term and the group sparsity regular term together to construct an intra-group sparsity regular term, as follows:

wherein the content of the first and second substances,

represents W_tOf the vector at the (m, n) position,

represents W_tThe element in the (m, n, d) position, theta is a weight parameter to balance exclusive sparsity and group sparsity regularization terms;

group sparsity is performed on channels₂The norm is then spatially scaled by l₁A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel₁The norm is then spatially scaled by l₂Norm, which is used to remove redundant features on the channel, so that the filter is sparse on the channel;

in target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation over time, the introduced temporal regularization term is as follows:

wherein, W_t-1A filter representing a t-1 th frame;

introducing an intra-group-to-inter-group sparsity regularization term R_EG(W_t) And a temporal regularization term R_T(W_t) Defining a regression loss function to learn the sparse correlation filter, the regression loss function being as follows:

wherein λ and μ are each R_EG(W_t) And R_T(W_t) The regularization term parameter of (2).

In step 2), the weight parameter θ in equation (2) is 0.2, and the regularization term parameter λ in equation (4) is 1.0 × 10^-4，μ＝5。

In step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:

ΔL＝L(X_t,Y；W′_t)-L(X_t,Y；W_t), (5)

wherein, W_tAnd W'_tRespectively representing filters which are not pruned and filters which are pruned; for a filter with D channels, channel pruning requires estimation of the loss function 2^DChannel pruning can be completed in the next time;

calculating the change of the regression loss by removing a filter of a specific channel; order to

Is represented by

Generated response graph:

D_t＝{X_t,Y}，

order to

To represent

The vector of (c) can be obtained:

wherein the content of the first and second substances,

vector representing current response graph

The loss after the pruning is carried out,

to represent

Loss of non-pruned branches; to pair

At the point of

The first order Taylor expansion is performed as follows:

wherein the content of the first and second substances,

representing a first-order residue in the Taylor representation; substituting equation (7) into equation (6) and removing

The following can be obtained:

thus, for each filter

Its degree of importance

The calculation formula is as follows:

wherein the content of the first and second substances,

tensor Z representing the response map_tAn element located at the (m, n, d) position; according to the degree of importance

And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter remains in subsequent frames, thus the computational complexity can be significantly reduced.

In step 3), the channel parameter C is 64.

In step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm may be:

in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; introduction considering that the sparse compact filter is compressed from the D channel to the C channel at the initial frameAn auxiliary variable U_t＝W_tAnd constructs the lagrange function as follows:

wherein, V_tIs Lagrange multiplier, gamma is penalty factor; the naadmm algorithm alternately solves for the following variables:

for correlation filter W_tOptimizing; firstly, the correlation filter W is processed by adopting the Pasteval theorem_tThe conversion from the time domain to the frequency domain of (1) is as follows:

wherein the content of the first and second substances,

representing the sign of the discrete fourier transform,

represents a dot-by-symbol; similar to the solution of STRCF and LADCF,

each vector in

The solution of (a) is as follows:

wherein the content of the first and second substances,

to represent

The element at the (m, n) position in (c); computing

Then, the obtained product is subjected to inverse Fourier transform to obtain the product

For auxiliary variable U_tOptimization is carried out in order to solve the auxiliary variable U_tThe following sub-problems are optimized:

auxiliary variable U_tEach element of

The solution of (a) is as follows:

wherein

(·)₊For the contraction operator, the following is defined: (x)₊＝max(0,x)。

Updating lagrange multiplier V_tAnd a penalty factor γ: known filter W_tAnd an auxiliary variable U_tFor lagrange multiplier V_tAnd the penalty factor gamma is updated as follows:

γⁱ⁺¹＝max(γ^min,ργⁱ) (ii) a Where i is the iteration index, γ^minIs the minimum value of γ, and ρ is the scale factor.

In step 4), the parameter γ^min＝0.002，γ⁰At 0.01, ρ 0.2, ADMM iterates 2 times.

Compared with the prior art, the invention has the following advantages:

the sparse and compact correlation filter provided by the invention is used for robust real-time target tracking, and can effectively relieve the problems of overfitting and high calculation complexity in the tracking process. The proposed intra-group-inter-group sparse regularization term can effectively select specific target features with discriminant power to train the filter, so that the discriminant and interpretability of the filter are effectively improved. On one hand, a new intra-group-inter-group sparse regular term is introduced in filter learning, so that the learned filter keeps sparse in space and channels simultaneously, the characteristic of a specific target with discriminability can be activated in the tracking process to effectively relieve the problem of overfitting, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. On the other hand, a new channel pruning algorithm based on Taylor expansion is adopted to prune the filter, so that a small number of filters with strong response aiming at a specific target are effectively reserved, a large number of redundant filters are removed, the problem of overfitting can be further relieved, and the calculation complexity is effectively reduced. The solution of the correlation filtering uses an efficient ADMM algorithm, which can efficiently optimize the filter with only a few iterations. Experimental results on various challenging data sets show that the method can obtain a good tracking result, and is high in precision and high in speed. On the OTB-2015 data set, the DP/AUC score of the invention is 93.3%/70.0%, and the speed can reach 20 FPS.

Detailed Description

The present invention belongs to a target tracking method of the related filtering class, and the following embodiments will further describe the present invention.

The embodiment of the invention comprises the following steps:

A. in the t-th frame, for a given target, a basic sample is constructed by the target and the context thereof, a training sample is composed of all cyclic translation samples of the basic sample, labels of the cyclic translation samples are determined by Gaussian functions, and a loss function of the DCF training multichannel correlation filter is defined as follows:

wherein the content of the first and second substances,

for cyclic convolution operation symbols, X_t∈R^M×N×DAnd W_t∈R^M×N×DFor the base sample sum filter of the tth frame, Y ∈ R^M×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function

In equation (1), the multi-channel features representing the base samples are all used to train the multi-channel correlation filter. However, a significant portion of these features are unrelated to the particular target being tracked or otherwise useless for distinguishing between background and target. In order to select discriminative and target-specific features to train the filter to effectively alleviate the problems of overfitting and high computational complexity, a sparse and compact correlation filter is proposed below for fast and robust tracking.

B. In step a, the correlation filter of each channel is usually trained by the feature of each channel individually. However, different signature channels exhibit different characteristics, some signature channels being mutually exclusive and some signature channels being mutually cooperative. In training a multichannel correlation filter, mutually exclusive eigen-channels require individual training of the correlation filter of the respective channel and mutually cooperating eigen-channels require joint training of their correlation filters. At this time, the learning problem of the multi-channel correlation filter can be converted into a multi-task learning problem, wherein each task corresponds to the correlation filter of each channel.

In multi-task learning, the exclusive sparse regular term can effectively promote model parameters of different tasks to be in a competitive state, and finally intra-group sparsity can be realized; the group sparse regular term can effectively promote model parameters of different tasks to be in a collaborative state, and finally, inter-group sparsity can be achieved. For target tracking, both intra-group sparsity and inter-group sparsity can effectively alleviate the over-fitting problem. In order to solve the problem of using only exclusive sparse or group sparse regularization terms, the two are integrated together to construct a new regularization term, i.e. an intra-group-inter-group sparse regularization term, as follows:

wherein the content of the first and second substances,

representing a filter W_tOf the vector at the (m, n) position,

representing a filter W_tIs located at the (m, n, d) position. θ is a weight parameter to balance the exclusive sparsity and the group sparsity regularization terms.

On the one hand, group sparseness is first performed on channels l₂The norm is then taken over spatially₁Norm, and thus, group sparseness can effectively remove spatially redundant features, making the filter spatially sparse. Exclusive sparseness, on the other hand, is first performed on the channel₁The norm is then taken over spatially₂Norm, and therefore exclusive sparsity, can effectively remove features that are redundant on the channel, making the filter sparse on the channel. In general, the proposed intra-group-to-inter-group sparse regularization term can effectively select specific target features with discriminative power to train a filterThereby effectively improving the discriminability and the interpretability of the filter.

In target tracking, some recent DCF-based trackers often introduce a temporal consistency constraint in filter learning to effectively alleviate the DCF degradation over time, and the temporal regularization term that is usually introduced is as follows:

wherein, W_t-1Representing the filter for frame t-1.

In order to fully utilize the sparsity in space, the sparsity in channels and the consistency in time, an intra-group-to-inter-group sparsity regular term R is simultaneously introduced into a regression loss function_EG(W_t) And a temporal regularization term R_T(W_t) To learn the sparse correlation filter, as follows:

wherein λ and μ are each R_EG(W_t) And R_T(W_t) The regularization term parameter of (2). By designing the regression loss function, the learned filter can effectively enhance the characteristics with discrimination power and can effectively relieve the problem of overfitting.

C. The regression loss function defined by step B can make the learned correlation filter sparse in space and channels, and can alleviate the problem of overfitting. However, the sparse correlation filters are not compact enough in structure, and the computation complexity is still high, so that in order to further accelerate the computation process, an effective solution is to remove the redundant filters as a whole, namely channel pruning. The goal of channel pruning, which is usually based on an evaluation criterion of the importance of the filter, is to minimize the impact of removing the filter. Oracle channel pruning is the best criterion for removing redundant filters, and it estimates the importance of a filter based on the variation of the loss as follows:

ΔL＝L(X_t,Y；W′_t)-L(X_t,Y；W_t), (5)

wherein, W_tAnd W'_tRespectively showing the filters without pruning and after pruning. Oracle channel pruning can achieve very high accuracy but its computational complexity is high. For a filter for the D channel, Oracle channel pruning requires estimation of the loss function 2^DChannel pruning can be completed the next time.

Based on the idea of Oracle channel pruning, channel pruning based on taylor expansion calculates the change in regression loss by removing the filter of a particular channel. Order to

Is represented by

The generated response graph:

D_t＝{X_t,Y}，

order to

To represent

The vector of (c) can be obtained:

wherein the content of the first and second substances,

vector representing current response graph

Damage after pruningIn the light of the above-mentioned problems,

representing response map vectors

Loss without pruning. To pair

At the point of

The first order Taylor expansion is performed as follows:

wherein the content of the first and second substances,

representing the first-order residue in the taylor equation. Substituting equation (7) into equation (6) and removing

The following can be obtained:

thus, for each filter

Its degree of importance

The calculation formula is as follows:

wherein

Tensor Z representing the response map_tIs located at the (m, n, d) position. According to the degree of importance

And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking. This channel selection process is performed in the first frame and only the selected C channel filter is retained in subsequent frames, thus the computational complexity can be significantly reduced.

D. To minimize the regression loss presented by equation (4) in step B, the problem was optimized using the ADMM algorithm. Considering that the sparse compact filter is compressed into a C channel from a D channel in an initial frame, an auxiliary variable U is introduced_t＝W_tAnd constructs the lagrange function as follows:

wherein, V_tIs lagrange multiplier and gamma is penalty factor. The ADMM algorithm is adopted to solve the following variables alternately:

to filter W_tAnd (6) optimizing. To efficiently align the filter W_tOptimizing by first applying Parceval's theorem to filter W_tThe conversion from the time domain to the frequency domain of (1) is as follows:

wherein the content of the first and second substances,

a discrete fourier transform symbol is represented,

indicating a dot-by-symbol. Similar to the solution of STRCF and LADCF,

each vector in

The solution of (a) is as follows:

wherein the content of the first and second substances,

to represent

Is located at the (m, n) position. Computing

For auxiliary variable U_tAnd (6) optimizing. To solve for the auxiliary variable U_tThe following sub-problems are optimized:

auxiliary variable U_tEach element of

The solution of (a) is as follows:

wherein the content of the first and second substances,

(·)₊for the contraction operator, the following is defined: (x)₊＝sign(x)max(0,x)。

γⁱ⁺¹＝max(γ^min,ργⁱ). Where i is the iteration index, γ^minIs the minimum value of γ, and ρ is the scale factor.

In step B, the weight parameter θ in formula (2) is 0.2, and the regularization term parameter λ in formula (4) is 1.0 × 10^-4，μ＝5。

In step C, the channel parameter C is 64.

In step D, the parameter γ^min＝0.002，γ⁰At 0.01, ρ 0.2, ADMM iterates 2 times.

Table 1 shows the accuracy, success rate and speed of the OTB100 data set of the present invention and several other correlation filter-based target tracking methods. Wherein SCCF is the method of the present invention.

TABLE 1

Tracking method	CCOT	MCPF	ECO	STRCF	MCCT	ASRCF	LADCF	GFSDCF	SCCF
										Precision (%)	89.6	87.3	90.9	88.0	91.7	91.9	90.6	92.5	93.3
Success rate (%)	66.6	62.8	68.7	67.5	68.2	68.9	69.6	68.9	70.0
										Speed (FPS)	2.1	3.2	8.4	5.2	6.8	24.8	10.6	7.8	19.8

CCOT corresponds to the method proposed by Danelljan, M.et al (Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correction filters: left connected operation operators for visual tracking. in: ECCV. pp.472-488,2016);

MCPF corresponds to the method proposed by Zhang, T, et al (Zhang, T., Xu, C., Yang, M.H.: Multi-task correction parameter filter for robust object tracking. in: CVPR. pp.4819-4827,2017);

ECO corresponds to the method proposed by Danelljan, M. et al (Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: effective restriction operators for tracking. in: CVPR. pp.6931-6939,2017);

STRCF corresponds to the method proposed by Li, F et al (Li, F., Tian, C., Zuo, W., Zhang, L., Yang, M.H.: leaving spatial-temporal regulated correlation filters for visual tracking. in: CVPR. pp.4904-4913,2018);

MCCT corresponds to the method proposed by Wang, N. et al (Wang, N., Zhou, W., Tian, Q., Hong, R., Wang, M., Li, H.: Multi-core correlation filters for robust visual tracking. in: CVPR.pp.4844-4853,2018);

ASRCF corresponds to the method proposed by Dai, K. et al (Dai, K., Wang, D., Lu, H., Sun, C., Li, J.: Visual tracking via adaptive sizing filters. in: CVPR. pp.4670-4679,2019);

LADCF corresponds to the methods proposed by Xu, T.et al (Xu, T., Feng, Z., Wu, X., Kittler, J.: Learning adaptive correction filters, visual temporal correlation prediction IEEE TIP 28(11), 5596-;

GFSDCF corresponds to the method proposed by Xu, T, et al (Xu, T., Feng, Z.H., Wu, X.J., Kittler, J.: Joint group failure selection and discrete filter leaving for generating visual object tracking. in: ICCV. pp.7950-7960,2019).

According to the invention, the sparse and compact correlation filter is learned to carry out rapid robust visual tracking, the learned correlation filter can adaptively select the characteristics related to the target and inhibit redundancy and characteristics unrelated to the target, the problems of overfitting and high calculation complexity of the traditional correlation filter can be effectively relieved, and the robustness of the algorithm to shielding, deformation, rotation and background interference is improved. Through sparseness and time consistency constraint, the correlation filter adaptively selects discriminant features of a small number of channels which are continuous in time and have regional characteristics. The derived correlation-filtered learning problem can be solved by ADMM, which can be solved efficiently with only a few iterations. Experiments are carried out on various challenging data sets (OTB-2013, OTB-2015, VOT-2016, VOT2017 and UAV20L), and the results show that the method can obtain better performance, high precision and high speed. Specifically, on the OTB-2015 dataset, the tracker AUC scored 70.0% with a velocity of approximately 20FPS when using the Handcrafted and CNN features.

Claims

1. A fast robust target tracking method based on a sparse compact correlation filter is characterized by comprising the following steps:

2. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 1, wherein in step 1), for a given target, a basic sample is constructed from the target and its context, the training sample is composed of all cyclic shift samples of the basic sample, the labels of the cyclic shift samples are determined by gaussian function, and the DCF training loss function of the multichannel correlation filter is specifically:

wherein the content of the first and second substances,

for cyclic convolutionCalculation of symbol, X_t∈R^M×N×DAnd W_t∈R^M×N×DRespectively basic sample and filter of the t-th frame, Y ∈ R^M×NFor the label determined by the Gaussian function, M, N and D are respectively used for representing the width, the height and the number of channels, and xi is a regular term parameter; the goal of filter learning is to minimize the loss function

3. The sparse compact correlation filter-based fast robust target tracking method according to claim 2, wherein in step 2), in the multitask learning, an exclusive sparse regularization term and a group sparse regularization term are integrated together to construct an intra-group-inter-group sparse regularization term, as follows:

wherein the content of the first and second substances,

represents W_tOf the vector at the (m, n) position,

group sparsity is performed on channels₂The norm is then taken over spatially₁A norm for removing spatially redundant features such that the filter is spatially sparse; exclusive sparse on channel₁The norm is then spatiallyCarry out l₂Norm, which is used to remove redundant features on the channel, so that the filter is sparse on the channel.

4. A fast robust target tracking method based on sparse compact correlation filter as claimed in claim 3 wherein the weight parameter θ is 0.2.

5. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 3, wherein in step 2), in the target tracking, the DCF-based tracker introduces a temporal consistency constraint in filter learning to alleviate the DCF degradation problem over time, and the introduced temporal regularization term is as follows:

wherein, W_t-1A filter representing a t-1 th frame;

6. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 5 wherein the regularization term parameter λ ═ 1.0 × 10^-4，μ＝5。

7. The sparse compact correlation filter-based fast robust target tracking method according to claim 5, wherein in step 3), the specific method for performing channel pruning based on the regression loss function defined in step 2) is as follows:

ΔL＝L(X_t,Y；W_t')-L(X_t,Y；W_t), (5)

wherein, W_tAnd W_t' filter without and after pruning respectively; for a filter with D channels, channel pruning requires estimation of the loss function 2^DChannel pruning can be completed in the next time;

D_t＝{X_t,Y}，

Order to

To represent

The vector of (a) is obtained:

wherein the content of the first and second substances,

vector representing current response graph

The loss after the pruning is carried out,

represent

Loss of non-pruned branches; to pair

At the point of

The first order Taylor expansion is performed as follows:

wherein the content of the first and second substances,

representing a first-order residue in a Taylor expression; substituting equation (7) into equation (6) and removing

Obtaining:

thus, for each filter

Its degree of importance

The calculation formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

And sequencing the D channel filters, and only selecting the C channel filters sequenced at the front for tracking.

8. The sparse compact correlation filter-based fast robust target tracking method of claim 7, wherein C-64.

9. The fast robust target tracking method based on the sparse compact correlation filter as claimed in claim 7, wherein in step 4), the specific method for constructing the lagrangian function and optimizing the regression loss by using the ADMM algorithm is as follows:

in order to minimize the regression loss proposed by the formula (4) in the step 2), an ADMM algorithm is adopted for optimization; considering that the sparse compact filter is compressed from the D channel to the C channel in the initial frame, an auxiliary variable U is introduced_t＝W_tAnd constructs the lagrange function as follows:

wherein the content of the first and second substances,

indicates a discrete Fourier transform symbol, indicates a point-by-symbol,

each vector in

The solution of (a) is as follows:

wherein the content of the first and second substances,

represent

The element at the (m, n) position in (c); computing

Then, it is subjected to inverse Fourier transform to obtain

auxiliary variable U_tEach element of

The solution of (a) is as follows:

wherein the content of the first and second substances,

(·)₊for the contraction operator, the following is defined: (x)₊＝max(0,x)；

10. The sparse compact correlation filter-based fast robust target tracking method as claimed in claim 9 wherein the parameter γ is^min＝0.002，γ⁰At 0.01, ρ 0.2, ADMM iterates 2 times.