CN108986139A

CN108986139A - A kind of band for target following is made a difference the feature integration method of figure

Info

Publication number: CN108986139A
Application number: CN201810619533.6A
Authority: CN
Inventors: 杨明; 李爱师
Original assignee: Nanjing Normal University
Current assignee: Nanjing Normal University
Priority date: 2018-06-12
Filing date: 2018-06-12
Publication date: 2018-12-11
Anticipated expiration: 2038-06-12
Also published as: CN108986139B

Abstract

It makes a difference the feature integration method of figure the invention discloses a kind of band for monotrack, this method comprises the following steps: extracting the candidate region feature of present frame, including gradient orientation histogram, gray feature centered on previous frame target position；Utilize correlation filtering model prediction target position；The training sample of present frame is extracted centered on present frame target position；Training is for the correlation filtering model of next frame target positioning and the importance figure of feature；It moves in circles until image sequence terminates.Using importance figure, this method has fully played the complementarity of different manual features, being capable of real-time training pattern, detection target.Since gradient orientation histogram pays close attention to gradient, to illumination robust, and gray feature pays close attention to homogeneous region, to deformation robust；With the help of importance figure, the method for proposition both can take into account feature, promote correlation filtering method to the robustness of deformation, illumination, therefore use value with higher.

Description

Feature integration method with significance map for target tracking

Technical Field

The invention belongs to the field of single target tracking, and particularly provides a feature integration method with an importance graph for target tracking.

Background

Single target tracking is an important task in computer vision, and has numerous applications in real life, such as video surveillance, robot services, and the like. The difficulty of single target tracking is that a priori information about the appearance and shape of a target is lacked, but various challenging problems such as deformation, occlusion and illumination need to be handled while the tracking speed is maintained. Because of these challenges, the appearance and shape of the target is changing in the image sequence, and most trackers tend to update the model online, rather than learning a fixed tracker. In addition, the lack of training samples also presents difficulties in learning a fixed tracker.

Detecting models and feature representations in tracking models are two elements of great interest. At present, the related filtering is a widely used single target tracking method, which carries out dense sampling by cyclic translation and approximates the translation of a sample in a real image. Furthermore, ambiguity of the two classification boundaries is avoided, and the corresponding target is represented by a Gaussian response. Due to the existence of the convolution theorem, the correlation filtering can be quickly trained and detected in a frequency domain, and meanwhile, good accuracy and robustness are achieved. Although the circular translation approximates the real translation to cause the boundary effect, two solutions mainly exist at present, one is to obtain more accurate samples through clipping, the other is to introduce two norms to the filter, and the punishment strength is increased along with the increase of the distance from the center. The proposed method uses a clipping approach to resolve the boundary effect.

The feature representation has a crucial influence on the effect of the model, mainly by using manual features or depth features. The manual features mainly adopt gradient direction histograms and color features at present, but most of the manual features are simply spliced, and the advantages of the manual features are not fully exerted. And the depth feature sacrifices the tracking speed and obtains excellent robustness and accuracy. The proposed method adopts the gradient direction histogram and the gray scale feature, and exerts the robustness of the gradient direction histogram to illumination and the robustness of the gray scale feature to deformation through the significance map.

Disclosure of Invention

The purpose of the invention is as follows: the invention provides a feature integration method with an importance graph, aiming at solving the problem of single-target tracking and effectively improving the robustness of deformation and illumination.

The invention content is as follows: a feature integration method with an importance map for target tracking specifically comprises the following steps:

step 1, taking the predicted position of the previous frame as a center, and extracting a gradient direction histogram and gray features of a candidate region;

step 2, predicting the target position of the current frame according to the maximum response value by using a filter;

step 3, taking the position of the current prediction target as a center, and extracting a training sample;

and 4, learning by utilizing a relevant filtering model to obtain a filter and an importance graph, and completing prediction of the target position of the next frame.

As a further preferred embodiment of the feature integration method with significance map for object tracking according to the present invention, in step 1, the position p is predicted in the previous frame_t-1Centering on, extracting a candidate region feature x_t∈R^DMNWherein the candidate region feature x is processed using a Hann window_t∈R^DMN(ii) a D refers to the dimension of the feature, MN refers to the number of elements in the feature, and x_tConsisting of K features, i.e. x_t＝[x¹,x²,…,x^K]Whereinand K is 2.

As a further preferable solution of the feature integration method with significance map for target tracking of the present invention, the prediction of the target position of the current frame specifically includes the following steps:

step 2.1, applying two-dimensional discrete Fourier transform to each channel of the feature to obtain candidate features of a frequency domain

Step 2.2, obtaining a response value by using the relevant filtering, and predicting the position of the target according to the maximum response value;

wherein,is the filter with the significance map obtained by the last frame training; equivalent to the diagonal splicing of the channels of each feature, F-¹Is the inverse of the two-dimensional discrete fourier transform.

As a further preferred embodiment of the feature integration method with significance map for target tracking according to the present invention, in step 3, the position p is predicted from the current frame_tCentered, the training samples were extractedSame training samples were processed using Hann window

As a further preferable aspect of the feature integration method with significance map for target tracking of the present invention, the method for obtaining a filter and a significance map by using a correlation filtering model learning specifically includes the following steps:

step 4.1, establishing a model: similar to ridge regression, the objective function of correlation filtering in the real domain is given a vectorized two-dimensional Gaussian response y ∈ R^MNTo target, the true translation of the target in the image is approximated by a circular translation, the goal being to minimize the least squares error with regularization term

wherein: (x) is cyclically related_k,lThe ith channel referring to the kth feature, h ∈ R^DMNIs a filter, h_k,lIs corresponding to x_k,lM of the filter_kIs the significance map of the kth feature, diag (m)_k)∈R^MN×MNIs m_kdiagonalization of alpha_kIs the two-norm coefficient of the kth significance map, and λ is the two-norm coefficient of the filter;

converting the above formula into a frequency domain by using a convolution theorem and a Parseval theorem, and diagonalizing;

wherein,is thatThe diagonalization of the angle of (a) is,is a filter with an importance map, and F is a unit discrete Fourier transform matrix;

step 4.2, solving the model: using the augmented lagrange method transformation problem, followed by using ADMM to solve the following sub-problems;

wherein ξ_k,l∈R^MNIs g_k,lLagrange multiplier, mu is a penalty term of two norms;

filter for solving subproblemsSimilar to ridge regression, this equation exists in closed solutions, but by using the Sherman-Morrison equation to avoid the matrix inversion operation, the solution is thus

Wherein t ═ 1,2, …, MN]The position is represented by a number of positions,

conj represents the complex conjugate;

solving the sub-problem significance map m: converting an objective function for m to the real number domain, m_k∈R^MNIndependent of each other, and can therefore be solved separately

Solving a subproblem intermediate variable h: converting an objective function for h to the real number domain, h_k,l∈R^MNIndependent of each other, respectively solve

Updating

Updating the mu:wherein mu_maxis the maximum value and β is the update rate;

it is noted that the initial value of m is 0 matrix, and the initial solution of h comes from KCF, the H is cut off at the center by adopting the method of BACF processing boundary effect, the rest are all assigned to 0, and the target feature model is updatedPost-training model, whereinis an online updated target feature model, t represents the index of the frame, η is the learning rate,by passingDiagonalization is obtained;

and 4.3, obtaining a filter and an importance graph until the maximum iteration times is reached in the step 4.2. The filter is used for object detection of the next frame.

The invention provides a characteristic integration method aiming at relevant filtering, compared with the prior art, the method has the following characteristics:

firstly, a related filtering frame is adopted to ensure the tracking speed and precision, and a cutting method is used to overcome the boundary effect; then, the importance graph and the filter are jointly learned by integrating the importance graph and the features into a relevant filtering framework; the method has the advantages of maintaining the speed and having the characteristics, and the robustness of the tracker is improved, so the method has higher use value.

Drawings

FIG. 1 is an overall flow chart of the present invention;

FIG. 2 is a flow chart of the joint learning gain filter and significance map of the present invention.

Detailed Description

The following describes embodiments of the present invention with reference to the drawings.

As shown in fig. 1, the present invention discloses a feature integration method with significance map for target tracking, which comprises the following specific steps:

step 2, predicting the position of the target of the current frame according to the maximum response value by using a filter;

step 3, taking the predicted position of the current frame as a center, and extracting a training sample;

and 4, combining the learning filter and the significance map for predicting the target position of the next frame.

It should be noted that, the core step joint learning filter and the significance map of the present invention, the description of the specific embodiment mainly focuses on step 4, and steps 1,2 and step 3 can be implemented by using the prior art.

As shown in fig. 2, the specific steps of the joint learning to obtain the filter and the significance map are as follows:

and 4.1, establishing a model. Similar to ridge regression, the objective function of the correlation filtering in the real domain is given a vectorized two-dimensional Gaussian response y ∈ R^MNTo this end, the true translation of the target in the image is approximated by a circular translation, the objective being to minimize the least squares error with the positive terms

and 4.2, solving the model. Using the augmented lagrange method transformation problem, followed by using ADMM to solve the following sub-problems;

filter for solving subproblemsSimilar to ridge regression, theseThe formula exists as a closed solution, but the matrix inversion operation is avoided by using the Sherman-Morrison formula, so that the solution is

conj represents the complex conjugate;

Updating

Updating the mu:wherein mu_maxis the maximum value and β is the update rate;

it is noted that the initial value of m is 0 matrix, and the initial solution of h comes from KCF, the H is cut off at the center by adopting the method of BACF processing boundary effect, the rest are all assigned to 0, and the target feature model is updatedPost-training model, whereinis an online updated target feature model, t represents the index of the frame, η is the learning rate,by passingAnd obtaining diagonalization.

And 4.3, obtaining a filter and an importance graph until the maximum iteration times is reached in the step 4.2. The filter is used for target tracking of the next frame.

The above provides a detailed description of a feature integration method with an importance map for target tracking according to the present invention. It should be noted that there are many ways to implement the technical solution, and the above description is only a preferred embodiment of the present invention, and is only used to help understand the method and core idea of the present invention; meanwhile, for a person skilled in the art, modifications and adjustments based on the core idea of the present invention shall be considered as the protection scope of the present invention. In view of the foregoing, it is intended that the present disclosure not be considered as limiting, but rather that the scope of the invention be limited only by the appended claims.

Claims

1. A feature integration method with an importance map for target tracking is characterized by comprising the following steps:

2. The method of claim 1, wherein in step 1, the predicted position p is predicted from the previous frame_t-1Centering on, extracting a candidate region feature x_t∈R^DMNWherein the candidate region feature x is processed using a Hann window_t∈R^DMN(ii) a D refers to the dimension of the feature, MN refers to the number of elements in the feature, and x_tConsisting of K features, i.e. x_t＝[x¹，x²，...，x^K]Whereinand K is 2.

3. The method of claim 2, wherein the predicting the target position of the current frame specifically comprises the following steps:

wherein,is the filter with the significance map obtained by the last frame training; equivalent to diagonalizing and splicing the channels of each feature, F^-1Is the inverse of the two-dimensional discrete fourier transform.

4. The method of claim 3, wherein the feature integration method with significance map for target tracking,

in step 3, the position p is predicted by the current frame_tCentered, the training samples were extractedSame training samples were processed using Hann window

5. The method of claim 4, wherein the learning of the correlation filter model is used to obtain the filter and the significance map, and the method comprises the following steps:

step 4.1, establishing a model: similar to ridge regression, the objective function of the correlation filtering in the real domain is given a vectorized two-dimensional Gaussian response y ∈ R^MNTo target, the true translation of the target in the image is approximated by a circular translation, the goal being to minimize the least square error with a regularization term

wherein: (x) is cyclically related_k，lThe ith channel referring to the kth feature, h ∈ R^DMNIs a filter, h_k，lIs corresponding to x_k，lM of the filter_kIs the significance map of the kth feature, diag (m)_k)∈R^MN×MNIs m_kdiagonalization of alpha_kIs the two-norm coefficient of the kth significance map, and λ is the two-norm coefficient of the filter;

wherein ξ_k，l∈R^MNIs g_k，lLagrange multiplier, mu is a penalty term of two norms;

Wherein t ═ 1, 2.., MN]The position is represented by a number of positions,

conj represents the complex conjugate;

Solving a subproblem intermediate variable h: converting an objective function for h to the real number domain, h_k，l∈R^MNIndependent of each other, respectively solve

Updating

UpdatingWherein mu_maxis the maximum value and β is the update rate;

it is noted that the initial value of m is 0 matrix, and the initial solution of h comes from KCF, and by adopting the method of BACF processing boundary effect, h is clipped at the center, and the rest are all assigned as 0, the target feature is updatedModel (model)Post-training model, whereinis an online updated target feature model, t represents the index of the frame, η is the learning rate,by passingDiagonalization is obtained;