CN114897938A

CN114897938A - Improved cosine window related filtering target tracking method

Info

Publication number: CN114897938A
Application number: CN202210575210.8A
Authority: CN
Inventors: 孙家伟; 邓丽珍; 朱虎
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-08-12
Anticipated expiration: 2042-05-25
Also published as: CN114897938B

Abstract

A relevant filtering target tracking method for improving a cosine window provides improvement of a scale self-adaptive cosine window, peak clipping processing is carried out on the cosine window by using the initial size of a target, in addition, scale change of the target is considered, and dynamic adjustment is carried out on the peak clipping position of the cosine window through a scale change factor, so that the tracking success rate of a tracking algorithm in a target scale change scene is improved; the ADTrack algorithm is improved, self-constraint and mutual constraint between a training target filter and a background filter are introduced, an objective function is further optimized, tracking drift and template rapid degradation phenomena of an existing tracker are optimized, and ADMM optimization derivation is performed on a new objective function.

Description

A Correlation Filtering Target Tracking Method with Improved Cosine Window

技术领域technical field

本发明属于目标跟踪技术领域，具体涉及一种改进余弦窗的相关滤波目标跟踪方法。The invention belongs to the technical field of target tracking, and in particular relates to an improved cosine window correlation filtering target tracking method.

背景技术Background technique

图像识别与目标跟踪是计算机视觉中的一项关键工作，其具体内容是面向视频序列中的目标状态加以连续的推断处理，核心工作是定位视频在全部帧数中涉及到的跟踪目标，来获得对应目标的运动轨迹，针对每一帧图像提供跟踪目标的活动区域。目标跟踪技术目前已经被军事以及民用领域普遍推广，给人生活带来大量便利，随着人类生活水平的提高，对于目标跟踪的要求也越来越高。尽管目标跟踪相关技术每年都在革新,相关算法在性能优化方面依然存在需求，现阶段研究人员仍然面临着众多挑战。Image recognition and target tracking is a key work in computer vision. Its specific content is to continuously infer the target state in the video sequence. The core work is to locate the tracking target involved in all frames of the video to obtain Corresponding to the motion trajectory of the target, the active area of the tracking target is provided for each frame of image. Target tracking technology has been widely promoted in the military and civilian fields, bringing a lot of convenience to people's lives. With the improvement of human living standards, the requirements for target tracking are also getting higher and higher. Although the technology related to target tracking is innovated every year, there is still a need for performance optimization of related algorithms, and researchers still face many challenges at this stage.

近年来，针对目标跟踪的各种问题，许多研究者们提出了很多方法来设计一个稳定、准确的目标跟踪器。2017年Galoogahi等提出了BACF算法，为了提高样本的质量，提出了关键的“裁剪”思想，将原本的采样区域扩大，获得更多的背景信息，然后对循环样本进行中心裁剪，这样即扩大了样本数量，由提高了样本质量。再之后，2021年Li等提出的ADTrack算法，利用一种0-1掩膜提取目标信息，然后将目标和目标背景信息分开训练，进一步改进了BACF算法。In recent years, for various problems of target tracking, many researchers have proposed many methods to design a stable and accurate target tracker. In 2017, Galoogahi et al. proposed the BACF algorithm. In order to improve the quality of the samples, they proposed the key "cropping" idea, which expanded the original sampling area to obtain more background information, and then centrally cropped the loop samples, which expanded the size of the sample. The sample size is improved by the sample quality. After that, the ADTrack algorithm proposed by Li et al. in 2021 uses a 0-1 mask to extract target information, and then trains the target and target background information separately, further improving the BACF algorithm.

但是，ADTrack也存在着很多不足。首先是余弦窗函数，普通的余弦窗函数直接采用了中间极值四周零值的模型，忽略了目标的实际尺寸，若直接预处理样本，会增加目标纹理而污染目标源信息。除此之外，ADTrack的算法中对于滤波器的约束项也不理想，存在滤波器的过拟合问题和滤波器模型快速退化的情况。However, ADTrack also has many shortcomings. The first is the cosine window function. The ordinary cosine window function directly adopts the model of zero values around the middle extreme value, ignoring the actual size of the target. If the sample is directly preprocessed, the target texture will be added and the target source information will be polluted. In addition, the constraints on the filter in the ADTrack algorithm are not ideal, and there is a problem of overfitting of the filter and rapid degradation of the filter model.

发明内容SUMMARY OF THE INVENTION

本发明是一种改进余弦窗的相关滤波目标跟踪方法，并分别对目标和背景信息进行滤波训练。目前目标跟踪领域还存在着很多问题，如目标尺度变化、目标遮挡等。针对于目标尺度变化，本发明构建一种尺度自适应余弦窗，对余弦窗进行削峰，并且适时更新，再使用目标和目标背景两种滤波器，结合两者的自约束和互约束，对相关滤波目标跟踪模型进行改进，以此提高跟踪性能。The present invention is an improved cosine window correlation filtering target tracking method, and the target and background information are filtered and trained respectively. At present, there are still many problems in the field of target tracking, such as target scale change, target occlusion and so on. Aiming at the change of target scale, the present invention constructs a scale adaptive cosine window, clips the peak of the cosine window, and updates it in time, and then uses two filters of target and target background, combined with the self-constraint and mutual constraint of the two, to The correlation filtering target tracking model is improved to improve the tracking performance.

本发明采用的技术方案大致如下：The technical scheme adopted in the present invention is roughly as follows:

一种改进余弦窗的相关滤波目标跟踪方法，包括如下步骤：A correlation filtering target tracking method with improved cosine window, comprising the following steps:

步骤1，预处理阶段；判断是否处于暗场景，若判断处于暗场景，则对视频序列进行增强，并计算得到掩膜m，用于后续滤波器的训练；Step 1, preprocessing stage; judge whether it is in a dark scene, if it is judged to be in a dark scene, enhance the video sequence, and calculate the mask m, which is used for the training of subsequent filters;

步骤2，提取特征阶段；基于训练样本得到特征集x_g，通过削峰处理后的余弦窗实现跟踪目标尺度更新的自适应，基于掩膜m得到目标特征集x_o，得到尺度自适应余弦窗模型；Step 2, feature extraction stage; obtain feature set x _g based on training samples, realize adaptive tracking target scale update through cosine window after peak clipping processing, obtain target feature set x _o based on mask m, and obtain scale adaptive cosine window Model;

步骤3，训练阶段；利用步骤2得到的特征集x_g和x_o训练出下一帧的两个滤波器h_g和h_o，其中h_g是用于训练背景信息的滤波器，h_o是用于训练目标信息的滤波器；Step 3, training phase; use the feature sets x _g and x _o obtained in step 2 to train two filters h _g and _ho for the next frame, where _h _g is the filter for training background information, and ho is Filters for training target information;

步骤4，检测阶段；基于训练后的滤波器和提取的样本通道特征，得到目标响应图，通过目标响应图的最大值确定目标的位置信息。Step 4, detection stage; based on the trained filter and the extracted sample channel features, a target response map is obtained, and the position information of the target is determined by the maximum value of the target response map.

进一步地，步骤1包括如下分步骤：Further, step 1 includes the following sub-steps:

步骤1-1，假设给定彩色图像

对图像的RBG三通道进行光度整合：Step 1-1, assuming a color image is given

Photometric integration of the RBG three channels of the image:

其中Ψ_m(I(x,y))代表着m通道出图像的像素值，并且α_R+α_G+α_B＝1，将彩色图像转换为单通道图像；再对亮度进行对数平均处理：Where Ψ _m (I(x, y)) represents the pixel value of the m-channel output image, and α _R + α _G + α _B =1, convert the color image to a single-channel image; then perform logarithmic average processing on the brightness :

其中δ为常数，用于防止log(0)的出现，

代表当前图像的对数平均场景亮度，再通过引入阈值τ判断当前图像是否处于暗场景，

小于阈值即为暗场景，表示为S(I)；where δ is a constant to prevent the occurrence of log(0),

represents the logarithmic average scene brightness of the current image, and then judges whether the current image is in a dark scene by introducing a threshold τ,

Less than the threshold is a dark scene, denoted as S(I);

步骤1-2，对图像进行增强，利用之前获取的图像亮度V(x,y,I)和图像对数平均亮度

图像全局的增强矩阵表示为：Step 1-2, enhance the image, use the image brightness V(x, y, I) obtained before and the logarithmic average brightness of the image

The enhancement matrix of the image global is expressed as:

其中V_max(I)代表图像亮度V(x,y,I)中的最大值，然后对图像的三通道进行增强处理：Where V _max (I) represents the maximum value in the image brightness V(x, y, I), and then the three channels of the image are enhanced:

其中I_e表示增强过的图像，Ψ_m(I_e(x,y))表示增强过的图像在m通道(x,y)位置处的像素值；由上述增强过后的图像，得到图像的增强部分信息：where I _e represents the enhanced image, Ψ _m (I _e (x, y)) represents the pixel value of the enhanced image at the position of m channel (x, y); from the enhanced image above, the enhanced image is obtained Partial information:

E(I)＝V(I)-V(I_e)E(I)=V(I)-V(I _e )

步骤1-3，通过E(I)得到E(I)的平均值μ和标准差σ，由此得到全局的掩膜m_g：Steps 1-3, obtain the average μ and standard deviation σ of E(I) through E(I), thereby obtaining the global mask m _g :

通过裁剪矩阵P对m_g进行裁剪，得到预期的掩膜m＝m_g⊙P，P∈R^w×h，用于提取样本内的目标大小信息。The m _g is cropped by the crop matrix P, and the expected mask m=m _g ⊙P, P∈R ^w×h is obtained, which is used to extract the target size information within the sample.

进一步地，步骤2包括如下分步骤：Further, step 2 includes the following sub-steps:

步骤2-1，通过余弦窗预处理、循环矩阵、中心裁剪方法获取大量训练样本，余弦窗处理即为在样本上直接点乘余弦窗函数，循环矩阵和中心裁剪的操作见附图2。对获取的样本进行特征提取，包括灰度信息、颜色信息、梯度信息等，得到特征集合x_g；In step 2-1, a large number of training samples are obtained through cosine window preprocessing, circulant matrix, and center clipping methods. Cosine window processing is to directly multiply the cosine window function on the samples. The operations of circulant matrix and center clipping are shown in Figure 2. Perform feature extraction on the acquired samples, including grayscale information, color information, gradient information, etc., to obtain a feature set x _g ;

步骤2-2，利用目标初始大小对余弦窗削峰处理，削峰位置为参数Q∈(0,1)，Q由下面的公式得到：Step 2-2, use the initial size of the target to clip the peak of the cosine window, the clipping position is the parameter Q∈(0,1), and Q is obtained by the following formula:

其中cosWin₀是最原始的余弦窗，W×H,W＝H为余弦窗尺寸，w×h,w≥h为目标大小；where cosWin ₀ is the original cosine window, W×H, W=H is the cosine window size, and w×h, w≥h is the target size;

步骤2-3，每次跟踪目标尺度更新之后，获取尺度更新因子S_scale，再次更新余弦窗的削峰位置Q_scale＝Q×S_scale，使模型尺度自适应；之后利用掩膜m，得到x_o＝m⊙x_g，代表着单纯的目标特征，得到x_g和x_o两种特征集；尺度自适应余弦窗的具体模型为：Step 2-3, after each tracking target scale update, obtain the scale update factor S _scale , update the peak clipping position of the cosine window again Q _scale = Q×S _scale , so that the model scale is adaptive; then use the mask m to obtain x _o = m⊙x _g , which represents a simple target feature, and two feature sets x _g and x _o are obtained; the specific model of the scale adaptive cosine window is:

上式中cosWin₀是最原始的余弦窗，Q_scale为余弦窗削峰的位置，S_scale为尺度因子。In the above formula, cosWin ₀ is the most primitive cosine window, Q _scale is the position of the peak clipping of the cosine window, and S _scale is the scale factor.

进一步地，步骤3包括如下分步骤：Further, step 3 includes the following sub-steps:

步骤3-1，滤波器的目标函数为：Step 3-1, the objective function of the filter is:

其中，P是样本裁剪矩阵；

代表的是第c通道的目标信息滤波器或者背景信息滤波器；cosWin是之前中提出的随着目标尺度因子变化的余弦窗函数，用于对训练样本数据进行预处理；y是理想高斯模型；h_t和h_t-1表示当前帧和前一帧的滤波器，M矩阵表示两种滤波器之间的关系，λ是滤波器正则化项的约束参数，μ是两个滤波器互约束项的约束参数；where P is the sample cropping matrix;

Represents the target information filter or background information filter of the c-th channel; cosWin is the cosine window function proposed in the previous section that changes with the target scale factor, which is used to preprocess the training sample data; y is the ideal Gaussian model; h _t and h _t-1 represent the filters of the current frame and the previous frame, the M matrix represents the relationship between the two filters, λ is the constraint parameter of the regularization term of the filter, and μ is the mutual constraint term of the two filters the constraint parameters;

步骤3-2，对于整体目标函数来说，因为k∈{g,o}，再加上最后的两者互约束，所以目标函数可以看成7部分累加，第1、2和第4、5部分是多了裁剪矩阵的常规线性模型，最小二乘加上正则项，正则项的目的是为了防止滤波器过拟合；第3、6部分是两种滤波器的自约束，可以有效防止滤波器的快速退化；第7部分是两种滤波器的互约束，在训练期间相互绑定，使得两种滤波器的判别能力更加强大；Step 3-2, for the overall objective function, because k∈{g,o}, plus the last two mutual constraints, the objective function can be regarded as the accumulation of 7 parts, the 1st, 2nd and the 4th, 5th. Part of it is a conventional linear model with more clipping matrices, least squares plus a regular term, the purpose of the regular term is to prevent the filter from overfitting; Parts 3 and 6 are the self-constraints of the two filters, which can effectively prevent filtering The fast degradation of the filter; the seventh part is the mutual constraint of the two filters, which are bound to each other during training, making the discriminative ability of the two filters more powerful;

步骤3-3，通过ADMM迭代算法进行求解；因为cosWin是对样本的预处理，所以迭代时可以将其忽略。对于已知的h_o和M，对h_g进行ADMM迭代最优解；使用增广拉格朗日法，引入松弛变量

P^T是裁剪矩阵P的转置，I_N是N×N的单位矩阵；目标函数的增广拉格朗日形式表示为：Step 3-3, solve by ADMM iterative algorithm; because cosWin is the preprocessing of samples, it can be ignored during iteration. For known h _o and M, perform ADMM iterative optimal solution for h _g ; use the augmented Lagrangian method to introduce slack variables

P ^T is the transpose of the clipping matrix P, and I _N is an N×N identity matrix; the augmented Lagrangian form of the objective function is expressed as:

其中

是拉格朗日向量，γ是惩罚因子；采用ADMM方法，通过迭代的方式将上式转化为下面三个子问题：in

is the Lagrangian vector, and γ is the penalty factor; using the ADMM method, the above formula is transformed into the following three sub-problems through iteration:

对于

子问题，通过一阶导数求得h_g的闭式解：for

Subproblem, the closed-form solution of h _g is obtained by the first derivative:

对于ξ^*子问题，需要先将其转换到频域进行进一步的计算：For the ξ ^* subproblem, it needs to be converted to the frequency domain for further computation:

上式分解为T个子问题,T＝42表示特征的维度，设每个子问题为

得：The above formula is decomposed into T sub-problems, T=42 represents the dimension of the feature, and each sub-problem is set as

have to:

对上式求导得：Derive the above formula to get:

使用Sherman-Morrison方程来优化求解逆矩阵，得：Using the Sherman-Morrison equation to optimally solve the inverse matrix, we get:

其中

为标量；in

is a scalar;

h_g和h_o的迭代过程相同，M矩阵的迭代为：

The iterative process of h _g and h _o is the same, and the iteration of the M matrix is:

进一步地，步骤4中，目标响应图

表示为：Further, in step 4, the target response graph

Expressed as:

其中，

代表给定数据的傅里叶域的对应量，

代表反傅里叶变换之后的响应图，D代表滤波器的维数，

和

是第f帧的两个滤波器，

表示从f+1帧中提取的搜索区域样本的第c通道特征，

是经过掩膜处理的

ρ是一个控制由

和

产生的两个响应图的权重参数。in,

represents the corresponding quantity in the Fourier domain of the given data,

represents the response map after inverse Fourier transform, D represents the dimension of the filter,

and

are the two filters of frame f,

represents the c-th channel feature of the search area sample extracted from frame f+1,

is masked

ρ is a control by

and

Weight parameter for the two response graphs produced.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

(1)发明了一种尺度自适应余弦窗。分析了现存余弦窗的不足，利用目标初始大小对余弦窗进行削峰处理。另外加上考虑目标的尺度变化，通过尺度变化因子，对余弦窗削峰的位置进行动态调整，提高了跟踪算法在目标发生尺度变化场景下的跟踪成功率。(1) A scale-adaptive cosine window is invented. The insufficiency of the existing cosine window is analyzed, and the cosine window is clipped by the initial size of the target. In addition, considering the scale change of the target, the position of the peak clipping of the cosine window is dynamically adjusted through the scale change factor, which improves the tracking success rate of the tracking algorithm in the scene where the target scale changes.

(2)改进ADTrack算法，引入了训练目标和背景两种滤波器之间的自约束和互约束，进一步优化了目标函数，优化了现存跟踪器的跟踪漂移和模板快速退化现象，并且对新的目标函数进行了ADMM的优化推导。(2) Improve the ADTrack algorithm, introduce the self-constraint and mutual constraint between the training target and the background filter, further optimize the objective function, optimize the tracking drift of the existing tracker and the rapid template degradation phenomenon, and for the new The objective function is optimized and deduced by ADMM.

附图说明Description of drawings

图1为本发明实施例中的尺度自适应余弦窗的构建。FIG. 1 shows the construction of a scale-adaptive cosine window in an embodiment of the present invention.

图2为循环矩阵和裁剪矩阵的示范。Figure 2 is an example of a circulant matrix and a clipping matrix.

图3为本发明实施例中的算法模型总体框架。FIG. 3 is an overall framework of an algorithm model in an embodiment of the present invention.

图4为本发明实施例中的TC128数据集tiger1视频序列的跟踪结果。FIG. 4 is the tracking result of the tiger1 video sequence of the TC128 data set in the embodiment of the present invention.

图5为本发明实施例中的基于TC128数据集对比近几年算法和本发明方法的总精确率结果示意图。FIG. 5 is a schematic diagram of the overall accuracy result of comparing the algorithm in recent years and the method of the present invention based on the TC128 data set in the embodiment of the present invention.

图6为本发明实施例中的基于TC128数据集对比近几年算法和本发明方法的总成功率结果示意图。FIG. 6 is a schematic diagram showing the result of comparing the total success rate of the algorithm in recent years and the method of the present invention based on the TC128 data set in the embodiment of the present invention.

具体实施方式Detailed ways

下面结合说明书附图对本发明的技术方案做进一步的详细说明。The technical solutions of the present invention will be further described in detail below with reference to the accompanying drawings.

本发明的技术方案主要内容如下：针对相关滤波目标跟踪算法中余弦窗会增加样本的纹理、污染样本的问题，构建一种尺度自适应余弦窗函数，构建方法为先基于目标的基准大小，对余弦窗模型进行削峰，然后利用DSST算法尺度检测模块中的尺度因子S_scale，适时更新削峰的余弦窗模型，使之尺度自适应，具体效果见附图1。本发明总的跟踪模型基于ADTrack算法，具体效果见附图3，除了上述的尺度自适应余弦窗外，还在ADTrack基础模型上优化了滤波器的自约束

和互约束

并重新推导模型的ADMM迭代。The main contents of the technical solution of the present invention are as follows: in view of the problem that the cosine window in the correlation filtering target tracking algorithm will increase the texture of the sample and contaminate the sample, a scale adaptive cosine window function is constructed, and the construction method is based on the benchmark size of the target. Peak clipping is performed on the cosine window model, and then the scale factor S _scale in the scale detection module of the DSST algorithm is used to update the peak clipping cosine window model in a timely manner to make it scale adaptive. The specific effect is shown in Figure 1. The overall tracking model of the present invention is based on the ADTrack algorithm, and the specific effect is shown in Figure 3. In addition to the above-mentioned scale adaptive cosine window, the self-constraint of the filter is optimized on the ADTrack basic model.

and mutual restraint

And re-derive the ADMM iteration of the model.

本发明具体步骤如下：The concrete steps of the present invention are as follows:

Step1：预处理阶段，利用阈值τ来判断是否处于暗场景，τ的最佳值通过调参获取。若判断处于暗场景，则对视频序列进行增强，目的是提高黑夜状态下的目标跟踪的准确性和鲁棒性，并且得到掩膜m，目的是用于后续两个滤波器的训练。Step1: In the preprocessing stage, the threshold τ is used to judge whether it is in a dark scene, and the optimal value of τ is obtained by adjusting the parameters. If it is judged that it is in a dark scene, the video sequence is enhanced to improve the accuracy and robustness of target tracking in the dark state, and a mask m is obtained, which is used for the training of the subsequent two filters.

假设给定彩色图像

对图像的RBG三通道进行光度整合：Assume given a color image

Photometric integration of the RBG three channels of the image:

其中Ψ_m(I(x,y))代表着m通道出图像的像素值，并且α_R+α_G+α_B＝1，可以理解为将彩色图像转换为单通道图像。再对亮度进行对数平均处理：where Ψ _m (I(x,y)) represents the pixel value of the m-channel output image, and α _R + α _G + α _B =1, which can be understood as converting a color image into a single-channel image. Then logarithmically average the brightness:

其中δ是一个很小的值，目的是防出现log(0)的错误情况，这个

就代表了当前图像的对数平均场景亮度，再通过引入阈值τ，就可以判断当前图像是否处于暗场景，

小于阈值即为暗场景，表示为S(I)。where δ is a small value, the purpose is to prevent log(0) error conditions, this

It represents the logarithmic average scene brightness of the current image, and then by introducing a threshold τ, it can be judged whether the current image is in a dark scene,

Less than the threshold is a dark scene, denoted as S(I).

之后对图像进行增强，利用之前获取的图像亮度V(x,y,I)和图像对数平均亮度

图像全局的增强矩阵可以表示为：After that, the image is enhanced, using the previously acquired image brightness V(x,y,I) and the image logarithmic average brightness

The enhancement matrix of the image global can be expressed as:

其中V_max(I)代表图像亮度V(x,y,I)中的最大值，然后可以对图像的三通道进行增强处理：Where V _max (I) represents the maximum value in the image brightness V(x, y, I), and then the three channels of the image can be enhanced:

其中I_e表示增强过的图像，Ψ_m(I_e(x,y))也就表示增强过的图像在m通道(x,y)位置处的像素值。由上述增强过后的图像，也就很容易得到图像的增强部分信息：where I _e represents the enhanced image, and Ψ _m (I _e (x, y)) also represents the pixel value of the enhanced image at the position of m channel (x, y). From the enhanced image above, it is easy to get the enhanced part of the image:

E(I)＝V(I)-V(I_e)E(I)=V(I)-V(I _e )

最后通过E(I)得到E(I)的平均值μ和标准差σ，由此就可以得到全局的掩膜m_g：Finally, the average μ and standard deviation σ of E(I) are obtained through E(I), so that the global mask m _g can be obtained:

参考BACF算法中的裁剪矩阵P，对m_g进行裁剪，就可以得到预期的掩膜m＝m_g⊙P,P∈R^w×h用于提取样本内的目标大小信息。Referring to the cropping matrix P in the BACF algorithm, by cropping m _g , the expected mask m=m _g ⊙P, P∈R ^w×h is used to extract the target size information in the sample.

Step2：提取特征阶段。通过余弦窗预处理、循环矩阵、中心裁剪方法获取大量训练样本，余弦窗处理即为在样本上直接点乘余弦窗函数，循环矩阵和中心裁剪的操作见附图2。再对获取的样本进行特征提取，包括灰度信息、颜色信息、梯度信息等，得到特征集合x_g。考虑到原始余弦窗的不足，本发明对余弦窗进行改进，具体步骤见附图1，先利用目标初始大小对余弦窗削峰处理，削峰位置为参数Q∈(0,1)，Q可由下面的公式得到：Step2: Extract feature stage. A large number of training samples are obtained through the cosine window preprocessing, circulant matrix, and center cropping methods. The cosine window processing is to directly multiply the cosine window function on the sample. The operations of the circulant matrix and the center cropping are shown in Figure 2. Then perform feature extraction on the obtained samples, including grayscale information, color information, gradient information, etc., to obtain a feature set x _g . Taking into account the shortcomings of the original cosine window, the present invention improves the cosine window. The specific steps are shown in Figure 1. First, the initial size of the target is used to clip the peak of the cosine window. The peak clipping position is the parameter Q∈(0,1), and Q can be The following formula is obtained:

其中cosWin₀是最原始的余弦窗，W×H,W＝H为余弦窗尺寸，w×h,w≥h为目标大小。where cosWin ₀ is the original cosine window, W×H, W=H is the cosine window size, and w×h, w≥h is the target size.

之后每次跟踪目标尺度更新之后，获取尺度更新因子S_scale，再次更新余弦窗的削峰位置Q_scale＝Q×S_scale，使模型尺度自适应。之后利用掩膜m，得到x_o＝m⊙x_g，代表着单纯的目标特征，至此就得到了x_g和x_o两种特征集。尺度自适应余弦窗的具体模型为：After each tracking target scale update, the scale update factor S _scale is obtained, and the peak clipping position Q _scale =Q×S _scale of the cosine window is updated again to make the model scale adaptive. Then, using the mask m, x _o =m⊙x _g is obtained, which represents the pure target feature, and thus two feature sets of x _g and x _o are obtained. The specific model of the scale-adaptive cosine window is:

上式中cosWin₀是最原始的余弦窗，Q_scale为余弦窗削峰的位置，S_scale为尺度因子，W×H,W＝H为余弦窗尺寸，w×h,w≥h为目标大小。In the above formula, cosWin ₀ is the most original cosine window, Q _scale is the position of the cosine window clipping peak, S _scale is the scale factor, W×H, W=H is the cosine window size, w×h, w≥h is the target size .

Step3：训练阶段。利用Step2得到的x_g和x_o特征集训练出下一帧的两个滤波器：h_g和h_o，h_g是用于训练背景信息的滤波器，h_o是用于训练目标信息的滤波器。本算法模型基于ADTrack模型改良，具体模型见附图3，模型的目标函数可以表示为：Step3: Training phase. Use the x _g and x _o feature sets obtained in Step2 to train two filters for the next frame: h _g and h _o , h _g is the filter for training background information, and h _o is the filter for training target information device. This algorithm model is improved based on the ADTrack model. The specific model is shown in Figure 3. The objective function of the model can be expressed as:

其中，P是ADTrack基准算法BACF算法中的样本裁剪矩阵；h_k ^c代表的是第c通道的目标信息或者背景信息滤波器；cosWin是之前中提出的随着目标尺度因子变化的余弦窗函数，用于对训练样本数据进行预处理；y是理想高斯模型；h_t和h_t-1表示当前帧和前一帧的滤波器，M矩阵表示两种滤波器之间的关系，λ是滤波器正则化项的约束参数，μ是两个滤波器互约束项的约束参数。Among them, P is the sample cropping matrix in the ADTrack benchmark algorithm BACF algorithm; h _k ^c represents the target information or background information filter of the c-th channel; cosWin is the cosine window function proposed in the previous section with the change of the target scale factor, It is used to preprocess the training sample data; y is the ideal Gaussian model; h _t and h _t-1 represent the filters of the current frame and the previous frame, the M matrix represents the relationship between the two filters, and λ is the filter The constraint parameter of the regularization term, μ is the constraint parameter of the mutual constraint terms of the two filters.

对于整体目标函数来说，因为k∈{g,o}，再加上最后的两者互约束，所以目标函数可以看成7部分累加，第1、2和第4、5部分是多了裁剪矩阵的常规线性模型，最小二乘加上正则项，正则项的目的是为了防止滤波器过拟合；第3、6部分是两种滤波器的自约束，可以有效防止滤波器的快速退化；第7部分是两种滤波器的互约束，在训练期间相互绑定，使得两种滤波器的判别能力更加强大。For the overall objective function, because k∈{g,o}, plus the last two mutual constraints, the objective function can be regarded as a 7-part accumulation, and the 1st, 2nd, 4th, and 5th parts are more clipped The conventional linear model of the matrix, the least squares plus the regular term, the purpose of the regular term is to prevent the filter from overfitting; Parts 3 and 6 are the self-constraints of the two filters, which can effectively prevent the rapid degradation of the filter; Part 7 is the mutual constraint of the two filters, which are bound to each other during training, making the discriminative power of the two filters more powerful.

接下来通过ADMM迭代算法进行求解。在迭代的过程中，可以先假定h_o和M是已知的，对h_g进行ADMM迭代最优解。由于W余弦窗函数仅仅是对样本的预处理，所以进行迭代的时候可以将其忽略。使用增广拉格朗日法，引入松弛变量

P^T是裁剪矩阵P的转置，I_N是N×N的单位矩阵；目标函数的增广拉格朗日形式表示为：Next, it is solved by the ADMM iterative algorithm. In the iterative process, it can be assumed that h _o and M are known, and the optimal solution of ADMM iteration is performed on h _g . Since the W cosine window function is only a preprocessing of the samples, it can be ignored when iterating. Using Augmented Lagrangian Method, Introducing Slack Variables

其中

是拉格朗日向量，γ是惩罚因子。采用ADMM方法，可以通过迭代的方式将上式转化为下面三个子问题：in

is the Lagrangian vector and γ is the penalty factor. Using the ADMM method, the above equation can be transformed into the following three sub-problems in an iterative manner:

对于

子问题，可以直接通过一阶导数求得h_g的闭式解：for

subproblem, the closed-form solution of h _g can be obtained directly through the first derivative:

对于子问题ξ^*，需要先将其转换到频域进行进一步的计算：For the subproblem ξ ^* , it needs to be converted to the frequency domain for further computation:

上式可以分解为T个子问题,T＝42表示特征的维度，设每个子问题为

可得：The above formula can be decomposed into T sub-problems, T=42 represents the dimension of the feature, and each sub-problem is set as

Available:

对上式求导可得：Derivation of the above formula can be obtained:

由于出现了矩阵除法，计算量大，需要用到Sherman-Morrison方程来优化求解逆矩阵，得：Due to the occurrence of matrix division, the amount of calculation is large, and the Sherman-Morrison equation needs to be used to optimize the solution of the inverse matrix, so that:

其中

为标量。in

is a scalar.

h_g和h_o的迭代过程大致相同，不再赘述，M矩阵的迭代为：

The iterative processes of h _g and h _o are roughly the same, and will not be repeated here. The iteration of the M matrix is:

Step4：检测阶段。目标响应图

可以表示为:Step4: detection stage. target response graph

It can be expressed as:

其中，

代表着给定数据的傅里叶域的对应量，

代表着反傅里叶变换之后的响应图，D对应着滤波器的维数，

和

是第f帧的两个滤波器，

表示从f+1帧中提取的搜索区域样本的第c通道特征，

是经过掩膜处理的

ρ是一个控制由

和

产生的两个响应图的权重参数。最后根据响应图

最大值就可以确定目标的位置信息。in,

represents the corresponding quantity in the Fourier domain of the given data,

represents the response map after inverse Fourier transform, D corresponds to the dimension of the filter,

and

are the two filters of frame f,

is masked

ρ is a control by

and

Weight parameter for the two response graphs produced. Finally according to the response graph

The maximum value can determine the location information of the target.

本发明通过尺度自适应余弦窗、滤波器的自约束和互约束，提高了目标跟踪算法在尺度变化场景下的跟踪性能，减少了跟踪过程中的模板漂移问题，大大提高了算法的准确率和成功率，如附图5-6和表1所示，本方法跟踪成功率排名第一，精确率和AutoTrack算法并列第一。The invention improves the tracking performance of the target tracking algorithm in the scale changing scene through the self-constraint and mutual constraint of the scale-adaptive cosine window and the filter, reduces the template drift problem in the tracking process, and greatly improves the accuracy and efficiency of the algorithm. Success rate, as shown in Figures 5-6 and Table 1, the tracking success rate of this method ranks first, and the accuracy rate is tied for the first place with the AutoTrack algorithm.

表1 各算法在TC128数据集下的总精确率和总成功率Table 1 The total accuracy and total success rate of each algorithm under the TC128 dataset

OursOurs CPCFCPCF ADTrackADTrack AutoTrackAutoTrack BACFBACF KCFKCF SRDCFSRDCF BiCFBiCF PrecisionPrecision 0.7020.702 0.6970.697 0.6890.689 0.7020.702 0.6440.644 0.5440.544 0.6440.644 0.6410.641 SuccessSuccess 0.6490.649 0.6170.617 0.6190.619 0.6290.629 0.6100.610 0.4540.454 0.5840.584 0.5590.559

以上所述仅为本发明的较佳实施方式，本发明的保护范围并不以上述实施方式为限，但凡本领域普通技术人员根据本发明所揭示内容所作的等效修饰或变化，皆应纳入权利要求书中记载的保护范围内。The above descriptions are only the preferred embodiments of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, but any equivalent modifications or changes made by those of ordinary skill in the art based on the contents disclosed in the present invention should be included in the within the scope of protection described in the claims.

Claims

1. a correlation filtering target tracking method of improving cosine window, is characterized in that: described method comprises the steps:

Step 1, preprocessing stage; judge whether it is in a dark scene, if it is judged to be in a dark scene, enhance the video sequence, and calculate the mask m, which is used for the training of subsequent filters;

Step 2, feature extraction stage; obtain feature set x _g based on training samples, realize adaptive tracking target scale update through cosine window after peak clipping processing, obtain target feature set x _o based on mask m, and obtain scale adaptive cosine window Model;

Step 3, training phase; use the feature sets x _g and x _o obtained in step 2 to train two filters h _g and _ho for the next frame, where _h _g is the filter for training background information, and ho is Filters for training target information;

Step 4, detection stage; based on the trained filter and the extracted sample channel features, a target response map is obtained, and the position information of the target is determined by the maximum value of the target response map.

2. the correlation filtering target tracking method of a kind of improved cosine window according to claim 1 is characterized in that: step 1 comprises the following steps:

Step 1-1, assuming a color image is given

Photometric integration of the RBG three channels of the image:

Where Ψ _m (I(x, y)) represents the pixel value of the m-channel output image, and α _R + α _G + α _B =1, convert the color image to a single-channel image; then perform logarithmic average processing on the brightness :

where δ is a constant to prevent the occurrence of log(0),

Less than the threshold is a dark scene, denoted as S(I);

Step 1-2, enhance the image, use the image brightness V(x, y, I) obtained before and the logarithmic average brightness of the image

The enhancement matrix of the image global is expressed as:

Where V _max (I) represents the maximum value in the image brightness V(x, y, I), and then the three channels of the image are enhanced:

where I _e represents the enhanced image, Ψ _m (I _e (x, y)) represents the pixel value of the enhanced image at the position of m channel (x, y); from the enhanced image above, the enhanced image is obtained Partial information:

E(I)=V(I)-V(I _e )

Steps 1-3, obtain the average μ and standard deviation σ of E(I) through E(I), thereby obtaining the global mask m _g :

The m _g is cropped by the crop matrix P, and the expected mask m=m _g ⊙P, P∈R ^w×h is obtained, which is used to extract the target size information within the sample.

3. the correlation filtering target tracking method of a kind of improved cosine window according to claim 1, is characterized in that: step 2 comprises the following steps:

Step 2-1: Obtain a large number of training samples through cosine window preprocessing, circulant matrix, and center cropping methods. Cosine window processing is to directly multiply the cosine window function on the samples, and perform feature extraction on the obtained samples, including grayscale information, Color information, gradient information, etc., to obtain a feature set x _g ;

Step 2-2, use the initial size of the target to clip the peak of the cosine window, the clipping position is the parameter Q∈(0,1), and Q is obtained by the following formula:

where cosWin ₀ is the original cosine window, W×H, W=H is the cosine window size, and w×h, w≥h is the target size;

Step 2-3, after each tracking target scale update, obtain the scale update factor S _scale , update the peak clipping position of the cosine window again Q _scale = Q×S _scale , so that the model scale is adaptive; then use the mask m to obtain x _o = m⊙x _g , which represents a simple target feature, and two feature sets x _g and x _o are obtained; the specific model of the scale adaptive cosine window is:

In the above formula, cosWin ₀ is the most primitive cosine window, Q _scale is the position of the peak clipping of the cosine window, and S _scale is the scale factor.

4. the correlation filtering target tracking method of a kind of improved cosine window according to claim 1, is characterized in that: step 3 comprises the following steps:

Step 3-1, the objective function of the filter is:

where P is the sample cropping matrix;

Step 3-2, for the overall objective function, because k∈{g,o}, plus the last two mutual constraints, the objective function is regarded as the accumulation of 7 parts, the 1st and 2nd and the 4th and 5th parts. It is a conventional linear model with more clipping matrices, least squares plus a regular term to prevent the filter from overfitting; Parts 3 and 6 are the self-constraints of the two filters to prevent rapid degradation of the filter; Part 7 is The mutual constraints of the two filters are bound to each other during training, which makes the discriminative ability of the two filters more powerful;

Step 3-3, solve by ADMM iterative algorithm; because cosWin is the preprocessing of samples, it can be ignored during iteration. For known h _o and M, perform ADMM iterative optimal solution for h _g ; use the augmented Lagrangian method to introduce slack variables

P ^T is the transpose of the clipping matrix P, and I _N is an N×N identity matrix.

5. the correlation filtering target tracking method of a kind of improved cosine window according to claim 4, is characterized in that: in step 3-3, the augmented Lagrangian form of objective function is expressed as:

in

is a Lagrangian vector, γ is a penalty factor, I _N is an N×N unit matrix, and F _N is an N×N Fourier matrix; ADMM method is used to iteratively convert the above formula into the following three subsections question:

for

For the ξ ^* subproblem, it needs to be converted to the frequency domain for further computation:

The above formula is decomposed into T sub-problems, T=42 represents the dimension of the feature, and each sub-problem is set as

have to:

Derive the above formula to get:

Using the Sherman-Morrison equation to optimally solve the inverse matrix, we get:

in

is a scalar;

6. the correlation filtering target tracking method of a kind of improved cosine window according to claim 1, is characterized in that: in step 4, the target response graph