CN114897938A - Improved cosine window related filtering target tracking method - Google Patents
Improved cosine window related filtering target tracking method Download PDFInfo
- Publication number
- CN114897938A CN114897938A CN202210575210.8A CN202210575210A CN114897938A CN 114897938 A CN114897938 A CN 114897938A CN 202210575210 A CN202210575210 A CN 202210575210A CN 114897938 A CN114897938 A CN 114897938A
- Authority
- CN
- China
- Prior art keywords
- target
- cosine window
- scale
- image
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/262—Analysis of motion using transform domain methods, e.g. Fourier domain methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
Description
技术领域technical field
本发明属于目标跟踪技术领域,具体涉及一种改进余弦窗的相关滤波目标跟踪方法。The invention belongs to the technical field of target tracking, and in particular relates to an improved cosine window correlation filtering target tracking method.
背景技术Background technique
图像识别与目标跟踪是计算机视觉中的一项关键工作,其具体内容是面向视频序列中的目标状态加以连续的推断处理,核心工作是定位视频在全部帧数中涉及到的跟踪目标,来获得对应目标的运动轨迹,针对每一帧图像提供跟踪目标的活动区域。目标跟踪技术目前已经被军事以及民用领域普遍推广,给人生活带来大量便利,随着人类生活水平的提高,对于目标跟踪的要求也越来越高。尽管目标跟踪相关技术每年都在革新,相关算法在性能优化方面依然存在需求,现阶段研究人员仍然面临着众多挑战。Image recognition and target tracking is a key work in computer vision. Its specific content is to continuously infer the target state in the video sequence. The core work is to locate the tracking target involved in all frames of the video to obtain Corresponding to the motion trajectory of the target, the active area of the tracking target is provided for each frame of image. Target tracking technology has been widely promoted in the military and civilian fields, bringing a lot of convenience to people's lives. With the improvement of human living standards, the requirements for target tracking are also getting higher and higher. Although the technology related to target tracking is innovated every year, there is still a need for performance optimization of related algorithms, and researchers still face many challenges at this stage.
近年来,针对目标跟踪的各种问题,许多研究者们提出了很多方法来设计一个稳定、准确的目标跟踪器。2017年Galoogahi等提出了BACF算法,为了提高样本的质量,提出了关键的“裁剪”思想,将原本的采样区域扩大,获得更多的背景信息,然后对循环样本进行中心裁剪,这样即扩大了样本数量,由提高了样本质量。再之后,2021年Li等提出的ADTrack算法,利用一种0-1掩膜提取目标信息,然后将目标和目标背景信息分开训练,进一步改进了BACF算法。In recent years, for various problems of target tracking, many researchers have proposed many methods to design a stable and accurate target tracker. In 2017, Galoogahi et al. proposed the BACF algorithm. In order to improve the quality of the samples, they proposed the key "cropping" idea, which expanded the original sampling area to obtain more background information, and then centrally cropped the loop samples, which expanded the size of the sample. The sample size is improved by the sample quality. After that, the ADTrack algorithm proposed by Li et al. in 2021 uses a 0-1 mask to extract target information, and then trains the target and target background information separately, further improving the BACF algorithm.
但是,ADTrack也存在着很多不足。首先是余弦窗函数,普通的余弦窗函数直接采用了中间极值四周零值的模型,忽略了目标的实际尺寸,若直接预处理样本,会增加目标纹理而污染目标源信息。除此之外,ADTrack的算法中对于滤波器的约束项也不理想,存在滤波器的过拟合问题和滤波器模型快速退化的情况。However, ADTrack also has many shortcomings. The first is the cosine window function. The ordinary cosine window function directly adopts the model of zero values around the middle extreme value, ignoring the actual size of the target. If the sample is directly preprocessed, the target texture will be added and the target source information will be polluted. In addition, the constraints on the filter in the ADTrack algorithm are not ideal, and there is a problem of overfitting of the filter and rapid degradation of the filter model.
发明内容SUMMARY OF THE INVENTION
本发明是一种改进余弦窗的相关滤波目标跟踪方法,并分别对目标和背景信息进行滤波训练。目前目标跟踪领域还存在着很多问题,如目标尺度变化、目标遮挡等。针对于目标尺度变化,本发明构建一种尺度自适应余弦窗,对余弦窗进行削峰,并且适时更新,再使用目标和目标背景两种滤波器,结合两者的自约束和互约束,对相关滤波目标跟踪模型进行改进,以此提高跟踪性能。The present invention is an improved cosine window correlation filtering target tracking method, and the target and background information are filtered and trained respectively. At present, there are still many problems in the field of target tracking, such as target scale change, target occlusion and so on. Aiming at the change of target scale, the present invention constructs a scale adaptive cosine window, clips the peak of the cosine window, and updates it in time, and then uses two filters of target and target background, combined with the self-constraint and mutual constraint of the two, to The correlation filtering target tracking model is improved to improve the tracking performance.
本发明采用的技术方案大致如下:The technical scheme adopted in the present invention is roughly as follows:
一种改进余弦窗的相关滤波目标跟踪方法,包括如下步骤:A correlation filtering target tracking method with improved cosine window, comprising the following steps:
步骤1,预处理阶段;判断是否处于暗场景,若判断处于暗场景,则对视频序列进行增强,并计算得到掩膜m,用于后续滤波器的训练;
步骤2,提取特征阶段;基于训练样本得到特征集xg,通过削峰处理后的余弦窗实现跟踪目标尺度更新的自适应,基于掩膜m得到目标特征集xo,得到尺度自适应余弦窗模型;Step 2, feature extraction stage; obtain feature set x g based on training samples, realize adaptive tracking target scale update through cosine window after peak clipping processing, obtain target feature set x o based on mask m, and obtain scale adaptive cosine window Model;
步骤3,训练阶段;利用步骤2得到的特征集xg和xo训练出下一帧的两个滤波器hg和ho,其中hg是用于训练背景信息的滤波器,ho是用于训练目标信息的滤波器;Step 3, training phase; use the feature sets x g and x o obtained in step 2 to train two filters h g and ho for the next frame, where h g is the filter for training background information, and ho is Filters for training target information;
步骤4,检测阶段;基于训练后的滤波器和提取的样本通道特征,得到目标响应图,通过目标响应图的最大值确定目标的位置信息。Step 4, detection stage; based on the trained filter and the extracted sample channel features, a target response map is obtained, and the position information of the target is determined by the maximum value of the target response map.
进一步地,步骤1包括如下分步骤:Further,
步骤1-1,假设给定彩色图像对图像的RBG三通道进行光度整合:Step 1-1, assuming a color image is given Photometric integration of the RBG three channels of the image:
其中Ψm(I(x,y))代表着m通道出图像的像素值,并且αR+αG+αB=1,将彩色图像转换为单通道图像;再对亮度进行对数平均处理:Where Ψ m (I(x, y)) represents the pixel value of the m-channel output image, and α R + α G + α B =1, convert the color image to a single-channel image; then perform logarithmic average processing on the brightness :
其中δ为常数,用于防止log(0)的出现,代表当前图像的对数平均场景亮度,再通过引入阈值τ判断当前图像是否处于暗场景,小于阈值即为暗场景,表示为S(I);where δ is a constant to prevent the occurrence of log(0), represents the logarithmic average scene brightness of the current image, and then judges whether the current image is in a dark scene by introducing a threshold τ, Less than the threshold is a dark scene, denoted as S(I);
步骤1-2,对图像进行增强,利用之前获取的图像亮度V(x,y,I)和图像对数平均亮度图像全局的增强矩阵表示为:Step 1-2, enhance the image, use the image brightness V(x, y, I) obtained before and the logarithmic average brightness of the image The enhancement matrix of the image global is expressed as:
其中Vmax(I)代表图像亮度V(x,y,I)中的最大值,然后对图像的三通道进行增强处理:Where V max (I) represents the maximum value in the image brightness V(x, y, I), and then the three channels of the image are enhanced:
其中Ie表示增强过的图像,Ψm(Ie(x,y))表示增强过的图像在m通道(x,y)位置处的像素值;由上述增强过后的图像,得到图像的增强部分信息:where I e represents the enhanced image, Ψ m (I e (x, y)) represents the pixel value of the enhanced image at the position of m channel (x, y); from the enhanced image above, the enhanced image is obtained Partial information:
E(I)=V(I)-V(Ie)E(I)=V(I)-V(I e )
步骤1-3,通过E(I)得到E(I)的平均值μ和标准差σ,由此得到全局的掩膜mg:Steps 1-3, obtain the average μ and standard deviation σ of E(I) through E(I), thereby obtaining the global mask m g :
通过裁剪矩阵P对mg进行裁剪,得到预期的掩膜m=mg⊙P,P∈Rw×h,用于提取样本内的目标大小信息。The m g is cropped by the crop matrix P, and the expected mask m=m g ⊙P, P∈R w×h is obtained, which is used to extract the target size information within the sample.
进一步地,步骤2包括如下分步骤:Further, step 2 includes the following sub-steps:
步骤2-1,通过余弦窗预处理、循环矩阵、中心裁剪方法获取大量训练样本,余弦窗处理即为在样本上直接点乘余弦窗函数,循环矩阵和中心裁剪的操作见附图2。对获取的样本进行特征提取,包括灰度信息、颜色信息、梯度信息等,得到特征集合xg;In step 2-1, a large number of training samples are obtained through cosine window preprocessing, circulant matrix, and center clipping methods. Cosine window processing is to directly multiply the cosine window function on the samples. The operations of circulant matrix and center clipping are shown in Figure 2. Perform feature extraction on the acquired samples, including grayscale information, color information, gradient information, etc., to obtain a feature set x g ;
步骤2-2,利用目标初始大小对余弦窗削峰处理,削峰位置为参数Q∈(0,1),Q由下面的公式得到:Step 2-2, use the initial size of the target to clip the peak of the cosine window, the clipping position is the parameter Q∈(0,1), and Q is obtained by the following formula:
其中cosWin0是最原始的余弦窗,W×H,W=H为余弦窗尺寸,w×h,w≥h为目标大小;where cosWin 0 is the original cosine window, W×H, W=H is the cosine window size, and w×h, w≥h is the target size;
步骤2-3,每次跟踪目标尺度更新之后,获取尺度更新因子Sscale,再次更新余弦窗的削峰位置Qscale=Q×Sscale,使模型尺度自适应;之后利用掩膜m,得到xo=m⊙xg,代表着单纯的目标特征,得到xg和xo两种特征集;尺度自适应余弦窗的具体模型为:Step 2-3, after each tracking target scale update, obtain the scale update factor S scale , update the peak clipping position of the cosine window again Q scale = Q×S scale , so that the model scale is adaptive; then use the mask m to obtain x o = m⊙x g , which represents a simple target feature, and two feature sets x g and x o are obtained; the specific model of the scale adaptive cosine window is:
上式中cosWin0是最原始的余弦窗,Qscale为余弦窗削峰的位置,Sscale为尺度因子。In the above formula, cosWin 0 is the most primitive cosine window, Q scale is the position of the peak clipping of the cosine window, and S scale is the scale factor.
进一步地,步骤3包括如下分步骤:Further, step 3 includes the following sub-steps:
步骤3-1,滤波器的目标函数为:Step 3-1, the objective function of the filter is:
其中,P是样本裁剪矩阵;代表的是第c通道的目标信息滤波器或者背景信息滤波器;cosWin是之前中提出的随着目标尺度因子变化的余弦窗函数,用于对训练样本数据进行预处理;y是理想高斯模型;ht和ht-1表示当前帧和前一帧的滤波器,M矩阵表示两种滤波器之间的关系,λ是滤波器正则化项的约束参数,μ是两个滤波器互约束项的约束参数;where P is the sample cropping matrix; Represents the target information filter or background information filter of the c-th channel; cosWin is the cosine window function proposed in the previous section that changes with the target scale factor, which is used to preprocess the training sample data; y is the ideal Gaussian model; h t and h t-1 represent the filters of the current frame and the previous frame, the M matrix represents the relationship between the two filters, λ is the constraint parameter of the regularization term of the filter, and μ is the mutual constraint term of the two filters the constraint parameters;
步骤3-2,对于整体目标函数来说,因为k∈{g,o},再加上最后的两者互约束,所以目标函数可以看成7部分累加,第1、2和第4、5部分是多了裁剪矩阵的常规线性模型,最小二乘加上正则项,正则项的目的是为了防止滤波器过拟合;第3、6部分是两种滤波器的自约束,可以有效防止滤波器的快速退化;第7部分是两种滤波器的互约束,在训练期间相互绑定,使得两种滤波器的判别能力更加强大;Step 3-2, for the overall objective function, because k∈{g,o}, plus the last two mutual constraints, the objective function can be regarded as the accumulation of 7 parts, the 1st, 2nd and the 4th, 5th. Part of it is a conventional linear model with more clipping matrices, least squares plus a regular term, the purpose of the regular term is to prevent the filter from overfitting; Parts 3 and 6 are the self-constraints of the two filters, which can effectively prevent filtering The fast degradation of the filter; the seventh part is the mutual constraint of the two filters, which are bound to each other during training, making the discriminative ability of the two filters more powerful;
步骤3-3,通过ADMM迭代算法进行求解;因为cosWin是对样本的预处理,所以迭代时可以将其忽略。对于已知的ho和M,对hg进行ADMM迭代最优解;使用增广拉格朗日法,引入松弛变量PT是裁剪矩阵P的转置,IN是N×N的单位矩阵;目标函数的增广拉格朗日形式表示为:Step 3-3, solve by ADMM iterative algorithm; because cosWin is the preprocessing of samples, it can be ignored during iteration. For known h o and M, perform ADMM iterative optimal solution for h g ; use the augmented Lagrangian method to introduce slack variables P T is the transpose of the clipping matrix P, and I N is an N×N identity matrix; the augmented Lagrangian form of the objective function is expressed as:
其中是拉格朗日向量,γ是惩罚因子;采用ADMM方法,通过迭代的方式将上式转化为下面三个子问题:in is the Lagrangian vector, and γ is the penalty factor; using the ADMM method, the above formula is transformed into the following three sub-problems through iteration:
对于子问题,通过一阶导数求得hg的闭式解:for Subproblem, the closed-form solution of h g is obtained by the first derivative:
对于ξ*子问题,需要先将其转换到频域进行进一步的计算:For the ξ * subproblem, it needs to be converted to the frequency domain for further computation:
上式分解为T个子问题,T=42表示特征的维度,设每个子问题为得:The above formula is decomposed into T sub-problems, T=42 represents the dimension of the feature, and each sub-problem is set as have to:
对上式求导得:Derive the above formula to get:
使用Sherman-Morrison方程来优化求解逆矩阵,得:Using the Sherman-Morrison equation to optimally solve the inverse matrix, we get:
其中为标量;in is a scalar;
hg和ho的迭代过程相同,M矩阵的迭代为: The iterative process of h g and h o is the same, and the iteration of the M matrix is:
进一步地,步骤4中,目标响应图表示为:Further, in step 4, the target response graph Expressed as:
其中,代表给定数据的傅里叶域的对应量,代表反傅里叶变换之后的响应图,D代表滤波器的维数,和是第f帧的两个滤波器,表示从f+1帧中提取的搜索区域样本的第c通道特征,是经过掩膜处理的ρ是一个控制由和产生的两个响应图的权重参数。in, represents the corresponding quantity in the Fourier domain of the given data, represents the response map after inverse Fourier transform, D represents the dimension of the filter, and are the two filters of frame f, represents the c-th channel feature of the search area sample extracted from frame f+1, is masked ρ is a control by and Weight parameter for the two response graphs produced.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
(1)发明了一种尺度自适应余弦窗。分析了现存余弦窗的不足,利用目标初始大小对余弦窗进行削峰处理。另外加上考虑目标的尺度变化,通过尺度变化因子,对余弦窗削峰的位置进行动态调整,提高了跟踪算法在目标发生尺度变化场景下的跟踪成功率。(1) A scale-adaptive cosine window is invented. The insufficiency of the existing cosine window is analyzed, and the cosine window is clipped by the initial size of the target. In addition, considering the scale change of the target, the position of the peak clipping of the cosine window is dynamically adjusted through the scale change factor, which improves the tracking success rate of the tracking algorithm in the scene where the target scale changes.
(2)改进ADTrack算法,引入了训练目标和背景两种滤波器之间的自约束和互约束,进一步优化了目标函数,优化了现存跟踪器的跟踪漂移和模板快速退化现象,并且对新的目标函数进行了ADMM的优化推导。(2) Improve the ADTrack algorithm, introduce the self-constraint and mutual constraint between the training target and the background filter, further optimize the objective function, optimize the tracking drift of the existing tracker and the rapid template degradation phenomenon, and for the new The objective function is optimized and deduced by ADMM.
附图说明Description of drawings
图1为本发明实施例中的尺度自适应余弦窗的构建。FIG. 1 shows the construction of a scale-adaptive cosine window in an embodiment of the present invention.
图2为循环矩阵和裁剪矩阵的示范。Figure 2 is an example of a circulant matrix and a clipping matrix.
图3为本发明实施例中的算法模型总体框架。FIG. 3 is an overall framework of an algorithm model in an embodiment of the present invention.
图4为本发明实施例中的TC128数据集tiger1视频序列的跟踪结果。FIG. 4 is the tracking result of the tiger1 video sequence of the TC128 data set in the embodiment of the present invention.
图5为本发明实施例中的基于TC128数据集对比近几年算法和本发明方法的总精确率结果示意图。FIG. 5 is a schematic diagram of the overall accuracy result of comparing the algorithm in recent years and the method of the present invention based on the TC128 data set in the embodiment of the present invention.
图6为本发明实施例中的基于TC128数据集对比近几年算法和本发明方法的总成功率结果示意图。FIG. 6 is a schematic diagram showing the result of comparing the total success rate of the algorithm in recent years and the method of the present invention based on the TC128 data set in the embodiment of the present invention.
具体实施方式Detailed ways
下面结合说明书附图对本发明的技术方案做进一步的详细说明。The technical solutions of the present invention will be further described in detail below with reference to the accompanying drawings.
本发明是一种改进余弦窗的相关滤波目标跟踪方法,并分别对目标和背景信息进行滤波训练。目前目标跟踪领域还存在着很多问题,如目标尺度变化、目标遮挡等。针对于目标尺度变化,本发明构建一种尺度自适应余弦窗,对余弦窗进行削峰,并且适时更新,再使用目标和目标背景两种滤波器,结合两者的自约束和互约束,对相关滤波目标跟踪模型进行改进,以此提高跟踪性能。The present invention is an improved cosine window correlation filtering target tracking method, and the target and background information are filtered and trained respectively. At present, there are still many problems in the field of target tracking, such as target scale change, target occlusion and so on. Aiming at the change of target scale, the present invention constructs a scale adaptive cosine window, clips the peak of the cosine window, and updates it in time, and then uses two filters of target and target background, combined with the self-constraint and mutual constraint of the two, to The correlation filtering target tracking model is improved to improve the tracking performance.
本发明的技术方案主要内容如下:针对相关滤波目标跟踪算法中余弦窗会增加样本的纹理、污染样本的问题,构建一种尺度自适应余弦窗函数,构建方法为先基于目标的基准大小,对余弦窗模型进行削峰,然后利用DSST算法尺度检测模块中的尺度因子Sscale,适时更新削峰的余弦窗模型,使之尺度自适应,具体效果见附图1。本发明总的跟踪模型基于ADTrack算法,具体效果见附图3,除了上述的尺度自适应余弦窗外,还在ADTrack基础模型上优化了滤波器的自约束和互约束并重新推导模型的ADMM迭代。The main contents of the technical solution of the present invention are as follows: in view of the problem that the cosine window in the correlation filtering target tracking algorithm will increase the texture of the sample and contaminate the sample, a scale adaptive cosine window function is constructed, and the construction method is based on the benchmark size of the target. Peak clipping is performed on the cosine window model, and then the scale factor S scale in the scale detection module of the DSST algorithm is used to update the peak clipping cosine window model in a timely manner to make it scale adaptive. The specific effect is shown in Figure 1. The overall tracking model of the present invention is based on the ADTrack algorithm, and the specific effect is shown in Figure 3. In addition to the above-mentioned scale adaptive cosine window, the self-constraint of the filter is optimized on the ADTrack basic model. and mutual restraint And re-derive the ADMM iteration of the model.
本发明具体步骤如下:The concrete steps of the present invention are as follows:
Step1:预处理阶段,利用阈值τ来判断是否处于暗场景,τ的最佳值通过调参获取。若判断处于暗场景,则对视频序列进行增强,目的是提高黑夜状态下的目标跟踪的准确性和鲁棒性,并且得到掩膜m,目的是用于后续两个滤波器的训练。Step1: In the preprocessing stage, the threshold τ is used to judge whether it is in a dark scene, and the optimal value of τ is obtained by adjusting the parameters. If it is judged that it is in a dark scene, the video sequence is enhanced to improve the accuracy and robustness of target tracking in the dark state, and a mask m is obtained, which is used for the training of the subsequent two filters.
假设给定彩色图像对图像的RBG三通道进行光度整合:Assume given a color image Photometric integration of the RBG three channels of the image:
其中Ψm(I(x,y))代表着m通道出图像的像素值,并且αR+αG+αB=1,可以理解为将彩色图像转换为单通道图像。再对亮度进行对数平均处理:where Ψ m (I(x,y)) represents the pixel value of the m-channel output image, and α R + α G + α B =1, which can be understood as converting a color image into a single-channel image. Then logarithmically average the brightness:
其中δ是一个很小的值,目的是防出现log(0)的错误情况,这个就代表了当前图像的对数平均场景亮度,再通过引入阈值τ,就可以判断当前图像是否处于暗场景,小于阈值即为暗场景,表示为S(I)。where δ is a small value, the purpose is to prevent log(0) error conditions, this It represents the logarithmic average scene brightness of the current image, and then by introducing a threshold τ, it can be judged whether the current image is in a dark scene, Less than the threshold is a dark scene, denoted as S(I).
之后对图像进行增强,利用之前获取的图像亮度V(x,y,I)和图像对数平均亮度图像全局的增强矩阵可以表示为:After that, the image is enhanced, using the previously acquired image brightness V(x,y,I) and the image logarithmic average brightness The enhancement matrix of the image global can be expressed as:
其中Vmax(I)代表图像亮度V(x,y,I)中的最大值,然后可以对图像的三通道进行增强处理:Where V max (I) represents the maximum value in the image brightness V(x, y, I), and then the three channels of the image can be enhanced:
其中Ie表示增强过的图像,Ψm(Ie(x,y))也就表示增强过的图像在m通道(x,y)位置处的像素值。由上述增强过后的图像,也就很容易得到图像的增强部分信息:where I e represents the enhanced image, and Ψ m (I e (x, y)) also represents the pixel value of the enhanced image at the position of m channel (x, y). From the enhanced image above, it is easy to get the enhanced part of the image:
E(I)=V(I)-V(Ie)E(I)=V(I)-V(I e )
最后通过E(I)得到E(I)的平均值μ和标准差σ,由此就可以得到全局的掩膜mg:Finally, the average μ and standard deviation σ of E(I) are obtained through E(I), so that the global mask m g can be obtained:
参考BACF算法中的裁剪矩阵P,对mg进行裁剪,就可以得到预期的掩膜m=mg⊙P,P∈Rw×h用于提取样本内的目标大小信息。Referring to the cropping matrix P in the BACF algorithm, by cropping m g , the expected mask m=m g ⊙P, P∈R w×h is used to extract the target size information in the sample.
Step2:提取特征阶段。通过余弦窗预处理、循环矩阵、中心裁剪方法获取大量训练样本,余弦窗处理即为在样本上直接点乘余弦窗函数,循环矩阵和中心裁剪的操作见附图2。再对获取的样本进行特征提取,包括灰度信息、颜色信息、梯度信息等,得到特征集合xg。考虑到原始余弦窗的不足,本发明对余弦窗进行改进,具体步骤见附图1,先利用目标初始大小对余弦窗削峰处理,削峰位置为参数Q∈(0,1),Q可由下面的公式得到:Step2: Extract feature stage. A large number of training samples are obtained through the cosine window preprocessing, circulant matrix, and center cropping methods. The cosine window processing is to directly multiply the cosine window function on the sample. The operations of the circulant matrix and the center cropping are shown in Figure 2. Then perform feature extraction on the obtained samples, including grayscale information, color information, gradient information, etc., to obtain a feature set x g . Taking into account the shortcomings of the original cosine window, the present invention improves the cosine window. The specific steps are shown in Figure 1. First, the initial size of the target is used to clip the peak of the cosine window. The peak clipping position is the parameter Q∈(0,1), and Q can be The following formula is obtained:
其中cosWin0是最原始的余弦窗,W×H,W=H为余弦窗尺寸,w×h,w≥h为目标大小。where cosWin 0 is the original cosine window, W×H, W=H is the cosine window size, and w×h, w≥h is the target size.
之后每次跟踪目标尺度更新之后,获取尺度更新因子Sscale,再次更新余弦窗的削峰位置Qscale=Q×Sscale,使模型尺度自适应。之后利用掩膜m,得到xo=m⊙xg,代表着单纯的目标特征,至此就得到了xg和xo两种特征集。尺度自适应余弦窗的具体模型为:After each tracking target scale update, the scale update factor S scale is obtained, and the peak clipping position Q scale =Q×S scale of the cosine window is updated again to make the model scale adaptive. Then, using the mask m, x o =m⊙x g is obtained, which represents the pure target feature, and thus two feature sets of x g and x o are obtained. The specific model of the scale-adaptive cosine window is:
上式中cosWin0是最原始的余弦窗,Qscale为余弦窗削峰的位置,Sscale为尺度因子,W×H,W=H为余弦窗尺寸,w×h,w≥h为目标大小。In the above formula, cosWin 0 is the most original cosine window, Q scale is the position of the cosine window clipping peak, S scale is the scale factor, W×H, W=H is the cosine window size, w×h, w≥h is the target size .
Step3:训练阶段。利用Step2得到的xg和xo特征集训练出下一帧的两个滤波器:hg和ho,hg是用于训练背景信息的滤波器,ho是用于训练目标信息的滤波器。本算法模型基于ADTrack模型改良,具体模型见附图3,模型的目标函数可以表示为:Step3: Training phase. Use the x g and x o feature sets obtained in Step2 to train two filters for the next frame: h g and h o , h g is the filter for training background information, and h o is the filter for training target information device. This algorithm model is improved based on the ADTrack model. The specific model is shown in Figure 3. The objective function of the model can be expressed as:
其中,P是ADTrack基准算法BACF算法中的样本裁剪矩阵;hk c代表的是第c通道的目标信息或者背景信息滤波器;cosWin是之前中提出的随着目标尺度因子变化的余弦窗函数,用于对训练样本数据进行预处理;y是理想高斯模型;ht和ht-1表示当前帧和前一帧的滤波器,M矩阵表示两种滤波器之间的关系,λ是滤波器正则化项的约束参数,μ是两个滤波器互约束项的约束参数。Among them, P is the sample cropping matrix in the ADTrack benchmark algorithm BACF algorithm; h k c represents the target information or background information filter of the c-th channel; cosWin is the cosine window function proposed in the previous section with the change of the target scale factor, It is used to preprocess the training sample data; y is the ideal Gaussian model; h t and h t-1 represent the filters of the current frame and the previous frame, the M matrix represents the relationship between the two filters, and λ is the filter The constraint parameter of the regularization term, μ is the constraint parameter of the mutual constraint terms of the two filters.
对于整体目标函数来说,因为k∈{g,o},再加上最后的两者互约束,所以目标函数可以看成7部分累加,第1、2和第4、5部分是多了裁剪矩阵的常规线性模型,最小二乘加上正则项,正则项的目的是为了防止滤波器过拟合;第3、6部分是两种滤波器的自约束,可以有效防止滤波器的快速退化;第7部分是两种滤波器的互约束,在训练期间相互绑定,使得两种滤波器的判别能力更加强大。For the overall objective function, because k∈{g,o}, plus the last two mutual constraints, the objective function can be regarded as a 7-part accumulation, and the 1st, 2nd, 4th, and 5th parts are more clipped The conventional linear model of the matrix, the least squares plus the regular term, the purpose of the regular term is to prevent the filter from overfitting; Parts 3 and 6 are the self-constraints of the two filters, which can effectively prevent the rapid degradation of the filter; Part 7 is the mutual constraint of the two filters, which are bound to each other during training, making the discriminative power of the two filters more powerful.
接下来通过ADMM迭代算法进行求解。在迭代的过程中,可以先假定ho和M是已知的,对hg进行ADMM迭代最优解。由于W余弦窗函数仅仅是对样本的预处理,所以进行迭代的时候可以将其忽略。使用增广拉格朗日法,引入松弛变量PT是裁剪矩阵P的转置,IN是N×N的单位矩阵;目标函数的增广拉格朗日形式表示为:Next, it is solved by the ADMM iterative algorithm. In the iterative process, it can be assumed that h o and M are known, and the optimal solution of ADMM iteration is performed on h g . Since the W cosine window function is only a preprocessing of the samples, it can be ignored when iterating. Using Augmented Lagrangian Method, Introducing Slack Variables P T is the transpose of the clipping matrix P, and I N is an N×N identity matrix; the augmented Lagrangian form of the objective function is expressed as:
其中是拉格朗日向量,γ是惩罚因子。采用ADMM方法,可以通过迭代的方式将上式转化为下面三个子问题:in is the Lagrangian vector and γ is the penalty factor. Using the ADMM method, the above equation can be transformed into the following three sub-problems in an iterative manner:
对于子问题,可以直接通过一阶导数求得hg的闭式解:for subproblem, the closed-form solution of h g can be obtained directly through the first derivative:
对于子问题ξ*,需要先将其转换到频域进行进一步的计算:For the subproblem ξ * , it needs to be converted to the frequency domain for further computation:
上式可以分解为T个子问题,T=42表示特征的维度,设每个子问题为可得:The above formula can be decomposed into T sub-problems, T=42 represents the dimension of the feature, and each sub-problem is set as Available:
对上式求导可得:Derivation of the above formula can be obtained:
由于出现了矩阵除法,计算量大,需要用到Sherman-Morrison方程来优化求解逆矩阵,得:Due to the occurrence of matrix division, the amount of calculation is large, and the Sherman-Morrison equation needs to be used to optimize the solution of the inverse matrix, so that:
其中为标量。in is a scalar.
hg和ho的迭代过程大致相同,不再赘述,M矩阵的迭代为: The iterative processes of h g and h o are roughly the same, and will not be repeated here. The iteration of the M matrix is:
Step4:检测阶段。目标响应图可以表示为:Step4: detection stage. target response graph It can be expressed as:
其中,代表着给定数据的傅里叶域的对应量,代表着反傅里叶变换之后的响应图,D对应着滤波器的维数,和是第f帧的两个滤波器,表示从f+1帧中提取的搜索区域样本的第c通道特征,是经过掩膜处理的ρ是一个控制由和产生的两个响应图的权重参数。最后根据响应图最大值就可以确定目标的位置信息。in, represents the corresponding quantity in the Fourier domain of the given data, represents the response map after inverse Fourier transform, D corresponds to the dimension of the filter, and are the two filters of frame f, represents the c-th channel feature of the search area sample extracted from frame f+1, is masked ρ is a control by and Weight parameter for the two response graphs produced. Finally according to the response graph The maximum value can determine the location information of the target.
本发明通过尺度自适应余弦窗、滤波器的自约束和互约束,提高了目标跟踪算法在尺度变化场景下的跟踪性能,减少了跟踪过程中的模板漂移问题,大大提高了算法的准确率和成功率,如附图5-6和表1所示,本方法跟踪成功率排名第一,精确率和AutoTrack算法并列第一。The invention improves the tracking performance of the target tracking algorithm in the scale changing scene through the self-constraint and mutual constraint of the scale-adaptive cosine window and the filter, reduces the template drift problem in the tracking process, and greatly improves the accuracy and efficiency of the algorithm. Success rate, as shown in Figures 5-6 and Table 1, the tracking success rate of this method ranks first, and the accuracy rate is tied for the first place with the AutoTrack algorithm.
表1 各算法在TC128数据集下的总精确率和总成功率Table 1 The total accuracy and total success rate of each algorithm under the TC128 dataset
以上所述仅为本发明的较佳实施方式,本发明的保护范围并不以上述实施方式为限,但凡本领域普通技术人员根据本发明所揭示内容所作的等效修饰或变化,皆应纳入权利要求书中记载的保护范围内。The above descriptions are only the preferred embodiments of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, but any equivalent modifications or changes made by those of ordinary skill in the art based on the contents disclosed in the present invention should be included in the within the scope of protection described in the claims.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210575210.8A CN114897938B (en) | 2022-05-25 | 2022-05-25 | An improved cosine window correlation filter target tracking method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210575210.8A CN114897938B (en) | 2022-05-25 | 2022-05-25 | An improved cosine window correlation filter target tracking method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114897938A true CN114897938A (en) | 2022-08-12 |
CN114897938B CN114897938B (en) | 2025-02-11 |
Family
ID=82725509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210575210.8A Active CN114897938B (en) | 2022-05-25 | 2022-05-25 | An improved cosine window correlation filter target tracking method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114897938B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102496016A (en) * | 2011-11-22 | 2012-06-13 | 武汉大学 | Infrared target detection method based on space-time cooperation framework |
CN111951298A (en) * | 2020-06-25 | 2020-11-17 | 湖南大学 | A target tracking method fused with time series information |
US20210227132A1 (en) * | 2018-05-30 | 2021-07-22 | Arashi Vision Inc. | Method for tracking target in panoramic video, and panoramic camera |
-
2022
- 2022-05-25 CN CN202210575210.8A patent/CN114897938B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102496016A (en) * | 2011-11-22 | 2012-06-13 | 武汉大学 | Infrared target detection method based on space-time cooperation framework |
US20210227132A1 (en) * | 2018-05-30 | 2021-07-22 | Arashi Vision Inc. | Method for tracking target in panoramic video, and panoramic camera |
CN111951298A (en) * | 2020-06-25 | 2020-11-17 | 湖南大学 | A target tracking method fused with time series information |
Non-Patent Citations (1)
Title |
---|
王永雄等: "基于背景感知与快速尺寸判别的相关滤波跟踪算法", 数据采集与处理, no. 02, 15 March 2020 (2020-03-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN114897938B (en) | 2025-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210551B (en) | A Visual Object Tracking Method Based on Adaptive Subject Sensitivity | |
CN108550161B (en) | Scale self-adaptive kernel-dependent filtering rapid target tracking method | |
CN114037930B (en) | Video action recognition method based on spatiotemporal enhancement network | |
CN110084221B (en) | Serialized human face key point detection method with relay supervision based on deep learning | |
CN111192292A (en) | Target tracking method based on attention mechanism and twin network and related equipment | |
CN109410247A (en) | A kind of video tracking algorithm of multi-template and adaptive features select | |
CN109993137A (en) | A fast face correction method based on convolutional neural network | |
CN111489364A (en) | Medical image segmentation method based on lightweight full convolution neural network | |
CN111612024B (en) | Feature extraction method, device, electronic equipment and computer readable storage medium | |
CN111639524A (en) | Automatic driving image semantic segmentation optimization method | |
CN110866938B (en) | A fully automatic video moving object segmentation method | |
CN113298036B (en) | Method for dividing unsupervised video target | |
CN111460915B (en) | Light weight neural network-based finger vein verification method and system | |
CN113420794A (en) | Binaryzation Faster R-CNN citrus disease and pest identification method based on deep learning | |
CN113449671B (en) | Pedestrian re-recognition method and device based on multi-scale multi-feature fusion | |
CN114445665B (en) | Hyperspectral image classification method based on non-local U-shaped network enhanced by Transformer | |
CN111860278A (en) | A human action recognition algorithm based on deep learning | |
CN111931722B (en) | Correlated filtering tracking method combining color ratio characteristics | |
US12307746B1 (en) | Nighttime unmanned aerial vehicle object tracking method fusing hybrid attention mechanism | |
CN105046202A (en) | Adaptive face identification illumination processing method | |
CN116229154A (en) | Class increment image classification method based on dynamic hybrid model | |
CN111612802A (en) | A re-optimization training method and application based on existing image semantic segmentation model | |
CN114972435A (en) | Object Tracking Method Based on Long-Short-Time Integrated Appearance Update Mechanism | |
CN114897938A (en) | Improved cosine window related filtering target tracking method | |
CN109543684B (en) | Real-time target tracking detection method and system based on full convolution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Deng Lizhen Inventor after: Sun Jiawei Inventor after: Zhu Hu Inventor before: Sun Jiawei Inventor before: Deng Lizhen Inventor before: Zhu Hu |
|
GR01 | Patent grant | ||
GR01 | Patent grant |