CN107368785A

CN107368785A - The video target tracking method of multinuclear local restriction

Info

Publication number: CN107368785A
Application number: CN201710455426.XA
Authority: CN
Inventors: 王仁芳
Original assignee: Zhejiang Wanli College
Current assignee: Zhejiang Wanli College
Priority date: 2017-06-16
Filing date: 2017-06-16
Publication date: 2017-11-21

Abstract

The present invention provides a video target tracking method with multi-core local constraints, including: adopting a local constraint linear coding method, introducing the local structure of sample data into a cooperative representation method to obtain a sample representation with good classification performance; using a kernel function to The cooperative representation is extended to the kernel space of multi-feature fusion, which enhances the class discrimination ability of the dictionary and sparse representation coefficients on the target features; the target tracking is regarded as a binary classification problem, and the candidate target with the highest classifier score is used as the tracking target under the particle filter framework Target. The method can be used for accurate and robust tracking of video objects.

Description

Multi-core Local Constraint Video Object Tracking Method

技术领域technical field

本发明涉及计算机视觉领域技术领域，尤其涉及一种多核局部约束的视频目标跟踪方法。The present invention relates to the technical field of computer vision, in particular to a video target tracking method with multi-core local constraints.

背景技术Background technique

视觉目标跟踪是计算机视觉领域一个重要的研究内容，在视觉导航、人机交互、智能交通、视频监控等领域得到了广泛应用，是各种后续高级处理，如目标识别、行为分析、视频图像压缩编码和应用理解等高层视频处理和应用的基础。然而由于跟踪视频中存在遮挡、光照变化、尺度变化、突变、角度变化等因素，这使得准确鲁棒的视频目标跟踪成为一项非常重要的工作。Visual target tracking is an important research content in the field of computer vision. It has been widely used in visual navigation, human-computer interaction, intelligent transportation, video surveillance and other fields. It is a variety of subsequent advanced processing, such as target recognition, behavior analysis, video image compression. Fundamentals of high-level video processing and applications such as coding and application understanding. However, due to factors such as occlusion, illumination changes, scale changes, sudden changes, and angle changes in tracking videos, it makes accurate and robust video object tracking a very important task.

随着压缩感知理论以及稀疏编码理论的发展，使得稀疏表示已被应用于视频目标的跟踪中，其核心是将目标视为粒子滤波框架下的稀疏表示问题。在稀疏表示跟踪中，l₁跟踪方法具有较强的鲁棒性，但因其需求解l₁范数最小化问题使得求解比较困难且耗时。于是，基于l₂范数的协同稀疏表示被提出且应用到目标跟踪中，尽管l₂范数下的协同表示没有l₁范数下的重构系数的稀疏性强，但因可提前计算映射矩阵而不用更新每个粒子，从而提高了运算效率；然而协同表示方法本质上是一种线性方法，当目标发生非线性变化(强烈光照变化，突然运动，背景剧烈抖动)时，将导致跟踪失败。With the development of compressed sensing theory and sparse coding theory, sparse representation has been applied to the tracking of video targets, and its core is to regard the target as a sparse representation problem under the particle filter framework. In sparse representation tracking, the l ₁ tracking method has strong robustness, but it is difficult and time-consuming to solve the l ₁ norm minimization problem. Therefore, a cooperative sparse representation based on the l ₂ norm was proposed and applied to target tracking. Although the cooperative representation under the l ₂ norm is not as sparse as the reconstruction coefficient under the l ₁ norm, the mapping can be calculated in advance The matrix does not need to update each particle, thus improving the computational efficiency; however, the cooperative representation method is essentially a linear method, which will lead to tracking failure when the target undergoes nonlinear changes (strong illumination changes, sudden movements, violent background shakes) .

发明内容Contents of the invention

本发明针对现有技术中存在的不足，提供了一种多核局部约束的视频目标跟踪方法，通过设计单遍绘制方案提高渲染速度，通过EWA增强渲染的质量。Aiming at the deficiencies in the prior art, the present invention provides a video target tracking method with multi-core local constraints, improves the rendering speed by designing a single-pass rendering scheme, and enhances the rendering quality through EWA.

本发明实现其发明目的所采用的技术方案是：The technical scheme that the present invention realizes that its object of the invention adopts is:

一种多核局部约束的视频目标跟踪方法，包括如下步骤：A video target tracking method with multi-core local constraints, comprising the steps of:

(1)将样本数据的局部结构引入到协同表示方法中，构建样本特征的局部约束线性编码；(2)利用核方法，构建样本特征的多核局部约束协同编码；(3)基于支持向量机SVM，将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪；(4)依据目标和背景样本的变化，动态更新目标模板与背景模板以及分类器。支持向量机即Support VectorMachine,SVM。(1) Introduce the local structure of the sample data into the collaborative representation method to construct a locally constrained linear coding of the sample features; (2) use the kernel method to construct a multi-core local constrained collaborative coding of the sample features; (3) Based on the support vector machine SVM , embed the classifier score of the sample into the particle filter framework to track the video target; (4) dynamically update the target template, background template and classifier according to the changes of the target and background samples. Support vector machine is Support Vector Machine, SVM.

作为优选，在步骤(1)中，为了获得具有良好分类性能的样本表示，将样本数据的局部结构引入到协同表示中，来构建样本特征的局部约束协同编码，具体是：Preferably, in step (1), in order to obtain a sample representation with good classification performance, the local structure of the sample data is introduced into the collaborative representation to construct a locally constrained collaborative encoding of sample features, specifically:

(a)在稀疏表示框架下，测试样本表示为y＝d₁x₁+d₂x₂+…+d_nx_n＝Dx，其中字典表示字典原子，其协同表示的目其协同表示的目标函数为通过最小化得到样本的协同表示为最优解仅是y的线性投影，且P＝(D^TD+λI)^-1D^T独立于y，这样投影矩阵P可以被预先计算出来，避免了l₁范数下每一个测试样本均需单独优化处理，大大提高了运算速度。(a) Under the sparse representation framework, the test sample Expressed as y=d ₁ x ₁ +d ₂ x ₂ +...+d _n x _n = Dx, where the dictionary Represents a dictionary atom, and the objective function of its cooperative representation is The synergy of samples obtained by minimizing is expressed as Optimal solution It is only the linear projection of y, and P=(D ^T D+λI) ^-1 D ^T is independent of y, so that the projection matrix P can be calculated in advance, avoiding the need to optimize each test sample separately under the l ₁ norm processing, greatly improving the computing speed.

(b)协同表示方法本质上是一种线性方法，使用非局部的字典原子重构候选目标系数，然而数据的局部结构往往比全局结构携带更多的信息，比如一个样本和其周围的样本应具有相似的编码，因此使用局部约束的样本重建更为精确；将样本数据的局部结构引入到协同表示，其目标函数为其中是一个局部约束的对角矩阵，通过最小化得到样本的局部约束协同编码为 (b) The collaborative representation method is essentially a linear method that uses non-local dictionary atoms to reconstruct candidate target coefficients. However, the local structure of the data often carries more information than the global structure, such as a sample and its surrounding samples should be have similar encodings, so the sample reconstruction using local constraints is more accurate; the local structure of the sample data is introduced into the collaborative representation, and its objective function is in is a locally constrained diagonal matrix, and the local constrained co-encoding of the sample is obtained by minimizing as

作为优选，在步骤(2)中，基于核方法，构建样本特征的多核局部约束协同编码，具体是：As a preference, in step (2), based on the kernel method, the multi-core local constraint collaborative encoding of sample features is constructed, specifically:

(a)采用核方法将非线性数据映射到高维线性核空间，样本y在高维空间中的映像由Φ＝[φ(d₁),φ(d₂),…,φ(d_n)]来线性表示，这显然地增大了计算复杂度，因此有必要降低特征空间的维数。采用随机投影矩阵P^T将高维数据映射到低维空间，局部约束核协同编码的目标函数即为(a) Using the kernel method to map the nonlinear data to the high-dimensional linear kernel space, the image of the sample y in the high-dimensional space is given by Φ=[φ(d ₁ ),φ(d ₂ ),…,φ(d _n ) ] to linear representation, which obviously increases the computational complexity, so it is necessary to reduce the dimensionality of the feature space. Using the random projection matrix PT to map the high ^- dimensional data to the low-dimensional space, the objective function of the locally constrained kernel cooperative coding is

对其求偏导并等于0可得Take its partial derivative and equal to 0 to get

其中[x₁,x₂,…,x_n]^T是n维系数向量x，记Where [x ₁ ,x ₂ ,…,x _n ] ^T is the n-dimensional coefficient vector x, denote

则式(1)的解可写为Then the solution of formula (1) can be written as

根据核函数的性质，记P＝ΦB，代入式(2)和式(3)分别得According to the nature of the kernel function, record P=ΦB, and substitute into formula (2) and formula (3) respectively to get

其中K＝Φ^TΦ是核Gram矩阵(半正定的对称矩阵)。在特征空间样本内积可通过核函数来计算，对于任意两个样本φ(x),φ(y)，有φ(x)^Tφ(y)＝(φ(x)·φ(y))＝k(x,y)，则Where K=Φ ^T Φ is the kernel Gram matrix (positive semi-definite symmetric matrix). Inner product of samples in feature space can be calculated by kernel function, for any two samples φ(x), φ(y), there is φ(x) ^T φ(y)=(φ(x)·φ(y)) ＝k(x,y), then

(b)为了稳定地跟踪目标，采用多特征来描述目标，于是通过多核融合的方式得到融合核函数为从而求出样本特征的多核局部约束协同编码 (b) In order to track the target stably, multiple features are used to describe the target, so the fusion kernel function is obtained by multi-kernel fusion as Multi-kernel locally constrained collaborative coding of sample features

本发明使用空间彩色直方图和空间梯度直方图来表示目标；对于每一个目标样本，将其划分为四个面积相等的子区域，然后分别抽取子区域彩色特征并合并成为目标彩色特征h^c。对于梯度特征，首先使用核[-0.5,0,0.5]和[-0.5,0,0.5]^T进行滤波处理，得到图像梯度；然后采取同样的方法获得四个子区域方向梯度直方图，并通过合并四个子区域直方图得到空间梯度直方图h^g。设K_c、K_g分别为彩色特征与梯度特征的核矩阵，K_c、K_g中每个元素分别表示两个直方图间的相似性；K_c中相应的元素依照K_c(i,j)＝BhaCoff来计算，其中和是两个彩色直方图、BhaCoff(·)是Bhattacharyya系数函数；K_g、K_c(·,y)、K_g(·,y)的计算方法与K_c相同。The present invention uses the spatial color histogram and the spatial gradient histogram to represent the target; for each target sample, it is divided into four sub-regions with equal areas, and then the color features of the sub-regions are respectively extracted and combined into the target color feature h ^c . For the gradient feature, first use the kernel [-0.5,0,0.5] and [-0.5,0,0.5] ^T for filtering to obtain the image gradient; then use the same method to obtain the gradient histogram of the four sub-regions, and combine The histograms of the four sub-regions obtain the spatial gradient histogram h ^g . Let K _c and K _g be the kernel matrices of color features and gradient features respectively, and each element in K _c and K _g represents the similarity between two histograms; the corresponding elements in K _c are according to K _c (i,j ) = BhaCoff to calculate, where with are two color histograms, BhaCoff(·) is Bhattacharyya coefficient function; K _g , K _c (·,y), K _g (·,y) are calculated in the same way as K _c .

作为优选，在步骤(3)中，基于支持向量机SVM，将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪，具体是：As preferably, in step (3), based on the support vector machine SVM, the classifier score of the sample is embedded into the tracking of the video target under the particle filter framework, specifically:

(a)在第一帧目标区域周围按照高斯分布提取一定数量的目标样本和背景样本，目标样本、背景样本亦分别为正样本、负样本，分别计算正、负样本在当前字典下的多核局部约束协同编码z_i，将正、负样本的编码及其对应的标签代入支持向量机SVM进行训练，通过最小化代价函数来学习分类器，分类器的得分计算为(a) Extract a certain number of target samples and background samples according to the Gaussian distribution around the target area in the first frame. The target samples and background samples are also positive samples and negative samples respectively. Calculate the multi-core locality of the positive and negative samples under the current dictionary. Constrained collaborative encoding z _i , the encoding of positive and negative samples and their corresponding labels Substituting the support vector machine SVM for training, by minimizing the cost function To learn a classifier, the score of the classifier is calculated as

(b)将分类嵌入到粒子滤波框架下实现目标的跟踪；粒子滤波是贝叶斯序列重要采样技术，经常用于估计动态系统中状态变量的后验密度；假定时刻t的目标状态变量为s_t，给定目标观测集y_1:t＝{y₁,y₂,…,y_t}，则当前状态s_t可由最大后验概率确定：(b) Embedding classification into the particle filter framework to achieve target tracking; particle filter is an important sampling technique for Bayesian sequences, and is often used to estimate the posterior density of state variables in dynamic systems; assume that the target state variable at time t is s _t , given the target observation set y _1:t = {y ₁ ,y ₂ ,…,y _t }, the current state s _t can be determined by the maximum posterior probability:

其中后验概率的递归计算分为式(5)预测和式(6)更新两个步骤The recursive calculation of the posterior probability is divided into two steps: formula (5) prediction and formula (6) update

其中p(s_t|s_t-1)是状态转移概率用来描述动态模型，p(y_t|s_t)是观测似然函数用来描述观测模型；用图像的六个仿射变换参数来定义相邻帧目标运动，设s_t＝(υ₁,υ₂,υ₃,υ₄,t_x,t_y)，其中(υ₁,υ₂,υ₃,υ₄)分别表示旋转角度、尺度、面积比、倾斜方向，(t_x,t_y)表示2D位置参数，相邻帧间的状态转移则用高斯分布表示为where p(s _t |s _t-1 ) is the state transition probability used to describe the dynamic model, p(y _t |s _t ) is the observation likelihood function used to describe the observation model; use the six affine transformation parameters of the image to To define the target motion in adjacent frames, let s _t = (υ ₁ ,υ ₂ ,υ ₃ ,υ ₄ ,t _x ,t _y ), where (υ ₁ ,υ ₂ ,υ ₃ ,υ ₄ ) represent the rotation angle, Scale, area ratio, and tilt direction, (t _x , _ty ) represent 2D position parameters, and the state transition between adjacent frames is expressed as

p(s_t|s_t-1)＝N(s_t；s_t-1,Σ)p(s _t |s _t-1 )＝N(s _t ; s _t-1 ,Σ)

其中N(·)是高斯分布函数、Σ是对角协方差矩阵(其对角元素为s中相应运动参数的方差)。基于所学习的分类器，将观测模型定义为p(y|s)∝f(z)，f(z)是由式(4)计算的分类得分，则具有最高得分的候选目标视为跟踪结果。where N(·) is a Gaussian distribution function, and Σ is a diagonal covariance matrix (the diagonal elements of which are the variances of the corresponding motion parameters in s). Based on the learned classifier, the observation model is defined as p(y|s)∝f(z), f(z) is the classification score calculated by formula (4), then the candidate target with the highest score is regarded as the tracking result .

作为优选，在步骤(4)中，为了实现鲁棒的视频目标跟踪，动态地更新目标模板与背景模板以及分类器，具体是：As preferably, in step (4), in order to realize robust video target tracking, dynamically update target template and background template and classifier, specifically:

(a)关于目标模板D_f的更新，设α是新的跟踪结果y在目标字典模板上的系数，s是y与α中最大系数所对应的原子即样本间的巴氏系数，s_i是y与目标模板D_f中每一原子的巴氏系数且s_m为其最小值，同时设定两个阈值τ₁<τ₂。如果s>τ₂，则说明跟踪结果能很好地被目标模板表示；如果s<τ₁，则说明跟踪目标发生了强烈的外观变化，此时用y替换s_m所对应的目标样本；(a) Regarding the update of the target template D _f , let α be the coefficient of the new tracking result y on the target dictionary template, s is the atom corresponding to the largest coefficient in y and α, that is, the Barthel coefficient between samples, and s _i is y and the Barthel coefficient of each atom in the target template D _f and s _m as its minimum value, and set two thresholds τ ₁ <τ ₂ . If s>τ ₂ , it means that the tracking result can be well represented by the target template; if s<τ ₁ , it means that the tracking target has undergone a strong appearance change, and at this time replace the target sample corresponding to s _m with y;

(b)关于背景模板D_b的更新，在当前帧中确定跟踪目标后，在目标区域周围按照高斯分布选择M_B个背景样本，随机替换当前背景字典中的样本；(b) Regarding the update of the background template D _b , after determining the tracking target in the current frame, select M _B background samples according to the Gaussian distribution around the target area, and randomly replace the samples in the current background dictionary;

(c)关于分类器的更新，将当前更新后的目标模板D_f和背景模板D_b中的样本,即当前的正负样本，带入分类器进行训练便得到当前的分类器。(c) Regarding the update of the classifier, the current updated target template _Df and the samples in the background template _Db , that is, the current positive and negative samples, are brought into the classifier for training to obtain the current classifier.

本发明的有益效果是：The beneficial effects of the present invention are:

本发明是采用局部约束线性编码方法，将样本数据的局部结构引入到协同表示方法中，获得具有良好分类性能的样本表示；利用核函数将协同表示扩展到多特征融合的核空间，使得字典和稀疏表示系数对目标特征的类判别能力得到增强；在粒子滤波框架下将二分类器得分最高的候选目标作为跟踪目标，因此可准确且鲁棒地跟踪视频目标。The present invention adopts the local constraint linear coding method, introduces the local structure of the sample data into the cooperative representation method, and obtains the sample representation with good classification performance; uses the kernel function to extend the cooperative representation to the kernel space of multi-feature fusion, so that the dictionary and The class discrimination ability of sparse representation coefficients on target features is enhanced; under the particle filter framework, the candidate target with the highest score of the binary classifier is used as the tracking target, so the video target can be tracked accurately and robustly.

附图说明Description of drawings

图1-4是本发明案例的视频目标跟踪效果图以及跟踪效果对比图，与其它四种方法比较：2008年Ross提出的IVT法，2011年Mei提出的L1法、2010年Kwon提出的VID法、2009年Babenko提出的MIL法。Figure 1-4 is the video target tracking effect diagram and the tracking effect comparison diagram of the case of the present invention, compared with other four methods: the IVT method proposed by Ross in 2008, the L1 method proposed by Mei in 2011, and the VID method proposed by Kwon in 2010 , MIL law proposed by Babenko in 2009.

注：图1-4中，各矩形框框选的即是各方法的效果，且各方法效果对应的矩形框分别用标号A、B、C、D、E来表示，各标号分别代表的方法说明如下：Note: In Figure 1-4, each rectangular box selects the effect of each method, and the rectangular boxes corresponding to the effect of each method are respectively represented by labels A, B, C, D, and E, and the descriptions of the methods represented by each label as follows:

A(IVT)；B(L1)；C(VTD)；D(MIL)；E(本发明)A (IVT); B (L1); C (VTD); D (MIL); E (this invention)

图1-Oclcusion2序列(注：该序列中，出现了目标旋转与遮挡变化)；从图1的跟踪结果可以看出，本发明在发生遮挡和旋转的时能够准确地跟踪目标，IVT、VTD跟踪失败(如#501)，MIL能够实现跟踪但不能够估计旋转变化，l₁跟踪出现漂移现象。Fig. 1-Oclcusion2 sequence (note: in this sequence, target rotation and occlusion change occurred); as can be seen from the tracking result of Fig. 1, the present invention can track target accurately when occlusion and rotation occur, IVT, VTD tracking Failed (eg #501), MIL is able to track but cannot estimate rotation changes, l ₁ tracking drifts.

图2-DavidIndoor序列(注：随着目标的前进和后退，目标头部发生了旋转变化，同时因距离的远近而出现目标尺寸变化，且背景光照也随之发生了改变)；图3-Car11序列(注：背景复杂、目标模糊且目标尺寸发生了变化)；从图2与图3可以看出，IVT算法优于其他3种算法，具有旋转自适应性，但尺度变化适应性较差，本发明在目标旋转与尺度变化适应性能上均优于其它的。Figure 2-DavidIndoor sequence (note: as the target moves forward and backward, the target's head rotates, and the target size changes due to the distance, and the background lighting also changes); Figure 3-Car11 Sequence (Note: the background is complex, the target is blurred and the target size has changed); from Figure 2 and Figure 3, it can be seen that the IVT algorithm is superior to the other three algorithms, and has rotation adaptability, but poor adaptability to scale changes. The present invention is superior to others in object rotation and scale change adaptability.

图4Deer序列(注：因目标的快速运动出现了目标模糊现象)；从图4可以看出，只有本发明与VTD算法能够定位目标，其他算法均出现丢失目标的现象(如#40)，同时VTD的跟踪存在着偏移的情况。本发明在复杂的环境下能够实现准确鲁棒的目标跟踪。Fig. 4 Deer sequence (note: target fuzzy phenomenon has occurred because of the rapid motion of target); As can be seen from Fig. 4, only the present invention and VTD algorithm can locate the target, other algorithms all occur the phenomenon of missing target (as #40), simultaneously There is an offset in the tracking of VTD. The invention can realize accurate and robust target tracking in complex environments.

具体实施方式detailed description

下在通过具体实施例并结合附图对本发明的技术方案作进一步详细说明。The technical solution of the present invention will be described in further detail below through specific embodiments and in conjunction with the accompanying drawings.

参照图1-4，一种多核局部约束的视频目标跟踪方法，所述方法包括：With reference to Fig. 1-4, a kind of video target tracking method of multi-core local constraint, described method comprises:

(1)将样本数据的局部结构引入到协同表示方法中，构建样本特征的局部约束线性编码。具体如下：(a)在稀疏表示框架下，测试样本y＝d₁x₁+d₂x₂+…+d_nx_n＝Dx，其协同表示的目其协同表示的目标函数为通过最小化得到样本的协同表示为(b)将样本数据的局部结构引入到协同表示，其目标函数为通过最小化得到样本的局部约束协同编码为 (1) Introduce the local structure of sample data into the collaborative representation method to construct a locally constrained linear encoding of sample features. The details are as follows: (a) Under the sparse representation framework, the test sample y=d ₁ x ₁ +d ₂ x ₂ +...+d _n x _n ＝Dx, the objective function of its collaborative representation is The synergy of samples obtained by minimizing is expressed as (b) Introduce the local structure of the sample data into the collaborative representation, and its objective function is The locally constrained co-encoding of samples obtained by minimization is

(2)利用核方法，构建样本特征的多核局部约束协同编码。具体如下：(a)样本y在高维空间中的映像由Φ＝[φ(d₁),φ(d₂),…,φ(d_n)]来线性表示，采用随机投影矩阵P^T再将高维数据映射到低维空间，得到局部约束核协同编码的目标函数(式(1))，对其求偏导并等于0以及根据核函数的性质，记P＝ΦB得到目标函数(2) Using the kernel method, a multi-kernel local constraint collaborative encoding of sample features is constructed. The details are as follows: (a) The image of sample y in high-dimensional space is represented linearly by Φ=[φ(d ₁ ),φ(d ₂ ),…,φ(d _n )], and the random projection matrix P ^T is used to Map the high-dimensional data to the low-dimensional space to obtain the objective function of the locally constrained kernel cooperative coding (Formula (1)), calculate its partial derivative and equal to 0, and according to the nature of the kernel function, record P=ΦB to obtain the objective function

(b)通过多核融合的方式得到融合核函数为使用空间彩色直方图和空间梯度直方图来表示目标：对于每一个目标样本，将其划分为四个面积相等的子区域，然后分别抽取子区域彩色特征并合并成为目标彩色特征h^c。对于梯度特征，首先使用核[-0.5,0,0.5]和[-0.5,0,0.5]^T进行滤波处理，得到图像梯度；然后采取同样的方法获得四个子区域方向梯度直方图，并通过合并四个子区域直方图得到空间梯度直方图h^g。依照来计算K_c中元素，K_g、K_c(·,y)、K_g(·,y)的计算方法与K_c相同。从而求出样本特征的多核局部约束协同编码 (b) The fusion kernel function obtained by multi-core fusion is Use the spatial color histogram and the spatial gradient histogram to represent the target: for each target sample, it is divided into four sub-regions with equal areas, and then the color features of the sub-regions are respectively extracted and merged into the target color feature h ^c . For the gradient feature, first use the kernel [-0.5,0,0.5] and [-0.5,0,0.5] ^T for filtering to obtain the image gradient; then use the same method to obtain the gradient histogram of the four sub-regions, and combine The histograms of the four sub-regions obtain the spatial gradient histogram h ^g . according to To calculate the elements in K _c , the calculation methods of K _g , K _c (·, y), and K _g (·, y) are the same as K _c . Multi-kernel locally constrained collaborative coding of sample features

(3)基于SVM，将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪。具体如下：(a)在第一帧目标区域周围按照高斯分布提取目标样本(正样本)和背景样本(负样本)，分别计算正负样本在当前字典下的多核局部约束协同编码z_i，将正负样本的编码及其对应的标签代入SVM进行训练，通过最小化代价函数来学习分类器，分类器的得分f(z)由式(4)求出；(b)基于所学习的分类器，将粒子滤波框架下的观测模型定义为p(y|s)∝f(z)，求出最高得分的候选目标，即为跟踪结果。(3) Based on SVM, the classifier score of the sample is embedded into the particle filter framework to track the video target. The details are as follows: (a) Extract the target samples (positive samples) and background samples (negative samples) according to the Gaussian distribution around the target area in the first frame, and calculate the multi-core local constraint cooperative coding z _i of the positive and negative samples under the current dictionary respectively, and set Encoding of positive and negative samples and their corresponding labels Substitute into SVM for training, by minimizing the cost function To learn the classifier, the score f(z) of the classifier is obtained by formula (4); (b) Based on the learned classifier, the observation model under the particle filter framework is defined as p(y|s)∝f( z), find the candidate target with the highest score, which is the tracking result.

(4)依据目标和背景样本的变化，动态更新目标模板与背景模板以及分类器。具体如下：(a)目标模板D_f的更新。如果s>τ₂，则说明跟踪结果能很好地被目标模板表示，保留D_f；如果s<τ₁，则说明跟踪目标发生了强烈的外观变化，此时用y替换s_m所对应的目标样本。(b)背景模板D_b的更新，在当前帧中确定跟踪目标后，在目标区域周围按照高斯分布选择M_B个背景样本，随机替换当前背景字典中的样本。(c)分类器的更新，将当前更新后的目标模板D_f和背景模板D_b中的样本，带入分类器进行训练便得到当前的分类器。(4) Dynamically update the target template, background template and classifier according to the changes of the target and background samples. The details are as follows: (a) The update of the target template _Df . If s>τ ₂ , it means that the tracking result can be well represented by the target template, and D _f is retained; if s<τ ₁ , it means that the tracking target has undergone a strong appearance change, and at this time, replace s _m with y target sample. (b) The update of the background template _Db . After determining the tracking target in the current frame, select M _B background samples around the target area according to the Gaussian distribution, and randomly replace the samples in the current background dictionary. (c) The update of the classifier, the samples in the current updated target template D _f and the background template D _b are brought into the classifier for training to obtain the current classifier.

参照图1-4给出了本发明跟踪效果和其它四种方法跟踪效果的对比，可以看出，在发生目标运动模糊、尺度变化与快速运动以及遮挡、光照变化时，本发明的跟踪效果优于其他四种方法，实现了准确且鲁棒的目标跟踪。Referring to Figures 1-4, the tracking effect of the present invention is compared with the tracking effects of the other four methods. It can be seen that the tracking effect of the present invention is better when the target motion blur, scale change and fast movement, occlusion, and illumination change occur. Compared with the other four methods, accurate and robust object tracking is achieved.

Claims

1. a kind of video target tracking method of multinuclear local restriction, it is characterised in that the tracking comprises the steps： (1) partial structurtes of sample data are incorporated into collaboration method for expressing, build the local restriction uniform enconding of sample characteristics； (2) kernel method is utilized, builds the multinuclear local restriction collaboration coding of sample characteristics；(3) SVMs is based on, by point of sample Class device score is embedded into the tracking that video object is realized under particle filter framework；(4) change according to target and background sample, move State updates To Template and background template and grader.

2. the video target tracking method of multinuclear local restriction as claimed in claim 1, it is characterised in that：In step (1), Represented to obtain the sample with good classification performance, the partial structurtes of sample data are incorporated into collaboration expression, carry out structure The local restriction collaboration coding of sample characteristics is built, is specifically：(a) under framework of sparse representation, test sampleIt is expressed as Y=d₁x₁+d₂x₂+…+d_nx_n=Dx, wherein dictionaryIts cooperate with represent object function beThe collaboration that sample is obtained by minimizing is expressed as (b) partial structurtes of sample data are incorporated into collaboration to represent, its object function is WhereinIt is the diagonal matrix of a local restriction, the local restriction that sample is obtained by minimizing is assisted It is same to be encoded to

3. the video target tracking method of multinuclear local restriction as claimed in claim 1, it is characterised in that：In step (2), Based on kernel method, the multinuclear local restriction collaboration coding of sample characteristics is built, is specifically：(a) kernel method is used by non-linear number According to High-dimensional Linear nuclear space is mapped to, images of the sample y in higher dimensional space is by Φ=[φ (d₁),φ(d₂),…,φ(d_n)] Carry out linear expression, using accidental projection matrix P^THigh dimensional data is mapped to lower dimensional space again, local restriction nuclear coordination coding Object function is

Local derviation and the property according to kernel function are asked it, Note P=Φ B obtain object function and areWherein K=Φ^TΦ It is core Gram matrixes；(b) in order to stably track target, target is described using multiple features, then passes through the side of multi-core integration Formula obtains merging kernel functionSo as to obtain the multinuclear local restriction of sample characteristics collaboration coding

4. the video target tracking method of multinuclear local restriction as claimed in claim 1, it is characterised in that：In step (3), Based on SVMs, the grader score of sample is embedded into the tracking that video object is realized under particle filter framework, specifically It is：(a) a number of target sample and background sample, target sample are extracted according to Gaussian Profile around the first frame target area Originally, background sample is also respectively positive sample, negative sample, calculates multinuclear local restriction of the positive and negative samples under current dictionary respectively Collaboration coding z_i, by the coding of positive and negative samples and its corresponding labelSubstitute into branch Hold vector machine to be trained, by minimizing cost functionCarry out learning classification Device, the score of grader are calculated as(b) based on the grader learnt, by the sight under particle filter framework Survey model definition is p (y | s) ∝ f (z), and the candidate target with top score is tracking result.

5. the video target tracking method of multinuclear local restriction as claimed in claim 1, it is characterised in that：In step (4), In order to realize the video frequency object tracking of robust, To Template and background template and grader are dynamicallyd update, is specifically：(a) On To Template D_fRenewal, if tracking result y can well by To Template represent if retain D_f, otherwise obtain y and D_fIn The minimum dictionary atom of Pasteur's coefficient, and replaced it by y；(b) on background template D_bRenewal, in the current frame determine tracking After target, M is selected according to Gaussian Profile around target area_BIndividual background sample, the sample in random replacement current background dictionary This；(c) renewal on grader, by the To Template D after current renewal_fWith background template D_bIn sample, bring grader into It is trained and just obtains current grader.