CN107368785A - The video target tracking method of multinuclear local restriction - Google Patents
The video target tracking method of multinuclear local restriction Download PDFInfo
- Publication number
- CN107368785A CN107368785A CN201710455426.XA CN201710455426A CN107368785A CN 107368785 A CN107368785 A CN 107368785A CN 201710455426 A CN201710455426 A CN 201710455426A CN 107368785 A CN107368785 A CN 107368785A
- Authority
- CN
- China
- Prior art keywords
- sample
- target
- tracking
- local restriction
- multinuclear
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
本发明提供一种多核局部约束的视频目标跟踪方法,包括:采用局部约束线性编码方法,将样本数据的局部结构引入到协同表示方法中,以获得具有良好分类性能的样本表示;利用核函数将协同表示扩展到多特征融合的核空间,使得字典和稀疏表示系数对目标特征的类判别能力得到增强;视目标跟踪为二分类问题,在粒子滤波框架下将分类器得分最高的候选目标作为跟踪目标。采用本方法可对视频目标进行准确且鲁棒的跟踪。
The present invention provides a video target tracking method with multi-core local constraints, including: adopting a local constraint linear coding method, introducing the local structure of sample data into a cooperative representation method to obtain a sample representation with good classification performance; using a kernel function to The cooperative representation is extended to the kernel space of multi-feature fusion, which enhances the class discrimination ability of the dictionary and sparse representation coefficients on the target features; the target tracking is regarded as a binary classification problem, and the candidate target with the highest classifier score is used as the tracking target under the particle filter framework Target. The method can be used for accurate and robust tracking of video objects.
Description
技术领域technical field
本发明涉及计算机视觉领域技术领域,尤其涉及一种多核局部约束的视频目标跟踪方法。The present invention relates to the technical field of computer vision, in particular to a video target tracking method with multi-core local constraints.
背景技术Background technique
视觉目标跟踪是计算机视觉领域一个重要的研究内容,在视觉导航、人机交互、智能交通、视频监控等领域得到了广泛应用,是各种后续高级处理,如目标识别、行为分析、视频图像压缩编码和应用理解等高层视频处理和应用的基础。然而由于跟踪视频中存在遮挡、光照变化、尺度变化、突变、角度变化等因素,这使得准确鲁棒的视频目标跟踪成为一项非常重要的工作。Visual target tracking is an important research content in the field of computer vision. It has been widely used in visual navigation, human-computer interaction, intelligent transportation, video surveillance and other fields. It is a variety of subsequent advanced processing, such as target recognition, behavior analysis, video image compression. Fundamentals of high-level video processing and applications such as coding and application understanding. However, due to factors such as occlusion, illumination changes, scale changes, sudden changes, and angle changes in tracking videos, it makes accurate and robust video object tracking a very important task.
随着压缩感知理论以及稀疏编码理论的发展,使得稀疏表示已被应用于视频目标的跟踪中,其核心是将目标视为粒子滤波框架下的稀疏表示问题。在稀疏表示跟踪中,l1跟踪方法具有较强的鲁棒性,但因其需求解l1范数最小化问题使得求解比较困难且耗时。于是,基于l2范数的协同稀疏表示被提出且应用到目标跟踪中,尽管l2范数下的协同表示没有l1范数下的重构系数的稀疏性强,但因可提前计算映射矩阵而不用更新每个粒子,从而提高了运算效率;然而协同表示方法本质上是一种线性方法,当目标发生非线性变化(强烈光照变化,突然运动,背景剧烈抖动)时,将导致跟踪失败。With the development of compressed sensing theory and sparse coding theory, sparse representation has been applied to the tracking of video targets, and its core is to regard the target as a sparse representation problem under the particle filter framework. In sparse representation tracking, the l 1 tracking method has strong robustness, but it is difficult and time-consuming to solve the l 1 norm minimization problem. Therefore, a cooperative sparse representation based on the l 2 norm was proposed and applied to target tracking. Although the cooperative representation under the l 2 norm is not as sparse as the reconstruction coefficient under the l 1 norm, the mapping can be calculated in advance The matrix does not need to update each particle, thus improving the computational efficiency; however, the cooperative representation method is essentially a linear method, which will lead to tracking failure when the target undergoes nonlinear changes (strong illumination changes, sudden movements, violent background shakes) .
发明内容Contents of the invention
本发明针对现有技术中存在的不足,提供了一种多核局部约束的视频目标跟踪方法,通过设计单遍绘制方案提高渲染速度,通过EWA增强渲染的质量。Aiming at the deficiencies in the prior art, the present invention provides a video target tracking method with multi-core local constraints, improves the rendering speed by designing a single-pass rendering scheme, and enhances the rendering quality through EWA.
本发明实现其发明目的所采用的技术方案是:The technical scheme that the present invention realizes that its object of the invention adopts is:
一种多核局部约束的视频目标跟踪方法,包括如下步骤:A video target tracking method with multi-core local constraints, comprising the steps of:
(1)将样本数据的局部结构引入到协同表示方法中,构建样本特征的局部约束线性编码;(2)利用核方法,构建样本特征的多核局部约束协同编码;(3)基于支持向量机SVM,将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪;(4)依据目标和背景样本的变化,动态更新目标模板与背景模板以及分类器。支持向量机即Support VectorMachine,SVM。(1) Introduce the local structure of the sample data into the collaborative representation method to construct a locally constrained linear coding of the sample features; (2) use the kernel method to construct a multi-core local constrained collaborative coding of the sample features; (3) Based on the support vector machine SVM , embed the classifier score of the sample into the particle filter framework to track the video target; (4) dynamically update the target template, background template and classifier according to the changes of the target and background samples. Support vector machine is Support Vector Machine, SVM.
作为优选,在步骤(1)中,为了获得具有良好分类性能的样本表示,将样本数据的局部结构引入到协同表示中,来构建样本特征的局部约束协同编码,具体是:Preferably, in step (1), in order to obtain a sample representation with good classification performance, the local structure of the sample data is introduced into the collaborative representation to construct a locally constrained collaborative encoding of sample features, specifically:
(a)在稀疏表示框架下,测试样本表示为y=d1x1+d2x2+…+dnxn=Dx,其中字典表示字典原子,其协同表示的目其协同表示的目标函数为通过最小化得到样本的协同表示为最优解仅是y的线性投影,且P=(DTD+λI)-1DT独立于y,这样投影矩阵P可以被预先计算出来,避免了l1范数下每一个测试样本均需单独优化处理,大大提高了运算速度。(a) Under the sparse representation framework, the test sample Expressed as y=d 1 x 1 +d 2 x 2 +...+d n x n = Dx, where the dictionary Represents a dictionary atom, and the objective function of its cooperative representation is The synergy of samples obtained by minimizing is expressed as Optimal solution It is only the linear projection of y, and P=(D T D+λI) -1 D T is independent of y, so that the projection matrix P can be calculated in advance, avoiding the need to optimize each test sample separately under the l 1 norm processing, greatly improving the computing speed.
(b)协同表示方法本质上是一种线性方法,使用非局部的字典原子重构候选目标系数,然而数据的局部结构往往比全局结构携带更多的信息,比如一个样本和其周围的样本应具有相似的编码,因此使用局部约束的样本重建更为精确;将样本数据的局部结构引入到协同表示,其目标函数为其中是一个局部约束的对角矩阵,通过最小化得到样本的局部约束协同编码为 (b) The collaborative representation method is essentially a linear method that uses non-local dictionary atoms to reconstruct candidate target coefficients. However, the local structure of the data often carries more information than the global structure, such as a sample and its surrounding samples should be have similar encodings, so the sample reconstruction using local constraints is more accurate; the local structure of the sample data is introduced into the collaborative representation, and its objective function is in is a locally constrained diagonal matrix, and the local constrained co-encoding of the sample is obtained by minimizing as
作为优选,在步骤(2)中,基于核方法,构建样本特征的多核局部约束协同编码,具体是:As a preference, in step (2), based on the kernel method, the multi-core local constraint collaborative encoding of sample features is constructed, specifically:
(a)采用核方法将非线性数据映射到高维线性核空间,样本y在高维空间中的映像由Φ=[φ(d1),φ(d2),…,φ(dn)]来线性表示,这显然地增大了计算复杂度,因此有必要降低特征空间的维数。采用随机投影矩阵PT将高维数据映射到低维空间,局部约束核协同编码的目标函数即为(a) Using the kernel method to map the nonlinear data to the high-dimensional linear kernel space, the image of the sample y in the high-dimensional space is given by Φ=[φ(d 1 ),φ(d 2 ),…,φ(d n ) ] to linear representation, which obviously increases the computational complexity, so it is necessary to reduce the dimensionality of the feature space. Using the random projection matrix PT to map the high - dimensional data to the low-dimensional space, the objective function of the locally constrained kernel cooperative coding is
对其求偏导并等于0可得Take its partial derivative and equal to 0 to get
其中[x1,x2,…,xn]T是n维系数向量x,记Where [x 1 ,x 2 ,…,x n ] T is the n-dimensional coefficient vector x, denote
则式(1)的解可写为Then the solution of formula (1) can be written as
根据核函数的性质,记P=ΦB,代入式(2)和式(3)分别得According to the nature of the kernel function, record P=ΦB, and substitute into formula (2) and formula (3) respectively to get
其中K=ΦTΦ是核Gram矩阵(半正定的对称矩阵)。在特征空间样本内积可通过核函数来计算,对于任意两个样本φ(x),φ(y),有φ(x)Tφ(y)=(φ(x)·φ(y))=k(x,y),则Where K=Φ T Φ is the kernel Gram matrix (positive semi-definite symmetric matrix). Inner product of samples in feature space can be calculated by kernel function, for any two samples φ(x), φ(y), there is φ(x) T φ(y)=(φ(x)·φ(y)) =k(x,y), then
(b)为了稳定地跟踪目标,采用多特征来描述目标,于是通过多核融合的方式得到融合核函数为从而求出样本特征的多核局部约束协同编码 (b) In order to track the target stably, multiple features are used to describe the target, so the fusion kernel function is obtained by multi-kernel fusion as Multi-kernel locally constrained collaborative coding of sample features
本发明使用空间彩色直方图和空间梯度直方图来表示目标;对于每一个目标样本,将其划分为四个面积相等的子区域,然后分别抽取子区域彩色特征并合并成为目标彩色特征hc。对于梯度特征,首先使用核[-0.5,0,0.5]和[-0.5,0,0.5]T进行滤波处理,得到图像梯度;然后采取同样的方法获得四个子区域方向梯度直方图,并通过合并四个子区域直方图得到空间梯度直方图hg。设Kc、Kg分别为彩色特征与梯度特征的核矩阵,Kc、Kg中每个元素分别表示两个直方图间的相似性;Kc中相应的元素依照Kc(i,j)=BhaCoff来计算,其中和是两个彩色直方图、BhaCoff(·)是Bhattacharyya系数函数;Kg、Kc(·,y)、Kg(·,y)的计算方法与Kc相同。The present invention uses the spatial color histogram and the spatial gradient histogram to represent the target; for each target sample, it is divided into four sub-regions with equal areas, and then the color features of the sub-regions are respectively extracted and combined into the target color feature h c . For the gradient feature, first use the kernel [-0.5,0,0.5] and [-0.5,0,0.5] T for filtering to obtain the image gradient; then use the same method to obtain the gradient histogram of the four sub-regions, and combine The histograms of the four sub-regions obtain the spatial gradient histogram h g . Let K c and K g be the kernel matrices of color features and gradient features respectively, and each element in K c and K g represents the similarity between two histograms; the corresponding elements in K c are according to K c (i,j ) = BhaCoff to calculate, where with are two color histograms, BhaCoff(·) is Bhattacharyya coefficient function; K g , K c (·,y), K g (·,y) are calculated in the same way as K c .
作为优选,在步骤(3)中,基于支持向量机SVM,将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪,具体是:As preferably, in step (3), based on the support vector machine SVM, the classifier score of the sample is embedded into the tracking of the video target under the particle filter framework, specifically:
(a)在第一帧目标区域周围按照高斯分布提取一定数量的目标样本和背景样本,目标样本、背景样本亦分别为正样本、负样本,分别计算正、负样本在当前字典下的多核局部约束协同编码zi,将正、负样本的编码及其对应的标签代入支持向量机SVM进行训练,通过最小化代价函数来学习分类器,分类器的得分计算为(a) Extract a certain number of target samples and background samples according to the Gaussian distribution around the target area in the first frame. The target samples and background samples are also positive samples and negative samples respectively. Calculate the multi-core locality of the positive and negative samples under the current dictionary. Constrained collaborative encoding z i , the encoding of positive and negative samples and their corresponding labels Substituting the support vector machine SVM for training, by minimizing the cost function To learn a classifier, the score of the classifier is calculated as
(b)将分类嵌入到粒子滤波框架下实现目标的跟踪;粒子滤波是贝叶斯序列重要采样技术,经常用于估计动态系统中状态变量的后验密度;假定时刻t的目标状态变量为st,给定目标观测集y1:t={y1,y2,…,yt},则当前状态st可由最大后验概率确定:(b) Embedding classification into the particle filter framework to achieve target tracking; particle filter is an important sampling technique for Bayesian sequences, and is often used to estimate the posterior density of state variables in dynamic systems; assume that the target state variable at time t is s t , given the target observation set y 1:t = {y 1 ,y 2 ,…,y t }, the current state s t can be determined by the maximum posterior probability:
其中后验概率的递归计算分为式(5)预测和式(6)更新两个步骤The recursive calculation of the posterior probability is divided into two steps: formula (5) prediction and formula (6) update
p(st|y1:t)=∫p(st|st-1)p(st-1|y1:t-1)dst-1 (5)p(s t |y 1:t )=∫p(s t |s t-1 )p(s t-1 |y 1:t-1 )ds t-1 (5)
其中p(st|st-1)是状态转移概率用来描述动态模型,p(yt|st)是观测似然函数用来描述观测模型;用图像的六个仿射变换参数来定义相邻帧目标运动,设st=(υ1,υ2,υ3,υ4,tx,ty),其中(υ1,υ2,υ3,υ4)分别表示旋转角度、尺度、面积比、倾斜方向,(tx,ty)表示2D位置参数,相邻帧间的状态转移则用高斯分布表示为where p(s t |s t-1 ) is the state transition probability used to describe the dynamic model, p(y t |s t ) is the observation likelihood function used to describe the observation model; use the six affine transformation parameters of the image to To define the target motion in adjacent frames, let s t = (υ 1 ,υ 2 ,υ 3 ,υ 4 ,t x ,t y ), where (υ 1 ,υ 2 ,υ 3 ,υ 4 ) represent the rotation angle, Scale, area ratio, and tilt direction, (t x , ty ) represent 2D position parameters, and the state transition between adjacent frames is expressed as
p(st|st-1)=N(st;st-1,Σ)p(s t |s t-1 )=N(s t ; s t-1 ,Σ)
其中N(·)是高斯分布函数、Σ是对角协方差矩阵(其对角元素为s中相应运动参数的方差)。基于所学习的分类器,将观测模型定义为p(y|s)∝f(z),f(z)是由式(4)计算的分类得分,则具有最高得分的候选目标视为跟踪结果。where N(·) is a Gaussian distribution function, and Σ is a diagonal covariance matrix (the diagonal elements of which are the variances of the corresponding motion parameters in s). Based on the learned classifier, the observation model is defined as p(y|s)∝f(z), f(z) is the classification score calculated by formula (4), then the candidate target with the highest score is regarded as the tracking result .
作为优选,在步骤(4)中,为了实现鲁棒的视频目标跟踪,动态地更新目标模板与背景模板以及分类器,具体是:As preferably, in step (4), in order to realize robust video target tracking, dynamically update target template and background template and classifier, specifically:
(a)关于目标模板Df的更新,设α是新的跟踪结果y在目标字典模板上的系数,s是y与α中最大系数所对应的原子即样本间的巴氏系数,si是y与目标模板Df中每一原子的巴氏系数且sm为其最小值,同时设定两个阈值τ1<τ2。如果s>τ2,则说明跟踪结果能很好地被目标模板表示;如果s<τ1,则说明跟踪目标发生了强烈的外观变化,此时用y替换sm所对应的目标样本;(a) Regarding the update of the target template D f , let α be the coefficient of the new tracking result y on the target dictionary template, s is the atom corresponding to the largest coefficient in y and α, that is, the Barthel coefficient between samples, and s i is y and the Barthel coefficient of each atom in the target template D f and s m as its minimum value, and set two thresholds τ 1 <τ 2 . If s>τ 2 , it means that the tracking result can be well represented by the target template; if s<τ 1 , it means that the tracking target has undergone a strong appearance change, and at this time replace the target sample corresponding to s m with y;
(b)关于背景模板Db的更新,在当前帧中确定跟踪目标后,在目标区域周围按照高斯分布选择MB个背景样本,随机替换当前背景字典中的样本;(b) Regarding the update of the background template D b , after determining the tracking target in the current frame, select M B background samples according to the Gaussian distribution around the target area, and randomly replace the samples in the current background dictionary;
(c)关于分类器的更新,将当前更新后的目标模板Df和背景模板Db中的样本,即当前的正负样本,带入分类器进行训练便得到当前的分类器。(c) Regarding the update of the classifier, the current updated target template Df and the samples in the background template Db , that is, the current positive and negative samples, are brought into the classifier for training to obtain the current classifier.
本发明的有益效果是:The beneficial effects of the present invention are:
本发明是采用局部约束线性编码方法,将样本数据的局部结构引入到协同表示方法中,获得具有良好分类性能的样本表示;利用核函数将协同表示扩展到多特征融合的核空间,使得字典和稀疏表示系数对目标特征的类判别能力得到增强;在粒子滤波框架下将二分类器得分最高的候选目标作为跟踪目标,因此可准确且鲁棒地跟踪视频目标。The present invention adopts the local constraint linear coding method, introduces the local structure of the sample data into the cooperative representation method, and obtains the sample representation with good classification performance; uses the kernel function to extend the cooperative representation to the kernel space of multi-feature fusion, so that the dictionary and The class discrimination ability of sparse representation coefficients on target features is enhanced; under the particle filter framework, the candidate target with the highest score of the binary classifier is used as the tracking target, so the video target can be tracked accurately and robustly.
附图说明Description of drawings
图1-4是本发明案例的视频目标跟踪效果图以及跟踪效果对比图,与其它四种方法比较:2008年Ross提出的IVT法,2011年Mei提出的L1法、2010年Kwon提出的VID法、2009年Babenko提出的MIL法。Figure 1-4 is the video target tracking effect diagram and the tracking effect comparison diagram of the case of the present invention, compared with other four methods: the IVT method proposed by Ross in 2008, the L1 method proposed by Mei in 2011, and the VID method proposed by Kwon in 2010 , MIL law proposed by Babenko in 2009.
注:图1-4中,各矩形框框选的即是各方法的效果,且各方法效果对应的矩形框分别用标号A、B、C、D、E来表示,各标号分别代表的方法说明如下:Note: In Figure 1-4, each rectangular box selects the effect of each method, and the rectangular boxes corresponding to the effect of each method are respectively represented by labels A, B, C, D, and E, and the descriptions of the methods represented by each label as follows:
A(IVT);B(L1);C(VTD);D(MIL);E(本发明)A (IVT); B (L1); C (VTD); D (MIL); E (this invention)
图1-Oclcusion2序列(注:该序列中,出现了目标旋转与遮挡变化);从图1的跟踪结果可以看出,本发明在发生遮挡和旋转的时能够准确地跟踪目标,IVT、VTD跟踪失败(如#501),MIL能够实现跟踪但不能够估计旋转变化,l1跟踪出现漂移现象。Fig. 1-Oclcusion2 sequence (note: in this sequence, target rotation and occlusion change occurred); as can be seen from the tracking result of Fig. 1, the present invention can track target accurately when occlusion and rotation occur, IVT, VTD tracking Failed (eg #501), MIL is able to track but cannot estimate rotation changes, l 1 tracking drifts.
图2-DavidIndoor序列(注:随着目标的前进和后退,目标头部发生了旋转变化,同时因距离的远近而出现目标尺寸变化,且背景光照也随之发生了改变);图3-Car11序列(注:背景复杂、目标模糊且目标尺寸发生了变化);从图2与图3可以看出,IVT算法优于其他3种算法,具有旋转自适应性,但尺度变化适应性较差,本发明在目标旋转与尺度变化适应性能上均优于其它的。Figure 2-DavidIndoor sequence (note: as the target moves forward and backward, the target's head rotates, and the target size changes due to the distance, and the background lighting also changes); Figure 3-Car11 Sequence (Note: the background is complex, the target is blurred and the target size has changed); from Figure 2 and Figure 3, it can be seen that the IVT algorithm is superior to the other three algorithms, and has rotation adaptability, but poor adaptability to scale changes. The present invention is superior to others in object rotation and scale change adaptability.
图4Deer序列(注:因目标的快速运动出现了目标模糊现象);从图4可以看出,只有本发明与VTD算法能够定位目标,其他算法均出现丢失目标的现象(如#40),同时VTD的跟踪存在着偏移的情况。本发明在复杂的环境下能够实现准确鲁棒的目标跟踪。Fig. 4 Deer sequence (note: target fuzzy phenomenon has occurred because of the rapid motion of target); As can be seen from Fig. 4, only the present invention and VTD algorithm can locate the target, other algorithms all occur the phenomenon of missing target (as #40), simultaneously There is an offset in the tracking of VTD. The invention can realize accurate and robust target tracking in complex environments.
具体实施方式detailed description
下在通过具体实施例并结合附图对本发明的技术方案作进一步详细说明。The technical solution of the present invention will be described in further detail below through specific embodiments and in conjunction with the accompanying drawings.
参照图1-4,一种多核局部约束的视频目标跟踪方法,所述方法包括:With reference to Fig. 1-4, a kind of video target tracking method of multi-core local constraint, described method comprises:
(1)将样本数据的局部结构引入到协同表示方法中,构建样本特征的局部约束线性编码。具体如下:(a)在稀疏表示框架下,测试样本y=d1x1+d2x2+…+dnxn=Dx,其协同表示的目其协同表示的目标函数为通过最小化得到样本的协同表示为(b)将样本数据的局部结构引入到协同表示,其目标函数为通过最小化得到样本的局部约束协同编码为 (1) Introduce the local structure of sample data into the collaborative representation method to construct a locally constrained linear encoding of sample features. The details are as follows: (a) Under the sparse representation framework, the test sample y=d 1 x 1 +d 2 x 2 +...+d n x n =Dx, the objective function of its collaborative representation is The synergy of samples obtained by minimizing is expressed as (b) Introduce the local structure of the sample data into the collaborative representation, and its objective function is The locally constrained co-encoding of samples obtained by minimization is
(2)利用核方法,构建样本特征的多核局部约束协同编码。具体如下:(a)样本y在高维空间中的映像由Φ=[φ(d1),φ(d2),…,φ(dn)]来线性表示,采用随机投影矩阵PT再将高维数据映射到低维空间,得到局部约束核协同编码的目标函数(式(1)),对其求偏导并等于0以及根据核函数的性质,记P=ΦB得到目标函数(2) Using the kernel method, a multi-kernel local constraint collaborative encoding of sample features is constructed. The details are as follows: (a) The image of sample y in high-dimensional space is represented linearly by Φ=[φ(d 1 ),φ(d 2 ),…,φ(d n )], and the random projection matrix P T is used to Map the high-dimensional data to the low-dimensional space to obtain the objective function of the locally constrained kernel cooperative coding (Formula (1)), calculate its partial derivative and equal to 0, and according to the nature of the kernel function, record P=ΦB to obtain the objective function
(b)通过多核融合的方式得到融合核函数为使用空间彩色直方图和空间梯度直方图来表示目标:对于每一个目标样本,将其划分为四个面积相等的子区域,然后分别抽取子区域彩色特征并合并成为目标彩色特征hc。对于梯度特征,首先使用核[-0.5,0,0.5]和[-0.5,0,0.5]T进行滤波处理,得到图像梯度;然后采取同样的方法获得四个子区域方向梯度直方图,并通过合并四个子区域直方图得到空间梯度直方图hg。依照来计算Kc中元素,Kg、Kc(·,y)、Kg(·,y)的计算方法与Kc相同。从而求出样本特征的多核局部约束协同编码 (b) The fusion kernel function obtained by multi-core fusion is Use the spatial color histogram and the spatial gradient histogram to represent the target: for each target sample, it is divided into four sub-regions with equal areas, and then the color features of the sub-regions are respectively extracted and merged into the target color feature h c . For the gradient feature, first use the kernel [-0.5,0,0.5] and [-0.5,0,0.5] T for filtering to obtain the image gradient; then use the same method to obtain the gradient histogram of the four sub-regions, and combine The histograms of the four sub-regions obtain the spatial gradient histogram h g . according to To calculate the elements in K c , the calculation methods of K g , K c (·, y), and K g (·, y) are the same as K c . Multi-kernel locally constrained collaborative coding of sample features
(3)基于SVM,将样本的分类器得分嵌入到粒子滤波框架下实现视频目标的跟踪。具体如下:(a)在第一帧目标区域周围按照高斯分布提取目标样本(正样本)和背景样本(负样本),分别计算正负样本在当前字典下的多核局部约束协同编码zi,将正负样本的编码及其对应的标签代入SVM进行训练,通过最小化代价函数来学习分类器,分类器的得分f(z)由式(4)求出;(b)基于所学习的分类器,将粒子滤波框架下的观测模型定义为p(y|s)∝f(z),求出最高得分的候选目标,即为跟踪结果。(3) Based on SVM, the classifier score of the sample is embedded into the particle filter framework to track the video target. The details are as follows: (a) Extract the target samples (positive samples) and background samples (negative samples) according to the Gaussian distribution around the target area in the first frame, and calculate the multi-core local constraint cooperative coding z i of the positive and negative samples under the current dictionary respectively, and set Encoding of positive and negative samples and their corresponding labels Substitute into SVM for training, by minimizing the cost function To learn the classifier, the score f(z) of the classifier is obtained by formula (4); (b) Based on the learned classifier, the observation model under the particle filter framework is defined as p(y|s)∝f( z), find the candidate target with the highest score, which is the tracking result.
(4)依据目标和背景样本的变化,动态更新目标模板与背景模板以及分类器。具体如下:(a)目标模板Df的更新。如果s>τ2,则说明跟踪结果能很好地被目标模板表示,保留Df;如果s<τ1,则说明跟踪目标发生了强烈的外观变化,此时用y替换sm所对应的目标样本。(b)背景模板Db的更新,在当前帧中确定跟踪目标后,在目标区域周围按照高斯分布选择MB个背景样本,随机替换当前背景字典中的样本。(c)分类器的更新,将当前更新后的目标模板Df和背景模板Db中的样本,带入分类器进行训练便得到当前的分类器。(4) Dynamically update the target template, background template and classifier according to the changes of the target and background samples. The details are as follows: (a) The update of the target template Df . If s>τ 2 , it means that the tracking result can be well represented by the target template, and D f is retained; if s<τ 1 , it means that the tracking target has undergone a strong appearance change, and at this time, replace s m with y target sample. (b) The update of the background template Db . After determining the tracking target in the current frame, select M B background samples around the target area according to the Gaussian distribution, and randomly replace the samples in the current background dictionary. (c) The update of the classifier, the samples in the current updated target template D f and the background template D b are brought into the classifier for training to obtain the current classifier.
参照图1-4给出了本发明跟踪效果和其它四种方法跟踪效果的对比,可以看出,在发生目标运动模糊、尺度变化与快速运动以及遮挡、光照变化时,本发明的跟踪效果优于其他四种方法,实现了准确且鲁棒的目标跟踪。Referring to Figures 1-4, the tracking effect of the present invention is compared with the tracking effects of the other four methods. It can be seen that the tracking effect of the present invention is better when the target motion blur, scale change and fast movement, occlusion, and illumination change occur. Compared with the other four methods, accurate and robust object tracking is achieved.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710455426.XA CN107368785A (en) | 2017-06-16 | 2017-06-16 | The video target tracking method of multinuclear local restriction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710455426.XA CN107368785A (en) | 2017-06-16 | 2017-06-16 | The video target tracking method of multinuclear local restriction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107368785A true CN107368785A (en) | 2017-11-21 |
Family
ID=60306515
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710455426.XA Pending CN107368785A (en) | 2017-06-16 | 2017-06-16 | The video target tracking method of multinuclear local restriction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107368785A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108719A (en) * | 2018-01-05 | 2018-06-01 | 重庆邮电大学 | A kind of Weighted Kernel is sparse and cooperates with the Hyperspectral Image Classification method for representing coefficient |
CN109348410A (en) * | 2018-11-16 | 2019-02-15 | 电子科技大学 | Indoor localization method based on global and local joint constraint transfer learning |
CN112578675A (en) * | 2021-02-25 | 2021-03-30 | 中国人民解放军国防科技大学 | High-dynamic vision control system and task allocation and multi-core implementation method thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440645A (en) * | 2013-08-16 | 2013-12-11 | 东南大学 | Target tracking algorithm based on self-adaptive particle filter and sparse representation |
CN103544483A (en) * | 2013-10-25 | 2014-01-29 | 合肥工业大学 | United target tracking method based on local sparse representation and system thereof |
CN104766343A (en) * | 2015-03-27 | 2015-07-08 | 电子科技大学 | Vision target tracking method based on sparse representation |
KR101542206B1 (en) * | 2014-04-24 | 2015-08-12 | (주)디브이알씨앤씨 | Method and system for tracking with extraction object using coarse to fine techniques |
-
2017
- 2017-06-16 CN CN201710455426.XA patent/CN107368785A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103440645A (en) * | 2013-08-16 | 2013-12-11 | 东南大学 | Target tracking algorithm based on self-adaptive particle filter and sparse representation |
CN103544483A (en) * | 2013-10-25 | 2014-01-29 | 合肥工业大学 | United target tracking method based on local sparse representation and system thereof |
KR101542206B1 (en) * | 2014-04-24 | 2015-08-12 | (주)디브이알씨앤씨 | Method and system for tracking with extraction object using coarse to fine techniques |
CN104766343A (en) * | 2015-03-27 | 2015-07-08 | 电子科技大学 | Vision target tracking method based on sparse representation |
Non-Patent Citations (3)
Title |
---|
LAI WEI等: "Kernel locality-constrained collaborative representation based discriminant analysis", 《 KNOWLEDGE-BASED SYSTEMS 》 * |
LINGFENG WANG等: "Visual Tracking Via Kernel Sparse Representation With Multikernel Fusion", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 * |
林冬冬: "基于稀疏表达和机器学习的目标跟踪的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108719A (en) * | 2018-01-05 | 2018-06-01 | 重庆邮电大学 | A kind of Weighted Kernel is sparse and cooperates with the Hyperspectral Image Classification method for representing coefficient |
CN109348410A (en) * | 2018-11-16 | 2019-02-15 | 电子科技大学 | Indoor localization method based on global and local joint constraint transfer learning |
CN112578675A (en) * | 2021-02-25 | 2021-03-30 | 中国人民解放军国防科技大学 | High-dynamic vision control system and task allocation and multi-core implementation method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zahra et al. | Person re-identification: A retrospective on domain specific open challenges and future trends | |
Von Stumberg et al. | Gn-net: The gauss-newton loss for multi-weather relocalization | |
Yang et al. | An improving faster-RCNN with multi-attention ResNet for small target detection in intelligent autonomous transport with 6G | |
Xiong et al. | Spatiotemporal modeling for crowd counting in videos | |
Zhong et al. | Robust object tracking via sparse collaborative appearance model | |
Luo et al. | Spatio-temporal feature extraction and representation for RGB-D human action recognition | |
Ma et al. | Generalized pooling for robust object tracking | |
Zhang et al. | Image object detection and semantic segmentation based on convolutional neural network | |
Zhang et al. | Visual tracking via Boolean map representations | |
Chen et al. | Multitarget tracking in nonoverlapping cameras using a reference set | |
Li et al. | Visual tracking with spatio-temporal Dempster–Shafer information fusion | |
Hu et al. | Surveillance video face recognition with single sample per person based on 3D modeling and blurring | |
CN103714556A (en) | Moving target tracking method based on pyramid appearance model | |
Hwang et al. | Lidar depth completion using color-embedded information via knowledge distillation | |
Fan et al. | HCPVF: Hierarchical cascaded point-voxel fusion for 3D object detection | |
Deng et al. | Detail preserving coarse-to-fine matching for stereo matching and optical flow | |
Liu et al. | AnchorPoint: Query design for transformer-based 3D object detection and tracking | |
CN107368785A (en) | The video target tracking method of multinuclear local restriction | |
Kumar et al. | Small and dim target detection in ir imagery: A review | |
Liu et al. | Unsupervised spike depth estimation via cross-modality cross-domain knowledge transfer | |
Guo et al. | MDSFE: Multiscale deep stacking fusion enhancer network for visual data enhancement | |
Dang et al. | Adaptive sparse memory networks for efficient and robust video object segmentation | |
Taylor et al. | Pose-sensitive embedding by nonlinear nca regression | |
Chen et al. | An improved BIM aided indoor localization method via enhancing cross-domain image retrieval based on deep learning | |
Shih et al. | Video interpolation and prediction with unsupervised landmarks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171121 |