CN105654514A

CN105654514A - Image target tracking method

Info

Publication number: CN105654514A
Application number: CN201511026620.3A
Authority: CN
Inventors: 罗武胜; 孙备; 鲁琴; 杜列波; 李阳; 肖晶晶
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2015-12-31
Filing date: 2015-12-31
Publication date: 2016-06-08

Abstract

The invention provides a method for image target tracking. By acquiring the first frame of image, the appearance models of local image blocks of different scales are assembled into an overall image dictionary, and the sparse coefficients of local image blocks are calculated; the current state of the collected image is candidate target, establish a particle filter and a similarity function, and calculate the estimated position of the candidate target through the similarity function in the current state particle filter framework; based on the estimated position of the candidate target, recalculate the sparse coefficient of the local image block, Complete the final positioning position of the target. The present invention improves the previous tracking method of random sampling and consistent recognition of targets by using statistical features, and provides a way of recognizing target objects in complex backgrounds through discriminative models constructed by sparse coefficients of local image blocks. And the stability and accuracy of the tracking method are improved through the fusion of discriminative and generative candidate target similarity functions.

Description

A Method of Image Target Tracking

技术领域technical field

本发明涉及计算机图像处理技术领域，特别涉及计算机图像处理中的图像目标跟踪的方法。The invention relates to the technical field of computer image processing, in particular to an image target tracking method in computer image processing.

背景技术Background technique

随着近年来传感器技术和计算机硬件处理能力及其存储技术的不断发展，运动目标跟踪成为模式识别和计算机视觉中一个热门的研究领域，在军事和民用等领域都有着广泛的应用。视觉信息是人类通过感官从外界获取的最主要信息之一，同时视频序列具有比静态图像更多的有效信息，因此对视频序列中的目标进行分割和跟踪是后续研究工作的前提和基础，如异常行为检测、目标识别等工作都以目标的跟踪和分割为前提。With the continuous development of sensor technology, computer hardware processing capability and storage technology in recent years, moving target tracking has become a popular research field in pattern recognition and computer vision, and has a wide range of applications in military and civilian fields. Visual information is one of the most important information that humans obtain from the outside world through the senses. At the same time, video sequences have more effective information than static images. Therefore, the segmentation and tracking of targets in video sequences is the premise and basis of subsequent research work, such as Abnormal behavior detection, target recognition and other tasks are based on the tracking and segmentation of targets.

所谓目标跟踪技术，研究内容包括以下两方面，一个是对所捕获的视频序列中的运动目标进行检测、跟踪、识别和提取所需信息，如目标的轨迹及其相关运动参数，如速度、加速度、某时刻的位置等。另一个是利用所获取的各项运动参数对目标进行估计与预测以辅助决策。因此，精确地提取运动目标的特征是提高目标跟踪、识别与分类的前提；而跟踪的精确度又影响到高层决策过程精确与困难程度，如目标行为的描述和理解与判定、决策等。The so-called target tracking technology, the research content includes the following two aspects, one is to detect, track, identify and extract the required information for the moving target in the captured video sequence, such as the trajectory of the target and its related motion parameters, such as speed, acceleration , position at a certain moment, etc. The other is to use the acquired motion parameters to estimate and predict the target to assist decision-making. Therefore, accurately extracting the features of moving targets is the prerequisite for improving target tracking, recognition and classification; and the accuracy of tracking affects the accuracy and difficulty of high-level decision-making processes, such as description, understanding and judgment of target behavior, decision-making, etc.

然而，现实中存在大量影响着运动目标及其外形特征提取精度的因素，比如目标被遮挡、发生空间内旋转和目标进出视野范围会影响大部分跟踪器的性能，导致其逐步累积误差而最终跟踪结果发生较大偏差；拥挤的场景会存在大量类似目标的干扰而使得目标的检测和跟踪难度加大；诸多干扰目标边缘特征提取的因素，如运动目标姿态的改变将会影响基于边缘、梯度等特征的跟踪器性能导致其无法检测到与训练样本近似的区域而影响跟踪结果；如摄像机的抖动和影子会导致视频序列的相邻两帧出现灰度和内容发生较大的变化会影响大部分跟踪器性能；此外如光照的突变、光线较弱、能见度较低，或者是室内场景目标与背景的对比度变化等情况会导致目标色彩特征的改变，由于视觉特征对光照变化的敏感性，将会影响一些基于色彩特征的跟踪器运行结果。因此，找到应对复杂场景下影响目标跟踪算法性能诸多因素的有效解决方法，提高算法的鲁棒性和精确性，成功地搭建一个高效、实时的运动目标跟踪平台，对于行为模式理解等研究都具有很重要的理论价值与广泛的应用价值。However, in reality, there are a large number of factors that affect the accuracy of moving targets and their shape feature extraction, such as target occlusion, rotation in space, and targets entering and exiting the field of view will affect the performance of most trackers, resulting in gradual accumulation of errors and final tracking. There will be a large deviation in the result; there will be a large number of interferences of similar targets in a crowded scene, which will make the detection and tracking of the target more difficult; many factors that interfere with the extraction of target edge features, such as changes in the posture of moving targets will affect the performance based on edges, gradients, etc. The performance of the feature tracker makes it unable to detect areas similar to the training samples and affects the tracking results; for example, camera shake and shadows will cause grayscale and large changes in the content of two adjacent frames of the video sequence, which will affect most Tracker performance; In addition, such as sudden changes in lighting, weak light, low visibility, or changes in the contrast between the indoor scene target and the background will cause changes in the color characteristics of the target. Due to the sensitivity of visual features to light changes, it will be Affects the results of some trackers based on color features. Therefore, finding an effective solution to many factors that affect the performance of the target tracking algorithm in complex scenes, improving the robustness and accuracy of the algorithm, and successfully building an efficient and real-time moving target tracking platform have important implications for the understanding of behavior patterns. Very important theoretical value and extensive application value.

发明内容Contents of the invention

针对现有技术存在的上述问题，本发明提供一种基于稀疏系数的目标跟踪方法，主要针对现有的通过采用统计特征进行随机抽样一致性识别目标的跟踪方法在识别能力上的不足，提出了一种通过局部图像块的稀疏系数所构造的辨别式模型来识别复杂背景中目标对象的方式。并通过辨别式和产生式融合的候选目标相似度函数，提高了该跟踪方法的稳定性和精确性。Aiming at the above-mentioned problems existing in the prior art, the present invention provides a target tracking method based on sparse coefficients, mainly aiming at the lack of recognition ability of the existing tracking method for random sampling and consistent identification of targets by using statistical features, and proposes A discriminative model constructed by the sparse coefficients of local image blocks to identify target objects in complex backgrounds. And the stability and accuracy of the tracking method are improved through the fusion of discriminative and generative candidate target similarity functions.

为实现上述发明目的，本技术发明提供以下技术方案：In order to realize the above-mentioned purpose of the invention, the technical invention provides the following technical solutions:

本发明采用了一种构建场景图像的多尺度图像整体字典的方法，基于不同尺度块，通过第一帧的目标图像创建静态字典，结合不同尺度的图像整体字典以及局部块稀疏系数来实现外观图像表示。The present invention adopts a method for constructing a multi-scale image overall dictionary of a scene image. Based on blocks of different scales, a static dictionary is created through the target image of the first frame, and the appearance image is realized by combining the overall image dictionary of different scales and the local block sparse coefficient. express.

进一步的，本发明采用了通过局部图像块的稀疏系数所构造的辨别式的方法。将局部稀疏系数通过排列池法实现目标表示，并将单一尺度稀疏表示拓展成多尺度下稀疏系数表示，以此完善目标外观表示，增强跟踪器的鲁棒性。考虑到块状大小选择对跟踪表现影响较大，本方法可以通过多尺度块自适应融合方法来解决图像局部块选择的问题。Further, the present invention adopts a discriminative method constructed by sparse coefficients of local image blocks. The local sparse coefficients are used to realize the target representation through the permutation pooling method, and the single-scale sparse representation is extended to multi-scale sparse coefficient representations, so as to improve the target appearance representation and enhance the robustness of the tracker. Considering that block size selection has a great influence on tracking performance, this method can solve the problem of image local block selection through multi-scale block adaptive fusion method.

进一步的，本发明建立局部图像稀疏直方图的目标状态估计方法，在生产式模型中，目标外观由相应的稀疏编码直方图表示，目标模板与候选模板之间的相似度通过稀疏编码直方图计算。Further, the present invention establishes a method for estimating the target state of the local image sparse histogram. In the production model, the appearance of the target is represented by the corresponding sparse-coded histogram, and the similarity between the target template and the candidate template is calculated by the sparse-coded histogram .

本发明采用了粒子滤波器的框架，在追踪候选目标图像的基础上计算候选目标相似度并结合粒子滤波器确定最终的目标定位的估计位置。The present invention adopts the frame of the particle filter, calculates the similarity degree of the candidate target on the basis of tracking the image of the candidate target, and combines the particle filter to determine the estimated position of the final target location.

进一步的，所述粒子滤波器为估计和扩散状态变化后验概率密度函数提供了环境框架。Further, the particle filter provides an environmental framework for estimating and diffusing the posterior probability density function of state changes.

本发明采用了一种多尺度图像整体字典与粒子滤波器结合的目标定位方法，候选目标的状态估计通过一个整合了判别式模型以及生产模型的多尺度字典进行分析并结合粒子滤波器来对候选目标进行最终的定位位置；The present invention adopts a target positioning method combining a multi-scale image overall dictionary and a particle filter. The state estimation of a candidate target is analyzed through a multi-scale dictionary integrating a discriminant model and a production model, and combined with a particle filter to estimate the candidate target. The final positioning position of the target;

进一步的，本发明采用了基于多尺度字典和粒子滤波的目标定位的方法。对候选目标的每一个局部图像块重构稀疏系数并经过粒子滤波完成跟踪目标最优位置的选择。Further, the present invention adopts a method of target location based on multi-scale dictionary and particle filter. The sparse coefficients are reconstructed for each local image block of the candidate target, and the selection of the optimal position of the tracking target is completed through particle filtering.

进一步的，本发明对候选目标的跟踪采用了一种通过稀疏系数构造辨别式模型的有效目标识别的方法，在判别式模型中目标图像被相应的稀疏编码来表示，通过对稀疏编码的学习，获得一个线性分类器来识别背景图像和目标图像。Further, the present invention adopts an effective target recognition method for constructing a discriminative model through sparse coefficients to track candidate targets. In the discriminative model, target images are represented by corresponding sparse codes. Through the learning of sparse codes, Obtain a linear classifier to identify background and target images.

综上所述，本发明对候选目标的跟踪采用了一种多尺度图像整体字典与粒子滤波器结合的目标定位方法，该方法不仅拜托了目标跟踪前局部块选择的困扰，而且还使得目标信息与空间信息相互补充，提升了跟踪精度。To sum up, the present invention uses a target positioning method combining multi-scale image overall dictionary and particle filter to track candidate targets. This method not only overcomes the trouble of local block selection before target tracking, but also makes target information It complements the spatial information and improves the tracking accuracy.

附图说明Description of drawings

附图是用来提供对本发明的进一步理解，并且构成说明书的一部分，与下面的具体实施方式一起用于解释本发明，但并不构成对本发明的限制。在附图中：The accompanying drawings are used to provide a further understanding of the present invention, and constitute a part of the description, together with the following specific embodiments, are used to explain the present invention, but do not constitute a limitation to the present invention. In the attached picture:

图1为本发明实施例中图像目标跟踪方法的流程图Fig. 1 is the flowchart of image target tracking method in the embodiment of the present invention

具体实施方式detailed description

以下结合附图对本发明的具体实施方式进行详细说明。应当理解的是，此处所描述的具体实施方式仅用于说明和解释本发明，并不用于限制本发明。因此，本领域普通技术人员应当认识到，可以对这里的描述的实施例做出各种改变和修改，而不会背离本发明的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

一种基于稀疏表示的目标跟踪方法，主要针对现有的通过采用统计特征进行随机抽样一致性识别目标的跟踪方法在识别能力上的不足，提出了一种通过局部图像块的稀疏系数所构造的辨别式模型来识别复杂背景中目标对象的方式。并通过辨别式和产生式融合的候选目标相似度函数，提高了该跟踪方法的稳定性和精确性。A target tracking method based on sparse representation, mainly aimed at the lack of recognition ability of the existing tracking methods that use statistical features to randomly sample and consistently identify targets, and propose a method constructed by sparse coefficients of local image blocks Discriminative models to identify the way to target objects in complex backgrounds. And the stability and accuracy of the tracking method are improved through the fusion of discriminative and generative candidate target similarity functions.

本技术发明为实现上述目的，采用了一种结合不同尺度字典以及局部块稀疏系数来实现外观表示的方法，并应用粒子滤波的方式提高跟踪的稳定性和精度，采取的技术方案主要包括三个部分：In order to achieve the above purpose, this technical invention adopts a method of combining dictionaries of different scales and local block sparse coefficients to realize the appearance representation, and uses particle filtering to improve the stability and accuracy of tracking. The technical solutions adopted mainly include three part:

(1)基于多尺度块的稀疏系数表示图像外观模型：(1) Image appearance model represented by sparse coefficients based on multi-scale blocks:

根据不同尺度块大小建立图像外观模型，通过不同尺度下图像求解局部图像块的稀疏系数，搜集全部目标稀疏来表示图像外观。在目标区域内通过滑动不同大小的窗口来采样局部图像块，以此来建立不同尺度下的图像整体字典 $D^{s} = {d_{j}^{s} | j = 1 : n \times K, s = 1, 2, ..., L, r = 2 \times s + 2} . d_{j}^{s} &Element; R^{d}$ 表示第j列图像块，d^s是图像块维数，r表示尺度。然后再不同尺度下从候选目标区域抽取局部图像块，根据不同的图像块建立图像整体字典D^s，每一个尺度的局部图像块可以得到对应的稀疏系数：The image appearance model is established according to the block size of different scales, the sparse coefficient of the local image block is calculated through the image at different scales, and all the target sparseness is collected to represent the image appearance. In the target area, local image blocks are sampled by sliding windows of different sizes, so as to establish an overall image dictionary at different scales ${D.}^{the s} = {d_{j}^{the s} | j = 1 : no \times K, the s = 1, 2, ..., L, r = 2 \times the s + 2} . d_{j}^{the s} &Element; R^{d}$ Indicates the image block in the jth column, d ^s is the dimension of the image block, and r represents the scale. Then extract local image blocks from the candidate target area at different scales, and establish an overall image dictionary D ^s according to different image blocks. The corresponding sparse coefficients can be obtained for each scale of local image blocks:

${a a}_{j j}^{s the s} = = arg arg m m i i n no | | | | {a a}_{r r}^{s the s} | | | | s the s u u b b j j e e c c t t t t o o | | | | {p p}_{j j}^{s the s} - - {D D.}^{s the s} {a a}_{j j}^{s the s} | | | | < < ϵ ϵ$

当一个候选目标图像块通过上式计算后得到相应的稀疏编码，即可通过稀疏系数权值化操作来表示不同尺度下局部图像的变化情况从而构建该目标场景的表示模型。When a candidate target image block is calculated by the above formula and the corresponding sparse coding is obtained, the change of the local image at different scales can be represented by the sparse coefficient weighting operation to construct the representation model of the target scene.

(2)粒子滤波器的建立：(2) Establishment of particle filter:

粒子滤波器为估计候选目标的状态变化后验概率密度函数提供了环境框架，当前状态为t时刻时的候选目的观察状态为y_s＝{y₁,...,y_r}，则当前状态S_t通过最大化后验概率来估计：The particle filter provides an environmental framework for estimating the posterior probability density function of the state change of the candidate target. The current state is the candidate target observation state at time t is y _s ={y ₁ ,...,y _r }, then the current state S _t is estimated by maximizing the posterior probability:

s_t＝argmaxp(S_t|y_tr)s _t =argmaxp(S _t |y _tr )

这里p(S_t|y_tr)是后验概率，是候选目标在给定状态S_t下的y_t的相似度函数。Here p(S _t |y _tr ) is the posterior probability, which is the similarity function of candidate target y _t in a given state S _t .

(3)根据多尺度字典对目标进行跟踪：(3) Track the target according to the multi-scale dictionary:

在实际应用中，基于不同尺度块，通过来自第一帧目标图像所创建的静态字典建立静态表现模型来实现相似度计算。在t时刻，候选目标通过实时更新的多尺度字典，在粒子滤波器框架中结合相似度函数来完成候选目标位置估计。然后带着构建的静态字典在第一步估计结果的基础上，对候选目标的每一个局部图像块通过以下方式重新计算对应的稀疏系数：In practical applications, based on blocks of different scales, a static representation model is established through a static dictionary created from the first frame of the target image to achieve similarity calculation. At time t, the candidate targets are estimated through the multi-scale dictionary updated in real time, combined with the similarity function in the particle filter framework to complete the candidate target position estimation. Then, on the basis of the estimated results of the first step with the static dictionary constructed, the corresponding sparse coefficients are recalculated for each local image block of the candidate target in the following way:

${\overset{^^}{a a}}_{j j}^{s the s} = = arg arg min min | | | | {a a}_{r r}^{s the s} | | | | s the s u u b b j j e e c c t t t t o o | | | | {p p}_{j j}^{s the s} - - {D D.}^{s the s} {a a}_{j j}^{s the s} | | | | < < ϵ ϵ$

最终，类似第一部算法流程完成图像目标的最后位置定位。Finally, similar to the first algorithm process, the final location of the image target is completed.

综上所述，本发明提供的一种图像目标跟踪的方法，通过外观相似度和粒子滤波追踪框架来实现目标状态估计，采用了通过局部图像块的稀疏系数所构造的辨别式的方法，将局部稀疏系数通过排列池法实现目标表示，并将单一尺度稀疏表示拓展成多尺度下稀疏系数表示，以此完善目标外观表示，构建场景图像的多尺度字典。在目标状态估计的基础上计算候选目标相似度并结合粒子滤波器确定最终的目标定位，对候选目标的每一个局部图像块重构稀疏系数并经过粒子滤波完成跟踪目标最优位置的选择。通过多尺度图像整体字典与粒子滤波器结合的目标定位方法，候选目标的状态估计通过一个整合了判别式模型以及生产模型的多尺度图像整体字典进行分析并结合粒子滤波器来提升跟踪精度。In summary, the present invention provides a method for image target tracking, which realizes target state estimation through appearance similarity and particle filter tracking framework, adopts a discriminative method constructed by sparse coefficients of local image blocks, and The local sparse coefficient realizes the target representation through the permutation pooling method, and expands the single-scale sparse representation into a multi-scale sparse coefficient representation to improve the target appearance representation and build a multi-scale dictionary of the scene image. On the basis of target state estimation, the similarity of candidate targets is calculated and combined with the particle filter to determine the final target location, the sparse coefficients are reconstructed for each local image block of the candidate target, and the optimal position of the tracking target is selected through the particle filter. Through the target localization method combining the multi-scale image dictionary and the particle filter, the state estimation of the candidate target is analyzed through a multi-scale image dictionary integrating the discriminant model and the production model, and the particle filter is combined to improve the tracking accuracy.

显然，本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样，倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内，则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention also intends to include these modifications and variations.

Claims

1. A method for image target tracking, comprising the following steps:

(1) Obtain the first frame of image, assemble the overall image dictionary according to the appearance models of local image blocks of different scales, and calculate the sparse coefficient of the local image blocks;

(2) Collect the current state of the image as a candidate target, establish a particle filter and a similarity function, and calculate the estimated position of the candidate target through the similarity function in the current state particle filter framework;

(3) Based on the estimated position of the candidate target, recalculate the sparse coefficient of the local image block to complete the final positioning position of the target.

2. the method for a kind of image object tracking according to claim 1, is characterized in that, the image whole dictionary in the described step (1) comprises:

In the appearance model of the first frame image, the local image block is sampled by sliding windows of different sizes, and an overall image dictionary is established:

{D.}^{the s} = {d_{j}^{the s} | j = 1 : no \times K, the s = 1, 2, ..., L, r = 2 \times the s + 2} .

in, Indicates the image block in the jth column, d ^s is the dimension of the image block, and r represents the scale. Extract local image blocks from candidate target areas at different scales, and build an overall image dictionary D ^s according to different image blocks.

3. the method for a kind of image target tracking according to claim 1, is characterized in that, the sparse coefficient in the described step (1) is:

a_{j}^{the s} = \arg \min | | a_{r}^{the s} | | the s u b j e c t t o | | p_{j}^{the s} - {D.}^{the s} a_{j}^{the s} | | < ϵ .

4. the method for a kind of image target tracking according to claim 1, is characterized in that, the method for calculating the estimated position of candidate target in the described step (2), comprises:

The particle filter provides an environmental framework for estimating and diffusing the posterior probability density function of state changes, given the target observed state y _s ={y ₁ ,...,y _r } up to time t;

Estimated by maximizing the posterior probability with the current state S _t : s _t =argmaxp(S _t |y _tr );

Where p(S _t |y _tr ) is the posterior probability function, that is, the similarity function of y _t under a given state S _t .

5. the method for a kind of image target tracking according to claim 1, is characterized in that, the method for sparse coefficient recalculation in described step (3), comprises:

At time t, the candidate target passes through each local image block of the estimated position of the target to recalculate the corresponding sparse coefficient in the following way:

{\overset{^^}{a a}}_{j j}^{s the s} = = arg arg min min | | | | {a a}_{r r}^{s the s} | | | | s the s u u b b j j e e c c t t t t o o | | | | {p p}_{j j}^{s the s} - - {D D.}^{s the s} {a a}_{j j}^{s the s} | | | | < < ϵ ϵ . .