CN105654505B

CN105654505B - A kind of collaboration track algorithm and system based on super-pixel

Info

Publication number: CN105654505B
Application number: CN201510971312.1A
Authority: CN
Inventors: 纪庆革; 袁大龙; 韩非凡; 杜景洪; 印鉴
Original assignee: GUANGZHOU INFINITE WISDOM ASPECT INFORMATION TECHNOLOGY Co Ltd; Sun Yat Sen University; Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
Current assignee: GUANGZHOU INFINITE WISDOM ASPECT INFORMATION TECHNOLOGY Co Ltd; Sun Yat Sen University; Guangzhou Zhongda Nansha Technology Innovation Industrial Park Co Ltd
Priority date: 2015-12-18
Filing date: 2015-12-18
Publication date: 2018-06-26
Anticipated expiration: 2035-12-18
Also published as: CN105654505A

Abstract

The invention relates to a superpixel-based collaborative tracking algorithm and system. The method provided by the present invention will combine global judgment and local judgment to determine whether the candidate image contains the target area, so it can solve the tracking problem that the target area is occluded. Various appearance changes during the process, and its accuracy and applicability are greatly improved.

Description

A Collaborative Tracking Algorithm and System Based on Superpixels

技术领域technical field

本发明涉及计算机视觉目标跟踪领域，更具体地，涉及一种基于超像素的协同跟踪算法和系统。The present invention relates to the field of computer vision target tracking, and more specifically, to a superpixel-based cooperative tracking algorithm and system.

背景技术Background technique

随着计算机的发展和普及，人们越来越期待计算机能具有像人类一样的感知和识别能力，其中一个努力的方向就是类似人类的视觉感知系统。计算机视觉是通过计算机对输入的图像信息进行处理，模拟人眼对视觉信息的感知和识别，完成诸如目标识别、跟踪等任务。随着计算机性能的提高以及摄像头的普及，我们每天都能够获取到海量的视频图像信息，而且还在不断增长，使得人们对于视觉信息自动化处理的需求日益增加。With the development and popularization of computers, people are increasingly expecting computers to have the same perception and recognition capabilities as humans. One of the directions of effort is to have a visual perception system similar to humans. Computer vision is to process the input image information through the computer, simulate the perception and recognition of visual information by the human eye, and complete tasks such as target recognition and tracking. With the improvement of computer performance and the popularization of cameras, we can obtain a large amount of video image information every day, and it is still growing, which makes people's demand for automatic processing of visual information increasing.

目标跟踪是在一组图像序列中对预先选定的感兴趣目标进行检测，逐帧地跟踪目标。按照跟踪目标的个数，目标跟踪算法可以简单地分成单目标跟踪算法和多目标跟踪算法；按照跟踪过程中使用的摄像头个数，目标跟踪算法可以分成单摄像头跟踪和多摄像头跟踪。本发明主要针对单摄像头单目标的跟踪问题。目标跟踪本身就是计算机视觉中的一个应用技术，同时它又是其他高级应用的基础。目标跟踪的一些典型应用包括：人机交互、安全监控、交通检测、智能机器人导航等。然而，目标跟踪是一个复杂的过程，该领域还存在许多挑战，例如目标跟踪过程中的局部遮挡、外观变化、光线变化、剧烈运动、目标在视野中消失后再现、背景影响等。Object tracking is to detect a pre-selected object of interest in a set of image sequences and track the object frame by frame. According to the number of tracking targets, target tracking algorithms can be simply divided into single target tracking algorithms and multi-target tracking algorithms; according to the number of cameras used in the tracking process, target tracking algorithms can be divided into single-camera tracking and multi-camera tracking. The present invention is mainly aimed at the tracking problem of a single camera and a single target. Object tracking itself is an applied technology in computer vision, and it is the basis for other advanced applications. Some typical applications of object tracking include: human-computer interaction, security monitoring, traffic detection, intelligent robot navigation, etc. However, object tracking is a complex process, and there are still many challenges in this field, such as partial occlusion, appearance changes, light changes, violent motion, object reappearance after disappearing in the field of view, background influence, etc. during object tracking.

发明内容Contents of the invention

本发明为解决以上现有技术的缺陷，提供了一种基于超像素的协同跟踪算法，该方法能够处理遮挡、外观变化等目标跟踪中的常见问题，并且具有很好的稳定性和鲁棒性。In order to solve the above defects of the prior art, the present invention provides a superpixel-based collaborative tracking algorithm, which can deal with common problems in target tracking such as occlusion and appearance changes, and has good stability and robustness .

为实现以上发明目的，采用的技术方案是：For realizing above-mentioned purpose of the invention, the technical scheme that adopts is:

一种基于超像素分割的协同跟踪算法，用于解决单摄像头单目标的跟踪问题，包括以下步骤：A collaborative tracking algorithm based on superpixel segmentation is used to solve the tracking problem of a single camera and a single target, including the following steps:

一、训练阶段1. Training stage

S1.构建全局判别模型，所述全局判别模型用于提取目标区域的Haar_Like特征，然后根据提取的Haar_Like特征构建全局分类器GC，并确定全局分类器GC的参数；S1. Build a global discriminant model, the global discriminant model is used to extract the Haar_Like feature of the target area, then construct a global classifier GC according to the extracted Haar_Like feature, and determine the parameters of the global classifier GC;

S2.使用基于重叠滑动窗口的分片方法对目标区域进行分片，获得N个子区域，然后构建出N个局部判别模型，所述N个局部判别模型用于对N个子区域分别提取Haar_Like特征，然后根据提取的Haar_Like特征分别构建局部分类器，并确定局部分类器的参数；S2. Use the fragmentation method based on overlapping sliding windows to fragment the target area to obtain N sub-regions, and then construct N local discriminant models, and the N local discriminant models are used to extract Haar_Like features for the N sub-regions respectively, Then build local classifiers according to the extracted Haar_Like features, and determine the parameters of the local classifiers;

S3.构建适应生成模型，并确认适应生成模型的模型参数，其具体步骤如下：S3. Constructing an adaptive generative model, and confirming the model parameters of the adaptive generative model, the specific steps are as follows:

对目标区域进行超像素分割，并分别提取每个超像素的特征向量，然后使用K-means算法对目标区域的所有超像素进行聚类，从而确定适应生成模型的模型参数；Carry out superpixel segmentation on the target area, and extract the feature vector of each superpixel separately, and then use the K-means algorithm to cluster all the superpixels in the target area, so as to determine the model parameters suitable for the generated model;

二、跟踪阶段2. Tracking stage

S4.将候选图像p_i输入至全局判别模型，全局判别模型对候选图像p_i的Haar_Like特征进行提取，然后使用全局分类器GC对候选图像p_i的Haar_Like特征进行分类，GC(p_i)表示候选图像p_i的分类结果；S4. Input the candidate image p _i into the global discriminant model, the global discriminant model extracts the Haar_Like feature of the candidate image p _i , and then uses the global classifier GC to classify the Haar_Like feature of the candidate image p _i , GC(pi ₎ means The classification result of the candidate image p _i ;

S5.使用步骤S2的方法将候选图像p_i划分N个子区域，然后使N个局部判别模型对N个子区域分别提取Haar_Like特征，然后使用N个局部分类器分别对N个子区域的Haar_Like特征进行分类，LC_j(p_i)表示第j个局部分类器对子区域的分类结果；S5. Use the method of step S2 to divide the candidate image p _i into N sub-regions, then make N local discriminant models extract Haar_Like features for the N sub-regions, and then use N local classifiers to classify the Haar_Like features of the N sub-regions respectively , LC _j (p _i ) represents the classification result of the jth local classifier for the sub-region;

S6.结合全局分类模型、局部分类模型的分类结果，对候选图像是否包含目标区域进行判断：S6. Combining the classification results of the global classification model and the local classification model, it is judged whether the candidate image contains the target area:

thr_GC、thr_LC分别表示全局分类、局部分类的两个阈值，当y(p_i)＝1时，表示候选图像p_i包含有目标区域；thr _GC and thr _LC represent the two thresholds of global classification and local classification respectively, when y(pi ₎ =1, it means that the candidate image p _i contains the target area;

S7.将所有的候选图像进行步骤S4～S6的操作从而判断其内是否包含有目标区域，然后将所有判定其内包含有目标区域的候选图像输入至适应生成模型；S7. Perform the operations of steps S4 to S6 on all candidate images to determine whether they contain the target area, and then input all candidate images that are determined to contain the target area into the adaptive generative model;

S8.对于每一张候选图像，适应生成模型对其进行超像素分割，然后提取每个超像素的特征向量，然后使用K-means算法对所有超像素的特征向量进行聚类，并计算候选图像的聚类置信度；然后选取置信度最高的候选图像作为跟踪结果进行输出，输出数据包括当前跟踪结果的置信度conf_T与目标区域的匹配面积area_T，其中其中A_i为每个超像素的面积，N表示候选图像片中包含超像素的个数，S8. For each candidate image, adapt the generated model to perform superpixel segmentation, then extract the feature vector of each superpixel, then use the K-means algorithm to cluster the feature vectors of all superpixels, and calculate the candidate image clustering confidence; then select the candidate image with the highest confidence as the tracking result to output, the output data includes the confidence conf _T of the current tracking result and the matching area area _T of the target area, where where A _i is the area of each superpixel, N represents the number of superpixels contained in the candidate image slice,

上述公式表明当超像素与聚类中心在特征空间中相近，与聚类中模板超像素在目标区域相对位置也相近，且所属聚类的目标/背景置信度高时，本专利认为该类超像素可以更加充分地描述当前目标的外观信息且判别能力强，其中g′_i表示候选图像片包含的超像素，k′_i表示超像素所属聚类，S′_i表示超像素所属聚类的距离，表示k′_i的目标/背景置信度，R′_j表示聚类半径，conf_i′表示g′_i的置信度，L_i表示候选图像片中每个超像素与所属聚类中的模板超像素间的最小空间距离，a_s是控制空间距离权重的权重因子，表示g′_i与所属聚类的模板超像素在目标区域中的空间距离，表示以a_s为底，以为指数的幂运算；The above formula shows that when the superpixel is close to the cluster center in the feature space, and the relative position of the template superpixel in the cluster is also similar to the target area, and the target/background confidence of the cluster to which it belongs, this patent considers that the superpixel of this type is Pixels can more fully describe the appearance information of the current target and have strong discriminative ability, where _g'i represents the superpixels contained in the candidate image slice, _k'i represents the cluster to which the superpixel belongs, and _S'i represents the distance of the cluster to which the superpixel belongs , Indicates the target/background confidence of k′ _i , R′ _j represents the cluster radius, conf _i ′ represents the confidence of g′ _i , L _i represents each superpixel in the candidate image slice and the template superpixel in the cluster to which it belongs The minimum spatial distance between, a _s is the weight factor to control the weight of the spatial distance, Represents g′ _i and the template superpixel of the cluster it belongs to the spatial distance in the target area, Indicates that a _s is the base, and the is the power operation of the exponent;

其中 in

A′_j表示当前跟踪结果中每个超像素的像素点个数，表示每个超像素聚类包含的目标区域像素点个数，M表示超像素的总数；A' _j represents the number of pixels of each superpixel in the current tracking result, Represents the number of pixels in the target area contained in each superpixel cluster, and M represents the total number of superpixels;

三、检测阶段3. Detection stage

S9.构建模板库生成模型，并使模板库生成模型在当前帧内检测目标区域，返回检测结果的置信度conf_D，然后根据适应生成模型和模板库生成模型的输出结果估计目标区域的当前位置：S9. Build a template library generation model, and make the template library generation model detect the target area in the current frame, return the confidence degree conf _D of the detection result, and then estimate the current position of the target area according to the output results of the adaptation generation model and the template library generation model :

1)当area_T≥thr_PL且conf_T≥thr_TH时1) When area _T ≥ thr _PL and conf _T ≥ thr _TH

其中thr_TH、thr_PL分别表示置信度阈值和匹配面积阈值，此时适应生成模型的跟踪结果具有较高的置信度和匹配面积，适应生成模型正常工作而且适应了目标区域外观，所以把适应生成模型的输出结果作为目标位置输出；然后按照更新策略根据area_T、conf_T对全局分类器GC、局部分类器、适应生成模型的参数进行更新；Among them, thr _TH and thr _PL represent the confidence threshold and the matching area threshold respectively. At this time, the tracking result of the adaptive generative model has high confidence and matching area, and the adaptive generative model works normally and adapts to the appearance of the target area, so the adaptive generative model The output of the model is output as the target position; then according to the update strategy, update the parameters of the global classifier GC, local classifier, and adaptive generation model according to area _T and conf _T ;

2)当area_T<thr_PL且conf_T≥thr_TH时2) When area _T < thr _PL and conf _T ≥ thr _TH

此时适应生成模型的跟踪结果的匹配面积较低，但跟踪结果的置信度仍然高于阈值，所以仍把适应生成模型的输出结果作为目标位置输出；然后按照更新策略根据area_T、conf_T对全局分类器GC、局部分类器、适应生成模型的参数进行更新；At this time, the matching area of the tracking result of the adaptive generative model is low, but the confidence of the tracking result is still higher than the threshold, so the output result of the adaptive generative model is still output as the target position; then according to the update strategy according to area _T , conf _T Update the parameters of the global classifier GC, local classifier, and adaptive generation model;

3)当area_T≥thr_PL且conf_T<thr_TH时3) When area _T ≥ thr _PL and conf _T < thr _TH

此时适应生成模型的跟踪结果具有较低的置信度，但具有较高的匹配面积，所以仍把适应生成模型的输出结果作为目标位置输出；然后按照更新策略根据area_T、conf_T对全局分类器GC、局部分类器、适应生成模型的参数进行更新；At this time, the tracking result of the adaptive generative model has low confidence, but has a high matching area, so the output result of the adaptive generative model is still output as the target position; and then the global classification is performed according to the area _T and conf _T according to the update strategy GC, local classifiers, parameters adapted to generate models are updated;

4)当area_T<thr_PL，conf_T≥thr_TH且conf_D≥thr_DH时4) When area _T < thr _PL , conf _T ≥ thr _TH and conf _D ≥ thr _DH

thr_DH表示检测结果置信度的阈值，此时适应生成模型的跟踪结果的置信度和匹配面积都低于预设的阈值，而模板库生成模型检测到一个置信度较高的目标位置，则把模板库生成模型的检测结果当作目标位置输出，然后对全局分类器GC、局部分类器、适应生成模型进行重初始化。thr _DH represents the threshold of the confidence of the detection result. At this time, the confidence of the tracking result and the matching area of the adaptive generation model are lower than the preset threshold, and the template library generation model detects a target position with a high degree of confidence. The detection result of the template library generation model is output as the target position, and then the global classifier GC, local classifier, and adaptive generation model are reinitialized.

模型更新是使跟踪算法能够适应目标外观变化的关键，判别模型采用了类似RealTime Compressive Tracking文献中增量更新方法(注与本专利无关，所以不赘述)，生成模型中采用了一种基于滑动窗口的更新方法。在跟踪过程中，每隔U帧图像，我们就把一帧图像加入到模型中并进行超像素分割、特征提取、聚类。为了保证算法的实时性，我们采用了一个固定大小的窗口，并在每次更新时，如果窗口的图像帧数大于预定大小，则按一定的策略丢弃对生成模型影响最小的图像。Model update is the key to enable the tracking algorithm to adapt to changes in the appearance of the target. The discriminant model uses an incremental update method similar to that in the RealTime Compressive Tracking literature (note that it has nothing to do with this patent, so it will not be described in detail), and the generation model uses a method based on sliding windows. update method. In the tracking process, every U frame of image, we add a frame of image to the model and perform superpixel segmentation, feature extraction, and clustering. In order to ensure the real-time performance of the algorithm, we use a fixed-size window, and at each update, if the number of image frames in the window is greater than the predetermined size, the image that has the least impact on the generation model will be discarded according to a certain strategy.

同时，本发明还提供了一种应用所述协同跟踪算法的系统，其具体方案如下：包括跟踪模块、检测模块和位置估计模块，其中所述跟踪模块包括全局判别模型、局部判别模型和适应性生成模型，所述检测模型包括模板库生成模型，位置估计模块用于根据适应生成模型和模板库生成模型的输出结果估计目标区域的当前位置。At the same time, the present invention also provides a system for applying the cooperative tracking algorithm, and its specific scheme is as follows: it includes a tracking module, a detection module and a position estimation module, wherein the tracking module includes a global discriminant model, a local discriminant model and an adaptive A generation model, the detection model includes a template library generation model, and the position estimation module is used to estimate the current position of the target area according to output results of the adaptive generation model and the template library generation model.

与现有技术相比，本发明的有益效果是：Compared with prior art, the beneficial effect of the present invention is:

本发明提供的基于超像素的协同跟踪算法，该方法能够处理遮挡、外观变化等目标跟踪中的常见问题，具有很好的稳定性和鲁棒性。The superpixel-based collaborative tracking algorithm provided by the present invention can deal with common problems in target tracking such as occlusion and appearance change, and has good stability and robustness.

附图说明Description of drawings

图1为本方法的框架图。Figure 1 is the frame diagram of this method.

图2为判别模型的训练示意图。Figure 2 is a schematic diagram of the training of the discriminant model.

图3为适应生成模型的训练示意图。Figure 3 is a schematic diagram of the training of the adaptive generative model.

具体实施方式Detailed ways

附图仅用于示例性说明，不能理解为对本专利的限制；The accompanying drawings are for illustrative purposes only and cannot be construed as limiting the patent;

以下结合附图和实施例对本发明做进一步的阐述。The present invention will be further elaborated below in conjunction with the accompanying drawings and embodiments.

实施例1Example 1

一、训练阶段1. Training stage

S1.构建全局判别模型，所述全局判别模型用于提取目标区域的Haar_Like特征，然后根据提取全局压缩Haar_Like特征构建全局分类器GC，并确定全局分类器GC的参数，具体如如图2所示；S1. Build a global discriminant model, the global discriminant model is used to extract the Haar_Like feature of the target area, then construct a global classifier GC according to the extracted global compressed Haar_Like feature, and determine the parameters of the global classifier GC, specifically as shown in Figure 2 ;

S2.使用基于重叠滑动窗口的分片方法对目标区域进行分片，获得N个子区域，然后分别构建N个全局判别模型，所述N个局部判别模型用于对N个子区域分别提取Haar_Like特征，然后根据提取局部压缩Haar_Like特征分别构建局部分类器，并确定局部分类器的参数，具体如如图2所示；S2. Use the fragmentation method based on overlapping sliding windows to fragment the target area to obtain N sub-regions, and then construct N global discriminant models respectively, and the N local discriminant models are used to extract Haar_Like features respectively for the N sub-regions, Then construct local classifiers according to the extracted local compressed Haar_Like features, and determine the parameters of the local classifiers, as shown in Figure 2;

对目标区域进行超像素分割，并分别提取每个超像素的特征向量，然后使用K-means算法对目标区域的所有超像素进行聚类，从而确定适应生成模型的模型参数，具体如如图3所示；Carry out superpixel segmentation on the target area, and extract the feature vector of each superpixel separately, and then use the K-means algorithm to cluster all the superpixels in the target area, so as to determine the model parameters suitable for the generated model, as shown in Figure 3 shown;

二、跟踪阶段2. Tracking stage

S4.将候选图像p_i输入至全局判别模型，全局判别模型对候选图像p_i的Haar_Like特征进行提取，然后使用全局分类器GC对候选图像p_i的全局压缩Haar_Like特征进行分类，GC(p_i)表示候选图像p_i的分类结果；S4. Input the candidate image p _i into the global discriminant model, the global discriminant model extracts the Haar_Like feature of the candidate image p _i , and then uses the global classifier GC to classify the globally compressed Haar_Like feature of the candidate image p _i , GC(p _i ) represents the classification result of the candidate image p _i ;

S5.使用步骤S2的方法将候选图像p_i划分N个子区域，然后使N个局部判别模型对N个子区域分别提取Haar_Like特征，然后使用N个局部分类器分别对N个子区域的局部压缩Haar_Like特征进行分类，LC_j(p_i)表示第j个局部分类器对子区域的分类结果。目标发生遮挡时，全局判别模型可能无法对目标区域进行正确判别，但N个局部判别模型中通常仍然有一个或者多个对应区域未被遮挡的局部分类器能够正确判别目标区域。S5. Use the method of step S2 to divide the candidate image p _i into N sub-regions, then make N local discriminant models extract Haar_Like features for N sub-regions, and then use N local classifiers to compress Haar_Like features for N sub-regions respectively Classify, LC _j (p _i ) represents the classification result of the jth local classifier for the sub-region. When the target is occluded, the global discriminant model may not be able to correctly identify the target area, but there are usually one or more local classifiers whose corresponding areas are not occluded in the N local discriminant models, which can correctly identify the target area.

上述方案中，当目标区域被阻挡时，全局判别模型无法正常工作，为了避免此种缺陷，本发明提供的方法将结合全局判断和局部判断来确定候选图像内是否包含有目标区域，其准确性、适用性大大提高。In the above solution, when the target area is blocked, the global discrimination model cannot work normally. In order to avoid such defects, the method provided by the present invention will combine global judgment and local judgment to determine whether the candidate image contains the target area, and its accuracy , The applicability is greatly improved.

上述公式表明当超像素与聚类中心在特征空间中相近，与聚类中模板超像素在目标区域相对位置也相近，且所属聚类的目标/背景置信度高时，本专利认为该类超像素可以更加充分地描述当前目标的外观信息且判别能力强，其中g′_i表示候选图像片包含的超像素，k′_i表示超像素所属聚类，S′_i表示超像素所属聚类的距离，表示每个聚类的目标/背景置信度，R′_j表示聚类半径，conf_i′表示g′_i的置信度，L_i表示候选图像片中每个超像素与所属聚类中的模板超像数间的最小空间距离，a_s是控制空间距离权重的权重因子，表示g′_i与所属聚类的模板超像素在目标区域中的空间距离；The above formula shows that when the superpixel is close to the cluster center in the feature space, and the relative position of the template superpixel in the cluster is also similar to the target area, and the target/background confidence of the cluster to which it belongs, this patent considers that the superpixel of this type is Pixels can more fully describe the appearance information of the current target and have strong discriminative ability, where _g'i represents the superpixels contained in the candidate image slice, _k'i represents the cluster to which the superpixel belongs, and _S'i represents the distance of the cluster to which the superpixel belongs , Represents the target/background confidence of each cluster, R′ _j represents the cluster radius, conf _i ′ represents the confidence of g′ _i , L _i represents the relationship between each superpixel in the candidate image slice and the template superpixel in the cluster to which it belongs. The minimum spatial distance between pixels, a _s is the weight factor to control the weight of the spatial distance, Represents g′ _i and the template superpixel of the cluster it belongs to Spatial distance in the target area;

其中A_target表示每个聚类中所有类成员属于目标区域的像素点个数的和，A_background表示背景区域的像素点个数的和；Where A _target represents the sum of the number of pixels belonging to the target area for all class members in each cluster, and A _background represents the sum of the number of pixels in the background area;

其中 in

三、检测阶段3. Detection stage

上述方案中，模板库生成模型根据一定的策略来确定当前各模型的工作状态和目标位置并输出，同时反馈到全局分类器GC、局部分类器、适应生成模型中，并对全局分类器GC、局部分类器、适应生成模型中进行更新，从而使得该方法可以适应目标区域在跟踪过程中各种外观变化。In the above scheme, the template library generation model determines the current working status and target position of each model according to a certain strategy and outputs them, and feeds back to the global classifier GC, local classifiers, and adaptive generation models at the same time, and updates the global classifier GC, The local classifier is updated in the adaptive generative model, so that the method can adapt to various appearance changes of the target area during the tracking process.

实施例2Example 2

本发明还提供了一种应用所述协同跟踪算法的系统，如图3所示，其具体方案如下：The present invention also provides a system for applying the collaborative tracking algorithm, as shown in Figure 3, and its specific scheme is as follows:

包括跟踪模块、检测模块和位置估计模块，其中所述跟踪模块包括全局判别模型、局部判别模型和适应性生成模型，所述检测模型包括模板库生成模型，位置估计模块用于根据适应生成模型和模板库生成模型的输出结果估计目标区域的当前位置。Including a tracking module, a detection module and a position estimation module, wherein the tracking module includes a global discriminant model, a local discriminant model and an adaptive generation model, the detection model includes a template library generation model, and the position estimation module is used to generate the model according to the adaptation and The output of the template library generation model estimates the current location of the target region.

显然，本发明的上述实施例仅仅是为清楚地说明本发明所作的举例，而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明权利要求的保护范围之内。Apparently, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the implementation of the present invention. For those of ordinary skill in the art, other changes or changes in different forms can be made on the basis of the above description. It is not necessary and impossible to exhaustively list all the implementation manners here. All modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the claims of the present invention.

Claims

1. A collaborative tracking algorithm based on superpixel segmentation, used to solve the tracking problem of single camera single target, is characterized in that: comprise the following steps:

1. Training stage

S1. Build a global discriminant model, the global discriminant model is used to extract the Haar_Like feature of the target area, then construct a global classifier GC according to the extracted Haar_Like feature, and determine the parameters of the global classifier GC;

S2. Use the fragmentation method based on overlapping sliding windows to fragment the target area to obtain N sub-regions, and then construct N local discriminant models, and the N local discriminant models are used to extract Haar_Like features for the N sub-regions respectively, Then build local classifiers according to the extracted Haar_Like features, and determine the parameters of the local classifiers;

S3. Constructing an adaptive generative model, and confirming the model parameters of the adaptive generative model, the specific steps are as follows:

Carry out superpixel segmentation on the target area, and extract the feature vector of each superpixel separately, and then use the K-means algorithm to cluster all the superpixels in the target area, so as to determine the model parameters suitable for the generated model;

2. Tracking stage

S4. Input the candidate image p _i into the global discriminant model, the global discriminant model extracts the Haar_Like feature of the candidate image p _i , and then uses the global classifier GC to classify the Haar_Like feature of the candidate image p _i , GC(pi ₎ means The classification result of the candidate image p _i ;

S5. Use the method of step S2 to divide the candidate image p _i into N sub-regions, then make N local discriminant models extract Haar_Like features for the N sub-regions, and then use N local classifiers to classify the Haar_Like features of the N sub-regions respectively , LC _j (p _i ) represents the classification result of the jth local classifier for the sub-region;

S6. Combining the classification results of the global classification model and the local classification model, it is judged whether the candidate image contains the target area:

thr _GC and thr _LC represent the two thresholds of global classification and local classification respectively, when y(pi ₎ =1, it means that the candidate image p _i contains the target area;

S7. Perform the operations of steps S4 to S6 on all candidate images to determine whether they contain the target area, and then input all candidate images that are determined to contain the target area into the adaptive generative model;

S8. For each candidate image, adapt the generated model to perform superpixel segmentation, then extract the feature vector of each superpixel, then use the K-means algorithm to cluster the feature vectors of all superpixels, and calculate the candidate image clustering confidence; then select the candidate image with the highest confidence as the tracking result to output, the output data includes the confidence conf _T of the current tracking result and the matching area area _T of the target area, where where A _i is the area of each superpixel, N represents the number of superpixels contained in the candidate image slice,

Among them _{, g'i} represents the superpixels contained in the candidate image slice, _k'i represents the cluster to which the superpixel belongs, _S'i represents the distance of the cluster to which the superpixel belongs, Indicates the target/background confidence of k′ _i , R′ _j represents the cluster radius, conf _i ′ represents the confidence of g′ _i , L _i represents each superpixel in the candidate image slice and the template superpixel in the cluster to which it belongs The minimum spatial distance between, a _s is the weight factor to control the weight of the spatial distance, a _s ∈ (0,1), Represents the template superpixels of g′ _i and the cluster it belongs to The spatial distance in the target area, Indicates that a _s is the base, and the is the power operation of the exponent;

in

A' _j represents the number of pixels of each superpixel in the current tracking result, Represents the number of pixels in the target area contained in each superpixel cluster, and M represents the total number of superpixels;

3. Detection stage

S9. Build a template library generation model, and make the template library generation model detect the target area in the current frame, return the confidence degree conf _D of the detection result, and then estimate the current position of the target area according to the output results of the adaptation generation model and the template library generation model :

1) When area _T ≥ thr _PL and conf _T ≥ thr _TH

Among them, thr _TH and thr _PL represent the confidence threshold and the matching area threshold respectively. At this time, the tracking result of the adaptive generative model has high confidence and matching area, and the adaptive generative model works normally and adapts to the appearance of the target area, so the adaptive generative model The output of the model is output as the target position; then according to the update strategy, update the parameters of the global classifier GC, local classifier, and adaptive generation model according to area _T and conf _T ;

2) When area _T < thr _PL and conf _T ≥ thr _TH

At this time, the matching area of the tracking result of the adaptive generative model is low, but the confidence of the tracking result is still higher than the threshold, so the output result of the adaptive generative model is still output as the target position; then according to the update strategy according to area _T , conf _T Update the parameters of the global classifier GC, local classifier, and adaptive generation model;

3) When area _T ≥ thr _PL and conf _T < thr _TH

At this time, the tracking result of the adaptive generative model has low confidence, but has a high matching area, so the output result of the adaptive generative model is still output as the target position; and then the global classification is performed according to the area _T and conf _T according to the update strategy GC, local classifiers, parameters adapted to generate models are updated;

4) When area _T < thr _PL , conf _T ≥ thr _TH and conf _D ≥ thr _DH

thr _DH represents the threshold of the confidence of the detection result. At this time, the confidence of the tracking result and the matching area of the adaptive generation model are lower than the preset threshold, and the template library generation model detects a target position with a high degree of confidence. The detection result of the template library generation model is output as the target position, and then the global classifier GC, local classifier, and adaptive generation model are reinitialized.

2. A system of collaborative tracking algorithms based on superpixel segmentation according to claim 1, characterized in that: comprising a tracking module, a detection module and a position estimation module, wherein said tracking module includes a global discriminant model, a local discriminant model and The adaptive generation model, the detection module includes a template library generation model, and the position estimation module is used to estimate the current position of the target area according to the output results of the adaptation generation model and the template library generation model.