CN103198493B

CN103198493B - A kind ofly to merge and the method for tracking target of on-line study based on multiple features self-adaptation

Info

Publication number: CN103198493B
Application number: CN201310121576.9A
Authority: CN
Inventors: 苏育挺; 刘安安; 刘晓伟
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2013-04-09
Filing date: 2013-04-09
Publication date: 2015-10-28
Anticipated expiration: 2033-04-09
Also published as: CN103198493A

Abstract

The invention discloses a target tracking method based on multi-feature adaptive fusion and online learning, which extracts target features and uses them as template features; respectively extracts three types of features for new candidate target areas; performs automatic tracking according to the discrimination and correlation of each feature Adapt to fusion; calculate the Bhattacharyachian distance between the fused feature and the template feature, and normalize the Bhattacharyachian distance as the weight of the new candidate target area; judge the overlap between the new candidate target area and the target area with the largest weight, if the overlap rate is less than Overlap rate threshold Input the multiple area of the new candidate target area with the maximum weight into the detector. When the output of the recognizer is yes, it means that the tracking is successful and the recognizer, template features and target area are updated; if the output is no, it means that a new target is found; if it overlaps rate is greater than or equal to the overlap rate threshold, update the recognizer, template features, and target regions. Enhanced the adaptability of target tracking in different scenes and under certain deformation conditions, avoiding the problem of easy tracking drift after occlusion.

Description

A Target Tracking Method Based on Multi-Feature Adaptive Fusion and Online Learning

技术领域technical field

本发明涉及目标跟踪领域，特别涉及一种基于多特征自适应融合和在线学习的目标跟踪方法。The invention relates to the field of target tracking, in particular to a target tracking method based on multi-feature adaptive fusion and online learning.

背景技术Background technique

视频目标跟踪是指对视频序列中目标的检测、表征和轨迹提取的过程。视频目标跟踪在视频监控、事件分析、人机交互等领域都有实际应用需求。Video object tracking refers to the process of object detection, representation and trajectory extraction in video sequences. Video object tracking has practical application requirements in video surveillance, event analysis, human-computer interaction and other fields.

目前，世界上最先进的监控系统也不能完美地处理复杂场景下的动态跟踪任务，例如：形变、遮挡、光照变化、阴影或拥挤环境下的跟踪。特别在目标间发生部分遮挡以及目标发生形变时，目标跟踪仍是个挑战。At present, the world's most advanced surveillance systems cannot perfectly handle dynamic tracking tasks in complex scenes, such as: deformation, occlusion, lighting changes, shadows or tracking in crowded environments. Object tracking remains a challenge, especially when objects are partially occluded and deformed.

基于单特征的跟踪方法通常初始化目标区域，提取任意目标特征，例如：颜色特征，在下一帧进行搜索和匹配。但该方法很难处理复杂背景下的跟踪任务，造成对目标跟踪以及后续的轨迹评估不具鲁棒性。Single-feature-based tracking methods usually initialize target regions, extract arbitrary target features, such as color features, and search and match them in the next frame. However, it is difficult for this method to deal with tracking tasks in complex backgrounds, which makes it not robust to target tracking and subsequent trajectory evaluation.

为此，现有技术中又提出了基于多特征的目标跟踪方法，该方法通常提取颜色、边缘等特征，可以较好地完成某些复杂情况下的跟踪任务。For this reason, an object tracking method based on multi-features is proposed in the prior art. This method usually extracts features such as color and edge, and can better complete tracking tasks in some complex situations.

发明人在实现本发明的过程中，发现现有技术中至少存在以下缺点和不足：In the process of realizing the present invention, the inventor finds that at least the following disadvantages and deficiencies exist in the prior art:

基于多特征的目标跟踪方法无法适应目标外形的变化，不能很好的解决目标之间发生部分遮挡后跟踪漂移的问题。The target tracking method based on multi-features cannot adapt to the change of target shape, and cannot solve the problem of tracking drift after partial occlusion between targets.

发明内容Contents of the invention

本发明提供了一种基于多特征自适应融合和在线学习的目标跟踪方法，本方法避免了跟踪漂移的问题，很好的适应了目标外形的变化，详见下文描述：The present invention provides a target tracking method based on multi-feature adaptive fusion and online learning. This method avoids the problem of tracking drift and is well adapted to changes in target shape. See the following description for details:

一种基于多特征自适应融合和在线学习的目标跟踪方法，所述方法包括以下步骤：A target tracking method based on multi-feature adaptive fusion and online learning, said method comprising the following steps:

（1）从一帧图像中选定目标区域，提取目标特征并作为模板特征；(1) Select the target area from a frame of image, extract the target features and use them as template features;

（2）初始化识别器，输入下一帧图像，初始化候选目标区域；根据转移公式获取新候选目标区域；(2) Initialize the recognizer, input the next frame of image, and initialize the candidate target area; obtain the new candidate target area according to the transfer formula;

（3）对所述新候选目标区域分别提取颜色、边缘、纹理三种特征；根据各特征的辨别性和相关性进行自适应融合；(3) Extract three features of color, edge, and texture from the new candidate target area; perform adaptive fusion according to the discrimination and correlation of each feature;

（4）计算融合后特征与模版特征的巴氏距离，将所述巴氏距离归一化后作为所述新候选目标区域的权重；(4) Calculate the Bhattacharyachian distance between the fused feature and the template feature, and normalize the Bhattacharyachian distance as the weight of the new candidate target area;

（5）将N个新候选目标区域按照权重大小进行排序，若重采样判断值大于重采样判决阈值，进行重采样，执行步骤（6）；如果否，执行步骤（6）；(5) Sort the N new candidate target areas according to their weights. If the resampling judgment value is greater than the resampling judgment threshold, perform resampling and perform step (6); if not, perform step (6);

（6）对最大权重的新候选目标区域与所述目标区域进行重叠判断，若重叠率小于重叠率阈值执行步骤（7）；否则，执行步骤（8）；(6) Judging the overlap between the new candidate target area with the maximum weight and the target area, if the overlap rate is less than the overlap rate threshold, perform step (7); otherwise, perform step (8);

（7）将最大权重的新候选目标区域的多倍区域输入检测器，若所述检测器输出0，表示跟踪失败；否则将所述检测器输出结果输入到识别器，若所述识别器输出是，表示跟踪成功并更新所述识别器、所述模板特征和所述目标区域；若输出否，表示发现新目标，流程结束；(7) Input the multiple area of the new candidate target area with the maximum weight into the detector. If the detector outputs 0, it means that the tracking fails; otherwise, input the output result of the detector to the recognizer. If the recognizer outputs Yes, it means that the tracking is successful and the recognizer, the template feature and the target area are updated; if the output is No, it means that a new target is found, and the process ends;

（8）重叠率大于等于重叠率阈值，则认为跟踪成功，更新所述识别器、所述模板特征和所述目标区域，流程结束。(8) If the overlap rate is greater than or equal to the overlap rate threshold, it is considered that the tracking is successful, the recognizer, the template feature and the target area are updated, and the process ends.

所述提取目标特征并作为模板特征的步骤具体包括：The step of extracting target features and using them as template features specifically includes:

1）提取颜色特征信息；1) Extract color feature information;

2）提取边缘特征信息；2) Extract edge feature information;

3）提取纹理特征信息；3) Extract texture feature information;

4）融合所述颜色特征信息、所述边缘特征信息和所述纹理特征信息，得到目标特征直方图，作为所述模板特征。4) Fusing the color feature information, the edge feature information and the texture feature information to obtain a target feature histogram as the template feature.

所述提取颜色特征信息的步骤具体包括：The step of extracting color feature information specifically includes:

1）将颜色空间分为彩色区域和非彩色区域，对所述彩色区域和所述非彩色区域进行HSV分区，获取Q_H×Q_S个彩色子区间和Q_v个非彩色子区间，将所述Q_H×Q_S个彩色子区间和所述Q_v个非彩色子区间作为Q_H×Q_S+Q_v个颜色区间u；1) Divide the color space into a color area and an achromatic area, perform HSV partition on the color area and the achromatic area, obtain Q _H × Q _S color sub-intervals and Q _v achromatic sub-intervals, divide all The QH×QS color subintervals and the _Qv achromatic subintervals are used as _QH × _QS ₊ _Qv color intervals _u ;

2）根据像素点与目标区域中心点的距离远近对所述像素点赋予不同的权值，根据所述像素点的HSV对相应的颜色区间u进行投票；2) Assign different weights to the pixel according to the distance between the pixel and the center point of the target area, and vote for the corresponding color interval u according to the HSV of the pixel;

3）统计每一颜色区间的投票值得到颜色特征直方图。3) Count the voting values of each color interval to obtain a color feature histogram.

所述提取边缘特征信息的步骤具体包括：The step of extracting edge feature information specifically includes:

1）对所述目标区域插值得到2倍宽和高的插值区域，然后分别对所述目标区域和所述插值区域分块；1) Interpolating the target area to obtain an interpolation area twice the width and height, and then dividing the target area and the interpolation area into blocks respectively;

2）计算每一子块的边缘强度和方向，在0°-360°范围内将边缘方向划分为若干个方向区域，根据边缘强度对方向区域投票，得到每一子块的边缘特征直方图；2) Calculate the edge strength and direction of each sub-block, divide the edge direction into several direction areas within the range of 0°-360°, vote for the direction area according to the edge strength, and obtain the edge feature histogram of each sub-block;

3）将每一子块计算得到的所述边缘特征直方图连接起来得到完整边缘特征直方图。3) The edge feature histograms calculated by each sub-block are connected to obtain a complete edge feature histogram.

所述提取纹理特征信息的步骤具体包括：The step of extracting texture feature information specifically includes:

1）对每一子块，计算局部二值模式特征直方图；1) For each sub-block, calculate the local binary pattern feature histogram;

2）将每一子块计算得到的所述局部二值模式特征直方图连接起来得到完整纹理特征直方图。2) Connect the local binary pattern feature histograms calculated by each sub-block to obtain a complete texture feature histogram.

所述辨别性定义为所述新候选目标区域与相邻背景关于某一特征的相似程度，用两直方图的巴氏系数表示。The discrimination is defined as the degree of similarity between the new candidate target region and the adjacent background with respect to a certain feature, expressed by the Bhattachary coefficient of the two histograms.

所述相关性定义为所述新候选目标区域与所述模版特征关于某一特征的相似程度，用两直方图的巴氏系数表示。The correlation is defined as the degree of similarity between the new candidate target region and the template feature with respect to a certain feature, expressed by the Bhattachary coefficient of the two histograms.

所述更新识别器具体为：正样本由所述检测器验证过的新候选目标区域组成，负样本为背景中随机选取的与新候选目标区域等尺寸的区域；The update identifier is specifically: the positive sample is composed of the new candidate target area verified by the detector, and the negative sample is a randomly selected area in the background with the same size as the new candidate target area;

所述更新模板特征具体为：将最大权重的新候选目标区域的特征作为更新后的模板特征；The updating template feature is specifically: using the feature of the new candidate target region with the maximum weight as the updated template feature;

所述更新目标区域具体为：将最大权重的新候选目标区域作为更新后的目标区域。The updating of the target area specifically includes: taking the new candidate target area with the largest weight as the updated target area.

本发明提供的技术方案的有益效果是：本方法克服了单特征的单一性，增强了不同场景以及一定形变情况下目标跟踪的适应能力，并且避免了遮挡后易发生跟踪漂移的问题，大大提高了目标跟踪的精确性和鲁棒性。The beneficial effects of the technical solution provided by the present invention are: the method overcomes the singleness of a single feature, enhances the adaptability of target tracking in different scenes and under certain deformation conditions, and avoids the problem of easy tracking drift after occlusion, greatly improving The accuracy and robustness of target tracking are improved.

附图说明Description of drawings

图1为一种基于多特征自适应融合和在线学习的目标跟踪方法的流程图；Fig. 1 is a flow chart of a target tracking method based on multi-feature adaptive fusion and online learning;

图2为初始化目标区域的示意图；FIG. 2 is a schematic diagram of initializing a target area;

图3为目标1发生遮挡的示意图；FIG. 3 is a schematic diagram of occlusion of target 1;

图4为通过在线学习成功找回遮挡丢失的目标的示意图；Figure 4 is a schematic diagram of successfully retrieving the lost target through online learning;

图5为另一初始化目标区域的示意图；5 is a schematic diagram of another initialization target area;

图6为目标1和目标2发生交错遮挡的示意图；Fig. 6 is a schematic diagram of staggered occlusion of target 1 and target 2;

图7为精确跟踪没有发生跟踪漂移的示意图。Fig. 7 is a schematic diagram of accurate tracking without tracking drift.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

为了避免跟踪漂移的问题，很好的适应目标外形的变化，本发明实施例提供了一种基于多特征自适应融合和在线学习的目标跟踪方法，在线学习通过实时自动学习，可以克服目标形变以及跟踪漂移带来的问题，较好地实现了预期的跟踪效果，参见图1，详见下文描述：In order to avoid the problem of tracking drift and adapt to the change of target shape, the embodiment of the present invention provides a target tracking method based on multi-feature adaptive fusion and online learning. Online learning can overcome target deformation and The problems caused by tracking drift have better achieved the expected tracking effect, see Figure 1, and see the description below for details:

101：将任意视频序列的一帧作为输入，从一帧图像中选定目标区域，提取目标特征并作为模板特征；101: Take a frame of any video sequence as input, select a target area from a frame of image, extract target features and use them as template features;

其中，选定目标区域的操作为本领域技术人员所公知，目标区域为矩形区域，例如：根据实际需求可以人为手工选定被跟踪对象所在矩形区域；或者采用被检测对象模型（例如：人体检测模型[1]）进行自动检测，计算模版特征。Wherein, the operation of selecting the target area is well known to those skilled in the art, and the target area is a rectangular area, for example: according to actual needs, the rectangular area where the tracked object is located can be manually selected; or the detected object model (for example: human body detection Model [1]) for automatic detection and calculation of template features.

其中，提取目标特征并作为模板特征的步骤具体为：Among them, the steps of extracting target features and using them as template features are as follows:

1）提取颜色特征信息；1) Extract color feature information;

本方法采用基于HSV（色度、饱和度和亮度）颜色空间模型的核加权颜色特征直方图对目标进行建模，基本思想是：This method uses the kernel-weighted color feature histogram based on the HSV (hue, saturation, and brightness) color space model to model the target. The basic idea is:

（1）将颜色空间分为彩色区域和非彩色区域，对彩色区域和非彩色区域进行HSV分区，获取Q_H×Q_S个彩色子区间和Q_v个非彩色子区间，将Q_H×Q_S个彩色子区间和Q_v个非彩色子区间作为Q_H×Q_S+Q_v个颜色区间u；(1) Divide the color space into color area and achromatic area, perform HSV partition on the color area and achromatic area, obtain Q _H × Q _S color sub-intervals and Q _v achromatic sub-intervals, divide Q _H × Q _S color sub-intervals and Q _v achromatic sub-intervals are used as Q _H × Q _S + Q _v color intervals u;

例如：所有亮度小于20%或者饱和度小于10%均归入非彩色区域，并按亮度值分为Q_v个非彩色子区间，非彩色区域以外的颜色区域为彩色区域，按色度和饱和度分为Q_H×Q_S个彩色子区间。For example: all brightness less than 20% or saturation less than 10% are classified as non-color areas, and are divided into Q _v achromatic sub-intervals according to the brightness value, and the color areas outside the achromatic area are color areas, according to chroma and saturation Degree is divided into Q _H × Q _S color subintervals.

（2）根据像素点与目标区域中心点的距离远近对像素点赋予不同的权值（即对距离目标中心较远的像素点赋予较小的权值，从而减弱目标边界及背景的干扰），根据像素点的HSV对相应的颜色区间u进行投票；(2) According to the distance between the pixel point and the center point of the target area, different weights are assigned to the pixel point (that is, a smaller weight value is assigned to the pixel point farther from the center point of the target area, thereby reducing the interference of the target boundary and background), Vote for the corresponding color interval u according to the HSV of the pixel;

定义目标区域为宽W，高H的矩形区域。在目标区域边界的像素点可能属于背景或发生了部分遮挡，为了增加颜色分布的可靠性，采用如下函数分配权重：Define the target area as a rectangular area with width W and height H. Pixels on the boundary of the target area may belong to the background or be partially occluded. In order to increase the reliability of the color distribution, the following function is used to assign weights:

$k k (({x x}_{i i})) = = \{\begin{matrix} 11 - - {((\frac{y the y - - {x x}_{i i}}{h h}))}^{22} & r r < < 11 \\ 00 & r r &GreaterEqual; &Greater Equal; 11 \end{matrix}$

式中，k(x_i)表示赋予像素点x_i的权重，y表示目标区域中心点，x_i为目标区域内第i个像素点；表示目标区域大小。In the formula, k( _xi ) represents the weight assigned to the pixel x _i , y represents the center point of the target area, and x _i is the i-th pixel in the target area; Indicates the target area size.

本方法通过狄拉克函数δ确定每个像素点所对应的颜色区间，δ(b(x_i)-u)表示像素点x_i在直方图中颜色区间u的分布：b(x_i)为像素点x_i的HSV，φ，η为常数。狄拉克函数δ定义如下：This method uses the Dirac function δ to determine the color interval corresponding to each pixel point, δ(b( _xi )-u) represents the distribution of pixel point x _i in the color interval u in the histogram: b( _xi ) is the pixel HSV of point x _i , φ, η is a constant. The Dirac function δ is defined as follows:

δ(x-φ)=0,x≠φδ(x-φ)=0, x≠φ

例如：有3个像素点，分别为：x₁、x₂和x₃，像素点x₁对应的颜色区间为u1，像素点x2对应的颜色区间为u1，像素点x₃对应的颜色区间为u2，则颜色区间u1的投票结果为像素点x₁和像素点x₂的权重之和；颜色区间u2的投票结果为像素点x₃的权重；颜色区间u3的投票结果为0。For example: there are 3 pixels, namely: x ₁ , x ₂ and x ₃ , the color interval corresponding to pixel x ₁ is u1, the color interval corresponding to pixel x2 is u1, and the color interval corresponding to pixel x ₃ is u2, the voting result of color interval u1 is the sum of the weights of pixel point x1 and pixel point _x2 _; the voting result of color interval u2 is the weight of pixel point _x3 ; the voting result of color interval u3 is 0.

（3）通过每一颜色区间计算得到的投票值得到颜色特征直方图，颜色特征直方图的维数为彩色区间与非彩色区间的个数之和。(3) The color feature histogram is obtained through the voting value calculated in each color interval, and the dimension of the color feature histogram is the sum of the number of color intervals and achromatic intervals.

颜色特征直方图p_c可以表示为：The color feature histogram p _c can be expressed as:

${p p}_{c c} = = {C C}_{h h} {Σ Σ}_{u u = = 11}^{U u} {Σ Σ}_{i i = = 11}^{N N} k k (({x x}_{i i})) * * δ δ ((b b (({x x}_{i i})) - - u u))$

式中，U为颜色区间的个数，N为目标区域像素点的个数,归一化常数C_h：In the formula, U is the number of color intervals, N is the number of pixels in the target area, and the normalization constant C _h :

${C C}_{h h} = = \frac{11}{{Σ Σ}_{i i = = 11}^{N N} k k (({x x}_{i i}))}$

2）提取边缘特征信息；2) Extract edge feature information;

本方法采用一种基于多尺度分块的边缘强度加权的边缘方向特征，该步骤具体为：This method adopts an edge direction feature based on multi-scale block edge intensity weighting, and the steps are as follows:

（1）对目标区域插值得到2倍宽和高的插值区域，然后分别对目标区域和插值区域分块；(1) Interpolate the target area to obtain an interpolation area twice as wide and high, and then divide the target area and the interpolation area into blocks;

其中，对目标区域插值的方法为本领域技术人员所公知，分块方法采用重叠块或非重叠块等方法，将目标区域和插值区域分为若干个子块，本发明实施例对此不做限制。Among them, the method of interpolating the target area is well known to those skilled in the art. The block method uses methods such as overlapping blocks or non-overlapping blocks to divide the target area and the interpolation area into several sub-blocks, which is not limited in the embodiment of the present invention .

（2）计算每一子块的边缘强度和方向，在0°-360°范围内将边缘方向划分为若干个方向区域，根据边缘强度对方向区域投票，得到每一子块的边缘特征直方图；(2) Calculate the edge strength and direction of each sub-block, divide the edge direction into several direction areas within the range of 0°-360°, vote for the direction area according to the edge strength, and obtain the edge feature histogram of each sub-block ;

其中，本发明实施例采用索贝尔算子[2]计算出边缘强度和方向，还可以采用其他的方法，本发明实施例对此不做限制。方向区域的个数根据实际应用中的需要进行设定，例如：以20度为区间，在0°-360°范围内将边缘方向划分为18个方向区域。根据边缘强度对方向区域投票的步骤为本领域技术人员所公知，本发明实施例在此不做赘述。Wherein, the embodiment of the present invention uses the Sobel operator [2] to calculate the edge strength and direction, and other methods may also be used, which is not limited in the embodiment of the present invention. The number of direction areas is set according to the needs of practical applications, for example, the edge direction is divided into 18 direction areas within the range of 0°-360° with 20 degrees as the interval. The step of voting for the direction region according to the edge strength is well known to those skilled in the art, and the embodiment of the present invention will not repeat it here.

（3）将每一子块计算得到的边缘特征直方图连接起来得到完整边缘特征直方图。(3) Connect the edge feature histograms calculated by each sub-block to obtain a complete edge feature histogram.

其中，完整边缘特征直方图的维数为方向区域个数与子块总数的乘积。Among them, the dimension of the complete edge feature histogram is the product of the number of direction regions and the total number of sub-blocks.

3）提取纹理特征信息；3) Extract texture feature information;

本方法采用基于分块的多尺度局部二值模式[3]算子计算纹理特征，相比原始的局部二值模式特征，基于分块的多尺度局部二值模式特征对图像噪声的影响更加不敏感，且能够提取到更加丰富的局部和全局信息，对目标图像具有更强的表示能力和判别能力，鲁棒性更强。This method uses the block-based multi-scale local binary model [3] operator to calculate the texture features. Compared with the original local binary pattern features, the block-based multi-scale local binary pattern features have less impact on image noise. It is sensitive, and can extract richer local and global information, has stronger representation and discrimination capabilities for target images, and is more robust.

（1）对每一子块，计算局部二值模式特征直方图；(1) For each sub-block, calculate the feature histogram of the local binary pattern;

其中，每一子块为边缘特征信息中划分得到的子块，本方法采用包含59种样式的统一模式[4]计算局部二值模式特征。Among them, each sub-block is a sub-block obtained by dividing edge feature information, and this method uses a unified mode [4] containing 59 styles to calculate local binary mode features.

（2）将每一子块计算得到的局部二值模式特征直方图连接起来得到完整纹理特征直方图。(2) Connect the local binary pattern feature histograms calculated by each sub-block to obtain a complete texture feature histogram.

其中，完整纹理特征直方图的维数为59倍的子块总数。Among them, the dimensionality of the complete texture feature histogram is 59 times the total number of sub-blocks.

4）融合颜色特征信息、边缘特征信息和纹理特征信息，得到目标特征直方图，作为模板特征。4) The color feature information, edge feature information and texture feature information are fused to obtain the target feature histogram as the template feature.

即对颜色特征信息、边缘特征信息和纹理特征信息进行归一化处理，按照预设权重（权重值根据实际应用中的需要进行设定，例如：颜色特征信息、边缘特征信息和纹理特征信息的权重比为1：1：1）将归一化后的三种特征信息连接，得到目标特征直方图。That is to normalize the color feature information, edge feature information and texture feature information, according to the preset weight (the weight value is set according to the needs of practical applications, for example: color feature information, edge feature information and texture feature information The weight ratio is 1:1:1) The three normalized feature information are connected to obtain the target feature histogram.

102：初始化识别器；102: Initialize the recognizer;

识别器用于找回丢失的已跟踪目标，识别器采用Boost算法[5]训练进行学习，学习中的正样本为目标区域，负样本随机选取背景中与正样本大小相同的矩形区域，负样本个数的选择根据实际需求确定，本实验参考值为3。The recognizer is used to retrieve the lost tracked target. The recognizer uses the Boost algorithm [5] to train and learn. The positive samples in the learning are the target areas, and the negative samples randomly select the rectangular area with the same size as the positive samples in the background. The negative samples The selection of the number is determined according to actual needs, and the reference value of this experiment is 3.

103：输入下一帧图像，初始化候选目标区域；103: Input the next frame of image, and initialize the candidate target area;

根据高斯分布，在目标区域周围随机选取N个区域作为候选目标区域。N的取值根据实际需要进行选择，本实验中参考值为20。According to the Gaussian distribution, randomly select N regions around the target region as candidate target regions. The value of N is selected according to actual needs, and the reference value is 20 in this experiment.

104：根据转移公式获取新候选目标区域；104: Obtain a new candidate target area according to the transfer formula;

R_n=A*(R_c-R₀)+B*rng+R₀ R _n =A*(R _c -R ₀ )+B*rng+R ₀

其中，R_n，R_c，R₀分别为新候选目标区域、候选目标区域、目标区域；A、B为转移系数；rng为随机数产生器。例如：新候选目标区域、候选目标区域和目标区域用矩形区域的宽、高和中心坐标（x,y）表示，即宽、高和中心坐标来确定新候选目标区域，当计算新候选目标区域的宽参数时，将候选目标区域和目标区域的宽参数代入到转移公式中，得到新候选目标区域的宽参数，以此类推，分别计算出新候选目标区域的高参数和中心坐标，最终得到新候选目标区域。Among them, R _n , R _c , and R ₀ are the new candidate target area, the candidate target area, and the target area respectively; A and B are transfer coefficients; rng is a random number generator. For example: the new candidate target area, the candidate target area and the target area are represented by the width, height and center coordinates (x, y) of the rectangular area, that is, the width, height and center coordinates are used to determine the new candidate target area. When calculating the new candidate target area When the width parameter of the candidate target area and the width parameter of the target area are substituted into the transfer formula, the width parameter of the new candidate target area is obtained. By analogy, the height parameter and center coordinates of the new candidate target area are calculated respectively, and finally New candidate target regions.

105：对新候选目标区域分别提取颜色、边缘、纹理三种视觉特征；105: Extract three visual features of color, edge, and texture from the new candidate target area;

其中，该步骤的具体操作与步骤101相同，本发明实施例对此不做赘述。Wherein, the specific operation of this step is the same as that of step 101, which is not repeated in this embodiment of the present invention.

106：根据颜色、边缘和纹理特征的辨别性和相关性，对各特征进行自适应融合；106: According to the discrimination and correlation of color, edge and texture features, adaptively fuse each feature;

辨别性RD_f定义为新候选目标区域与相邻背景关于某一特征的相似程度，用两直方图的巴氏系数来表示：The discriminative RD _f is defined as the degree of similarity between the new candidate target area and the adjacent background on a certain feature, expressed by the Bhattachary coefficient of the two histograms:

${RD RD}_{f f} = = \frac{11}{{Σ Σ}_{m m = = 11}^{M m} \sqrt{{q q}_{f f ((bg bg))}^{m m} \cdot \cdot {p p}_{f f}^{m m}}},, f f &Element; &Element; {{HSV HSV,, EO EO,, LBP LBP}}$

式中，HSV、EO、LBP分别表示颜色、边缘、纹理特征，和分别表示相邻背景和目标区域特征直方图的第m维；M为颜色特征直方图的维数或完整边缘特征直方图的维数或完整纹理特征直方图的维数。辨别性RD_f越小，该特征的区分程度越强，分配较高的权重；相反，分配较低的权重。In the formula, HSV, EO, and LBP respectively represent color, edge, and texture features, and Represents the mth dimension of the feature histogram of the adjacent background and target area respectively; M is the dimension of the color feature histogram or the dimension of the complete edge feature histogram or the dimension of the complete texture feature histogram. The smaller the discriminative RD _f is, the stronger the discrimination of the feature is, and a higher weight is assigned; on the contrary, a lower weight is assigned.

相关性定义为新候选目标区域与模版特征关于某一特征的相似程度，用两直方图的巴氏系数来表示：Correlation is defined as the degree of similarity between the new candidate target area and the template feature on a certain feature, expressed by the Bhattachary coefficient of the two histograms:

${CD cd}_{f f} = = \frac{11}{{Σ Σ}_{m m = = 11}^{M m} \sqrt{{q q}_{f f}^{m m} \cdot \cdot {p p}_{f f}^{m m}}},, f f &Element; &Element; {{HSV HSV,, EO EO,, LBP LBP}}$

式中，HSV、EO、LBP分别表示颜色、边缘、纹理特征，和分别表示新候选目标区域和目标区域特征直方图的第m维；M为颜色特征直方图的维数或完整边缘特征直方图的维数或完整纹理特征直方图的维数；相关性CD_f越大，该特征的相关程度越强，分配较高的权重；相反，分配较低的权重。In the formula, HSV, EO, and LBP respectively represent color, edge, and texture features, and represent the mth dimension of the feature histogram of the new candidate target area and the target area respectively; M is the dimension of the color feature histogram or the dimension of the complete edge feature histogram or the dimension of the complete texture feature histogram; the correlation CD _f The larger the feature is, the stronger the correlation of the feature is, and a higher weight is assigned; on the contrary, a lower weight is assigned.

每一种特征的权重α，β，γ最终由RDf和CDf共同决定：The weights α, β, and γ of each feature are finally jointly determined by RDf and CDf:

$\{\begin{matrix} α α = = C C \cdot \cdot {RD RD}_{HSV HSV} \cdot \cdot {CD cd}_{HSV HSV} \\ β β = = C C \cdot &Center Dot; {RD RD}_{EO EO} \cdot \cdot {CD cd}_{EO EO} \\ γ γ = = C C \cdot &Center Dot; {RD RD}_{LBP LBP} \cdot &Center Dot; {CD cd}_{LBP LBP} \end{matrix}$

式中，是归一化常数，α+β+γ=1；HSV、EO、LBP分别表示颜色、边缘、纹理特征。In the formula, is a normalization constant, α+β+γ=1; HSV, EO, and LBP represent color, edge, and texture features, respectively.

107：计算融合后特征与模版特征的巴氏距离d，对巴氏距离d进行归一化处理，将归一化结果作为新候选目标区域的权重；107: Calculate the Bhattacharyachian distance d between the fused feature and the template feature, perform normalization processing on the Bhattacharyachian distance d, and use the normalized result as the weight of the new candidate target area;

$d d = = \sqrt{11 - - ρ ρ}$

$ρ ρ = = {Σ Σ}_{m m = = 11}^{M m} \sqrt{{p p}^{m m} \cdot &Center Dot; {q q}^{m m}}$

其中，ρ表示巴氏系数；M为颜色特征直方图的维数或完整边缘特征直方图的维数或完整纹理特征直方图的维数；q^m，p^m分别为融合后特征与模版特征的第m维。Among them, ρ represents the ^Bhattachary coefficient; ^M is the dimension of the color feature histogram or the dimension of the complete edge feature histogram or the dimension of the complete texture feature histogram; mth dimension.

108：将N个新候选目标区域按照权重大小进行排序，计算重采样判断值N_j，若重采样判断值大于重采样判决阈值，进行重采样，执行步骤109；如果否，执行步骤109；108: Sort the N new candidate target areas according to their weights, and calculate the resampling judgment value N _j , if the resampling judgment value is greater than the resampling judgment threshold, perform resampling, and execute step 109; if not, execute step 109;

其中，N为新候选目标区域数目；为第i个新候选目标区域权重。in, N is the number of new candidate target regions; is the weight of the i-th new candidate object region.

重采样判决阈值根据经验所得，参考值为50。重采样即删除小权重新候选目标区域，然后复制最大权重的新候选目标区域替换删除掉的新候选目标区域。The resampling decision threshold is obtained based on experience, and the reference value is 50. Resampling is to delete the low-weight candidate target area, and then copy the new candidate target area with the largest weight to replace the deleted new candidate target area.

109：对最大权重的新候选目标区域与目标区域进行重叠判断，若重叠率小于重叠率阈值执行步骤110；否则，执行步骤111；109: Judging the overlap between the new candidate target area with the maximum weight and the target area, if the overlap rate is less than the overlap rate threshold, perform step 110; otherwise, perform step 111;

其中，重叠率阈值根据实际应用中的情况进行设定，例如：0.3，具体实现时，本发明实施例对此不做限制。Wherein, the overlapping rate threshold is set according to the actual application situation, for example: 0.3, and this embodiment of the present invention does not limit it during specific implementation.

110：将最大权重的新候选目标区域(宽W，高H，中心（x,y）)的多倍区域（宽LW，高LH，中心（x,y），L为倍数）输入检测器，若检测器输出0，表示跟踪失败；否则将检测器输出结果输入到识别器，若识别器输出是，表示跟踪成功并更新识别器、模板特征和目标区域；若输出否，表示发现新目标，流程结束；110: Input the multiple area (width LW, height LH, center (x, y), L is a multiple) of the maximum weighted new candidate target area (width W, height H, center (x, y)) to the detector, If the output of the detector is 0, it means the tracking failed; otherwise, the output of the detector is input to the recognizer, if the output of the recognizer is yes, it means that the tracking is successful and the recognizer, template features and target area are updated; if the output is no, it means that a new target is found, end of process;

检测器是一种可以检测感兴趣目标的通用分类器，用于区分被跟踪目标与其他物体。检测器在离线下训练，在跟踪过程中不会更新。检测器输入可以是一帧图像，也可以是一帧图像中的一块区域，输出是该图像中目标对象的区域。A detector is a general-purpose classifier that can detect objects of interest and is used to distinguish tracked objects from other objects. The detectors are trained offline and are not updated during tracking. The detector input can be a frame of image, or a region in a frame of image, and the output is the region of the target object in the image.

识别器用于获取跟踪过程中特定目标的先验信息，它只在特定情况下进行更新。识别器更新中的正样本是由检测器验证过的新候选目标区域组成，负样本为背景中随机选取的与新候选目标区域等尺寸区域。识别器通过对验证过的新候选目标区域进行识别来防止跟踪漂移的发生。识别器输入是目标区域，输出是逻辑值，表示该目标是否属于识别器表示的目标。将最大权重的新候选目标区域的特征作为更新后的模板特征；将最大权重的新候选目标区域作为更新后的目标区域。The recognizer is used to obtain prior information about a specific target during tracking, and it is only updated in specific situations. The positive samples in the discriminator update are composed of new candidate object regions verified by the detector, and the negative samples are randomly selected regions in the background with the same size as the new candidate object regions. The recognizer prevents the occurrence of tracking drift by identifying new candidate object regions that have been verified. The recognizer input is the object region, and the output is a logical value indicating whether the object belongs to the object represented by the recognizer. The feature of the new candidate target area with the maximum weight is used as the updated template feature; the new candidate target area with the maximum weight is used as the updated target area.

111：重叠率大于等于重叠率阈值，则认为跟踪成功，更新识别器、模板特征和目标区域，流程结束。111: If the overlap rate is greater than or equal to the overlap rate threshold, it is considered that the tracking is successful, and the recognizer, template features and target area are updated, and the process ends.

其中，该步骤中的更新识别器、模板特征和目标区域的操作和步骤110中的更新操作一致，本发明实施例在此不做赘述。Wherein, the operation of updating the recognizer, the template feature, and the target area in this step is consistent with the updating operation in step 110, and will not be described in detail here in this embodiment of the present invention.

对视频序列的其他帧重复执行步骤104至步骤111，直到遍历完整个视频。Repeat step 104 to step 111 for other frames of the video sequence until the entire video is traversed.

下面以具体的实验来验证本发明实施例提供的一种基于多特征自适应融合和在线学习的目标跟踪方法的可行性，本实验将人作为目标进行跟踪，具体结果如图2至图7所示。其中，检测器采用基于边缘梯度直方图特征和支持向量机模型训练得到的人体检测器。The following specific experiments are used to verify the feasibility of a target tracking method based on multi-feature adaptive fusion and online learning provided by the embodiment of the present invention. In this experiment, people are used as targets for tracking. The specific results are shown in Figures 2 to 7 Show. Among them, the detector adopts the human detector based on the edge gradient histogram feature and the support vector machine model training.

图2，图3和图4，分别为视频序列第16，19，23帧。在第16帧初始化目标区域，第19帧，目标1发生了遮挡，跟踪丢失，第23帧通过在线学习成功找回遮挡丢失的目标，证明了本方法可以有效避免遮挡后目标丢失问题。Figure 2, Figure 3 and Figure 4 are the 16th, 19th, and 23rd frames of the video sequence, respectively. The target area was initialized in the 16th frame. In the 19th frame, the target 1 was occluded and the tracking was lost. In the 23rd frame, the occlusion lost target was successfully retrieved through online learning, which proved that this method can effectively avoid the problem of target loss after occlusion.

图5，图6和图7分别为视频序列第291，294和299帧。在第291帧，初始化目标区域，第294帧，目标1和目标2发生交错遮挡，第299帧，本方法进行了精确有效地跟踪，避免了目标交错遮挡后易发生跟踪漂移的问题。Figure 5, Figure 6 and Figure 7 are the 291st, 294th and 299th frames of the video sequence respectively. In frame 291, the target area is initialized. In frame 294, target 1 and target 2 are occluded alternately. In frame 299, this method performs accurate and effective tracking, avoiding the problem of easy tracking drift after target staggered occlusion.

参考文献references

[1]N.Dalal and B.Triggs.Histograms of oriented gradients for human detection.CVPR,2005.[1]N.Dalal and B.Triggs.Histograms of oriented gradients for human detection.CVPR,2005.

[2]Kanopoulos N,Vasanthavada N,Baker R.L.Design of an image edge detectionfilter using the Sobel operator.IEEE Journal of Volume:23,Issue.1988.[2] Kanopoulos N, Vasanthavada N, Baker R.L. Design of an image edge detection filter using the Sobel operator. IEEE Journal of Volume: 23, Issue.1988.

[3]Wang Xiaoyu,Han,Tony X,Yan Shuicheng.An HOG-LBP human detector withpartial occlusion handling.Computer Vision,32-39,2009.[3]Wang Xiaoyu, Han, Tony X, Yan Shuicheng. An HOG-LBP human detector with partial occlusion handling. Computer Vision, 32-39, 2009.

[4]T.Ahonen,A.Hadid,and M.Pietikainen.Face Recogniton with Local BinaryPatterns.Proc.European Conf.Computer Vision,469-481,2004.[4] T. Ahonen, A. Hadid, and M. Pietikainen. Face Recogniton with Local Binary Patterns. Proc. European Conf. Computer Vision, 469-481, 2004.

[5]P.Viola and M.Jones.Rapid object detection using a boosted cascade of simplefeatures.In Proc.CVPR,volume I,511-518,2001.[5]P.Viola and M.Jones.Rapid object detection using a boosted cascade of simple features.In Proc.CVPR,volume I,511-518,2001.

本领域技术人员可以理解附图只是一个优选实施例的示意图，上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred embodiment, and the serial numbers of the above-mentioned embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.

以上所述仅为本发明的较佳实施例，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within range.

Claims

1. A target tracking method based on multi-feature adaptive fusion and online learning, characterized in that, the method may further comprise the steps:

(1) Select the target area from a frame of image, extract the target feature and use it as a template feature;

(2) Initialize the recognizer, input the next frame image, and initialize the candidate target area; obtain a new candidate target area according to the transfer formula;

(3) extract three kinds of features of color, edge and texture respectively to described new candidate target area; Carry out self-adaptive fusion according to discrimination and correlation of each feature;

(4) Calculating the Bhattacharyachian distance between the fused feature and the template feature, and normalizing the Bhattacharyachian distance as the weight of the new candidate target region;

(5) N new candidate target areas are sorted according to the weight size, if the resampling judgment value is greater than the resampling judgment threshold, resampling is performed, and step (6) is executed; if not, step (6) is executed;

(6) Carry out overlap judgment to the new candidate target area of maximum weight and described target area, if overlap rate is less than overlap rate threshold value execution step (7); Otherwise, execution step (8);

(7) Input multiple regions of the new candidate target region with the maximum weight into the detector, if the detector outputs 0, it means that the tracking fails; otherwise, input the output result of the detector to the recognizer, if the recognizer outputs Yes, it means that the tracking is successful and the recognizer, the template feature and the target area are updated; if the output is No, it means that a new target is found, and the process ends;

(8) If the overlap rate is greater than or equal to the overlap rate threshold, it is considered that the tracking is successful, and the identifier, the template feature and the target area are updated, and the process ends;

in,

The discrimination is defined as the degree of similarity between the new candidate target region and the adjacent background on a certain feature, expressed by the Bhattachary coefficient of the two histograms;

The correlation is defined as the degree of similarity between the new candidate target region and the template feature with respect to a certain feature, expressed by the Bhattachary coefficient of the two histograms;

The update identifier is specifically: the positive sample is composed of the new candidate target area verified by the detector, and the negative sample is a randomly selected area in the background with the same size as the new candidate target area;

The updating template feature is specifically: using the feature of the new candidate target region with the maximum weight as the updated template feature;

The updating of the target area specifically includes: taking the new candidate target area with the largest weight as the updated target area.

2. a kind of target tracking method based on multi-feature adaptive fusion and online learning according to claim 1, is characterized in that, the described step of extracting target feature and as template feature specifically comprises:

1) Extract color feature information;

2) Extract edge feature information;

3) Extract texture feature information;

4) Fusing the color feature information, the edge feature information and the texture feature information to obtain a target feature histogram as the template feature.

3. A kind of target tracking method based on multi-feature adaptive fusion and online learning according to claim 2, characterized in that, the step of extracting color feature information specifically comprises:

1) Divide the color space into a color area and an achromatic area, perform HSV partitioning on the color area and the achromatic area, obtain Q _H × Q _S color sub-intervals and Q _v achromatic sub-intervals, and divide all The QH×QS color subintervals and the _Qv achromatic subintervals are used as _QH × _QS ₊ _Qv color intervals _u ;

2) assign different weights to the pixel according to the distance between the pixel and the center point of the target area, and vote for the corresponding color interval u according to the HSV of the pixel;

3) Count the voting values of each color interval to obtain a color feature histogram.

4. A kind of target tracking method based on multi-feature adaptive fusion and online learning according to claim 2, characterized in that, the step of extracting edge feature information specifically comprises:

1) Interpolating the target area to obtain an interpolation area twice as wide and high, and then dividing the target area and the interpolation area into blocks respectively;

2) Calculate the edge strength and direction of each sub-block, divide the edge direction into several direction areas within the range of 0°-360°, vote for the direction area according to the edge strength, and obtain the edge feature histogram of each sub-block;

3) The edge feature histograms calculated by each sub-block are connected to obtain a complete edge feature histogram.

5. A kind of target tracking method based on multi-feature adaptive fusion and online learning according to claim 2, characterized in that, the step of extracting texture feature information specifically comprises:

1) For each sub-block, calculate the local binary pattern feature histogram;

2) Connecting the feature histograms of local binary patterns calculated by each sub-block to obtain a complete texture feature histogram.