CN103632382A

CN103632382A - Compressive sensing-based real-time multi-scale target tracking method

Info

Publication number: CN103632382A
Application number: CN201310700915.9A
Authority: CN
Inventors: 孙继平; 贾倪; 伍云霞
Original assignee: China University of Mining and Technology Beijing CUMTB
Current assignee: China University of Mining and Technology Beijing CUMTB
Priority date: 2013-12-19
Filing date: 2013-12-19
Publication date: 2014-03-12
Anticipated expiration: 2033-12-19
Also published as: CN103632382B

Abstract

The invention discloses a real-time multi-scale target tracking method based on compressed sensing. The sample is modeled by extracting the normalized rectangular feature of the sampled image, which is robust to multi-scale target tracking. Due to the high dimensionality of normalized rectangular features, the invention compresses high-dimensional features based on compressed sensing, extracts scale-invariant compressed feature vectors and uses integral maps to greatly reduce computational complexity to meet the needs of real-time tracking. Use the naive Bayesian classifier to classify the compressed feature vector of the sample to determine the most likely position of the target, and use the classifier response to estimate the weight of the particles, resample the particles to prevent the degradation of the particle tracking ability, and consider the speed of the target using two factors First-order models estimate and predict particle states. The method of the invention can track the target in the video image in real time, has high accuracy, low computational complexity, and the tracking frame changes with the scale of the target in real time, meeting the needs of actual tracking applications.

Description

A real-time multi-scale target tracking method based on compressed sensing

技术领域technical field

本发明涉及一种基于压缩感知的实时多尺度目标跟踪方法，属于计算机视觉技术领域。The invention relates to a real-time multi-scale target tracking method based on compressed sensing, which belongs to the technical field of computer vision.

背景技术Background technique

视频图像运动目标跟踪是计算机视觉中最为重要的一项课题，在目标监督，运动检测和识别，医药图像领域均具有十分广泛的应用。跟踪的任务是在视频起始帧中目标状态已知的条件下，估计后续视频帧中目标状态信息的过程。视频图像运动目标跟踪通常被描述成为动态状态估计问题。根据应用的不同，目标的状态信息一般为目标的运动学特征，如位置坐标，目标尺度等。虽然国内外研究人员经过多年的研究对视频图像目标跟踪问题已提出了多种解决方案，但由于影响跟踪的因素比较多，如目标姿态调整，光照变化导致目标外观特性改变，目标遮挡，目标尺度改变，非刚性目标形变，运动模糊，目标快速移动，目标旋转，背景干扰等，实现实时可靠的视频目标跟踪仍然面临诸多挑战。Video image moving target tracking is the most important topic in computer vision, and it has a very wide range of applications in the fields of target supervision, motion detection and recognition, and medical images. The task of tracking is the process of estimating the object state information in subsequent video frames under the condition that the object state in the initial video frame is known. Video image moving target tracking is usually described as a dynamic state estimation problem. According to different applications, the state information of the target is generally the kinematic characteristics of the target, such as position coordinates, target scale, etc. Although researchers at home and abroad have proposed various solutions to the problem of video image target tracking after years of research, there are many factors that affect tracking, such as target attitude adjustment, changes in the appearance of the target due to changes in illumination, target occlusion, and target scale. Real-time and reliable video target tracking still faces many challenges.

总体来讲，视频目标跟踪方法可以分为两大类：统计跟踪方法和确定性跟踪方法。确定性跟踪方法通过最大化目标模型和候选样本之间的相似性度量来获取当前视频帧中的目标状态位置。最为常用的度量目标模型与当前图像样本之间相似性的方法是差值平方和(SSD)，这种度量方式虽然简单，但往往鲁棒性较差，为此提出了各种改进方式如基于均值漂移算法以及优化算法以寻找最优候选样本。虽然改进后鲁棒性得到大大提高，各种迭代搜索方式却消耗了大量计算机资源，实时性无法得到保证。此外，在确定性跟踪方法中，有多种方法对目标进行建模，如基于点的模型，基于轮廓的模型和基于核密度空间的模型等。基于点的目标模型如利用尺度不变特征点(SIFT)对目标建模，然后利用这些尺度不变关键特征点来匹配原始目标和当前帧中的目标区域，同时利用RANSAC方法消除错配特征点，最终利用仿射变换或透视变换实现原始目标区域与当前帧候选区域之间的映射；基于轮廓的目标模型通常需要对目标轮廓进行离线条件下的建模，利用大量样本训练目标的轮廓特征，在线跟踪检测阶段，通过自适应方式逼近运动目标并获得目标位置状态。可以看出确定性跟踪方法最大的问题在于计算负载较大，不利于实时性应用。另外确定性跟踪方法对目标模型建立过程中特征的选择依赖较大，为了对遮挡、尺度变化、光照变化等因素具有较高的鲁棒性，必须对所选取的特征严格要求和设计，必然导致计算量大幅度增加。Generally speaking, video object tracking methods can be divided into two categories: statistical tracking methods and deterministic tracking methods. Deterministic tracking methods obtain the object state location in the current video frame by maximizing the similarity measure between the object model and candidate samples. The most commonly used method to measure the similarity between the target model and the current image sample is the difference sum of squares (SSD). Although this measurement method is simple, it is often less robust. For this reason, various improvement methods have been proposed, such as based on Mean shift algorithm and optimization algorithm to find the best candidate samples. Although the robustness has been greatly improved after the improvement, various iterative search methods consume a lot of computer resources, and the real-time performance cannot be guaranteed. Furthermore, in deterministic tracking methods, there are various methods to model the target, such as point-based models, contour-based models, and kernel-density-space-based models, etc. Point-based object models such as using scale-invariant feature points (SIFT) to model objects, and then using these scale-invariant key feature points to match the original object and the target area in the current frame, while using the RANSAC method to eliminate mismatched feature points , and finally use affine transformation or perspective transformation to realize the mapping between the original target area and the current frame candidate area; the contour-based target model usually needs to model the target contour under offline conditions, and use a large number of samples to train the contour features of the target. In the online tracking and detection stage, the moving target is approached in an adaptive way and the target position state is obtained. It can be seen that the biggest problem of the deterministic tracking method is that the calculation load is relatively large, which is not conducive to real-time applications. In addition, the deterministic tracking method relies heavily on the selection of features in the process of building the target model. In order to have high robustness to factors such as occlusion, scale changes, and illumination changes, the selected features must be strictly required and designed, which will inevitably lead to The amount of calculation is greatly increased.

统计跟踪方法近几年受到越来越广泛的关注。虽然确定性跟踪方法近些年仍不断有新的成果出现，这些方案始终未能从根本上解决实时性问题。统计跟踪方法利用状态空间和测量空间共同描述整个跟踪系统目标的动态变化过程，对状态的估计通过寻找在相应测量条件下状态的后验概率峰值完成。在线性高斯模型下，利用卡尔曼滤波不断更新相应概率分布的均值与方差，可以得到相应估计的最优值。对于非线性或者非高斯条件下的跟踪问题，已经无法得到状态后验概率分布解析最优值。为此提出了很多算法如粒子滤波、扩展卡尔曼滤波等以得到对分布的近似估计。粒子滤波是统计跟踪方法中最为典型的一种方案，通过对采样粒子不断的转移和预测并获取样本相应特征测量以对样本权重进行更新，利用样本近似对状态空间后验概率密度进行估计。粒子滤波利用递归方式不断更新和传播粒子，因此在预测和更新阶段算法计算负载较低。经过多年发展，出现了多种基于粒子滤波的跟踪方法，这些方法在对样本进行观测时通常使用轮廓或者颜色特征，如基于颜色直方图的粒子滤波，基于尺度不变特征的粒子滤波方法，基于级联特征的粒子滤波方法等。利用颜色直方图作为观测模型虽然对噪声鲁棒性较强，但是当光照变化明显或者背景对目标干扰较强时，系统的可靠性往往会大幅度降低。另外，颜色直方图的计算与目标尺寸有关的，随着目标尺寸的增加，其计算量也将增加。利用轮廓和尺度不变SIFT特征点的方法也存在同样的问题。这就限制了粒子滤波中采样粒子的数量，当粒子数目较少时，其对后验概率密度的近似估计精确度也必然会降低。Statistical tracking methods have received more and more attention in recent years. Although deterministic tracking methods continue to have new results in recent years, these solutions have not fundamentally solved the real-time problem. The statistical tracking method uses the state space and the measurement space to describe the dynamic change process of the entire tracking system target, and the estimation of the state is completed by finding the peak value of the posterior probability of the state under the corresponding measurement conditions. Under the linear Gaussian model, the mean value and variance of the corresponding probability distribution are continuously updated by using the Kalman filter, and the optimal value of the corresponding estimate can be obtained. For tracking problems under nonlinear or non-Gaussian conditions, it is impossible to obtain the analytical optimal value of the state posterior probability distribution. For this reason, many algorithms such as particle filter and extended Kalman filter have been proposed to obtain an approximate estimate of the distribution. Particle filter is the most typical scheme in the statistical tracking method. By continuously transferring and predicting the sampled particles and obtaining the corresponding feature measurement of the sample to update the sample weight, the sample approximation is used to estimate the posterior probability density of the state space. Particle filtering uses a recursive method to continuously update and propagate particles, so the computational load of the algorithm is low during the prediction and update phases. After years of development, a variety of tracking methods based on particle filtering have emerged. These methods usually use contour or color features when observing samples, such as particle filtering based on color histograms, particle filtering methods based on scale-invariant features, and based on Particle filter methods for cascaded features, etc. Although the color histogram is used as an observation model, it is robust to noise, but when the illumination changes significantly or the background interferes strongly with the target, the reliability of the system will often be greatly reduced. In addition, the calculation of the color histogram is related to the target size. As the target size increases, the calculation amount will also increase. The same problem exists in the method using contour and scale-invariant SIFT feature points. This limits the number of sampling particles in the particle filter. When the number of particles is small, the accuracy of the approximate estimation of the posterior probability density will inevitably decrease.

可以看出无论是确定性跟踪方法还是统计跟踪方法，目标外观模型的选取都是跟踪任务中十分重要的部分，将直接影响到跟踪算法的实时性、鲁棒性以及对各种因素的适应性。近几年在视频跟踪领域，国内外研究人员对外观模型进行了大量的研究。总的来讲，可以分为两大类：产生式模型和判别式模型。产生式模型首先通过学习目标的外观特征，然后利用学习到的目标外观特征搜索相关图像区域，依据最小误差准则得到后续视频帧中目标的位置。为了适应跟踪过程中多种因素的干扰，构造鲁棒且高效的外观模型十分困，而且十分严格。这同样会导致计算复杂度的大幅度提升。较为典型的有基于稀疏表达的外观模型，基于正交匹配追踪的外观模型，增量学习方法等。这些已有的产生式外观模型最大的问题在于，外观特征的学习需要的训练样本数目较多，为了降低计算复杂度，只能在线下学习并假设目标外观在整个跟踪过程中是不变的。而且产生式模型不能充分利用目标附近的背景信息，而这些背景信息往往有利于提升跟踪效果。判别式模型将跟踪问题看作二进制分类问题，其主要思想是将目标从背景中分离出来。目前比较典型的判别式跟踪模型如利用支持向量机分类器的跟踪，在线提升跟踪算法，半监督在线提升跟踪算法、多实例学习(MIL)跟踪算法，压缩跟踪(CT)算法等。压缩跟踪算法由于其较高的实时性和可靠性而受到十分广泛的关注，但是该方法存在下面几个问题，限制了其实用性。首先，压缩跟踪方法利用固定大小跟踪框检测识别样本，跟踪框不随目标尺度变化而改变，无法适应目标多尺度变化对跟踪效果的影响。在很多实际应用中，需要算法具备多尺度跟踪特性。作为一种判别式跟踪方法，跟踪过程不断对目标和背景进行分类，分类器参数需要随着跟踪进程不断更新，由于固定了跟踪尺度，虽然在某些条件下算法可以跟踪到目标区域，但实际上在目标尺度变化时算法已将目标和背景和结合体(目标尺度小于初始尺度)或者将目标的一部分(目标尺度大于初始尺度)看作了新的目标而进行跟踪。一旦目标尺度发生突变，分类器还没有足够的时间学习到变化后的目标特征，就会导致目标丢失的可能性大大增加。第二、目前各种判别式跟踪方法在采集样本时往往利用目标位置在时间上的相关性，在固定的半径区域内选择，没有考虑目标运动的速度和加速度信息，对于快速目标移动因素的适应性较差。第三，目前各种判别式跟踪方法，分类器学习参数值固定，当目标被长时间遮挡时，分类器必然会将覆盖物误认为是目标从而导致目标跟丢。It can be seen that whether it is a deterministic tracking method or a statistical tracking method, the selection of the target appearance model is a very important part of the tracking task, which will directly affect the real-time performance, robustness and adaptability to various factors of the tracking algorithm. . In the field of video tracking in recent years, researchers at home and abroad have done a lot of research on appearance models. Generally speaking, it can be divided into two categories: generative models and discriminative models. The generative model first learns the appearance features of the target, and then uses the learned target appearance features to search for relevant image regions, and obtains the position of the target in subsequent video frames according to the minimum error criterion. In order to adapt to the interference of various factors in the tracking process, it is difficult and rigorous to construct a robust and efficient appearance model. This will also lead to a substantial increase in computational complexity. The more typical ones are appearance models based on sparse representation, appearance models based on orthogonal matching pursuit, and incremental learning methods. The biggest problem with these existing generative appearance models is that the learning of appearance features requires a large number of training samples. In order to reduce the computational complexity, they can only learn offline and assume that the target appearance is unchanged throughout the tracking process. Moreover, the generative model cannot make full use of the background information near the target, which is often beneficial to improve the tracking effect. Discriminative models treat the tracking problem as a binary classification problem, and the main idea is to separate the object from the background. At present, typical discriminative tracking models such as tracking using support vector machine classifier, online boosting tracking algorithm, semi-supervised online boosting tracking algorithm, multiple instance learning (MIL) tracking algorithm, compressed tracking (CT) algorithm, etc. Compressed tracking algorithm has been widely concerned because of its high real-time and reliability, but the method has the following problems, which limit its practicability. First of all, the compressed tracking method uses a fixed-size tracking frame to detect and identify samples. The tracking frame does not change with the change of the target scale, and cannot adapt to the impact of multi-scale changes of the target on the tracking effect. In many practical applications, algorithms are required to have multi-scale tracking properties. As a discriminative tracking method, the tracking process continuously classifies the target and the background, and the classifier parameters need to be updated continuously with the tracking process. Due to the fixed tracking scale, although the algorithm can track the target area under certain conditions, the actual When the target scale changes, the algorithm has considered the target and the background and the combination (the target scale is smaller than the initial scale) or a part of the target (the target scale is larger than the initial scale) as a new target for tracking. Once the target scale has a sudden change, the classifier does not have enough time to learn the changed target features, which will greatly increase the possibility of target loss. Second, the current various discriminative tracking methods often use the time correlation of the target position when collecting samples, select within a fixed radius area, and do not consider the speed and acceleration information of the target movement, and adapt to the fast target moving factor Sex is poor. Third, in the current discriminative tracking methods, the learning parameter values of the classifier are fixed. When the target is covered for a long time, the classifier will inevitably mistake the cover for the target and cause the target to be lost.

发明内容Contents of the invention

为了克服现有视频跟踪方法存在的不足，本发明提出一种基于压缩感知的实时多尺度单目标跟踪方法，方法实时性强、能够适应目标尺度变化、跟踪结果鲁棒性高。In order to overcome the shortcomings of existing video tracking methods, the present invention proposes a real-time multi-scale single-target tracking method based on compressed sensing. The method has strong real-time performance, can adapt to changes in target scales, and has high robustness in tracking results.

本发明所述的实时多尺度目标跟踪方法采用如下技术方案实现，包括系统初始化阶段和视频实时目标跟踪阶段，具体步骤如下：The real-time multi-scale target tracking method of the present invention is realized by the following technical solutions, including the system initialization phase and the video real-time target tracking phase, and the specific steps are as follows:

系统初始化阶段：System initialization phase:

1.读取目标初始位置参数R_state=[x，y，w，h]，其中(x，y)表示目标初始位置矩形框左上角点坐标，w和h分别表示目标初始位置矩形框宽和高；1. Read the target initial position parameter R _state = [x, y, w, h], where (x, y) represents the coordinates of the upper left corner of the target initial position rectangle, w and h represent the target initial position rectangle width and high;

2.读取视频序列第一帧图像F₀，并转换为灰度图像，记为I₀；2. Read the first frame image F ₀ of the video sequence, and convert it into a grayscale image, denoted as I ₀ ;

3.计算第一帧图像的积分图；3. Calculate the integral image of the first frame image;

4.构建初始随机测量矩阵R₀；4. Construct an initial random measurement matrix R ₀ ;

5.以初始目标位置中心为基准，采集宽高与初始目标相同的正负样本；5. Based on the center of the initial target position, collect positive and negative samples with the same width and height as the initial target;

6.提取所有采集得到的正、负样本的尺度不变压缩特征向量，并更新朴素贝叶斯分类器参数；6. Extract the scale-invariant compressed feature vectors of all collected positive and negative samples, and update the parameters of the naive Bayesian classifier;

7.计算初始目标矩形框内样本尺度不变压缩特征向量v₀，并对粒子分布进行初始化：7. Calculate the scale-invariant compressed feature vector v ₀ of the sample in the initial target rectangular frame, and initialize the particle distribution:

视频实时目标跟踪阶段：Video real-time target tracking stage:

1.读取第t帧图像，并转换为灰度图像，记为I_t；1. Read the t-th frame image and convert it into a grayscale image, denoted as I _t ;

2.计算当前图像积分图；2. Calculate the current image integral map;

3.利用二阶模型对粒子状态进行估计和预测；3. Use the second-order model to estimate and predict the particle state;

4.计算所有粒子的尺度不变压缩特征向量；4. Calculate the scale-invariant compressed eigenvectors of all particles;

5.利用朴素贝叶斯分类器对所有粒子进行分类，并得到所有粒子的分类器响应值；5. Use the naive Bayesian classifier to classify all particles, and obtain the classifier response values of all particles;

6.将分类器响应最大的粒子作为目标位置，相应粒子尺度作为目标当前尺度估计值，然后利用分类器响应值计算粒子权重；6. The particle with the largest classifier response is used as the target position, and the corresponding particle scale is used as the estimated value of the current scale of the target, and then the particle weight is calculated using the classifier response value;

7.依照粒子权重，对所有粒子重采样，重采样后，权重高的粒子处采样数目增加，权重过低的粒子被舍弃以避免低权重粒子样本所引起的退化；7. According to the particle weight, resample all particles. After resampling, the number of samples at the particle with high weight increases, and the particle with too low weight is discarded to avoid degradation caused by low weight particle samples;

8.以当前所确定目标中心为基准，采集与所确定目标框宽度、高度相同的正负样本；8. Based on the currently determined target center, collect positive and negative samples with the same width and height as the determined target frame;

9.提取所采集正、负样本的尺度不变压缩特征向量，并更新分类器参数；9. Extract the scale-invariant compressed feature vectors of the collected positive and negative samples, and update the classifier parameters;

10.若视频未结束，返回步骤1继续读取下一帧视频图像10. If the video is not over, return to step 1 and continue to read the next frame of video image

假设

表示宽、高分别为w、h的样本图像。本发明利用样本图像的一系列归一化矩形特征作为表示样本特征的原始高维特征，这些高维特征可以通过样本图像z与一系列的归一化矩形滤波器

进行卷积得到：suppose

Indicates a sample image whose width and height are w and h respectively. The present invention uses a series of normalized rectangular features of the sample image as the original high-dimensional features representing the sample features, and these high-dimensional features can pass through the sample image z and a series of normalized rectangular filters

Perform convolution to get:

其中，i和j分别表示归一化矩形滤波器的宽度和高度。将利用所有矩形滤波器滤波得到的滤波后的样本图像表示为列向量并将这些得到的列向量相互连接，便得到了表示样本特征的样本图像的原始高维特征向量。对于宽度和高度分别为w、h的样本图像，其原始归一化矩形高维特征向量的维数大约为(wh)²。where i and j denote the width and height of the normalized rectangular filter, respectively. The original high-dimensional feature vector of the sample image representing the feature of the sample is obtained by expressing the filtered sample images obtained by filtering with all rectangular filters as column vectors and connecting the obtained column vectors with each other. For a sample image whose width and height are w and h respectively, the dimension of its original normalized rectangular high-dimensional feature vector is about (wh) ² .

显然，计算所有样本图像的原始高维特征向量需要耗费大量计算机资源，无法满足实时性应用。下面利用压缩感知原理得到压缩后的样本图像低维特征向量。Obviously, calculating the original high-dimensional feature vectors of all sample images requires a lot of computer resources, which cannot meet real-time applications. Next, the compressed sensing principle is used to obtain the low-dimensional feature vector of the compressed sample image.

假设

表示上述样本原始高维特征向量，

表示随机测量矩阵，用于将高维特征映射至低维特征。那么可以得到压缩后的低维特征向量v=Rx，其中

是低维压缩特征向量。本发明利用如下稀疏随机测量矩阵来实现对原始高维特征的压缩。suppose

Represents the original high-dimensional feature vector of the above sample,

Represents a random measurement matrix for mapping high-dimensional features to low-dimensional features. Then the compressed low-dimensional feature vector v=Rx can be obtained, where

is a low-dimensional compressed feature vector. The present invention uses the following sparse random measurement matrix to realize the compression of original high-dimensional features.

其中，s=n／4，权重w_i为第i行中非零元个数倒数的开方。可以看出，矩阵

的每一行非零元个数最多不超过4个。Among them, s=n/4, the weight w _i is the root of the reciprocal of the number of non-zero elements in the i-th row. It can be seen that the matrix

The maximum number of non-zero elements in each row of is 4.

所述构建初始随机测量矩阵的具体步骤如下：The specific steps of constructing the initial random measurement matrix are as follows:

(1)w和h分别表示目标初始位置矩形框宽度和高度，低维特征数目为m。设定初始随机测量矩阵每行非零元个数上下限分别为Num_max和Num_min，根据上述稀疏矩阵，分别将Num_max和Num_min设置为4和2。(1) w and h represent the width and height of the rectangular frame of the initial position of the target, respectively, and the number of low-dimensional features is m. Set the upper and lower limits of the number of non-zero elements in each row of the initial random measurement matrix to Num _max and Num _min respectively, and set Num _max and Num _min to 4 and 2 respectively according to the above sparse matrix.

(2)循环构建初始随机测量矩阵各行：(2) Loop to build each row of the initial random measurement matrix:

A.利用均匀随机数发生器产生区间[Num_min，Num_max]内的随机整数，作为初始随机测量矩阵第i行中非零元个数nz_i；A. Use a uniform random number generator to generate a random integer in the interval [Num _min , Num _max ] as the number nz _i of non-zero elements in the i-th row of the initial random measurement matrix;

B.随机产生nz_i个矩形区域，使得px(i，t)～U(1，w-3)，py(i，t)～U(1，h-3)，pw(i，t)～U(1，w-px-2)，ph(i，t)～U(1，h-py-2)，其中px(i，t)，py(i，t)，pw(i，t)，ph(i，t)分别表示所产生的第t个矩形区域的左上角横坐标，左上角纵坐标，宽度和高度，U(a，b)表示区间[a，b]内整数均匀分布，用于产生区间内的随机整数，并产生与相应矩形区域对应的随机测量矩阵值p_value(i，t)=w_i·sign_t，其中

表示随机测量矩阵当前行权重，sign_t表示第t个非零元素符号并通过随机数发生器以等概率随机产生。B. Randomly generate nz _i rectangular areas such that px(i,t)～U(1,w-3), py(i,t)～U(1,h-3), pw(i,t)～ U(1, w-px-2), ph(i, t) ~ U(1, h-py-2), where px(i, t), py(i, t), pw(i, t) , ph(i, t) respectively represent the abscissa of the upper left corner, the ordinate of the upper left corner, width and height of the generated t-th rectangular area, U(a, b) represents the uniform distribution of integers in the interval [a, b], It is used to generate a random integer in the interval, and generate a random measurement matrix value p _value (i, t)=w _i sign _t corresponding to the corresponding rectangular area, where

Indicates the weight of the current row of the random measurement matrix, and sign _t indicates the sign of the tth non-zero element and is randomly generated by the random number generator with equal probability.

最终得到并储存初始随机测量矩阵所有非零元的值，非零元对应矩形区左上角横坐标，左上角纵坐标，宽度和高度信息，以及每行非零元的个数集合{nz_i|i=1，2，...，m}。Finally obtain and store the values of all non-zero elements of the initial random measurement matrix, the non-zero elements correspond to the abscissa of the upper left corner of the rectangular area, the ordinate of the upper left corner, width and height information, and the set of the number of non-zero elements in each row {nz _i | i=1,2,...,m}.

所述样本尺度不变压缩特征向量提取的具体计算步骤如下：The specific calculation steps of the sample scale-invariant compressed feature vector extraction are as follows:

假设w和h分别表示目标初始位置矩形框宽度和高度，并认为初始目标大小时尺度为1。视频当前帧图像的积分图为iH，采样图像的宽度和高度分别为w_s和h_s，采样图像左上角位置坐标为(x，y)，采样图像尺度为s，其中

Assume that w and h represent the width and height of the rectangular frame of the initial position of the target, respectively, and consider that the initial target size is 1. The integral image of the current video frame image is iH, the width and height of the sampled image are w _s and h _s respectively, the coordinates of the upper left corner of the sampled image are (x, y), and the scale of the sampled image is s, where

(1)依据当前样本图像尺度s，对所存储的初始随机测量矩阵R₀的非零元进行适应性修正，得到尺度s下的随机测量矩阵R_s，具体如下：(1) According to the scale s of the current sample image, the non-zero elements of the stored initial random measurement matrix R ₀ are adaptively corrected to obtain the random measurement matrix R _s at the scale s, as follows:

保持初始测量矩阵中所有非零元数值不变，非零元所对应的矩形的参数px(i，t)，py(i，t)，pw(i，t)，ph(i，t)相应变为原来的s倍并按照四舍五入取整，即：Keep the values of all non-zero elements in the initial measurement matrix unchanged, and the parameters of the rectangle corresponding to the non-zero elements px(i, t), py(i, t), pw(i, t), ph(i, t) correspond to Change to the original s times and round up according to the rounding, that is:

其中i=1，...，m，t=1，...，nzi，m为压缩特征向量维数；Where i=1,...,m, t=1,...,nzi, m is the dimension of the compressed feature vector;

(2)第i维压缩特征值计算公式如下：(2) The formula for calculating the compressed eigenvalue of the i-th dimension is as follows:

其中，P_sum(i，t)表示修正后的随机测量矩阵中第i行第t个非零元所对应矩形内像素值总和，可以利用积分图计算如下：Among them, P _sum (i, t) represents the sum of pixel values in the rectangle corresponding to the tth non-zero element in row i of the modified random measurement matrix, which can be calculated as follows by using the integral graph:

P_sum(i，t)=iH(maxI，maxJ)-iH(maxI，minJ)-iH(minI，maxJ)+iH(minI，minJ)P _sum (i, t)=iH(maxI, maxJ)-iH(maxI, minJ)-iH(minI, maxJ)+iH(minI, minJ)

其中

表示积分图(u，v)点的值；in

Represents the value of the integral graph (u, v) point;

(3)最后得到采样图像尺度不变压缩特征向量v=(v_i|i=1，2，...，m}。(3) Finally, the scale-invariant compressed feature vector v=(v _i |i=1, 2, . . . , m} of the sampled image is obtained.

本发明的有益效果是，利用归一化矩形特征作为原始高维特征描述目标模型，使得特征对目标尺度变化有较强的适应性，能够准确跟踪目标位置及目标尺度变化，且目标跟踪准确度提高；利用压缩感知原理仅按照随机方式抽取少量原始高维特征即可对目标有效建模，利用积分图方法可快速计算归一化矩形特征，使得本发明所述方法计算复杂度低，能够实时跟踪目标。The beneficial effect of the present invention is that the normalized rectangular feature is used as the original high-dimensional feature to describe the target model, so that the feature has strong adaptability to the change of the target scale, can accurately track the position of the target and the change of the target scale, and the target tracking accuracy Improvement; the principle of compressed sensing can be used to effectively model the target only by randomly extracting a small amount of original high-dimensional features, and the integral graph method can be used to quickly calculate the normalized rectangular features, so that the method of the present invention has low computational complexity and can be real-time track target.

附图说明Description of drawings

为了更清楚说明本发明实施例及技术方案，以下将对技术方案描述中所需要使用的附图做简单地介绍。In order to illustrate the embodiments and technical solutions of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the technical solutions.

图1是基于压缩感知的实时多尺度目标跟踪方法流程图；Figure 1 is a flowchart of a real-time multi-scale target tracking method based on compressed sensing;

图2是初始随机测量矩阵示意图；Fig. 2 is a schematic diagram of an initial random measurement matrix;

图3是本发明所述样本尺度不变压缩特征向量提取过程示意图；Fig. 3 is a schematic diagram of the sample scale invariant compression feature vector extraction process of the present invention;

图4是正样本、负样本采集区域示意图；Fig. 4 is a schematic diagram of positive sample and negative sample collection area;

图5是粒子状态估计和预测示意图；Fig. 5 is a schematic diagram of particle state estimation and prediction;

图中，1、视频图像区域；2、目标区；3、正样本中心点采集区域；4、负样本中心点采集区域。In the figure, 1. Video image area; 2. Target area; 3. Positive sample center point collection area; 4. Negative sample center point collection area.

具体实施方式Detailed ways

下面结合说明书附图对本发明具体实施方式进行详细的描述，首先对基于压缩感知的实时多尺度单目标跟踪方法的基本流程进行描述。参照图1，具体步骤如下，整个过程分为系统初始化阶段和视频实时目标跟踪阶段：The specific implementation of the present invention will be described in detail below with reference to the accompanying drawings. Firstly, the basic flow of the real-time multi-scale single-target tracking method based on compressed sensing will be described. Referring to Figure 1, the specific steps are as follows. The whole process is divided into the system initialization phase and the video real-time target tracking phase:

系统初始化阶段：System initialization phase:

2.读取视频序列第一帧图像F₀={F_R，F_G，F_B}，并转换为灰度图像，记为T₀；2. Read the first frame image F ₀ ={F _R , F _G , F _B } of the video sequence, and convert it into a grayscale image, denoted as T ₀ ;

将彩色图像转换为灰度图像的公式为：The formula for converting a color image to a grayscale image is:

I₀(x，y)=0.299F_R(x，y)+0.587F_G(x，y)+0.114F_B(x，y)I ₀ (x, y) = 0.299F _R (x, y) + 0.587F _G (x, y) + 0.114F _B (x, y)

I₀(x，y)表示灰度图像I₀在点(x，y)处的灰度值，灰度值取值范围为[0，255]，其中0表示黑色，255表示白色，F_R，F_G，F_B分别为原始图像的R，G，B分量。I ₀ (x, y) represents the gray value of the grayscale image I ₀ at the point (x, y), and the range of the gray value is [0, 255], where 0 represents black, 255 represents white, F _R , F _G , F _B are the R, G, and B components of the original image, respectively.

3.计算第一帧图像的积分图，积分图的计算公式如下：3. Calculate the integral map of the first frame image, the calculation formula of the integral map is as follows:

$iH i H ((x x,, y the y)) = = {Σ Σ}_{u u = = 11}^{x x} {Σ Σ}_{v v = = 11}^{y the y} {I I}_{00} ((u u,, v v))$

5.以初始目标位置中心为基准，采集宽、高与初始目标相同的正负样本5. Based on the center of the initial target position, collect positive and negative samples with the same width and height as the initial target

参照图4，正样本中心采集半径设置在区间(0.inrad)，负样本中心采集半径设置在区间(inrad+4，outrad)，本发明实施例中inrad与outrad分别设置为4和30，并始终保持不变。由于目标尺度发生改变，正负样本采样半径也可设置与尺度成正比：inrad=4s，outrad=30s，其中s表示当前需要采集正负样本的尺度。需要说明的是inrad、outrad参数的确定与具体视频图像分辨率及目标尺寸有关，本实施例中所设置参数数值对于常用视频分辨率下的不同尺寸目标均能达到较为理想的跟踪效果。在上述采样区间内，通过均匀随机采样方式分别采集45个正样本与50个负样本；Referring to Fig. 4, positive sample center collection radius is set in interval (0.inrad), negative sample center collection radius is set in interval (inrad+4, outrad), in the embodiment of the present invention, inrad and outrad are respectively set to 4 and 30, and Always stay the same. Due to the change of the target scale, the sampling radius of positive and negative samples can also be set proportional to the scale: inrad=4s, outrad=30s, where s represents the current scale that needs to collect positive and negative samples. It should be noted that the determination of the inrad and outrad parameters is related to the specific video image resolution and target size. The parameter values set in this embodiment can achieve ideal tracking effects for targets of different sizes under common video resolutions. In the above sampling interval, 45 positive samples and 50 negative samples were collected through uniform random sampling;

6.提取所有采集得到的正样本的尺度不变压缩特征向量{v(n)|n=1，2，...，n_pos}以及所有采集得到的负样本的尺度不变压缩特征向量{w(t)|t=1，2，...，n_neg}，并利用下式更新分类器参数：6. Extract the scale-invariant compressed feature vector {v(n)|n=1, 2, ..., n _pos } of all collected positive samples and the scale-invariant compressed feature vector { w(t)|t=1, 2, ..., n _neg }, and use the following formula to update the classifier parameters:

$\begin{matrix} {μ μ}_{i i}^{11} &LeftArrow; &LeftArrow; λ λ {μ μ}_{i i}^{11} + + ((11 - - λ λ)) {μ μ}^{11} \\ {σ σ}_{i i}^{11} &LeftArrow; &LeftArrow; \sqrt{λ λ {(({σ σ}_{i i}^{11}))}^{22} + + ((11 - - λ λ)) {(({σ σ}^{11}))}^{22} + + λ λ ((11 - - λ λ)) {(({μ μ}_{i i}^{11} - - {μ μ}^{11}))}^{22}} \\ {μ μ}_{i i}^{00} &LeftArrow; &LeftArrow; λ λ {μ μ}_{i i}^{00} + + ((11 - - λ λ)) {μ μ}^{00} \\ {σ σ}_{i i}^{00} &LeftArrow; &LeftArrow; \sqrt{λ λ {(({σ σ}_{i i}^{00}))}^{22} + + ((11 - - λ λ)) {(({σ σ}^{00}))}^{22} + + λ λ ((11 - - λ λ))} {(({μ μ}_{i i}^{00} - - {μ μ}^{00}))}^{22} \end{matrix}$

其中λ是分类器学习速率参数，λ越小分类器参数更新速度越快，本实施例中λ取0.9，n_pos=45，n_neg=50，i=1，2，...，m，m表示尺度不变压缩特征向量的维数，

表示压缩特征向量第i维特征值对应的朴素贝叶斯分类器参数，

μ^{1} = \frac{1}{n_{pos}} Σ_{k = 1}^{n_{pos}} v_{i} (k), σ^{1} = \sqrt{\frac{1}{n_{pos}} Σ_{k = 1}^{n_{pos}} {(v_{i} (k) - μ^{1})}^{2},} μ^{0} = \frac{1}{n_{neg}} Σ_{k = 1}^{n_{neg}} w_{i} (k), σ^{0} = \sqrt{\frac{1}{n_{neg}} Σ_{k = 1}^{n_{nrg}} {(w_{i} (k) - μ^{0})}^{2}};

v_i(k)、w_i(k)分别表示第k个正、负样本压缩特征向量的第i维特征值，初始条件下所有分类器参数

均为0，参数

均为1；Wherein λ is the classifier learning rate parameter, the smaller the λ, the faster the update speed of the classifier parameters, in this embodiment, λ is 0.9, n _pos =45, n _neg =50, i=1, 2,..., m, m represents the dimensionality of the scale-invariant compressed feature vector,

Indicates the parameters of the Naive Bayesian classifier corresponding to the i-th dimension eigenvalue of the compressed eigenvector,

μ^{1} = \frac{1}{{no}_{pos}} Σ_{k = 1}^{{no}_{pos}} v_{i} (k), σ^{1} = \sqrt{\frac{1}{{no}_{pos}} Σ_{k = 1}^{{no}_{pos}} {(v_{i} (k) - μ^{1})}^{2},} μ^{0} = \frac{1}{{no}_{neg}} Σ_{k = 1}^{{no}_{neg}} w_{i} (k), σ^{0} = \sqrt{\frac{1}{{no}_{neg}} Σ_{k = 1}^{{no}_{nrg}} {(w_{i} (k) - μ^{0})}^{2}};

v _i (k) and w _i (k) represent the i-th dimension eigenvalues of the k-th positive and negative sample compressed feature vectors respectively, and all classifier parameters under the initial conditions

are all 0, parameter

Both are 1;

本发明实施例中粒子数目pn设置为200，每个粒子包含如下参数：初始状态参数(x₀，y₀，s₀)，当前t时刻状态参数(x_t，y_t，z_t)，t-1时刻状态参数(x_t-1，y_t-1，s_t-1)，t-2时刻状态参数(X_t-2，y_t-2，s_t-2)，其中x_t，y_t表示t时刻粒子样本区域中心点位置坐标，s_t表示t时刻粒子样本区域尺度值；还包括参数v和参数w，v表示粒子区域当前时刻尺度不变压缩特征向量，w表示粒子权重；In the embodiment of the present invention, the number of particles pn is set to 200, and each particle contains the following parameters: initial state parameters (x ₀ , y ₀ , s ₀ ), state parameters (x _t , y _t , z _t ) at the current time t, t State parameter at time -1 (x _t-1 , y _t-1 , s _t-1 ), state parameter at time t-2 (X _t-2 , y _t-2 , s _t-2 ), where x _t , y _t represents the position coordinates of the center point of the particle sample area at time t, s _t represents the scale value of the particle sample area at time t; it also includes parameters v and w, v represents the scale-invariant compressed feature vector of the particle area at the current time, and w represents the particle weight;

粒子分布初始化过程中将所有粒子初始状态参数、当前时刻状态参数、t-1时刻状态参数、t-2时刻状态参数均设置为(x+floor(w／2)，y+floor(h／2)，1)，各粒子权重初始化为0，各粒子的尺度不变压缩特征向量参数均初始化为v₀。In the process of particle distribution initialization, all particle initial state parameters, current moment state parameters, t-1 moment state parameters, and t-2 moment state parameters are all set as (x+floor(w/2), y+floor(h/2 ), 1), the weights of each particle are initialized to 0, and the scale-invariant compressed feature vector parameters of each particle are initialized to v ₀ .

视频实时目标跟踪阶段：Video real-time target tracking stage:

2.计算当前图像积分图；2. Calculate the current image integral map;

对于每一个粒子，利用粒子t-1和t-2时刻状态信息及二阶模型对粒子状态进行估计，公式如下：For each particle, the state information of particles at time t-1 and t-2 and the second-order model are used to estimate the particle state, the formula is as follows:

$\{\begin{matrix} {x x}_{t t} ((i i)) = = 22 {x x}_{t t - - 11} ((i i)) - - {x x}_{t t - - 22} ((i i)) + + {wx wx}_{t t} \\ {y the y}_{t t} ((i i)) = = 22 {y the y}_{t t - - 11} ((i i)) - - {y the y}_{t t - - 22} ((i i)) + + {wy wy}_{t t} \\ {s the s}_{t t} ((i i)) = = 22 {s the s}_{t t - - 11} ((i i)) - - {s the s}_{t t - - 22} ((i i)) + + {ws ws}_{t t} \end{matrix}$

其中，i=1，2，...，pn，wx_t，wy_t，ws_t分别表示三个状态分量上的0均值高斯噪声，其标准差分别为std_x，std_y，std_s，本发明实施例中，标准差分别设置为std_x=5，std_y=2.5，std_s=0.06；为防止粒子样本超出图像范围，需要对估计状态进行越界处理，并利用估计并越界处理后的状态(x_t(i)，y_t(i)，s_t(i))替换粒子t时刻状态参数，将原粒子t、t-1时刻状态参数分别替换相应粒子t-1、t-2时刻状态参数。参照图5，利用二阶模型估计和预测粒子状态实际考虑了目标运动的速度信息，图5直观显示了利用上述二阶模型计算得到的新的粒子位置更趋向于目标运动方向，而不是将新的粒子采样点限制在上一时刻样本点附近，同理对于粒子尺度状态的估计也考虑了前面两个时刻粒子的尺度状态，并按照其变化趋势对新粒子尺度状态进行估计；Among them, i=1, 2,..., pn, wx _t , wy _t , ws _t represent the zero-mean Gaussian noise on the three state components respectively, and their standard deviations are std _x , std _y , std _s respectively, this In the embodiment of the invention, the standard deviation is set to std _x = 5, std _y = 2.5, std _s = 0.06; in order to prevent the particle sample from exceeding the image range, it is necessary to perform out-of-boundary processing on the estimated state, and use the state after estimation and out-of-boundary processing (x _t (i), y _t (i), s _t (i)) replaces the state parameters of the particle at time t, and replaces the state parameters of the original particle at time t and t-1 with the corresponding particle at time t-1 and t-2 respectively parameter. Referring to Figure 5, using the second-order model to estimate and predict the particle state actually takes into account the velocity information of the target's motion. Figure 5 intuitively shows that the new particle position calculated by the above-mentioned second-order model tends to the direction of the target's motion, rather than the new particle position. The particle sampling point is limited to the vicinity of the sample point at the previous moment. Similarly, the particle-scale state estimation also considers the particle-scale state at the previous two moments, and estimates the new particle-scale state according to its change trend;

4.计算所有粒子尺度不变压缩特征向量；4. Calculate the scale-invariant compressed eigenvectors of all particles;

5.利用朴素贝叶斯分类器对所有粒子进行分类，并得到所有粒子的分类器响应值{H(v_t)|t=1，2，...，pn}，公式如下：5. Use the naive Bayesian classifier to classify all particles, and obtain the classifier response value {H(v _t )|t=1, 2, ..., pn} of all particles, the formula is as follows:

$H h (({v v}_{t t})) = = log log ((\frac{{Π Π}_{i i = = 11}^{m m} p p (({v v}_{t t}^{i i} | | y the y = = 11)) p p ((y the y = = 11))}{{Π Π}_{i i = = 11}^{m m} p p (({v v}_{t t}^{i i} | | y the y = = 00)) p p ((y the y = = 00))})) = = {Σ Σ}_{i i = = 11}^{m m} log log ((\frac{p p (({v v}_{t t}^{i i} | | y the y = = 11))}{p p (({v v}_{t t}^{i i} | | y the y = = 00))}))$

其中m表示低维压缩特征向量维数，

表示第t个粒子的压缩特征向量v_t的第i维特征值，y=1表示样本属于目标，y=0表示样本属于背景，并假设p(y=1)=p(y=0)。由于高维随机向量的低维随机投影都是服从高斯分布，因此低维压缩特征向量的每一维特征均服从如下正太分布

p (v_{t}^{i} | y = 1) ~ N (μ_{i}^{1}, σ_{i}^{1}), p (v_{t}^{i} | y = 0) ~ N (μ_{i}^{0}, σ_{i}^{0}),

分类器参数

通过不断采集得到的正负样本更新得到；Where m represents the low-dimensional compressed feature vector dimension,

Indicates the i-th dimension eigenvalue of the compressed eigenvector v _t of the t-th particle, y=1 means the sample belongs to the target, y=0 means the sample belongs to the background, and assume p(y=1)=p(y=0). Since the low-dimensional random projections of high-dimensional random vectors are subject to Gaussian distribution, each dimension feature of the low-dimensional compressed feature vector is subject to the following normal distribution

p (v_{t}^{i} | the y = 1) ~ N (μ_{i}^{1}, σ_{i}^{1}), p (v_{t}^{i} | the y = 0) ~ N (μ_{i}^{0}, σ_{i}^{0}),

Classifier parameters

It is obtained by updating the positive and negative samples obtained through continuous collection;

6.将分类器响应最大的粒子作为目标位置，相应粒子尺度作为目标当前尺度估计值，然后利用分类器响应值计算粒子权重，权重计算公式如下：6. Use the particle with the largest response of the classifier as the target position, and the corresponding particle scale as the estimated value of the current scale of the target, and then use the classifier response value to calculate the particle weight. The weight calculation formula is as follows:

${w w}_{i i} = = \frac{p p (({z z}_{t t} | | {x x}_{t t}^{* *} ((i i))))}{{Σ Σ}_{j j = = 11}^{pn pn} p p (({z z}_{t t} | | {x x}_{t t}^{* *} ((j j))))}$

其中

表示第i个粒子在t时刻状态为

条件下观测到t时刻目标z_t的概率，

p (z_{t} | x_{t}^{*} (i)) &Proportional; \exp (H (v_{i})) = Π_{j = 1}^{m} \frac{p (v_{j}^{j} | y = 1)}{p (v_{i}^{j} | y = 0)};

in

Indicates that the state of the i-th particle at time t is

The probability of observing the target z _t at time t under the condition,

p (z_{t} | x_{t}^{*} (i)) &Proportional; \exp (h (v_{i})) = Π_{j = 1}^{m} \frac{p (v_{j}^{j} | the y = 1)}{p (v_{i}^{j} | the y = 0)};

7.对所有粒子按照式p{newp=p(i)}=w_t进行重采样，使得重采样后新粒子集合中粒子newp为重采样前第i个粒子p(i)的概率等于其权重w_i。假设重采样前粒子集合为{p(i)|i=1，2，...，pn}，具体做法如下：7. Resample all particles according to the formula p{newp=p(i)}=w _t , so that the probability that particle newp in the new particle set after resampling is the i-th particle p(i) before resampling is equal to its weight w _i . Assuming that the set of particles before resampling is {p(i)|i=1, 2, ..., pn}, the specific method is as follows:

(1)将所有pn个粒子按照权重从大到小排序，得到新的粒子集合{p′(i)|i=1，2，...，pn}，排序后对应粒子权重为

(1) Sort all pn particles according to their weights from large to small, and get a new particle set {p′(i)|i=1, 2, ..., pn}, and the corresponding particle weight after sorting is

(2)循环执行以下步骤，直到得到全部pn个重采样粒子为止：a.读取第i个排序后的粒子权重

b.计算以第i个排序后粒子为基础，重采样所需复制衍生出的粒子个数c.复制得到n_i个重采样后的新粒子。(2) Perform the following steps in a loop until all pn resampled particles are obtained: a. Read the i-th sorted particle weight

b. Calculate the number of particles derived from the replication required for resampling based on the i-th sorted particle c. Copy to obtain n _i new particles after resampling.

重采样后，权重高的粒子处采样数目增加，权重过低的粒子被舍弃以避免低权重粒子样本所引起的退化。After resampling, the number of samples at the particles with high weight is increased, and the particles with too low weight are discarded to avoid the degradation caused by low weight particle samples.

9.提取所采集正、负样本尺度不变压缩特征向量，并更新分类器参数；9. Extract the scale-invariant compressed feature vectors of the collected positive and negative samples, and update the classifier parameters;

10.若视频未结束，返回步骤1继续读取下一帧视频图像。10. If the video is not over, return to step 1 and continue to read the next frame of video image.

参照图2，所述构建初始随机测量矩阵的具体步骤如下：With reference to Fig. 2, the concrete steps of described construction initial random measurement matrix are as follows:

(1)w和h分别表示目标初始位置矩形框宽度和高度，低维特征数目为m(本实施例m=150)。设定初始随机测量矩阵每行非零元个数上下限分别为Num_max和Num_min(本实施例Num_max=4，Num_min=2)。(1) w and h respectively represent the width and height of the rectangular frame of the initial position of the target, and the number of low-dimensional features is m (m=150 in this embodiment). Set the upper and lower limits of the number of non-zero elements in each row of the initial random measurement matrix to Num _max and Num _min respectively (Num _max = 4, Num _min = 2 in this embodiment).

(2)循环构建初始随机测量矩阵各行：(2) Circularly construct each row of the initial random measurement matrix:

A.利用均匀随机数发生器产生区间[Num_min，Num_max]内的随机整数，作为矩阵当前第i行中非零元个数nz_i；A. Use a uniform random number generator to generate random integers in the interval [Num _min , Num _max ] as the number nz _i of non-zero elements in the current i-th row of the matrix;

B.随机产生nz_i个矩形区域，使得px(i，t)～U(1，w-3)，py(i，t)～U(1，h-3)，pw(i，t)～U(1，w-px-2)，ph(i，t)～U(1，h-py-2)，其中px(i，t)，py(i，t)，pw(i，t)，ph(i，t)分别表示所产生的第t个矩形区域的左上角横坐标，左上角纵坐标，宽度和高度，U(a，b)表示区间[a，b]内整数均匀分布，并产生与相应矩形区域对应的随机测量矩阵值p_value(i，t)=w_i·sign_t，其中

表示随机测量矩阵当前行权重，sign_t表示第t个非零元素符号，以等概率随机产生。B. Randomly generate nz _i rectangular areas such that px(i,t)～U(1,w-3), py(i,t)～U(1,h-3), pw(i,t)～ U(1, w-px-2), ph(i, t) ~ U(1, h-py-2), where px(i, t), py(i, t), pw(i, t) , ph(i, t) respectively represent the abscissa of the upper left corner, the ordinate of the upper left corner, width and height of the generated t-th rectangular area, U(a, b) represents the uniform distribution of integers in the interval [a, b], And generate a random measurement matrix value p _value (i, t)=w _i sign _t corresponding to the corresponding rectangular area, where

Indicates the weight of the current row of the random measurement matrix, and sign _t indicates the sign of the tth non-zero element, which is randomly generated with equal probability.

下面结合说明书附图3说明采样图像尺度不变压缩特征向量提取的具体计算步骤：The specific calculation steps for extracting the scale-invariant compressed feature vector of the sampled image are described below in conjunction with the accompanying drawing 3 of the description:

为了方便算法描述，本实施例中认为目标尺度在宽度和高度方向上变化一致，不再分别设置宽度和高度方向上的尺度参数，在不付出创造性劳动的前提下，可以扩展得到在宽度、高度方向尺度独立变化的实施例。Assume that w and h represent the width and height of the rectangular frame of the initial position of the target, respectively, and consider that the initial target size is 1. The integral image of the current video frame image is iH, the width and height of the sampled image are w _s and h _s respectively, the coordinates of the upper left corner of the sampled image are (x, y), and the scale of the sampled image is s, where

In order to facilitate the description of the algorithm, in this embodiment, it is considered that the target scale changes uniformly in the width and height directions, and the scale parameters in the width and height directions are no longer set separately. On the premise of no creative work, it can be expanded to obtain Embodiments where orientation scales vary independently.

(1)依据当前样本图像尺度s，对所存储的初始随机测量矩阵R₀的非零元进行适应性修正，得到尺度s下的随机测量矩阵R_s：(1) According to the scale s of the current sample image, the non-zero elements of the stored initial random measurement matrix R ₀ are adaptively corrected to obtain the random measurement matrix R _s at the scale s:

其中i=1，...，m，t=1，...，nz_i，m为压缩特征向量维数；Where i=1,...,m, t=1,...,nz _i , m is the dimension of the compressed feature vector;

其中

in

参照图3，上述公式中所表示的即是经过修正后的随机测量矩阵R_s中第i行第t个非零元所对应的归一化矩形特征，p_value(i，t)表示的是相应随机测量矩阵中非零元的数值。图3中分别用r_ij和x_j表示随机测量矩阵R_s中各元素及原始高维归一化矩形特征向量中各特征值，由于随机测量矩阵R_s是十分稀疏的矩阵，因此在上述计算中仅仅需要对非零元进行运算，而无需计算高维特征向量中所有特征值。Referring to Figure 3, the above formula What is represented is the normalized rectangular feature corresponding to the tth non-zero element in the i-th row in the modified random measurement matrix R _s , and p _value (i, t) represents the non-zero element in the corresponding random measurement matrix The value of the element. In Fig. 3, r _ij and x _j represent each element in the random measurement matrix R _s and each eigenvalue in the original high-dimensional normalized rectangular eigenvector. Since the random measurement matrix R _s is a very sparse matrix, in the above calculation In , only the non-zero elements need to be operated, and there is no need to calculate all the eigenvalues in the high-dimensional eigenvector.

最终得到采样图像尺度不变压缩特征向量v={v_i|i=1，2，...，m}。Finally, the sampled image scale-invariant compressed feature vector v={v _i |i=1, 2, ..., m} is obtained.

Claims

1. A real-time multi-scale target tracking method based on compressed sensing, characterized in that, comprising the following steps:

System initialization phase:

(1). Read the target initial position parameter R _state = [x, y, w, h], where (x, y) represents the coordinates of the upper left corner of the target initial position rectangle, w and h represent the target initial position rectangle respectively width and height;

(2). Read the first frame image F ₀ of the video sequence, and convert it into a grayscale image, denoted as I ₀ ;

(3). Calculate the integral image of the first frame image;

(4). Construct an initial random measurement matrix R ₀ ;

(5). Based on the center of the initial target position, collect positive and negative samples with the same width and height as the initial target;

(6). Extract the scale-invariant compressed feature vectors of all collected positive and negative samples, and update the parameters of the naive Bayesian classifier;

(7). Calculate the sample scale-invariant compressed feature vector v ₀ in the initial target rectangular frame, and initialize the particle distribution;

Video real-time target tracking stage:

(1). Read the t-th frame image, and convert it into a grayscale image, which is denoted as I _t ;

(2). Calculate the current image integral map;

(3). Use the second-order model to estimate and predict the particle state;

(4). Calculating the scale-invariant compressed eigenvectors of all particles;

(5). Utilize the naive Bayesian classifier to classify all particles, and obtain the classifier response values of all particles;

(6). The particle with the largest classifier response is used as the target position, and the corresponding particle scale is used as the estimated value of the current scale of the target, and then the particle weight is calculated using the classifier response value;

(7). Resample all particles according to the weight of the particles. After resampling, the number of samples at the particles with high weight increases, and the particles with too low weight are discarded to avoid degradation caused by low weight particle samples;

(8). Based on the currently determined target center, collect positive and negative samples with the same width and height as the determined target frame;

(9). Extract the scale-invariant compressed feature vectors of the collected positive and negative samples, and update the classifier parameters;

(10). If the video is not over, return to step 1 and continue to read the next frame of video image.

2. a kind of real-time multi-scale target tracking method based on compressed sensing according to claim 1, is characterized in that, described sample image scale-invariant compressed feature vector extraction comprises the following steps:

Let w and h denote the width and height of the rectangular frame of the initial position of the target respectively, and consider the initial target size scale to be 1; the width and height of the sample image are w _s and h _s respectively, and the position coordinates of the upper left corner of the sample image are (x, y) , the sample image scale is s,

The sample image is collected from a frame image in the video sequence and the integral image of the frame image is iH;

(1). According to the scale s of the sample image, the non-zero elements of the initial random measurement matrix R ₀ are adjusted to obtain the random measurement matrix R _s under the scale s:

Keep the values of all non-zero elements in R ₀ unchanged, and the rectangular parameters corresponding to the non-zero elements px(i, t), py(i, t), pw(i, t), and ph(i, t) become The original s times and according to rounding, the formula is as follows:

Where i=1,...,m, t=1,...,nz _i , m is the compressed feature vector dimension, and nz _i is the number of non-zero elements in row i of the initial random measurement matrix;

(2). Calculate the i-th dimension eigenvalue of the scale-invariant compressed eigenvector, the formula is as follows:

Among them, P _sum (i, t) represents the sum of the pixel values in the rectangle corresponding to the t non-zero element in the i-th row of the random measurement matrix R _s , which can be calculated by using the integral graph as follows:

P _sum (i, t)=iH(maxI, maxJ)-iH(maxI, minJ)-iH(minI, maxJ)+iH(minI, minJ)

in

In the formula,

Represents the corresponding eigenvalue in the original high-dimensional normalized rectangular eigenvector corresponding to the t non-zero element in the i-th row in the random measurement matrix R _s ;

(3). The sampled image scale-invariant compressed feature vector v={v _i |i=1, 2,..., m}.

3. A kind of real-time multi-scale target tracking method based on compressed sensing according to claim 2, wherein the original high-dimensional normalized rectangular feature vector of the sample image has the following characteristics:

make

Represents a sample image with width and height w and h respectively, the original high-dimensional normalized rectangular feature vector can be described as a sample image z and a series of normalized rectangular filters

The convolution, normalized rectangular filter formula is as follows:

Among them, i and j respectively represent the width and height of the normalized rectangular filter; the filtered sample image obtained by filtering with all rectangular filters is represented as a column vector and these column vectors are connected to each other to form the sample image The original high-dimensional feature vector.

4. a kind of real-time multi-scale target tracking method based on compressed sensing according to claim 1, is characterized in that, described utilize second-order model to estimate particle state and prediction formula is as follows:

\{\begin{matrix} {x x}_{t t} ((i i)) = = 22 {x x}_{t t - - 11} ((i i)) - - {x x}_{t t - - 22} ((i i)) + + {wx wx}_{t t} \\ {y the y}_{t t} ((i i)) = = 22 {y the y}_{t t - - 11} ((i i)) - - {y the y}_{t t - - 22} ((i i)) + + {wy wy}_{t t} \\ {s the s}_{t t} ((i i)) = = 22 {s the s}_{t t - - 11} ((i i)) - - {s the s}_{t t - - 22} ((i i)) + + {ws ws}_{t t} \end{matrix}

Among them, i=1, 2,..., pn, pn is the total number of particles, wx _t , wy _t , ws _t respectively represent the 0-mean Gaussian noise on the three state components of the particles at time t, and their standard deviations are std _x , std _y , std _s ; in order to prevent particle samples from exceeding the range of the image, it is necessary to perform out-of-boundary processing on the estimated state, and use the state after estimation and out-of-boundary processing (x _t (i), y _t (i), s _t (i) ) to replace the state parameters of the particle at time t, and replace the original particle state parameters at time t and t-1 with the corresponding particle state parameters at time t-1 and t-2, and use the second-order model to estimate and predict the particle state considering the speed information of the target movement .

5. a kind of real-time multi-scale object tracking method based on compressed sensing according to claim 1, is characterized in that, described particle weight calculation formula is as follows:

{w w}_{i i} = = \frac{p p (({z z}_{t t} | | {x x}_{t t}^{* *} ((i i))))}{{Σ Σ}_{j j = = 11}^{pn pn} p p (({z z}_{t t} | | {x x}_{t t}^{* *} ((j j))))}

Where i=1, 2,..., pn, pn is the total number of particles,

Indicates that the state of the i-th particle at time t is

The probability of observing the target at time t under the condition,

p (z_{t} | x_{t}^{*} (i)) &Proportional; \exp (h (v_{i})) = Π_{j = 1}^{m} \frac{p (v_{i}^{j} | the y = 1)}{p (v_{i}^{j} | the y = 0)},

H(v _i ) represents the Bayesian classifier response value of the i-th particle.