CN108596947B

CN108596947B - Rapid target tracking method suitable for RGB-D camera

Info

Publication number: CN108596947B
Application number: CN201810258190.5A
Authority: CN
Inventors: 刘烨; 聂建辉; 荆晓远
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2018-03-27
Filing date: 2018-03-27
Publication date: 2021-09-17
Anticipated expiration: 2038-03-27
Also published as: CN108596947A

Abstract

The invention discloses a rapid target tracking method suitable for an RGB-D camera, and belongs to the field of video analysis and three-dimensional point cloud processing. The method comprises the steps of projecting a two-dimensional response image obtained by template matching to a three-dimensional space by utilizing depth information obtained by RGB-D on the basis of traditional template matching to obtain a three-dimensional response image, searching a local maximum value of the three-dimensional response image by a Parzen window method to determine the position of an object in the three-dimensional space, and providing accurate scale information for template matching at the next moment by the obtained three-dimensional position to obtain a more accurate tracking result. The method can be used in the fields of video monitoring, augmented reality, robot visual navigation and the like, and can realize real-time and accurate tracking of the target.

Description

A Fast Object Tracking Method for RGB-D Cameras

技术领域technical field

本发明涉及一种适用于RGB-D相机的快速目标跟踪方法，属于视频分析及三维点云处理技术领域。The invention relates to a fast target tracking method suitable for RGB-D cameras, belonging to the technical field of video analysis and three-dimensional point cloud processing.

背景技术Background technique

目标跟踪在视频监控、虚拟现实等领域都有重要的应用。传统的相机条件下，目标跟踪只能在二维RGB图像上进行，容易受到干扰，导致跟踪失败。近年来，RGB-D相机开始普及，相比于传统的RGB相机，RGB-D除了能获得RGB图像外还能够获得场景的深度信息。目前，RGB-D相机现有的RGB-D相机下的目标跟踪方法，将深度通道作为一个普通的颜色通道来处理，忽略了其背后包含的场景三维结构信息，因此并未充分利用RGB-D相机提供的深度信息。Object tracking has important applications in video surveillance, virtual reality and other fields. Under traditional camera conditions, target tracking can only be performed on two-dimensional RGB images, which is easily disturbed and leads to tracking failure. In recent years, RGB-D cameras have become popular. Compared with traditional RGB cameras, RGB-D can obtain scene depth information in addition to RGB images. At present, the existing target tracking method under RGB-D camera of RGB-D camera treats the depth channel as a common color channel, ignoring the three-dimensional structure information of the scene contained behind it, so RGB-D is not fully utilized. Depth information provided by the camera.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是克服现有技术的缺陷，提供一种适用于RGB-D相机的快速目标跟踪方法，该方法是在传统模板匹配的基础上，利用RGB-D得到的深度信息，将模板匹配得到的二维响应图投影到三维空间，得到三维响应图，通过Parzen窗口方法搜索三维响应图的局部极大值，从而确定物体在三维空间中的位置，而得到的三维位置又能为下一时刻的模板匹配提供精确的尺度信息，从而得到更为准确的跟踪结果。The technical problem to be solved by the present invention is to overcome the defects of the prior art and provide a fast target tracking method suitable for RGB-D cameras. The method is based on traditional template matching and uses the depth information obtained by RGB-D, The two-dimensional response map obtained by template matching is projected into the three-dimensional space to obtain a three-dimensional response map, and the local maximum value of the three-dimensional response map is searched by the Parzen window method to determine the position of the object in the three-dimensional space, and the obtained three-dimensional position can be Provide accurate scale information for template matching at the next moment, so as to obtain more accurate tracking results.

为解决上述技术问题，本发明提供一种适用于RGB-D相机的快速目标跟踪方法，包括如下步骤：In order to solve the above technical problems, the present invention provides a fast target tracking method suitable for RGB-D cameras, comprising the following steps:

1)初始时刻，人工选定需要跟踪的目标，将目标所在的区域大小为(w₀,h₀)的RGB图像块同时作为模板T₀和模板T₁，其中，w₀表示图像块的宽，h₀表示图像块的高；1) At the initial moment, the target to be tracked is manually selected, and the RGB image block with the target area size (w ₀ , h ₀ ) is used as the template T ₀ and the template T ₁ at the same time, where w ₀ represents the width of the image block. , h ₀ represents the height of the image block;

2)将模板T₀所在的区域内所有像素点反投影到三维空间得到三维点云，计算三维点云的中心作为目标在0时刻的三维位置(X₀,Y₀,Z₀)，计算三维点云中的点与(X₀,Y₀,Z₀)距离的最大值r；2) Back-project all pixels in the area where the template T ₀ is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X ₀ , Y ₀ , Z ₀ );

3)分别利用模板T₀和模板T₁在相机获取的RGB图像的R、G、B三个通道上的图像进行模板匹配，得到6张二维响应图R₁,R₂,...,R₆，并计算平均响应图R，计算方法如下：R＝(R₁+R₂+...R₆)/6；3) Use template T ₀ and template T ₁ to perform template matching on the R, G, and B channels of the RGB image acquired by the camera respectively, and obtain 6 two-dimensional response maps R ₁ , R ₂ ,..., R ₆ , and calculate the average response graph R, the calculation method is as follows: R=(R ₁ +R ₂ +...R ₆ )/6;

4)利用相机拍摄得到的当前时刻的深度图像和给定的相机厂商提供的相机参数将平均响应图R中的每个点

反投影到三维空间得到三维点云集合

点云中的每个点的权值为平均响应图R中对应位置的值

4) Using the depth image captured by the camera at the current moment and the camera parameters provided by the given camera manufacturer, each point in the response map R will be averaged

Back projection to 3D space to get 3D point cloud collection

The weight of each point in the point cloud is the value of the corresponding position in the average response map R

5)将前一时刻跟踪得到的三维位置做初始值，用Parzen窗口方法得到步骤4)中三维点云集合的局部极大值所在位置，该位置即为目标在当前时刻的三维位置(X_t,Y_t,Z_t)；5) Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4), which is the three-dimensional position of the target at the current moment (X _{t t} , Y _t , Z _t );

6)利用目标的当前位置(X_t,Y_t,Z_t)，计算得到目标在当前时刻图像上的投影尺寸(w_t,h_t)，其中，w_t表示投影图像的宽，h_t表示投影图像的高；6) Using the current position of the target (X _t , Y _t , Z _t ), calculate the projected size (w _t , h _t ) of the target on the image at the current moment, where w _t represents the width of the projected image, and h _t represents the The height of the projected image;

7)计算(X_t,Y_t,Z_t)在t时刻RGB图像上的投影(x_t,y_t)，在图像上以(x_t,y_t)为中心取大小为(w_t,h_t)的图像块T；7) Calculate the projection (x _t , y _t ) of (X _t , Y _t , Z _t ) on the RGB image at time t, and take (x _t , y _t ) as the center on the image and take the size as (w _t , h _t ) of the image block T;

8)利用图像块T更新模板T₁，T₀保持不变；8) Use the image block T to update the template T ₁ , and T ₀ remains unchanged;

9)返回步骤3)，直到相机停止拍摄得到新的图像。9) Return to step 3) until the camera stops shooting to obtain a new image.

前述的步骤2)中目标在0时刻的三维位置(X₀,Y₀,Z₀)分别为：The three-dimensional positions (X ₀ , Y ₀ , Z ₀ ) of the target at time 0 in the aforementioned step 2) are:

其中，(X_i,Y_i,Z_i)是0时刻三维点云中第i个点的坐标，n为点的总数。Among them, (X _i , Y _i , Z _i ) are the coordinates of the i-th point in the three-dimensional point cloud at time 0, and n is the total number of points.

前述的步骤3)中模板匹配过程为：The template matching process in the aforementioned step 3) is:

计算模板T₀或模板T₁与相机获取的RGB图像内不同位置、但是尺寸与模板图像相同的子图像S的相似度，相似度通过计算模板图像与子图像S之间的归一化交叉相关性得到：Calculate the similarity between the template T ₀ or template T ₁ and the sub-image S at different positions in the RGB image obtained by the camera, but the size is the same as the template image, and the similarity is calculated by calculating the normalized cross-correlation between the template image and the sub-image S. Sex gets:

其中，

和Sⁱ分别是T₀和S的第i个元素，σ(T₀)和

分别是T₀的方差和均值，σ(S)和

分别是S的方差和均值。in,

and Si are the ^ith elements of T ₀ and S, respectively, σ(T ₀ ) and

are the variance and mean of T ₀ , σ(S) and

are the variance and mean of S, respectively.

前述的步骤2)和步骤4)的反投影方法如下：The back-projection methods of the aforementioned steps 2) and 4) are as follows:

对于二维图像上的点(x,y),在深度图像上取得(x,y)的深度d，反投影后的三维位置X坐标为(x-cx)d/fx，Y坐标为(y-cy)d/fy，Z坐标为深度d，其中，cx，cy，fx，fy为相机厂商提供的相机参数。For the point (x, y) on the two-dimensional image, the depth d of (x, y) is obtained on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/fx, and the Y coordinate is (y -cy)d/fy, the Z coordinate is the depth d, where cx, cy, fx, and fy are the camera parameters provided by the camera manufacturer.

前述的步骤5)中，Parzen窗口方法的步骤如下：In the aforementioned step 5), the steps of the Parzen window method are as follows:

51)：将前一时刻目标的三维位置(X_t-1,Y_t-1,Z_t-1)作为初始值，即当迭代次数j＝0时，51): Take the three-dimensional position (X _t-1 , Y _t-1 , Z _t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0,

X^j＝X_t-1,Y^j＝Y_t-1,Z^j＝Z_t-1；X ^j =X _t-1 , Y ^j =Y _t-1 , Z ^j =Z _t-1 ;

52)：通过下式计算新的三维位置：52): Calculate the new three-dimensional position by the following formula:

其中，

表示以第j-1次迭代后的位置(X^j-1,Y^j-1,Z^j-1)为中心，半径为r球体范围内的第i个点，

表示(X^j-1,Y^j-1,Z^j-1)在图像上的投影，

表示步骤4中平均响应图在

位置的取值；in,

Represents the i-th point within the radius of the r sphere with the position (X ^j-1 , Y ^j-1 , Z ^j-1 ) after the j-1 iteration as the center,

represents the projection of (X ^j-1 , Y ^j-1 , Z ^j-1 ) on the image,

Represents the average response graph in step 4 at

the value of the position;

53)：迭代次数j＝j+1；53): iteration times j=j+1;

54)：返回步骤52)，直到迭代次数j＞10，进入下一步；54): return to step 52), until the number of iterations j>10, enter the next step;

55)：X_t＝X^j,Y_t＝Y^j,Z_t＝Z^j。55): X _t =X ^j , Y _t =Y ^j , Z _t =Z ^j .

前述的步骤6)中，目标在当前时刻图像上的投影计算方法为：In the aforementioned step 6), the projection calculation method of the target on the image at the current moment is:

61)计算初始0时刻三维位置(X₀,Y₀,Z₀)与(X₀+r,Y₀,Z₀)的距离s₀；61) Calculate the distance s ₀ between the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) and (X ₀ +r, Y ₀ , Z ₀ ) at the initial 0 time;

62)计算当前时刻的三维位置(X_t,Y_t,Z_t)与(X_t+r,Y_t,Z_t)的距离s_t；62) Calculate the distance s _t between the three-dimensional position (X _t , Y _t , Z _t ) and (X _t +r, Y _t , Z _t ) at the current moment;

63)利用下式计算目标在t时刻图像上的投影尺寸：63) Use the following formula to calculate the projection size of the target on the image at time t:

(w_t,h_t)＝(w₀,h₀)s_t/s₀。(w _t , h _t )=(w ₀ , h ₀ )s _t /s ₀ .

前述的步骤8)中模板T₁的更新方法为：The update method of template T ₁ in the aforementioned step 8) is:

T₁＝T₁+0.1T。T ₁ =T ₁ +0.1T.

本发明的有益效果为：The beneficial effects of the present invention are:

本发明应用于视频监控、增强现实、机器人视觉导航等领域，能够对目标进行实时、准确的追踪。The invention is applied to the fields of video surveillance, augmented reality, robot visual navigation and the like, and can track the target in real time and accurately.

附图说明Description of drawings

图1为本发明的方法流程图。FIG. 1 is a flow chart of the method of the present invention.

具体实施方式Detailed ways

下面对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案，而不能以此来限制本发明的保护范围。The present invention is further described below. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.

如图1所示，本发明方法的具体过程如下：As shown in Figure 1, the concrete process of the inventive method is as follows:

步骤1：初始时刻，人工选定需要跟踪的目标，将目标所在的区域大小为(w₀,h₀)的RGB图像块同时作为模板T₀和模板T₁，此时T₀和T₁为完全一样的图像块。Step 1: At the initial moment, manually select the target to be tracked, and use the RGB image block with the size of the target area as (w ₀ , h ₀ ) as the template T ₀ and the template T ₁ at the same time. At this time, T ₀ and T ₁ are Exactly the same image block.

步骤2：将T₀所在的区域内所有像素点反投影到三维空间得到三维点云，计算三维点云的中心作为目标在0时刻的三维位置(X₀,Y₀,Z₀)，计算三维点云中的点与(X₀,Y₀,Z₀)距离的最大值r；Step 2: Back-project all pixels in the area where T ₀ is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X ₀ , Y ₀ , Z ₀ );

反投影的计算方法如下：对于二维图像上的点(x,y)，在深度图像上取得(x,y)的深度d，反投影后的三维位置X坐标为(x-cx)d/fx，Y坐标为(y-cy)d/fy，Z坐标为深度d，The calculation method of back projection is as follows: for a point (x, y) on a two-dimensional image, obtain the depth d of (x, y) on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/ fx, Y coordinate is (y-cy)d/fy, Z coordinate is depth d,

其中，cx，cy，fx，fy为相机厂商提供的标定参数。Among them, cx, cy, fx, fy are the calibration parameters provided by the camera manufacturer.

三维位置分别为：

The three-dimensional positions are:

步骤3：分别利用模板T₀和模板T₁在相机获取的RGB图像的R、G、B三个通道上的图像进行模板匹配，得到6张二维响应图R₁,R₂,...,R₆，计算平均响应图R，计算方法如下：R＝(R₁+R₂+...R₆)/6。Step 3: Use template T ₀ and template T ₁ to perform template matching on the R, G, and B channels of the RGB image acquired by the camera, respectively, to obtain 6 two-dimensional response maps R ₁ , R ₂ ,...,R _6. Calculate the average response graph R, and the calculation method is as follows: R=(R ₁ +R ₂ +...R ₆ )/6.

模板匹配过程为：计算模板图像T₀或T₁与相机获取的RGB图像内不同位置、尺寸与模板图像相同的子图像S的相似度，相似度通过计算模板图像与子图像S之间的归一化交叉相关性(NCC)得到：The template matching process is: calculating the similarity between the template image T ₀ or T ₁ and the sub-image S with the same position and size as the template image in the RGB image acquired by the camera. The similarity is calculated by calculating the normalization between the template image and the sub-image S. The normalized cross-correlation (NCC) yields:

其中，

和Sⁱ分别是T₀和S的第i个元素，σ(T₀)和

分别是T₀的方差和均值，σ(S)和

分别是S的方差和均值。in,

and Si are the ^ith elements of T ₀ and S, respectively, σ(T ₀ ) and

are the variance and mean of T ₀ , σ(S) and

are the variance and mean of S, respectively.

步骤4：利用相机拍摄得到的当前时刻的深度图像和给定的所述相机厂商提供的相机参数将平均响应图R中的每个点

反投影到三维空间得到三维点云集合

点云中的每个点的权值为所述的平均响应图R中对应位置的值

反投影方法与步骤2中的相同。Step 4: Use the depth image captured by the camera at the current moment and the given camera parameters provided by the camera manufacturer to averagely respond to each point in the map R

Back projection to 3D space to get 3D point cloud collection

The backprojection method is the same as in step 2.

步骤5：将前一时刻跟踪得到的三维位置作为初始值，用Parzen窗口方法得到步骤4中三维点云集合的局部极大值所在位置，该位置即为目标在当前时刻的三维位置(X_t,Y_t,Z_t)；Step 5: Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4, which is the three-dimensional position of the target at the current moment (X _{t t} , Y _t , Z _t );

Parzen窗口方法的步骤如下：The steps of the Parzen window method are as follows:

步骤S51：将前一时刻目标的三维位置(X_t-1,Y_t-1,Z_t-1)作为初始值，即当迭代次数j＝0时，Step S51: Take the three-dimensional position (X _t-1 , Y _t-1 , Z _t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0,

步骤S52：通过下式计算新的三维位置：Step S52: Calculate the new three-dimensional position by the following formula:

其中，

表示(X^j-1,Y^j-1,Z^j-1)在图像上的投影，

表示步骤4中平均响应图在

位置的取值。in,

represents the projection of (X ^j-1 , Y ^j-1 , Z ^j-1 ) on the image,

Represents the average response graph in step 4 at

The value of the location.

步骤S53：迭代次数j＝j+1；Step S53: the number of iterations j=j+1;

步骤S54：返回步骤S52，直到迭代次数j＞10，进入下一步；Step S54: Return to step S52 until the number of iterations j>10, and enter the next step;

步骤S55：X_t＝X^j,Y_t＝Y^j,Z_t＝Z^j。Step S55: X _t =X ^j , Y _t =Y ^j , Z _t =Z ^j .

步骤6：利用目标的当前位置(X_t,Y_t,Z_t)计算得到目标在当前时刻图像上的投影尺寸(w_t,h_t)，w_t,h_t分别为宽和高；Step 6: Use the current position of the target (X _t , Y _t , Z _t ) to calculate the projected size (w _t , h _t ) of the target on the image at the current moment, where w _t , h _t are width and height respectively;

计算方法为：The calculation method is:

步骤S61：计算初始0时刻三维位置(X₀,Y₀,Z₀)与(X₀+r,Y₀,Z₀)的距离s₀；Step S61: Calculate the distance s ₀ between the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) and (X ₀ +r, Y ₀ , Z ₀ ) at the initial time 0;

步骤S62：计算(X_t,Y_t,Z_t)与(X_t+r,Y_t,Z_t)的距离s_t；Step S62: Calculate the distance s _t between (X _t , Y _t , Z _t ) and (X _t +r, Y _t , Z _t );

步骤S63：利用下式计算目标在t时刻图像上的投影尺寸：Step S63: Use the following formula to calculate the projection size of the target on the image at time t:

(w_t,h_t)＝(w₀,h₀)s_t/s₀。(w _t , h _t )=(w ₀ , h ₀ )s _t /s ₀ .

步骤7：计算(X_t,Y_t,Z_t)在t时刻RGB图像上的投影(x_t,y_t)，在图像上以(x_t,y_t)为中心取大小为(w_t,h_t)的图像块T；Step 7: Calculate the projection (x _t , y _t ) of (X _t , Y _t , Z _t ) on the RGB image at time t, and take (x _t , y _t ) as the center on the image and take the size as (w _t , h _t ) image block T;

步骤8：利用图像块T更新模板T₁，T₀保持不变。Step 8: The template T ₁ is updated with the image block T, and T ₀ remains unchanged.

模板T₁的更新方法为：T₁＝T₁+0.1T。The update method of template T ₁ is: T ₁ =T ₁ +0.1T.

步骤9：返回步骤3，直到相机停止拍摄得到新的图像。Step 9: Go back to Step 3 until the camera stops shooting to get a new image.

本发明方法能够有效的跟踪物体在三维空间中的位置，并且该方法能够处理目标尺度的变化，所述的模板更新机制能够处理物体外观的变化同时避免被周围外观相近物体干扰。The method of the invention can effectively track the position of the object in the three-dimensional space, and the method can deal with the change of the target scale, and the template updating mechanism can deal with the change of the appearance of the object and avoid being disturbed by surrounding objects with similar appearance.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明技术原理的前提下，还可以做出若干改进和变形，这些改进和变形也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims

1. a fast target tracking method applicable to RGB-D camera, is characterized in that, comprises the steps:

1) At the initial moment, the target to be tracked is manually selected, and the RGB image block with the target area size (w ₀ , h ₀ ) is used as the template T ₀ and the template T ₁ at the same time, where w ₀ represents the width of the image block. , h ₀ represents the height of the image block;

2) Back-project all pixels in the area where the template T ₀ is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X ₀ , Y ₀ , Z ₀ );

3) Use template T ₀ and template T ₁ to perform template matching on the R, G, and B channels of the RGB image acquired by the camera respectively, and obtain 6 two-dimensional response maps R ₁ , R ₂ ,..., R ₆ , and calculate the average response graph R, the calculation method is as follows: R=(R ₁ +R ₂ +...R ₆ )/6;

Back projection to 3D space to get 3D point cloud collection

5) Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4), which is the three-dimensional position of the target at the current moment (X _{t t} , Y _t , Z _t );

6) Using the current position of the target (X _t , Y _t , Z _t ), calculate the projected size (w _t , h _t ) of the target on the image at the current moment, where w _t represents the width of the projected image, and h _t represents the The height of the projected image;

7) Calculate the projection (x _t , y _t ) of (X _t , Y _t , Z _t ) on the RGB image at time t, and take (x _t , y _t ) as the center on the image and take the size as (w _t , h _t ) of the image block T;

8) Use the image block T to update the template T ₁ , and T ₀ remains unchanged;

9) Return to step 3) until the camera stops shooting to obtain a new image.

2. A kind of fast target tracking method suitable for RGB-D camera according to claim 1, is characterized in that, in described step 2), the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) of the target at time 0 They are:

Among them, (X _i , Y _i , Z _i ) are the coordinates of the i-th point in the three-dimensional point cloud at time 0, and n is the total number of points.

3. a kind of fast target tracking method that is applicable to RGB-D camera according to claim 1, is characterized in that, in described step 3), template matching process is:

Calculate the similarity between the template T ₀ or template T ₁ and the sub-image S at different positions in the RGB image obtained by the camera, but the size is the same as the template image, and the similarity is calculated by calculating the normalized cross-correlation between the template image and the sub-image S. Sex gets:

in,

and Si are the ^ith elements of T ₀ and S, respectively, σ(T ₀ ) and

are the variance and mean of T ₀ , σ(S) and

are the variance and mean of S, respectively.

4. a kind of fast target tracking method applicable to RGB-D camera according to claim 1, is characterized in that, the back projection method of described step 2) and step 4) is as follows:

For the point (x, y) on the two-dimensional image, the depth d of (x, y) is obtained on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/fx, and the Y coordinate is (y -cy)d/fy, the Z coordinate is the depth d, where cx, cy, fx, and fy are the camera parameters provided by the camera manufacturer.

5. a kind of fast target tracking method that is applicable to RGB-D camera according to claim 1, is characterized in that, in described step 5), the step of Parzen window method is as follows:

51): Take the three-dimensional position (X _t-1 , Y _t-1 , Z _t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0, X ^j =X _t-1 , Y ^j =Y _t-1 , Z ^j =Z _t-1 ;

52): Calculate the new three-dimensional position by the following formula:

in,

represents the projection of (X ^j-1 , Y ^j-1 , Z ^j-1 ) on the image,

Represents the average response graph in step 4 at

the value of the position;

53): iteration times j=j+1;

54): return to step 52), until the number of iterations j>10, enter the next step;

55): X _t =X ^j , Y _t =Y ^j , Z _t =Z ^j .

6. a kind of fast target tracking method applicable to RGB-D camera according to claim 5, is characterized in that, in described step 6), the projection calculation method of target on current moment image is:

61) Calculate the distance s ₀ between the three-dimensional position (X ₀ , Y ₀ , Z ₀ ) and (X ₀ +r, Y ₀ , Z ₀ ) at the initial 0 time;

62) Calculate the distance s _t between the three-dimensional position (X _t , Y _t , Z _t ) and (X _t +r, Y _t , Z _t ) at the current moment;

63) Use the following formula to calculate the projection size of the target on the image at time t:

(w _t , h _t )=(w ₀ , h ₀ )s _t /s ₀ .

7. a kind of fast target tracking method applicable to RGB-D camera according to claim 6, is characterized in that, the updating method of template T ₁ in described step 8) is:

T ₁ =T ₁ +0.1T.