CN108596947B - Rapid target tracking method suitable for RGB-D camera - Google Patents

Rapid target tracking method suitable for RGB-D camera Download PDF

Info

Publication number
CN108596947B
CN108596947B CN201810258190.5A CN201810258190A CN108596947B CN 108596947 B CN108596947 B CN 108596947B CN 201810258190 A CN201810258190 A CN 201810258190A CN 108596947 B CN108596947 B CN 108596947B
Authority
CN
China
Prior art keywords
image
dimensional
rgb
template
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810258190.5A
Other languages
Chinese (zh)
Other versions
CN108596947A (en
Inventor
刘烨
聂建辉
荆晓远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201810258190.5A priority Critical patent/CN108596947B/en
Publication of CN108596947A publication Critical patent/CN108596947A/en
Application granted granted Critical
Publication of CN108596947B publication Critical patent/CN108596947B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention discloses a rapid target tracking method suitable for an RGB-D camera, and belongs to the field of video analysis and three-dimensional point cloud processing. The method comprises the steps of projecting a two-dimensional response image obtained by template matching to a three-dimensional space by utilizing depth information obtained by RGB-D on the basis of traditional template matching to obtain a three-dimensional response image, searching a local maximum value of the three-dimensional response image by a Parzen window method to determine the position of an object in the three-dimensional space, and providing accurate scale information for template matching at the next moment by the obtained three-dimensional position to obtain a more accurate tracking result. The method can be used in the fields of video monitoring, augmented reality, robot visual navigation and the like, and can realize real-time and accurate tracking of the target.

Description

一种适用于RGB-D相机的快速目标跟踪方法A Fast Object Tracking Method for RGB-D Cameras

技术领域technical field

本发明涉及一种适用于RGB-D相机的快速目标跟踪方法,属于视频分析及三维点云处理技术领域。The invention relates to a fast target tracking method suitable for RGB-D cameras, belonging to the technical field of video analysis and three-dimensional point cloud processing.

背景技术Background technique

目标跟踪在视频监控、虚拟现实等领域都有重要的应用。传统的相机条件下,目标跟踪只能在二维RGB图像上进行,容易受到干扰,导致跟踪失败。近年来,RGB-D相机开始普及,相比于传统的RGB相机,RGB-D除了能获得RGB图像外还能够获得场景的深度信息。目前,RGB-D相机现有的RGB-D相机下的目标跟踪方法,将深度通道作为一个普通的颜色通道来处理,忽略了其背后包含的场景三维结构信息,因此并未充分利用RGB-D相机提供的深度信息。Object tracking has important applications in video surveillance, virtual reality and other fields. Under traditional camera conditions, target tracking can only be performed on two-dimensional RGB images, which is easily disturbed and leads to tracking failure. In recent years, RGB-D cameras have become popular. Compared with traditional RGB cameras, RGB-D can obtain scene depth information in addition to RGB images. At present, the existing target tracking method under RGB-D camera of RGB-D camera treats the depth channel as a common color channel, ignoring the three-dimensional structure information of the scene contained behind it, so RGB-D is not fully utilized. Depth information provided by the camera.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是克服现有技术的缺陷,提供一种适用于RGB-D相机的快速目标跟踪方法,该方法是在传统模板匹配的基础上,利用RGB-D得到的深度信息,将模板匹配得到的二维响应图投影到三维空间,得到三维响应图,通过Parzen窗口方法搜索三维响应图的局部极大值,从而确定物体在三维空间中的位置,而得到的三维位置又能为下一时刻的模板匹配提供精确的尺度信息,从而得到更为准确的跟踪结果。The technical problem to be solved by the present invention is to overcome the defects of the prior art and provide a fast target tracking method suitable for RGB-D cameras. The method is based on traditional template matching and uses the depth information obtained by RGB-D, The two-dimensional response map obtained by template matching is projected into the three-dimensional space to obtain a three-dimensional response map, and the local maximum value of the three-dimensional response map is searched by the Parzen window method to determine the position of the object in the three-dimensional space, and the obtained three-dimensional position can be Provide accurate scale information for template matching at the next moment, so as to obtain more accurate tracking results.

为解决上述技术问题,本发明提供一种适用于RGB-D相机的快速目标跟踪方法,包括如下步骤:In order to solve the above technical problems, the present invention provides a fast target tracking method suitable for RGB-D cameras, comprising the following steps:

1)初始时刻,人工选定需要跟踪的目标,将目标所在的区域大小为(w0,h0)的RGB图像块同时作为模板T0和模板T1,其中,w0表示图像块的宽,h0表示图像块的高;1) At the initial moment, the target to be tracked is manually selected, and the RGB image block with the target area size (w 0 , h 0 ) is used as the template T 0 and the template T 1 at the same time, where w 0 represents the width of the image block. , h 0 represents the height of the image block;

2)将模板T0所在的区域内所有像素点反投影到三维空间得到三维点云,计算三维点云的中心作为目标在0时刻的三维位置(X0,Y0,Z0),计算三维点云中的点与(X0,Y0,Z0)距离的最大值r;2) Back-project all pixels in the area where the template T 0 is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X 0 , Y 0 , Z 0 ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X 0 , Y 0 , Z 0 );

3)分别利用模板T0和模板T1在相机获取的RGB图像的R、G、B三个通道上的图像进行模板匹配,得到6张二维响应图R1,R2,...,R6,并计算平均响应图R,计算方法如下:R=(R1+R2+...R6)/6;3) Use template T 0 and template T 1 to perform template matching on the R, G, and B channels of the RGB image acquired by the camera respectively, and obtain 6 two-dimensional response maps R 1 , R 2 ,..., R 6 , and calculate the average response graph R, the calculation method is as follows: R=(R 1 +R 2 +...R 6 )/6;

4)利用相机拍摄得到的当前时刻的深度图像和给定的相机厂商提供的相机参数将平均响应图R中的每个点

Figure BDA0001609532220000011
反投影到三维空间得到三维点云集合
Figure BDA0001609532220000012
点云中的每个点的权值为平均响应图R中对应位置的值
Figure BDA0001609532220000013
4) Using the depth image captured by the camera at the current moment and the camera parameters provided by the given camera manufacturer, each point in the response map R will be averaged
Figure BDA0001609532220000011
Back projection to 3D space to get 3D point cloud collection
Figure BDA0001609532220000012
The weight of each point in the point cloud is the value of the corresponding position in the average response map R
Figure BDA0001609532220000013

5)将前一时刻跟踪得到的三维位置做初始值,用Parzen窗口方法得到步骤4)中三维点云集合的局部极大值所在位置,该位置即为目标在当前时刻的三维位置(Xt,Yt,Zt);5) Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4), which is the three-dimensional position of the target at the current moment (X t t , Y t , Z t );

6)利用目标的当前位置(Xt,Yt,Zt),计算得到目标在当前时刻图像上的投影尺寸(wt,ht),其中,wt表示投影图像的宽,ht表示投影图像的高;6) Using the current position of the target (X t , Y t , Z t ), calculate the projected size (w t , h t ) of the target on the image at the current moment, where w t represents the width of the projected image, and h t represents the The height of the projected image;

7)计算(Xt,Yt,Zt)在t时刻RGB图像上的投影(xt,yt),在图像上以(xt,yt)为中心取大小为(wt,ht)的图像块T;7) Calculate the projection (x t , y t ) of (X t , Y t , Z t ) on the RGB image at time t, and take (x t , y t ) as the center on the image and take the size as (w t , h t ) of the image block T;

8)利用图像块T更新模板T1,T0保持不变;8) Use the image block T to update the template T 1 , and T 0 remains unchanged;

9)返回步骤3),直到相机停止拍摄得到新的图像。9) Return to step 3) until the camera stops shooting to obtain a new image.

前述的步骤2)中目标在0时刻的三维位置(X0,Y0,Z0)分别为:The three-dimensional positions (X 0 , Y 0 , Z 0 ) of the target at time 0 in the aforementioned step 2) are:

Figure BDA0001609532220000021
Figure BDA0001609532220000021

其中,(Xi,Yi,Zi)是0时刻三维点云中第i个点的坐标,n为点的总数。Among them, (X i , Y i , Z i ) are the coordinates of the i-th point in the three-dimensional point cloud at time 0, and n is the total number of points.

前述的步骤3)中模板匹配过程为:The template matching process in the aforementioned step 3) is:

计算模板T0或模板T1与相机获取的RGB图像内不同位置、但是尺寸与模板图像相同的子图像S的相似度,相似度通过计算模板图像与子图像S之间的归一化交叉相关性得到:Calculate the similarity between the template T 0 or template T 1 and the sub-image S at different positions in the RGB image obtained by the camera, but the size is the same as the template image, and the similarity is calculated by calculating the normalized cross-correlation between the template image and the sub-image S. Sex gets:

Figure BDA0001609532220000022
Figure BDA0001609532220000022

其中,

Figure BDA0001609532220000023
和Si分别是T0和S的第i个元素,σ(T0)和
Figure BDA0001609532220000024
分别是T0的方差和均值,σ(S)和
Figure BDA0001609532220000025
分别是S的方差和均值。in,
Figure BDA0001609532220000023
and Si are the ith elements of T 0 and S, respectively, σ(T 0 ) and
Figure BDA0001609532220000024
are the variance and mean of T 0 , σ(S) and
Figure BDA0001609532220000025
are the variance and mean of S, respectively.

前述的步骤2)和步骤4)的反投影方法如下:The back-projection methods of the aforementioned steps 2) and 4) are as follows:

对于二维图像上的点(x,y),在深度图像上取得(x,y)的深度d,反投影后的三维位置X坐标为(x-cx)d/fx,Y坐标为(y-cy)d/fy,Z坐标为深度d,其中,cx,cy,fx,fy为相机厂商提供的相机参数。For the point (x, y) on the two-dimensional image, the depth d of (x, y) is obtained on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/fx, and the Y coordinate is (y -cy)d/fy, the Z coordinate is the depth d, where cx, cy, fx, and fy are the camera parameters provided by the camera manufacturer.

前述的步骤5)中,Parzen窗口方法的步骤如下:In the aforementioned step 5), the steps of the Parzen window method are as follows:

51):将前一时刻目标的三维位置(Xt-1,Yt-1,Zt-1)作为初始值,即当迭代次数j=0时,51): Take the three-dimensional position (X t-1 , Y t-1 , Z t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0,

Xj=Xt-1,Yj=Yt-1,Zj=Zt-1X j =X t-1 , Y j =Y t-1 , Z j =Z t-1 ;

52):通过下式计算新的三维位置:52): Calculate the new three-dimensional position by the following formula:

Figure BDA0001609532220000026
Figure BDA0001609532220000026

其中,

Figure BDA0001609532220000031
表示以第j-1次迭代后的位置(Xj-1,Yj-1,Zj-1)为中心,半径为r球体范围内的第i个点,
Figure BDA0001609532220000032
表示(Xj-1,Yj-1,Zj-1)在图像上的投影,
Figure BDA0001609532220000033
表示步骤4中平均响应图在
Figure BDA0001609532220000034
位置的取值;in,
Figure BDA0001609532220000031
Represents the i-th point within the radius of the r sphere with the position (X j-1 , Y j-1 , Z j-1 ) after the j-1 iteration as the center,
Figure BDA0001609532220000032
represents the projection of (X j-1 , Y j-1 , Z j-1 ) on the image,
Figure BDA0001609532220000033
Represents the average response graph in step 4 at
Figure BDA0001609532220000034
the value of the position;

53):迭代次数j=j+1;53): iteration times j=j+1;

54):返回步骤52),直到迭代次数j>10,进入下一步;54): return to step 52), until the number of iterations j>10, enter the next step;

55):Xt=Xj,Yt=Yj,Zt=Zj55): X t =X j , Y t =Y j , Z t =Z j .

前述的步骤6)中,目标在当前时刻图像上的投影计算方法为:In the aforementioned step 6), the projection calculation method of the target on the image at the current moment is:

61)计算初始0时刻三维位置(X0,Y0,Z0)与(X0+r,Y0,Z0)的距离s061) Calculate the distance s 0 between the three-dimensional position (X 0 , Y 0 , Z 0 ) and (X 0 +r, Y 0 , Z 0 ) at the initial 0 time;

62)计算当前时刻的三维位置(Xt,Yt,Zt)与(Xt+r,Yt,Zt)的距离st62) Calculate the distance s t between the three-dimensional position (X t , Y t , Z t ) and (X t +r, Y t , Z t ) at the current moment;

63)利用下式计算目标在t时刻图像上的投影尺寸:63) Use the following formula to calculate the projection size of the target on the image at time t:

(wt,ht)=(w0,h0)st/s0(w t , h t )=(w 0 , h 0 )s t /s 0 .

前述的步骤8)中模板T1的更新方法为:The update method of template T 1 in the aforementioned step 8) is:

T1=T1+0.1T。T 1 =T 1 +0.1T.

本发明的有益效果为:The beneficial effects of the present invention are:

本发明应用于视频监控、增强现实、机器人视觉导航等领域,能够对目标进行实时、准确的追踪。The invention is applied to the fields of video surveillance, augmented reality, robot visual navigation and the like, and can track the target in real time and accurately.

附图说明Description of drawings

图1为本发明的方法流程图。FIG. 1 is a flow chart of the method of the present invention.

具体实施方式Detailed ways

下面对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。The present invention is further described below. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.

如图1所示,本发明方法的具体过程如下:As shown in Figure 1, the concrete process of the inventive method is as follows:

步骤1:初始时刻,人工选定需要跟踪的目标,将目标所在的区域大小为(w0,h0)的RGB图像块同时作为模板T0和模板T1,此时T0和T1为完全一样的图像块。Step 1: At the initial moment, manually select the target to be tracked, and use the RGB image block with the size of the target area as (w 0 , h 0 ) as the template T 0 and the template T 1 at the same time. At this time, T 0 and T 1 are Exactly the same image block.

步骤2:将T0所在的区域内所有像素点反投影到三维空间得到三维点云,计算三维点云的中心作为目标在0时刻的三维位置(X0,Y0,Z0),计算三维点云中的点与(X0,Y0,Z0)距离的最大值r;Step 2: Back-project all pixels in the area where T 0 is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X 0 , Y 0 , Z 0 ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X 0 , Y 0 , Z 0 );

反投影的计算方法如下:对于二维图像上的点(x,y),在深度图像上取得(x,y)的深度d,反投影后的三维位置X坐标为(x-cx)d/fx,Y坐标为(y-cy)d/fy,Z坐标为深度d,The calculation method of back projection is as follows: for a point (x, y) on a two-dimensional image, obtain the depth d of (x, y) on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/ fx, Y coordinate is (y-cy)d/fy, Z coordinate is depth d,

其中,cx,cy,fx,fy为相机厂商提供的标定参数。Among them, cx, cy, fx, fy are the calibration parameters provided by the camera manufacturer.

三维位置分别为:

Figure BDA0001609532220000041
The three-dimensional positions are:
Figure BDA0001609532220000041

其中,(Xi,Yi,Zi)是0时刻三维点云中第i个点的坐标,n为点的总数。Among them, (X i , Y i , Z i ) are the coordinates of the i-th point in the three-dimensional point cloud at time 0, and n is the total number of points.

步骤3:分别利用模板T0和模板T1在相机获取的RGB图像的R、G、B三个通道上的图像进行模板匹配,得到6张二维响应图R1,R2,...,R6,计算平均响应图R,计算方法如下:R=(R1+R2+...R6)/6。Step 3: Use template T 0 and template T 1 to perform template matching on the R, G, and B channels of the RGB image acquired by the camera, respectively, to obtain 6 two-dimensional response maps R 1 , R 2 ,...,R 6. Calculate the average response graph R, and the calculation method is as follows: R=(R 1 +R 2 +...R 6 )/6.

模板匹配过程为:计算模板图像T0或T1与相机获取的RGB图像内不同位置、尺寸与模板图像相同的子图像S的相似度,相似度通过计算模板图像与子图像S之间的归一化交叉相关性(NCC)得到:The template matching process is: calculating the similarity between the template image T 0 or T 1 and the sub-image S with the same position and size as the template image in the RGB image acquired by the camera. The similarity is calculated by calculating the normalization between the template image and the sub-image S. The normalized cross-correlation (NCC) yields:

Figure BDA0001609532220000042
Figure BDA0001609532220000042

其中,

Figure BDA0001609532220000043
和Si分别是T0和S的第i个元素,σ(T0)和
Figure BDA0001609532220000044
分别是T0的方差和均值,σ(S)和
Figure BDA0001609532220000045
分别是S的方差和均值。in,
Figure BDA0001609532220000043
and Si are the ith elements of T 0 and S, respectively, σ(T 0 ) and
Figure BDA0001609532220000044
are the variance and mean of T 0 , σ(S) and
Figure BDA0001609532220000045
are the variance and mean of S, respectively.

步骤4:利用相机拍摄得到的当前时刻的深度图像和给定的所述相机厂商提供的相机参数将平均响应图R中的每个点

Figure BDA0001609532220000046
反投影到三维空间得到三维点云集合
Figure BDA0001609532220000047
点云中的每个点的权值为所述的平均响应图R中对应位置的值
Figure BDA0001609532220000048
反投影方法与步骤2中的相同。Step 4: Use the depth image captured by the camera at the current moment and the given camera parameters provided by the camera manufacturer to averagely respond to each point in the map R
Figure BDA0001609532220000046
Back projection to 3D space to get 3D point cloud collection
Figure BDA0001609532220000047
The weight of each point in the point cloud is the value of the corresponding position in the average response map R
Figure BDA0001609532220000048
The backprojection method is the same as in step 2.

步骤5:将前一时刻跟踪得到的三维位置作为初始值,用Parzen窗口方法得到步骤4中三维点云集合的局部极大值所在位置,该位置即为目标在当前时刻的三维位置(Xt,Yt,Zt);Step 5: Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4, which is the three-dimensional position of the target at the current moment (X t t , Y t , Z t );

Parzen窗口方法的步骤如下:The steps of the Parzen window method are as follows:

步骤S51:将前一时刻目标的三维位置(Xt-1,Yt-1,Zt-1)作为初始值,即当迭代次数j=0时,Step S51: Take the three-dimensional position (X t-1 , Y t-1 , Z t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0,

Xj=Xt-1,Yj=Yt-1,Zj=Zt-1X j =X t-1 , Y j =Y t-1 , Z j =Z t-1 ;

步骤S52:通过下式计算新的三维位置:Step S52: Calculate the new three-dimensional position by the following formula:

Figure BDA0001609532220000049
Figure BDA0001609532220000049

其中,

Figure BDA00016095322200000410
表示以第j-1次迭代后的位置(Xj-1,Yj-1,Zj-1)为中心,半径为r球体范围内的第i个点,
Figure BDA00016095322200000411
表示(Xj-1,Yj-1,Zj-1)在图像上的投影,
Figure BDA00016095322200000412
表示步骤4中平均响应图在
Figure BDA0001609532220000051
位置的取值。in,
Figure BDA00016095322200000410
Represents the i-th point within the radius of the r sphere with the position (X j-1 , Y j-1 , Z j-1 ) after the j-1 iteration as the center,
Figure BDA00016095322200000411
represents the projection of (X j-1 , Y j-1 , Z j-1 ) on the image,
Figure BDA00016095322200000412
Represents the average response graph in step 4 at
Figure BDA0001609532220000051
The value of the location.

步骤S53:迭代次数j=j+1;Step S53: the number of iterations j=j+1;

步骤S54:返回步骤S52,直到迭代次数j>10,进入下一步;Step S54: Return to step S52 until the number of iterations j>10, and enter the next step;

步骤S55:Xt=Xj,Yt=Yj,Zt=ZjStep S55: X t =X j , Y t =Y j , Z t =Z j .

步骤6:利用目标的当前位置(Xt,Yt,Zt)计算得到目标在当前时刻图像上的投影尺寸(wt,ht),wt,ht分别为宽和高;Step 6: Use the current position of the target (X t , Y t , Z t ) to calculate the projected size (w t , h t ) of the target on the image at the current moment, where w t , h t are width and height respectively;

计算方法为:The calculation method is:

步骤S61:计算初始0时刻三维位置(X0,Y0,Z0)与(X0+r,Y0,Z0)的距离s0Step S61: Calculate the distance s 0 between the three-dimensional position (X 0 , Y 0 , Z 0 ) and (X 0 +r, Y 0 , Z 0 ) at the initial time 0;

步骤S62:计算(Xt,Yt,Zt)与(Xt+r,Yt,Zt)的距离stStep S62: Calculate the distance s t between (X t , Y t , Z t ) and (X t +r, Y t , Z t );

步骤S63:利用下式计算目标在t时刻图像上的投影尺寸:Step S63: Use the following formula to calculate the projection size of the target on the image at time t:

(wt,ht)=(w0,h0)st/s0(w t , h t )=(w 0 , h 0 )s t /s 0 .

步骤7:计算(Xt,Yt,Zt)在t时刻RGB图像上的投影(xt,yt),在图像上以(xt,yt)为中心取大小为(wt,ht)的图像块T;Step 7: Calculate the projection (x t , y t ) of (X t , Y t , Z t ) on the RGB image at time t, and take (x t , y t ) as the center on the image and take the size as (w t , h t ) image block T;

步骤8:利用图像块T更新模板T1,T0保持不变。Step 8: The template T 1 is updated with the image block T, and T 0 remains unchanged.

模板T1的更新方法为:T1=T1+0.1T。The update method of template T 1 is: T 1 =T 1 +0.1T.

步骤9:返回步骤3,直到相机停止拍摄得到新的图像。Step 9: Go back to Step 3 until the camera stops shooting to get a new image.

本发明方法能够有效的跟踪物体在三维空间中的位置,并且该方法能够处理目标尺度的变化,所述的模板更新机制能够处理物体外观的变化同时避免被周围外观相近物体干扰。The method of the invention can effectively track the position of the object in the three-dimensional space, and the method can deal with the change of the target scale, and the template updating mechanism can deal with the change of the appearance of the object and avoid being disturbed by surrounding objects with similar appearance.

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can also be made. These improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims (7)

1.一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,包括如下步骤:1. a fast target tracking method applicable to RGB-D camera, is characterized in that, comprises the steps: 1)初始时刻,人工选定需要跟踪的目标,将目标所在的区域大小为(w0,h0)的RGB图像块同时作为模板T0和模板T1,其中,w0表示图像块的宽,h0表示图像块的高;1) At the initial moment, the target to be tracked is manually selected, and the RGB image block with the target area size (w 0 , h 0 ) is used as the template T 0 and the template T 1 at the same time, where w 0 represents the width of the image block. , h 0 represents the height of the image block; 2)将模板T0所在的区域内所有像素点反投影到三维空间得到三维点云,计算三维点云的中心作为目标在0时刻的三维位置(X0,Y0,Z0),计算三维点云中的点与(X0,Y0,Z0)距离的最大值r;2) Back-project all pixels in the area where the template T 0 is located to the three-dimensional space to obtain a three-dimensional point cloud, calculate the center of the three-dimensional point cloud as the three-dimensional position (X 0 , Y 0 , Z 0 ) of the target at time 0, and calculate the three-dimensional point cloud. The maximum value r of the distance between the point in the point cloud and (X 0 , Y 0 , Z 0 ); 3)分别利用模板T0和模板T1在相机获取的RGB图像的R、G、B三个通道上的图像进行模板匹配,得到6张二维响应图R1,R2,...,R6,并计算平均响应图R,计算方法如下:R=(R1+R2+...R6)/6;3) Use template T 0 and template T 1 to perform template matching on the R, G, and B channels of the RGB image acquired by the camera respectively, and obtain 6 two-dimensional response maps R 1 , R 2 ,..., R 6 , and calculate the average response graph R, the calculation method is as follows: R=(R 1 +R 2 +...R 6 )/6; 4)利用相机拍摄得到的当前时刻的深度图像和给定的相机厂商提供的相机参数将平均响应图R中的每个点
Figure FDA0001609532210000011
反投影到三维空间得到三维点云集合
Figure FDA0001609532210000012
点云中的每个点的权值为平均响应图R中对应位置的值
Figure FDA0001609532210000013
4) Using the depth image captured by the camera at the current moment and the camera parameters provided by the given camera manufacturer, each point in the response map R will be averaged
Figure FDA0001609532210000011
Back projection to 3D space to get 3D point cloud collection
Figure FDA0001609532210000012
The weight of each point in the point cloud is the value of the corresponding position in the average response map R
Figure FDA0001609532210000013
5)将前一时刻跟踪得到的三维位置做初始值,用Parzen窗口方法得到步骤4)中三维点云集合的局部极大值所在位置,该位置即为目标在当前时刻的三维位置(Xt,Yt,Zt);5) Use the three-dimensional position tracked at the previous moment as the initial value, and use the Parzen window method to obtain the position of the local maximum value of the three-dimensional point cloud set in step 4), which is the three-dimensional position of the target at the current moment (X t t , Y t , Z t ); 6)利用目标的当前位置(Xt,Yt,Zt),计算得到目标在当前时刻图像上的投影尺寸(wt,ht),其中,wt表示投影图像的宽,ht表示投影图像的高;6) Using the current position of the target (X t , Y t , Z t ), calculate the projected size (w t , h t ) of the target on the image at the current moment, where w t represents the width of the projected image, and h t represents the The height of the projected image; 7)计算(Xt,Yt,Zt)在t时刻RGB图像上的投影(xt,yt),在图像上以(xt,yt)为中心取大小为(wt,ht)的图像块T;7) Calculate the projection (x t , y t ) of (X t , Y t , Z t ) on the RGB image at time t, and take (x t , y t ) as the center on the image and take the size as (w t , h t ) of the image block T; 8)利用图像块T更新模板T1,T0保持不变;8) Use the image block T to update the template T 1 , and T 0 remains unchanged; 9)返回步骤3),直到相机停止拍摄得到新的图像。9) Return to step 3) until the camera stops shooting to obtain a new image.
2.根据权利要求1所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤2)中目标在0时刻的三维位置(X0,Y0,Z0)分别为:
Figure FDA0001609532210000014
2. A kind of fast target tracking method suitable for RGB-D camera according to claim 1, is characterized in that, in described step 2), the three-dimensional position (X 0 , Y 0 , Z 0 ) of the target at time 0 They are:
Figure FDA0001609532210000014
其中,(Xi,Yi,Zi)是0时刻三维点云中第i个点的坐标,n为点的总数。Among them, (X i , Y i , Z i ) are the coordinates of the i-th point in the three-dimensional point cloud at time 0, and n is the total number of points.
3.根据权利要求1所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤3)中模板匹配过程为:3. a kind of fast target tracking method that is applicable to RGB-D camera according to claim 1, is characterized in that, in described step 3), template matching process is: 计算模板T0或模板T1与相机获取的RGB图像内不同位置、但是尺寸与模板图像相同的子图像S的相似度,相似度通过计算模板图像与子图像S之间的归一化交叉相关性得到:Calculate the similarity between the template T 0 or template T 1 and the sub-image S at different positions in the RGB image obtained by the camera, but the size is the same as the template image, and the similarity is calculated by calculating the normalized cross-correlation between the template image and the sub-image S. Sex gets:
Figure FDA0001609532210000015
Figure FDA0001609532210000015
其中,
Figure FDA0001609532210000016
和Si分别是T0和S的第i个元素,σ(T0)和
Figure FDA0001609532210000017
分别是T0的方差和均值,σ(S)和
Figure FDA0001609532210000018
分别是S的方差和均值。
in,
Figure FDA0001609532210000016
and Si are the ith elements of T 0 and S, respectively, σ(T 0 ) and
Figure FDA0001609532210000017
are the variance and mean of T 0 , σ(S) and
Figure FDA0001609532210000018
are the variance and mean of S, respectively.
4.根据权利要求1所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤2)和步骤4)的反投影方法如下:4. a kind of fast target tracking method applicable to RGB-D camera according to claim 1, is characterized in that, the back projection method of described step 2) and step 4) is as follows: 对于二维图像上的点(x,y),在深度图像上取得(x,y)的深度d,反投影后的三维位置X坐标为(x-cx)d/fx,Y坐标为(y-cy)d/fy,Z坐标为深度d,其中,cx,cy,fx,fy为相机厂商提供的相机参数。For the point (x, y) on the two-dimensional image, the depth d of (x, y) is obtained on the depth image, and the X coordinate of the three-dimensional position after back projection is (x-cx) d/fx, and the Y coordinate is (y -cy)d/fy, the Z coordinate is the depth d, where cx, cy, fx, and fy are the camera parameters provided by the camera manufacturer. 5.根据权利要求1所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤5)中,Parzen窗口方法的步骤如下:5. a kind of fast target tracking method that is applicable to RGB-D camera according to claim 1, is characterized in that, in described step 5), the step of Parzen window method is as follows: 51):将前一时刻目标的三维位置(Xt-1,Yt-1,Zt-1)作为初始值,即当迭代次数j=0时,Xj=Xt-1,Yj=Yt-1,Zj=Zt-151): Take the three-dimensional position (X t-1 , Y t-1 , Z t-1 ) of the target at the previous moment as the initial value, that is, when the number of iterations j=0, X j =X t-1 , Y j =Y t-1 , Z j =Z t-1 ; 52):通过下式计算新的三维位置:52): Calculate the new three-dimensional position by the following formula:
Figure FDA0001609532210000021
Figure FDA0001609532210000021
其中,
Figure FDA0001609532210000022
表示以第j-1次迭代后的位置(Xj-1,Yj-1,Zj-1)为中心,半径为r球体范围内的第i个点,
Figure FDA0001609532210000023
表示(Xj-1,Yj-1,Zj-1)在图像上的投影,
Figure FDA0001609532210000024
表示步骤4中平均响应图在
Figure FDA0001609532210000025
位置的取值;
in,
Figure FDA0001609532210000022
Represents the i-th point within the radius of the r sphere with the position (X j-1 , Y j-1 , Z j-1 ) after the j-1 iteration as the center,
Figure FDA0001609532210000023
represents the projection of (X j-1 , Y j-1 , Z j-1 ) on the image,
Figure FDA0001609532210000024
Represents the average response graph in step 4 at
Figure FDA0001609532210000025
the value of the position;
53):迭代次数j=j+1;53): iteration times j=j+1; 54):返回步骤52),直到迭代次数j>10,进入下一步;54): return to step 52), until the number of iterations j>10, enter the next step; 55):Xt=Xj,Yt=Yj,Zt=Zj55): X t =X j , Y t =Y j , Z t =Z j .
6.根据权利要求5所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤6)中,目标在当前时刻图像上的投影计算方法为:6. a kind of fast target tracking method applicable to RGB-D camera according to claim 5, is characterized in that, in described step 6), the projection calculation method of target on current moment image is: 61)计算初始0时刻三维位置(X0,Y0,Z0)与(X0+r,Y0,Z0)的距离s061) Calculate the distance s 0 between the three-dimensional position (X 0 , Y 0 , Z 0 ) and (X 0 +r, Y 0 , Z 0 ) at the initial 0 time; 62)计算当前时刻的三维位置(Xt,Yt,Zt)与(Xt+r,Yt,Zt)的距离st62) Calculate the distance s t between the three-dimensional position (X t , Y t , Z t ) and (X t +r, Y t , Z t ) at the current moment; 63)利用下式计算目标在t时刻图像上的投影尺寸:63) Use the following formula to calculate the projection size of the target on the image at time t: (wt,ht)=(w0,h0)st/s0(w t , h t )=(w 0 , h 0 )s t /s 0 . 7.根据权利要求6所述的一种适用于RGB-D相机的快速目标跟踪方法,其特征在于,所述步骤8)中模板T1的更新方法为:7. a kind of fast target tracking method applicable to RGB-D camera according to claim 6, is characterized in that, the updating method of template T 1 in described step 8) is: T1=T1+0.1T。T 1 =T 1 +0.1T.
CN201810258190.5A 2018-03-27 2018-03-27 Rapid target tracking method suitable for RGB-D camera Active CN108596947B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810258190.5A CN108596947B (en) 2018-03-27 2018-03-27 Rapid target tracking method suitable for RGB-D camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810258190.5A CN108596947B (en) 2018-03-27 2018-03-27 Rapid target tracking method suitable for RGB-D camera

Publications (2)

Publication Number Publication Date
CN108596947A CN108596947A (en) 2018-09-28
CN108596947B true CN108596947B (en) 2021-09-17

Family

ID=63624668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810258190.5A Active CN108596947B (en) 2018-03-27 2018-03-27 Rapid target tracking method suitable for RGB-D camera

Country Status (1)

Country Link
CN (1) CN108596947B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636814A (en) * 2018-12-18 2019-04-16 联想(北京)有限公司 A kind of image processing method and electronic equipment
CN109993086B (en) * 2019-03-21 2021-07-27 北京华捷艾米科技有限公司 Face detection method, device and system and terminal equipment
CN110245601B (en) * 2019-06-11 2022-03-01 Oppo广东移动通信有限公司 Eye tracking methods and related products
CN110472553B (en) * 2019-08-12 2022-03-11 北京易航远智科技有限公司 Target tracking method, computing device and medium for fusion of image and laser point cloud
CN117237406B (en) * 2022-06-08 2025-02-14 珠海一微半导体股份有限公司 Robot vision tracking method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955682A (en) * 2014-05-22 2014-07-30 深圳市赛为智能股份有限公司 Behavior recognition method and device based on SURF interest points
CN106384079A (en) * 2016-08-31 2017-02-08 东南大学 RGB-D information based real-time pedestrian tracking method
CN107240129A (en) * 2017-05-10 2017-10-10 同济大学 Object and indoor small scene based on RGB D camera datas recover and modeling method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955682A (en) * 2014-05-22 2014-07-30 深圳市赛为智能股份有限公司 Behavior recognition method and device based on SURF interest points
CN106384079A (en) * 2016-08-31 2017-02-08 东南大学 RGB-D information based real-time pedestrian tracking method
CN107240129A (en) * 2017-05-10 2017-10-10 同济大学 Object and indoor small scene based on RGB D camera datas recover and modeling method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Tracking fractures of deformable objects in real-time with an RGB-D sensor;Antoine Petit等;《2015 International Conference on 3D Vision》;20151130;第632-639页 *

Also Published As

Publication number Publication date
CN108596947A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
CN111462200B (en) A cross-video pedestrian positioning and tracking method, system and device
CN108596947B (en) Rapid target tracking method suitable for RGB-D camera
CN109345574B (en) LiDAR 3D Mapping Method Based on Semantic Point Cloud Registration
CN110807809B (en) Light-weight monocular vision positioning method based on point-line characteristics and depth filter
CN105809687B (en) A Monocular Vision Odometry Method Based on Edge Point Information in Image
CN111462207A (en) RGB-D simultaneous positioning and map creation method integrating direct method and feature method
CN108229416B (en) Robot SLAM method based on semantic segmentation technology
CN110799921A (en) Filming method, device and drone
CN108955718A (en) A kind of visual odometry and its localization method, robot and storage medium
CN112446882B (en) A robust visual SLAM method based on deep learning in dynamic scenes
CN105678809A (en) Handheld automatic follow shot device and target tracking method thereof
CN104463969B (en) A kind of method for building up of the model of geographical photo to aviation tilt
WO2020107312A1 (en) Rigid body configuration method and optical motion capturing method
CN113052907B (en) Positioning method of mobile robot in dynamic environment
CN109785373B (en) Speckle-based six-degree-of-freedom pose estimation system and method
CN110517284A (en) A Target Tracking Method Based on LiDAR and PTZ Camera
CN116030136B (en) Cross-angle visual positioning method, device and computer equipment based on geometric features
WO2020015501A1 (en) Map construction method, apparatus, storage medium and electronic device
CN106996769B (en) Active pose rapid relocation method without camera calibration
CN106871900A (en) Image matching positioning method in ship magnetic field dynamic detection
CN105258680A (en) Object pose measurement method and device
CN111161334A (en) A deep learning-based semantic map construction method
CN116310128A (en) Dynamic environment monocular multi-object SLAM method based on instance segmentation and three-dimensional reconstruction
CN107610216A (en) Video camera based on the three-dimensional point cloud generation method of particle group optimizing various visual angles and application
CN109215122B (en) A street view three-dimensional reconstruction system and method, intelligent car

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant