CN108596947B - Rapid target tracking method suitable for RGB-D camera - Google Patents
Rapid target tracking method suitable for RGB-D camera Download PDFInfo
- Publication number
- CN108596947B CN108596947B CN201810258190.5A CN201810258190A CN108596947B CN 108596947 B CN108596947 B CN 108596947B CN 201810258190 A CN201810258190 A CN 201810258190A CN 108596947 B CN108596947 B CN 108596947B
- Authority
- CN
- China
- Prior art keywords
- image
- dimensional
- camera
- rgb
- template
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000004364 calculation method Methods 0.000 claims description 10
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 abstract description 3
- 230000003190 augmentative effect Effects 0.000 abstract description 2
- 230000000007 visual effect Effects 0.000 abstract description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/223—Analysis of motion using block-matching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
The invention discloses a rapid target tracking method suitable for an RGB-D camera, and belongs to the field of video analysis and three-dimensional point cloud processing. The method comprises the steps of projecting a two-dimensional response image obtained by template matching to a three-dimensional space by utilizing depth information obtained by RGB-D on the basis of traditional template matching to obtain a three-dimensional response image, searching a local maximum value of the three-dimensional response image by a Parzen window method to determine the position of an object in the three-dimensional space, and providing accurate scale information for template matching at the next moment by the obtained three-dimensional position to obtain a more accurate tracking result. The method can be used in the fields of video monitoring, augmented reality, robot visual navigation and the like, and can realize real-time and accurate tracking of the target.
Description
Technical Field
The invention relates to a rapid target tracking method suitable for an RGB-D camera, and belongs to the technical field of video analysis and three-dimensional point cloud processing.
Background
The target tracking has important application in the fields of video monitoring, virtual reality and the like. Under the condition of a traditional camera, target tracking can only be carried out on a two-dimensional RGB image, and is easily interfered, so that tracking failure is caused. In recent years, RGB-D cameras have become popular, which can obtain depth information of a scene in addition to RGB images, compared to conventional RGB cameras. At present, in an existing target tracking method under an RGB-D camera of an RGB-D camera, a depth channel is treated as a common color channel, and scene three-dimensional structure information contained behind the depth channel is ignored, so that the depth information provided by the RGB-D camera is not fully utilized.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a rapid target tracking method suitable for an RGB-D camera, which comprises the steps of utilizing depth information obtained by RGB-D on the basis of traditional template matching, projecting a two-dimensional response image obtained by template matching to a three-dimensional space to obtain a three-dimensional response image, searching a local maximum value of the three-dimensional response image by a Parzen window method, and determining the position of an object in the three-dimensional space, wherein the obtained three-dimensional position can provide accurate scale information for template matching at the next moment, so that a more accurate tracking result is obtained.
In order to solve the above technical problem, the present invention provides a fast target tracking method suitable for an RGB-D camera, comprising the steps of:
1) at the initial moment, manually selecting a target to be tracked, and setting the area of the target as (w)0,h0) The RGB image blocks are simultaneously used as the template T0And a template T1Wherein w is0Width, h, of the image block0Represents the height of the image block;
2) template T0All pixel points in the region are back projected to a three-dimensional space to obtain a three-dimensional point cloud, and the center of the three-dimensional point cloud is calculated to be used as the three-dimensional position (X) of a target at 0 moment0,Y0,Z0) Calculating points and (X) in the three-dimensional point cloud0,Y0,Z0) The maximum value r of the distance;
3) respectively using templates T0And a template T1Carrying out template matching on R, G, B three channels of RGB images acquired by the camera to obtain 6 two-dimensional response graphs R1,R2,...,R6And calculating an average response graph R, wherein the calculation method comprises the following steps: r ═ R1+R2+...R6)/6;
4) Each point in the response map R will be averaged using the current time depth image captured by the camera and given camera parameters provided by the camera manufacturerBack projecting to three-dimensional space to obtain three-dimensional point cloud setThe weight of each point in the point cloud is the value of the corresponding position in the average response graph R
5) Taking the three-dimensional position tracked at the previous moment as an initial value, and obtaining the position of the local maximum value of the three-dimensional point cloud set in the step 4) by using a Parzen window method, wherein the position is the three-dimensional position (X) of the target at the current momentt,Yt,Zt);
6) Using the current position (X) of the targett,Yt,Zt) Calculating to obtain the projection size (w) of the target on the image at the current momentt,ht) Wherein w istRepresenting the width, h, of the projected imagetIndicating the height of the projected image;
7) calculation (X)t,Yt,Zt) Projection (x) on an RGB image at time tt,yt) On the image with (x)t,yt) The size of the center is (w)t,ht) The image block T of (1);
8) updating a template T with an image block T1,T0Keeping the same;
9) and returning to the step 3) until the camera stops shooting to obtain a new image.
Three-dimensional position (X) of target at time 0 in the aforementioned step 2)0,Y0,Z0) Respectively as follows:
wherein (X)i,Yi,Zi) And n is the total number of points, wherein the coordinates of the ith point in the three-dimensional point cloud at the moment 0 are obtained.
The template matching process in the step 3) is as follows:
calculating a template T0Or a template T1Similarity of sub-image S at different position in RGB image obtained by camera but same size as template image is calculated by calculating similarity between template image and sub-image SNormalized cross-correlation yields:
wherein,and SiAre each T0And the ith element of S, σ (T)0) Andare each T0The variance and mean, σ (S) andthe variance and mean of S, respectively.
The back projection method of the foregoing step 2) and step 4) is as follows:
for a point (X, Y) on the two-dimensional image, the depth d of (X, Y) is obtained on the depth image, the X coordinate of the three-dimensional position after back projection is (X-cx) d/fx, the Y coordinate is (Y-cy) d/fy, and the Z coordinate is the depth d, wherein cx, cy, fx, fy are camera parameters provided by a camera manufacturer.
In the aforementioned step 5), the Parzen window method comprises the following steps:
51): the three-dimensional position (X) of the target at the previous momentt-1,Yt-1,Zt-1) As an initial value, i.e. when the number of iterations j is 0,
Xj=Xt-1,Yj=Yt-1,Zj=Zt-1;
52): the new three-dimensional position is calculated by:
wherein,representing bits after the j-1 th iterationIs arranged (X)j-1,Yj-1,Zj-1) As the center, the ith point with the radius of r in the sphere range,is represented by (X)j-1,Yj-1,Zj-1) The projection onto the image is carried out in such a way that,shows the average response in step 4The value of the position;
53): the iteration number j is j + 1;
54): returning to the step 52) until the iteration number j is more than 10, and entering the next step;
55):Xt=Xj,Yt=Yj,Zt=Zj。
in the foregoing step 6), the projection calculation method of the target on the current time image is as follows:
61) calculating the initial 0 time three-dimensional position (X)0,Y0,Z0) And (X)0+r,Y0,Z0) S is0;
62) Calculating the three-dimensional position (X) of the current timet,Yt,Zt) And (X)t+r,Yt,Zt) S ist;
63) Calculating the projection size of the target on the image at the time t by using the following formula:
(wt,ht)=(w0,h0)st/s0。
template T in the aforementioned step 8)1The updating method comprises the following steps:
T1=T1+0.1T。
the invention has the beneficial effects that:
the method is applied to the fields of video monitoring, augmented reality, robot visual navigation and the like, and can accurately track the target in real time.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The invention is further described below. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
As shown in fig. 1, the method of the present invention comprises the following steps:
step 1: at the initial moment, manually selecting a target to be tracked, and setting the area of the target as (w)0,h0) The RGB image blocks are simultaneously used as the template T0And a template T1At this time T0And T1Are identical image blocks.
Step 2: will T0All pixel points in the region are back projected to a three-dimensional space to obtain a three-dimensional point cloud, and the center of the three-dimensional point cloud is calculated to be used as the three-dimensional position (X) of a target at 0 moment0,Y0,Z0) Calculating points and (X) in the three-dimensional point cloud0,Y0,Z0) The maximum value r of the distance;
the calculation method of the back projection is as follows: for a point (X, Y) on the two-dimensional image, a depth d of (X, Y) is obtained on the depth image, the three-dimensional position after back projection has an X coordinate of (X-cx) d/fx, a Y coordinate of (Y-cy) d/fy, a Z coordinate of depth d,
wherein, cx, cy, fx, fy are calibration parameters provided by camera manufacturers.
wherein (X)i,Yi,Zi) And n is the total number of points, wherein the coordinates of the ith point in the three-dimensional point cloud at the moment 0 are obtained.
And step 3: respectively using templates T0And a template T1Carrying out template matching on R, G, B three channels of RGB images acquired by the camera to obtain 6 two-dimensional response graphs R1,R2,...,R6CalculatingThe average response graph R is calculated as follows: r ═ R1+R2+...R6)/6。
The template matching process is as follows: computing a template image T0Or T1The similarity of the sub-image S with the same position and size as the template image in the RGB image acquired by the camera is obtained by calculating the Normalized Cross Correlation (NCC) between the template image and the sub-image S:
wherein,and SiAre each T0And the ith element of S, σ (T)0) Andare each T0The variance and mean, σ (S) andthe variance and mean of S, respectively.
And 4, step 4: the depth image of the current time taken by the camera and given camera parameters provided by the camera manufacturer will average each point in the response map RBack projecting to three-dimensional space to obtain three-dimensional point cloud setThe weight value of each point in the point cloud is the value of the corresponding position in the average response graph RThe back projection method is the same as in step 2.
And 5: using the three-dimensional position tracked at the previous time as an initial value and using a Parzen window methodThe method obtains the position of the local maximum value of the three-dimensional point cloud set in the step 4, wherein the position is the three-dimensional position (X) of the target at the current momentt,Yt,Zt);
The procedure of the Parzen window method is as follows:
step S51: the three-dimensional position (X) of the target at the previous momentt-1,Yt-1,Zt-1) As an initial value, i.e. when the number of iterations j is 0,
Xj=Xt-1,Yj=Yt-1,Zj=Zt-1;
step S52: the new three-dimensional position is calculated by:
wherein,indicates the position (X) after the j-1 th iterationj-1,Yj-1,Zj-1) As the center, the ith point with the radius of r in the sphere range,is represented by (X)j-1,Yj-1,Zj-1) The projection onto the image is carried out in such a way that,shows the average response in step 4The value of the position.
Step S53: the iteration number j is j + 1;
step S54: returning to the step S52 until the iteration number j is more than 10, and entering the next step;
step S55: xt=Xj,Yt=Yj,Zt=Zj。
Step 6: using an objectCurrent position (X)t,Yt,Zt) Calculating to obtain the projection size (w) of the target on the image at the current momentt,ht),wt,htWidth and height, respectively;
the calculation method comprises the following steps:
step S61: calculating the initial 0 time three-dimensional position (X)0,Y0,Z0) And (X)0+r,Y0,Z0) S is0;
Step S62: calculation (X)t,Yt,Zt) And (X)t+r,Yt,Zt) S ist;
Step S63: calculating the projection size of the target on the image at the time t by using the following formula:
(wt,ht)=(w0,h0)st/s0。
and 7: calculation (X)t,Yt,Zt) Projection (x) on an RGB image at time tt,yt) On the image with (x)t,yt) The size of the center is (w)t,ht) The image block T of (1);
and 8: updating a template T with an image block T1,T0Remain unchanged.
Template T1The updating method comprises the following steps: t is1=T1+0.1T。
And step 9: and returning to the step 3 until the camera stops shooting to obtain a new image.
The method can effectively track the position of the object in the three-dimensional space, can process the change of the target dimension, and can process the change of the appearance of the object and avoid the interference of the surrounding objects with similar appearances.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (7)
1. A fast target tracking method suitable for an RGB-D camera is characterized by comprising the following steps:
1) at the initial moment, manually selecting a target to be tracked, and setting the area of the target as (w)0,h0) The RGB image blocks are simultaneously used as the template T0And a template T1Wherein w is0Width, h, of the image block0Represents the height of the image block;
2) template T0All pixel points in the region are back projected to a three-dimensional space to obtain a three-dimensional point cloud, and the center of the three-dimensional point cloud is calculated to be used as the three-dimensional position (X) of a target at 0 moment0,Y0,Z0) Calculating points and (X) in the three-dimensional point cloud0,Y0,Z0) The maximum value r of the distance;
3) respectively using templates T0And a template T1Carrying out template matching on R, G, B three channels of RGB images acquired by the camera to obtain 6 two-dimensional response graphs R1,R2,...,R6And calculating an average response graph R, wherein the calculation method comprises the following steps: r ═ R1+R2+...R6)/6;
4) Each point in the response map R will be averaged using the current time depth image captured by the camera and given camera parameters provided by the camera manufacturerBack projecting to three-dimensional space to obtain three-dimensional point cloud setThe weight of each point in the point cloud is the value of the corresponding position in the average response graph R
5) Taking the three-dimensional position tracked at the previous moment as an initial value, and obtaining the local maximum value of the three-dimensional point cloud set in the step 4) by using a Parzen window methodAt the position, which is the three-dimensional position (X) of the target at the current timet,Yt,Zt);
6) Using the current position (X) of the targett,Yt,Zt) Calculating to obtain the projection size (w) of the target on the image at the current momentt,ht) Wherein w istRepresenting the width, h, of the projected imagetIndicating the height of the projected image;
7) calculation (X)t,Yt,Zt) Projection (x) on an RGB image at time tt,yt) On the image with (x)t,yt) The size of the center is (w)t,ht) The image block T of (1);
8) updating a template T with an image block T1,T0Keeping the same;
9) and returning to the step 3) until the camera stops shooting to obtain a new image.
2. The fast target tracking method for RGB-D camera as claimed in claim 1, wherein the three-dimensional position (X) of the target at time 0 in step 2)0,Y0,Z0) Respectively as follows:
wherein (X)i,Yi,Zi) And n is the total number of points, wherein the coordinates of the ith point in the three-dimensional point cloud at the moment 0 are obtained.
3. The fast target tracking method for RGB-D camera as claimed in claim 1, wherein the template matching process in step 3) is:
calculating a template T0Or a template T1Similarity of the sub-image S at a different position within the RGB image acquired by the camera but of the same size as the template image, the similarity being obtained by calculating the normalized cross-correlation between the template image and the sub-image S:
4. The fast target tracking method for RGB-D camera as claimed in claim 1, wherein the back projection method of step 2) and step 4) is as follows:
for a point (X, Y) on the two-dimensional image, the depth d of (X, Y) is obtained on the depth image, the X coordinate of the three-dimensional position after back projection is (X-cx) d/fx, the Y coordinate is (Y-cy) d/fy, and the Z coordinate is the depth d, wherein cx, cy, fx, fy are camera parameters provided by a camera manufacturer.
5. The fast target tracking method for RGB-D camera as claimed in claim 1, wherein in the step 5), the Parzen window method comprises the following steps:
51): the three-dimensional position (X) of the target at the previous momentt-1,Yt-1,Zt-1) As an initial value, i.e. when the number of iterations j is 0, Xj=Xt-1,Yj=Yt-1,Zj=Zt-1;
52): the new three-dimensional position is calculated by:
wherein,indicates the position (X) after the j-1 th iterationj-1,Yj-1,Zj-1) As the center, the ith point with the radius of r in the sphere range,is represented by (X)j-1,Yj-1,Zj-1) The projection onto the image is carried out in such a way that,shows the average response in step 4The value of the position;
53): the iteration number j is j + 1;
54): returning to the step 52) until the iteration number j is more than 10, and entering the next step;
55):Xt=Xj,Yt=Yj,Zt=Zj。
6. the fast target tracking method for RGB-D camera as claimed in claim 5, wherein in the step 6), the projection calculation method of the target on the image at the current time is:
61) calculating the initial 0 time three-dimensional position (X)0,Y0,Z0) And (X)0+r,Y0,Z0) S is0;
62) Calculating the three-dimensional position (X) of the current timet,Yt,Zt) And (X)t+r,Yt,Zt) S ist;
63) Calculating the projection size of the target on the image at the time t by using the following formula:
(wt,ht)=(w0,h0)st/s0。
7. the fast target tracking method for RGB-D camera as claimed in claim 6, wherein the template T in step 8)1The updating method comprises the following steps:
T1=T1+0.1T。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810258190.5A CN108596947B (en) | 2018-03-27 | 2018-03-27 | Rapid target tracking method suitable for RGB-D camera |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810258190.5A CN108596947B (en) | 2018-03-27 | 2018-03-27 | Rapid target tracking method suitable for RGB-D camera |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108596947A CN108596947A (en) | 2018-09-28 |
CN108596947B true CN108596947B (en) | 2021-09-17 |
Family
ID=63624668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810258190.5A Active CN108596947B (en) | 2018-03-27 | 2018-03-27 | Rapid target tracking method suitable for RGB-D camera |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108596947B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109636814A (en) * | 2018-12-18 | 2019-04-16 | 联想(北京)有限公司 | A kind of image processing method and electronic equipment |
CN109993086B (en) * | 2019-03-21 | 2021-07-27 | 北京华捷艾米科技有限公司 | Face detection method, device and system and terminal equipment |
CN110245601B (en) * | 2019-06-11 | 2022-03-01 | Oppo广东移动通信有限公司 | Eyeball tracking method and related product |
CN110472553B (en) * | 2019-08-12 | 2022-03-11 | 北京易航远智科技有限公司 | Target tracking method, computing device and medium for fusion of image and laser point cloud |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955682A (en) * | 2014-05-22 | 2014-07-30 | 深圳市赛为智能股份有限公司 | Behavior recognition method and device based on SURF interest points |
CN106384079A (en) * | 2016-08-31 | 2017-02-08 | 东南大学 | RGB-D information based real-time pedestrian tracking method |
CN107240129A (en) * | 2017-05-10 | 2017-10-10 | 同济大学 | Object and indoor small scene based on RGB D camera datas recover and modeling method |
-
2018
- 2018-03-27 CN CN201810258190.5A patent/CN108596947B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955682A (en) * | 2014-05-22 | 2014-07-30 | 深圳市赛为智能股份有限公司 | Behavior recognition method and device based on SURF interest points |
CN106384079A (en) * | 2016-08-31 | 2017-02-08 | 东南大学 | RGB-D information based real-time pedestrian tracking method |
CN107240129A (en) * | 2017-05-10 | 2017-10-10 | 同济大学 | Object and indoor small scene based on RGB D camera datas recover and modeling method |
Non-Patent Citations (1)
Title |
---|
Tracking fractures of deformable objects in real-time with an RGB-D sensor;Antoine Petit等;《2015 International Conference on 3D Vision》;20151130;第632-639页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108596947A (en) | 2018-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021196294A1 (en) | Cross-video person location tracking method and system, and device | |
CN108596947B (en) | Rapid target tracking method suitable for RGB-D camera | |
CN109345588B (en) | Tag-based six-degree-of-freedom attitude estimation method | |
US10059002B2 (en) | Image processing apparatus, image processing method, and non-transitory computer-readable medium | |
US10339389B2 (en) | Methods and systems for vision-based motion estimation | |
JP6095018B2 (en) | Detection and tracking of moving objects | |
CN111897349B (en) | Autonomous obstacle avoidance method for underwater robot based on binocular vision | |
CN108171715B (en) | Image segmentation method and device | |
JP2019536170A (en) | Virtually extended visual simultaneous localization and mapping system and method | |
CN107843251B (en) | Pose estimation method of mobile robot | |
US20200334842A1 (en) | Methods, devices and computer program products for global bundle adjustment of 3d images | |
CN105678809A (en) | Handheld automatic follow shot device and target tracking method thereof | |
Tang et al. | Camera self-calibration from tracking of moving persons | |
CN110006444B (en) | Anti-interference visual odometer construction method based on optimized Gaussian mixture model | |
CN109785373B (en) | Speckle-based six-degree-of-freedom pose estimation system and method | |
CN111354007B (en) | Projection interaction method based on pure machine vision positioning | |
JP6515039B2 (en) | Program, apparatus and method for calculating a normal vector of a planar object to be reflected in a continuous captured image | |
JP6894707B2 (en) | Information processing device and its control method, program | |
JP6061770B2 (en) | Camera posture estimation apparatus and program thereof | |
WO2022217794A1 (en) | Positioning method of mobile robot in dynamic environment | |
JP6922348B2 (en) | Information processing equipment, methods, and programs | |
JP2019200516A (en) | Template posture estimation apparatus, method, and program | |
CN109671084B (en) | Method for measuring shape of workpiece | |
CN107945166B (en) | Binocular vision-based method for measuring three-dimensional vibration track of object to be measured | |
Xu et al. | Head tracking using particle filter with intensity gradient and color histogram |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |