CN108010055B - Tracking system and tracking method for three-dimensional object - Google Patents

Tracking system and tracking method for three-dimensional object Download PDF

Info

Publication number
CN108010055B
CN108010055B CN201711183555.4A CN201711183555A CN108010055B CN 108010055 B CN108010055 B CN 108010055B CN 201711183555 A CN201711183555 A CN 201711183555A CN 108010055 B CN108010055 B CN 108010055B
Authority
CN
China
Prior art keywords
frame
data
module
feature point
video frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711183555.4A
Other languages
Chinese (zh)
Other versions
CN108010055A (en
Inventor
康大智
吕国云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tapuyihai Shanghai Intelligent Technology Co ltd
Original Assignee
Tapuyihai Shanghai Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tapuyihai Shanghai Intelligent Technology Co ltd filed Critical Tapuyihai Shanghai Intelligent Technology Co ltd
Priority to CN201711183555.4A priority Critical patent/CN108010055B/en
Publication of CN108010055A publication Critical patent/CN108010055A/en
Application granted granted Critical
Publication of CN108010055B publication Critical patent/CN108010055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a tracking system of a three-dimensional object and a tracking method thereof, wherein the tracking system of the three-dimensional object comprises a key frame forming unit, a video frame external parameter analyzing unit and a tracking judging unit, wherein the key frame forming unit forms key frame data by analyzing data of a template frame and data of a video frame, the key frame data comprises external parameters of the key frame, the video frame external parameter analyzing unit is connected with the key frame forming unit in a communication way, the video frame external parameter analyzing unit can obtain the external parameters of the key frame and calculate the external parameters of the video frame according to the external parameters of the key frame, the tracking judging unit is connected with the video frame external parameter analyzing unit in a communication way, after the video frame analyzing module obtains the data of the key frame and the data of the video frame, and calculating the corresponding pose of the tracked object in the video frame.

Description

Tracking system and tracking method for three-dimensional object
Technical Field
The present invention relates to a tracking system and a tracking method for a three-dimensional object, and more particularly, to a tracking system and a tracking method for a three-dimensional object, which can track a three-dimensional object without placing a marker in a real scene.
Background
According to the traditional tracking method for the three-dimensional object, a preset mark needs to be placed in a real scene in advance, tracking is achieved through detecting the mark, parameters of a camera are determined, a three-dimensional data model is further rendered, and external parameters of the camera are recalculated along with the change of the mark such as zooming, translation or rotation, so that the tracking of the three-dimensional object is achieved. In real life, placing a mark is troublesome, on one hand, it is necessary to avoid that the placed mark blocks a tracked object, and on the other hand, some tracked objects are moving, so that the video frame acquisition device is also moving correspondingly at this time, and the placed mark is still, if the video frame acquisition device moves very quickly, the mark placed in a real scene is likely to move out of the visual field range which can be acquired by the video frame acquisition device, which may cause a failure of the whole tracking process. To solve such problems, the conventional tracking of a three-dimensional object is performed by using a planar image as a mark, and the tracking is performed by using a two-dimensional planar image as a mark, so that the moving direction of the tracked object is generally limited, and the degree of freedom of movement of a further superimposed three-dimensional model is also limited.
In the development process of modern science and technology, the markless visual feature-based visual feature has a wider application range, but the related technology at present has higher requirements on a CAD model of a tracked object, the tracking process is very complex, and simultaneously, the related operation equipment has higher requirements, and with the great popularization of mobile terminal AR application, the real-time property becomes a key problem of algorithm research, wherein the key point is the stability of the tracking method, the traditional tracking method for three-dimensional objects has lower tracking speed on objects with complex textures, effective features and sufficient quantity of features can not be extracted from simple objects, and when the traditional tracking method extracts features from the tracked objects, the traditional tracking method is very sensitive to the illumination intensity, which causes the continuity of the whole tracking process to be poor to a certain extent, and the overlapped virtual three-dimensional objects can often move violently in a scene, there is no coherent visual perception.
Disclosure of Invention
An object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for the three-dimensional object does not need to place specific real object markers and plane markers in a real scene when tracking the object.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object can stably track objects having different texture richness.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object can quickly track an object with a complex texture.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object can stably track an object with a complex texture.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object can extract sufficient feature points for a simple object, thereby performing effective and stable tracking.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object is not affected by the intensity of light rays when tracking the three-dimensional object.
Another object of the present invention is to provide a tracking system for a three-dimensional object and a tracking method thereof, wherein the tracking system for a three-dimensional object can track the object in real time by updating the key template frame, so that the tracking of the object can be continued even when the tracked object goes out of bounds and returns to the field of view of the video frame acquisition device. That is, after the tracked object fails to track due to out-of-bounds, the tracking system of the three-dimensional object can automatically track the tracked object again if the tracked object returns to the field of view of the video frame acquisition device again.
To achieve at least one of the above objects, the present invention provides a tracking system for a three-dimensional object, comprising:
a key frame forming unit, wherein the key frame forming unit forms a key frame data by analyzing data of a template frame and data of a video frame, wherein the key frame data comprises extrinsic parameters of the key frame;
a video frame outer parameter analyzing unit, wherein the video frame outer parameter analyzing unit is communicatively connected to the key frame forming unit, and the video frame outer parameter analyzing unit is capable of acquiring outer parameters of the key frame and calculating the outer parameters of the video frame according to the outer parameters of the key frame; and
and the tracking judgment unit is in communication connection with the video frame outer parameter analysis unit, and the video frame analysis module calculates the spatial pose in the video frame after acquiring the data of the key frame and the data of the video frame.
According to an embodiment of the present invention, the tracking system of the three-dimensional object further comprises a data obtaining unit, wherein the data obtaining unit further comprises a template frame obtaining module and a video frame obtaining module, wherein the template frame obtaining module and the video frame obtaining module are respectively connected to the key frame forming unit in a communication manner.
According to an embodiment of the present invention, the data obtaining unit further includes a feature point processing module, wherein the feature point processing module includes a feature point judging module and a feature point extracting module, wherein the feature point judging module is communicatively connected to the template frame obtaining module and the video frame obtaining module, and wherein the feature point extracting module is communicatively connected to the feature point judging module, the key frame forming unit and the video frame outside parameter analyzing unit.
According to an embodiment of the present invention, the feature point processing module further includes a texture richness determination module, wherein the texture richness determination module includes a feature point analysis module, wherein the feature point analysis module is communicatively connected to the feature point determination module.
According to an embodiment of the present invention, the texture richness determining module further includes a feature point change determining module and a homogenizing module, wherein the feature point change determining module is communicatively connected to the feature point extracting module, and wherein the homogenizing module is communicatively connected to the feature point determining module.
According to an embodiment of the present invention, the feature point processing module further includes a feature point matching judgment module, wherein the feature point matching judgment module is communicatively connected to the feature point extraction module.
To achieve at least one of the above objects, the present invention provides a tracking method of a three-dimensional object, comprising the steps of:
(A) acquiring data of a key frame matched with a planar image of a tracked object from a video frame according to the data of the tracked object, wherein the data of the key frame comprises external parameters of the key frame;
(B) analyzing the acquired data of the key frame and the acquired data of a video frame, and further calculating an external parameter corresponding to the current video frame; and
(C) and calculating the pose between the tracked object in the video frame and the tracked object in the template frame according to the external parameters of the video frame.
According to an embodiment of the present invention, the tracking method of a three-dimensional object further includes:
(D) comparing whether the difference between the extrinsic parameters of the video frame and the extrinsic parameters of the key frame meets a specific threshold; and
(E) and when the difference value does not meet the threshold value, replacing the template frame with the template frame used last time.
According to an embodiment of the present invention, wherein the step (a) comprises:
(F) judging whether a point in the current image data is a characteristic point according to a characteristic point judgment threshold;
(G) extracting feature points according to the judgment result;
(H) comparing the feature points extracted in the step (G) with feature points in reference data to judge that the feature points meet the corresponding requirements of the reference data on the feature points; and
(I) and if so, taking the feature points extracted in the step (G) as finally extracted feature points, and if not, continuously executing the step (F) by changing the feature point judgment threshold in the step (F).
According to an embodiment of the present invention, wherein the step (a) comprises:
(J) judging whether the number of the extracted feature points meets a feature point number threshold range or not; and
(K) when the number of the extracted feature points does not accord with a feature point number threshold range, carrying out homogenization processing on the video frame; and
(L) extracting the feature points of the homogenized video frame.
Drawings
FIG. 1 is a schematic diagram of a three-dimensional object tracking system of the present invention.
FIG. 2 is a tracking schematic of a tracking system for a three-dimensional object according to the present invention.
FIG. 3 is a schematic diagram of feature point extraction of a three-dimensional object tracking system according to the present invention.
FIG. 4 is a flow chart of a method for tracking a three-dimensional object according to the present invention.
Detailed Description
The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments provided in the following description are only intended as examples and modifications obvious to a person skilled in the art, and do not constitute a limitation of the scope of the invention. The general principles defined in the following description may be applied to other embodiments, alternatives, modifications, equivalent implementations, and applications without departing from the spirit and scope of the invention.
It will be understood by those skilled in the art that in the present disclosure, the terms "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for ease of description and simplicity of description, and do not indicate or imply that the referenced devices or components must be constructed and operated in a particular orientation and thus are not to be considered limiting.
The invention discloses a tracking system of a three-dimensional object and a tracking method thereofAs will be explained in detail below, the tracking system for three-dimensional objects includes a data obtaining unit 10, a key frame forming unit 20, a video frame outer parameter analyzing unit 30 and a tracking determining unit 40, wherein the data obtaining unit 10 can obtain tracking data of a tracked object, the key frame forming unit 20 is communicatively connected to the data obtaining unit 10 to obtain the tracking data from the data obtaining unit 10, and the key frame forming unit 20 can form initial data related to the key frame accordingly, wherein the initial data includes outer parameters and inner parameters of the key frame, the video frame outer parameter analyzing unit 30 is communicatively connected to the key frame forming unit 20 to obtain outer parameters Mi of the current video frame according to the initial data formed by the initial key frame forming unit 20, wherein the tracking judgment unit 40 is communicatively connected to the video frame extrinsic parameter analysis unit 30 and the key frame forming unit 20, wherein the tracking judgment unit 40 is capable of obtaining the extrinsic parameters Mi of the video frame and the extrinsic parameters M of the key frame0Comparing, wherein when the Mi and the M0When the difference value between the two exceeds a certain range, the tracking judgment unit 40 forms an updated data accordingly, wherein the updated data is transmitted to the key frame forming unit 20 to replace the initial data of the key frame with the related data of the previous template frame, so as to re-track.
It will be appreciated by those skilled in the art that by updating the keyframes in real time, the tracking system for three-dimensional objects based on CAD models can track objects even when the tracked objects move out of the field of view of the video frame acquisition device and return back into the field of view.
In the present invention, the data acquiring unit 10 includes a template frame acquiring module 11 and a video frame acquiring module 12, wherein the template frame acquiring module 11 and the video frame acquiring module 12 are communicatively connected to the key frame forming unit 20, and the key frame forming unit 20 determines whether the video frame can be used as a key template frame by analyzing and comparing data of a template frame formed by the template frame acquiring module 11 with data of a video frame formed by the video frame acquiring module 12.
Specifically, in an embodiment of the present invention, the template frame acquiring module 11 may acquire CAD model data of a tracked object, and parse the CAD model data to store the parsed data, where the format type of the CAD model data includes, but is not limited to, OBJ or 2DS, and the CAD model data parsed by the template frame acquiring module 11 includes triangular bin data, vertex data, normal vector data, texture data, material data, illumination information data, and the like.
Those skilled in the art can understand that the template frame acquiring module 11 can perform vertex redundancy processing on the CAD model data at the same time, so as to reduce redundant information of the CAD model data, thereby increasing the time for subsequently processing the CAD model data.
Further, the template frame obtaining module 11 can render the CAD model data by using OpenGL to form the template frame, so as to match the data characteristics adopted for subsequent tracking, and those skilled in the art can understand that, in the present invention, the rendering mode of the tracked object may be other methods, and the present invention is not limited in this respect.
The video frame acquiring module 12 can be communicatively connected to the video frame acquiring device, such as a monocular camera, wherein the video frame acquiring module 12 can acquire at least one video frame, wherein the key frame forming unit 20 can acquire the data of the template frame and the data of the video frame from the template frame acquiring module 11 and the video frame acquiring module 12, respectively, to determine whether the current video frame data can be used as the key frame, and if so, the video frame can be used as the key frame, and the data of the template frame can be used as the data of the key frame.
In the present invention, the initial data of the template frame is implemented as an image T of the tracked object rendered by OpenGL,and the data of the template frame comprises an extrinsic parameter M given the image rendered by the OpenGL at the time0=[R0 t0]Wherein the key frame forming unit 20 is capable of comparing whether the current video frame matches with the data of the template frame, if yes, the video frame is regarded as the key frame, and the external parameter of the video frame image capturing device corresponding to the key frame at this time is marked as the external parameter of the template frame M0=[R0 t0]。
Specifically, a two-dimensional projection image T of the CAD model under the current pose can be obtained by rendering the CAD model of the tracked object using OpenGL, and an internal parameter K and an external parameter M of the video frame acquisition device, such as a monocular camera, are given, and the video frame is acquired in real time by the video frame acquisition device, wherein the keyframe forming unit 20 can perform fast initial matching on the template frame and the video frame, and a successful matching can be regarded that the tracked object is included in the current frame, and the external parameter of the current frame is equal to the external parameter of the rendered image, that is, M ≈ M', and the current video frame is taken as the keyframe.
The specific process of the rapid initial matching in the embodiment of the invention is as follows:
firstly, giving an internal parameter K and an external parameter M of a CAD model image rendered by OpenGL, and then extracting and binarizing the outline characteristics of the rendered image and the video frame by adopting a self-adaptive threshold value; the keyframe forming unit 20 matches the processed image with a similarity matrix obtained by a conventional least square error method, where D (i, j) is the similarity matrix, and l (i, j) is the coordinate of the top left corner of the region with the highest similarity:
Figure GDA0003663552430000071
the best matching position in D (i, j), i.e. the element minimum point ε1=min{dij}, experiments prove that the match point falls on ε1~ 2ε1The probability of the range is as high as 98%, so ε is used2=2ε1As the upper bound of similarity of matching points, the similarity matrix after correction is d'ijFor the modified matrix element values:
Figure GDA0003663552430000072
the modified similarity matrix accelerates the matching speed, and when the template frame is close to the image of the video frame to be matched, the value is rapidly converged; when the difference between the two is large, the value is rapidly amplified, so that the rough scanning can rapidly determine the approximate area by utilizing the distribution characteristics, and then the searching step length is shortened to perform fine scanning, and the matching time consumption is obviously reduced. It can be understood by those skilled in the art that, in the embodiment of the present invention, by the method described above, the matching speed between the template frame and the video frame can be increased, so that when an object is tracked, a fast tracking rate can be maintained.
After the template frame and the video frame are successfully subjected to coarse-fine combination quick initial matching, the video frame is saved as the key frame, and the camera external parameters of the rendering CAD model are taken as the external parameters M of the key frame0=[R0 t0]。
Further, the data obtaining unit 10 further includes a feature point processing module 13, wherein the feature point processing module 13 is communicatively connected to the template frame obtaining module 11 and the video frame obtaining module 12, wherein the feature point processing module 13 is capable of extracting feature points of the template frame and feature points of the video frame to form corresponding feature point data, respectively.
Specifically, the feature point processing module 13 extracts feature descriptors with scale invariance and rotation invariance from the template frame and the video frame by an ORB feature extraction method, and matches the two by using distance measurement to obtain a rough matching point set. Then eliminating mismatching from the obtained matching point set, and reducing the influence of the miscorresponding relation on subsequent calculation, wherein the method specifically comprises the following steps:
carrying out ORB method feature point extraction on the template frame and the video frame, constructing a corresponding feature point set to obtain descriptors of feature points, and matching the feature points of the template frame and the feature points of the video frame by using Hamming distance, which comprises the following specific steps:
for the feature point p on the template frame, if its simple description subset is FPAnother feature point descriptor on the video frame I is FS={F1,F2,…Fn};
Calculating FPAnd FIHamming distance of all feature points in (1): d ═ D1,d2,…dn}, in which:
dn=Fp^Fn
minimum value of Hamming distance Dmin=min{d1,d2,…dnThe point corresponding to p is the nearest point of p, t is the matching threshold, if D ismin<t, judging that p is matched with the point, otherwise, p has no matching point;
and executing the same operation on the rest points in the template frame P to obtain all feature point matching point sets.
In an embodiment of the present invention, the feature point processing module 13 further includes a feature point judging module 131, a feature point extracting module 132 and a texture richness judging module 133, wherein the feature point extracting module 132 is communicatively connected to the feature point judging module 131 and the texture richness judging module 133, wherein the feature point judging module 131 is communicatively connected to the template frame acquiring module 11 and the video frame acquiring module 12 of the data acquiring unit 10 to be able to acquire the corresponding data of the template frame and the data of the video frame from the template frame acquiring module 11 and the video frame acquiring module 12, wherein the feature point judging module 131 is able to judge whether the data of the vertex in the data of the template frame and the data of the vertex in the data of the video frame satisfy the requirement of the feature point, wherein the feature point extracting module 132 is capable of extracting points meeting the feature point requirement to form a template frame feature point set and a video frame feature point set respectively, wherein the texture richness judging module 133 is communicatively connected to the feature point extracting module 132 to be capable of acquiring the template frame feature point set and the video frame feature point set from the texture richness judging module 133 and analyzing whether the number of the feature points in the video frame feature point set meets the requirement, if not, the texture richness judging module 133 forms a replacement data accordingly, wherein the feature point judging module 131 is capable of acquiring the replacement data accordingly and changing the requirement of the feature points according to the replacement data.
Specifically, in the embodiment of the present invention, the feature point determining module 131 is preset with a feature point determining threshold δ, and preferably, the feature point determining threshold δ is implemented as an absolute value of a difference between a pixel value of 16 pixels on a circle with a certain pixel radius R and a pixel value of a center point, if the circle has a value exceeding det (P)num) If the difference between each pixel point and the central point exceeds a characteristic point judgment threshold value delta, the central point is taken as a characteristic point, and the judgment condition is as follows:
Ni=∑|gray(p)-gray(xi)|。
the feature point extraction module 132 can correspondingly extract feature points meeting the above requirements to form each of the video feature point sets.
In the present invention, the texture richness determining module 133 includes a feature point analyzing module 1331, wherein the feature point analyzing module 1331 is communicatively connected to the feature point extracting module 132 and the feature point determining module 131, wherein the feature point analyzing module 1331 is capable of obtaining the video feature point set from the feature point extracting module 132 and comparing the number of feature points in the video feature point set with a preset reference value (δ;)1-δ2Inner), wherein when the number of feature points in the video feature point set analyzed by the feature point analysis module 1331 is not within the range corresponding to the reference value, the feature point analysis module 1331 forms the replacement data accordingly, wherein the feature point judgment module 131 obtains the replacement dataAfter data exchange, the replacement data can be executed accordingly to adjust the condition of feature point extraction.
Through the above description, it can be understood by those skilled in the art that, in the embodiment of the present invention, by using the feature point analysis module 1331 of the texture richness determination module 133, the tracking system of the three-dimensional object can extract enough feature point sets of the video for objects with different texture richness, so that the tracking system of the three-dimensional object can be applied to objects with different texture richness.
Further, the texture richness determination module 133 further includes a feature point change determination module 1332, wherein the feature point change determination module 1332 is communicatively connected to the feature point extraction module 132 to monitor the difference between the number of extracted feature points of the extracted video feature points at two adjacent times, wherein the feature point change determination module 1332 is communicatively connected to the feature point analysis module 1331, if the quantity difference is less than a given change threshold, it indicates that the current state is normal, the feature point analysis module 1331 continues to extract the feature points of the video frames, if the number difference is greater than the change threshold, it indicates that the feature point has changed drastically, and accordingly indicates that the currently tracked object is out of bounds or is occluded, and at this time, the image needs to be homogenized.
More specifically, the feature point processing module 13 further includes a uniformization module 134, wherein the uniformization module 134 is communicatively connected to the feature point change judging module 1332 and the feature point judging module 131, wherein when the feature point analyzing module 1331 analyzes that the number difference is greater than the change threshold, the uniformization module 134 forms uniformization data, wherein the feature point judging module 131 can correspondingly execute the uniformization data, and can equally divide the video frame according to the uniformization data, so that the feature point extracting module 132 can extract the feature points from the equally divided video frame.
As can be understood by those skilled in the art, with the feature point change determining module 1332 and the homogenization processing module 134, even if an object in the analyzed image is blocked or out of bounds, the image feature point extracting system can extract an appropriate number of feature points.
The feature point processing module 13 further comprises a feature point matching determination module 135, wherein the feature point matching determination module 135 is communicatively connected to the feature point extraction module 132 and the video frame outside parameter analysis unit 30, wherein after the feature point extraction module 132 extracts the feature points of the video frame, the feature point matching determination module 135 is capable of calculating a hamming distance D { D ═ D between each feature point of the template frame and each feature point of each video frame1,d2,...,dnIn which the Hamming distance minimum Dmin=min{d1,d2,...,dnThe point corresponding to p is the nearest neighbor point if DminIf t, determining that two points are matched, otherwise, p has no matched point.
Figure GDA0003663552430000101
The feature point matching determination module 135 further rearranges the feature points of the video frame according to the matching result to form the matched feature point set, specifically, performs a ratio test with the template frame image as a reference and the video template frame image as a target, first performs proximity query 2 times on the image feature points by using the feature points of the template frame, so as to obtain a distance from each feature point of the video frame to each feature point of the template frame, correspondingly, can also obtain a nearest neighbor bn and a next neighbor bn ' of each feature point of the video frame, and then performs a threshold test with a matching threshold t, and if dn is set, performs a threshold test with a distance dn ' from the nearest neighbor and the next neighbor being dn and dn ' respectively>t, rejecting the matched pair; finally, a ratio test is carried out, and a ratio threshold value is set as epsilon if
Figure GDA0003663552430000102
At this time, we consider that both bn and bn' are possible matching points of the query set, so the matching pair is eliminated. And then carrying out cross test on the obtained point sets, wherein the feature point sets of the query set and the target set matching pairs are respectively { s }nAnd { p }nWhere n is 1, 2, 3 …, inverting the query set and target set, solving for { p }nMatching point of s'nAnd the query set corresponding to the correct matching pair is sn}∩{s′n}。
As can be understood by those skilled in the art, since there are a large number of mismatches in the matching point set obtained by the first matching, and a large error may be caused to subsequent results and operations, the feature point matching determination module 135 may eliminate the mismatching pairs after obtaining the matching point pairs.
Further, the video out-of-frame parameter analysis unit 30 is communicatively connected to the key frame forming unit 20 and the feature point matching determination module 135, wherein the video out-of-frame parameter analysis unit 30 can obtain data of the template frame and related data of the video frame, so that the video out-of-frame parameter analysis module 30 can obtain the initial external parameter M according to the known initial external parameter M0=[R0 t0]Obtaining the external parameter M of the video frame by the corresponding relation of the internal parameter K and the current matching pointi=[Ri ti]And using Kalman filtering prediction to ensure the stability of the parameters, which comprises the following steps:
the initial external parameter M of the video frame acquisition device0=[R0 t0]Is a 3D point p which is not coplanar with the internal reference K under a series of world coordinate systemsi=(xi,yi,zi)tPoint g corresponding to the lower part of the two-dimensional planei=(u′i,v′i,1)tThe relationship between:
gi=KMpi
in the following tracking process based on the characteristics, the template frame I is obtained0Characteristic point g0And the video frame IiCharacteristic point giThe corresponding matching point set, namely the 2D-2D correspondence:
gi=H0ig0
obtaining a new 2D-3D corresponding relation according to the initial external participation current 2D-2D corresponding relation, and calculating an external participation M of the image acquisition equipment currently acquiring the video frameiIs given byi=H0iKMip0And then optimizing the estimation result by using Kalman filtering.
The tracking judgment unit 40 can acquire the extrinsic parameters of the video frame calculated by the video frame extrinsic parameter analysis unit 30, and calculate the spatial pose of the tracked object in the video frame according to the extrinsic parameters, so as to realize the tracking of the tracked object.
It should be noted that the video frame extrinsic parameter analysis unit 30 is communicatively connected to the feature point processing module 13 and the feature point matching judgment module 135, so as to obtain the feature point data and calculate the extrinsic parameter M of the video frame according to the obtained feature point dataiAnd forming corresponding analysis data, wherein the tracking judgment unit 40 is communicatively connected to the video out-of-frame parameter analysis unit 30 to judge whether there is an object to be tracked in the current video frame according to the analysis data, so as to judge whether the current tracking is successful, and if the tracking is failed, the tracking judgment unit 40 correspondingly forms an update data, wherein the tracking judgment unit 40 is communicatively connected to the initial module forming unit 20, so as to update the relevant data in the template frame.
Specifically, the tracking judgment unit 40 can obtain the analysis result formed by the video frame external parameter analysis unit 30, wherein a threshold value is set in the tracking judgment unit 40, wherein the tracking judgment unit 40 can judge the magnitude relationship between the difference value between the external parameter of the current video frame and the external parameter of the template frame and the threshold value, wherein when the difference value is greater than the threshold value, it indicates that there is no object tracked by the tracking judgment unit 40 in the current video frame, the tracking judgment unit 40 forms an update data, wherein the initial key frame forming unit 20 can obtain the update data and can form the template frame data according to the update data; when the difference value is smaller than the threshold value, the tracked object exists in the current video frame, and the current tracking state is judged in real time and the key template image is updated in order to guarantee stable tracking of the three-dimensional object under different visual angles. Through the steps, the stable tracking of the three-dimensional object is completed, and the external parameters of the camera are output in real time, and the method specifically comprises the following steps:
judging whether the number of the matching point sets of the current frame and the key template frame is lower than a given threshold value or not;
judging that the external parameters of the camera obtained by the current frame exceed a certain range, namely that the moving range of the camera is enlarged, if one of the two conditions is met, storing a certain video frame as a new template key frame for subsequent tracking, and selecting a video frame strategy as follows:
the frame posture information obtained by matrix calculation is as follows:
{rx,ry,rz,tx,ty,tz}
the pose information predicted by removing singular values and Kalman filtering is as follows:
{r′x,r′y,r′z,t′x,t′y,t′z}
the frame tracking score is:
g=(r′x-rx)2+(r′y-ry)2+(r′z-rz)2+(t′x-tx)2+(t′y-ty)2+(t′z-tz)2
and if G < G, judging the frame as a template replacement optional frame.
According to another aspect of the present invention, there is provided a method of tracking a three-dimensional object, wherein the method comprises the steps of:
(A) acquiring data of a key frame matched with a planar image of a tracked object from a video frame according to the data of the tracked object, wherein the data of the key frame comprises external parameters of the key frame;
(B) analyzing the acquired data of the key frame and the acquired data of a video frame, and further calculating an external parameter corresponding to the current video frame; and
(C) and calculating the pose between the tracked object in the video frame and the tracked object in the template frame according to the external parameters of the video frame.
According to an embodiment of the present invention, the tracking method of a three-dimensional object further includes:
(D) comparing whether the difference between the extrinsic parameters of the video frame and the extrinsic parameters of the key frame meets a specific threshold; and
(E) and when the difference value does not meet the threshold value, replacing the template frame with the template frame used last time.
In the present invention, the step (a) comprises:
(F) judging whether a point in the current image data is a characteristic point according to a characteristic point judgment threshold;
(G) extracting feature points according to the judgment result;
(H) comparing the feature points extracted in the step (G) with feature points in reference data to judge that the feature points meet the corresponding requirements of the reference data on the feature points; and
(I) and if so, taking the feature points extracted in the step (G) as finally extracted feature points, and if not, continuously executing the step (F) by changing the feature point judgment threshold in the step (F).
Further, the step (a) includes:
(J) judging whether the number of the extracted feature points meets a feature point number threshold range or not; and
(K) when the number of the extracted feature points does not accord with a feature point number threshold range, carrying out homogenization processing on the video frame; and
(L) extracting the feature points of the homogenized video frame.
It can thus be seen that the objects of the invention are sufficiently well-attained. The embodiments for explaining the functional and structural principles of the present invention have been fully illustrated and described, and the present invention is not limited by changes based on the principles of these embodiments. Accordingly, this invention includes all modifications encompassed within the scope and spirit of the following claims.

Claims (8)

1. A system for tracking a three-dimensional object, comprising:
a key frame forming unit, wherein the key frame forming unit forms a key frame data by analyzing data of a template frame and data of a video frame, wherein the key frame data comprises extrinsic parameters of the key frame;
the video frame external parameter analysis unit is in communication connection with the key frame forming unit and can acquire external parameters of the key frame and calculate the external parameters of the video frame according to the external parameters of the key frame; and
the tracking judgment unit is in communication connection with the video frame external parameter analysis unit, and after acquiring the data of the key frame and the data of the video frame, the tracking judgment unit calculates the corresponding pose of the tracked object in the video frame;
a data acquisition unit, wherein the data acquisition unit further comprises a template frame acquisition module and a video frame acquisition module, wherein the template frame acquisition module and the video frame acquisition module are respectively communicatively connected to the key frame forming unit;
the video frame acquisition module acquires at least one video frame, wherein the key frame forming unit can acquire the data of the template frame and the data of the video frame from the template frame acquisition module and the video frame acquisition module respectively, judge whether the current video frame data can be used as the key frame, if so, use the video frame as the key frame, and use the data of the template frame as the data of the key frame.
2. The system for tracking a three-dimensional object according to claim 1, wherein the data acquisition unit further comprises a feature point processing module, wherein the feature point processing module comprises a feature point judgment module and a feature point extraction module, wherein the feature point judgment module is communicatively connected to the template frame acquisition module and the video frame acquisition module, wherein the feature point extraction module is communicatively connected to the feature point judgment module, the key frame formation unit, and the video out-of-frame parameter analysis unit.
3. The system for tracking a three-dimensional object according to claim 2, wherein the feature point processing module further comprises a texture richness determination module, wherein the texture richness determination module comprises a feature point analysis module, wherein the feature point analysis module is communicatively connected to the feature point determination module.
4. The system for tracking a three-dimensional object according to claim 3, wherein the texture richness judging module further comprises a feature point change judging module and a homogenization processing module, wherein the feature point change judging module is communicatively connected to the feature point extracting module, wherein the homogenization processing module is communicatively connected to the feature point judging module.
5. The system for tracking a three-dimensional object according to any one of claims 2 to 4, wherein the feature point processing module further comprises a feature point matching judgment module, wherein the feature point matching judgment module is communicatively connected to the feature point extraction module.
6. A method of tracking a three-dimensional object, comprising the steps of:
(A) acquiring a video frame matched with a template frame from video frames according to the data of a tracked object as a key frame, and acquiring the data of the key frame, wherein the data of the key frame comprises the external parameters of the key frame;
(B) analyzing the acquired data of the key frame and the acquired data of a video frame, and further calculating the corresponding external parameters of the current video frame; and
(C) calculating the pose between the tracked object in the video frame and the tracked object in the template frame according to the external parameters of the video frame;
(D) comparing whether the difference value between the external parameters of the video frame and the external parameters of the key frame meets a specific threshold value; and
(E) and when the difference value does not meet the threshold value, replacing the template frame with the template frame used last time.
7. The tracking method of a three-dimensional object according to claim 6, wherein the step (A) includes:
(F) judging whether a point in the current image data is a characteristic point according to a characteristic point judgment threshold;
(G) extracting feature points according to the judgment result;
(H) comparing the feature points extracted in the step (G) with feature points in reference data to judge that the feature points meet the corresponding requirements of the reference data on the feature points; and
(I) and if so, taking the feature points extracted in the step (G) as finally extracted feature points, and if not, continuously executing the step (F) by changing the feature point judgment threshold in the step (F).
8. The tracking method of a three-dimensional object according to claim 7, wherein the step (a) includes:
(J) judging whether the quantity change of the extracted feature points accords with a feature point quantity change threshold range or not; and
(K) when the change of the number of the extracted feature points does not accord with a feature point number change threshold range, carrying out homogenization processing on the video frame; and
(L) extracting the feature points of the homogenized video frame.
CN201711183555.4A 2017-11-23 2017-11-23 Tracking system and tracking method for three-dimensional object Active CN108010055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711183555.4A CN108010055B (en) 2017-11-23 2017-11-23 Tracking system and tracking method for three-dimensional object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711183555.4A CN108010055B (en) 2017-11-23 2017-11-23 Tracking system and tracking method for three-dimensional object

Publications (2)

Publication Number Publication Date
CN108010055A CN108010055A (en) 2018-05-08
CN108010055B true CN108010055B (en) 2022-07-12

Family

ID=62053343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711183555.4A Active CN108010055B (en) 2017-11-23 2017-11-23 Tracking system and tracking method for three-dimensional object

Country Status (1)

Country Link
CN (1) CN108010055B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2017769A2 (en) * 2007-07-19 2009-01-21 Honeywell International Inc. Multi-pose face tracking using multiple appearance models
CN106342332B (en) * 2008-07-04 2012-10-03 中国航空工业集团公司洛阳电光设备研究所 Target following keeping method when switch visual field under airborne moving condition
CN103198488A (en) * 2013-04-16 2013-07-10 北京天睿空间科技有限公司 PTZ surveillance camera realtime posture rapid estimation method
CN103839277A (en) * 2014-02-21 2014-06-04 北京理工大学 Mobile augmented reality registration method of outdoor wide-range natural scene
CN104145294A (en) * 2012-03-02 2014-11-12 高通股份有限公司 Scene structure-based self-pose estimation
CN105578034A (en) * 2015-12-10 2016-05-11 深圳市道通智能航空技术有限公司 Control method, control device and system for carrying out tracking shooting for object
CN107122770A (en) * 2017-06-13 2017-09-01 驭势(上海)汽车科技有限公司 Many mesh camera systems, intelligent driving system, automobile, method and storage medium
CN107122782A (en) * 2017-03-16 2017-09-01 成都通甲优博科技有限责任公司 A kind of half intensive solid matching method in a balanced way
CN107248169A (en) * 2016-03-29 2017-10-13 中兴通讯股份有限公司 Image position method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001155164A (en) * 1999-11-26 2001-06-08 Ntt Communications Kk Device for tracing mobile object
US7536030B2 (en) * 2005-11-30 2009-05-19 Microsoft Corporation Real-time Bayesian 3D pose tracking
CN101739686B (en) * 2009-02-11 2012-05-30 北京智安邦科技有限公司 Moving object tracking method and system thereof
CN102073864B (en) * 2010-12-01 2015-04-22 北京邮电大学 Football item detecting system with four-layer structure in sports video and realization method thereof
RU2013106357A (en) * 2013-02-13 2014-08-20 ЭлЭсАй Корпорейшн THREE-DIMENSIONAL TRACKING OF AREA OF INTEREST, BASED ON COMPARISON OF KEY FRAMES
US9373174B2 (en) * 2014-10-21 2016-06-21 The United States Of America As Represented By The Secretary Of The Air Force Cloud based video detection and tracking system
CN104778697B (en) * 2015-04-13 2017-07-28 清华大学 Based on Quick positioning map as yardstick and the three-dimensional tracking and system in region

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2017769A2 (en) * 2007-07-19 2009-01-21 Honeywell International Inc. Multi-pose face tracking using multiple appearance models
CN106342332B (en) * 2008-07-04 2012-10-03 中国航空工业集团公司洛阳电光设备研究所 Target following keeping method when switch visual field under airborne moving condition
CN104145294A (en) * 2012-03-02 2014-11-12 高通股份有限公司 Scene structure-based self-pose estimation
CN103198488A (en) * 2013-04-16 2013-07-10 北京天睿空间科技有限公司 PTZ surveillance camera realtime posture rapid estimation method
CN103839277A (en) * 2014-02-21 2014-06-04 北京理工大学 Mobile augmented reality registration method of outdoor wide-range natural scene
CN105578034A (en) * 2015-12-10 2016-05-11 深圳市道通智能航空技术有限公司 Control method, control device and system for carrying out tracking shooting for object
CN107248169A (en) * 2016-03-29 2017-10-13 中兴通讯股份有限公司 Image position method and device
CN107122782A (en) * 2017-03-16 2017-09-01 成都通甲优博科技有限责任公司 A kind of half intensive solid matching method in a balanced way
CN107122770A (en) * 2017-06-13 2017-09-01 驭势(上海)汽车科技有限公司 Many mesh camera systems, intelligent driving system, automobile, method and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
3D Pose Tracking With Multitemplate Warping and SIFT Correspondences;Shu Chen 等;《 IEEE Transactions on Circuits and Systems for Video Technology》;20161130;第26卷(第11期);2043-2055 *
Martin Hoßbach 等.Design and analysis of a calibrationmethod for stereo-optical motion tracking in MRI using a virtual calibration phantom.《Medical Imaging 2013: Physics of Medical Imaging》.2013,第8668卷 *
基于奇异值分解的宽基线图像匹配算法;岳思聪 等;《计算机科学》;20090331;第36卷(第3期);223-225、265 *
基于部分惯性传感器信息的单目视觉同步定位与地图创建方法;顾照鹏、董秋雷;《计算机辅助设计与图形学学报》;20120229;第24卷(第2期);155-160 *

Also Published As

Publication number Publication date
CN108010055A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN105701820B (en) A kind of point cloud registration method based on matching area
Revaud et al. Epicflow: Edge-preserving interpolation of correspondences for optical flow
CN104573614B (en) Apparatus and method for tracking human face
JP6295645B2 (en) Object detection method and object detection apparatus
EP2707834B1 (en) Silhouette-based pose estimation
CN110378997B (en) ORB-SLAM 2-based dynamic scene mapping and positioning method
CN108090435B (en) Parking available area identification method, system and medium
US11037325B2 (en) Information processing apparatus and method of controlling the same
CN109472820B (en) Monocular RGB-D camera real-time face reconstruction method and device
CN110310320A (en) A kind of binocular vision matching cost optimizing polymerization method
GB2520338A (en) Automatic scene parsing
CN103106659A (en) Open area target detection and tracking method based on binocular vision sparse point matching
KR20130073812A (en) Device and method for object pose estimation
CN111998862B (en) BNN-based dense binocular SLAM method
CN111696133B (en) Real-time target tracking method and system
Zhu et al. Handling occlusions in video‐based augmented reality using depth information
WO2018129794A1 (en) Method and system for real-time three-dimensional scan modeling for large-scale scene
CN113744315B (en) Semi-direct vision odometer based on binocular vision
CN111046856A (en) Parallel pose tracking and map creating method based on dynamic and static feature extraction
JP2018113021A (en) Information processing apparatus and method for controlling the same, and program
CN111784775A (en) Identification-assisted visual inertia augmented reality registration method
CN105740751A (en) Object detection and identification method and system
KR100574227B1 (en) Apparatus and method for separating object motion from camera motion
Wang et al. Hand posture recognition from disparity cost map
CN108010055B (en) Tracking system and tracking method for three-dimensional object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 202177 room 493-61, building 3, No. 2111, Beiyan highway, Chongming District, Shanghai

Applicant after: TAPUYIHAI (SHANGHAI) INTELLIGENT TECHNOLOGY Co.,Ltd.

Address before: 201802 room 412, building 5, No. 1082, Huyi Road, Jiading District, Shanghai

Applicant before: TAPUYIHAI (SHANGHAI) INTELLIGENT TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant