CN109325444A

CN109325444A - A kind of texture-free three-dimension object Attitude Tracking method of monocular based on 3-D geometric model

Info

Publication number: CN109325444A
Application number: CN201811093757.4A
Authority: CN
Inventors: 王斌; 秦学英; 钟凡
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2018-09-19
Filing date: 2018-09-19
Publication date: 2019-02-12
Anticipated expiration: 2038-09-19
Also published as: CN109325444B

Abstract

The present invention relates to a kind of texture-free three-dimension object Attitude Tracking methods of monocular based on 3-D geometric model.The method of the invention carries out Attitude estimation using the edge feature of object, does not depend on the characteristic point of body surface, and weak texture or texture-free three-dimension object is suitble to track；The method of the invention not to three-dimension object Installation Mark object or hardware positioning sensor under the premise of can the texture-free three-dimension object of Real-time solution attitude parameter (rotation and translation parameter of the object relative to camera), the dummy object for generating computer is aligned with the three-dimension object picture in image sequence, generates the actual situation Overlay of seamless fusion.

Description

A kind of texture-free three-dimension object Attitude Tracking method of monocular based on 3-D geometric model

Technical field

The present invention relates to a kind of texture-free three-dimension object Attitude Tracking methods of monocular based on 3-D geometric model, belong to three Tie up the technical field of tracking.

Background technique

As one of virtual information and the key technology of real world is registrated in augmented reality application, three-dimension object tracking is inhaled A large number of scientific researchers input research is drawn.There are many three-dimensional trackings to be suggested at present, but three-dimension object tracking is still Open full of challenges project.The physical attribute of above-mentioned challenge one side source object itself, such as object lack texture, object Body surface reflection etc.；On the other hand the environment locating for the object, for example complicated background, block and the illumination changed etc.. In general three-dimension object track algorithm is primarily upon two problems: how 1. establish the pass of three-dimensional object surface and the plane of delineation Join (3D-2D association)；2. how Optimization Solution three-dimension object posture.The solution relative maturity of Second Problem, so existing There is three-dimension object track algorithm to concentrate how research robustly establishes 3D-2D association.Used image is associated with according to 3D-2D is established The difference of feature, three-dimension object tracking can be divided into based on characteristic point, region and edge three classes.

In recent years, the excellent feature point description symbol algorithm of multiple performance emerges in large numbers, such as SIFT, SURF, ORB.SIFT has Scale and rotational invariance are widely used in Feature Points Matching and object identification application, but it calculates more time-consuming, because hereafter Continuous performance more excellent SURF and ORB is put forward one after another.The development for having benefited from above-mentioned algorithm, the three-dimension object based on characteristic point Tracking has in real-time and stability and is obviously improved.These methods would generally be in the texture of three-dimensional grid model Feature and feature descriptor are extracted on image, by will obtain in the characteristic point back projection to threedimensional model surface on texture image The corresponding threedimensional model point of characteristic point.When being tracked, detects characteristic point on the image first and extract corresponding feature description It is corresponding to establish 3D-2D point by being matched with the description of the three-dimensional feature of lane database point for symbol；Finally, using being matched to The posture of the corresponding iterative solution object of 3D-2D point.

Although the three-dimensional track algorithm based on characteristic point block, illumination and dimensional variation in the case where illustrate stabilization As a result, also being capable of real time execution under constraint environment.But due to the detection and matching of algorithm dependence characteristics point, when environment and object When body has similar grain or tracking target object to lack texture, Feature Points Matching can occur mistake or can't detect stabilization completely Characteristic point, it is difficult to obtain stable tracking result.

Local feature region and feature descriptor based on the method in region in addition to using body surface, also by body surface Complete or regional area information is tracked for three-dimension object.Three-dimension object tracking based on region is usually by object segmentation It is coupled under a frame with gestures of object estimation, seeks the posture of energy optimum segmentation object and background.Such method uses water Flat set function indicates the projected outline of three-dimension object, is taken turns by maximizing the prospect probability of each pixel and projection in projected outline The background probability of wide outer each pixel, carrys out the posture of iteration adjustment three-dimension object.When object is adjusted to optimum posture, projection wheel Exterior feature just separates object and background.

Based on the method in region to block, complex background and motion blur have a degree of anti-interference ability, still It is based on background before statistics level-set is distinguished, the discrimination of Background statistic information before tracking stability relies on.Current background face When color statistics is similar, the method based on region can not work well.

In method based on edge, image border reflects the geometry of object, because its calculating is simple and efficient, early stage Three-dimension object tracking can select image border or profile as feature.In the feelings of known 3-D geometric model and initial attitude Under condition, the three-dimension object tracking based on edge is sought to establish three-dimension object first by project objects to two dimensional image plane The corresponding relationship of the contour edge of object on projecting edge and image, by minimizing the distance between corresponding sides or point, iteration is excellent Change to solve and obtains the attitude parameter of object.

The quality that the stability dependency of method based on edge is extracted in image top edge.When image or object move Fuzzy, object is blocked or object background is mixed and disorderly, and the acquisition of object edge or profile will be interfered on image, three-dimensional The stability of track algorithm can reduce.

Summary of the invention

In view of the deficiencies of the prior art, the present invention provides a kind of texture-free three-dimension object of the monocular based on 3-D geometric model Attitude Tracking method.

Term explanation:

L-M optimization algorithm: full name Levenberg-Marquardt method is regression parameter minimum two in nonlinear regression Multiply a kind of estimation method of estimation.This method is integrated steepest descent method and linearization technique (Taylor series).Two kinds of sides Method combines can find optimal value quickly.

The technical solution of the present invention is as follows:

A kind of texture-free three-dimension object Attitude Tracking method of monocular based on 3-D geometric model, comprises the following steps that

1) system mode is initialized, setting and resetting digit counter is 0；Attitude Tracking method of the invention is based on computer and takes the photograph The hardware realization of camera, system corresponding to the system mode refer to the software for the carry out Attitude Tracking being arranged in computer System is provided in the software systems and resets digit counter；

2) motion image sequence or video of three-dimension object are acquired；The motion image sequence or video of three-dimension object are by taking the photograph Camera collects；After the motion image sequence or video acquisition of three-dimension object, for real-time transmission to computer is inputted, computer is soft Part system carries out the processing of subsequent step to present frame；

3) judge system current state；System can be as the case may be in three kinds of states: init state, tracking mode It is switched between reorientation state.Init state refers to the state after system initial start-up or the shape after reorientation failure State；Tracking mode refers to the state after system initialization success or the state after relocating successfully；System can be under tracking mode Pose refinement is carried out continuously on image sequence；Reorientation state refers to the state after system tracking failure, is under reorientation state System can be carried out continuously fast target detection；

3.1) if system mode is init state, the initial attitude of three-dimension object is read from Parameter File；Institute Stating record in Parameter File is the initial attitude manually set；

It is that three-dimension object calculates initial attitude using motion model if 3.2) system mode is tracking mode；Specifically Calculation method are as follows:

System calculates present frame posture according to the posture of previous frame and the speed of previous frame posture, using following motion model The initial attitude of optimization；

p'_k+1=ln (exp (v_k)exp(p_k))

Wherein, p_kIt is the posture of kth frame three-dimension object, p'_k+1It is the initial attitude of+1 frame pose refinement of kth, v_kIt is kth frame The speed of three-dimension object posture；

v_k=ln (exp (p_k)exp(p_k-1)^-1)

Wherein, p_k-1It is the posture of -1 frame three-dimension object of kth；

It is that three-dimension object calculates initial appearance using fast Template detection algorithm if 3.3) system is attached most importance to positioning states State；Using the image in key frame set as template, in the picture using the quick object detection algorithms positioning three-dimension object of LINE2D Position, set the corresponding posture of optimal Template in object detection for the initial attitude of object；Wherein, key frame refers to three-dimensional Subject image during object tracking on three-dimension object historical movement track, the objects in images attitude parameter is it is known that therefore The attitude parameter of key frame it is also known that；Optimal Template in LINE2D detection process refer in the image of all key frame set with The most matched key frame of present frame；

4) pose refinement based on Edge Distance field

Pose refinement process is converted into outline process, by by three-D profile point set Φ project to image border away from The D that leaves the theatre is realized；The 3 d pose of direct Optimization Solution object in Edge Distance field can not have to explicit building three peacekeepings two dimension Corresponding points.Iteratively minimize the global distance in the contour of object projection under prediction posture and present image between edge Optimum attitude parameter is solved, while being blocked to handle, Robust method for estimating is added in energy equation.

4.1) according to the initial attitude of three-dimension object, the geometrical model of renders three-dimensional object is obtained three-dimensional under initial attitude The depth image of object extracts the outline projection of three-dimension object, by outline projection back projection to three-dimension object on depth image Geometrical model surface obtains three-D profile point set Φ；

4.2) using Canny edge detection algorithm extract current frame image edge graph, on edge graph using quickly away from It leaves the theatre and converts the Edge Distance field D that image is calculated in algorithm；Three-D profile point set Φ is projected to described using initial attitude Edge Distance field D, value of the subpoint in the D of Edge Distance field indicate three-D profile point to the pixel distance of nearest image border； Gradient direction of the subpoint in Edge Distance field indicates three-D profile point to the negative direction of nearest image border；

4.3) for each three-D profile point X_i∈ Φ, matching error are expressed as

Wherein, K is video camera internal reference matrix,For several mapping functions, appearance is tieed up by the 6 of representation of Lie algebra State Parameter Switch is the euclidean transformation matrix of 4x4,It is X_iHomogeneous coordinates indicate,Function turns homogeneous coordinates Turn to inhomogeneous coordinate；Therefore, the matching error between three-D profile point set Φ and image border distance field D is all three-D profiles The sum of point matching error, i.e. objective energy function:

Wherein,It is the weight function of Robust method for estimating, Systematic selection Tukey Robust method for estimating, therefore

Wherein c is fixed threshold, indicates maximum value of the subpoint in the D of Edge Distance field；

4.4)+1 frame posture of kth is optimized using initial attitude obtained in step 3) as iteration initial point by L-M Algorithm iteration solves E (p), obtains optimum attitude parameter value p_k+1；

5) after iteration optimization, three-D profile point is projected to the Edge Distance field of image under optimum attitude, read each The value W of three-D profile point being projected in the distance field of image border calculates the average value of all W values, as each three-D profile The mean residual of point；If the mean residual of each three-D profile point is less than threshold residual value, determine that current pose optimizes successfully, Updating key frame set, updating motion model, set system mode is tracking mode, finally exports optimum attitude parameter value, then Go to step 8)；Otherwise determine current pose optimization failure, then go to step 6)；Mean residual has measured object three-dimensional contour outline The degree of registration of projection and objects in images profile；

6) judge system current state；If system mode is init state, system mode is kept, and export three-dimensional Object initialization posture, then goes to step 8)；If system mode is tracking mode, sets system mode and attaches most importance to positioning states, Next frame carries out Fault recovery, and exports three-dimension object initialization posture, then goes to step 8)；If system mode is to reset Position state, resets digit counter and adds 1, and go to step 7)；

7) judge to relocate whether number is less than frequency threshold value；If relocating number is less than frequency threshold value, system is set State is attached most importance to positioning states, and output three-dimension object initializes posture；If relocating number is more than or equal to frequency threshold value, weight is determined Positioning cannot complete Fault recovery, and setting and resetting digit counter is 0, and setting system mode is init state, exports three-dimension object Initialize posture；When relocating the continuous reorientation number of counter records less than specified threshold, system mode is set as reorientation State, next frame continue to attempt to Fault recovery.

8) judge whether present frame is last frame；If present frame is last frame, work end；If present frame It is not last frame, then using next frame as present frame, repeats step 2)~7), until present frame is last frame.

Preferred according to the present invention, in the step 5), the detailed process for updating key frame set is to judge present frame appearance State is to the minimum range of all postures of key frame set, if minimum range is greater than distance threshold, key frame is added in present frame Set；Crucial frame count is removed and is added into earliest in key frame set if key frame sum is greater than specified threshold simultaneously Key frame.

Preferred according to the present invention, in the step 5), the detailed process for updating motion model is, according in step 3.2) Formula according to the Attitude Calculation present frame three-dimension object attitudes vibration speed of present frame gestures of object and previous frame object.

The invention has the benefit that

1. the texture-free three-dimension object Attitude Tracking method of the monocular of the present invention based on 3-D geometric model, not to three-dimensional Under the premise of object Installation Mark object or hardware positioning sensor can the texture-free three-dimension object of Real-time solution attitude parameter (object Rotation and translation parameter of the body relative to camera), the three-dimension object in dummy object and image sequence for generating computer is drawn In face of neat, the actual situation Overlay of seamless fusion is generated；

2. the texture-free three-dimension object Attitude Tracking method of the monocular of the present invention based on 3-D geometric model, uses object Edge feature carry out Attitude estimation, do not depend on the characteristic point of body surface, weak texture or texture-free three-dimension object be suitble to track；

3. the texture-free three-dimension object Attitude Tracking method of the monocular of the present invention based on 3-D geometric model, merges and be The judgement of system state, when object of which movement goes out, the visual field is blocked by large area or when motion blur etc. leads to tracking failure, system starts base In the reorientation algorithm of fast Template detection, system mistake recovery is carried out, system is made to restore tracking mode.

Detailed description of the invention

Fig. 1 is the texture-free three-dimension object Attitude Tracking method flow of the monocular of the present invention based on 3-D geometric model Figure；

Fig. 2 is target three-dimension object to be tracked；

Fig. 3 a- Fig. 3 c is tracking result when article size changes；

Fig. 4 a- Fig. 4 c is tracking result when object background changes；

Tracking result when Fig. 5 a- Fig. 5 c holds object rotation；

Fig. 6 a- Fig. 6 c holds tracking result when object enters and leaves the visual field；

Tracking result when Fig. 7 a- Fig. 7 c camera rotates；

Fig. 8 a- Fig. 8 c camera acutely shakes tracking result when leading to motion blur；

Tracking result when Fig. 9 a- Fig. 9 c object is blocked；

Figure 10 a- Figure 10 c camera acutely shakes tracking result when object being caused to enter and leave the visual field；

Specific embodiment

Below with reference to embodiment and Figure of description, the present invention will be further described, but not limited to this.

Embodiment 1

p'_k+1=ln (exp (v_k)exp(p_k))

v_k=ln (exp (p_k)exp(p_k-1)^-1)

Wherein, p_k-1It is the posture of -1 frame three-dimension object of kth；

4) pose refinement based on Edge Distance field

4.1) according to the initial attitude of three-dimension object, the geometrical model of renders three-dimensional object is obtained three-dimensional under initial attitude The depth image of object extracts the outline projection of three-dimension object, by outline projection back projection to three-dimension object on depth image Geometrical model surface obtains three-D profile point set Φ；Wherein, render process is the technology of comparative maturity in computer graphics, this System uses the Rendering software library OpenGL being widely used.

4.3) for each three-D profile point X_i∈ Φ, matching error are expressed as

Wherein, the detailed process for updating key frame set is to judge present frame posture to all postures of key frame set Minimum range, if minimum range is greater than distance threshold, key frame set is added in present frame；Simultaneously to crucial frame count, such as Fruit key frame sum is greater than specified threshold, then removes the key frame being added into earliest in key frame set.

The detailed process for updating motion model is, according to the formula in step 3.2) according to present frame gestures of object and upper one The Attitude Calculation present frame three-dimension object attitudes vibration speed of frame object.

The present embodiment tracking effect figure, Fig. 2 are illustrated target three-dimension object to be tracked (black toy cat), Fig. 3 a- figure 10c illustrates the virtual reality fusion result (canescence is threedimensional model rendering figure) of the tracking test of object in several cases, The size variation of object during tracking, for background by simply to noisy, hand-held object is rotated, object of which movement goes out the visual field again again Secondary entrance, camera rotation, camera are acutely shaken, and motion blur, object is blocked completely, and three-dimensional geometry is based on described in the present embodiment The texture-free three-dimension object Attitude Tracking method of the monocular of model can tenacious tracking, and the visual field at the beginning of object of which movement again enter view It is wild and can quickly be relocated after being blocked completely, it is restored to tracking mode.

Claims

1. a kind of texture-free three-dimension object Attitude Tracking method of monocular based on 3-D geometric model, which is characterized in that including step It is rapid as follows:

1) system mode is initialized, setting and resetting digit counter is 0；

2) motion image sequence or video of three-dimension object are acquired；

3) judge system current state；

3.1) if system mode is init state, the initial attitude of three-dimension object is read from Parameter File；

It is that three-dimension object calculates initial attitude using motion model if 3.2) system mode is tracking mode；It is specific to calculate Method are as follows:

System calculates present frame pose refinement according to the posture of previous frame and the speed of previous frame posture, using following motion model Initial attitude；

p'_k+1=ln (exp (v_k)exp(p_k))

Wherein, p_kIt is the posture of kth frame three-dimension object, p'_k+1It is the initial attitude of+1 frame pose refinement of kth, v_kIt is kth frame three-dimensional The speed of gestures of object；

v_k=ln (exp (p_k)exp(p_k-1)^-1)

Wherein, p_k-1It is the posture of -1 frame three-dimension object of kth；

It is that three-dimension object calculates initial attitude using fast Template detection algorithm if 3.3) system is attached most importance to positioning states；With Image in key frame set is template, using the position of the quick object detection algorithms positioning three-dimension object of LINE2D in the picture It sets, sets the corresponding posture of optimal Template in object detection for the initial attitude of object；

4) pose refinement based on Edge Distance field

4.1) according to the initial attitude of three-dimension object, the geometrical model of renders three-dimensional object obtains three-dimension object under initial attitude Depth image, on depth image extract three-dimension object outline projection, by outline projection back projection to three-dimension object geometry Model surface obtains three-D profile point set Φ；

4.2) edge graph that current frame image is extracted using Canny edge detection algorithm, uses quick distance field on edge graph The Edge Distance field D of image is calculated in transformation algorithm；

4.3) for each three-D profile point X_i∈ Φ, matching error are expressed as

Wherein, K is video camera internal reference matrix,For several mapping functions, attitude parameter is tieed up by the 6 of representation of Lie algebra The euclidean transformation matrix of 4x4 is converted to,It is X_iHomogeneous coordinates indicate,Function converts homogeneous coordinates to non- Homogeneous coordinates；Therefore, the matching error between three-D profile point set Φ and image border distance field D is the matching of all three-D profile points The sum of error, i.e. objective energy function:

4.4) L-M optimization algorithm is passed through using initial attitude obtained in step 3) as iteration initial point for+1 frame posture of kth It iteratively solves E (p), obtains optimum attitude parameter value p_k+1；

5) after iteration optimization, three-D profile point is projected to the Edge Distance field of image under optimum attitude, reads each three-dimensional The value W of profile point being projected in the distance field of image border calculates the average value of all W values, as each three-D profile point Mean residual；If the mean residual of each three-D profile point is less than threshold residual value, determines that current pose optimizes successfully, update Key frame set updates motion model, sets system mode as tracking mode, finally exports optimum attitude parameter value, then goes to Step 8)；Otherwise determine current pose optimization failure, then go to step 6)；

6) judge system current state；If system mode is init state, system mode is kept, and export three-dimension object Posture is initialized, step 8) is then gone to；If system mode is tracking mode, sets system mode and attach most importance to positioning states, it is next Frame carries out Fault recovery, and exports three-dimension object initialization posture, then goes to step 8)；If system mode is reorientation shape State resets digit counter and adds 1, and goes to step 7)；

7) judge to relocate whether number is less than frequency threshold value；If relocating number is less than frequency threshold value, system mode is set Attach most importance to positioning states, output three-dimension object initializes posture；If relocating number is more than or equal to frequency threshold value, reorientation is determined Fault recovery cannot be completed, and setting and resetting digit counter is 0, and setting system mode is init state, and output three-dimension object is initial Change posture；

8) judge whether present frame is last frame；If present frame is last frame, work end；If present frame is not Last frame repeats step 2)~7 then using next frame as present frame), until present frame is last frame.

2. the texture-free three-dimension object Attitude Tracking method of the monocular according to claim 1 based on 3-D geometric model, It is characterized in that, in the step 5), the detailed process for updating key frame set is to judge present frame posture to key frame set institute There is the minimum range of posture, if minimum range is greater than distance threshold, key frame set is added in present frame；Simultaneously to key frame It counts, if key frame sum is greater than specified threshold, removes the key frame being added into earliest in key frame set.

3. the texture-free three-dimension object Attitude Tracking method of the monocular according to claim 1 based on 3-D geometric model, It is characterized in that, in the step 5), the detailed process for updating motion model is, according to the formula in step 3.2) according to present frame The Attitude Calculation present frame three-dimension object attitudes vibration speed of gestures of object and previous frame object.