CN105096341A

CN105096341A - Mobile robot pose estimation method based on trifocal tensor and key frame strategy

Info

Publication number: CN105096341A
Application number: CN201510445644.6A
Authority: CN
Inventors: 陈剑; 贾丙西; 张凯祥
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2015-07-27
Filing date: 2015-07-27
Publication date: 2015-11-25
Anticipated expiration: 2035-07-27
Also published as: CN105096341B

Abstract

Aiming at the problem of visual servo tracking control, the invention proposes a mobile robot pose estimation method based on trifocal tensor and key frame strategy. For the generalized scene, the trifocal tensor is used to describe the geometric relationship among the initial view, the current view and the final view, and based on this, the pose information of the robot with unknown scale factors in the final view is obtained. For a wide range of visual servoing tasks, the key frame-based method estimates the pose of the robot without matching feature points between the current view, the initial view, and the final view, and obtains continuous pose measurements in global coordinates, greatly The working space of the robot visual servo has been expanded. The obtained pose information can be widely used in the design of visual servo controller.

Description

Pose Estimation Method for Mobile Robot Based on Trifocal Tensor and Key Frame Strategy

技术领域technical field

本发明属于机器人与计算机视觉的交叉领域，涉及基于视觉的移动机器人位姿估计问题，尤其涉及基于三焦张量和关键帧策略的移动机器人位姿估计方法。The invention belongs to the intersecting field of robot and computer vision, relates to the problem of pose estimation of a mobile robot based on vision, in particular to a pose estimation method of a mobile robot based on trifocal tensor and key frame strategy.

背景技术Background technique

随着机器人技术的迅速发展，机器人在实际中扮演着重要的角色，其承担的任务也更加复杂多样。特别的，对于移动机器人，传统的方法往往基于GPS、里程计等位置测量设备对其进行控制，其控制精度受到传感设备精度的限制，而且任务的灵活度较差，需要给定机器人在物理空间中的参考轨迹，增加了实现难度和成本。视觉伺服控制利用视觉信息作为反馈,对环境进行非接触式的测量,利用更大的信息量,提高了机器人系统的灵活性和精确性,在机器人控制中具有不可替代的作用。With the rapid development of robot technology, robots play an important role in practice, and the tasks they undertake are more complex and diverse. In particular, for mobile robots, traditional methods are often based on position measurement devices such as GPS and odometers to control them. The control accuracy is limited by the accuracy of sensing devices, and the task flexibility is poor. The reference trajectory in space increases the difficulty and cost of implementation. Visual servo control uses visual information as feedback to measure the environment in a non-contact manner, utilizes a larger amount of information, improves the flexibility and accuracy of the robot system, and plays an irreplaceable role in robot control.

经典的视觉伺服控制方法可以分为基于位置、图像和几何约束的方法，主要体现为误差系统构造方式的不同。根据以往的综述(贾丙西,刘山,张凯祥,陈剑.机器人视觉伺服研究进展:视觉系统与控制策略，自动化学报,2015,41(5):861-873)，基于图像和几何约束的方法对图像噪声和相机标定误差的鲁棒性更好，并且需要较少的先验知识，应用范围更广。但是，对于大部分视觉伺服系统而言，一般都要求在控制过程中目标特征点至少部分在视野内，从而构造有效的视觉反馈。然而，由于相机的视野范围有限，而且现有图像特征提取的方法在发生大的旋转/平移时重复性和精确度并不好，从而影响了视觉伺服系统的精度，限制了其工作空间。另外，对于移动机器人而言，大部分的移动机器人都具有非完整约束，即具有三个空间自由度，但是只有两个控制自由度。以往的研究中有学者(SalarisP,FontanelliD,PallottinoL,etal.Shortestpathsforarobotwithnonholonomicandfield-of-viewconstraints,IEEETransactionsonRobotics,2010,26(2):269-281)针对非完整约束移动机器人提出了保证目标可见的轨迹规划策略，但是需要一定的先验知识，而且规划得到的路径往往是曲折的，这在实际中降低了机器人的工作效率。Classical visual servo control methods can be divided into methods based on position, image and geometric constraints, which are mainly reflected in the different construction methods of the error system. According to the previous review (Jia Bingxi, Liu Shan, Zhang Kaixiang, Chen Jian. Research Progress of Robot Visual Servo: Vision System and Control Strategy, Acta Automatica Sinica, 2015, 41(5):861-873), methods based on images and geometric constraints The robustness to image noise and camera calibration error is better, and requires less prior knowledge, and has a wider range of applications. However, for most visual servoing systems, it is generally required that the target feature points are at least partially within the field of view during the control process, so as to construct effective visual feedback. However, due to the limited field of view of the camera, and the existing image feature extraction methods are not repeatable and accurate when large rotation/translation occurs, which affects the accuracy of the visual servo system and limits its working space. In addition, for mobile robots, most mobile robots have nonholonomic constraints, that is, they have three spatial degrees of freedom, but only two control degrees of freedom. In previous studies, scholars (SalarisP, FontanelliD, PallottinoL, et al. Shortest paths for arobot with nonholonomic and field-of-view constraints, IEEE Transactions on Robotics, 2010, 26(2): 269-281) proposed a trajectory planning strategy to ensure that the target is visible for non-holonomic mobile robots. However, certain prior knowledge is required, and the planned path is often tortuous, which reduces the working efficiency of the robot in practice.

视觉伺服中常用的多视图几何约束主要包括单应性、对极几何、三焦张量等，基于多视几何的视觉伺服一般都涉及对机器人的姿态进行估计。其中，基于单应性的视觉伺服通常结合三维空间信息与图像信息进行误差系统的构建，但在对单应性矩阵进行分解时存在多解问题(四组解)，往往需要目标平面的先验知识，为了求解方便往往要求特征点共面；基于对极几何的方法利用两个视图之间的极点信息估计机器人的相对姿态，但是当两个视图邻近时会出现奇异，并且当两个极点都调整到零时存在位姿的歧义，只能保证二者方向相同且共线，需要切换到其他控制策略从而调整到目标位姿。基于三焦张量的方法利用三个视图的对应关系对机器人的姿态进行估计，与场景的结构无关，而仅与视图之间的相对姿态有关，具有更好的普适性。然而，相比于单应性和对极几何这种两视图几何的方法，三焦张量要求三个视图中具有匹配的特征点，对视野约束的要求更强。另外，特征点提取和匹配的方法在发生旋转和平移变换时的重复性和精度并不好，影响了视觉伺服系统的精度，也限制了机器人视觉伺服的工作空间。The multi-view geometric constraints commonly used in visual servoing mainly include homography, epipolar geometry, trifocal tensor, etc. Visual servoing based on multi-view geometry generally involves estimating the pose of the robot. Among them, homography-based visual servoing usually combines three-dimensional spatial information and image information to construct an error system, but there are multiple solutions (four sets of solutions) when decomposing the homography matrix, which often requires prior knowledge of the target plane. Knowledge, in order to solve the convenience, the feature points are often required to be coplanar; the method based on epipolar geometry uses the pole information between the two views to estimate the relative pose of the robot, but when the two views are close to each other, there will be a singularity, and when the two poles are both When adjusting to zero, there is ambiguity in the pose. It can only ensure that the two directions are the same and collinear. It is necessary to switch to other control strategies to adjust to the target pose. The method based on the trifocal tensor uses the corresponding relationship of the three views to estimate the pose of the robot, which has nothing to do with the structure of the scene, but only with the relative pose between the views, which has better universality. However, compared to homography and epipolar geometry, which are two-view geometry methods, the trifocal tensor requires matching feature points in three views, and has stronger requirements for field of view constraints. In addition, the repeatability and accuracy of the method of feature point extraction and matching are not good when rotation and translation transformation occur, which affects the accuracy of the visual servo system and limits the working space of the robot visual servo.

发明内容Contents of the invention

为了克服以往技术的不足，本发明针对视觉伺服跟踪控制问题提出了基于三焦张量和关键帧策略的移动机器人位姿估计方法，以及根据该方法的视觉伺服跟踪系统和移动机器人。In order to overcome the deficiencies of previous technologies, the present invention proposes a mobile robot pose estimation method based on trifocal tensor and key frame strategy, and a visual servo tracking system and mobile robot based on the method.

一种基于三焦张量和关键帧策略的移动机器人位姿估计方法，用于移动机器人的视觉轨迹跟踪任务，利用移动机器人在示教的过程拍摄的环境图像作为期望轨迹，从期望轨迹中选取关键帧，然后基于三焦张量对三个视图进行几何模型构建，从而提取位姿信息；所述的三个视图为当前视图C和其它任意两个视图C₀,C^*。A mobile robot pose estimation method based on the trifocal tensor and key frame strategy, which is used for the visual trajectory tracking task of the mobile robot. The environment image captured by the mobile robot during the teaching process is used as the expected trajectory, and the key points are selected from the expected trajectory. frame, and then construct the geometric model of the three views based on the trifocal tensor, thereby extracting the pose information; the three views are the current view C and any other two views C ₀ , C ^* .

所述的几何模型构建过程为，对于当前视图C，通过计算与另外两个视图C₀,C^*的三焦张量，从而得到当前视图相对于C^*的位姿信息，包括角度信息和含未知比例因子的位置信息。The geometric model construction process is, for the current view C, by calculating the trifocal tensor with the other two views C ₀ , C ^* , so as to obtain the pose information of the current view relative to C ^* , including angle information and unknown The location information of the scale factor.

从所述期望轨迹中选取的包含初始视图和最终视图的图像序列作为关键帧，且相邻的关键帧之间具有预设阈值的匹配度；在位姿估计的过程中，首先计算当前视图与两个最匹配的关键帧之间的三焦张量，得到相对于最匹配关键帧的位姿信息，再迭代变换到最终视图下，得到全局坐标下连续的位姿测量。The image sequence containing the initial view and the final view selected from the expected trajectory is used as a key frame, and there is a matching degree of a preset threshold between adjacent key frames; in the process of pose estimation, first calculate the current view and The trifocal tensor between the two most matching keyframes is used to obtain the pose information relative to the most matching keyframe, and then iteratively transformed to the final view to obtain continuous pose measurements in global coordinates.

一种采用根据任一项方法的视觉伺服跟踪系统。A visual servo tracking system employing any one of the methods.

一种采用根据任一项方法的移动机器人。A mobile robot employing the method according to any one.

本发明的有益效果：Beneficial effects of the present invention:

针对大范围的视觉伺服任务，不需要当前视图与初始视图和最终视图之间存在匹配的特征点，从而大大扩展了机器人视觉伺服的工作空间；由于相邻关键帧之间具有接近预设阈值的匹配率，因此基于关键帧的位姿估计算法中三焦张量的计算更加精确，提高了系统的精度。For a wide range of visual servoing tasks, there is no need for matching feature points between the current view and the initial view and final view, thus greatly expanding the working space of robot visual servoing; The matching rate, so the calculation of the trifocal tensor in the keyframe-based pose estimation algorithm is more accurate, which improves the accuracy of the system.

附图说明Description of drawings

图1是视觉伺服跟踪任务示意图；Figure 1 is a schematic diagram of the visual servo tracking task;

图2是三焦张量原理图；Fig. 2 is a schematic diagram of the trifocal tensor;

图3是基于关键帧的位姿估计示意图；Figure 3 is a schematic diagram of pose estimation based on key frames;

图4是关键帧之间位姿变换示意图。Figure 4 is a schematic diagram of pose transformation between keyframes.

具体实施方式Detailed ways

以下结合具体实施方式并对照附图对本发明加以详细说明。The present invention will be described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

本发明针对一般化的场景，利用三焦张量描述初始视图、当前视图和最终视图之间的几何关系，并基于此得到机器人含未知比例因子的位姿信息。针对大范围的视觉伺服跟踪问题，充分利用三焦张量与环境结构无关这一特点，提出了基于关键帧的姿态估计方法，先计算当前视图相对于两个最相似关键帧的位姿信息，再迭代变换到最终视图下，得到全局坐标下连续的位姿测量。基于三焦张量的移动机器人位姿估计方法，主要用于基于视觉的轨迹跟踪问题。这一类问题一般分为示教和跟踪两个环节。如图1，在示教环节，机器人在运动的过程中拍摄环境的图像作为期望轨迹；在跟踪环节，机器人基于视觉反馈跟踪期望图像从而完成期望轨迹的跟踪，该跟踪过程主要分为位姿估计和控制两个部分。本发明提出包含未知比例因子的位姿估计方法，再利用现有的方法易于设计出相应的自适应控制器，完成轨迹跟踪任务，如(ChenJ,DixonWE,DawsonDM,etal.Homography-basedvisualservotrackingcontrolofawheeledmobilerobot[J].Robotics,IEEETransactionson,2006,22(2):406-415.)。针对移动机器人的位姿估计，本发明主要分为几何模型构建和基于关键帧的位姿估计两部分。1、几何模型Aiming at the generalized scene, the present invention uses the trifocal tensor to describe the geometric relationship among the initial view, the current view and the final view, and obtains the pose information of the robot with an unknown scale factor based on it. Aiming at the problem of large-scale visual servo tracking, and making full use of the fact that the trifocal tensor has nothing to do with the environment structure, a pose estimation method based on keyframes is proposed. First, the pose information of the current view relative to the two most similar keyframes is calculated, and then The iterative transformation to the final view obtains continuous pose measurements in global coordinates. A mobile robot pose estimation method based on the trifocal tensor is mainly used for vision-based trajectory tracking problems. This type of problem is generally divided into two parts: teaching and tracking. As shown in Figure 1, in the teaching link, the robot takes images of the environment as the expected trajectory during the movement process; in the tracking link, the robot tracks the expected image based on visual feedback to complete the tracking of the expected trajectory. The tracking process is mainly divided into pose estimation and control two parts. The present invention proposes a pose estimation method including unknown scale factors, and then utilizes existing methods to easily design a corresponding adaptive controller to complete trajectory tracking tasks, such as (ChenJ, DixonWE, DawsonDM, et al.Homography-basedvisualservotrackingcontrolofawheeledmobilerobot[J] . Robotics, IEEE Transactions on, 2006, 22(2): 406-415.). For the pose estimation of the mobile robot, the present invention is mainly divided into two parts: geometric model construction and pose estimation based on key frames. 1. Geometric model

如图2，三焦张量{T_i}_i＝1,2,3描述了三个视图之间的相互关系，且与环境结构无关。对于三幅图像I₀,I,I^*中相应的特征点x₀,x,x^*，具有如下的关系：As shown in Figure 2, the trifocal tensor {T _i } _{i=1, 2, 3} describes the relationship among the three views, and has nothing to do with the environment structure. For the corresponding feature points x ₀ , x, x ^* in the three images I ₀ , I, I ^* , the relationship is as follows:

其中，分别表示x₀,x,x^*的第i,j,k个元素，T_iqr表示T_i中第q行第r列的元素，ε_jqs,ε_krt为排列符号，定义如下：in, Respectively represent the i, j, k elements of x ₀ , x, x ^* , T _iqr represents the element of row q and column r in T _i , ε _jqs , ε _krt are arrangement symbols, defined as follows:

特别的，对于平面移动机器人，考虑初始视图C₀、当前视图C、以及最终视图C^*的关系。在C^*处建立全局坐标系，则视图C₀的坐标可以表示为(x₀,z₀,θ₀)，当前视图的坐标可以表示为(x,z,θ)。定义R₀,t₀为C₀到C^*的旋转和平移，定义R,t为C到C^*的旋转和平移，三个视图C,C₀,C^*之间的位姿变换关系如下：In particular, for a planar mobile robot, consider the relationship between the initial view C ₀ , the current view C, and the final view C ^* . A global coordinate system is established at C ^* , then the coordinates of the view C ₀ can be expressed as (x ₀ , z ₀ , θ ₀ ), and the coordinates of the current view can be expressed as (x, z, θ). Define R ₀ , t ₀ as the rotation and translation from C ₀ to C ^* , define R, t as the rotation and translation from C to C ^* , and the pose transformation relationship between the three views C, C ₀ , C ^* is as follows:

将R,R₀表示为R＝[r₁r₂r₃],R₀＝[r₀₁r₀₂r₀₃]，将t,t₀表示为t＝[t_x0t_z]^T,t₀＝[t_0x0t_0z]^T，则三焦张量与相对位姿信息有如下的关系：Express R, R ₀ as R=[r ₁ r ₂ r ₃ ],R ₀ =[r ₀₁ r ₀₂ r ₀₃ ], express t,t ₀ as t=[t _x 0t _z ] ^T ,t ₀ = [t _0x 0t _0z ] ^T , then the relationship between the trifocal tensor and the relative pose information is as follows:

将位姿变换关系代入后可以得到三焦张量关于三幅视图之间相对位姿的表达式，其中非零的元素如下，T_ijk表示T_i中第j行第k列的元素：After substituting the pose transformation relationship, the expression of the trifocal tensor about the relative pose between the three views can be obtained, where the non-zero elements are as follows, and T _ijk represents the element of row j and column k in T _i :

T₁₁₁＝t_0xcosθ-t_xcosθ_n,T₁₁₃＝t_0zcosθ+t_xsinθ₀,T ₁₁₁ ＝t _0x cosθ-t _x cosθ _n , T ₁₁₃ ＝t _0z cosθ+t _x sinθ ₀ ,

T₁₃₁＝-t_0xsinθ-t_zcosθ₀,T₁₃₃＝-t_0zsinθ+t_zsinθ₀,T ₁₃₁ ＝-t _0x sinθ-t _z cosθ ₀ , T ₁₃₃ ＝-t _0z sinθ+t _z sinθ ₀ ,

T₂₁₂＝-t_x,T₂₂₁＝-t_0x,T₂₂₃＝t_0z,T₂₃₂＝-t_z,T ₂₁₂ =-t _x , T ₂₂₁ =-t _0x , T ₂₂₃ =t _0z , T ₂₃₂ =-t _z ,

T₃₁₁＝t_0xsinθ-t_xsinθ₀,T₃₁₃＝t_0zsinθ-t_xcosθ₀,T ₃₁₁ ＝t _0x sinθ-t _x sinθ ₀ , T ₃₁₃ ＝t _0z sinθ-t _x cosθ ₀ ,

T₃₃₁＝t_0xcosθ-t_zsinθ₀,T₃₃₃＝t_0zcosθ-t_zcosθ₀.T ₃₃₁ ＝t _0x cosθ-t _z sinθ ₀ , T ₃₃₃ ＝t _0z cosθ-t _z cosθ ₀ .

基于三个视图下拍摄到的图像，可以确定未知比例因子的三焦张量。定义为C₀到C^*的距离。一般情况下，C₀与C^*的坐标原点不重合，即对三焦张量进行如下的归一化操作：Based on images captured in three views, a trifocal tensor of unknown scale factor can be determined. definition is the distance from C ₀ to C ^* . In general, the coordinate origins of C ₀ and C ^* do not coincide, that is Perform the following normalization operations on the trifocal tensor:

其中α为符号变量，可以由sign(T₂₂₁)sign(t_0x)或sign(T₂₂₃)sign(t_0z)确定。在控制过程中t_0x,t_0z的符号不发生变化，可以在示教过程利用先验知识得到。Wherein α is a sign variable, which can be determined by sign(T ₂₂₁ )sign(t _0x ) or sign(T ₂₂₃ )sign(t _0z ). The signs of t _0x and t _0z do not change during the control process, and can be obtained by using prior knowledge in the teaching process.

根据上述推导，定义t_zm＝t_x/d,t_zm＝t_z/d为未知比例因子的位置信息，可以根据归一化的三焦张量计算得到：According to the above derivation, define t _zm ＝t _x /d, t _zm ＝t _z /d as the position information of unknown scale factor, which can be calculated according to the normalized trifocal tensor:

机器人的旋转角θ可以由下式确定：The rotation angle θ of the robot can be determined by the following formula:

定义x_m＝x/d,z_m＝z/d为未知比例的坐标信息，可以用下式计算：Define x _m ＝x/d, z _m ＝z/d as the coordinate information of unknown scale, which can be calculated by the following formula:

在控制系统中，可以使用(t_xm,t_zm,θ)或(x_m,z_m,θ)表达机器人当前的位姿信息。另外，给定一个图像序列，期望轨迹可以利用同样的方法计算得到。In the control system, (t _xm ,t _zm ,θ) or (x _m ,z _m ,θ) can be used to express the current pose information of the robot. In addition, given an image sequence, the expected trajectory can be calculated using the same method.

2、基于关键帧的位姿估计2. Keyframe-based pose estimation

如图3，基于关键帧的位姿估计主要思想是在期望轨迹中选择一系列的关键帧，从而相邻的关键帧之间具有较多的匹配点。当前帧的位姿信息首先基于两个最相似的关键帧计算，再迭代变换得到最终视图下的位姿信息。关键帧策略主要包括关键帧的选择和位姿估计两部分。As shown in Figure 3, the main idea of keyframe-based pose estimation is to select a series of keyframes in the desired trajectory, so that there are more matching points between adjacent keyframes. The pose information of the current frame is first calculated based on the two most similar key frames, and then iteratively transformed to obtain the pose information in the final view. The key frame strategy mainly includes two parts: key frame selection and pose estimation.

2.1关键帧选择2.1 Key frame selection

关键帧的选择过程为给定期望轨迹上的一个图像序列{C_d}，从中选择一系列的关键帧{C_k}，使得第一个关键帧为初始视图，最后一个关键帧为最终视图，相邻两个关键帧的匹配率接近于阈值τ。其中，定义帧C_t与关键帧C_p之间的匹配率为两幅视图中匹配的特征点数量与C_p中提取到特征点数量的比值。该选取过程如下：The key frame selection process is given an image sequence {C _d } on the desired trajectory, and a series of key frames {C _k } are selected from it, so that the first key frame is the initial view, and the last key frame is the final view, The matching rate of two adjacent keyframes is close to the threshold τ. Among them, the matching rate between frame C _t and key frame C _p is defined as the ratio of the number of matching feature points in the two views to the number of feature points extracted in C _p . The selection process is as follows:

(1)将C₀加入{C_k},i＝1；(1) Add C ₀ to {C _k }, i=1;

(2)考虑{C_d}中的下一帧C_t；(2) Consider the next frame C _t in {C _d };

(3)如果C_t与C_k(i)的匹配率小于τ，将C_t加入{C_k},i＝i+1；(3) If the matching rate between C _t and C _k (i) is less than τ, add C _t to {C _k }, i=i+1;

(4)如果C_t不是C^*，则返回(2)；(4) If C _t is not C ^* , then return to (2);

(5)如果C^*不在{C_k}中，则将C^*加入{C_k}。(5) If C ^* is not in {C _k }, add C ^* to {C _k }.

2.2基于关键帧的位姿估计2.2 Keyframe-based pose estimation

给定当前帧C，位姿估计的思路为首先计算相对于两个最相似关键帧的位姿信息，再迭代变换到最终视图C^*，具体过程如下：Given the current frame C, the idea of pose estimation is to first calculate the pose information relative to the two most similar key frames, and then iteratively transform to the final view C ^* , the specific process is as follows:

(1)基于匹配率选择与当前帧最相似的关键帧C_k(i-1),C_k(i)；(1) Select the key frame C _k (i-1), C _k (i) most similar to the current frame based on the matching rate;

(2)计算C_k(i-1),C,C_k(i)之间的三焦张量T；(2) Calculate the trifocal tensor T between C _k (i-1), C, and C _k (i);

(3)计算相对于C_k(i)的位姿信息X_m(x_m,i,z_m,i,θ_i)；(3) Calculate the pose information X _m (x _m,i ,z _m,i ,θ _i ) relative to C _k (i);

(4)对于{C_k}中C_k(i)后面的关键帧，迭代变换位姿信息直到C^*。(4) For the key frames behind C _k (i) in {C _k }, iteratively transform the pose information until C ^* .

如图4，相邻两个关键帧之间的姿态变换方法如下：As shown in Figure 4, the attitude transformation method between two adjacent key frames is as follows:

(1)计算C_k(j-1),C_k(k+1),C_k(j)的三焦张量T；(1) Calculate the trifocal tensor T of C _k (j-1), C _k (k+1), and C _k (j);

(2)根据T计算 (2) Calculated according to T

(3)计算 (3) calculation

(4)计算 (4) calculation

上述姿态变换中，关键帧之间的三焦张量信息可以预先离线计算从而提高时间效率。另外，由于相邻关键帧之间具有较多的匹配点，因此三焦张量的计算较为精确。而且随着机器人离目标越来越近，位姿变换的累积误差越来越小。另外，基于关键帧策略测量得到的全局位姿信息中的未知比例因子为最后两个关键帧的距离，在机器人运动过程中保持不变。In the above pose transformation, the trifocal tensor information between key frames can be pre-calculated offline to improve time efficiency. In addition, since there are more matching points between adjacent key frames, the calculation of the trifocal tensor is more accurate. And as the robot gets closer to the target, the cumulative error of pose transformation becomes smaller and smaller. In addition, the unknown scale factor in the global pose information measured based on the keyframe strategy is the distance between the last two keyframes, which remains unchanged during the robot's motion.

Claims

1. A mobile robot pose estimation method based on trifocal tensor and key frame strategy is characterized by being used for a visual track tracking task of a mobile robot, utilizing an environment image shot by the mobile robot in a teaching process as an expected track, selecting a key frame from the expected track, and then carrying out geometric model construction on three views based on the trifocal tensor so as to extract pose information; the three views are a current view C and any two other views C₀,C^*。

2. The method of claim 1, wherein the geometric model is constructed by computing the current view C and the other two views C₀,C^*The trifocal tensor, thereby obtaining the current view relative to C^*The pose information of (1) includes angle information and position information including unknown scale factors.

3. The method according to claim 2, wherein an image sequence including an initial view and a final view selected from the desired trajectory is used as a key frame, and adjacent key frames have a matching degree of a preset threshold; in the pose estimation process, the trifocal tensor between the current view and the two most matched key frames is calculated to obtain pose information relative to the most matched key frames, and then the pose information is iteratively transformed to the final view to obtain continuous pose measurement under the global coordinate.

4. A visual servo tracking system employing the method according to any of claims 1-4.

5. A mobile robot employing the method according to any one of claims 1-4.