CN108921895B - Sensor relative pose estimation method - Google Patents

Sensor relative pose estimation method Download PDF

Info

Publication number
CN108921895B
CN108921895B CN201810600704.0A CN201810600704A CN108921895B CN 108921895 B CN108921895 B CN 108921895B CN 201810600704 A CN201810600704 A CN 201810600704A CN 108921895 B CN108921895 B CN 108921895B
Authority
CN
China
Prior art keywords
matching
point
point cloud
feature
screening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810600704.0A
Other languages
Chinese (zh)
Other versions
CN108921895A (en
Inventor
朱效洲
曹璐
姚雯
陈小前
赵勇
白玉铸
王祎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Defense Technology Innovation Institute PLA Academy of Military Science filed Critical National Defense Technology Innovation Institute PLA Academy of Military Science
Priority to CN201810600704.0A priority Critical patent/CN108921895B/en
Publication of CN108921895A publication Critical patent/CN108921895A/en
Application granted granted Critical
Publication of CN108921895B publication Critical patent/CN108921895B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method for estimating the relative pose of a sensor, which comprises the following steps: a preprocessing step, extracting and describing feature points in a previous frame and a current frame color image respectively by using a 2D feature detector, matching and screening the feature points to establish a corresponding relation, forming a 2D-3D information association set by using depth information of the feature points in the previous frame depth image, converting the previous frame and the current frame depth image into a source point cloud and a target point cloud respectively, and performing downsampling on the point cloud; a point cloud corresponding relation construction step, wherein a corresponding relation between the source point cloud and the target point cloud is constructed to form a corresponding point set; and an error function constructing and solving step, namely constructing an error function by using the 2D-3D information association set and the corresponding point set and carrying out optimization solving until a termination condition is met, and if the termination condition is not met, reconstructing the point cloud corresponding relation. Based on the method, the calculation precision and the robustness are obviously improved aiming at the problem of estimation of the relative pose of the sensor in the close-range sensing of the space target.

Description

Sensor relative pose estimation method
Technical Field
The invention belongs to the field of space target close range perception, and particularly relates to a relative pose estimation method of a sensor.
Background
Iterative closest Point (Iterative closest Point)t, ICP) algorithm was originally proposed by Besl equal to the beginning of the 90 th 20 th century, is used for estimating the transformation relation between 3D point clouds, and has been widely applied to the aspects of three-dimensional reconstruction, target identification, robot navigation and the like at present through the development of more than twenty years. Given source point cloud
Figure BDA0001693128970000011
Target point cloud
Figure BDA0001693128970000012
And transformation to be solved
Figure BDA0001693128970000013
Is in the rotational component R ∈ SO (3), the translational component
Figure BDA0001693128970000014
The ICP algorithm iterates through the following two steps until convergence or a specified number of iterations is reached:
for each point P in the source point cloudiSearching the nearest neighbor point, namely the point with the minimum distance to the nearest neighbor point, in the target point cloud, thereby establishing a corresponding point set C:
Figure BDA0001693128970000015
wherein | · | purple2The Frobenius norm of a vector or matrix is represented indiscriminately and is abbreviated without ambiguity as | · | | | (| magnetism)2
And carrying out transformation estimation according to the current corresponding relation.
Besl et al constructs an error function using point-to-point (point-to-point) distances, i.e., Euclidean distances between a transformed source point and its nearest neighbors
Figure BDA0001693128970000016
Find R and t that make it minimum as the current estimate:
Figure BDA0001693128970000017
the formula (2) has a closed form solution, and can also be solved by using an iterative optimization method. The closed form solution is usually obtained by using methods such as singular value decomposition, unit quaternion, orthogonal matrix, dual quaternion, and the like. Eggert et al compare the above four methods and find that the differences in the results obtained by each method under real-world noise conditions are not significant.
One problem of the ICP algorithm is that the number of the point cloud midpoints is large, which results in an excessive calculation amount, and some methods reduce the number of the point cloud midpoints and increase the calculation speed through preprocessing processes such as random sampling and uniform sampling. The ICP algorithm has another problem that the dependence on an initial value is strong, and an ideal result can be obtained by simply using the identity matrix I as the initial value under the condition that the transformation between two points of clouds is not large; under the condition that the transformation between two points of cloud is large, the unit matrix is still used as an initial value, so that the method is not suitable any more, and is easy to fall into local minimum. The common method for solving the problem is to adopt a strategy from coarse to fine and provide a rough but reasonable initial value in a pretreatment stage through different ways. And performing down-sampling on point clouds of different compression degrees by Zhang and Jost, and the like, then estimating in a layered mode, and taking the result estimated by the layer with high compression degree as the initial value estimated by the layer with low compression degree. Some methods first detect 3D Feature points using 3D Scale Invariant Feature Transform (SIFT), Normal Aligned Radial Feature (NARF), etc., then describe the detected Feature points using Point Feature Histograms (PFH), direction histogram features (Signature of features of perspectives, SHOT), Binary Robust Appearance and Normal descriptors (bran), etc., and finally complete Feature Point matching and estimate a more reasonable initial value using the matching points.
In the phase of establishing the corresponding relationship between point clouds, some methods use spatial data structures such as a k-dimensional tree (k-d tree) and an Octree (Octree) to accelerate the searching process of Nearest Neighbors, and common open source implementations include an Approximate Nearest Neighbor (ANN) Library, an Approximate Nearest neighbor for quick neighbor Neighbors (FLANN), a libnabo, and the like. Unlike traditional ICP algorithms that rely only on geometric information, some methods also introduce additional information to search for nearest neighbors, which is particularly common in methods that use RGBD data. Such as Johnson et al and Ji et al introduce YIQ color information, Men et al introduce hue (hue) information, Korn et al introduce LAB color information, and the like.
In terms of error function construction, researchers also explore measurement methods beyond point-to-point (point-to-point) distance. Chen et al, in the same period as Besl, for example, proposes to use the point-to-plane (point-to-plane) distance, i.e., the euclidean distance between the source point and the local tangent plane of the target point after transformation, which utilizes the normal vector information of the target point. Segal et al propose to use plane-to-plane (plane-to-plane) distances while using normal vector information of the source and target points. However, the existing method has the problems of low estimation precision, weak robustness and the like, and a new method is needed to solve the problems.
Disclosure of Invention
The purpose of the invention is realized by the following technical scheme.
The invention provides a method for estimating the relative pose of a sensor, which comprises the following steps:
a preprocessing step, extracting and describing feature points in a previous frame color image and a current frame color image by using a 2D feature detector, matching and screening the feature points to establish a corresponding relation between the feature points to form a 2D-3D information association set, respectively converting a previous frame depth image and a current frame depth image into a source point cloud and a target point cloud, and performing down-sampling on the source point cloud and the target point cloud;
a point cloud corresponding relation construction step, wherein a corresponding relation between the source point cloud and the target point cloud is constructed to form a corresponding point set;
and an error function constructing and solving step, namely constructing an error function by using the 2D-3D information association set and the corresponding point set and carrying out optimization solving until a termination condition is met, and if the termination condition is not met, reconstructing the point cloud corresponding relation.
Preferably, the pre-treatment comprises the steps of:
extracting and describing feature points in the color map by using SIFT, wherein the SIFT comprises a feature detector and a descriptor, and the detection result of the SIFT comprises the position of the feature points, the main direction and the information of three dimensions of the scale where the detection is carried out;
sequentially carrying out bidirectional nearest neighbor matching, symmetry screening, pixel distance screening and characteristic distance screening on the characteristic points in sequence to obtain matching point pairs;
randomly selecting a matching point pair, establishing an image pixel coordinate, and if an effective depth value exists at the image pixel coordinate in the depth map of the previous frame, successfully associating the 2D-3D information and forming an association set;
and respectively converting the depth map of the previous frame and the depth map of the current frame into a source point cloud and a target point cloud, and performing down-sampling on the source point cloud and the target point cloud.
Preferably, the bidirectional nearest neighbor matching uses the feature distance between the feature points as a metric to perform nearest neighbor matching, and is divided into forward nearest neighbor matching and reverse nearest neighbor matching;
in forward nearest neighbor matching, for
Figure BDA0001693128970000031
In (1)
Figure BDA0001693128970000032
In that
Figure BDA0001693128970000033
Finding out the characteristic point with the minimum distance from the characteristic point
Figure BDA0001693128970000034
Second smallest feature point
Figure BDA0001693128970000035
As a candidate, if
Figure BDA0001693128970000036
Less than a given threshold, then
Figure BDA0001693128970000037
And
Figure BDA0001693128970000038
the matching is successful, otherwise, the matching is not established;
reverse nearest neighbor matching is similar to but opposite in direction to the forward nearest neighbor matching,
Figure BDA0001693128970000039
by reverse matching with
Figure BDA00016931289700000310
Establishing a matching relation;
wherein the content of the first and second substances,
Figure BDA0001693128970000041
representing the feature point set of the color image of the previous frame,
Figure BDA0001693128970000042
to represent
Figure BDA0001693128970000043
Is detected by the image sensor, and each of the feature points,
Figure BDA0001693128970000044
representing a current frame color map feature point set,
Figure BDA0001693128970000045
to represent
Figure BDA0001693128970000046
Is detected by the image sensor, and each of the feature points,
Figure BDA0001693128970000047
representing characteristic points
Figure BDA0001693128970000048
And
Figure BDA0001693128970000049
i and j are positive integers.
Preferably, the symmetry-screening comprises the steps of:
Figure BDA00016931289700000410
by forward matching with
Figure BDA00016931289700000411
A matching relationship is established, and
Figure BDA00016931289700000412
by reverse matching with
Figure BDA00016931289700000413
Establishing a matching relation, and obtaining a screened matching point set through symmetry screening;
otherwise, the symmetry screening fails, and the corresponding matching relation is cancelled.
Preferably, the pixel distance screening comprises the steps of:
and (4) selecting any matching point pair in the screened matching point set, wherein the pixel distance between the selected matching point pair and the selected matching point set is larger than a given threshold value, the pixel distance screening fails, and the matching relation needs to be eliminated.
Preferably, the feature distance screening comprises the steps of:
and keeping the matching points with the characteristic distance smaller than a certain proportion threshold value alpha in proportion among all the matching points to form a matching point set.
Preferably, the error function construction comprises the steps of:
utilizing 2D-3D information association set acquired in preprocessing process
Figure BDA00016931289700000414
And constructing an error function by using a corresponding point set C obtained in the point cloud corresponding relation constructing process
Figure BDA00016931289700000415
Figure BDA00016931289700000416
Figure BDA00016931289700000417
Wherein the error function is composed of 2D terms
Figure BDA00016931289700000418
And 3D items
Figure BDA00016931289700000419
The components of the composition are as follows,
Figure BDA00016931289700000420
representing a homogeneous form of the variable p, i.e.
Figure BDA00016931289700000421
K is the matrix of calibration known camera intrinsic parameters, w is the 2D term weight,
Figure BDA00016931289700000422
is a cloud of a source point, and is,
Figure BDA00016931289700000423
as a target point cloud, pprojIs a 2D feature point pprevThe corresponding 3D space point P is transformed into a projection point in the current frame image,
Figure BDA00016931289700000424
for the translation component, d is the depth value corresponding to the feature point.
Preferably, the error function solving comprises the steps of:
parameterizing the pose by using lie algebra, and expressing the pose in the form of unconstrained least squares:
Figure BDA00016931289700000425
Figure BDA0001693128970000051
Figure BDA0001693128970000052
wherein the content of the first and second substances,
Figure BDA0001693128970000053
is a parameterized representation form of a rotation matrix R, and the two are in one-to-one correspondence;
Figure BDA0001693128970000054
is [ omega ]]×,[·]×An antisymmetric matrix representing vector generation;
Figure BDA00016931289700000510
Figure BDA0001693128970000055
representing a homogeneous form of the variable p, i.e.
Figure BDA0001693128970000056
Figure BDA0001693128970000057
It is shown
Figure BDA0001693128970000058
A homogeneous form of (a).
Preferably, the optimal solution for least squares requires the calculation of a Jacobian (Jacobian) matrix of the error function, the Jacobian matrix of the 2D terms being:
Figure BDA0001693128970000059
wherein R is a rotation matrix.
Preferably, the optimal solution of the least squares requires the calculation of the jacobian matrix of the error function, the jacobian matrix of the 3D term being:
[I-[R·P+t]×]
where I is a 3x3 identity matrix.
The invention has the advantages that:
the invention provides a relative pose estimation method of a sensor, which improves an iterative closest point algorithm and comprises the following steps: preprocessing, point cloud corresponding relation construction, error function construction and solving, and constructing an error function and optimizing and solving by using a 2D-3D information association set and a corresponding point set. Based on the algorithm, the calculation precision and the robustness are remarkably improved aiming at the problem of estimation of the relative pose of the sensor in the close-range sensing of the space target.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the specific embodiments. The drawings are only for purposes of illustrating the particular embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 illustrates an iterative closest point algorithm framework diagram for fusing 2D-3D information in accordance with an embodiment of the present invention;
FIG. 2 illustrates feature extraction and description in color drawings according to an embodiment of the present invention: (a) color images; (b) using SIFT to extract a feature extraction result;
FIG. 3 shows a schematic of feature matching according to an embodiment of the invention: (a) positive nearest neighbor matching; (b) reverse nearest neighbor matching; (c) symmetry screening;
FIG. 4 shows feature matching and screening results according to an embodiment of the invention: (a) a positive nearest neighbor matching result; (b) a symmetry screening result; (c) a pixel distance screening result; (d) screening results of the characteristic distances;
FIG. 5 illustrates a schematic diagram of 2D-3D information association according to an embodiment of the invention;
FIG. 6 illustrates down-sampling a point cloud by a voxel grid according to an embodiment of the invention;
FIG. 7 shows a schematic diagram of an error function construction according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
According to the embodiment of the invention, the method for estimating the relative pose of the sensor is provided, and an improved ICP (inductively coupled plasma) fusing 2D-3D information is provided by using a 2D image and a 3D point cloud acquired by an RBD (radial basis function) camera and is referred to as 2D-3D ICP for estimation of the relative pose/motion of the sensor. As shown in fig. 1, the algorithm framework is mainly divided into three parts, namely preprocessing, point cloud correspondence construction, error function construction and solving: firstly, extracting feature points from the color images of the previous frame and the current frame by using a 2D feature detector, and describing by using a feature descriptor. On the basis, feature matching and screening are carried out to establish the corresponding relation among feature points to form a matching point pair set containing 2D information
Figure BDA0001693128970000061
Subsequently, the 3D information in the depth map is compared with
Figure BDA0001693128970000062
Performing association to form a 2D-3D information association set
Figure BDA0001693128970000063
Meanwhile, according to the relation between the camera system and the image pixel coordinate system, the depth maps of the previous frame and the current frame are respectively converted into source point clouds
Figure BDA0001693128970000064
And a target point cloud
Figure BDA0001693128970000065
After the point clouds are down-sampled, the corresponding relation between the two point clouds is constructed, and a corresponding point set C is formed. Finally using 2D-3D information association set
Figure BDA0001693128970000066
Constructing an error function with the corresponding point set C
Figure BDA0001693128970000067
And optimizing the solution relative pose until the termination condition is met. And the point cloud corresponding relation construction part is completed by using FLANN. For source point cloud
Figure BDA0001693128970000068
In each point PiA random k-d tree is used to construct its nearest neighbor index from which the nearest neighbor is then searched. Compared with a brute force method for traversing each point of a target point cloud to find a nearest neighbor point, the method can effectively improve the searching efficiency.
The operation of the present invention is mainly embodied in the preprocessing and error function construction and solving sections, which are described in detail below.
1. Pretreatment of
1.1 feature extraction and description
The invention uses SIFT to extract and describe the feature points in the color map. SIFT was originally proposed by David g. Lowe and consists of two parts, a feature detector and a descriptor. The core idea of the detector is to construct an image pyramid and detect feature points in different scale spaces, so that the scale transformation of the image is invariant; and secondly, calculating a main direction according to the local structure of the feature point neighborhood, so that the image rotation transformation has invariance.
The detection result of the SIFT comprises three aspects of information of a feature point position, a main direction and a scale of the SIFT during detection. As shown in fig. 2, each circle represents a detected feature point, wherein the center of the circle represents the position of the feature point, the size of the circle is proportional to the scale at the time of detection, and the straight line inside the circle represents the principal direction of the feature point. The SIFT detector can keep stability under the condition of changing image brightness and visual angle, and has excellent performances on repeatability, robustness and feature point positioning accuracy. The SIFT descriptor divides the feature point neighborhood into 16 sub-regions, and counts the gradient distribution of the pixels in each sub-region in 8 directions, thereby representing the feature points as 128-dimensional vectors.
1.2 feature matching and screening
After completing feature detection and description in two frames of images, performing feature matching and screening to establish a corresponding relation between feature points, and sequentially dividing the feature points into four parts, namely bidirectional nearest neighbor matching, symmetry screening, pixel distance screening and feature distance screening. It is assumed that feature point sets detected in color images of the previous frame and the current frame are respectively marked as feature point sets
Figure RE-GDA0001752112420000071
And
Figure RE-GDA0001752112420000072
wherein N isprevAnd NcurrRespectively representing the number of feature points in each point set. Characteristic point
Figure RE-GDA0001752112420000073
And
Figure RE-GDA0001752112420000074
the characteristic distance between them is recorded as
Figure RE-GDA0001752112420000075
Representing the Euclidean distance of the two in a 128-dimensional feature vector space; the pixel distance is recorded as
Figure RE-GDA0001752112420000076
Representing the euclidean distance of the two in the image pixel coordinate system.
(1) Bidirectional nearest neighbor matching
Using the characteristic distance between characteristic points as the measurement standard to carry out nearest neighbor matching, dividing the nearest neighbor matching into a forward direction and a reverse direction, and establishingThe matching point sets are respectively recorded as
Figure RE-GDA0001752112420000077
And
Figure RE-GDA0001752112420000078
in the forward nearest neighbor matching, for
Figure RE-GDA0001752112420000079
Each characteristic point in
Figure RE-GDA00017521124200000710
Go through
Figure RE-GDA00017521124200000711
All the characteristic points in the image are found, and the distance between the characteristic points and the characteristic points is the minimum
Figure RE-GDA0001752112420000081
And the next smallest
Figure RE-GDA0001752112420000082
As candidates. If it is
Figure RE-GDA0001752112420000083
Less than a given threshold, then
Figure RE-GDA0001752112420000084
And
Figure RE-GDA0001752112420000085
the matching is established, otherwise, the matching is not established. As shown in FIG. 3(a), assume the same as
Figure RE-GDA0001752112420000086
The two feature points with the minimum feature distance are
Figure RE-GDA0001752112420000087
And
Figure RE-GDA0001752112420000088
but instead of the other end of the tube
Figure RE-GDA0001752112420000089
The requirement is not met, and the matching is not established; and
Figure RE-GDA00017521124200000810
the two feature points with the smallest feature distance are
Figure RE-GDA00017521124200000811
And
Figure RE-GDA00017521124200000812
and is
Figure RE-GDA00017521124200000813
And satisfying the requirement, establishing matching, and using a red arrow to represent the established forward matching relation. As shown in FIG. 3(b), reverse nearest neighbor matching is similar to but opposite in direction to forward nearest neighbor matching, i.e., for
Figure RE-GDA00017521124200000814
Each characteristic point in
Figure RE-GDA00017521124200000815
Need to traverse
Figure RE-GDA00017521124200000816
For each feature point. The established inverse match relationship is indicated in the figure by green arrows.
(2) Symmetry screening
After the bidirectional nearest neighbor matching is finished, symmetrical screening is applied to reduce the number of mismatching, and the screened matching point set is recorded as
Figure RE-GDA00017521124200000817
If it is
Figure RE-GDA00017521124200000818
By forward matching with
Figure RE-GDA00017521124200000819
A matching relationship is established, and
Figure RE-GDA00017521124200000820
by reverse matching with
Figure RE-GDA00017521124200000821
Establishing a matching relation, and passing symmetry screening; otherwise, the symmetry screening fails, and the corresponding matching relation is cancelled. As shown in figure 3(c),
Figure RE-GDA00017521124200000822
by forward matching with
Figure RE-GDA00017521124200000823
A matching relationship is established, and the matching relationship is established,
Figure RE-GDA00017521124200000824
also by reverse matching with
Figure RE-GDA00017521124200000825
Establishing a matching relation, and passing symmetry screening;
Figure RE-GDA00017521124200000826
by forward matching with
Figure RE-GDA00017521124200000827
Establish a matching relationship, but
Figure RE-GDA00017521124200000828
By reverse matching but with
Figure RE-GDA00017521124200000829
And establishing a matching relation, and failing to symmetrically screen.
(3) Pixel distance screening
After the symmetry screening is completed, pixel distance screening is applied to further reduce the number of mismatching, and the set of the screened matching points is recorded as
Figure RE-GDA00017521124200000830
In consideration of the fact that pose transformation generated in the process of imaging the camera at two different poses respectively is limited, the change of the image pixel coordinates of the feature points corresponding to the same target in the two frames of images should also be limited. Arbitrarily take matching point pair
Figure RE-GDA00017521124200000831
If the pixel distance between the two is larger than a given threshold value, the pixel distance screening fails, and the matching relation needs to be removed.
(4) Feature distance screening
After the pixel distance screening is finished, feature distance screening is needed to be carried out, matching points with smaller feature distances are reserved in all the matching points in proportion, and the set of the screened matching points is recorded as
Figure RE-GDA00017521124200000832
The result of feature matching and screening is shown in fig. 4, in which feature points having a matching relationship between two frames of images are connected by straight lines for easy display. It is seen from (a) that there are many pairs of mis-matched points using only the forward nearest neighbor matching. These mismatched point pairs do not correspond to the same target, but due to similarity in feature space, a matching relationship is erroneously established. And (b) the condition of mismatching is effectively relieved by symmetrically screening the result of the bidirectional matching. And (c) effectively eliminating the mismatching point pairs with too large distance in the image pixel coordinate system through pixel distance screening. And (d) the matching result is further improved by characteristic distance screening.
1.32D-3D information correlation
After the feature matching and screening are completed, the corresponding relation between the 2D feature points is established, and 3D information needs to be associated through a depth map. The 2D-3D information association process is illustrated in FIG. 5, wherein the depth map is displayed in pseudo-color, the red color indicates larger depth value, the blue color indicates smaller depth value, and the invalid area of depth value uses whiteAnd (5) displaying in color. Arbitrarily take matching point pair
Figure RE-GDA0001752112420000091
Hypothesis feature points
Figure RE-GDA0001752112420000092
And
Figure RE-GDA0001752112420000093
the image pixel coordinates in the corresponding color map are respectively (u)prev,vprev) And (u)curr,vcurr). If in the depth map of the previous frame, the image pixel coordinate ([ u ] u)prev],[vprev]) There is an effective depth value d, wherein [ ·]And if the coordinate value is rounded, the 2D-3D information association is successful. Finally formed association set
Figure RE-GDA0001752112420000094
Can be expressed as
Figure RE-GDA0001752112420000095
Wherein
Figure RE-GDA0001752112420000096
Indicating the number of matching point pairs.
1.4 Point cloud Generation and downsampling
Respectively converting the depth images of the previous frame and the current frame into source point clouds
Figure BDA0001693128970000096
And a target point cloud
Figure BDA0001693128970000097
And then, according to different requirements, down-sampling the point cloud by using a Voxel (Voxel) grid to reduce the number of points contained in the point cloud, thereby improving the calculation speed of the algorithm. The downsampling process is as shown in fig. 6, a voxel grid is used to divide the 3D space, each point in the point cloud is considered as a mass point, and the points in each grid are approximated using their centroids.
2 error function construction and solution
2.1 error function construction
Utilizing 2D-3D information association set acquired in preprocessing process
Figure BDA0001693128970000098
And constructing an error function by using a corresponding point set C obtained in the process of constructing the point cloud corresponding relation
Figure BDA0001693128970000099
The preprocessing is only carried out once, and the construction of the corresponding relation and the construction of the error function are carried out iteratively: giving an initial value of the pose to be solved, and then constructing a point cloud corresponding relation; constructing an error function according to the obtained corresponding point set and optimizing and solving the error function so as to update the pose to be solved; and then judging, if the end condition is met, ending, and if the end condition is not met, building the point cloud corresponding relation again according to the updated pose.
The error function is represented by a 2D term
Figure BDA0001693128970000101
And 3D items
Figure BDA0001693128970000102
Is composed of (a) wherein
Figure BDA0001693128970000103
Representing variables
Figure BDA0001693128970000104
In homogeneous form of
Figure BDA0001693128970000105
K is a calibration known parameter matrix in the camera, and w is the weight of the 2D term. As shown in FIG. 7, the 3D term is consistent with the error function in the conventional ICP algorithm, and is used for measuring the source point in the 3D space
Figure BDA0001693128970000106
Transformed to its nearest neighbors
Figure BDA0001693128970000107
The Euclidean distance between them; the 2D term is used for measuring p in the 2D imageprojAnd
Figure BDA0001693128970000108
of the Euclidean distance between pprojAs 2D feature points
Figure BDA0001693128970000109
Corresponding 3D space point
Figure BDA00016931289700001010
And transforming the projection points in the current frame image.
Figure BDA00016931289700001011
Wherein the error function is composed of 2D terms
Figure BDA00016931289700001012
And 3D items
Figure BDA00016931289700001013
The components of the composition are as follows,
Figure BDA00016931289700001014
representing a homogeneous form of the variable p, i.e.
Figure BDA00016931289700001015
K is the matrix of calibration known camera intrinsic parameters, w is the 2D term weight,
Figure BDA00016931289700001016
is a cloud of a source point, and is,
Figure BDA00016931289700001017
as a target point cloud, pprojIs a 2D feature point pprevThe corresponding 3D space point P is transformed into a projection point in the current frame image,
Figure BDA00016931289700001018
for the translation component, d is the depth value corresponding to the feature point.
2.2 error function solution
To obtain the optimal R and t, the least squares problem with constraints as shown below needs to be solved
Figure BDA00016931289700001019
On one hand, applying constraints using lagrange multipliers increases the complexity of the optimization problem, resulting in more computation; on the other hand, if the expression of R using the 3 × 3 orthogonal matrix requires 9 variables, which is an over-parameter expression, the optimization process is negatively affected. According to the Group theory, each Lie Group (Lie Group) is a special orthogonal Group SO (3) and a special Euclidean Group
Figure BDA00016931289700001020
Also a smooth manifold, which can be locally approximated using its corresponding Lie Algebra (Lie Algebra). The lie algebra forms a tangent space of the lie group at an unimportant position, and derivatives can be calculated on the tangent space, so that interpolation is conveniently carried out in the optimization process. In addition, the lie groups and the lie algebra can be mapped through exponential and logarithmic operations.
As a minimum parameterization, the lie algebra requires only 3 parameters to be able to represent the rotation R ∈ SO (3). If it is
Figure BDA0001693128970000111
Then
Figure BDA0001693128970000112
Wherein [. ]]×Representing a vector-generated antisymmetric matrix. Similarly, only 6 parameters are needed to transform the matrix using lie algebra
Figure BDA0001693128970000113
And (4) performing representation. If it is
Figure BDA0001693128970000114
Then
Figure BDA0001693128970000115
Therefore, the pose is represented parametrically by using the lie algebra, and the pose can be rearranged into the following form without constraint least squares, so that the pose is easy to solve.
Figure BDA0001693128970000116
Wherein the content of the first and second substances,
Figure BDA0001693128970000117
is a parameterized representation form of a rotation matrix R, and the two are in one-to-one correspondence;
Figure BDA0001693128970000118
is [ omega ]]×,[·]×An antisymmetric matrix representing vector generation;
Figure BDA0001693128970000119
Figure BDA00016931289700001110
representing a homogeneous form of the variable p, i.e.
Figure BDA00016931289700001111
Figure BDA00016931289700001112
It is shown
Figure BDA00016931289700001113
A homogeneous form of (a).
In the process of optimally solving the least square, the key is to calculate a Jacobian (Jacobian) matrix of an error function. According to the derivation property of lie group lie algebra, the Jacobian matrix of the 2D terms is
Figure BDA00016931289700001114
Wherein R is a rotation matrix.
And the Jacobian matrix of the 3D terms is
[I-[R·P+t]×] (9)
Where I is a 3x3 identity matrix.
The invention provides a sensor relative pose estimation method fusing 2D-3D information, which applies an iterative closest point algorithm, and comprises the following steps: preprocessing, point cloud corresponding relation construction, error function construction and solving, and constructing an error function and optimizing and solving by using a 2D-3D information association set and a corresponding point set. Based on the algorithm, the calculation precision and the robustness are obviously improved aiming at the problem of estimation of the relative pose of the sensor in the close-range sensing of the space target.
The above description is only an exemplary embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are also included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A sensor relative pose estimation method is characterized by comprising the following steps:
a preprocessing step, extracting and describing feature points in a previous frame color image and a current frame color image by using a 2D feature detector, matching and screening the feature points to establish a corresponding relation between the feature points to form a 2D-3D information association set, converting the previous frame depth image and the current frame depth image into a source point cloud and a target point cloud respectively, and performing down-sampling on the source point cloud and the target point cloud;
the pretreatment comprises the following steps:
extracting and describing feature points in a color image by using SIFT, wherein the SIFT comprises a feature detector and a descriptor, and the detection result of the SIFT comprises three aspects of information of feature point positions, a main direction and a scale of the SIFT during detection;
sequentially carrying out bidirectional nearest neighbor matching, symmetry screening, pixel distance screening and characteristic distance screening on the characteristic points in sequence to obtain matching point pairs;
randomly selecting the matching point pairs, establishing image pixel coordinates, and if an effective depth value exists at the image pixel coordinates in the depth map of the previous frame, successfully associating the 2D-3D information and forming an association set;
respectively converting the depth map of the previous frame and the depth map of the current frame into a source point cloud and a target point cloud, and performing down-sampling on the source point cloud and the target point cloud;
a point cloud corresponding relation construction step, wherein the corresponding relation between the source point cloud and the target point cloud is constructed to form a corresponding point set;
constructing and solving an error function, namely constructing the error function by using the 2D-3D information association set and the corresponding point set, and performing optimization solving until a termination condition is met, and if the termination condition is not met, reconstructing the point cloud corresponding relation;
utilizing the 2D-3D information association set acquired in the preprocessing process
Figure FDA0002804946430000011
And constructing an error function by using a corresponding point set C obtained in the process of constructing the point cloud corresponding relation
Figure FDA0002804946430000012
Figure FDA0002804946430000013
Figure FDA0002804946430000014
Wherein the error function is composed of 2D terms
Figure FDA0002804946430000015
And 3D items
Figure FDA0002804946430000016
The components of the composition are as follows,
Figure FDA0002804946430000017
representing a homogeneous form of the variable p, i.e.
Figure FDA0002804946430000018
K is the matrix of calibration known camera intrinsic parameters, w is the 2D term weight,
Figure FDA0002804946430000019
is a cloud of a source point, and is,
Figure FDA00028049464300000110
as a target point cloud, pprojIs a 2D feature point pprevThe corresponding 3D space point P is transformed into a projection point in the current frame image,
Figure FDA0002804946430000021
and d is a translation component, d is a depth value corresponding to the feature point, and R is a rotation matrix.
2. The method of claim 1, wherein the bi-directional nearest-neighbor matching performs nearest-neighbor matching using a feature distance between the feature points as a metric, classified into forward nearest-neighbor matching and reverse nearest-neighbor matching;
in the forward nearest neighbor matching, for
Figure FDA0002804946430000022
In (1)
Figure FDA0002804946430000023
In that
Figure FDA0002804946430000024
Finding out the characteristic point with the minimum distance to the characteristic point
Figure FDA0002804946430000025
Second smallest feature point
Figure FDA0002804946430000026
As a candidate, if
Figure FDA0002804946430000027
Less than a given threshold, then
Figure FDA0002804946430000028
And the above-mentioned
Figure FDA0002804946430000029
The matching is successful, otherwise, the matching is not established;
the reverse nearest neighbor match is similar to but opposite in direction to the forward nearest neighbor match,
Figure FDA00028049464300000210
by reverse matching with
Figure FDA00028049464300000211
Establishing a matching relation;
wherein the content of the first and second substances,
Figure FDA00028049464300000212
representing the feature point set of the color image of the previous frame,
Figure FDA00028049464300000213
to represent
Figure FDA00028049464300000214
Is detected by the image sensor, and each of the feature points,
Figure FDA00028049464300000215
representing a current frame color map feature point set,
Figure FDA00028049464300000216
to represent
Figure FDA00028049464300000217
Is detected by the image sensor, and each of the feature points,
Figure FDA00028049464300000218
representing characteristic points
Figure FDA00028049464300000219
And
Figure FDA00028049464300000220
i and j are positive integers.
3. The method of claim 2, wherein the symmetry-screening comprises the steps of:
Figure FDA00028049464300000221
by forward matching with
Figure FDA00028049464300000222
A matching relationship is established, and
Figure FDA00028049464300000223
by reverse matching with
Figure FDA00028049464300000224
Establishing a matching relation, and obtaining a screened matching point set through symmetry screening;
otherwise, the symmetry screening fails, and the corresponding matching relation is cancelled.
4. The method of claim 3, wherein the pixel distance filtering comprises the steps of:
and selecting any matching point pair in the screened matching point set, and if the pixel distance between the selected matching point pair and the selected matching point set is greater than a given threshold value, failing to screen the pixel distance, and rejecting the matching relationship.
5. The method of claim 1, wherein the feature distance screening comprises the steps of:
and keeping the matching points with the characteristic distance smaller than a certain proportion threshold value alpha in proportion among all the matching points to form a matching point set.
6. The method of claim 1, wherein the error function solving comprises the steps of:
parameterizing the pose by using lie algebra, and expressing the pose in the form of unconstrained least squares:
Figure FDA0002804946430000031
wherein the content of the first and second substances,
Figure FDA0002804946430000032
the method is a parameterized representation form of a rotation matrix R, and the parameterized representation form corresponds to the rotation matrix R one by one;
Figure FDA0002804946430000033
is [ omega ]]×,[·]×An antisymmetric matrix representing vector generation;
Figure FDA0002804946430000034
Figure FDA0002804946430000035
representing a homogeneous form of the variable p, i.e.
Figure FDA0002804946430000036
Figure FDA0002804946430000037
It is shown
Figure FDA0002804946430000038
A homogeneous form of (a).
7. The method of claim 6, wherein the Jacobian matrix of error functions is calculated when the least squares are optimally solved, the Jacobian matrix of 2D terms being:
Figure FDA0002804946430000039
wherein R is a rotation matrix.
8. The method of claim 6, wherein the Jacobian matrix of error functions is calculated when the least squares are optimally solved, the Jacobian matrix of 3D terms being:
[I-[R·P+t]×]
where I is a 3x3 identity matrix.
CN201810600704.0A 2018-06-12 2018-06-12 Sensor relative pose estimation method Active CN108921895B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810600704.0A CN108921895B (en) 2018-06-12 2018-06-12 Sensor relative pose estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810600704.0A CN108921895B (en) 2018-06-12 2018-06-12 Sensor relative pose estimation method

Publications (2)

Publication Number Publication Date
CN108921895A CN108921895A (en) 2018-11-30
CN108921895B true CN108921895B (en) 2021-03-02

Family

ID=64419193

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810600704.0A Active CN108921895B (en) 2018-06-12 2018-06-12 Sensor relative pose estimation method

Country Status (1)

Country Link
CN (1) CN108921895B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109827578B (en) * 2019-02-25 2019-11-22 中国人民解放军军事科学院国防科技创新研究院 Satellite relative attitude estimation method based on profile similitude
CN110188604A (en) * 2019-04-18 2019-08-30 盎锐(上海)信息科技有限公司 Face identification method and device based on 2D and 3D image
CN110097581B (en) * 2019-04-28 2021-02-19 西安交通大学 Method for constructing K-D tree based on point cloud registration ICP algorithm
CN110458177B (en) * 2019-07-12 2023-04-07 中国科学院深圳先进技术研究院 Method for acquiring image depth information, image processing device and storage medium
CN110490932B (en) * 2019-08-21 2023-05-09 东南大学 Method for measuring space pose of crane boom through monocular infrared coplanar cursor iteration optimization
CN110766716B (en) * 2019-09-10 2022-03-29 中国科学院深圳先进技术研究院 Method and system for acquiring information of space unknown moving target
CN111681279B (en) * 2020-04-17 2023-10-31 东南大学 Driving suspension arm space pose measurement method based on improved Liqun nonlinear optimization
CN112212889A (en) * 2020-09-16 2021-01-12 北京工业大学 SINS strapdown inertial navigation system shaking base rough alignment method based on special orthogonal group optimal estimation
CN112285734B (en) * 2020-10-30 2023-06-23 北京斯年智驾科技有限公司 Port unmanned set card high-precision alignment method and system based on spike
CN113436270B (en) * 2021-06-18 2023-04-25 上海商汤临港智能科技有限公司 Sensor calibration method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645170A (en) * 2009-09-03 2010-02-10 北京信息科技大学 Precise registration method of multilook point cloud
CN105938619A (en) * 2016-04-11 2016-09-14 中国矿业大学 Visual odometer realization method based on fusion of RGB and depth information
CN107220995A (en) * 2017-04-21 2017-09-29 西安交通大学 A kind of improved method of the quick point cloud registration algorithms of ICP based on ORB characteristics of image
CN107292949A (en) * 2017-05-25 2017-10-24 深圳先进技术研究院 Three-dimensional rebuilding method, device and the terminal device of scene
CN107590827A (en) * 2017-09-15 2018-01-16 重庆邮电大学 A kind of indoor mobile robot vision SLAM methods based on Kinect

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2956891B1 (en) * 2013-02-18 2019-12-18 Tata Consultancy Services Limited Segmenting objects in multimedia data
US10223829B2 (en) * 2016-12-01 2019-03-05 Here Global B.V. Method and apparatus for generating a cleaned object model for an object in a mapping database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101645170A (en) * 2009-09-03 2010-02-10 北京信息科技大学 Precise registration method of multilook point cloud
CN105938619A (en) * 2016-04-11 2016-09-14 中国矿业大学 Visual odometer realization method based on fusion of RGB and depth information
CN107220995A (en) * 2017-04-21 2017-09-29 西安交通大学 A kind of improved method of the quick point cloud registration algorithms of ICP based on ORB characteristics of image
CN107292949A (en) * 2017-05-25 2017-10-24 深圳先进技术研究院 Three-dimensional rebuilding method, device and the terminal device of scene
CN107590827A (en) * 2017-09-15 2018-01-16 重庆邮电大学 A kind of indoor mobile robot vision SLAM methods based on Kinect

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Evaluating 3D-2D correspondences for accurate camera pose estimation from a single image";Yonghuai Liu 等;《SMC"03 Conference Proceedings》;20031110;703-708 *
"Fast RGBD-ICP with bionic vision depth perception model";Xia Shen 等;《2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems 》;20151005;1241-1246 *
"基于深度信息的仿生视觉模型快速RGB-D ICP算法研究";谌夏;《中国优秀硕士学位论文全文数据库-信息科技辑》;20150715;第2015年卷(第7期);I138-1257 *

Also Published As

Publication number Publication date
CN108921895A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
CN108921895B (en) Sensor relative pose estimation method
Zhou et al. Canny-vo: Visual odometry with rgb-d cameras based on geometric 3-d–2-d edge alignment
CN108369741B (en) Method and system for registration data
CN106709947B (en) Three-dimensional human body rapid modeling system based on RGBD camera
Kurka et al. Applications of image processing in robotics and instrumentation
CN113178009B (en) Indoor three-dimensional reconstruction method utilizing point cloud segmentation and grid repair
CN104200461B (en) The remote sensing image registration method of block and sift features is selected based on mutual information image
CN109544599B (en) Three-dimensional point cloud registration method based on camera pose estimation
CN106485690A (en) Cloud data based on a feature and the autoregistration fusion method of optical image
CN106683173A (en) Method of improving density of three-dimensional reconstructed point cloud based on neighborhood block matching
WO2013029675A1 (en) Method for estimating a camera motion and for determining a three-dimensional model of a real environment
CN103729643A (en) Recognition and pose determination of 3d objects in multimodal scenes
CN113298934B (en) Monocular visual image three-dimensional reconstruction method and system based on bidirectional matching
CN107492107B (en) Object identification and reconstruction method based on plane and space information fusion
CN110310331A (en) A kind of position and orientation estimation method based on linear feature in conjunction with point cloud feature
Alcantarilla et al. Large-scale dense 3D reconstruction from stereo imagery
CN110009670A (en) The heterologous method for registering images described based on FAST feature extraction and PIIFD feature
Xie et al. Fine registration of 3D point clouds with iterative closest point using an RGB-D camera
Bethmann et al. Object-based multi-image semi-global matching–concept and first results
CN109345570B (en) Multi-channel three-dimensional color point cloud registration method based on geometric shape
CN117351078A (en) Target size and 6D gesture estimation method based on shape priori
CN117197333A (en) Space target reconstruction and pose estimation method and system based on multi-view vision
Wan et al. A performance comparison of feature detectors for planetary rover mapping and localization
Bignoli et al. Multi-view stereo 3D edge reconstruction
Zhang Binocular Stereo Vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant