CN111105460A - RGB-D camera pose estimation method for indoor scene three-dimensional reconstruction - Google Patents

RGB-D camera pose estimation method for indoor scene three-dimensional reconstruction Download PDF

Info

Publication number
CN111105460A
CN111105460A CN201911361680.9A CN201911361680A CN111105460A CN 111105460 A CN111105460 A CN 111105460A CN 201911361680 A CN201911361680 A CN 201911361680A CN 111105460 A CN111105460 A CN 111105460A
Authority
CN
China
Prior art keywords
rgb
frame
camera
camera pose
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911361680.9A
Other languages
Chinese (zh)
Other versions
CN111105460B (en
Inventor
李纯明
方硕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911361680.9A priority Critical patent/CN111105460B/en
Publication of CN111105460A publication Critical patent/CN111105460A/en
Application granted granted Critical
Publication of CN111105460B publication Critical patent/CN111105460B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes, which combines the camera pose and depth map joint optimization of local interframes with a camera pose estimation method for combining RGB-D feature matching, eliminates the influence of single-frame depth noise or cavities on feature matching and camera pose estimation after feature matching by using dense RGB-D alignment of the local interframes, and can also reduce redundant RGB-D information; by combining the feature extraction and matching of the RGB and the depth information, the camera pose estimation error caused by RGB repeated texture and weak texture can be reduced. The invention solves the problems of serious depth loss, repeated texture and structure, weak texture, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference.

Description

RGB-D camera pose estimation method for indoor scene three-dimensional reconstruction
Technical Field
The invention belongs to the technical field of positioning and tracking, and particularly relates to an RGB-D camera pose estimation method for three-dimensional reconstruction of an indoor scene.
Background
At present, with the rise of a plurality of consumption-level RGB-D camera products, a plurality of teams at home and abroad are dedicated to the research on a more robust, accurate, efficient and large-scale RGB-D camera three-dimensional reconstruction technology. And the camera pose estimation, namely the estimation of the inter-frame relative transformation matrix T (the rotation matrix R and the translational vector T), is the most important link in the three-dimensional reconstruction based on the RGB-D camera.
The current camera pose estimation method based on the RGB-D camera mainly comprises the following steps: feature point methods, direct methods, iterative nearest neighbor algorithms (ICP), RGB-D alignment methods, and the like. The feature point method and the direct method only use RGB information to estimate the position and posture of the camera, and abandon the use of depth information. The feature point method estimates the pose by using a feature point matching method, is suitable for scenes capable of providing rich feature points and can be relocated by using the feature points, but the information used by the feature points is too little and time-consuming in calculation, most information in RGB images is lost, and the feature point method is often invalid in the environment of weak texture and repeated texture; the direct method can obtain a dense or semi-dense map without calculating a feature descriptor, so that the map can be normally used under the condition of feature loss, but the assumption that the gray scale is unchanged is too severe, the requirement that the camera movement speed cannot be too fast, automatic exposure cannot be carried out and the like is unfavorable to the conditions that the illumination change is large and the camera movement is large; the traditional iterative nearest neighbor algorithm (ICP) calculates the optimal rigid body transformation by repeatedly selecting the corresponding relationship point pairs, using only depth information, but not RGB information, then applies the transformation, finds the corresponding relationship point pairs, and calculates a new optimal transformation until the convergence accuracy requirement for correct registration is satisfied. Although the geometrical structural features of the point cloud are fully utilized, the point cloud does not depend on RGB features and luminosity, but is sensitive to the initial value of the pose and needs a better initial value, the method of matching the RGB feature point method as the initial value of the ICP method is also adopted to provide the better initial value of the ICP algorithm, but the dependence on the RGB features is increased, and weak textures cannot be better processed. The current RGB-D three-dimensional reconstruction algorithms Kinectfusion, Elasticfusion and a plurality of variants of the Kinectfusion and Elasticfusion which have wide influence mainly comprise a camera pose tracking link based on an ICP algorithm; the RGB-D alignment method uses both RGB information and depth information to solve for the relative camera pose between two frames by minimizing depth and photometric errors. The framework of the BundleFusion algorithm is an RGB-D alignment based approach. But noise problems with depth cameras often affect the quality of RGB-D alignment.
Therefore, how to solve the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference is very worth paying attention to the method for accurately and robustly estimating the pose change of the camera and realizing the three-dimensional reconstruction of the indoor scene.
Disclosure of Invention
Aiming at the defects in the prior art, the RGB-D camera pose estimation method for the indoor scene three-dimensional reconstruction solves the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference.
In order to achieve the above purpose, the invention adopts the technical scheme that:
the scheme provides an indoor scene three-dimensional reconstruction RGB-D camera pose estimation method, which comprises the following steps:
s1, acquiring each RGB-D frame in the RGB-D camera;
s2, aligning the RGB image and the depth image according to each RGB-D frame, preprocessing the depth image, and deleting abnormal depth data to obtain an aligned RGB-D frame;
s3, performing optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the pose estimation of the RGB-D camera;
s4, performing RGB-D camera pose estimation on the RGB-D frames in the local optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain optimized RGB-D key frames;
and S5, extracting and matching feature points of the optimized RGB-D key frames by combining the RGB-D information to obtain pose estimation among the RGB-D key frames, and finishing estimation of the RGB-D camera pose of the indoor scene three-dimensional reconstruction.
Further, the abnormal depth data in step S2 includes:
points outside the RGB-D camera effective distance;
3D points with the distance from the closest point in the RGB-D frame point cloud being greater than a preset threshold value, wherein the threshold value is 0.9 times of the maximum point pair distance of the frame point cloud; and
and respectively forming an included angle between a 3D point in the RGB-D frame and a transverse and longitudinal main optical axis to exceed a preset threshold value, wherein the threshold value of the included angle of the main optical axis is 60-70 degrees.
Still further, the step S3 includes the following steps:
s301, extracting an ORB corner point of an aligned RGB image of a first frame RGB-D, and extracting an ORB corner point of an aligned RGB image of a next frame RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB angular points, and judging whether the optical flow tracking is successful, if so, entering a step S303, otherwise, entering a step S304;
s303, calculating by using an epipolar geometry method to obtain the relative pose of the adjacent RGB-D interframe cameras, judging whether the L-2 norm of a lie algebra with the changed relative pose is within a preset threshold value, if so, recording the RGB-D frames as frames to be selected of a local optimization interval, returning to the step S302, and otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is the first frame and has no RGB-D frame to be selected, if so, entering the step S305, otherwise, entering the step S306;
s305, recording the RGB-D frame as a group of frames to be selected of a new local optimization interval, judging whether a next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of RGB-D camera pose estimation local alignment and optimization intervals by all current RGB-D frames to be selected, entering step S4, recording the RGB-D frames as a group of new frames to be selected of local optimization intervals, judging whether a next RGB-D frame exists, if so, returning to step S304, otherwise, returning to step S302.
Still further, the threshold value in step S303 is 10.
Still further, the step S4 includes the following steps:
s401, according to the RGB-D sequence in the local optimization interval, selecting the second time in the interval
Figure BDA0002337323060000044
The RGB-D frame of (1) is a key frame of the local optimization interval, wherein n isiThe number of RGB-D frames representing the ith local optimization interval,
Figure BDA0002337323060000045
represents rounding down;
s402, according to the key frame, calculating by utilizing the minimized inverse depth error and the photometric error to obtain the camera pose in each local optimization interval;
and S403, transforming the 3D points of the adjacent RGB-D frames in the camera pose in the local optimization interval to the camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
Still further, the camera pose T in each local optimization interval in step S402 satisfies the following expression:
Figure BDA0002337323060000041
Figure BDA0002337323060000042
Figure BDA0002337323060000043
wherein E iszTo reverse depth error, EIPhotometric error, α, is the relative weight that balances the inverse depth error with the photometric error, z (X)j) Represents a key point XjDepth at i-th frame, Zj(xj) Representing a keypoint X on the depth image of the jth framejProjection position and key point X ofjCorresponding depth, pZFor an inverse depth error robustness function, Ii(xi) Representing point x on the ith frameiCorresponding luminosity, pIIs a photometric error-robustness function, xi2D feature point location, E, representing the current keyframealignThe total error is indicated.
Still further, the step S5 includes the following steps:
s501, extracting key points of the optimized RGB-D key frame in combination with the RGB image and the depth image in combination with the RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by utilizing a PnP algorithm, eliminating wrong matching point pairs, obtaining pose estimation between the RGB-D key frames, and finishing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
Still further, the step S504 includes the steps of:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating by utilizing a PnP algorithm to obtain a rotation matrix R and a translational vector t of the pose of the camera;
s5042, forming a judgment function by using a 3D point reprojection error, an epipolar geometric model and a homography matrix error according to the rotation matrix R and the translational vector t of the camera pose;
s5043, judging whether the random matching pairs are eliminated or not according to the judgment function, if so, entering the step S5044, and if not, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judgment function, calculating the pose estimation between the RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judgment function, and finishing the estimation of the RGB-D camera pose of the indoor scene three-dimensional reconstruction.
Still further, the expression of the RGB-D camera pose estimation E (R, t) in step S504 is as follows:
Figure BDA0002337323060000051
wherein K represents an internal reference matrix, giRepresenting the 3D characteristic point of the ith key frame, R representing a rotation matrix, t representing a balance vector, xiAnd 2D feature points representing the ith key frame.
The invention has the beneficial effects that:
the invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes, which combines the camera pose and depth map joint optimization of local interframes with a camera pose estimation method for combining RGB-D feature matching, eliminates the influence of single-frame depth noise or cavities on feature matching and camera pose estimation after feature matching by using dense RGB-D alignment of the local interframes, and can also reduce redundant RGB-D information; by combining the feature extraction and matching of the RGB and the depth information, the camera pose estimation error caused by RGB repeated texture and weak texture can be reduced. The invention solves the problems of serious depth loss, repeated texture and structure, weak texture, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
Examples
In order to solve the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference, as shown in fig. 1, the invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of an indoor scene, which comprises the following steps:
s1, acquiring RGB-D information of each frame in the RGB-D camera;
s2, aligning the RGB image and the depth image according to the RGB-D information, preprocessing the depth image, and deleting abnormal depth data;
the abnormal depth data in step S2 includes any one of the following condition points:
the first condition is:
points outside the RGB-D camera effective distance;
the second condition is:
3D points with the distance from the closest point in the RGB-D frame point cloud being greater than a preset threshold value, wherein the threshold value is 0.9 times of the maximum point pair distance of the frame point cloud;
the third condition is:
and respectively enabling included angles between the 3D points in the RGB-D frame and the transverse and longitudinal main optical axes to exceed a preset threshold, wherein the threshold of the included angle of the main optical axis is 60-70 degrees.
S3, performing optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the pose estimation of the RGB-D camera, wherein the implementation method comprises the following steps:
s301, extracting an ORB corner point of an aligned RGB image of a first frame RGB-D, and extracting an ORB corner point of an aligned RGB image of a next frame RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB angular points, and judging whether the optical flow tracking is successful, if so, entering a step S303, otherwise, entering a step S304;
s303, calculating by using an epipolar geometry method to obtain the relative pose of the adjacent RGB-D interframe cameras, judging whether the L-2 norm of a lie algebra with the changed relative pose is within a preset threshold value, if so, recording the RGB-D frames as frames to be selected of a local optimization interval, returning to the step S302, and otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is the first frame and has no RGB-D frame to be selected, if so, entering the step S305, otherwise, entering the step S306;
s305, recording the RGB-D frame as a group of frames to be selected of a new local optimization interval, judging whether a next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of RGB-D camera pose estimation local alignment and optimization intervals by all current RGB-D frames to be selected, entering step S4, recording the RGB-D frames as a group of new frames to be selected of local optimization intervals, judging whether a next RGB-D frame exists, if so, returning to step S304, otherwise, returning to step S302.
In this embodiment, for the RGB-D data corresponding to the preprocessed pixel coordinates one to one, optical flow tracking is performed on the RGB image to determine a local RGB-D information alignment and optimization interval, where the optical flow tracking of each frame of RGB image specifically includes extraction of an ORB corner point and optical flow tracking based on a luminosity invariant assumption, which can preliminarily estimate pose change between two frames. Continuous frames with successful optical flow tracking and pose change within a given threshold value form a local optimization interval; if the optical flow tracking fails or the pose change exceeds a given threshold, it is the start of the next local optimization interval.
S4, performing RGB-D camera pose estimation on the RGB-D frames in the local optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain an optimized RGB-D key frame, wherein the implementation method comprises the following steps:
s401, according to the RGB-D sequence in the local optimization interval, selecting the second time in the interval
Figure BDA0002337323060000084
The RGB-D frame of (1) is a key frame of the local optimization interval, wherein n isiThe number of RGB-D frames representing the ith local optimization interval,
Figure BDA0002337323060000085
represents rounding down;
s402, according to the key frame, calculating by utilizing the minimized inverse depth error and the photometric error to obtain the camera pose in each local optimization interval;
the camera pose T in each local optimization interval satisfies the following expression:
Figure BDA0002337323060000081
Figure BDA0002337323060000082
Figure BDA0002337323060000083
wherein E iszTo reverse depth error, EIPhotometric error, α, is the relative weight that balances the inverse depth error with the photometric error, z (X)j) Represents a key point XjDepth at i-th frame, Zj(xj) Representing a keypoint X on the depth image of the jth framejProjection position and key point X ofjCorresponding depth, pZFor an inverse depth error robustness function, Ii(xi) Representing point x on the ith frameiCorresponding luminosity, pIIs a photometric error-robustness function, xi2D feature point location, E, representing the current keyframealignRepresents the total error;
and S403, transforming the 3D points of the adjacent RGB-D frames in the optimized camera pose to the camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
In this embodiment, in the step S3, the fine-scale RGB-D information matching alignment of the RGB-D frames in each local optimization interval is realized, and the camera pose in each local optimization interval is solved by minimizing the inverse depth error and the photometric error. Selecting the first in the section
Figure BDA0002337323060000091
RGB-D is the key frame of the segment, and the adjacent RGB-D frame 3D point after pose optimization is transformed to the key frame camera coordinate system.
S5, extracting and matching feature points of the optimized RGB-D key frames by combining the RGB-D information to obtain pose estimation among the RGB-D key frames, and finishing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene, wherein the implementation method comprises the following steps:
s501, extracting key points of the optimized RGB-D key frame in combination with the RGB image and the depth image in combination with the RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by utilizing a PnP algorithm, eliminating wrong matching point pairs to obtain pose estimation between the RGB-D key frames, and finishing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of an indoor scene, wherein the implementation method comprises the following steps:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating by utilizing a PnP algorithm to obtain a rotation matrix R and a translational vector t of the pose of the camera;
s5042, forming a judgment function by using a 3D point reprojection error, an epipolar geometric model and a homography matrix error according to the rotation matrix R and the translational vector t of the camera pose;
s5043, judging whether the random matching pairs are eliminated or not according to the judgment function, if so, entering the step S5044, and if not, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judgment function, calculating the pose estimation between the RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judgment function, and finishing the estimation of the RGB-D camera pose of the indoor scene three-dimensional reconstruction.
The expression of the camera pose estimation E (R, t) in step S504 is as follows:
Figure BDA0002337323060000101
wherein K represents an internal reference matrix, giRepresenting the 3D characteristic point of the ith key frame, R representing a rotation matrix, t representing a translation vector, xiAnd 2D feature points representing the ith key frame.
In the embodiment, for each optimized RGB-D key frame, key points combining an RGB camera and depth information are extracted, SIFT descriptors and FPFH descriptors are combined to generate a combined descriptor, matching of corresponding points between the RGB-D key frames is performed, an RANSAC algorithm is used for eliminating wrong matching point pairs, and a PnP algorithm is used for estimating the position and posture of the camera, so that registration of point clouds between the RGB-D key frames is achieved, and a three-dimensional point cloud model with a complete indoor scene is obtained.
According to the design, the camera pose and depth map joint optimization between local frames is combined with a camera pose estimation method combining RGB-D feature matching, the influence of single-frame depth noise or cavities on feature matching and camera pose estimation after feature matching is eliminated by utilizing dense RGB-D alignment between local frames, and redundant RGB-D information can be reduced; by combining the feature extraction and matching of the RGB and the depth information, the camera pose estimation error caused by RGB repeated texture and weak texture can be reduced. The problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference are solved.

Claims (9)

1. An indoor scene three-dimensional reconstruction RGB-D camera pose estimation method is characterized by comprising the following steps:
s1, acquiring each RGB-D frame in the RGB-D camera;
s2, aligning the RGB image and the depth image according to each RGB-D frame, preprocessing the depth image, and deleting abnormal depth data to obtain an aligned RGB-D frame;
s3, performing optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the pose estimation of the RGB-D camera;
s4, performing RGB-D camera pose estimation on the RGB-D frames in the local optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain optimized RGB-D key frames;
and S5, extracting and matching feature points of the optimized RGB-D key frames by combining the RGB-D information to obtain pose estimation among the RGB-D key frames, and finishing estimation of the RGB-D camera pose of the indoor scene three-dimensional reconstruction.
2. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 1, wherein the abnormal depth data in the step S2 includes:
points outside the RGB-D camera effective distance;
3D points with the distance from the closest point in the RGB-D frame point cloud being greater than a preset threshold value, wherein the threshold value is 0.9 times of the maximum point pair distance of the frame point cloud; and
and respectively forming an included angle between a 3D point in the RGB-D frame and a transverse and longitudinal main optical axis to exceed a preset threshold value, wherein the threshold value of the included angle of the main optical axis is 60-70 degrees.
3. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 1, wherein the step S3 includes the steps of:
s301, extracting an ORB corner point of an aligned RGB image of a first frame RGB-D, and extracting an ORB corner point of an aligned RGB image of a next frame RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB angular points, and judging whether the optical flow tracking is successful, if so, entering a step S303, otherwise, entering a step S304;
s303, calculating by using an epipolar geometry method to obtain the relative pose of the adjacent RGB-D interframe cameras, judging whether the L-2 norm of a lie algebra with the changed relative pose is within a preset threshold value, if so, recording the RGB-D frames as frames to be selected of a local optimization interval, returning to the step S302, and otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is the first frame and has no RGB-D frame to be selected, if so, entering the step S305, otherwise, entering the step S306;
s305, recording the RGB-D frame as a group of frames to be selected of a new local optimization interval, judging whether a next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of RGB-D camera pose estimation local alignment and optimization intervals by all current RGB-D frames to be selected, entering step S4, recording the RGB-D frames as a group of new frames to be selected of local optimization intervals, judging whether a next RGB-D frame exists, if so, returning to step S304, otherwise, returning to step S302.
4. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene according to claim 3, wherein the threshold value in the step S303 is 10.
5. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 1, wherein the step S4 includes the steps of:
s401, according to the RGB-D sequence in the local optimization interval, selecting the second time in the interval
Figure FDA0002337323050000021
The RGB-D frame of (1) is a key frame of the local optimization interval, wherein n isiThe number of RGB-D frames representing the ith local optimization interval,
Figure FDA0002337323050000022
represents rounding down;
s402, according to the key frame, calculating by utilizing the minimized inverse depth error and the photometric error to obtain the camera pose in each local optimization interval;
and S403, transforming the 3D points of the adjacent RGB-D frames in the camera pose in the local optimization interval to the camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
6. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 5, wherein the camera pose T in each local optimization interval in the step S402 satisfies the following expression:
Figure FDA0002337323050000031
Figure FDA0002337323050000032
Figure FDA0002337323050000033
wherein E iszTo reverse depth error, EIPhotometric error, α, is the relative weight that balances the inverse depth error with the photometric error, z (X)j) Represents a key point XjDepth at i-th frame, Zj(xj) Representing a keypoint X on the depth image of the jth framejProjection position and key point X ofjCorresponding depth, pZFor an inverse depth error robustness function, Ii(xi) Representing point x on the ith frameiCorresponding luminosity, pIIs a photometric error-robustness function, xi2D feature point location, E, representing the current keyframealignThe total error is indicated.
7. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 1, wherein the step S5 includes the steps of:
s501, extracting key points of the optimized RGB-D key frame in combination with the RGB image and the depth image in combination with the RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by utilizing a PnP algorithm, eliminating wrong matching point pairs, obtaining pose estimation between the RGB-D key frames, and finishing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
8. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes according to claim 7, wherein the step S504 includes the steps of:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating by utilizing a PnP algorithm to obtain a rotation matrix R and a translational vector t of the pose of the camera;
s5042, forming a judgment function by using a 3D point reprojection error, an epipolar geometric model and a homography matrix error according to the rotation matrix R and the translational vector t of the camera pose;
s5043, judging whether the random matching pairs are eliminated or not according to the judgment function, if so, entering the step S5044, and if not, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judgment function, calculating the pose estimation between the RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judgment function, and finishing the estimation of the RGB-D camera pose of the indoor scene three-dimensional reconstruction.
9. The RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene according to claim 7, wherein the expression of the RGB-D camera pose estimation E (R, t) in step S504 is as follows:
Figure FDA0002337323050000041
wherein K represents an internal reference matrix, giRepresenting the 3D characteristic point of the ith key frame, R representing a rotation matrix, t representing a balance vector, xiAnd 2D feature points representing the ith key frame.
CN201911361680.9A 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene Active CN111105460B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911361680.9A CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911361680.9A CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Publications (2)

Publication Number Publication Date
CN111105460A true CN111105460A (en) 2020-05-05
CN111105460B CN111105460B (en) 2023-04-25

Family

ID=70425095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911361680.9A Active CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Country Status (1)

Country Link
CN (1) CN111105460B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111915651A (en) * 2020-07-31 2020-11-10 西安电子科技大学 Visual pose real-time estimation method based on digital image map and feature point tracking
CN113284176A (en) * 2021-06-04 2021-08-20 深圳积木易搭科技技术有限公司 Online matching optimization method combining geometry and texture and three-dimensional scanning system
CN113610001A (en) * 2021-08-09 2021-11-05 西安电子科技大学 Indoor mobile terminal positioning method based on depth camera and IMU combination
CN113724365A (en) * 2020-05-22 2021-11-30 杭州海康威视数字技术股份有限公司 Three-dimensional reconstruction method and device
CN113724369A (en) * 2021-08-01 2021-11-30 国网江苏省电力有限公司徐州供电分公司 Scene-oriented three-dimensional reconstruction viewpoint planning method and system

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080144925A1 (en) * 2006-08-15 2008-06-19 Zhiwei Zhu Stereo-Based Visual Odometry Method and System
CN102034267A (en) * 2010-11-30 2011-04-27 中国科学院自动化研究所 Three-dimensional reconstruction method of target based on attention
US20140104387A1 (en) * 2012-10-17 2014-04-17 DotProduct LLC Handheld portable optical scanner and method of using
CN105654492A (en) * 2015-12-30 2016-06-08 哈尔滨工业大学 Robust real-time three-dimensional (3D) reconstruction method based on consumer camera
CN105957017A (en) * 2016-06-24 2016-09-21 电子科技大学 Video splicing method based on adaptive key frame sampling
CN107025668A (en) * 2017-03-30 2017-08-08 华南理工大学 A kind of design method of the visual odometry based on depth camera
CN107292921A (en) * 2017-06-19 2017-10-24 电子科技大学 A kind of quick three-dimensional reconstructing method based on kinect cameras
KR101865173B1 (en) * 2017-02-03 2018-06-07 (주)플레이솔루션 Method for generating movement of motion simulator using image analysis of virtual reality contents
CN109387204A (en) * 2018-09-26 2019-02-26 东北大学 The synchronous positioning of the mobile robot of dynamic environment and patterning process in faced chamber
CN109961506A (en) * 2019-03-13 2019-07-02 东南大学 A kind of fusion improves the local scene three-dimensional reconstruction method of Census figure
US20190206078A1 (en) * 2018-01-03 2019-07-04 Baidu Online Network Technology (Beijing) Co., Ltd . Method and device for determining pose of camera
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera
SG11201908974XA (en) * 2017-03-29 2019-10-30 Agency Science Tech & Res Real time robust localization via visual inertial odometry

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080144925A1 (en) * 2006-08-15 2008-06-19 Zhiwei Zhu Stereo-Based Visual Odometry Method and System
CN102034267A (en) * 2010-11-30 2011-04-27 中国科学院自动化研究所 Three-dimensional reconstruction method of target based on attention
US20140104387A1 (en) * 2012-10-17 2014-04-17 DotProduct LLC Handheld portable optical scanner and method of using
CN105654492A (en) * 2015-12-30 2016-06-08 哈尔滨工业大学 Robust real-time three-dimensional (3D) reconstruction method based on consumer camera
CN105957017A (en) * 2016-06-24 2016-09-21 电子科技大学 Video splicing method based on adaptive key frame sampling
KR101865173B1 (en) * 2017-02-03 2018-06-07 (주)플레이솔루션 Method for generating movement of motion simulator using image analysis of virtual reality contents
SG11201908974XA (en) * 2017-03-29 2019-10-30 Agency Science Tech & Res Real time robust localization via visual inertial odometry
CN107025668A (en) * 2017-03-30 2017-08-08 华南理工大学 A kind of design method of the visual odometry based on depth camera
CN107292921A (en) * 2017-06-19 2017-10-24 电子科技大学 A kind of quick three-dimensional reconstructing method based on kinect cameras
US20190206078A1 (en) * 2018-01-03 2019-07-04 Baidu Online Network Technology (Beijing) Co., Ltd . Method and device for determining pose of camera
CN109387204A (en) * 2018-09-26 2019-02-26 东北大学 The synchronous positioning of the mobile robot of dynamic environment and patterning process in faced chamber
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera
CN109961506A (en) * 2019-03-13 2019-07-02 东南大学 A kind of fusion improves the local scene three-dimensional reconstruction method of Census figure
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LÜ CHAOHUI;PAN JIAYING;: "Extraction technique of region of interest from stereoscopic video" *
蔡军;陈科宇;张毅;: "基于Kinect的改进移动机器人视觉SLAM" *
高成强;张云洲;王晓哲;邓毅;姜浩;: "面向室内动态环境的半直接法RGB-D SLAM算法" *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724365A (en) * 2020-05-22 2021-11-30 杭州海康威视数字技术股份有限公司 Three-dimensional reconstruction method and device
CN113724365B (en) * 2020-05-22 2023-09-26 杭州海康威视数字技术股份有限公司 Three-dimensional reconstruction method and device
CN111915651A (en) * 2020-07-31 2020-11-10 西安电子科技大学 Visual pose real-time estimation method based on digital image map and feature point tracking
CN111915651B (en) * 2020-07-31 2023-09-12 西安电子科技大学 Visual pose real-time estimation method based on digital image map and feature point tracking
CN113284176A (en) * 2021-06-04 2021-08-20 深圳积木易搭科技技术有限公司 Online matching optimization method combining geometry and texture and three-dimensional scanning system
CN113284176B (en) * 2021-06-04 2022-08-16 深圳积木易搭科技技术有限公司 Online matching optimization method combining geometry and texture and three-dimensional scanning system
WO2022252362A1 (en) * 2021-06-04 2022-12-08 深圳积木易搭科技技术有限公司 Geometry and texture based online matching optimization method and three-dimensional scanning system
CN113724369A (en) * 2021-08-01 2021-11-30 国网江苏省电力有限公司徐州供电分公司 Scene-oriented three-dimensional reconstruction viewpoint planning method and system
CN113610001A (en) * 2021-08-09 2021-11-05 西安电子科技大学 Indoor mobile terminal positioning method based on depth camera and IMU combination
CN113610001B (en) * 2021-08-09 2024-02-09 西安电子科技大学 Indoor mobile terminal positioning method based on combination of depth camera and IMU

Also Published As

Publication number Publication date
CN111105460B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN111105460B (en) RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene
CN108986037B (en) Monocular vision odometer positioning method and positioning system based on semi-direct method
CN107025668B (en) Design method of visual odometer based on depth camera
CN110009732B (en) GMS feature matching-based three-dimensional reconstruction method for complex large-scale scene
CN107341814B (en) Four-rotor unmanned aerial vehicle monocular vision range measurement method based on sparse direct method
CN108776989B (en) Low-texture planar scene reconstruction method based on sparse SLAM framework
CN112001926B (en) RGBD multi-camera calibration method, system and application based on multi-dimensional semantic mapping
CN110288712B (en) Sparse multi-view three-dimensional reconstruction method for indoor scene
CN107862735B (en) RGBD three-dimensional scene reconstruction method based on structural information
CN108537848A (en) A kind of two-stage pose optimal estimating method rebuild towards indoor scene
CN111242991B (en) Method for quickly registering visible light and infrared camera
CN112734839B (en) Monocular vision SLAM initialization method for improving robustness
CN111553939B (en) Image registration algorithm of multi-view camera
CN112484746B (en) Monocular vision auxiliary laser radar odometer method based on ground plane
CN112381841A (en) Semantic SLAM method based on GMS feature matching in dynamic scene
KR101869605B1 (en) Three-Dimensional Space Modeling and Data Lightening Method using the Plane Information
CN112396595A (en) Semantic SLAM method based on point-line characteristics in dynamic environment
CN111797688A (en) Visual SLAM method based on optical flow and semantic segmentation
CN112652020B (en) Visual SLAM method based on AdaLAM algorithm
CN112541973B (en) Virtual-real superposition method and system
CN113744315B (en) Semi-direct vision odometer based on binocular vision
CN106408596A (en) Edge-based local stereo matching method
CN111882602A (en) Visual odometer implementation method based on ORB feature points and GMS matching filter
CN114693720A (en) Design method of monocular vision odometer based on unsupervised deep learning
CN116128966A (en) Semantic positioning method based on environmental object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant