CN111105460B - RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene - Google Patents

RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene Download PDF

Info

Publication number
CN111105460B
CN111105460B CN201911361680.9A CN201911361680A CN111105460B CN 111105460 B CN111105460 B CN 111105460B CN 201911361680 A CN201911361680 A CN 201911361680A CN 111105460 B CN111105460 B CN 111105460B
Authority
CN
China
Prior art keywords
rgb
frame
camera
pose
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911361680.9A
Other languages
Chinese (zh)
Other versions
CN111105460A (en
Inventor
李纯明
方硕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911361680.9A priority Critical patent/CN111105460B/en
Publication of CN111105460A publication Critical patent/CN111105460A/en
Application granted granted Critical
Publication of CN111105460B publication Critical patent/CN111105460B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes, which combines the camera pose and depth map joint optimization between local frames with the camera pose estimation method for feature matching of joint RGB-D, eliminates the influence of single-frame depth noise or cavities on feature matching and camera pose estimation after the feature matching by using dense RGB-D alignment between the local frames, and can reduce redundant RGB-D information; the feature extraction and matching of RGB and depth information can reduce the camera pose estimation error caused by RGB repeated textures and weak textures. The invention solves the problems of serious depth loss, repeated textures and structures, weak textures, intense illumination change, intense camera movement and the like caused by distance limitation or infrared interference.

Description

RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene
Technical Field
The invention belongs to the technical field of positioning and tracking, and particularly relates to an RGB-D camera pose estimation method for three-dimensional reconstruction of an indoor scene.
Background
At present, with the rise of a plurality of consumer-grade RGB-D camera products, a plurality of groups at home and abroad aim at researching more robust, accurate, efficient and large-scale RGB-D camera three-dimensional reconstruction technology. The camera pose estimation, i.e. the estimation of the inter-frame relative transformation matrix T (rotation matrix R and translation vector T), is the most important link in three-dimensional reconstruction based on RGB-D cameras.
The current camera pose estimation method based on the RGB-D camera mainly comprises the following steps: feature point method, direct method, iterative nearest neighbor algorithm (ICP), RGB-D alignment method, etc. The feature point method and the direct method only use RGB information to estimate the camera pose, and the utilization of depth information is abandoned. The feature point method uses a feature point matching method to estimate the pose, is suitable for providing scenes with rich feature points and can utilize the feature points to reposition, but the information utilized by the feature points is too little and the calculation is too time-consuming, so that most of information in RGB images is lost, and the pose is often invalid in the environments of weak textures and repeated textures; the direct method can obtain a dense or semi-dense map without calculating feature descriptors, so that the map can be normally used under the condition of missing features, but the gray scale is not too severely assumed, the camera is required to move at a high speed and cannot be automatically exposed, and the map is unfavorable and has larger illumination variation and camera movement; the traditional iterative nearest neighbor algorithm (ICP) only uses depth information, does not use RGB information, and calculates the optimal rigid transformation by repeatedly selecting corresponding relation point pairs, then applies the transformation, searches the corresponding relation point pairs, and calculates the new optimal transformation until the convergence accuracy requirement of correct registration is met. Although the geometrical structure characteristics of the point cloud are fully utilized and do not depend on RGB characteristics and luminosity, the point cloud is sensitive to pose initial values and needs to have a good initial value, so that the method of matching the RGB characteristic point method with the initial value of an ICP method is also used for providing the good initial value of an ICP algorithm, the dependence on the RGB characteristics is increased, and weak textures cannot be processed well. Currently, the widely influenced RGB-D three-dimensional reconstruction algorithm KinectFusion, elasticFusion and camera pose tracking links of a plurality of variants thereof are mainly based on an ICP algorithm; the RGB-D alignment method uses RGB information and depth information simultaneously, and solves the relative camera pose between two frames by minimizing the depth error and the luminosity error. The framework of the BundleFusion algorithm is a method based on RGB-D alignment. But the noise problem of the depth camera often affects the quality of the RGB-D alignment.
Therefore, how to solve the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera movement and the like caused by distance limitation or infrared interference is very interesting in accurately and robustly estimating the pose change of a camera and realizing the three-dimensional reconstruction of an indoor scene.
Disclosure of Invention
Aiming at the defects in the prior art, the RGB-D camera pose estimation method for three-dimensional reconstruction of the indoor scene solves the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera motion and the like caused by distance limitation or infrared interference.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the scheme provides an RGB-D camera pose estimation method for three-dimensional reconstruction of an indoor scene, which comprises the following steps:
s1, acquiring each RGB-D frame in an RGB-D camera;
s2, aligning an RGB image with a depth image according to each RGB-D frame, preprocessing the depth image, deleting abnormal depth data, and obtaining aligned RGB-D frames;
s3, performing optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the RGB-D camera pose estimation;
s4, carrying out RGB-D camera pose estimation on the RGB-D frames in the local optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain optimized RGB-D key frames;
and S5, extracting and matching feature points of the optimized RGB-D key frames by combining with RGB-D information to obtain pose estimation among the RGB-D key frames, and completing the estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
Further, the abnormal depth data in step S2 includes:
points outside the RGB-D camera effective distance;
3D points with the distance to the nearest point in the RGB-D frame point cloud being larger than a preset threshold, wherein the threshold is 0.9 times of the maximum point-to-point distance of the frame point cloud; and
the included angle between the 3D point in the RGB-D frame and the transverse main optical axis and the longitudinal main optical axis exceeds a preset threshold value, and the included angle threshold value of the main optical axis is 60-70 degrees.
Still further, the step S3 includes the steps of:
s301, extracting ORB corner points of the RGB image of the first frame of the aligned RGB-D, and extracting ORB corner points of the RGB image of the next frame of the aligned RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB corner points, judging whether the optical flow tracking is successful, if yes, entering a step S303, otherwise, entering a step S304;
s303, calculating to obtain the relative pose of the adjacent RGB-D inter-frame cameras by using an epipolar geometry method, judging whether the L-2 norm of the lie algebra of the change of the relative pose is within a preset threshold, if so, recording the RGB-D frame as a frame to be selected in a local optimization interval, returning to the step S302, otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is a first frame and no RGB-D frame to be selected exists, if so, entering a step S305, otherwise, entering a step S306;
s305, recording the RGB-D frame as a group of new frames to be selected in a local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of local alignment and optimization intervals of RGB-D camera pose estimation by all current RGB-D frames to be selected, entering a step S4, recording the RGB-D frames as a group of new frames to be selected of the local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302.
Still further, the threshold in step S303 is 10.
Still further, the step S4 includes the steps of:
s401, selecting the first interval according to the RGB-D sequence in the local optimization interval
Figure BDA0002337323060000044
Is the keyframe of the locally optimized interval, where n i The number of RGB-D frames representing the i-th local optimization interval, < >>
Figure BDA0002337323060000045
Representing a downward rounding;
s402, calculating the camera pose in each local optimization interval by using a minimized inverse depth error and a luminosity error according to the key frame;
s403, transforming 3D points of adjacent RGB-D frames in the camera pose in the local optimization interval to the camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
Still further, the camera pose T in each local optimization zone in step S402 satisfies the following expression:
Figure BDA0002337323060000041
Figure BDA0002337323060000042
Figure BDA0002337323060000043
wherein E is z For inverse depth error, E I Luminosity error, α is the relative weight of the equilibrium inverse depth error and luminosity error, z (X j ) Representing the key point X j Depth at the i-th frame, Z j (x j ) Key point X on depth image representing jth frame j Projection position and key point X of (2) j Corresponding depth ρ Z Is an inverse depth error robust function, I i (x i ) Representing the point x on the ith frame i Corresponding luminosity ρ I Is a luminosity error robustness function, x i Representing the 2D feature point position of the current key frame, E align Indicating the total error.
Still further, the step S5 includes the steps of:
s501, extracting key points combining an RGB image and a depth image from the optimized RGB-D key frame by combining RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by using a PnP algorithm, removing wrong matching point pairs, obtaining pose estimation among the RGB-D key frames, and completing the estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
Still further, the step S504 includes the steps of:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating a rotation matrix R and a translation vector t of the pose of the camera by using a PnP algorithm;
s5042, forming a judging function by utilizing the 3D point re-projection error, the epipolar geometric model and the homography matrix error according to the rotation matrix R and the translation vector t of the camera pose;
s5043, judging whether to reject the random matching pair according to the judging function, if so, entering a step S5044, otherwise, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judging function, calculating to obtain pose estimation among RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judging function, and completing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
Still further, the expression of the RGB-D camera pose estimation E (R, t) in step S504 is as follows:
Figure BDA0002337323060000051
wherein K represents an internal reference matrix, g i 3D feature points representing the ith keyframe, R represents a rotation matrix, t represents a balance vector, and x i Representing the 2D feature points of the ith keyframe.
The invention has the beneficial effects that:
the invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scenes, which combines the camera pose and depth map joint optimization between local frames with the camera pose estimation method for feature matching of joint RGB-D, eliminates the influence of single-frame depth noise or cavities on feature matching and camera pose estimation after the feature matching by using dense RGB-D alignment between the local frames, and can reduce redundant RGB-D information; the feature extraction and matching of RGB and depth information can reduce the camera pose estimation error caused by RGB repeated textures and weak textures. The invention solves the problems of serious depth loss, repeated textures and structures, weak textures, intense illumination change, intense camera movement and the like caused by distance limitation or infrared interference.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
Examples
In order to solve the problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera movement and the like caused by distance limitation or infrared interference, as shown in fig. 1, the invention provides an RGB-D camera pose estimation method for three-dimensional reconstruction of an indoor scene, which comprises the following implementation steps:
s1, acquiring RGB-D information of each frame in an RGB-D camera;
s2, aligning an RGB image with a depth image according to the RGB-D information, preprocessing the depth image, and deleting abnormal depth data;
the abnormal depth data in step S2 includes points of any one of the following conditions:
the first condition is:
points outside the RGB-D camera effective distance;
the second condition is:
3D points with the distance to the nearest point in the RGB-D frame point cloud being larger than a preset threshold, wherein the threshold is 0.9 times of the maximum point-to-point distance of the frame point cloud;
third condition:
and the included angle between the 3D point in the RGB-D frame and the transverse main optical axis and the longitudinal main optical axis exceeds a preset threshold value, and the included angle threshold value of the main optical axis is 60-70 degrees.
S3, carrying out optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the RGB-D camera pose estimation, wherein the implementation method is as follows:
s301, extracting ORB corner points of the RGB image of the first frame of the aligned RGB-D, and extracting ORB corner points of the RGB image of the next frame of the aligned RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB corner points, judging whether the optical flow tracking is successful, if yes, entering a step S303, otherwise, entering a step S304;
s303, calculating to obtain the relative pose of the adjacent RGB-D inter-frame cameras by using an epipolar geometry method, judging whether the L-2 norm of the lie algebra of the change of the relative pose is within a preset threshold, if so, recording the RGB-D frame as a frame to be selected in a local optimization interval, returning to the step S302, otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is a first frame and no RGB-D frame to be selected exists, if so, entering a step S305, otherwise, entering a step S306;
s305, recording the RGB-D frame as a group of new frames to be selected in a local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of local alignment and optimization intervals of RGB-D camera pose estimation by all current RGB-D frames to be selected, entering a step S4, recording the RGB-D frames as a group of new frames to be selected of the local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302.
In this embodiment, for the RGB-D data with one-to-one correspondence to the preprocessed pixel coordinates, optical flow tracking is performed on the RGB image to determine a local RGB-D information alignment and optimization interval, where each frame of RGB image optical flow tracking specifically includes, extracting an ORB corner and optical flow tracking based on a luminance invariant assumption, so that pose change between two frames can be estimated preliminarily. Continuous frames with successful optical flow tracking and pose change within a given threshold form a local optimization interval; if the optical flow tracking fails or the pose changes exceed a given threshold, the optical flow tracking is started in the next local optimization interval.
S4, carrying out RGB-D camera pose estimation on the RGB-D frame in the local optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain an optimized RGB-D key frame, wherein the implementation method is as follows:
s401, according to the local partOptimizing RGB-D sequence in interval, selecting the first one in the interval
Figure BDA0002337323060000084
Is the keyframe of the locally optimized interval, where n i The number of RGB-D frames representing the i-th local optimization interval, < >>
Figure BDA0002337323060000085
Representing a downward rounding;
s402, calculating the camera pose in each local optimization interval by using a minimized inverse depth error and a luminosity error according to the key frame;
the camera pose T in each local optimization interval satisfies the following expression:
Figure BDA0002337323060000081
Figure BDA0002337323060000082
Figure BDA0002337323060000083
wherein E is z For inverse depth error, E I Luminosity error, α is the relative weight of the equilibrium inverse depth error and luminosity error, z (X j ) Representing the key point X j Depth at the i-th frame, Z j (x j ) Key point X on depth image representing jth frame j Projection position and key point X of (2) j Corresponding depth ρ Z Is an inverse depth error robust function, I i (x i ) Representing the point x on the ith frame i Corresponding luminosity ρ I Is a luminosity error robustness function, x i Representing the 2D feature point position of the current key frame, E align Representing the total error;
s403, transforming 3D points of adjacent RGB-D frames in the optimized camera pose to a camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
In this embodiment, in the step S3, matching alignment of RGB-D information of fine scale of RGB-D frames in each local optimization interval is achieved, and the pose of the camera in each local optimization interval is solved by minimizing inverse depth error and luminosity error. Selecting the first in the section
Figure BDA0002337323060000091
RGB-D is the key frame of the segment, and the 3D points of the adjacent RGB-D frames after pose optimization are transformed to the key frame camera coordinate system.
S5, extracting and matching feature points of the optimized RGB-D key frames by combining with RGB-D information to obtain pose estimation among the RGB-D key frames, and completing the estimation of the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene, wherein the implementation method comprises the following steps:
s501, extracting key points combining an RGB image and a depth image from the optimized RGB-D key frame by combining RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by using a PnP algorithm, removing wrong matching point pairs, obtaining pose estimation among the RGB-D key frames, and completing the estimation of the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene, wherein the implementation method comprises the following steps:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating a rotation matrix R and a translation vector t of the pose of the camera by using a PnP algorithm;
s5042, forming a judging function by utilizing the 3D point re-projection error, the epipolar geometric model and the homography matrix error according to the rotation matrix R and the translation vector t of the camera pose;
s5043, judging whether to reject the random matching pair according to the judging function, if so, entering a step S5044, otherwise, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judging function, calculating to obtain pose estimation among RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judging function, and completing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
The expression of the camera pose estimation E (R, t) in the step S504 is as follows:
Figure BDA0002337323060000101
wherein K represents an internal reference matrix, g i 3D feature points representing the ith keyframe, R represents a rotation matrix, t represents a translation vector, and x i Representing the 2D feature points of the ith keyframe.
In this embodiment, for each optimized RGB-D key frame, key points of the joint RGB camera and depth information are extracted, and SIFT and FPFH descriptors are combined to generate a joint descriptor to match corresponding points between the RGB-D key frames, an RANSAC algorithm is used to remove erroneous matching point pairs, and a PnP algorithm is used to perform camera pose estimation, so as to realize registration of point clouds between the RGB-D key frames, and obtain a relatively complete three-dimensional point cloud model of an indoor scene.
According to the invention, through the design, the camera pose and depth map joint optimization between the partial frames is combined with the camera pose estimation method of the feature matching of the joint RGB-D, and the influence of single-frame depth noise or holes on the feature matching and the camera pose estimation after the single-frame depth noise or holes is eliminated by using dense RGB-D alignment between the partial frames, and redundant RGB-D information can be reduced; the feature extraction and matching of RGB and depth information can reduce the camera pose estimation error caused by RGB repeated textures and weak textures. The problems of serious depth loss, repeated textures and structures, weak textures, severe illumination change, severe camera movement and the like caused by distance limitation or infrared interference are solved.

Claims (7)

1. An RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene is characterized by comprising the following steps:
s1, acquiring each RGB-D frame in an RGB-D camera;
s2, aligning an RGB image with a depth image according to each RGB-D frame, preprocessing the depth image, deleting abnormal depth data, and obtaining aligned RGB-D frames;
s3, performing optical flow tracking on the RGB image according to the aligned RGB-D frame, and determining a local alignment and optimization interval of the RGB-D camera pose estimation;
the step S3 includes the steps of:
s301, extracting ORB corner points of the RGB image of the first frame of the aligned RGB-D, and extracting ORB corner points of the RGB image of the next frame of the aligned RGB-D;
s302, performing optical flow tracking based on unchanged luminosity according to the extracted ORB corner points, judging whether the optical flow tracking is successful, if yes, entering a step S303, otherwise, entering a step S304;
s303, calculating to obtain the relative pose of the adjacent RGB-D inter-frame cameras by using an epipolar geometry method, judging whether the L-2 norm of the lie algebra of the change of the relative pose is within a preset threshold, if so, recording the RGB-D frame as a frame to be selected in a local optimization interval, returning to the step S302, otherwise, entering the step S304;
s304, judging whether the current RGB-D frame is a first frame and no RGB-D frame to be selected exists, if so, entering a step S305, otherwise, entering a step S306;
s305, recording the RGB-D frame as a group of new frames to be selected in a local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s306, forming a group of local alignment and optimization intervals of RGB-D camera pose estimation by all current RGB-D frames to be selected, entering a step S4, recording the RGB-D frames as a group of new frames to be selected of the local optimization interval, judging whether the next RGB-D frame exists, if so, returning to the step S304, otherwise, returning to the step S302;
s4, performing RGB-D camera pose estimation on the RGB-D frames in the local alignment and optimization interval, and converting the RGB-D information in the interval into an RGB-D key frame coordinate system of the interval to obtain an optimized RGB-D key frame;
s5, extracting and matching feature points of the optimized RGB-D key frames by combining with RGB-D information to obtain pose estimation among the RGB-D key frames, and completing the estimation of the pose of an RGB-D camera for three-dimensional reconstruction of the indoor scene;
the step S5 includes the steps of:
s501, extracting key points combining an RGB image and a depth image from the optimized RGB-D key frame by combining RGB-D information;
s502, combining a two-dimensional image feature descriptor SIFT and a three-dimensional point cloud feature descriptor FPFH according to the key points to generate a joint descriptor;
s503, matching corresponding points between the RGB-D key frames according to the joint descriptors;
s504, filtering the RGB-D key frames by using a PnP algorithm, removing wrong matching point pairs, obtaining pose estimation among the RGB-D key frames, and completing the estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
2. The method for estimating the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene according to claim 1, wherein the abnormal depth data in step S2 comprises:
points outside the RGB-D camera effective distance;
3D points with the distance to the nearest point in the RGB-D frame point cloud being larger than a preset threshold, wherein the threshold is 0.9 times of the maximum point-to-point distance of the frame point cloud; and
the included angle between the 3D point in the RGB-D frame and the transverse main optical axis and the longitudinal main optical axis exceeds a preset threshold value, and the included angle threshold value of the main optical axis is 60-70 degrees.
3. The method for estimating the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene according to claim 2, wherein the threshold in step S303 is 10.
4. The method for estimating the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene according to claim 1, wherein said step S4 comprises the steps of:
s401, selecting the first interval according to the RGB-D sequence in the local alignment and optimization interval
Figure FDA0004112027650000035
Is the keyframe of the locally optimized interval, where n i The number of RGB-D frames representing the i-th local optimization interval, < >>
Figure FDA0004112027650000031
Representing a downward rounding;
s402, calculating the camera pose in each local optimization interval by using a minimized inverse depth error and a luminosity error according to the key frame;
s403, transforming 3D points of adjacent RGB-D frames in the camera pose in the local optimization interval to the camera coordinate system of the key frame to obtain the optimized RGB-D key frame.
5. The method for estimating the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene according to claim 4, wherein the camera pose T in each local optimization interval in step S402 satisfies the following expression:
Figure FDA0004112027650000032
Figure FDA0004112027650000033
Figure FDA0004112027650000034
wherein E is z For inverse depth error, E I Luminosity error, alpha is the equilibrium inverse depth error and luminosityRelative weight of error, z (X j ) Representing the key point X j Depth at the i-th frame, Z j (x j ) Key point X on depth image representing jth frame j Projection position and key point X of (2) j Corresponding depth ρ Z Is an inverse depth error robust function, I i (x i ) Representing the point x on the ith frame i Corresponding luminosity ρ I Is a luminosity error robustness function, x i Representing the 2D feature point position of the current key frame, E align Indicating the total error.
6. The method for estimating the pose of an RGB-D camera for three-dimensional reconstruction of an indoor scene according to claim 5, wherein said step S504 comprises the steps of:
s5041, randomly selecting 8 groups of matching point pairs obtained in the step S503, and calculating a rotation matrix R and a translation vector t of the pose of the camera by using a PnP algorithm;
s5042, forming a judging function by utilizing the 3D point re-projection error, the epipolar geometric model and the homography matrix error according to the rotation matrix R and the translation vector t of the camera pose;
s5043, judging whether to reject the random matching pair according to the judging function, if so, entering a step S5044, otherwise, returning to the step S5041;
s5044, eliminating all matching point pairs which do not meet the judging function, calculating to obtain pose estimation among RGB-D key frames by utilizing a PnP algorithm according to all matching point pairs which meet the judging function, and completing estimation of the pose of the RGB-D camera for three-dimensional reconstruction of the indoor scene.
7. The method for estimating the pose of the RGB-D camera for three-dimensional reconstruction of an indoor scene as set forth in claim 6, wherein the expression of the pose estimation E (R, t) of the RGB-D camera in step S504 is as follows:
Figure FDA0004112027650000041
wherein, the liquid crystal display device comprises a liquid crystal display device,k represents an internal reference matrix, g i 3D feature points representing the ith keyframe, R represents a rotation matrix, t represents a balance vector, and x i Representing the 2D feature points of the ith keyframe.
CN201911361680.9A 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene Active CN111105460B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911361680.9A CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911361680.9A CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Publications (2)

Publication Number Publication Date
CN111105460A CN111105460A (en) 2020-05-05
CN111105460B true CN111105460B (en) 2023-04-25

Family

ID=70425095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911361680.9A Active CN111105460B (en) 2019-12-26 2019-12-26 RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene

Country Status (1)

Country Link
CN (1) CN111105460B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724365B (en) * 2020-05-22 2023-09-26 杭州海康威视数字技术股份有限公司 Three-dimensional reconstruction method and device
CN111915651B (en) * 2020-07-31 2023-09-12 西安电子科技大学 Visual pose real-time estimation method based on digital image map and feature point tracking
CN113284176B (en) * 2021-06-04 2022-08-16 深圳积木易搭科技技术有限公司 Online matching optimization method combining geometry and texture and three-dimensional scanning system
CN113724369A (en) * 2021-08-01 2021-11-30 国网江苏省电力有限公司徐州供电分公司 Scene-oriented three-dimensional reconstruction viewpoint planning method and system
CN113610001B (en) * 2021-08-09 2024-02-09 西安电子科技大学 Indoor mobile terminal positioning method based on combination of depth camera and IMU

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654492A (en) * 2015-12-30 2016-06-08 哈尔滨工业大学 Robust real-time three-dimensional (3D) reconstruction method based on consumer camera
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7925049B2 (en) * 2006-08-15 2011-04-12 Sri International Stereo-based visual odometry method and system
CN102034267A (en) * 2010-11-30 2011-04-27 中国科学院自动化研究所 Three-dimensional reconstruction method of target based on attention
US9332243B2 (en) * 2012-10-17 2016-05-03 DotProduct LLC Handheld portable optical scanner and method of using
CN105957017B (en) * 2016-06-24 2018-11-06 电子科技大学 A kind of video-splicing method based on self adaptation key frame sampling
KR101865173B1 (en) * 2017-02-03 2018-06-07 (주)플레이솔루션 Method for generating movement of motion simulator using image analysis of virtual reality contents
WO2018182524A1 (en) * 2017-03-29 2018-10-04 Agency For Science, Technology And Research Real time robust localization via visual inertial odometry
CN107025668B (en) * 2017-03-30 2020-08-18 华南理工大学 Design method of visual odometer based on depth camera
CN107292921B (en) * 2017-06-19 2020-02-04 电子科技大学 Rapid three-dimensional reconstruction method based on kinect camera
CN108062776B (en) * 2018-01-03 2019-05-24 百度在线网络技术(北京)有限公司 Camera Attitude Tracking method and apparatus
CN109387204B (en) * 2018-09-26 2020-08-28 东北大学 Mobile robot synchronous positioning and composition method facing indoor dynamic environment
CN109961506B (en) * 2019-03-13 2023-05-02 东南大学 Local scene three-dimensional reconstruction method for fusion improved Census diagram

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654492A (en) * 2015-12-30 2016-06-08 哈尔滨工业大学 Robust real-time three-dimensional (3D) reconstruction method based on consumer camera
CN110223348A (en) * 2019-02-25 2019-09-10 湖南大学 Robot scene adaptive bit orientation estimation method based on RGB-D camera
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Also Published As

Publication number Publication date
CN111105460A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
CN111105460B (en) RGB-D camera pose estimation method for three-dimensional reconstruction of indoor scene
CN108986037B (en) Monocular vision odometer positioning method and positioning system based on semi-direct method
CN107025668B (en) Design method of visual odometer based on depth camera
CN108776989B (en) Low-texture planar scene reconstruction method based on sparse SLAM framework
CN108648161B (en) Binocular vision obstacle detection system and method of asymmetric kernel convolution neural network
CN110009732B (en) GMS feature matching-based three-dimensional reconstruction method for complex large-scale scene
CN106204574B (en) Camera pose self-calibrating method based on objective plane motion feature
CN110689008A (en) Monocular image-oriented three-dimensional object detection method based on three-dimensional reconstruction
CN108597009B (en) Method for detecting three-dimensional target based on direction angle information
CN107862735B (en) RGBD three-dimensional scene reconstruction method based on structural information
CN111242991B (en) Method for quickly registering visible light and infrared camera
CN112652020B (en) Visual SLAM method based on AdaLAM algorithm
CN115205489A (en) Three-dimensional reconstruction method, system and device in large scene
CN112484746B (en) Monocular vision auxiliary laser radar odometer method based on ground plane
CN112396595A (en) Semantic SLAM method based on point-line characteristics in dynamic environment
CN111797688A (en) Visual SLAM method based on optical flow and semantic segmentation
CN111681275B (en) Double-feature-fused semi-global stereo matching method
CN113744315B (en) Semi-direct vision odometer based on binocular vision
CN111998862B (en) BNN-based dense binocular SLAM method
CN106408596A (en) Edge-based local stereo matching method
CN116468786B (en) Semantic SLAM method based on point-line combination and oriented to dynamic environment
CN104240229A (en) Self-adaptation polarline correcting method based on infrared binocular camera
CN117011660A (en) Dot line feature SLAM method for fusing depth information in low-texture scene
CN114998532B (en) Three-dimensional image visual transmission optimization method based on digital image reconstruction
CN114399547B (en) Monocular SLAM robust initialization method based on multiframe

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant