WO2017220815A1 - Rgb-d camera based tracking system and method thereof - Google Patents

Rgb-d camera based tracking system and method thereof Download PDF

Info

Publication number
WO2017220815A1
WO2017220815A1 PCT/EP2017/065677 EP2017065677W WO2017220815A1 WO 2017220815 A1 WO2017220815 A1 WO 2017220815A1 EP 2017065677 W EP2017065677 W EP 2017065677W WO 2017220815 A1 WO2017220815 A1 WO 2017220815A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyframe
graph
constraint
loop
generating
Prior art date
Application number
PCT/EP2017/065677
Other languages
French (fr)
Inventor
Soohwan Kim
Benzun Pious Wisely BABU
Zhixin Yan
Liu Ren
Original Assignee
Robert Bosch Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch Gmbh filed Critical Robert Bosch Gmbh
Priority to ES17733432T priority Critical patent/ES2911400T3/en
Priority to CN201780038684.6A priority patent/CN109478330B/en
Priority to JP2018567158A priority patent/JP2019522288A/en
Priority to US16/312,039 priority patent/US11093753B2/en
Priority to EP17733432.3A priority patent/EP3475917B1/en
Priority to KR1020187037380A priority patent/KR102471301B1/en
Publication of WO2017220815A1 publication Critical patent/WO2017220815A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • This disclosure relates generally to tracking systems and, more particularly, to a RGB-D camera based tracking system and method thereof.
  • Embodiments of the disclosure related to a method for computing visual Simultaneous localization and Mapping (SLAM).
  • the method comprises generating, by a visual odometry module, a local odometry estimate; generating, by a keyframe generator, keyframes; creating keyframe graph; adding constraint to the keyframe graph using a loop constraint evaluator; and optimizing the keyframe graph with trajectory.
  • the method further comprising generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate.
  • the method of adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure wherein the loop closure is the return to previously visited locations.
  • the method further comprises adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
  • a method of applying a probabilistic sensor model for a dense visual odometry comprises generating, by a keyframe generator, keyframes, creating keyframe graph, adding constraint to the keyframe graph using a loop constraint evaluator, and optimizing the keyframe graph with trajectory.
  • the method further comprises generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate.
  • the method of adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure wherein the loop closure is the return to previously visited locations.
  • the method further comprises adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
  • a method of t-distribution for photometric errors and a probabilistic sensor model for geometric errors comprises: [0008]
  • a visual SLAM system comprises a plurality of keyframes including a keyframe, a current keyframe, and a previous keyframe, a dual dense visual odometry configured to provide a pairwise transformation estimate between two of the plurality of keyframes, a frame generator configured to create keyframe graph, a loop constraint evaluator adds a constraint to the receiving keyframe graph, and a graph optimizer configured to produce a map with trajectory.
  • FIG. 1 is a block diagram illustrating a visual SLAM system
  • FIG. 2 is a block diagram illustrating the structure of an example keyframe graph and loop constraint evaluator
  • FIG. 3 illustrates a RGB-D camera sensor model
  • FIG. 4 is a block diagram of an uncertainty propagation
  • FIG. 5 illustrates an example of a map generated by a ⁇ -DVO SLAM system
  • FIG. 1 is a block diagram illustrating a visual Simultaneous Localization and Mapping (SLAM) system 100 divided into frontend 100a and backend 100b.
  • the system 100 uses visual odometry approach by making full use of all pixel information from an RGB-D camera to generate a local transformation estimate 1 12.
  • dense visual odometry 108 or 110 provides a pairwise transformation estimate between two image frames 102, 104, 106.
  • pairwise transformation estimate is performed between keyframe 102 and current frame 104 using dense visual odometry 108.
  • Second pairwise transformation estimate is performed between current frame 104 and previous frame 106 using dense visual odometry 1 10.
  • a keyframe generator 1 14 is used to generate a keyframe Vk based on the quality of the odometry estimate.
  • a keyframe graph G C ⁇ Vk ⁇ 1 16 using the keyframe generator 1 14 is created.
  • constraints based on the return, e.g. loop closure to previously visited locations are added to the keyframe graph to improve its connectivity.
  • Graph optimizer 120 then optimizes the final graph with constraints to produce an optimized map with trajectory 122. More details on the keyframe graph 1 16 and the loop constraint evaluator 1 18 will be described below.
  • FIG. 2 is a block diagram illustrating a structure of an example keyframe graph 200 comprises a backend graph optimization 202 and a local neighborhood 204.
  • the loop constraints LKi,Kj combined with odometry constraints weighted by is optimized. Recent keyframe and the frames tracked with respect to are included in the local
  • the keyframes and the tracked frames are
  • a new keyframe is generated by using entropy of a camera pose estimate.
  • the camera pose estimate generates a new keyframe when the estimated entry between the keyframe and the current frame falls below a threshold normalized by the largest estimate entropy in the local neighborhood 204.
  • the largest estimate entropy is assumed to be the one between the keyframe and the first frame.
  • An additional key frame generation strategy based on the curve estimate of the camera trajectory is proposed.
  • the curve estimate between Frames i and k is defined as the ratio of the sum of the translations between the frames ( ⁇ , ⁇ - ⁇ ) in the local neighborhood N with respect to the translation between the keyframe and the latest frame
  • the return to a previously visited location helps identify additional constraints to the graph called loop closure at the loop constraint evaluator 1 18 as illustrated in FIG. 2.
  • the pose graph is adjusted based on the edge weights of different constraints in the graph.
  • An erroneous loop constraint sometime can lead to a poorly optimized final trajectory.
  • two additional techniques can be used to reduce the impact of wrong loop constraints. Firstly, the loop closure constraints are weighted based on the inverse square of the metric distance between the keyframes that form the loop closure. This is based on the intuition that loop constraint between far frames is prone to a larger error than frames close to one another.
  • occlusion filtering is performed to remove false loop closure constraints.
  • the depth image provides geometry information which can be used to perform occlusion filtering between two keyframes.
  • the standard deviation of sensor model uncertainty of a depth point provides a bound on the maximum possible depth shift of the following equation:
  • the back-end graph is updated with the previous keyframe information and a double window graph structure 200 is created.
  • the pose graph in the back-end is optimized using for example an open source library, g2o.
  • a final optimization on the termination of the visual odometry is performed to generate optimized camera trajectory estimate.
  • RGB-D cameras project infra-red patterns and recover depth from correspondences between two image views with a small parallax. During this process, the disparity is quantized into sub-pixels. This introduces a quantization error in the depth measurement.
  • the noise due to quantization error in depth measurement is defined as
  • the 3D sensor noise of RGB-D cameras can be modeled with a zero-mean multivariate Gaussian distribution whose co variance matrix has the following as the diagonal components:
  • FIG. 3 illustrates a RGB-D camera sensor model.
  • the camera is located at the origin and is looking up in the z direction. For each range of 1 , 2, and 3 meters, 80 points are sampled and their uncertainties are expressed with ellipsoids. The error in the ray direction increases quadratically.
  • FIG. 4 is a block diagram of an uncertainty propagation.
  • Each 3D point p; in Figure 4 is associated with a Gaussian distribution whose covariance matrices are respectively,
  • Rray denotes the rotation matrix between the ray and camera coordinates.
  • a method of linearization is used to propagate the uncertainty to the residuals and the likelihood function can be expressed as a Gaussian distribution
  • the photometric and geometric errors can be defined as,
  • the energy function is the sum of weighted square errors as
  • n is the total number of valid pixels, and denotes the weights for different errors.
  • Eq. (14) is equivalent with maximum likelihood estimation where each residual is independent and follows an identical Gaussian distribution
  • the ⁇ -DVO algorithm can be implemented in any suitable client devices such as smart phone, tablet, mobile phone, personal digital assistant (PDA), and any devices.
  • client devices such as smart phone, tablet, mobile phone, personal digital assistant (PDA), and any devices.
  • PDA personal digital assistant
  • the SLAM system 100 with integrated ⁇ -DVO algorithm uses smaller number of keyframes and is due to a reduced drift in the system. A reduced number of keyframes indicates less computational requirements in the back- end of the system.
  • FIG. 5 illustrates an example of a map generated by a ⁇ -DVO SLAM system 100. As can be seen, a consistent trajectory is generated using the ⁇ -DVO SLAM system 100.
  • Embodiments within the scope of the disclosure may also include non- transitory computer-readable storage media or machine-readable medium for carrying or having computer-executable instructions or data structures stored thereon.
  • Such non-transitory computer-readable storage media or machine-readable medium may be any available media that can be accessed by a general purpose or special purpose computer.
  • such non-transitory computer- readable storage media or machine-readable medium can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. Combinations of the above should also be included within the scope of the non-transitory computer-readable storage media or machine-readable medium.
  • Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network.
  • Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions.
  • Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments.
  • program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types.
  • Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Studio Devices (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Instructional Devices (AREA)

Abstract

A visual SLAM system comprises a plurality of keyframes including a keyframe, a current keyframe, and a previous keyframe, a dual dense visual odometry configured to provide a pairwise transformation estimate between two of the plurality of keyframes, a frame generator configured to create keyframe graph, a loop constraint evaluator adds a constraint to the receiving keyframe graph, and a graph optimizer configured to produce a map with trajectory.

Description

RGB-D Camera Based Tracking System and Method thereof
CROSS-REFERENCE TO RELATED APPLICATION
[0001 ] This application claims priority to a U.S. provisional patent application Scr. No. 62/354,251 , filed June 24, 2016, the contents of which are incorporated herein by reference as if fully enclosed herein.
FIELD
[0002] This disclosure relates generally to tracking systems and, more particularly, to a RGB-D camera based tracking system and method thereof.
BACKGROUND
[0003] Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to the prior art by inclusion in this section.
SUMMARY
[0004] A summary of certain embodiments disclosed herein is set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of these certain embodiments and that these aspects are not intended to limit the scope of this disclosure. Indeed, this disclosure may encompass a variety of aspects that may not be set forth below.
[0005] Embodiments of the disclosure related to a method for computing visual Simultaneous localization and Mapping (SLAM). The method comprises generating, by a visual odometry module, a local odometry estimate; generating, by a keyframe generator, keyframes; creating keyframe graph; adding constraint to the keyframe graph using a loop constraint evaluator; and optimizing the keyframe graph with trajectory. The method further comprising generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate. The method of adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure wherein the loop closure is the return to previously visited locations. The method further comprises adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
[0006] According to another aspect of the disclosure, a method of applying a probabilistic sensor model for a dense visual odometry comprises generating, by a keyframe generator, keyframes, creating keyframe graph, adding constraint to the keyframe graph using a loop constraint evaluator, and optimizing the keyframe graph with trajectory. The method further comprises generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate. The method of adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure wherein the loop closure is the return to previously visited locations. The method further comprises adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
[0007] According to another aspect of the disclosure, a method of t-distribution for photometric errors and a probabilistic sensor model for geometric errors comprises:
Figure imgf000004_0001
[0008] According to another aspect of the disclosure, a visual SLAM system comprises a plurality of keyframes including a keyframe, a current keyframe, and a previous keyframe, a dual dense visual odometry configured to provide a pairwise transformation estimate between two of the plurality of keyframes, a frame generator configured to create keyframe graph, a loop constraint evaluator adds a constraint to the receiving keyframe graph, and a graph optimizer configured to produce a map with trajectory.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] These and other features, aspects, and advantages of this disclosure will become better understood when the following detailed description of certain exemplary embodiments is read with reference to the accompanying drawings in which like characters represent like arts throughout the drawings, wherein:
[0010] FIG. 1 is a block diagram illustrating a visual SLAM system;
[0011] FIG. 2 is a block diagram illustrating the structure of an example keyframe graph and loop constraint evaluator;
[0012] FIG. 3 illustrates a RGB-D camera sensor model; a
[0013] FIG. 4 is a block diagram of an uncertainty propagation; and
[0014] FIG. 5 illustrates an example of a map generated by a σ-DVO SLAM system
DETAILED DESCRIPTION
[0015] The following description is presented to enable any person skilled in the art to make and use the described embodiments, and is provided in the context of a particular application and its requirements. Various modifications to the described embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the described embodiments. Thus, the described embodiments are not limited to the embodiments shown, but are to be accorded the widest scope consistent with the principles and features disclosed herein.
[0016] Various operations may be described as multiple discrete actions or operations in turn, in a manner that is most helpful in understanding the claimed subject matter. However, the order of description should not be construed as to imply that these operations are necessarily order dependent. In particular, these operations may not be performed in the order of presentation. Operations described may be performed in a different order than the described embodiment. Various additional operations may be performed and/or described operations may be omitted in additional embodiments.
[0017] FIG. 1 is a block diagram illustrating a visual Simultaneous Localization and Mapping (SLAM) system 100 divided into frontend 100a and backend 100b. At the frontend 100a, the system 100 uses visual odometry approach by making full use of all pixel information from an RGB-D camera to generate a local transformation estimate 1 12. Which is to say, dense visual odometry 108 or 110 provides a pairwise transformation estimate between two image frames 102, 104, 106. As illustrated, pairwise transformation estimate is performed between keyframe 102 and current frame 104 using dense visual odometry 108. Second pairwise transformation estimate is performed between current frame 104 and previous frame 106 using dense visual odometry 1 10. A keyframe generator 1 14 is used to generate a keyframe Vk based on the quality of the odometry estimate. At the backend 100b of the system 100, a keyframe graph G C {Vk} 1 16 using the keyframe generator 1 14 is created. At a loop constraint evaluator 1 18, constraints based on the return, e.g. loop closure to previously visited locations are added to the keyframe graph to improve its connectivity. Graph optimizer 120 then optimizes the final graph with constraints to produce an optimized map with trajectory 122. More details on the keyframe graph 1 16 and the loop constraint evaluator 1 18 will be described below. A probabilistic sensor model is used in the front-end 100a and performs keyframe generation 1 14, and loop constraint detection 1 18 and graph optimization 120 in the back-end 100b. [0018] FIG. 2 is a block diagram illustrating a structure of an example keyframe graph 200 comprises a backend graph optimization 202 and a local neighborhood 204. In the back-end graph optimization 202, the loop constraints LKi,Kj combined with odometry constraints weighted by is optimized. Recent keyframe and
Figure imgf000007_0002
Figure imgf000007_0004
the frames tracked with respect to are included in the local
Figure imgf000007_0001
neighborhood 204. The keyframes and the tracked frames are
Figure imgf000007_0005
Figure imgf000007_0003
determined based on the ratio of entropies When the current frame does
Figure imgf000007_0006
not contain sufficient information to track a new frame, a new keyframe is generated by using entropy of a camera pose estimate. The camera pose estimate generates a new keyframe when the estimated entry between the keyframe and the current frame falls below a threshold normalized by the largest estimate entropy in the local neighborhood 204. The largest estimate entropy is assumed to be the one between the keyframe and the first frame. An additional key frame generation strategy based on the curve estimate of the camera trajectory is proposed. The curve estimate
Figure imgf000008_0003
between Frames i and k is defined as the ratio of the sum of the translations between the frames (δί,ί-ι) in the local neighborhood N with respect to the translation between the keyframe and the latest frame
Figure imgf000008_0004
Figure imgf000008_0001
[0019] The return to a previously visited location helps identify additional constraints to the graph called loop closure at the loop constraint evaluator 1 18 as illustrated in FIG. 2. After optimization, the pose graph is adjusted based on the edge weights of different constraints in the graph. An erroneous loop constraint sometime can lead to a poorly optimized final trajectory. Extending previous loop constraint generation methods, two additional techniques can be used to reduce the impact of wrong loop constraints. Firstly, the loop closure constraints are weighted based on the inverse square of the metric distance between the keyframes that form the loop closure. This is based on the intuition that loop constraint between far frames is prone to a larger error than frames close to one another. Secondly, occlusion filtering is performed to remove false loop closure constraints. The depth image provides geometry information which can be used to perform occlusion filtering between two keyframes. The standard deviation of sensor model uncertainty of a depth point provides a bound on the maximum possible depth shift of the following equation:
Figure imgf000008_0002
equation (2)
[0020] All points which violates this assumption are considered as occlusion. [0021] On generation of a new keyframe, the back-end graph is updated with the previous keyframe information and a double window graph structure 200 is created. The pose graph in the back-end is optimized using for example an open source library, g2o. A final optimization on the termination of the visual odometry is performed to generate optimized camera trajectory estimate.
[0022] Generally, RGB-D cameras project infra-red patterns and recover depth from correspondences between two image views with a small parallax. During this process, the disparity is quantized into sub-pixels. This introduces a quantization error in the depth measurement. The noise due to quantization error in depth measurement is defined as
Figure imgf000009_0001
equation (3)
[0023] where is the sub-pixel resolution of the device, b is the baseline, and f
Figure imgf000009_0003
is the focal length. This error increases quadratically with range Zi, thus preventing the use of depth observations from far objects. The 3D sensor noise of RGB-D cameras can be modeled with a zero-mean multivariate Gaussian distribution whose co variance matrix has the following as the diagonal components:
Figure imgf000009_0002
[0024] where the direction is along the ray, and denote the angular
Figure imgf000009_0005
Figure imgf000009_0004
resolutions in x and y directions.
[0025] FIG. 3 illustrates a RGB-D camera sensor model. The camera is located at the origin and is looking up in the z direction. For each range of 1 , 2, and 3 meters, 80 points are sampled and their uncertainties are expressed with ellipsoids. The error in the ray direction increases quadratically.
[0026] FIG. 4 is a block diagram of an uncertainty propagation. Each 3D point p; in Figure 4 is associated with a Gaussian distribution whose covariance matrices are respectively,
Figure imgf000010_0007
Figure imgf000010_0001
[0027] where
Figure imgf000010_0002
[0028] Rray denotes the rotation matrix between the ray and camera coordinates.
[0029] A method of linearization is used to propagate the uncertainty to the residuals and the likelihood function can be expressed as a Gaussian distribution,
Figure imgf000010_0004
[0030] where
Figure imgf000010_0003
[0031] Here, denotes the variance of the back-projected point in the z
Figure imgf000010_0006
Figure imgf000010_0005
axis of the current camera coordinates as shown in FIG. 4. The maximum likelihood estimation is,
Figure imgf000011_0001
[0032] The individual precision matrix is split as two square roots
Figure imgf000011_0008
and normalize it by applying the single precision matrix of the weighted residuals
Figure imgf000011_0009
as
Figure imgf000011_0002
[0033] The photometric and geometric errors can be defined as,
Figure imgf000011_0003
[0034] where
Figure imgf000011_0007
denotes the z component of the vector.
[0035] To find the relative camera pose which minimizes the photometric and geometric errors, the energy function is the sum of weighted square errors as
Figure imgf000011_0004
[0036] where n is the total number of valid pixels, and
Figure imgf000011_0010
denotes the weights for different errors.
[0037] Since the energy function is non-linear with respect to the relative camera pose the Gauss-Newton algorithm is usually applied to numerically find the optimal solution and the equation (14) is now updated to:
Figure imgf000011_0005
[0038] where□ denotes the Kronecker product,
Figure imgf000011_0006
and the Jacobian matrix is defined as
Figure imgf000012_0001
[0039] Eq. (14) is equivalent with maximum likelihood estimation where each residual is independent and follows an identical Gaussian distribution,
Figure imgf000012_0004
[0040] where
Figure imgf000012_0008
Note that this corresponds to the case of in
Figure imgf000012_0010
can be rewritten as:
Figure imgf000012_0009
[0041] where
Figure imgf000012_0005
Note that this corresponds to the case of
Figure imgf000012_0002
[0042] A T-distribution for photometric errors and propagate a sensor model of a Gaussian distribution for geometric errors by combining Eq (1 1) AND Eq (18) to now defined as σ-dense visual odometry (σ-DVO):
Figure imgf000012_0006
[0043] where the weight matrix and
Figure imgf000012_0007
Figure imgf000012_0003
Figure imgf000013_0001
[0044] The σ-DVO algorithm can be implemented in any suitable client devices such as smart phone, tablet, mobile phone, personal digital assistant (PDA), and any devices. Back to FIG. 1 , the SLAM system 100 with integrated σ-DVO algorithm uses smaller number of keyframes and is due to a reduced drift in the system. A reduced number of keyframes indicates less computational requirements in the back- end of the system.
[0045] FIG. 5 illustrates an example of a map generated by a σ-DVO SLAM system 100. As can be seen, a consistent trajectory is generated using the σ-DVO SLAM system 100.
[0046] The embodiments described above have been shown by way of example, and it should be understood that these embodiments may be susceptible to various modifications and alternative forms. It should be further understood that the claims are not intended to be limited to the particular forms disclosed, but rather to cover all modifications, equivalents, and alternatives falling with the sprit and scope of this disclosure.
[0047] Embodiments within the scope of the disclosure may also include non- transitory computer-readable storage media or machine-readable medium for carrying or having computer-executable instructions or data structures stored thereon. Such non-transitory computer-readable storage media or machine-readable medium may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such non-transitory computer- readable storage media or machine-readable medium can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. Combinations of the above should also be included within the scope of the non-transitory computer-readable storage media or machine-readable medium.
[0048] Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network.
[0049] Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps. [0050] While the patent has been described with reference to various embodiments, it will be understood that these embodiments are illustrative and that the scope of the disclosure is not limited to them. Many variations, modifications, additions, and improvements are possible. More generally, embodiments in accordance with the patent have been described in the context or particular embodiments. Functionality may be separated or combined in blocks differently in various embodiments of the disclosure or described with different terminology. These and other variations, modifications, additions, and improvements may fall within the scope of the disclosure as defined in the claims that follow.

Claims

What is claimed is:
1. A method for computing visual Simultaneous localization and Mapping (SLAM) comprising:
generating, by a visual odometry module, a local odometry estimate;
generating, by a keyframe generator, keyframes;
creating keyframe graph;
adding constraint to the keyframe graph using a loop constraint evaluator; and optimizing the keyframe graph with trajectory.
2. The method of claim 1 further comprising:
generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate.
3. The method of claim 2 wherein adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure;
wherein the loop closure is the return to previously visited locations.
4. The method of claim 3, further comprising adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
5. A method of applying a probabilistic sensor model for a dense visual odometry comprising: generating, by a keyframe generator, keyframes;
creating keyframe graph;
adding constraint to the keyframe graph using a loop constraint evaluator; and optimizing the keyframe graph with trajectory
6. The method of claim 5 further comprising:
generating a new keyframe between a keyframe and a current frame before generating a local odometry estimate.
7. The method of claim 6 wherein adding constraint to the keyframe graph using a loop constraint evaluator is based on a loop closure;
wherein the loop closure is the return to previously visited locations.
8. The method of claim 7, further comprising adjusting a pose graph based on edge heights of different constraints in the keyframe graph after optimization.
9. A method of t-distribution for photometric errors and a probabilistic sensor model for geometric errors comprising:
Figure imgf000017_0001
10. A visual SLAM system comprising: a plurality of keyframes including a keyframe, a current keyframe, and a previous keyframe;
a dual dense visual odometry configured to provide a pairwise transformation estimate between two of the plurality of keyframes;
a frame generator configured to create keyframe graph;
a loop constraint evaluator adds a constraint to the receiving keyframe graph; and a graph optimizer configured to produce a map with trajectory.
PCT/EP2017/065677 2016-06-24 2017-06-26 Rgb-d camera based tracking system and method thereof WO2017220815A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
ES17733432T ES2911400T3 (en) 2016-06-24 2017-06-26 SLAM system based on RGB-D camera and its method
CN201780038684.6A CN109478330B (en) 2016-06-24 2017-06-26 Tracking system based on RGB-D camera and method thereof
JP2018567158A JP2019522288A (en) 2016-06-24 2017-06-26 Tracking system and method based on RGB-D camera
US16/312,039 US11093753B2 (en) 2016-06-24 2017-06-26 RGB-D camera based tracking system and method thereof
EP17733432.3A EP3475917B1 (en) 2016-06-24 2017-06-26 Rgb-d camera based slam system and method thereof
KR1020187037380A KR102471301B1 (en) 2016-06-24 2017-06-26 RGB-D camera based tracking system and its method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662354251P 2016-06-24 2016-06-24
US62/354251 2016-06-24

Publications (1)

Publication Number Publication Date
WO2017220815A1 true WO2017220815A1 (en) 2017-12-28

Family

ID=59227732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/065677 WO2017220815A1 (en) 2016-06-24 2017-06-26 Rgb-d camera based tracking system and method thereof

Country Status (7)

Country Link
US (1) US11093753B2 (en)
EP (1) EP3475917B1 (en)
JP (1) JP2019522288A (en)
KR (1) KR102471301B1 (en)
CN (1) CN109478330B (en)
ES (1) ES2911400T3 (en)
WO (1) WO2017220815A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564625A (en) * 2018-04-27 2018-09-21 百度在线网络技术(北京)有限公司 Figure optimization method, device, electronic equipment and storage medium
CN108648274A (en) * 2018-05-10 2018-10-12 华南理工大学 A kind of cognition point cloud map creation system of vision SLAM
CN110516527A (en) * 2019-07-08 2019-11-29 广东工业大学 A kind of vision SLAM winding detection improvement method of Case-based Reasoning segmentation

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3474230B1 (en) * 2017-10-18 2020-07-22 Tata Consultancy Services Limited Systems and methods for edge points based monocular visual slam
TWI679511B (en) * 2018-08-22 2019-12-11 和碩聯合科技股份有限公司 Method and system for planning trajectory
CA3028708A1 (en) * 2018-12-28 2020-06-28 Zih Corp. Method, system and apparatus for dynamic loop closure in mapping trajectories
CN110956651B (en) * 2019-12-16 2021-02-19 哈尔滨工业大学 Terrain semantic perception method based on fusion of vision and vibrotactile sense
CN111292420B (en) * 2020-02-28 2023-04-28 北京百度网讯科技有限公司 Method and device for constructing map
CN112016612A (en) * 2020-08-26 2020-12-01 四川阿泰因机器人智能装备有限公司 Monocular depth estimation-based multi-sensor fusion SLAM method
KR102692572B1 (en) * 2021-08-23 2024-08-05 연세대학교 산학협력단 Method and apparatus for estimating location of a moving object and generating map using fusion of point feature and surfel feature
US11899469B2 (en) 2021-08-24 2024-02-13 Honeywell International Inc. Method and system of integrity monitoring for visual odometry

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120306847A1 (en) * 2011-05-31 2012-12-06 Honda Motor Co., Ltd. Online environment mapping

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2619742B1 (en) * 2010-09-24 2018-02-28 iRobot Corporation Systems and methods for vslam optimization
GB201202344D0 (en) * 2012-02-10 2012-03-28 Isis Innovation Method of locating a sensor and related apparatus
US9269003B2 (en) * 2013-04-30 2016-02-23 Qualcomm Incorporated Diminished and mediated reality effects from reconstruction
US9607401B2 (en) * 2013-05-08 2017-03-28 Regents Of The University Of Minnesota Constrained key frame localization and mapping for vision-aided inertial navigation
CN104374395A (en) * 2014-03-31 2015-02-25 南京邮电大学 Graph-based vision SLAM (simultaneous localization and mapping) method
EP3159121A4 (en) * 2014-06-17 2018-05-16 Yujin Robot Co., Ltd. Device for updating map of mobile robot and method therefor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120306847A1 (en) * 2011-05-31 2012-12-06 Honda Motor Co., Ltd. Online environment mapping

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BABU BENZUN WISELY ET AL: "[sigma]-DVO: Sensor Noise Model Meets Dense Visual Odometry", 2016 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR), IEEE, 19 September 2016 (2016-09-19), pages 18 - 26, XP033023403, DOI: 10.1109/ISMAR.2016.11 *
HAUKE STRASDAT ET AL: "Double window optimisation for constant time visual SLAM", COMPUTER VISION (ICCV), 2011 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 6 November 2011 (2011-11-06), pages 2352 - 2359, XP032101470, ISBN: 978-1-4577-1101-5, DOI: 10.1109/ICCV.2011.6126517 *
KERL CHRISTIAN ET AL: "Dense visual SLAM for RGB-D cameras", 2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IEEE, 3 November 2013 (2013-11-03), pages 2100 - 2106, XP032537192, ISSN: 2153-0858, [retrieved on 20131226], DOI: 10.1109/IROS.2013.6696650 *
KERL CHRISTIAN ET AL: "Robust odometry estimation for RGB-D cameras", 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA); 6-10 MAY 2013; KARLSRUHE, GERMANY, IEEE, US, 6 May 2013 (2013-05-06), pages 3748 - 3754, XP032506020, ISSN: 1050-4729, ISBN: 978-1-4673-5641-1, [retrieved on 20131013], DOI: 10.1109/ICRA.2013.6631104 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564625A (en) * 2018-04-27 2018-09-21 百度在线网络技术(北京)有限公司 Figure optimization method, device, electronic equipment and storage medium
CN108648274A (en) * 2018-05-10 2018-10-12 华南理工大学 A kind of cognition point cloud map creation system of vision SLAM
CN108648274B (en) * 2018-05-10 2020-05-22 华南理工大学 Cognitive point cloud map creating system of visual SLAM
CN110516527A (en) * 2019-07-08 2019-11-29 广东工业大学 A kind of vision SLAM winding detection improvement method of Case-based Reasoning segmentation

Also Published As

Publication number Publication date
KR102471301B1 (en) 2022-11-29
CN109478330B (en) 2022-03-29
KR20190021257A (en) 2019-03-05
EP3475917B1 (en) 2022-01-26
CN109478330A (en) 2019-03-15
JP2019522288A (en) 2019-08-08
US11093753B2 (en) 2021-08-17
US20190377952A1 (en) 2019-12-12
ES2911400T3 (en) 2022-05-19
EP3475917A1 (en) 2019-05-01

Similar Documents

Publication Publication Date Title
KR102471301B1 (en) RGB-D camera based tracking system and its method
CN109084746B (en) Monocular mode for autonomous platform guidance system with auxiliary sensor
US10553026B2 (en) Dense visual SLAM with probabilistic surfel map
Wang et al. CamShift guided particle filter for visual tracking
EP3385917A1 (en) Method and apparatus for generating three-dimensional model using volumetric closest point approach method
CN103123727B (en) Instant location and map constructing method and equipment
EP3688718A1 (en) Unsupervised learning of image depth and ego-motion prediction neural networks
CN107784671B (en) Method and system for visual instant positioning and drawing
CN108829116B (en) Barrier-avoiding method and equipment based on monocular cam
CA3202821A1 (en) System and method of hybrid scene representation for visual simultaneous localization and mapping
CN105654492A (en) Robust real-time three-dimensional (3D) reconstruction method based on consumer camera
Tao et al. Automated localisation of Mars rovers using co-registered HiRISE-CTX-HRSC orthorectified images and wide baseline Navcam orthorectified mosaics
WO2018214086A1 (en) Method and apparatus for three-dimensional reconstruction of scene, and terminal device
WO2015179108A1 (en) Fast solving for loop closure
US10121259B2 (en) System and method for determining motion and structure from optical flow
CN112967340A (en) Simultaneous positioning and map construction method and device, electronic equipment and storage medium
CN105527968A (en) Information processing method and information processing device
Mehralian et al. EKFPnP: extended Kalman filter for camera pose estimation in a sequence of images
Han et al. DiLO: Direct light detection and ranging odometry based on spherical range images for autonomous driving
Liu et al. Robust keyframe-based dense SLAM with an RGB-D camera
CN116929348A (en) Factory AGV positioning method based on single base station UWB and visual inertia
Qu et al. Robust local stereo matching under varying radiometric conditions
US20240029350A1 (en) Computing apparatus and model generation method
CN117367408A (en) Asynchronous laser and vision integrated robot pose estimation method
US8959128B1 (en) General and nested Wiberg minimization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17733432

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20187037380

Country of ref document: KR

Kind code of ref document: A

Ref document number: 2018567158

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017733432

Country of ref document: EP

Effective date: 20190124