WO2021160098A1 - Error state kalman filter for visual slam by dynamically tuning measurement noise covariance - Google Patents

Error state kalman filter for visual slam by dynamically tuning measurement noise covariance Download PDF

Info

Publication number
WO2021160098A1
WO2021160098A1 PCT/CN2021/076067 CN2021076067W WO2021160098A1 WO 2021160098 A1 WO2021160098 A1 WO 2021160098A1 CN 2021076067 W CN2021076067 W CN 2021076067W WO 2021160098 A1 WO2021160098 A1 WO 2021160098A1
Authority
WO
WIPO (PCT)
Prior art keywords
pose
slam
state
covariance
state variable
Prior art date
Application number
PCT/CN2021/076067
Other languages
French (fr)
Inventor
Peng HE
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp., Ltd. filed Critical Guangdong Oppo Mobile Telecommunications Corp., Ltd.
Priority to CN202180011363.3A priority Critical patent/CN115003983B/en
Publication of WO2021160098A1 publication Critical patent/WO2021160098A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/4808Evaluating distance, position or velocity data

Definitions

  • Inertial navigation is a classical method for localization. Inertial navigation uses one or more sensors, for example, motion sensors or rotation sensors, to continuously calculate and adjust the position, the orientation, and the velocity of a moving object. Motion and rotation sensing can be accomplished via an inertial measurement unit (IMU) .
  • IMU inertial measurement unit
  • An advantage of inertial navigation is that it relies only on the sensor itself without the need for external references such as global positioning systems, sonar location mapping, or camera based location estimation.
  • EKF Extended Kalman Filter
  • orientation needs to use quaternion or rotation matrix, which have many parameters.
  • EKF has high-order products, which makes the computation of Jacobians difficult and slow.
  • computing Jacobians using EKF may have various computational problems, e.g., possible parameter singularities, gimbal lock issues, etc., which cannot provide a guarantee that the linearization validity holds at all times.
  • Embodiments of the present invention are related to updating an ESKF filter with image rate measurements by interleaving measured updates to more tightly couple predicted state variables and measured state variables. Embodiments of the present invention are applicable to a variety of applications in augmented reality and computer-based display systems.
  • a method of updating state variable in a simultaneous localization and mapping system includes receiving a computed position from an inertial measurement unit that includes a position, orientation, and velocity.
  • the method includes receiving multiple images from a camera.
  • the method further includes predicting an error state including a state variable and a state covariance.
  • the method includes processing a first subset of images from the multiple images using an optimization process to provide a first SLAM pose at a first frequency.
  • the method further includes processing a second subset of images from the multiple images using pure visual odometry to provide a second SLAM pose at a second frequency.
  • the method further includes updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
  • the state variable includes a vector of at least velocity, an accelerometer metric, gyroscopic metric, or gravitational measurement.
  • a computing system for updating a state variable and state covariance of a SLAM system includes an inertial measurement unit.
  • the computing system further includes a processor and one or more memories storing instructions executable by the processor.
  • the processor executes instructions that configure the computing system to update the state variable and state covariance of the SLAM system.
  • the processor configures the computing system to receive a computed position from an inertial measurement unit that includes a position, orientation, and velocity.
  • the processor configures the computing system to receive multiple images from a camera.
  • the processor configures the computing system to predict an error state including a state variable and a state covariance.
  • the processor configures the computing system to use an optimization process for a first subset of images from the multiple images to provide a first SLAM pose at a first frequency.
  • the processor configures the computing system to use pure visual odometry on a second subset of images from the multiple images to provide a second SLAM pose at a second frequency.
  • the processor configures the computing system to update the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
  • embodiments of the present invention provide a SLAM output for every image at the frame rate so that the SLAM pose can be used for measurement updates in an ESKF process. Further, the dynamic tuning of measurement noise covariance can make the SLAM pose more accurate as this covariance value represents the more confident the filter is with the measurement.
  • the measurement update may come directly from the output of visual SLAM.
  • FIG. 1 illustrates an example of a computer system that includes a depth sensor and red, green, and blue (RGB) optical sensors for AR applications according to an embodiment of the present invention.
  • RGB red, green, and blue
  • FIG. 2 illustrates an example of a block diagram for using Error-State Kalman Filtering (ESKF) to update state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • EKF Error-State Kalman Filtering
  • FIG. 3 illustrates another example of a flow for updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • SLAM simultaneous localization and mapping
  • FIG. 4 is a simplified flowchart illustrating a method of updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • SLAM simultaneous localization and mapping
  • FIG. 5A illustrates an example of a plot of iris pose, predicted iris pose, and ESKF pose according to an embodiment of the present invention.
  • FIG. 5B is a plot of the derivative of the predicted iris pose and ESKF pose according to an embodiment of the present invention.
  • FIG. 6A illustrates an example of a plot of iris pose, predicted iris pose, and ESKF pose, according to an embodiment of the present invention.
  • FIG. 6B illustrates an x-derivative of the predicted iris pose and the ESKF pose, according to an embodiment of the present invention.
  • FIG. 7 illustrates an example computer system according to an embodiment of the present invention.
  • Embodiments of the present invention relate to tightly coupling inertial measurement sensors (IMU) to Visual-inertial Simultaneous Localization and Mapping (SLAM) systems used in Augmented Reality (AR) systems.
  • IMU inertial measurement sensors
  • SLAM Visual-inertial Simultaneous Localization and Mapping
  • AR Augmented Reality
  • SLAM involves determining and mapping a previously-unknown real-world environment using image capture and determining a position in the real-world using visual information in the captured images.
  • embodiments of the present invention provide improvements over conventional inventions by providing SLAM pose measurement updates to an EKSF process at the image capture rate.
  • certain embodiments combine an optimized SLAM pose and a pure visual odometry SLAM pose to provide higher measurement update rates.
  • the ESKF process may be used to generate more accurate measurement updates and support higher rates of movement while maintaining the ESKF processing method. Dynamic tuning of measurement noise covariance can make the pose output more accurate, as this covariance value represents how confident the filter is with the measurement. The measurement comes directly from the output of visual SLAM.
  • the coupling of SLAM with optimized pose determination and SLAM using pure visual odometry to provide a higher frequency update (up to and including image rate) of the measurement position improves the predicted state and provides a solution for fusing IMU information with visual SLAM for higher velocity applications.
  • FIG. 1 illustrates an example of a computer system 110 that includes a depth sensor 112 and an RGB optical sensor 114 for augmented reality applications according to an embodiment of the present invention.
  • FIG. 1 includes a computer system 110 that includes a depth sensor 112, an Inertial Measurement Unit (IMU) 113, and an RGB optical sensor 114 for AR applications including calibration of one or more sensors.
  • the AR applications can be implemented by an AR module 116 of the computer system 110.
  • the computer system 110 represents a suitable user device that includes, in addition to the depth sensor 112 and the RGB optical sensor 114, one or more graphical processing units (GPUs) , one or more general purpose processors (GPPs) , and one or more memories storing computer-readable instructions that are executable by at least one of the processors to perform various functionalities of the embodiments of the present invention.
  • the computer system 110 can be any of a smartphone, a tablet, an AR headset, or a wearable AR device.
  • the depth sensor 112 has a known maximum depth range (e.g., a maximum working distance) and this maximum value may be stored locally and/or accessible to the AR module 116.
  • the depth sensor 112 can be a ToF camera.
  • the depth map generated by the depth sensor 112 includes a depth image.
  • the IMU 113 can detect inertial measurements and transmit measurements of the detected inertial measurements to computer system 110. Examples of measurements include position, angular velocity, and acceleration. IMU 113 can include an accelerometer and/or a gyroscope. An accelerometer can output a measurement of acceleration in one or more directions (e.g., x, y, and/or z) . A gyroscope can output a measurement of angular velocity. In turn, the computer system 110 receives the inertial measurements and adjusts the AR scene 120 based on the received measurements.
  • the RGB optical sensor 114 can be a color camera.
  • the depth image and the RGB image can have different resolutions. Typically, the resolution of the depth image is smaller than that of the RGB image. For instance, the depth image has a 640x180 resolution, whereas the RGB image has a 1920x1280 resolution.
  • the RGB optical sensor 114 generates an RGB image of a real-world environment that includes, for instance, a real-world object 130.
  • the depth sensor 112 generates depth data about the real-world environment, where this data includes, for instance, a depth map that shows depth (s) of the real-world object 130 (e.g., distance (s) between the depth sensor 112 and the real-world object 130) .
  • the depth sensor 112, the IMU 113, and/or the RGB optical sensor 114, as installed in the computer system 110, may be separated by a transformation (e.g., distance offset, field of view angle difference, etc. ) .
  • This transformation may be known and its value may be stored locally and/or accessible to the AR module 116.
  • the ToF camera and the color camera can have similar field of views. But due to the transformation, the field of views would partially, rather than fully, overlap.
  • the AR module 116 can generate a red, green, blue, and depth (RGBD) image from the RGB image and the depth map to detect an occlusion of the virtual object 124 by at least a portion of the real-world object representation 122 or vice versa.
  • the AR module 116 can additionally or alternatively generate a 3D model of the real-world environment based on the depth map, where the 3D model includes multi-level voxels. Such voxels are used to detect collision between the virtual object 124 and at least a portion of the real-world object representation 122.
  • the AR scene 120 can be rendered to properly show the occlusion and avoid the rendering of the collision.
  • the AR module 116 can be implemented as specialized hardware and/or a combination of hardware and software (e.g., general purpose processor and computer-readable instructions stored in memory and executable by the general purpose processor) . In addition to initializing an AR session and performing VIO, the AR module 116 can detect occlusion and collision to properly render the AR scene 120.
  • a smartphone is used for an AR session that shows the real-world environment.
  • the AR session includes rendering an AR scene that includes a representation of a real-world table on top of which a vase (or some other real-world object) is placed.
  • a virtual ball (or some other virtual object) is to be shown in the AR scene.
  • the virtual ball is to be shown on top of the table too.
  • the virtual ball can occlude the virtual vase when the virtual vase is behind the virtual ball relative to a change in the pose of the smartphone. And in remaining parts of the AR scene, no occlusion is present.
  • a user of the smartphone can interact with the virtual ball to move the virtual ball on the top surface of the virtual table (that represents the real-world table) . By tracking a possible collision between the virtual ball and the virtual object, any interaction that would cause collision would not be rendered. In other words, the collision tracking can be used to control where the virtual ball can be moved in the AR scene.
  • computer system 110 performs an ESKF process to provide a measurement update using of one or more sensors such as depth sensor 112 and/or RGB optical sensor 114.
  • computer system 110 can perform measurement updates using both an optimized SLAM pose and a SLAM pose obtained using pure visual odometry.
  • FIG. 2 illustrates an example of a process for using an Error-State Kalman Filtering (ESKF) process 201 to update state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • EKF Error-State Kalman Filtering
  • the ESKF process 201 receives an input from an inertial measurement unit (IMU) and an input from a visual SLAM module.
  • IMU raw data 202 may include acceleration and angular velocity information of the camera system.
  • the ESKF process 201 predicts an error-state at block 204 using the IMU raw data 202 to generate a predicted state variable.
  • An example of the predicted state variable is a vector including matrices that include pose information including 3-axes position, the 3-axes velocity, 3-axes orientation (e.g., a quaternion) , and IMU sensor information including 3-axes accelerometer bias, 3-axes gyroscope bias.
  • the ESKF 201 predicts the predicted state variable.
  • the ESKF process 201 may also predict a predicted state covariance.
  • the ESKF process 201 predicts the error state (e.g., the state variable) by computing the error in the state variable x by computing the predicted state as represented by The state transition model F x of the state variable x and the state of prediction u m are used to predict the error state (the predicted state variable) .
  • the ESKF process 201 may also predict the error state covariance (the predicted state covariance) as represented by The predicted covariance P is computed by combining the state transition model F x , the state covariance P, the transpose of the transition model and the initial transition model F i , an estimated covariance Q i , and the transpose of the initial transition model
  • the computing system 110 may capture multiple images of a real-world scene. For instance, the computing system 110 can receive image capture data 206 from a camera and perform visual SLAM 208. The computing system 110 can process the image capture data 206 using visual SLAM 208 to generate an image rate pose 210. The ESKF process 201 may receive the image rate pose 210 from computing system 110. Additional detail with regard to image rate pose 210 generation is likely best understood with regard to FIG. 5.
  • the ESKF process 201 performs a measurement update at block 212 by using a relationship between a measurement vector that is estimated from visual SLAM 208 and a noise factor.
  • the ESKF process 201 update equations can be represented as a relationship between the measurement vector, including position, velocity, orientation, accelerometer bias, gyroscope bias, gravitational acceleration, that are estimated from the visual SLAM 208 and a Gaussian white noise.
  • y h (x t ) + v
  • y is the measurement vector, including position, velocity, orientation, accelerometer bias, gyroscope bias, gravitational acceleration, estimated from our visual SLAM
  • h () is a nonlinear function of the system state (the true state)
  • v is a white Gaussian noise with covariance V.
  • the white Gaussian noise v can be represented as where the white Gaussian noise is drawn from a zero-mean normal distribution with variance N.
  • Jacobian matrix H can be used.
  • the Jacobian matrix H is a 19*19 Identity matrix.
  • the ESKF correction or update equations can, therefore, be represented as and P ⁇ (I-KH) P.
  • the ESKF sets the measurement noise covariance V, described above, to update the state variables and state covariance.
  • this measurement noise covariance is initialized and fixed during the ESKF update.
  • the measurement noise covariance changes due to the integration of visual SLAM to determine the measurement vector.
  • Performing visual SLAM 208 results in various quality levels of SLAM performance that can be characterized by various SLAM quality metrics (e.g., good, weak, lost, etc. ) .
  • the image rate pose 210 e.g., a slam pose
  • ESKF process 201 is used by ESKF process 201 as the measurement vector.
  • the ESKF process 201 tunes the covariance dynamically according to the performance of visual SLAM.
  • a lower value of the measurement noise covariance increases a confidence metric of the filter relating to the measurement. For example, in a measurement that is characterized with visual SLAM performance that is good, the measurement noise covariance should be a lesser value. In another example of a measurement that is characterized with visual SLAM performance that is poor or lost, the measurement noise covariance should be a greater value.
  • the ESKF process 201 injects the measured error into the nominal state at block 214.
  • the ESKF process 201 performs a direct sum of the full state x with the measured error
  • Amathematical representation of the injection may be represented as
  • the measured error may be a vector that includes one or more error terms or matrices.
  • the measured error is a vector that includes measured error for the 3-axes position the 3-axes velocity 3-axes orientation (e.g., a quaternion) however, may include additional or alternative measured error components.
  • the ESKF process 201 resets the ESKF at block 216. For instance, the error state is reset to 0.
  • the ESKF process 201 tunes the covariance, in some cases using the formula
  • FIG. 3 illustrates another example of a flow for updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • FIG. 3 illustrates the operations in a manner similar to FIG. 2 and additionally is annotated with formulas for the operations that may be performed at one or more steps as described with regard to FIG. 2.
  • SLAM simultaneous localization and mapping
  • the ESKF process may perform various computations for each step.
  • the equations above the arrow between IMU raw data and the ESKF process step error state prediction represent the mathematical prediction of the state variable and the prediction of the covariance.
  • the ESKF process illustrates that the measurement update may be computed from the SLAM pose, as described with regard to block 212.
  • the full state x is computed by injecting the observer (i.e., measured) error state into the state variable. This can be represented by the equation and the description of block 214.
  • Various elements from FIG. 2 have been maintained with the same reference numerals.
  • the ESKF generates a predicted state 301 (e.g., the predicted state by the ESKF) by predicting a predicted state 301 of the full state variable x.
  • the predicted state 301 is shown by the computation
  • the ESKF may also generate a predicted state covariance 302.
  • the predicted state covariance 302 can be calculated by The ESKF performs a measurement update by generating the measured state using a non-linear function of the true state h () and adding a white Gaussian noise v.
  • the ESKF measurement update equations 303 can, therefore, be represented as and P ⁇ (I-KH) P.
  • the ESKF While the ESKF is performing the measurement update, the ESKF sets the measurement noise covariance V.
  • the ESKF sets the measurement noise based at least in part on the quality level of the visual SLAM.
  • the optimized SLAM pose and pure visual odometry SLAM pose may be impacted by performance of visual SLAM. Accordingly, dynamically tuning the measurement noise renders a more accurate SLAM pose.
  • the ESKF incorporates the measured error into the nominal state variable by performing a direct sum 304 of the full state x with the measured error
  • the ESKF resets the cycle by setting the predicted state to 0 and tuning the covariance for the next iteration of the ESKF by using reset equations 305.
  • FIG. 4 is a simplified flowchart illustrating a method of updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
  • Some or all of the operations of the flows can be implemented via specific hardware on the computer system and/or can be implemented as computer-readable instructions stored on a non-transitory computer-readable medium of the computer system.
  • the computer-readable instructions represent programmable modules that include code executable by a processor of the computer system. The execution of such instructions configures the computer system to perform the respective operations.
  • Each programmable module in combination with the processor represents a means for performing a respective operation (s) . While the operations are illustrated in a particular order, it should be understood that no particular order is necessary and that one or more operations may be omitted, skipped, and/or reordered.
  • the flow starts at operation 402, where the computer system receives a computed position from an inertial measurement unit.
  • An IMU is a navigation device that continuously calculates a position, orientation, or velocity using dead reckoning of the moving object without the need for external references or sensors.
  • the flow includes operation 404, where the computer system receives a plurality of images from a camera.
  • the computing system uses a camera system, such as RGB optical sensor 114 or depth sensor 112 to generate image data to represent a real world scene.
  • the flow includes operation 406, where the computer system predicts an error state comprising a state variable and a state covariance from the computed position as described with regard to FIGS. 2 and 3.
  • the computer system can predict the error state covariance (the predicted state covariance) as represented by The predicted covariance P is computed by combining the state transition model F x , the state covariance P, the transpose of the transition model and rhw initial transition model F i , an estimated covariance Q i , the transpose of the initial transition model
  • the computer system may also compute the predicted state as represented by The state transition model F x of the state variable x and the state of prediction u m are used to predict the error state (the predicted state variable) .
  • the flow includes operation 408, where the computer system processes a first subset of images of the plurality of images to provide a first SLAM pose at a first frequency.
  • This first SLAM pose can be referred to as an optimized SLAM pose since it is generated, in some embodiments, by bundling images and performing an optimization process to generate an optimized SLAM pose .
  • keyframe-based visual SLAM methods cannot always output with image rate, because of the slow optimization process in the back-end. For example, for an image with a frequency of 30 Hz, the optimization process results in a 10 Hz SLAM pose.
  • the optimization process may be any optimization process that generates an optimized SLAM pose (e.g., Bayesian, graphical, non-linear, etc. ) .
  • the flow includes operation 410, where the computer system processes a second subset of images of the plurality of images with pure visual odometry to provide a second SLAM pose at a second frequency.
  • This second SLAM pose can be referred to as a pure visual odometry SLAM pose since it is generated, in some embodiments, by performing visual odometry without an optimization process as with the first SLAM pose.
  • the computer system processes a second subset of images of the plurality of images at 20 Hz. For instance, the computer system processes the second subset of images using only front-end processing techniques such as Perspective-n-Point.
  • the flow includes operation 412, where the computer system updates the state variable and the state covariance using the first SLAM pose or the second SLAM pose.
  • the ESKF process 201 performs an update of the state variable and the state covariance using the first SLAM pose at the first frequency (i.e., the optimize SLAM pose) and/or the second SLAM pose at the second frequency (i.e., the pure visual odometry SLAM pose) .
  • the computer system performs a measurement update using the first SLAM pose or the second SLAM pose to achieve an image pose rate (e.g., using the first frequency and the second frequency to achieve a sum of the frequencies) .
  • the computer system performs a measurement update at an image pose rate of 30 Hz.
  • FIG. 4 provides a particular method of updating state variables in a SLAM system according to an embodiment of the present invention.
  • Other sequences of steps may also be performed according to alternative embodiments.
  • alternative embodiments of the present invention may perform the steps outlined above in a different order.
  • the individual steps illustrated in FIG. 4 may include multiple sub-steps that may be performed in various sequences as appropriate to the individual step.
  • additional steps may be added or removed depending on the particular applications.
  • One of ordinary skill in the art would recognize many variations, modifications, and alternatives.
  • FIG. 5A illustrates an example of a plot 501 of iris pose, predicted pose, and ESKF pose according to an embodiment of the present invention.
  • FIG. 5B is a plot of the derivative of iris pose and an Kalman Filtered (i.e., ESKF, EKF, etc. ) pose according to an embodiment of the present invention.
  • plot 501 illustrates poses 503 including an iris pose 503A, a predicted iris pose 503B, and an ESKF pose 503C.
  • the poses 503 illustrate a visual representation of the difference in the actual position with reference to a y-axis 502 and an x-axis 504 of (iris pose 503A) , IMU pose (predicted iris pose 503B) , and the smoothed ESKF pose (ESKF pose 503C) .
  • the poses 503 represent the positions (or trajectories) of the iris performing the process as described with regards to embodiments of the present disclosure. During movements, the position (or trajectory) will vary based on the movements. In the example of FIG. 5A, the position/trajectory is shown as a linearization in two dimensions x (horizontal axis) 504 and y (vertical axis) 502.
  • the x-axis 504 and y-axis 502 may represent the trajectory intensity in the axis.
  • a value of (-3, -3) may represent a trajectory that is changing position at -3x per meter and -3y per meter.
  • FIG. 5B which is a plot 507 of the y-derivative of predicted iris pose 503B and ESKF pose 503C
  • the y-derivative plot depicts the change in the linearization of the predicted iris pose 503B and ESKF pose 503C in the y-axis trajectory.
  • FIG. 6A illustrates an example of a plot 601 of iris pose, predicted pose, and ESKF pose, according to an embodiment of the present invention.
  • FIG. 6A illustrates a plot 601 that illustrates poses 603 including an iris pose 603A, a predicted iris pose 603B, and an ESKF pose 603C.
  • the poses 603 illustrate a visual representation of the difference in the actual position (iris pose 603A) , IMU pose (predicted iris pose 603B) , and the smoothed ESKF pose (ESKF pose 603C) with respect to an x-axis 604 and a y-axis 602.
  • FIG. 6B which is a plot 607 that illustrates an x-derivative of the predicted iris pose 603B and the ESKF pose 603C
  • the derivatives reflect the changes in the x-axis.
  • the y-derivative plot depicts the change in the linearization of the predicted iris pose 603B and ESKF pose 603C in the x-axis trajectory with respect to a y-axis 605 and x-axis 606.
  • FIG. 7 illustrates examples of components of a computer system 700 according to certain embodiments.
  • the computer system 700 is an example of the computer system described herein above. Although these components are illustrated as belonging to a same computer system 700, the computer system 700 can also be distributed.
  • the computer system 700 includes at least a processor 702, a memory 704, a storage device 706, input/output peripherals (I/O) 708, communication peripherals 710, and an interface bus 712.
  • the interface bus 712 is configured to communicate, transmit, and transfer data, controls, and commands among the various components of the computer system 700.
  • the memory 704 and the storage device 706 include computer-readable storage media, such as RAM, ROM, electrically erasable programmable read-only memory (EEPROM) , hard drives, CD-ROMs, optical storage devices, magnetic storage devices, electronic non-volatile computer storage, for example memory, and other tangible storage media. Any of such computer readable storage media can be configured to store instructions or program codes embodying aspects of the disclosure.
  • the memory 704 and the storage device 706 also include computer readable signal media.
  • a computer readable signal medium includes a propagated data signal with computer readable program code embodied therein. Such a propagated signal takes any of a variety of forms including, but not limited to, electromagnetic, optical, or any combination thereof.
  • a computer readable signal medium includes any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use in connection with the computer system 700.
  • the memory 704 includes an operating system, programs, and applications.
  • the processor 702 is configured to execute the stored instructions and includes, for example, a logical processing unit, a microprocessor, a digital signal processor, and other processors.
  • the memory 704 and/or the processor 702 can be virtualized and can be hosted within another computer system of, for example, a cloud network or a data center.
  • the I/O peripherals 708 include user interfaces, such as a keyboard, screen (e.g., a touch screen) , microphone, speaker, other input/output devices, and computing components, such as graphical processing units, serial ports, parallel ports, universal serial buses, and other input/output peripherals.
  • the I/O peripherals 708 are connected to the processor 702 through any of the ports coupled to the interface bus 712.
  • the communication peripherals 710 are configured to facilitate communication between the computer system 700 and other computing devices over a communications network and include, for example, a network interface controller, modem, wireless and wired interface cards, antenna, and other communication peripherals.
  • a computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs.
  • Suitable computing devices include multipurpose microprocessor-based computer systems accessing stored software that programs or configures the computer system from a general-purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
  • Embodiments of the methods disclosed herein may be performed in the operation of such computing devices.
  • the order of the blocks presented in the examples above can be varied-for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Certain blocks or processes can be performed in parallel.
  • based on is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited.
  • use of “based at least in part on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based at least in part on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Navigation (AREA)

Abstract

Techniques for performing image rate pose measurement update are described. In an example, a computer system (110) receives a computed position from an inertial measurement unit (113). The computer system (110) receives multiple images from a camera (112, 114). The computer system (110) predicts an error state comprising a state variable and a state covariance. The computer system (110) uses an optimization process to provide a first SLAM pose at a first frequency and pure visual odometry to provide a second SLAM pose at a second frequency. The computing system (110) updates the state variable and the state covariance using the first SLAM pose or second SLAM pose.

Description

ERROR STATE KALMAN FILTER FOR VISUAL SLAM BY DYNAMICALLY TUNING MEASUREMENT NOISE COVARIANCE BACKGROUND OF THE INVENTION
Inertial navigation is a classical method for localization. Inertial navigation uses one or more sensors, for example, motion sensors or rotation sensors, to continuously calculate and adjust the position, the orientation, and the velocity of a moving object. Motion and rotation sensing can be accomplished via an inertial measurement unit (IMU) . An advantage of inertial navigation is that it relies only on the sensor itself without the need for external references such as global positioning systems, sonar location mapping, or camera based location estimation.
Another method of determining location is applying an Extended Kalman Filter (EKF) , such as with a global positioning system. However, in EKF, orientation needs to use quaternion or rotation matrix, which have many parameters. Sometimes, EKF has high-order products, which makes the computation of Jacobians difficult and slow. In addition, when state values are too large, computing Jacobians using EKF may have various computational problems, e.g., possible parameter singularities, gimbal lock issues, etc., which cannot provide a guarantee that the linearization validity holds at all times.
SUMMARY OF THE INVENTION
Embodiments of the present invention are related to updating an ESKF filter with image rate measurements by interleaving measured updates to more tightly couple predicted state variables and measured state variables. Embodiments of the present invention are applicable to a variety of applications in augmented reality and computer-based display systems.
According to an embodiment of the present invention, a method of updating state variable in a simultaneous localization and mapping system is provided. The method includes receiving a computed position from an inertial measurement unit that includes a position, orientation, and velocity. The method includes receiving multiple images from a camera. The method further includes predicting an error state including a state variable and a state covariance. The method includes processing a first subset of images from the multiple images using an optimization process to provide a first SLAM pose at a first frequency. The method further includes processing a second subset of images from the multiple images using pure visual odometry to provide a second SLAM pose at a second frequency. The method further includes updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose. In one example, the state variable includes a vector of at least velocity, an accelerometer metric, gyroscopic metric, or gravitational measurement.
According to an embodiment of the present invention, a computing system for updating a state variable and state covariance of a SLAM system is provided. The computing system includes an inertial measurement unit. The computing system further includes a processor and one or more memories storing instructions executable by the processor. The processor executes instructions that configure the computing system to update the state variable and state covariance of the SLAM system. The processor configures the computing system to receive a computed position from an inertial measurement unit that includes a position, orientation, and velocity. The processor configures the computing system to receive multiple images from a camera. The processor configures the computing system to predict an error state including a state variable and  a state covariance. The processor configures the computing system to use an optimization process for a first subset of images from the multiple images to provide a first SLAM pose at a first frequency. The processor configures the computing system to use pure visual odometry on a second subset of images from the multiple images to provide a second SLAM pose at a second frequency. The processor configures the computing system to update the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
Numerous benefits are achieved by way of the present invention over conventional techniques. For example, embodiments of the present invention provide a SLAM output for every image at the frame rate so that the SLAM pose can be used for measurement updates in an ESKF process. Further, the dynamic tuning of measurement noise covariance can make the SLAM pose more accurate as this covariance value represents the more confident the filter is with the measurement. The measurement update may come directly from the output of visual SLAM. These and other embodiments of the invention along with many of its advantages and features are described in more detail in conjunction with the text below and attached figures.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates an example of a computer system that includes a depth sensor and red, green, and blue (RGB) optical sensors for AR applications according to an embodiment of the present invention.
FIG. 2 illustrates an example of a block diagram for using Error-State Kalman Filtering (ESKF) to update state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
FIG. 3 illustrates another example of a flow for updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
FIG. 4 is a simplified flowchart illustrating a method of updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
FIG. 5A illustrates an example of a plot of iris pose, predicted iris pose, and ESKF pose according to an embodiment of the present invention.
FIG. 5B is a plot of the derivative of the predicted iris pose and ESKF pose according to an embodiment of the present invention.
FIG. 6A illustrates an example of a plot of iris pose, predicted iris pose, and ESKF pose, according to an embodiment of the present invention.
FIG. 6B illustrates an x-derivative of the predicted iris pose and the ESKF pose, according to an embodiment of the present invention.
FIG. 7 illustrates an example computer system according to an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
In the following description, various embodiments will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the embodiments. However, it will also be apparent to one skilled in the art that the embodiments may be practiced without the specific details. Furthermore, well-known features may be omitted or simplified in order not to obscure the embodiment being described.
Embodiments of the present invention relate to tightly coupling inertial measurement sensors (IMU) to Visual-inertial Simultaneous Localization and Mapping (SLAM) systems used in Augmented Reality (AR) systems. SLAM involves determining and mapping a previously-unknown real-world environment using image capture and determining a position in the real-world using visual information in the captured images.
In particular, embodiments of the present invention provide improvements over conventional inventions by providing SLAM pose measurement updates to an EKSF process at the image capture rate. For example, certain embodiments combine an optimized SLAM pose and a pure visual odometry SLAM pose to provide higher measurement update rates. In turn, the ESKF process may be used to generate more accurate measurement updates and support higher rates of movement while maintaining the ESKF processing method. Dynamic tuning of measurement noise covariance can make the pose output more accurate, as this covariance value represents how confident the filter is with the measurement. The measurement comes directly from the output of visual SLAM. The coupling of SLAM with optimized pose determination and SLAM using pure visual odometry to provide a higher frequency update (up to and including image rate) of the measurement position improves the predicted state and provides a solution for fusing IMU information with visual SLAM for higher velocity applications.
FIG. 1 illustrates an example of a computer system 110 that includes a depth sensor 112 and an RGB optical sensor 114 for augmented reality applications according to an embodiment of the present invention. FIG. 1 includes a computer system 110 that includes a depth sensor 112, an Inertial Measurement Unit (IMU) 113, and an RGB optical sensor 114 for AR applications including calibration of one or more sensors. The AR applications can be implemented by an AR module 116 of the computer system 110.
In an example, the computer system 110 represents a suitable user device that includes, in addition to the depth sensor 112 and the RGB optical sensor 114, one or more graphical processing units (GPUs) , one or more general purpose processors (GPPs) , and one or more memories storing computer-readable instructions that are executable by at least one of the processors to perform various functionalities of the embodiments of the present invention. For instance, the computer system 110 can be any of a smartphone, a tablet, an AR headset, or a wearable AR device.
The depth sensor 112 has a known maximum depth range (e.g., a maximum working distance) and this maximum value may be stored locally and/or accessible to the AR module 116. The depth sensor 112 can be a ToF camera. In this case, the depth map generated by the depth sensor 112 includes a depth image.
The IMU 113 can detect inertial measurements and transmit measurements of the detected inertial measurements to computer system 110. Examples of measurements include position, angular velocity, and acceleration. IMU 113 can include an accelerometer and/or a  gyroscope. An accelerometer can output a measurement of acceleration in one or more directions (e.g., x, y, and/or z) . A gyroscope can output a measurement of angular velocity. In turn, the computer system 110 receives the inertial measurements and adjusts the AR scene 120 based on the received measurements.
The RGB optical sensor 114 can be a color camera. The depth image and the RGB image can have different resolutions. Typically, the resolution of the depth image is smaller than that of the RGB image. For instance, the depth image has a 640x180 resolution, whereas the RGB image has a 1920x1280 resolution. Generally, the RGB optical sensor 114 generates an RGB image of a real-world environment that includes, for instance, a real-world object 130. The depth sensor 112 generates depth data about the real-world environment, where this data includes, for instance, a depth map that shows depth (s) of the real-world object 130 (e.g., distance (s) between the depth sensor 112 and the real-world object 130) .
In addition, the depth sensor 112, the IMU 113, and/or the RGB optical sensor 114, as installed in the computer system 110, may be separated by a transformation (e.g., distance offset, field of view angle difference, etc. ) . This transformation may be known and its value may be stored locally and/or accessible to the AR module 116. When cameras are used, the ToF camera and the color camera can have similar field of views. But due to the transformation, the field of views would partially, rather than fully, overlap.
The AR module 116 can generate a red, green, blue, and depth (RGBD) image from the RGB image and the depth map to detect an occlusion of the virtual object 124 by at least a portion of the real-world object representation 122 or vice versa. The AR module 116 can additionally or alternatively generate a 3D model of the real-world environment based on the depth map, where the 3D model includes multi-level voxels. Such voxels are used to detect collision between the virtual object 124 and at least a portion of the real-world object representation 122. The AR scene 120 can be rendered to properly show the occlusion and avoid the rendering of the collision. The AR module 116 can be implemented as specialized hardware and/or a combination of hardware and software (e.g., general purpose processor and computer-readable instructions stored in memory and executable by the general purpose processor) . In addition to initializing an AR session and performing VIO, the AR module 116 can detect occlusion and collision to properly render the AR scene 120.
In an illustrative example of FIG. 1, a smartphone is used for an AR session that shows the real-world environment. In particular, the AR session includes rendering an AR scene that includes a representation of a real-world table on top of which a vase (or some other real-world object) is placed. A virtual ball (or some other virtual object) is to be shown in the AR scene. In particular, the virtual ball is to be shown on top of the table too. By tracking the occlusion between the virtual ball and a virtual vase (that represents the real-world vase) , the virtual vase can occlude, in parts of the AR scene, when the virtual ball is behind the virtual vase relative to the pose of the smartphone. In other parts of the AR scene, the virtual ball can occlude the virtual vase when the virtual vase is behind the virtual ball relative to a change in the pose of the smartphone. And in remaining parts of the AR scene, no occlusion is present. In addition, a user of the smartphone can interact with the virtual ball to move the virtual ball on the top surface of the virtual table (that represents the real-world table) . By tracking a possible collision between the virtual ball and the virtual object, any interaction that would cause collision  would not be rendered. In other words, the collision tracking can be used to control where the virtual ball can be moved in the AR scene.
In another example of FIG. 1, computer system 110 performs an ESKF process to provide a measurement update using of one or more sensors such as depth sensor 112 and/or RGB optical sensor 114. For example, computer system 110 can perform measurement updates using both an optimized SLAM pose and a SLAM pose obtained using pure visual odometry.
FIG. 2 illustrates an example of a process for using an Error-State Kalman Filtering (ESKF) process 201 to update state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention.
In an example, the ESKF process 201 receives an input from an inertial measurement unit (IMU) and an input from a visual SLAM module. For example, the ESKF process 201 receives the IMU raw data 202 from the IMU. IMU raw data 202 may include acceleration and angular velocity information of the camera system. The ESKF process 201 predicts an error-state at block 204 using the IMU raw data 202 to generate a predicted state variable. An example of the predicted state variable is a vector including matrices that include pose information including 3-axes position, the 3-axes velocity, 3-axes orientation (e.g., a quaternion) , and IMU sensor information including 3-axes accelerometer bias, 3-axes gyroscope bias. Using the IMU raw data, the ESKF 201 predicts the predicted state variable. The ESKF process 201 may also predict a predicted state covariance.
The ESKF process 201 predicts the error state
Figure PCTCN2021076067-appb-000001
 (e.g., the state variable) by computing the error in the state variable x by computing the predicted state
Figure PCTCN2021076067-appb-000002
as represented by
Figure PCTCN2021076067-appb-000003
The state transition model F x of the state variable x and the state of prediction u m are used to predict the error state (the predicted state variable) . The ESKF process 201 may also predict the error state covariance (the predicted state covariance) as represented by 
Figure PCTCN2021076067-appb-000004
 The predicted covariance P is computed by combining the state transition model F x, the state covariance P, the transpose of the transition model 
Figure PCTCN2021076067-appb-000005
and the initial transition model F i, an estimated covariance Q i, and the transpose of the initial transition model 
Figure PCTCN2021076067-appb-000006
Continuing with the present example, the computing system 110 may capture multiple images of a real-world scene. For instance, the computing system 110 can receive image capture data 206 from a camera and perform visual SLAM 208. The computing system 110 can process the image capture data 206 using visual SLAM 208 to generate an image rate pose 210. The ESKF process 201 may receive the image rate pose 210 from computing system 110. Additional detail with regard to image rate pose 210 generation is likely best understood with regard to FIG. 5.
The ESKF process 201 performs a measurement update at block 212 by using a relationship between a measurement vector that is estimated from visual SLAM 208 and a noise factor. For instance, the ESKF process 201 update equations can be represented as a relationship between the measurement vector, including position, velocity, orientation, accelerometer bias, gyroscope bias, gravitational acceleration, that are estimated from the visual SLAM 208 and a Gaussian white noise. For instance, y=h (x t) + v, where y is the measurement vector, including position, velocity, orientation, accelerometer bias, gyroscope bias,  gravitational acceleration, estimated from our visual SLAM, h () is a nonlinear function of the system state (the true state) , and v is a white Gaussian noise with covariance V. The white Gaussian noise v can be represented as
Figure PCTCN2021076067-appb-000007
where the white Gaussian noise is drawn from a zero-mean normal distribution with variance N. In order to linearize the h () function, Jacobian matrix H can be used. In the present example, the Jacobian matrix H is a 19*19 Identity matrix. The ESKF correction or update equations can, therefore, be represented as 
Figure PCTCN2021076067-appb-000008
and P ← (I-KH) P.
In some examples, during the update process, the ESKF sets the measurement noise covariance V, described above, to update the state variables and state covariance. Generally, this measurement noise covariance is initialized and fixed during the ESKF update. However, the measurement noise covariance changes due to the integration of visual SLAM to determine the measurement vector. Performing visual SLAM 208 results in various quality levels of SLAM performance that can be characterized by various SLAM quality metrics (e.g., good, weak, lost, etc. ) . In the example of FIG. 2, the image rate pose 210 (e.g., a slam pose) is used by ESKF process 201 as the measurement vector. Due to the variations in SLAM quality, the ESKF process 201 tunes the covariance dynamically according to the performance of visual SLAM. In some embodiments, a lower value of the measurement noise covariance (or the narrower a normal distribution is) increases a confidence metric of the filter relating to the measurement. For example, in a measurement that is characterized with visual SLAM performance that is good, the measurement noise covariance should be a lesser value. In another example of a measurement that is characterized with visual SLAM performance that is poor or lost, the measurement noise covariance should be a greater value.
The ESKF process 201 injects the measured error into the nominal state at block 214. For instance, at block 214, the ESKF process 201 performs a direct sum of the full state x with the measured error
Figure PCTCN2021076067-appb-000009
Amathematical representation of the injection may be represented as 
Figure PCTCN2021076067-appb-000010
The measured error
Figure PCTCN2021076067-appb-000011
may be a vector that includes one or more error terms or matrices. In some embodiments, the measured error
Figure PCTCN2021076067-appb-000012
is a vector that includes measured error for the 3-axes position
Figure PCTCN2021076067-appb-000013
the 3-axes velocity 
Figure PCTCN2021076067-appb-000014
3-axes orientation (e.g., a quaternion) 
Figure PCTCN2021076067-appb-000015
however, may include additional or alternative measured error components.
The ESKF process 201 resets the ESKF at block 216. For instance, the error state
Figure PCTCN2021076067-appb-000016
is reset to 0. The ESKF process 201 tunes the covariance, in some cases using the formula 
Figure PCTCN2021076067-appb-000017
FIG. 3 illustrates another example of a flow for updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention. FIG. 3 illustrates the operations in a manner similar to FIG. 2 and additionally is annotated with formulas for the operations that may be performed at one or more steps as described with regard to FIG. 2.
For instance, various equations are illustrated in FIG. 3 at various positions of the ESKF process. As described with regard to FIG. 2, the ESKF process may perform various computations for each step. In one example, the equations above the arrow between IMU raw data and the ESKF process step error state prediction represent the mathematical prediction of the state variable and the prediction of the covariance. Similarly, the ESKF process illustrates  that the measurement update may be computed from the SLAM pose, as described with regard to block 212. In another example, the full state x is computed by injecting the observer (i.e., measured) error state into the state variable. This can be represented by the equation 
Figure PCTCN2021076067-appb-000018
and the description of block 214. Various elements from FIG. 2 have been maintained with the same reference numerals.
Referring to FIG. 3, the ESKF generates a predicted state
Figure PCTCN2021076067-appb-000019
301 (e.g., the predicted state by the ESKF) by predicting a predicted state
Figure PCTCN2021076067-appb-000020
301 of the full state variable x. In the embodiment of FIG. 3, the predicted state 301 is shown by the computation
Figure PCTCN2021076067-appb-000021
Figure PCTCN2021076067-appb-000022
In addition to the predicted state, the ESKF may also generate a predicted state covariance 302. In the example illustrated by FIG. 3, the predicted state covariance 302 can be calculated by 
Figure PCTCN2021076067-appb-000023
 The ESKF performs a measurement update by generating the measured state using a non-linear function of the true state h () and adding a white Gaussian noise v. The ESKF measurement update equations 303 can, therefore, be represented as 
Figure PCTCN2021076067-appb-000024
and P ← (I-KH) P. While the ESKF is performing the measurement update, the ESKF sets the measurement noise covariance V. The ESKF sets the measurement noise based at least in part on the quality level of the visual SLAM. The optimized SLAM pose and pure visual odometry SLAM pose may be impacted by performance of visual SLAM. Accordingly, dynamically tuning the measurement noise renders a more accurate SLAM pose. The ESKF incorporates the measured error into the nominal state variable by performing a direct sum 304 of the full state x with the measured error
Figure PCTCN2021076067-appb-000025
The ESKF resets the cycle by setting the predicted state to 0 and tuning the covariance for the next iteration of the ESKF by using reset equations 305.
FIG. 4 is a simplified flowchart illustrating a method of updating state variables in a simultaneous localization and mapping (SLAM) system according to an embodiment of the present invention. Some or all of the operations of the flows can be implemented via specific hardware on the computer system and/or can be implemented as computer-readable instructions stored on a non-transitory computer-readable medium of the computer system. As stored, the computer-readable instructions represent programmable modules that include code executable by a processor of the computer system. The execution of such instructions configures the computer system to perform the respective operations. Each programmable module in combination with the processor represents a means for performing a respective operation (s) . While the operations are illustrated in a particular order, it should be understood that no particular order is necessary and that one or more operations may be omitted, skipped, and/or reordered.
In an example, the flow starts at operation 402, where the computer system receives a computed position from an inertial measurement unit. An IMU is a navigation device that continuously calculates a position, orientation, or velocity using dead reckoning of the moving object without the need for external references or sensors.
In an example, the flow includes operation 404, where the computer system receives a plurality of images from a camera. For instance, the computing system uses a camera system, such as RGB optical sensor 114 or depth sensor 112 to generate image data to represent a real world scene.
In an example, the flow includes operation 406, where the computer system predicts an error state comprising a state variable and a state covariance from the computed position as described with regard to FIGS. 2 and 3. The computer system can predict the error state covariance (the predicted state covariance) as represented by 
Figure PCTCN2021076067-appb-000026
 The predicted covariance P is computed by combining the state transition model F x, the state covariance P, the transpose of the transition model 
Figure PCTCN2021076067-appb-000027
and rhw initial transition model F i, an estimated covariance Q i, the transpose of the initial transition model 
Figure PCTCN2021076067-appb-000028
 The computer system may also compute the predicted state
Figure PCTCN2021076067-appb-000029
as represented by
Figure PCTCN2021076067-appb-000030
The state transition model F x of the state variable x and the state of prediction u m are used to predict the error state (the predicted state variable) .
In an example, the flow includes operation 408, where the computer system processes a first subset of images of the plurality of images to provide a first SLAM pose at a first frequency. This first SLAM pose can be referred to as an optimized SLAM pose since it is generated, in some embodiments, by bundling images and performing an optimization process to generate an optimized SLAM pose . Generally, keyframe-based visual SLAM methods cannot always output with image rate, because of the slow optimization process in the back-end. For example, for an image with a frequency of 30 Hz, the optimization process results in a 10 Hz SLAM pose. The optimization process may be any optimization process that generates an optimized SLAM pose (e.g., Bayesian, graphical, non-linear, etc. ) .
In an example, the flow includes operation 410, where the computer system processes a second subset of images of the plurality of images with pure visual odometry to provide a second SLAM pose at a second frequency. This second SLAM pose can be referred to as a pure visual odometry SLAM pose since it is generated, in some embodiments, by performing visual odometry without an optimization process as with the first SLAM pose. As described in the above example, for a computer system using an image frequency of 30 Hz, the computer system processes a second subset of images of the plurality of images at 20 Hz. For instance, the computer system processes the second subset of images using only front-end processing techniques such as Perspective-n-Point.
In an example, the flow includes operation 412, where the computer system updates the state variable and the state covariance using the first SLAM pose or the second SLAM pose. The ESKF process 201 performs an update of the state variable and the state covariance using the first SLAM pose at the first frequency (i.e., the optimize SLAM pose) and/or the second SLAM pose at the second frequency (i.e., the pure visual odometry SLAM pose) . The computer system performs a measurement update using the first SLAM pose or the second SLAM pose to achieve an image pose rate (e.g., using the first frequency and the second frequency to achieve a sum of the frequencies) . Thus, in the example of a first frequency of 10 Hz, and a second frequency of 20 Hz, the computer system performs a measurement update at an image pose rate of 30 Hz.
It should be appreciated that the specific steps illustrated in FIG. 4 provide a particular method of updating state variables in a SLAM system according to an embodiment of the present invention. Other sequences of steps may also be performed according to alternative embodiments. For example, alternative embodiments of the present invention may perform the steps outlined above in a different order. Moreover, the individual steps illustrated in FIG. 4 may include multiple sub-steps that may be performed in various sequences as appropriate to the  individual step. Furthermore, additional steps may be added or removed depending on the particular applications. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.
FIG. 5A illustrates an example of a plot 501 of iris pose, predicted pose, and ESKF pose according to an embodiment of the present invention. FIG. 5B is a plot of the derivative of iris pose and an Kalman Filtered (i.e., ESKF, EKF, etc. ) pose according to an embodiment of the present invention. Referring to FIG. 5A, plot 501 illustrates poses 503 including an iris pose 503A, a predicted iris pose 503B, and an ESKF pose 503C. The poses 503 illustrate a visual representation of the difference in the actual position with reference to a y-axis 502 and an x-axis 504 of (iris pose 503A) , IMU pose (predicted iris pose 503B) , and the smoothed ESKF pose (ESKF pose 503C) . The poses 503 represent the positions (or trajectories) of the iris performing the process as described with regards to embodiments of the present disclosure. During movements, the position (or trajectory) will vary based on the movements. In the example of FIG. 5A, the position/trajectory is shown as a linearization in two dimensions x (horizontal axis) 504 and y (vertical axis) 502. In one embodiment that represents trajectory, the x-axis 504 and y-axis 502 may represent the trajectory intensity in the axis. For example, a value of (-3, -3) may represent a trajectory that is changing position at -3x per meter and -3y per meter.
Referring to FIG. 5B, which is a plot 507 of the y-derivative of predicted iris pose 503B and ESKF pose 503C, one can see that as the iris changes position and/or trajectory with respect to a y-axis 506, and an x-axis 505, and the derivatives reflect the changes in the y-axis. In an embodiment where the predicted iris pose 503B and ESKF pose 503C represent the trajectory of the iris, the y-derivative plot depicts the change in the linearization of the predicted iris pose 503B and ESKF pose 503C in the y-axis trajectory.
FIG. 6A illustrates an example of a plot 601 of iris pose, predicted pose, and ESKF pose, according to an embodiment of the present invention. For example, FIG. 6A illustrates a plot 601 that illustrates poses 603 including an iris pose 603A, a predicted iris pose 603B, and an ESKF pose 603C. The poses 603 illustrate a visual representation of the difference in the actual position (iris pose 603A) , IMU pose (predicted iris pose 603B) , and the smoothed ESKF pose (ESKF pose 603C) with respect to an x-axis 604 and a y-axis 602.
Referring to FIG. 6B, which is a plot 607 that illustrates an x-derivative of the predicted iris pose 603B and the ESKF pose 603C, one can see that as the iris changes position and/or trajectory, the derivatives reflect the changes in the x-axis. In an embodiment where the predicted iris pose 603B and ESKF pose 603C represent the trajectory of the iris, the y-derivative plot depicts the change in the linearization of the predicted iris pose 603B and ESKF pose 603C in the x-axis trajectory with respect to a y-axis 605 and x-axis 606.
FIG. 7 illustrates examples of components of a computer system 700 according to certain embodiments. The computer system 700 is an example of the computer system described herein above. Although these components are illustrated as belonging to a same computer system 700, the computer system 700 can also be distributed.
The computer system 700 includes at least a processor 702, a memory 704, a storage device 706, input/output peripherals (I/O) 708, communication peripherals 710, and an interface bus 712. The interface bus 712 is configured to communicate, transmit, and transfer data,  controls, and commands among the various components of the computer system 700. The memory 704 and the storage device 706 include computer-readable storage media, such as RAM, ROM, electrically erasable programmable read-only memory (EEPROM) , hard drives, CD-ROMs, optical storage devices, magnetic storage devices, electronic non-volatile computer storage, for example 
Figure PCTCN2021076067-appb-000031
memory, and other tangible storage media. Any of such computer readable storage media can be configured to store instructions or program codes embodying aspects of the disclosure. The memory 704 and the storage device 706 also include computer readable signal media. A computer readable signal medium includes a propagated data signal with computer readable program code embodied therein. Such a propagated signal takes any of a variety of forms including, but not limited to, electromagnetic, optical, or any combination thereof. A computer readable signal medium includes any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use in connection with the computer system 700.
Further, the memory 704 includes an operating system, programs, and applications. The processor 702 is configured to execute the stored instructions and includes, for example, a logical processing unit, a microprocessor, a digital signal processor, and other processors. The memory 704 and/or the processor 702 can be virtualized and can be hosted within another computer system of, for example, a cloud network or a data center. The I/O peripherals 708 include user interfaces, such as a keyboard, screen (e.g., a touch screen) , microphone, speaker, other input/output devices, and computing components, such as graphical processing units, serial ports, parallel ports, universal serial buses, and other input/output peripherals. The I/O peripherals 708 are connected to the processor 702 through any of the ports coupled to the interface bus 712. The communication peripherals 710 are configured to facilitate communication between the computer system 700 and other computing devices over a communications network and include, for example, a network interface controller, modem, wireless and wired interface cards, antenna, and other communication peripherals.
While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, it should be understood that the present invention has been presented for purposes of example rather than limitation, and does not preclude inclusion of such modifications, variations, and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art. Indeed, the methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the present invention. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the present invention.
Unless specifically stated otherwise, it is appreciated that throughout this specification discussions utilizing terms such as “processing, ” “computing, ” “calculating, ” “determining, ” and “identifying” or the like refer to actions or processes of a computing device, such as one or more computers or a similar electronic computing device or devices, that manipulate or transform data represented as physical, electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.
The system or systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multipurpose microprocessor-based computer systems accessing stored software that programs or configures the computer system from a general-purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
Embodiments of the methods disclosed herein may be performed in the operation of such computing devices. The order of the blocks presented in the examples above can be varied-for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Certain blocks or processes can be performed in parallel.
Conditional language used herein, such as, among others, “can, ” “could, ” “might, ” “may, ” “e.g., ” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain examples include, while other examples do not include, certain features, elements, and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more examples or that one or more examples necessarily include logic for deciding, with or without author input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular example.
The terms “including, ” “having, ” and the like are synonymous and are used inclusively, in an open-ended fashion, and do not exclude additional elements, features, acts, operations, and so forth. Also, the term “or” is used in its inclusive sense (and not in its exclusive sense) so that when used, for example, to connect a list of elements, the term “or” means one, some, or all of the elements in the list. The use of “adapted to” or “configured to” herein is meant as open and inclusive language that does not foreclose devices adapted to or configured to perform additional tasks or steps. Additionally, the use of “based on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Similarly, the use of “based at least in part on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based at least in part on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.
The various features and processes described above may be used independently of one another, or may be combined in various ways. All possible combinations and sub-combinations are intended to fall within the scope of the present invention. In addition, certain method or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence, and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example, described blocks or states may be performed in an order other than that specifically disclosed, or multiple blocks or states may be combined in a single block or state. The example blocks or states may be performed in serial, in parallel, or in some other manner. Blocks or states may be added to or removed from  the disclosed examples. Similarly, the example systems and components described herein may be configured differently than described. For example, elements may be added to, removed from, or rearranged compared to the disclosed examples.

Claims (20)

  1. A method of updating state variables in a simultaneous localization and mapping (SLAM) system, the method comprising:
    receiving a computed position from an inertial measurement unit, wherein the computed position comprises a position, an orientation, and a velocity;
    receiving a plurality of images from a camera;
    predicting, using the computed position, an error state comprising a state variable and a state covariance, wherein the state variable comprises a vector comprising one or more of a position, a velocity, an accelerometer metric, a gyroscope metric, or a gravitational measurement;
    processing, using an optimization process, a first subset of images of the plurality of images to provide a first SLAM pose at a first frequency;
    processing, using pure visual odometry, a second subset of images of the plurality of images to provide a second SLAM pose at a second frequency; and
    updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
  2. The method of claim 1 wherein providing the first SLAM pose comprises a first set of state parameters that are optimized by an optimization process and providing the second SLAM pose comprises a second set of state parameters that bypass the optimization process.
  3. The method of claim 1 wherein the plurality of images are captured at an image frequency equal to a sum of the first frequency and the second frequency.
  4. The method of claim 1 wherein updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose comprises updating the state variable and the state covariance at an image rate pose comprising:
    using the first SLAM pose to update the state variable and the state covariance at a rate of 10 updates per second; and
    using the second SLAM pose to update the state variable and the state covariance at a rate of 20 updates per second.
  5. The method of claim 4 wherein the image rate pose further comprises an interleaved sequence of the first SLAM pose and the second SLAM pose.
  6. The method of claim 1 wherein the state covariance is computed based on at least a first quality metric of the first SLAM pose or a second quality metric of the second SLAM pose.
  7. The method of claim 1 wherein updating the state variable and the state covariance using at least the first SLAM pose or the second SLAM pose comprises updating the state variable and state covariance using the computed position when a SLAM quality metric is lost.
  8. A computer system comprising:
    a processor;
    an inertial measurement unit; and
    one or more memories storing computer-readable instructions that, upon execution by the processor, configure the computer system to:
    receive a computed position from an inertial measurement unit, wherein the computed position comprises a position, an orientation, and a velocity;
    receive a plurality of images from a camera;
    predict, using the computed position, an error state comprising a state variable and a state covariance, wherein the state variable comprises a vector comprising one or more of a position, a velocity, an accelerometer metric, a gyroscope metric, or a gravitational measurement;
    process, using an optimization process, a first subset of images of the plurality of images to provide a first SLAM pose at a first frequency;
    process, using pure visual odometry, a second subset of images of the plurality of images to provide a second SLAM pose at a second frequency; and
    update the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
  9. The computer system of claim 8 wherein the plurality of images are captured at an image frequency equal to a sum of the first frequency and the second frequency.
  10. The computer system of claim 8 wherein updating the state variable and the state covariance using the first SLAM pose or second SLAM pose comprises updating the state variable and the state covariance at an image rate pose comprising:
    using the first SLAM pose to update the state variable and the state covariance at a rate of 10 updates per second; and
    using the second SLAM pose to update the state variable and the state covariance at a rate of 20 updates per second.
  11. The computer system of claim 10 wherein the image rate pose further comprises an interleaved sequence of the first SLAM pose and the second SLAM pose.
  12. The computer system of claim 8 wherein the state covariance is computed based on a first quality metric of the first SLAM pose or a second quality metric of the second SLAM pose.
  13. The computer system of claim 8 wherein updating the state variable and the state covariance using the first SLAM pose or second SLAM pose comprises updating the state variable and state covariance using the computed position when a SLAM quality metric is lost.
  14. One or more non-transitory computer-storage media storing instructions that, upon execution on a computer system, cause the computer system to perform operations including:
    receiving a computed position from an inertial measurement unit, wherein the computed position comprises a position, an orientation, and a velocity;
    receiving a plurality of images from a camera;
    predicting, using the computed position, an error state comprising a state variable and a state covariance, wherein the state variable comprises a vector comprising one or more of a position, a velocity, an accelerometer metric, a gyroscope metric, or a gravitational measurement;
    processing, using an optimization process, a first subset of images of the plurality of images to provide a first SLAM pose at a first frequency;
    processing, using pure visual odometry, a second subset of images of the plurality of images to provide a second SLAM pose at a second frequency; and
    updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose.
  15. The one or more non-transitory computer-storage media of claim 14 wherein the plurality of images are captured at an image frequency equal to a sum of the first frequency and the second frequency.
  16. The one or more non-transitory computer-storage media of claim 14 wherein updating the state variable and the state covariance using at least one of the first SLAM pose or the second SLAM pose comprises updating the state variable and the state covariance at an image rate pose comprising:
    using the first SLAM pose to update the state variable and the state covariance at a rate of 10 updates per second; and
    using the second SLAM pose to update the state variable and the state covariance at a rate of 20 updates per second.
  17. The one or more non-transitory computer-storage media of claim 16 wherein the image rate pose further comprises an interleaved sequence of the first SLAM pose and the second SLAM pose.
  18. The one or more non-transitory computer-storage media of claim 14 wherein the state covariance is computed based on a first quality metric of the first SLAM pose or a second quality metric of the second SLAM pose.
  19. The one or more non-transitory computer-storage media of claim 14 wherein updating the state variable and the state covariance using the at least one of the first SLAM pose or the second SLAM pose comprises updating the state variable and state covariance using the computed position when a SLAM quality metric is lost.
  20. The one or more non-transitory computer-storage media of claim 14 wherein providing the first SLAM pose comprise a first set of state parameters that are optimized by an optimization process and providing the second SLAM pose comprises a second set of state parameters that bypass the optimization process.
PCT/CN2021/076067 2020-02-13 2021-02-08 Error state kalman filter for visual slam by dynamically tuning measurement noise covariance WO2021160098A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202180011363.3A CN115003983B (en) 2020-02-13 2021-02-08 Method for updating state variable in SLAM system, computer system and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062975991P 2020-02-13 2020-02-13
US62/975,991 2020-02-13

Publications (1)

Publication Number Publication Date
WO2021160098A1 true WO2021160098A1 (en) 2021-08-19

Family

ID=77291392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076067 WO2021160098A1 (en) 2020-02-13 2021-02-08 Error state kalman filter for visual slam by dynamically tuning measurement noise covariance

Country Status (2)

Country Link
CN (1) CN115003983B (en)
WO (1) WO2021160098A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116908810A (en) * 2023-09-12 2023-10-20 天津大学四川创新研究院 Method and system for measuring earthwork of building by carrying laser radar on unmanned aerial vehicle
US11875615B2 (en) 2022-04-29 2024-01-16 Toyota Research Institute, Inc. Odometry noise model fitting from fleet-scale datasets

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160305784A1 (en) * 2015-04-17 2016-10-20 Regents Of The University Of Minnesota Iterative kalman smoother for robust 3d localization for vision-aided inertial navigation
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
CN106679648A (en) * 2016-12-08 2017-05-17 东南大学 Vision-inertia integrated SLAM (Simultaneous Localization and Mapping) method based on genetic algorithm
CN108629843A (en) * 2017-03-24 2018-10-09 成都理想境界科技有限公司 A kind of method and apparatus for realizing augmented reality
CN108731670A (en) * 2018-05-18 2018-11-02 南京航空航天大学 Inertia/visual odometry combined navigation locating method based on measurement model optimization
CN110517324A (en) * 2019-08-26 2019-11-29 上海交通大学 Binocular VIO implementation method based on variation Bayesian adaptation
CN110610513A (en) * 2019-09-18 2019-12-24 郑州轻工业学院 Invariance center differential filter method for vision SLAM of autonomous mobile robot

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11466990B2 (en) * 2016-07-22 2022-10-11 Regents Of The University Of Minnesota Square-root multi-state constraint Kalman filter for vision-aided inertial navigation system
CN108242079B (en) * 2017-12-30 2021-06-25 北京工业大学 VSLAM method based on multi-feature visual odometer and graph optimization model
CN109035303B (en) * 2018-08-03 2021-06-08 百度在线网络技术(北京)有限公司 SLAM system camera tracking method and device, and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
US20160305784A1 (en) * 2015-04-17 2016-10-20 Regents Of The University Of Minnesota Iterative kalman smoother for robust 3d localization for vision-aided inertial navigation
CN106679648A (en) * 2016-12-08 2017-05-17 东南大学 Vision-inertia integrated SLAM (Simultaneous Localization and Mapping) method based on genetic algorithm
CN108629843A (en) * 2017-03-24 2018-10-09 成都理想境界科技有限公司 A kind of method and apparatus for realizing augmented reality
CN108731670A (en) * 2018-05-18 2018-11-02 南京航空航天大学 Inertia/visual odometry combined navigation locating method based on measurement model optimization
CN110517324A (en) * 2019-08-26 2019-11-29 上海交通大学 Binocular VIO implementation method based on variation Bayesian adaptation
CN110610513A (en) * 2019-09-18 2019-12-24 郑州轻工业学院 Invariance center differential filter method for vision SLAM of autonomous mobile robot

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FU DONG, XIA HAO, QIAO YANYOU: "Monocular Visual-Inertial Navigation for Dynamic Environment", REMOTE SENSING, vol. 25, no. 5, 21 April 2021 (2021-04-21), pages 1 - 19, XP055835645, DOI: 10.3390/rs13091610 *
ZHANG YULONG; ZHANG GUOSHAN: "Loop-Closing Detection Algorithm of Keyframe-Based Visual-Inertial SLAM", JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY, vol. 12, no. 11, 1 November 2018 (2018-11-01), pages 1777 - 1787, XP009529748, ISSN: 1673-9418, DOI: 10.3778/j.issn.1673-9418.1708042 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11875615B2 (en) 2022-04-29 2024-01-16 Toyota Research Institute, Inc. Odometry noise model fitting from fleet-scale datasets
CN116908810A (en) * 2023-09-12 2023-10-20 天津大学四川创新研究院 Method and system for measuring earthwork of building by carrying laser radar on unmanned aerial vehicle
CN116908810B (en) * 2023-09-12 2023-12-12 天津大学四川创新研究院 Method and system for measuring earthwork of building by carrying laser radar on unmanned aerial vehicle

Also Published As

Publication number Publication date
CN115003983B (en) 2024-10-01
CN115003983A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
US11797083B2 (en) Head-mounted display system and 6-degree-of-freedom tracking method and apparatus thereof
KR102442780B1 (en) Method for estimating pose of device and thereof
CN108700947B (en) System and method for concurrent ranging and mapping
JP6198230B2 (en) Head posture tracking using depth camera
CN110084832B (en) Method, device, system, equipment and storage medium for correcting camera pose
WO2021160098A1 (en) Error state kalman filter for visual slam by dynamically tuning measurement noise covariance
WO2020253260A1 (en) Time synchronization processing method, electronic apparatus, and storage medium
US12073630B2 (en) Moving object tracking method and apparatus
US10452195B2 (en) Electronic system with gesture calibration mechanism and method of operation thereof
EP3366031B1 (en) Balancing exposure and gain at an electronic device based on device motion and scene distance
JP7274510B2 (en) VIRTUAL OBJECT DISPLAY DEVICE AND VIRTUAL OBJECT DISPLAY METHOD
KR20160020369A (en) Scene analysis for improved eye tracking
CN111578937A (en) Visual inertial odometer system capable of optimizing external parameters simultaneously
WO2014200625A1 (en) Systems and methods for feature-based tracking
US10740986B2 (en) Systems and methods for reconstructing a moving three-dimensional object
JP2023021994A (en) Data processing method and device for automatic driving vehicle, electronic apparatus, storage medium, computer program, and automatic driving vehicle
WO2021068799A1 (en) Occlusion and collision detection for augmented reality applications
CN110310304A (en) Monocular vision builds figure and localization method, device, storage medium and mobile device
CN108804161B (en) Application initialization method, device, terminal and storage medium
US20230005172A1 (en) Method and System for Implementing Adaptive Feature Detection for VSLAM Systems
KR102468355B1 (en) Apparatus and method for monitoring moving target based on complementary gimbal control through displacement partitioning
WO2021160070A1 (en) Imu static noise calibration scale adjustment for vislam applications
WO2021155828A1 (en) Method and system for implementing dynamic input resolution for vslam systems
WO2021160095A1 (en) Surface detection and tracking in augmented reality session based on sparse representation
US10636205B2 (en) Systems and methods for outlier edge rejection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21754054

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21754054

Country of ref document: EP

Kind code of ref document: A1