WO2021035669A1 - Pose prediction method, map construction method, movable platform, and storage medium - Google Patents

Pose prediction method, map construction method, movable platform, and storage medium Download PDF

Info

Publication number
WO2021035669A1
WO2021035669A1 PCT/CN2019/103599 CN2019103599W WO2021035669A1 WO 2021035669 A1 WO2021035669 A1 WO 2021035669A1 CN 2019103599 W CN2019103599 W CN 2019103599W WO 2021035669 A1 WO2021035669 A1 WO 2021035669A1
Authority
WO
WIPO (PCT)
Prior art keywords
image frame
current image
encoder
map
frame
Prior art date
Application number
PCT/CN2019/103599
Other languages
French (fr)
Chinese (zh)
Inventor
朱张豪
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980033455.4A priority Critical patent/CN112219087A/en
Priority to PCT/CN2019/103599 priority patent/WO2021035669A1/en
Publication of WO2021035669A1 publication Critical patent/WO2021035669A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • G01C21/32Structuring or formatting of map data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C3/00Measuring distances in line of sight; Optical rangefinders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/02Systems using the reflection of electromagnetic waves other than radio waves
    • G01S17/06Systems determining position data of a target
    • G01S17/08Systems determining position data of a target for measuring distance only
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the embodiment of the present invention relates to the field of machine vision, in particular to a pose prediction method, a map construction method, a movable platform and a storage medium.
  • Visual SLAM Simultaneous Localization and Mapping, real-time positioning and map construction
  • SLAM Simultaneous Localization and Mapping, real-time positioning and map construction
  • the existing visual SLAM method can cause special problems such as sparse feature points in the image (no texture) or too much repetition with the previous image frame, blur or non-overlapping area caused by high-speed motion, and accidental reduction of the camera frame rate.
  • the tracking function of visual SLAM is invalid, that is, the relative pose of the current image frame relative to the previous image frame cannot be obtained, which makes the visual SLAM unable to continue to work, and the vision can be restored by repositioning when the closed loop is triggered.
  • the existing visual SLAM method has poor stability and robustness. When a visual failure occurs, the vision cannot be restored until the closed loop occurs. Therefore, the visual SLAM will be interrupted when the visual failure occurs, and the map points and trajectory during the visual failure process are missing. , Causing the map points and trajectories before and after the visual failure to fail to connect, resulting in the visual SLAM unable to operate stably.
  • the embodiment of the present invention provides a pose prediction method, a map construction method, a movable platform, and a storage medium to improve the accuracy of the pose acquisition of the current image frame, improve the stability and robustness of visual SLAM, and prevent visual failure.
  • Time Vision SLAM can still run stably.
  • the first aspect of the embodiments of the present invention is to provide a pose prediction method, which is suitable for a movable platform.
  • the movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder, and a second encoder.
  • the first scroll wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide;
  • the camera is used to collect image frames, the The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel; the method includes:
  • the second aspect of the embodiments of the present invention is to provide a movable platform, including: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
  • the first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
  • the camera is used to collect image frames
  • the first encoder is used to detect the speed of the first scroll wheel
  • the second encoder is used to detect the speed of the second scroll wheel
  • the memory is used to store program codes
  • the processor calls the program code, and when the program code is executed, is used to perform the following operations:
  • a third aspect of the embodiments of the present invention is to provide a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the first aspect.
  • the fourth aspect of the embodiments of the present invention is to provide a map construction method, which is suitable for a movable platform, and the movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder, and a second encoder ,
  • the first scroll wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide;
  • the camera is used to collect image frames,
  • the first An encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel;
  • the method includes:
  • the constructed map After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
  • the fifth aspect of the embodiments of the present invention is to provide a movable platform, including: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
  • the first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
  • the camera is used to collect image frames
  • the first encoder is used to detect the speed of the first scroll wheel
  • the second encoder is used to detect the speed of the second scroll wheel
  • the memory is used to store program codes
  • the processor calls the program code, and when the program code is executed, is used to perform the following operations:
  • the constructed map After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
  • the sixth aspect of the embodiments of the present invention is to provide a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the fourth aspect.
  • the first time period is acquired according to the time stamp of the current image frame and the time stamp of the previous image frame; and the first time is acquired The measurement data of the first encoder and the measurement data of the second encoder in the segment; according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, the current image frame relative to the upper A predicted transformation matrix of an image frame; obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
  • the embodiment of the present invention applies the encoder to the visual SLAM and predicts the pose of the current image frame through the encoder measurement data, which can improve the accuracy of the pose acquisition, improve the stability and robustness of the visual SLAM, and prevent visual failure.
  • Time Vision SLAM can still run stably.
  • Fig. 1 is a flowchart of a map construction method provided by an embodiment of the present invention
  • Figure 2 is a flowchart of a pose prediction method provided by an embodiment of the present invention.
  • FIG. 3 is a flowchart of a map construction method provided by another embodiment of the present invention.
  • FIG. 5 is a flowchart of a map construction method provided by another embodiment of the present invention.
  • Fig. 6 is a flowchart of a map construction method provided by another embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a movable platform provided by another embodiment of the present invention.
  • a component when referred to as being "fixed to” another component, it can be directly on the other component or a central component may also exist. When a component is considered to be “connected” to another component, it can be directly connected to the other component or there may be a centered component at the same time.
  • the embodiment of the present invention provides a map construction method.
  • the map construction method is applied to a movable platform, the movable platform may be a robot, an unmanned vehicle, etc., as shown in FIG. 7, the movable platform 700 specifically includes a first scroll wheel 701 and a second scroll wheel 702 , The first encoder 703, the second encoder 704, the camera 705, and further may include a processor 706 and a memory 707.
  • the first scroll wheel 701 and the second scroll wheel 702 are used to displace the movable platform, and the axes of the first scroll wheel 701 and the second scroll wheel 702 coincide; the camera 705 is used to capture images Frame, the first encoder 703 is used to detect the rate of the first scroll wheel 701, and the second encoder 704 is used to detect the rate of the second scroll wheel 702.
  • the camera 705 can be a monocular camera, a binocular camera, an RGBD depth camera, etc.
  • the camera 705 adopts a monocular camera it is usually used in close coupling with an IMU (Inertial Measurement Unit).
  • the IMU is a measuring unit integrating two types of sensors, a gyroscope for measuring three-dimensional rotation and an accelerometer for three-dimensional acceleration; of course, when the camera 705 adopts a binocular camera and an RGBD depth camera It can also be used in conjunction with IMU to further improve accuracy, but the use of IMU has higher requirements for sensor calibration.
  • Encoder refers to the angular displacement sensor, which measures the speed of the wheel by detecting the number of radians that the robot wheel has turned in a certain period of time. It is mainly divided into three types: photoelectric, contact, and electromagnetic.
  • the map construction method according to the embodiment of the present invention can be divided into a tracking thread S11, a partial mapping thread S12, and a loop detection thread S13.
  • the tracking thread S11 includes predicting the pose of an image frame, as shown in FIG. As shown, the pose of the predicted image frame specifically includes:
  • Step S101 Obtain a first time period according to the time stamp of the current image frame and the time stamp of the previous image frame.
  • the preprocessing process specifically includes:
  • the images collected include RGB images and depth images. You can first convert the GRB image to a grayscale image, extract feature points from the grayscale image, and then calculate the stereo coordinates of the feature points based on the grayscale image and the depth image. That is, the coordinate of the feature point relative to the false right camera, the image collected by the RGBD camera is equivalent to the image collected by the binocular camera.
  • the collected images are also converted into grayscale images, the feature points are extracted from the grayscale images, and the feature points in the left camera and right camera images are matched to calculate the depth of the feature points to obtain the stereo coordinates of the feature points .
  • the extracted feature points in this embodiment can be ORB (Oriented Brief) feature points.
  • the ORB feature is a rotation-invariant feature, which has high stability, is insensitive to lighting and moving objects, and is calculated The amount is small, and it can be processed in real time using only the CPU of a conventional notebook computer; of course, the feature points can also be Harris corner points, SIFT (Scale-invariant Feature Transform) feature points, SURF (Speeded Up Robust Features) feature points, etc.
  • Step S102 Obtain the measurement data of the first encoder and the measurement data of the second encoder in the first time period.
  • the measurement data of the first encoder or the second encoder is acquired, it can be determined whether the measurement data is available, specifically as follows:
  • the acquisition frequency measured by the encoder is usually high, such as 200 Hz, while the acquisition frequency of the image is usually low, such as 30 Hz.
  • the encoder measurement data collected at the same time as the image Therefore, it is possible to find the encoder measurement data closest to the image acquisition time, that is, the time error between the encoder measurement data acquisition time and the current image frame acquisition time does not exceed the preset time threshold.
  • Step S103 Obtain a prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period.
  • Step S104 Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
  • the camera pose of the current image frame can be predicted according to the encoder data to obtain the predicted pose of the current image frame.
  • the movement of the movable platform is a two-dimensional movement (plane or curved surface), and the ground has sufficient friction.
  • the curved surface can adopt the idea of Taylor expansion, which is approximated to a flat surface in a short time, that is, the frame rate is high enough. It can also handle curved surface movement.
  • the predictive transformation matrix T 10 may be derived through a plane motion model, such as a motion model of a differential two-wheeled vehicle.
  • x-axis component of the measured speed v v and z-axis EE component of the measured angular velocity ⁇ EE [omega] satisfy the differential motion model of the motorcycle, as shown in formula (2) (wherein E represents a current image frame coding coordinate system, Take the center of the two-wheel track as the origin, and the x-axis points forward):
  • first scroll wheel speed and the second scroll wheel speed measured by the encoder all data between the encoder data closest to the current image frame and the previous image frame acquisition time. It is understandable that in the process of obtaining the predictive transformation matrix T 10 , all the data in the first time period can be obtained with Data, you can also get part of with data.
  • Ei represents the encoder coordinate system of the previous image frame
  • ⁇ EE ⁇ represents the antisymmetric matrix of the measured angular velocity of the current image frame, with Respectively represent the first derivative of the current image frame rotation matrix and position with respect to time.
  • the rotation matrix will be predicted And prediction translation matrix From the encoder coordinate system to the camera coordinate system, the prediction transformation matrix between i and j can be obtained as shown in formula (4):
  • T CjCi is the aforementioned predictive transformation matrix T 10 .
  • the fixed transformation matrix T CE of the camera coordinate system relative to the encoder coordinate system used in the conversion needs to be calibrated in advance.
  • the encoder is applied to the visual SLAM, and the current image frame pose is predicted through the encoder measurement data, which can solve the blur or non-overlapping area caused by sparse feature points (no texture) or too much repetition, high-speed motion, The accidental reduction of the camera frame rate and other visual failure problems, so that the map points and trajectories before and after the failure can be connected, the SLAM system will not be interrupted due to this, and the stability and robustness of the visual SLAM can be improved.
  • the tracking thread S11 further includes optimizing the predicted pose, specifically:
  • the predicted pose of the current image frame is optimized by the preset motion-only BA (motion-only buy adjustment) to obtain the optimized pose of the current image frame;
  • the pure motion beam adjustment includes reprojection error and encoder error.
  • the predicted pose of the current image frame is optimized by pure motion beam adjustment, where BA (bundle adjustment) is to adjust camera parameters (mainly refer to external parameters: the poses of many key frames; it is also possible Including internal parameters: focal length and optical center pixel coordinates) and structure (the position of the map point seen by the camera), so that the reprojection error (that is, the edge connecting the two key frames corresponds to a bunch of matching point pairs, of which key frame 1 is After the point is projected onto key frame 2 through geometric relations, the coordinate difference with the matching point is minimized (that is, the sum of the squares of the two norms of the above-mentioned error about the information matrix is minimized as the objective function of the optimization), a nonlinear optimization method, While Motion-only BA only adjusts camera parameters (mainly refers to the poses of many key frames), but the optimized objective function is still a nonlinear optimization method based on reprojection errors.
  • BA beam adjustment
  • camera parameters mainly refer to external parameters: the poses of many key frames; it is also possible Including internal parameters
  • the reprojection error reprojects the feature points on the predicted pose of the current image frame through the map points of the local map, and calculates the error between the reprojected feature points and the feature points of the current image frame.
  • the optimization formula of the objective function of the pure motion beam adjustment may be as shown in formula (5):
  • the part on the right side min is the objective function.
  • the first term is the prior error, which is generally 0. However, if the poses of the two frames before and after are optimized at the same time when tightly coupled with the IMU, you can Use the Hessian matrix in the optimization process of Motion-only BA of the previous image frame as the information matrix ⁇ 0 -1 to keep the pose of the previous image frame as unchanged as possible; the second term is the reprojection error term corresponding to the visual observation z il ;
  • the last term is the error term formed by the encoder information E ij (that is, the modulus of the error vector is about the square of the covariance matrix).
  • the encoder error term can be obtained by the following formula (6):
  • e R represents the error of the relative rotation matrix (using logarithmic mapping or its Lie algebra, it is a three-dimensional vector)
  • e p represents the error of the relative translation (three-dimensional vector), that is, the encoder error Is a six-dimensional vector corresponding to the covariance matrix It is a 6 ⁇ 6 square matrix, and its inverse matrix, the information matrix, is used here.
  • e R , e p and the covariance matrix All are based on the above-mentioned pre-integration to obtain the rotation matrix And relative translation Calculated.
  • R CiW and p CiW to the world coordinate system, the pose of an image frame, R CE and p CE required i.e. formula (4) in advance involved calibration transformation matrix T CE fixing elements.
  • the aforementioned Motion-only BA uses non-linear optimization to calculate the optimized pose R CjW and p CjW of the current image frame that minimizes the objective function.
  • Jacobian matrix used in the nonlinear optimization process in this embodiment can be numerical, and of course, the analytical solution derived from the Lie algebra related formula can also be used. The latter often makes the optimization more stable.
  • the pure motion beam adjustment further includes an IMU error term.
  • a predictive violent matching method is usually used to project the feature points of the previous image frame onto the current image by using the predictive transformation matrix of the uniform motion model (including the rotation matrix and the translation vector) After the frame, a small rectangular range is selected for fast violent matching.
  • the pose and speed of the previous image frame are used to estimate the initial pose of the current image frame, and then the initial pose is optimized to obtain the optimized pose.
  • the present invention uses the wheel speed measured by the encoder to estimate the initial pose of the current image frame, and then optimizes the initial pose to obtain the optimized pose, which is more robust.
  • the method may further include:
  • Step S202 Judge whether the vision is invalid
  • step S203 If the vision is invalid, go to step S203; if the vision is not invalid, go to step S204;
  • Step S203 Use the predicted pose of the current image frame as the pose of the current image frame
  • Step S204 Use the optimized pose of the current image frame as the pose of the current image frame.
  • the pure motion beam adjustment fails, that is, when the optimized pose cannot be obtained through the objective function of the pure motion beam adjustment described above, it is determined that the vision is invalid. If the vision fails, it means that the Motion-only BA process is invalid, that is, the optimized pose obtained by the final optimization is not accurate, and there is a large deviation from the real pose, so it is not used, and the predicted pose is used as the current image frame. If the vision has not failed, that is, the Motion-only BA process is successful, the optimized pose can be used as the pose of the current image frame.
  • the visual failure can be caused by the following reasons: the feature points in the image are sparse (no texture) or repeated too much with the previous image frame, blur or non-overlapping area caused by high-speed motion, and accidental decrease of the camera frame rate.
  • the optimization result cannot be obtained through the pure motion beam adjustment model, it is determined that the current image frame is visually invalid relative to the previous image frame.
  • the optimized pose obtained by Motion-only BA can be used to re-match the feature points of the current image frame and the frame with the same feature points as the current image frame (or The map points in the map are matched with the feature points of the current image frame). During the matching process, the same feature points in the frames and the current image frame can be projected into the current image frame.
  • a feature point is projected into the current image frame After the distance between the feature points corresponding to the current image frame is less than the threshold, it is determined as an interior point; otherwise, the feature point is an exterior point; judge whether the number of interior points in the current image frame is greater than the preset number threshold, if it is greater than or If it is equal to the preset number threshold, it means that the Motion-only BA is successful, that is, the vision has not failed; if it is less than the preset number threshold, it means that the Motion-only BA has failed, that is, the vision is invalid.
  • the pose information of the current image frame relative to the previous image frame is obtained, and the movement track of the movable platform can be tracked according to the relative pose.
  • the camera when the camera collects the image of the current image frame, the camera can compare the current image frame according to the pose of the camera in the previous image frame and the speed of the first scroll wheel and the speed of the second scroll wheel detected by the encoder. Predict the pose of the current image frame to obtain the predicted pose of the current image frame, and perform feature point matching according to the predicted pose of the current image frame, the image of the current image frame, and the key frame with the same feature point as the current image frame. The predicted pose of the current image frame is optimized, and the optimized pose of the current image frame is obtained.
  • the vision fails temporarily, that is, the current image cannot be passed
  • the predicted pose of the current image frame can be used as the pose information of the current image frame to ensure the subsequent process Continue to avoid the inability to continue working due to the failure of the map point and the track before and after the visual failure.
  • the encoder Since the predicted pose of the current image frame is obtained based on the encoder, and the encoder does not have the characteristic of zero drift, it can ensure that the accumulated error in a short period of time will not be large, which can solve the transition from the short visual failure to the visual recovery period. Problems, thereby improving the stability and robustness of visual SLAM.
  • the local mapping thread S12 can be performed according to the image frame and the corresponding pose information. Specifically, it can first determine whether the current image frame is a key frame. If it is determined to be a key frame, the feature points of the key frame can be matched at the back end to add new map points to realize the construction of a local map and the optimization of the global map. .
  • the visual SLAM method obtaineds camera images and encoder data; obtains camera pose information according to the image frames and the measurement data of the encoder; according to the image frames and the pose information Carry out the construction of the map.
  • the encoder By applying the encoder to the visual SLAM, the stability and robustness of the visual SLAM can be improved, and the failure of the map points and trajectories before and after the visual failure when the visual failure is short-lived can be avoided and the work cannot be continued.
  • the local mapping thread S12 after obtaining the pose of the current image frame, it may further include:
  • Step S301 Determine whether the current image frame is a key frame
  • the current image frame is added to the local mapping thread S12 to construct the map. Specifically, it may include:
  • Step S302 When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame;
  • the constructed map is constructed by key frames determined before the current image frame.
  • map points that are close to each other can also be directly added. Insert into the map as a new map point.
  • the current image frame will be used as the new key frame; or when the matching number of feature points of the current image frame is greater than the preset matching number threshold and sufficient new map points are provided, the current image frame will also be used as the new key frame.
  • the number of feature point matches refers to the number of matches between the feature points of the current image frame and the map points in the map, or the number of matches between the feature points of the current image frame and the feature points of the key frame that has the same feature points as the current image frame;
  • the time interval between the current image frame and the previous key frame must not be too long, that is, the time interval between the current image frame and the previous key frame exceeds the preset time interval When thresholding, the current image frame must be used as the new key frame.
  • the key frame judgment strategy in the prior art can be used; and when the vision fails, since the predicted pose of the current image frame can be obtained through the encoder data, two frames are allowed It is connected through the encoder (that is, pure encoder edges are allowed), it is not necessary that the distance between the current image frame and the previous key frame must be greater than the minimum value in the preset distance range, and the current image frame is also not required
  • the number of feature points matching is greater than the preset matching number threshold, so the above two restrictions can be removed; but for RGBD cameras or binocular cameras, the minimum number of map points created by the current image frame needs to be increased, that is, the current image frame creates map points
  • the number needs to be greater than the preset number of map points threshold to prevent too few contributed map points and lose the meaning of joining; for the case where the monocular camera is tightly coupled with the IMU, add a minimum time interval limit to tentatively build the map thread In (Partial Mapping), some map points are constructed by triangulation to restore visual tracking, that
  • the key frame judgment strategy when the vision fails can be restored to the key frame judgment strategy in the prior art.
  • updating the constructed map according to the current image frame includes:
  • Step S401 Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
  • Step S402 After generating a new map point, construct a common view of the current image frame and perform local beam adjustment to adjust other key frames that are in common view with the current image frame, the map points that can be seen in the current image frame, and The map points that can be seen in the other key frames are optimized; wherein the local beam adjustment includes an encoder error term.
  • the redundant map points can be removed first.
  • the redundant map points can include the map points that are not seen near the current key frame and the previous few frames. For map points with a small ratio of times that are continuously viewed, for feature points in the current image frame that can be fused with existing map points, the existing map points can be replaced by fusion.
  • redundant key frames can also be removed.
  • Redundant key frames refer to the existence of at least one other key frame and the number of same map points that it sees exceeds the threshold (for example, 90 %), in this embodiment, it is determined whether those key frames (frame nodes in the common view) that can see the same map point as the current key frame are redundant.
  • the constructing the common view of the current image frame to perform local beam adjustment includes:
  • Step S501 When any key frame and the previous key frame are connected by the pure encoder side, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, according to the The time stamp of the previous key frame and the time stamp of the next key frame acquire the second time period; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
  • Step S502 Obtain all measurement data of the first encoder and all measurement data of the second encoder in the second time period;
  • Step S503 Obtain a prediction transformation matrix of the next key frame relative to the previous key frame according to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period.
  • the continuous key frames connected by the pure encoder side can be removed, and the key frames connected by the continuous pure encoder side, such as "key frame 1-encoder side 1-key frame 2" -Encoder side 2-key frame 3",
  • the relative pose of key frame 2 to key frame 1 is the predicted pose obtained from the encoder data (ie encoder side 1)
  • key frame 3 is relative to key frame 2
  • the relative pose of is also the predicted pose obtained from the encoder data (that is, encoder side 2).
  • Removing the continuous pure encoder side can reduce the amount of calculation for subsequent optimization and prevent only the encoder data from being used for triangulation.
  • the error is large (the accuracy of the encoder and the friction with the ground cause the encoder to have lower accuracy than the vision).
  • the key frame 2 connected by the continuous pure encoder side can be deleted to obtain "key frame 1-encoder side-key frame 3" , Where the encoder side needs to be recalculated, that is, the predicted pose of key frame 3 relative to key frame 1 is calculated through the pre-integration process in the foregoing embodiment.
  • a new map point can be created by triangulation, where triangulation is based on two points (the relative pose of the key frame) and the ray passing them (the pixel coordinates of the matching point pair, that is, the direction) to obtain the ray The position of the intersection point relative to these two points (the depth or position of the matching point pair or 3D point).
  • the 3D point X is observed by the left and right cameras of the binocular camera, and the imaging position of the 3D point on the left camera is x.
  • the imaging position of the right camera is x'.
  • the images collected by the left and right cameras of the binocular camera can be replaced by two adjacent frames (the relative pose between the two frames is known).
  • the two adjacent frames simultaneously observe a certain fixed in space.
  • the triangulation process of 3D point X is the same as that of a binocular camera, so I won't repeat it here.
  • the image collected by the RGBD camera is equivalent to the image collected by the binocular camera, so the same triangulation process as the binocular camera can be used , I won’t repeat it here.
  • the key frame pose and map points can be optimized by local BA, where the local BA process is similar to the Motion-only BA optimization process in the foregoing embodiment, and the objective function of the local BA also adds an encoder error term.
  • the key frames participating in the local BA are usually the common view of the current key frame (that is, the picture contains two types of vertices, the frame and the map point, and the reprojection between the frame and the map point it can see.
  • these key frames can see the map points seen by the current key frame, and the map points in the figure are all map points that can be seen by these key frames), and the key frame participating in the local BA in this embodiment is changed to the current
  • the image frame is pushed forward by N consecutive key frames (N is greater than or equal to 1), and the objective function of the local BA can be constructed through the N consecutive key frames, so as to optimize the pose and map points of the current key frame, which can reduce Small amount of calculation.
  • the current key frame is sent to the loop detection thread S13 to perform closed loop detection to determine whether the movable platform is located at the previously reached position.
  • Closed-loop detection in the prior art query the map database, calculate the similar transformation (SE3 or Sim3) between the current key frame and the closed-loop key frame through the RANSAC (Random Sample Consensus) algorithm, and merge the current key frame through closed-loop fusion And the map point seen by the key frame near the closed-loop key frame.
  • the RANSAC algorithm uses several pairs of matching points to calculate the transformation randomly, uses the theory of probability statistics to find the most correct transformation, and distinguishes which matching point pairs are correct (the matching point pairs are correct, then called interior points).
  • the bag-of-words (BoW) vector of each key frame can be obtained to obtain the distance between the bag-of-words vector of any key frame and the bag-of-words vector of the current key frame (for example, Euclidean distance), judge whether there is a closed loop based on the distance between two bag-of-words vectors, where the bag-of-words vector is obtained based on the feature points of the key frame and the preset feature point bag of words, and the preset feature point bag of words is obtained by the feature It is constructed by clustering, which contains the feature point information arranged in a predetermined order, and the key frame feature point is matched with each feature point information in the preset feature point word bag.
  • the feature point is set to 1 if it appears in the key frame, and 0 if it does not appear, so that the bag of words vector of the key frame is obtained.
  • the bag-of-words vector of each key frame is stored in the database.
  • the distance between the bag-of-words vector of the current key frame and the bag-of-words vector of each key frame in the database can be used to judge whether the current key frame has a closed loop.
  • the key frame of is the closed-loop key frame of the current image frame.
  • the threshold for detecting the closed loop can be lowered after the visual failure.
  • the threshold for detecting the closed loop includes the number of repeated detections, the threshold for calculating the number of matching points for similar transformations, and the number of internal points.
  • the pose graph can be optimized.
  • the pose graph is a graph in which the vertices do not contain map points, only contain all the key frames, and the edges contain only the relative transformations between the frames on the supporting tree and the similar transformations corresponding to all closed-loop edges.
  • the spanning tree is the connection A loop-free graph of key frames in the entire map.
  • two consecutive key frames in time will become parent and child nodes, but when a parent node is deleted, the parent node of the child node will be modified according to the number of map points seen in common (this time the corresponding removal The operation of the key frame will update the pre-integration of the child nodes), and the root node is the initial key frame when the visual SLAM system is created.
  • the pose graph optimizes the pose of each key frame, and the objective function is relative transformation And vertex pose error About the covariance matrix (Originally a 3 ⁇ 3 identity matrix I) the sum of squares.
  • a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between adjacent key frames.
  • the covariance matrix can be obtained by pre-integration To modify the corresponding relative transformation of the pure encoder side on the support tree Covariance matrix
  • the diagonal element is generally greater than 1, which means that the accuracy of the pure encoder edge is not as good as the encoder and the visually tightly coupled edge, which can reduce the error introduced by the pure encoder edge during nonlinear optimization.
  • the error introduced by the pure encoder side can be dispersed to all sides, for example, on the support tree, key frame 1-key frame 2-key frame 3, when the closed loop of key frame 3 and key 1 occurs , A closed-loop edge is formed between key frame 1 and key frame 3 (similar transformation SE3 or Sim3 of current key frame and closed-loop key frame), which can disperse the cumulative error of key frame 1 and key frame 3 to key frame 1-key
  • the error of each side is relatively small and within an acceptable range.
  • global BA optimization can also be performed.
  • the global BA is similar to the Motion-only BA and local BA optimization process in the above embodiment.
  • the objective function of the global BA also adds the encoder error term, and the difference from the local BA optimization process is that the key frames participating in the global BA are All key frames and map points in the entire map library. After completing the global BA, you can update the map, update all map points, update all key frames in the map library until the latest key frame.
  • the IMU needs to be initialized to obtain the bias of the IMU, including the bias of the gyroscope in the IMU and the bias of the accelerometer. Gravity vector can also be obtained.
  • the map scale needs to be determined through IMU initialization, while binocular cameras and RGBD cameras usually do not need to be initialized to determine the map scale through IMU initialization. When the binocular camera and RGBD camera determine the map scale, the initialization result is close to 1.
  • the first predetermined number of key frames for example, 20
  • the zero offset, gravity vector, and map scale of the IMU can be obtained according to these key frames
  • the map scale Update to all the key frames, including the predetermined number of key frames and the new key frames obtained in the initialization process, and then perform global BA optimization to update the map scale, and then the visual SLAM can run at the updated map scale.
  • this embodiment also provides a lightweight positioning method.
  • the simplest saving method can be adopted, that is, only the key frames and map point information of the necessary map are saved, and binary file storage can be adopted.
  • the main storage content involves sensor type; key frame sequence number, pose, bag-of-words vector, feature point sequence and inter-frame sensor information sequence, sequence number information of map points that can be observed by the frame, and sequence number information of closed-loop key frame connected to the frame; map Point's serial number, position, reference key frame serial number, key frame serial number information that the map point can be observed; supporting tree structure (namely parent node serial number)
  • the sequence number for creating the key frame and the sequence number for the map point may not be the original sequence number.
  • the map information of the long-running visual SLAM system can be stored in a small storage space, which can increase the maximum working time of the visual SLAM system, but it takes more time to reconstruct the map and trajectory when reloading. time.
  • the visual SLAM can only start the tracking thread S11 for positioning, and does not need to start the local mapping thread S12 and the closed loop thread to achieve lightweight positioning. It should be noted that for the case where the monocular camera and the IMU are tightly coupled, the IMU still needs to be initialized (to obtain the initial zero bias and gravitational acceleration).
  • FIG. 7 is a structural diagram of a movable platform provided by an embodiment of the present invention.
  • the movable platform 700 includes a first scroll wheel 701, a second scroll wheel 702, a first encoder 703, and a second encoder 704, camera 705, memory 707, and processor 706;
  • the first rolling wheel 701 and the second rolling wheel 702 are used for displacing the movable platform, and the axes of the first rolling wheel 701 and the second rolling wheel 702 coincide;
  • the camera 705 is used to collect image frames
  • the first encoder 703 is used to detect the speed of the first scroll wheel 701
  • the second encoder 704 is used to detect the speed of the second scroll wheel 702;
  • the memory 707 is used to store program codes
  • the processor 706 calls the program code, and when the program code is executed, is configured to perform the following operations:
  • the processor 706 obtains the prediction of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period.
  • the processor is configured to:
  • the measurement data of the first encoder and the measurement data of the second encoder in the first time period obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
  • the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame is obtained.
  • the processor 706 After the processor 706 obtains the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame, the processor 706 is also Configured as:
  • the prediction pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term passes through the first encoder
  • the pure motion beam adjustment model includes an encoder error term, and the encoder error term passes through the first encoder
  • processor 706 is further configured to:
  • the pose of the current image frame is the predicted pose
  • the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
  • processor 706 is further configured to:
  • processor 706 is further configured to:
  • the constructed map is updated according to the current image frame; the constructed map is determined by the key frame determined before the current image frame Build.
  • the processor 706 when the processor 706 updates the constructed map according to the current image frame, the processor 706 is configured to:
  • the processor 706 constructs the common view of the current image frame to perform local beam adjustment
  • the processor 706 is configured to:
  • the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
  • the processor 706 when the processor 706 constructs the common view of the current image frame to perform local beam adjustment, the processor 706 is further configured to:
  • N is greater than or equal to 1;
  • the encoder error and the reprojection error optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map.
  • the pose of the key frame and the position of all map points are the encoder error and the reprojection error.
  • processor 706 is further configured to:
  • the processor 706 is further configured to:
  • the pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
  • the processor 706 when the processor 706 constructs a pose graph to optimize all key frames, the processor 706 is configured to:
  • a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
  • the processor 706 is further configured to:
  • processor 706 calls the program code, and when the program code is executed, is configured to perform the following operations:
  • the constructed map After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
  • the mobile platform provided in this embodiment obtains the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame; obtains the measurement data and the second code of the first encoder in the first time period Measured measurement data; According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the prediction transformation matrix of the current image frame relative to the previous image frame; According to the predicted change The matrix and the pose information of the previous image frame obtain the predicted pose of the current image frame.
  • the encoder is applied to the visual SLAM, and the pose of the current image frame is predicted through the encoder measurement data, which can improve the accuracy of the pose acquisition, improve the stability and robustness of the visual SLAM, and when the vision fails Visual SLAM can still run stably.
  • this embodiment also provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the pose prediction method and/or map construction method described in the foregoing embodiments.
  • this embodiment also provides a computer program, including program code.
  • the program code executes the pose prediction method and/or map construction method as described in the foregoing embodiment.
  • the disclosed device and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.
  • the above-mentioned integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium.
  • the above-mentioned software functional unit is stored in a storage medium, and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor execute the method described in each embodiment of the present invention. Part of the steps.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

A pose prediction method, a map construction method, a movable platform, and a storage medium. The method comprises: acquiring a first time period according to a timestamp of the current image frame and a timestamp of a previous image frame (S101); acquiring measurement data of a first encoder and measurement data of a second encoder within the first time period (S102); obtaining a prediction transformation matrix of the current image frame with respect to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder within the first time period (S103); and obtaining a predicted pose of the current image frame according to the prediction transformation matrix and pose information of the previous image frame (S104). By applying the encoders to visual SLAM, and predicting the pose of the current image frame by means of the measurement data of the encoders, the accuracy of pose acquisition can be improved, the stability and robustness of visual SLAM is improved, and visual SLAM can still stably operate during visual failure.

Description

位姿预测方法、地图构建方法、可移动平台及存储介质Pose prediction method, map construction method, movable platform and storage medium 技术领域Technical field
本发明实施例涉及机器视觉领域,尤其涉及一种位姿预测方法、地图构建方法、可移动平台及存储介质。The embodiment of the present invention relates to the field of machine vision, in particular to a pose prediction method, a map construction method, a movable platform and a storage medium.
背景技术Background technique
视觉SLAM(Simultaneous Localization and Mapping,即时定位与地图构建)指的是机器人在自身位置不确定的条件下,在完全未知环境中根据相机采集的图像创建地图,同时利用地图进行自主定位和导航。Visual SLAM (Simultaneous Localization and Mapping, real-time positioning and map construction) refers to a robot that creates a map based on the images collected by the camera in a completely unknown environment under the condition of its own location uncertain, and uses the map for autonomous positioning and navigation.
现有的视觉SLAM方法,在出现图像中特征点稀少(无纹理)或与上一图像帧重复太多、高速运动造成的模糊或无重叠区、相机帧率偶然降低等特殊问题时,会导致视觉SLAM的追踪功能失效,也即无法获取到当前图像帧相对于上一图像帧的相对位姿,进而使得视觉SLAM无法继续工作,需要在触发闭环时通过重定位的方式才能恢复视觉。The existing visual SLAM method can cause special problems such as sparse feature points in the image (no texture) or too much repetition with the previous image frame, blur or non-overlapping area caused by high-speed motion, and accidental reduction of the camera frame rate. The tracking function of visual SLAM is invalid, that is, the relative pose of the current image frame relative to the previous image frame cannot be obtained, which makes the visual SLAM unable to continue to work, and the vision can be restored by repositioning when the closed loop is triggered.
现有的视觉SLAM方法稳定性和鲁棒性较差,当出现视觉失效时需要等到发生闭环时视觉才能恢复,因此视觉SLAM会在视觉失效时出现中断,缺失视觉失效过程中的地图点和轨迹,导致视觉失效前后的地图点和轨迹无法衔接,导致视觉SLAM无法稳定的运行。The existing visual SLAM method has poor stability and robustness. When a visual failure occurs, the vision cannot be restored until the closed loop occurs. Therefore, the visual SLAM will be interrupted when the visual failure occurs, and the map points and trajectory during the visual failure process are missing. , Causing the map points and trajectories before and after the visual failure to fail to connect, resulting in the visual SLAM unable to operate stably.
发明内容Summary of the invention
本发明实施例提供一种位姿预测方法、地图构建方法、可移动平台及存储介质,以提高当前图像帧的位姿获取的准确性,提高视觉SLAM的稳定性和鲁棒性,在视觉失效时视觉SLAM仍能稳定的运行。The embodiment of the present invention provides a pose prediction method, a map construction method, a movable platform, and a storage medium to improve the accuracy of the pose acquisition of the current image frame, improve the stability and robustness of visual SLAM, and prevent visual failure. Time Vision SLAM can still run stably.
本发明实施例的第一方面是提供一种位姿预测方法,适用于一可移动平台,所述可移动平台包括第一滚动轮、第二滚动轮、摄像头、第一编码器和第二编码器,所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;所述摄像头用于采集图像帧,所述第一编码器用于检测所述第一滚动轮的速率,所述第二编 码器用于检测所述第二滚动轮的速率;所述方法包括:The first aspect of the embodiments of the present invention is to provide a pose prediction method, which is suitable for a movable platform. The movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder, and a second encoder. The first scroll wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide; the camera is used to collect image frames, the The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel; the method includes:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
本发明实施例的第二方面是提供一种可移动平台,包括:第一滚动轮、第二滚动轮、摄像头、第一编码器、第二编码器、存储器和处理器;The second aspect of the embodiments of the present invention is to provide a movable platform, including: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;The first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
所述摄像头用于采集图像帧;The camera is used to collect image frames;
所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel;
所述存储器用于存储程序代码;The memory is used to store program codes;
所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
本发明实施例的第三方面是提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现第一方面所述的方法。A third aspect of the embodiments of the present invention is to provide a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the first aspect.
本发明实施例的第四方面是提供一种地图构建方法,适用于一可移动平台,所述可移动平台包括第一滚动轮、第二滚动轮、摄像头、第一编码器和第二编码器,所述第一滚动轮和第二滚动轮用于使得所述可移动平台 发生位移,所述第一滚动轮和第二滚动轮的轴线重合;所述摄像头用于采集图像帧,所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;所述方法包括:The fourth aspect of the embodiments of the present invention is to provide a map construction method, which is suitable for a movable platform, and the movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder, and a second encoder , The first scroll wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide; the camera is used to collect image frames, the first An encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel; the method includes:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿;Obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame;
在获得所述当前图像帧的预测位姿后,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
本发明实施例的第五方面是提供一种可移动平台,包括:第一滚动轮、第二滚动轮、摄像头、第一编码器、第二编码器、存储器和处理器;The fifth aspect of the embodiments of the present invention is to provide a movable platform, including: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;The first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
所述摄像头用于采集图像帧;The camera is used to collect image frames;
所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel;
所述存储器用于存储程序代码;The memory is used to store program codes;
所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿;Obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame;
在获得所述当前图像帧的预测位姿后,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
本发明实施例的第六方面是提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现第四方面所述的方法。The sixth aspect of the embodiments of the present invention is to provide a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the method described in the fourth aspect.
本发明实施例提供的位姿预测方法、地图构建方法、可移动平台及存储介质,通过根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。本发明实施例通过将编码器应用到视觉SLAM中,通过编码器测量数据预测当前图像帧的位姿,可提高位姿获取的准确性,提高视觉SLAM的稳定性和鲁棒性,在视觉失效时视觉SLAM仍能稳定的运行。According to the pose prediction method, map construction method, movable platform and storage medium provided by the embodiments of the present invention, the first time period is acquired according to the time stamp of the current image frame and the time stamp of the previous image frame; and the first time is acquired The measurement data of the first encoder and the measurement data of the second encoder in the segment; according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, the current image frame relative to the upper A predicted transformation matrix of an image frame; obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame. The embodiment of the present invention applies the encoder to the visual SLAM and predicts the pose of the current image frame through the encoder measurement data, which can improve the accuracy of the pose acquisition, improve the stability and robustness of the visual SLAM, and prevent visual failure. Time Vision SLAM can still run stably.
附图说明Description of the drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in the embodiments of the present invention more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative labor.
图1为本发明实施例提供的地图构建方法的流程图;Fig. 1 is a flowchart of a map construction method provided by an embodiment of the present invention;
图2为本发明实施例提供的位姿预测方法的流程图;Figure 2 is a flowchart of a pose prediction method provided by an embodiment of the present invention;
图3为本发明另一实施例提供的地图构建方法的流程图;FIG. 3 is a flowchart of a map construction method provided by another embodiment of the present invention;
图4为本发明另一实施例提供的地图构建方法的流程图;4 is a flowchart of a map construction method provided by another embodiment of the present invention;
图5为本发明另一实施例提供的地图构建方法的流程图;FIG. 5 is a flowchart of a map construction method provided by another embodiment of the present invention;
图6为本发明另一实施例提供的地图构建方法的流程图;Fig. 6 is a flowchart of a map construction method provided by another embodiment of the present invention;
图7为本发明另一实施例提供的可移动平台的结构示意图。FIG. 7 is a schematic structural diagram of a movable platform provided by another embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following clearly describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
需要说明的是,当组件被称为“固定于”另一个组件,它可以直接在另一个组件上或者也可以存在居中的组件。当一个组件被认为是“连接”另一个组件,它可以是直接连接到另一个组件或者可能同时存在居中组件。It should be noted that when a component is referred to as being "fixed to" another component, it can be directly on the other component or a central component may also exist. When a component is considered to be "connected" to another component, it can be directly connected to the other component or there may be a centered component at the same time.
除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。本文所使用的术语“及/或”包括一个或多个相关的所列项目的任意的和所有的组合。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of the present invention. The terms used in the description of the present invention herein are only for the purpose of describing specific embodiments, and are not intended to limit the present invention. The term "and/or" as used herein includes any and all combinations of one or more related listed items.
下面结合附图,对本发明的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Hereinafter, some embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
本发明实施例提供一种地图构建方法。所述地图构建方法应用于可移动平台,所述可移动平台可以为机器人、无人车等,如图7所示,所述可移动平台700具体包括第一滚动轮701、第二滚动轮702、第一编码器703、第二编码器704、摄像头705,进一步的还可包括处理器706和存储器707。其中所述第一滚动轮701和第二滚动轮702用于使得所述可移动平台发生位移,所述第一滚动轮701和第二滚动轮702的轴线重合;所述摄像头705用于采集图像帧,所述第一编码器703用于检测所述第一滚动轮701的速率,所述第二编码器704用于检测所述第二滚动轮702的速率。The embodiment of the present invention provides a map construction method. The map construction method is applied to a movable platform, the movable platform may be a robot, an unmanned vehicle, etc., as shown in FIG. 7, the movable platform 700 specifically includes a first scroll wheel 701 and a second scroll wheel 702 , The first encoder 703, the second encoder 704, the camera 705, and further may include a processor 706 and a memory 707. The first scroll wheel 701 and the second scroll wheel 702 are used to displace the movable platform, and the axes of the first scroll wheel 701 and the second scroll wheel 702 coincide; the camera 705 is used to capture images Frame, the first encoder 703 is used to detect the rate of the first scroll wheel 701, and the second encoder 704 is used to detect the rate of the second scroll wheel 702.
其中,所述摄像头705可以采用单目相机、双目相机、RGBD深度相机等,当所述摄像头705采用单目相机时通常会与IMU(Inertial Measurement Unit,惯性测量单元)通过紧耦合配合使用,以确定地图尺度,其中IMU为集成了测量三维旋转的陀螺仪(Gyroscope)和三维加速度的加速度计(Accelerometer)两类传感器的测量单元;当然在所述摄像头705采用双目相机、RGBD深度相机时也可与IMU配合使用,可进一步提高准确度,但使用IMU对传感器标定有较高要求。编码器(encoder)是指角位移传感器,它通过检测机器人轮子在一定时间内转过的弧度数来 测量轮子的速率,主要分为光电式、接触式、电磁式三种。Wherein, the camera 705 can be a monocular camera, a binocular camera, an RGBD depth camera, etc. When the camera 705 adopts a monocular camera, it is usually used in close coupling with an IMU (Inertial Measurement Unit). To determine the map scale, the IMU is a measuring unit integrating two types of sensors, a gyroscope for measuring three-dimensional rotation and an accelerometer for three-dimensional acceleration; of course, when the camera 705 adopts a binocular camera and an RGBD depth camera It can also be used in conjunction with IMU to further improve accuracy, but the use of IMU has higher requirements for sensor calibration. Encoder refers to the angular displacement sensor, which measures the speed of the wheel by detecting the number of radians that the robot wheel has turned in a certain period of time. It is mainly divided into three types: photoelectric, contact, and electromagnetic.
如图1所示,本发明实施例所述的地图构建方法可分为追踪线程S11、局部建图线程S12以及回环检测线程S13,其中追踪线程S11包括预测图像帧的位姿,如图2所示,预测图像帧的位姿具体包括:As shown in FIG. 1, the map construction method according to the embodiment of the present invention can be divided into a tracking thread S11, a partial mapping thread S12, and a loop detection thread S13. The tracking thread S11 includes predicting the pose of an image frame, as shown in FIG. As shown, the pose of the predicted image frame specifically includes:
步骤S101、根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段。Step S101: Obtain a first time period according to the time stamp of the current image frame and the time stamp of the previous image frame.
在获取任一图像帧后,可首先对图像进行预处理,预处理过程具体包括:After acquiring any image frame, the image can be preprocessed first. The preprocessing process specifically includes:
对于RGBD相机,其采集的图像包括RGB图和深度图,可首先将GRB图转换为灰度图,对灰度图提取特征点,然后根据灰度图和深度图计算特征点的立体坐标,也即特征点相对于虚假的右相机的坐标,将RGBD相机采集的图像等效为双目相机采集的图像。For RGBD cameras, the images collected include RGB images and depth images. You can first convert the GRB image to a grayscale image, extract feature points from the grayscale image, and then calculate the stereo coordinates of the feature points based on the grayscale image and the depth image. That is, the coordinate of the feature point relative to the false right camera, the image collected by the RGBD camera is equivalent to the image collected by the binocular camera.
对于双目相机,将其采集的图像也转换为灰度图,对灰度图提取特征点,并根据左相机和右相机图像中特征点匹配,计算特征点的深度,得到特征点的立体坐标。For the binocular camera, the collected images are also converted into grayscale images, the feature points are extracted from the grayscale images, and the feature points in the left camera and right camera images are matched to calculate the depth of the feature points to obtain the stereo coordinates of the feature points .
需要说明的是,本实施例中提取特征点可以为ORB(Oriented Brief)特征点,ORB特征是一种旋转不变性特征,其具有较高的稳定性,对于光照、运动物体不敏感,且计算量较小,能只用常规笔记本电脑的CPU也做到实时处理;当然特征点也可以为Harris角点、SIFT(Scale-invariant Feature Transform)特征点、SURF(Speeded Up Robust Features)特征点等。It should be noted that the extracted feature points in this embodiment can be ORB (Oriented Brief) feature points. The ORB feature is a rotation-invariant feature, which has high stability, is insensitive to lighting and moving objects, and is calculated The amount is small, and it can be processed in real time using only the CPU of a conventional notebook computer; of course, the feature points can also be Harris corner points, SIFT (Scale-invariant Feature Transform) feature points, SURF (Speeded Up Robust Features) feature points, etc.
对于特征点的应用,下面将会详细介绍。The application of feature points will be described in detail below.
步骤S102、获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据。Step S102: Obtain the measurement data of the first encoder and the measurement data of the second encoder in the first time period.
优选地,在获取到第一编码器或第二编码器的测量数据后,可判断所述测量数据是否可用,具体如下:Preferably, after the measurement data of the first encoder or the second encoder is acquired, it can be determined whether the measurement data is available, specifically as follows:
将所述测量数据缓存至列表,并记录所述测量数据对应的采集时间;Buffer the measurement data in a list, and record the collection time corresponding to the measurement data;
从缓存列表中获取与当前图像帧采集时间最接近的所述测量数据,并获取该所述测量数据的采集时间与当前图像帧采集时间的时间误差;Acquiring the measurement data closest to the current image frame acquisition time from the buffer list, and acquiring the time error between the acquisition time of the measurement data and the current image frame acquisition time;
若时间误差超过预设时间阈值,则说明该所述测量数据不可用。If the time error exceeds the preset time threshold, it means that the measurement data is not available.
在本实施例中,编码器测量的采集频率通常较高,例如200Hz,而图像的采集频率通常较低,例如30Hz,一般情况下是无法找到与图像完全在同一时刻采集的编码器测量数据,因此可以查找与图像采集时间最接近的编码器测量数据,也即编码器测量数据的采集时间与当前图像帧采集时间的时间误差不超过预设时间阈值。而当系统存在不稳定情况时,可能会存在编码器测量数据缺失,也即无法找到与当前图像帧采集时间的时间误差不超过预设时间阈值的编码器数据,也即当前图像帧采集时间前后的编码器测量数据不可靠,可丢弃这些不可靠编码器测量数据,仅采用现有技术中的纯视觉SLAM(即只依靠视觉进行位姿预测和地图构建)。同理,对于IMU的测量数据也可采用相似的判断过程。In this embodiment, the acquisition frequency measured by the encoder is usually high, such as 200 Hz, while the acquisition frequency of the image is usually low, such as 30 Hz. Generally, it is impossible to find the encoder measurement data collected at the same time as the image. Therefore, it is possible to find the encoder measurement data closest to the image acquisition time, that is, the time error between the encoder measurement data acquisition time and the current image frame acquisition time does not exceed the preset time threshold. When the system is unstable, there may be missing encoder measurement data, that is, it is impossible to find the encoder data whose time error with the current image frame acquisition time does not exceed the preset time threshold, that is, before and after the current image frame acquisition time The encoder measurement data of is not reliable, these unreliable encoder measurement data can be discarded, and only the pure visual SLAM in the prior art (that is, only relying on vision for pose prediction and map construction). In the same way, a similar judgment process can also be used for IMU measurement data.
步骤S103、根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵。Step S103: Obtain a prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period.
步骤S104、根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Step S104: Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
在本实施例中,可根据编码器数据对当前图像帧相机的位姿进行预测,得到当前图像帧的预测位姿。具体的,假定可移动平台的运动为二维运动(平面或曲面),且地面具有足够的摩擦力,其中,曲面可以采用类似泰勒展开的思想,短时间内近似为平面,即帧率足够高时也能处理曲面运动。In this embodiment, the camera pose of the current image frame can be predicted according to the encoder data to obtain the predicted pose of the current image frame. Specifically, it is assumed that the movement of the movable platform is a two-dimensional movement (plane or curved surface), and the ground has sufficient friction. Among them, the curved surface can adopt the idea of Taylor expansion, which is approximated to a flat surface in a short time, that is, the frame rate is high enough. It can also handle curved surface movement.
通过编码器数据获取预测变换矩阵T 10,其中,预测变换矩阵T 10中包含3×3旋转矩阵R和三维平移矢量t,1表示当前图像帧相机坐标系,0表示上一图像帧相机坐标系;结合上一图像帧的位姿T 0w(w是世界坐标系),可得到当前图像帧的预测位姿T 1w=T 10T 0wObtain the predictive transformation matrix T 10 from the encoder data, where the predictive transformation matrix T 10 contains a 3×3 rotation matrix R and a three-dimensional translation vector t, 1 represents the current image frame camera coordinate system, and 0 represents the previous image frame camera coordinate system ; Combined with the pose T 0w of the previous image frame (w is the world coordinate system), the predicted pose T 1w =T 10 T 0w of the current image frame can be obtained.
在本实施例中,可通过平面运动模型推导预测变换矩阵T 10,例如差动二轮车的运动模型。 In this embodiment, the predictive transformation matrix T 10 may be derived through a plane motion model, such as a motion model of a differential two-wheeled vehicle.
可移动平台的当前图像帧的测量速度v EE和测量角速度ω EE如公式(1)所示: The measured velocity v EE and the measured angular velocity ω EE of the current image frame of the movable platform are shown in formula (1):
Figure PCTCN2019103599-appb-000001
Figure PCTCN2019103599-appb-000001
其中,测量速度v EE的x轴分量v以及测量角速度ω EE的z轴分量ω满足差动二轮车的运动模型,如公式(2)所示(其中E代表当前图像帧编码器坐标系,以两轮轮距的中心为原点,x轴指向前方): Wherein, x-axis component of the measured speed v v and z-axis EE component of the measured angular velocity ω EE [omega] satisfy the differential motion model of the motorcycle, as shown in formula (2) (wherein E represents a current image frame coding coordinate system, Take the center of the two-wheel track as the origin, and the x-axis points forward):
Figure PCTCN2019103599-appb-000002
Figure PCTCN2019103599-appb-000002
其中,
Figure PCTCN2019103599-appb-000003
Figure PCTCN2019103599-appb-000004
分别为编码器测得的第一滚动轮速度和第二滚动轮速度(与当前图像帧和上一图像帧采集时刻最接近的编码器数据之间的所有数据)。可以理解的,在获取预测变换矩阵T 10过程中,可获取第一时间段内所有的
Figure PCTCN2019103599-appb-000005
Figure PCTCN2019103599-appb-000006
数据,也可以获取部分
Figure PCTCN2019103599-appb-000007
Figure PCTCN2019103599-appb-000008
数据。
among them,
Figure PCTCN2019103599-appb-000003
with
Figure PCTCN2019103599-appb-000004
These are the first scroll wheel speed and the second scroll wheel speed measured by the encoder (all data between the encoder data closest to the current image frame and the previous image frame acquisition time). It is understandable that in the process of obtaining the predictive transformation matrix T 10 , all the data in the first time period can be obtained
Figure PCTCN2019103599-appb-000005
with
Figure PCTCN2019103599-appb-000006
Data, you can also get part of
Figure PCTCN2019103599-appb-000007
with
Figure PCTCN2019103599-appb-000008
data.
将当前图像帧的测量速度v EE和测量角速度ω EE代入如下运动模型的微分公式(3)中: Substitute the measured velocity v EE and the measured angular velocity ω EE of the current image frame into the differential formula (3) of the following motion model:
Figure PCTCN2019103599-appb-000009
Figure PCTCN2019103599-appb-000009
其中,Ei表示上一图像帧编码器坐标系,ω EE 表示当前图像帧的测量角速度的反对称阵,
Figure PCTCN2019103599-appb-000010
Figure PCTCN2019103599-appb-000011
分别表示当前图像帧旋转矩阵和位置对时间的一阶导数。
Among them, Ei represents the encoder coordinate system of the previous image frame, and ω EE represents the antisymmetric matrix of the measured angular velocity of the current image frame,
Figure PCTCN2019103599-appb-000010
with
Figure PCTCN2019103599-appb-000011
Respectively represent the first derivative of the current image frame rotation matrix and position with respect to time.
将运动模型的微分公式进行积分,能够得到两帧间预测旋转矩阵
Figure PCTCN2019103599-appb-000012
和预测平移矩阵
Figure PCTCN2019103599-appb-000013
(箭头表示是使用编码器信息的估计,假设信息不含任何噪声
Figure PCTCN2019103599-appb-000014
)。该过程为用于获得两帧件预测关系的预积分过程。
Integrate the differential formula of the motion model to obtain the prediction rotation matrix between two frames
Figure PCTCN2019103599-appb-000012
And prediction translation matrix
Figure PCTCN2019103599-appb-000013
(The arrow indicates the estimation using encoder information, assuming that the information does not contain any noise
Figure PCTCN2019103599-appb-000014
). This process is a pre-integration process used to obtain the prediction relationship between two frames.
进一步的,将预测旋转矩阵
Figure PCTCN2019103599-appb-000015
和预测平移矩阵
Figure PCTCN2019103599-appb-000016
从编码器坐标系转换到相机的坐标系下,从而可得到i、j两帧间的预测变换矩阵如公式(4)所示:
Further, the rotation matrix will be predicted
Figure PCTCN2019103599-appb-000015
And prediction translation matrix
Figure PCTCN2019103599-appb-000016
From the encoder coordinate system to the camera coordinate system, the prediction transformation matrix between i and j can be obtained as shown in formula (4):
Figure PCTCN2019103599-appb-000017
Figure PCTCN2019103599-appb-000017
也即,T CjCi为上述的预测变换矩阵T 10。其中转换中用到的相机坐标系相对编码器坐标系的固定变换矩阵T CE需要先行标定。 That is, T CjCi is the aforementioned predictive transformation matrix T 10 . Among them, the fixed transformation matrix T CE of the camera coordinate system relative to the encoder coordinate system used in the conversion needs to be calibrated in advance.
进一步的,通过公式T 1w=T 10T 0w即可得到当前图像帧的预测位姿T 1w。本发明实施例通过将编码器应用到视觉SLAM中,通过编码器测量数据预测当前图像帧位姿,可解决特征点稀少(无纹理)或重复太多、高速运动造成的模糊或无重叠区、相机帧率偶然降低等视觉失效问题,使得失效前后的地图点和轨迹可以连接上,SLAM系统不会因此中断,提高视觉SLAM的稳定性和鲁棒性。 Further, the predicted pose T 1w of the current image frame can be obtained by the formula T 1w =T 10 T 0w . In the embodiment of the present invention, the encoder is applied to the visual SLAM, and the current image frame pose is predicted through the encoder measurement data, which can solve the blur or non-overlapping area caused by sparse feature points (no texture) or too much repetition, high-speed motion, The accidental reduction of the camera frame rate and other visual failure problems, so that the map points and trajectories before and after the failure can be connected, the SLAM system will not be interrupted due to this, and the stability and robustness of the visual SLAM can be improved.
所述追踪线程S11还包括对所述预测位姿进行优化,具体为:The tracking thread S11 further includes optimizing the predicted pose, specifically:
通过预设的纯运动束调整(Motion-only BA:motion-only bunch adjustment)对所述当前图像帧的预测位姿对当前图像帧的预测位姿进行优化,得到当前图像帧的优化位姿;其中,所述纯运动束调整包括重投影误差和编码器误差。The predicted pose of the current image frame is optimized by the preset motion-only BA (motion-only buy adjustment) to obtain the optimized pose of the current image frame; Wherein, the pure motion beam adjustment includes reprojection error and encoder error.
本发明实施例中,通过纯运动束调整对当前图像帧的预测位姿进行优化,其中BA(bundle adjustment,束调整)即调整相机参数(主要指外参数:许多关键帧的位姿;也可能包括内参数:焦距和光心像素坐标)和结构(相机所看见的地图点的位置),使得重投影误差(也即连接两个关键帧的边对应了一堆匹配点对,其中关键帧1上的点通过几何关系投影到关键帧2上后与匹配点的坐标差值)达到最小(即上述误差关于信息矩阵的二范数的平方和作为优化的目标函数达到最小)的非线性优化方法,而Motion-only BA只调整相机参数(主要指许多关键帧的位姿),但是优化的目标函数依旧是基于重投影误差的一种非线性优化方法。In the embodiment of the present invention, the predicted pose of the current image frame is optimized by pure motion beam adjustment, where BA (bundle adjustment) is to adjust camera parameters (mainly refer to external parameters: the poses of many key frames; it is also possible Including internal parameters: focal length and optical center pixel coordinates) and structure (the position of the map point seen by the camera), so that the reprojection error (that is, the edge connecting the two key frames corresponds to a bunch of matching point pairs, of which key frame 1 is After the point is projected onto key frame 2 through geometric relations, the coordinate difference with the matching point is minimized (that is, the sum of the squares of the two norms of the above-mentioned error about the information matrix is minimized as the objective function of the optimization), a nonlinear optimization method, While Motion-only BA only adjusts camera parameters (mainly refers to the poses of many key frames), but the optimized objective function is still a nonlinear optimization method based on reprojection errors.
所述重投影误差通过局部地图的地图点在当前图像帧的预测位姿上对特征点进行重投影,计算重投影后的特征点和当前图像帧的特征点之间的误差。The reprojection error reprojects the feature points on the predicted pose of the current image frame through the map points of the local map, and calculates the error between the reprojected feature points and the feature points of the current image frame.
所述纯运动束调整的目标函数的优化公式可以如公式(5)所示:The optimization formula of the objective function of the pure motion beam adjustment may be as shown in formula (5):
Figure PCTCN2019103599-appb-000018
Figure PCTCN2019103599-appb-000018
其中,
Figure PCTCN2019103599-appb-000019
为为位姿的最大似然估计,右侧min中的部分为目标函数,第一项是先验误差,一般取0,但在和IMU紧耦合时若同时优化前后两帧的位姿,可以使用上一图像帧的Motion-only BA的优化过程中的Hessian矩阵做信息矩阵Σ 0 -1来保持上一图像帧位姿尽量不变;第二项是视觉观测z il对应的重投影误差项;最后一项为编码器信息E ij构成的误差项(即误差矢量的模关于协方差矩阵的平方)。
among them,
Figure PCTCN2019103599-appb-000019
In order to estimate the maximum likelihood of pose, the part on the right side min is the objective function. The first term is the prior error, which is generally 0. However, if the poses of the two frames before and after are optimized at the same time when tightly coupled with the IMU, you can Use the Hessian matrix in the optimization process of Motion-only BA of the previous image frame as the information matrix Σ 0 -1 to keep the pose of the previous image frame as unchanged as possible; the second term is the reprojection error term corresponding to the visual observation z il ; The last term is the error term formed by the encoder information E ij (that is, the modulus of the error vector is about the square of the covariance matrix).
本实施例中,编码器误差项可通过如下公式(6)获取:In this embodiment, the encoder error term can be obtained by the following formula (6):
Figure PCTCN2019103599-appb-000020
Figure PCTCN2019103599-appb-000020
其中e R表示相对旋转矩阵的误差(使用对数映射或其李代数表示,是一个三维矢量),e p表示相对平移的误差(三维矢量),即编码器误差
Figure PCTCN2019103599-appb-000021
是一个六维矢量,对应协方差矩阵
Figure PCTCN2019103599-appb-000022
是一个6×6方阵,这里用的是它的逆矩阵即信息矩阵。e R、e p以及协方差矩阵
Figure PCTCN2019103599-appb-000023
均是根据上述进行预积分得到旋转矩阵
Figure PCTCN2019103599-appb-000024
和相对平移
Figure PCTCN2019103599-appb-000025
计算得到。
Where e R represents the error of the relative rotation matrix (using logarithmic mapping or its Lie algebra, it is a three-dimensional vector), e p represents the error of the relative translation (three-dimensional vector), that is, the encoder error
Figure PCTCN2019103599-appb-000021
Is a six-dimensional vector corresponding to the covariance matrix
Figure PCTCN2019103599-appb-000022
It is a 6×6 square matrix, and its inverse matrix, the information matrix, is used here. e R , e p and the covariance matrix
Figure PCTCN2019103599-appb-000023
All are based on the above-mentioned pre-integration to obtain the rotation matrix
Figure PCTCN2019103599-appb-000024
And relative translation
Figure PCTCN2019103599-appb-000025
Calculated.
计算目标函数中的编码器误差项时会用到预积分的结果和下述公式,即编码器坐标系下的相对变换和相机坐标系下相对变换的关系:When calculating the encoder error term in the objective function, the result of pre-integration and the following formula are used, that is, the relationship between the relative transformation in the encoder coordinate system and the relative transformation in the camera coordinate system:
Figure PCTCN2019103599-appb-000026
Figure PCTCN2019103599-appb-000026
即:which is:
Figure PCTCN2019103599-appb-000027
Figure PCTCN2019103599-appb-000027
Figure PCTCN2019103599-appb-000028
Figure PCTCN2019103599-appb-000028
其中,R CiW和p CiW为世界坐标系下,上一图像帧位姿,R CE和p CE即公式(4)中所涉及到的需事先标定的固定变换矩阵T CE的元素。 Wherein, R CiW and p CiW to the world coordinate system, the pose of an image frame, R CE and p CE required i.e. formula (4) in advance involved calibration transformation matrix T CE fixing elements.
上述的Motion-only BA使用非线性优化,可计算出使目标函数最小的当前图像帧的优化位姿R CjW和p CjWThe aforementioned Motion-only BA uses non-linear optimization to calculate the optimized pose R CjW and p CjW of the current image frame that minimizes the objective function.
另外,本实施例中非线性优化过程中要用到的雅克比Jacobian矩阵可以采用数值的,当然也可以采用李代数相关公式推导出的解析解,后者往往能让优化更稳定些。In addition, the Jacobian matrix used in the nonlinear optimization process in this embodiment can be numerical, and of course, the analytical solution derived from the Lie algebra related formula can also be used. The latter often makes the optimization more stable.
在另一优选实施例中,当也采用IMU(inertial measurement unit)进行位姿优化时,所述纯运动束调整还包括IMU误差项。In another preferred embodiment, when an IMU (inertial measurement unit) is also used for pose optimization, the pure motion beam adjustment further includes an IMU error term.
则所述纯运动束调整的目标函数的优化公式可以如公式8所示:Then the optimization formula of the objective function of the pure motion beam adjustment can be as shown in formula 8:
Figure PCTCN2019103599-appb-000029
Figure PCTCN2019103599-appb-000029
其中,
Figure PCTCN2019103599-appb-000030
为IMU误差项。
among them,
Figure PCTCN2019103599-appb-000030
Is the IMU error term.
现有技术中对于当前图像帧位姿的获取,通常预测式暴力匹配的方式,采用匀速运动模型的预测变换矩阵(包含旋转矩阵和平移矢量)来将上一图像帧的特征点投影到当前图像帧后选取了个小矩形范围进行快速暴力匹配的,利用上一图像帧的位姿和速度来估计当前图像帧的初始位姿,然 后再对初始位姿进行优化得到优化位姿。而本发明采用编码器测量的轮速来估计当前图像帧的初始位姿,然后再对初始位姿进行优化得到优化位姿,鲁棒性更高。In the prior art, for acquiring the pose of the current image frame, a predictive violent matching method is usually used to project the feature points of the previous image frame onto the current image by using the predictive transformation matrix of the uniform motion model (including the rotation matrix and the translation vector) After the frame, a small rectangular range is selected for fast violent matching. The pose and speed of the previous image frame are used to estimate the initial pose of the current image frame, and then the initial pose is optimized to obtain the optimized pose. The present invention uses the wheel speed measured by the encoder to estimate the initial pose of the current image frame, and then optimizes the initial pose to obtain the optimized pose, which is more robust.
进一步的,在上述实施例的基础上,如图3所示,在进行步骤S201纯运动束调整后,还可包括:Further, on the basis of the foregoing embodiment, as shown in FIG. 3, after performing the pure motion beam adjustment in step S201, the method may further include:
步骤S202、判断视觉是否失效;Step S202: Judge whether the vision is invalid;
若视觉失效,则执行步骤S203;若视觉未失效,则执行步骤S204;If the vision is invalid, go to step S203; if the vision is not invalid, go to step S204;
步骤S203、以当前图像帧的预测位姿作为当前图像帧的位姿;Step S203: Use the predicted pose of the current image frame as the pose of the current image frame;
步骤S204、以当前图像帧的优化位姿作为当前图像帧的位姿。Step S204: Use the optimized pose of the current image frame as the pose of the current image frame.
在本实施例中,当纯运动束调整失败时,即通过上述所述纯运动束调整的目标函数不能获得所述优化位姿时,则确定视觉失效。若视觉失效,则说明Motion-only BA过程失效,也即最终优化得到的优化位姿不准确,与真实位姿也存在较大偏差,因此不予采用,而是以预测位姿作为当前图像帧的位姿;而视觉未失效,也即Motion-only BA过程成功,可将优化位姿作为当前图像帧的位姿。In this embodiment, when the pure motion beam adjustment fails, that is, when the optimized pose cannot be obtained through the objective function of the pure motion beam adjustment described above, it is determined that the vision is invalid. If the vision fails, it means that the Motion-only BA process is invalid, that is, the optimized pose obtained by the final optimization is not accurate, and there is a large deviation from the real pose, so it is not used, and the predicted pose is used as the current image frame. If the vision has not failed, that is, the Motion-only BA process is successful, the optimized pose can be used as the pose of the current image frame.
在本实施例中,视觉失效可由以下原因导致:图像中特征点稀少(无纹理)或与上一图像帧重复太多、高速运动造成的模糊或无重叠区、相机帧率偶然降低等。In this embodiment, the visual failure can be caused by the following reasons: the feature points in the image are sparse (no texture) or repeated too much with the previous image frame, blur or non-overlapping area caused by high-speed motion, and accidental decrease of the camera frame rate.
在一优选实施例中,当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧发生视觉失效。In a preferred embodiment, when the optimization result cannot be obtained through the pure motion beam adjustment model, it is determined that the current image frame is visually invalid relative to the previous image frame.
在另一实施例中,在判断视觉失效时,可通过Motion-only BA得到的优化位姿,将当前图像帧、以及与当前图像帧具有相同特征点的帧进行特征点的重新匹配(或者将地图中的地图点与当前图像帧进行特征点匹配),匹配过程中,可将该些帧中与当前图像帧的相同特征点投影到当前图像帧中,若某一特征点投影到当前图像帧后与当前图像帧对应的特征点之间的距离小于阈值,则确定为内点;否则,该特征点为外点;判断当前图像帧中内点的数量是否大于预设数量阈值,若大于或等于预设数量阈值,则说明Motion-only BA成功,也即视觉未失效;若小于预设数量阈值,则说明Motion-only BA失败,也即视觉失效。In another embodiment, when judging visual failure, the optimized pose obtained by Motion-only BA can be used to re-match the feature points of the current image frame and the frame with the same feature points as the current image frame (or The map points in the map are matched with the feature points of the current image frame). During the matching process, the same feature points in the frames and the current image frame can be projected into the current image frame. If a feature point is projected into the current image frame After the distance between the feature points corresponding to the current image frame is less than the threshold, it is determined as an interior point; otherwise, the feature point is an exterior point; judge whether the number of interior points in the current image frame is greater than the preset number threshold, if it is greater than or If it is equal to the preset number threshold, it means that the Motion-only BA is successful, that is, the vision has not failed; if it is less than the preset number threshold, it means that the Motion-only BA has failed, that is, the vision is invalid.
在本实施例中,在视觉SLAM的追踪线程S11中,获取到当前图像帧 相对于上一图像帧的位姿信息,即可根据相对位姿追踪可移动平台的运动轨迹。In this embodiment, in the tracking thread S11 of the visual SLAM, the pose information of the current image frame relative to the previous image frame is obtained, and the movement track of the movable platform can be tracked according to the relative pose.
本实施例中,当相机采集到当前图像帧的图像时,可根据上一图像帧相机的位姿以及编码器检测到的第一滚动轮速度和第二滚动轮速度信息来对当前图像帧相机的位姿进行预测,得到当前图像帧的预测位姿,并根据当前图像帧的预测位姿、当前图像帧的图像以及与当前图像帧具有相同特征点的关键帧进行特征点的匹配,以对当前图像帧的预测位姿进行优化,得到当前图像帧的优化位姿。In this embodiment, when the camera collects the image of the current image frame, the camera can compare the current image frame according to the pose of the camera in the previous image frame and the speed of the first scroll wheel and the speed of the second scroll wheel detected by the encoder. Predict the pose of the current image frame to obtain the predicted pose of the current image frame, and perform feature point matching according to the predicted pose of the current image frame, the image of the current image frame, and the key frame with the same feature point as the current image frame. The predicted pose of the current image frame is optimized, and the optimized pose of the current image frame is obtained.
由于图像中特征点稀少(无纹理)或与上一图像帧重复太多、高速运动造成的模糊或无重叠区、相机帧率偶然降低等特殊问题导致视觉短暂失效时,也即无法通过当前图像帧以及与当前图像帧具有相同特征点的关键帧进行特征点匹配的方式进行当前图像帧位姿优化,则可以当前图像帧的预测位姿作为当前图像帧的位姿信息,从而保证后续流程的继续进行,避免由于视觉失效前后地图点和轨迹衔接不上导致无法继续工作。由于当前图像帧的预测位姿是基于编码器获得,而基于编码器不具有零漂的特性,可以保证短时间内的积累误差不会很大,从而可以解决视觉短暂失效到视觉恢复期间的过渡问题,从而提高视觉SLAM的稳定性和鲁棒性。When the feature points in the image are sparse (no texture) or repeated too much with the previous image frame, blur or non-overlapping area caused by high-speed motion, and accidental reduction of the camera frame rate, etc., the vision fails temporarily, that is, the current image cannot be passed To optimize the pose of the current image frame by matching the feature points to the key frames with the same feature points as the current image frame, the predicted pose of the current image frame can be used as the pose information of the current image frame to ensure the subsequent process Continue to avoid the inability to continue working due to the failure of the map point and the track before and after the visual failure. Since the predicted pose of the current image frame is obtained based on the encoder, and the encoder does not have the characteristic of zero drift, it can ensure that the accumulated error in a short period of time will not be large, which can solve the transition from the short visual failure to the visual recovery period. Problems, thereby improving the stability and robustness of visual SLAM.
在本实施例中,在获取到当前图像帧的位姿信息后,则可根据图像帧以及对应的位姿信息进行局部建图线程S12。具体的,可首先确定当前图像帧是否为关键帧,若确定为关键帧,则可在后端通过关键帧的特征点的匹配,增加新的地图点,实现局部地图的构建以及全局地图的优化。In this embodiment, after acquiring the pose information of the current image frame, the local mapping thread S12 can be performed according to the image frame and the corresponding pose information. Specifically, it can first determine whether the current image frame is a key frame. If it is determined to be a key frame, the feature points of the key frame can be matched at the back end to add new map points to realize the construction of a local map and the optimization of the global map. .
本实施例提供的视觉SLAM方法,通过获取相机图像以及编码器数据;根据所述图像帧以及所述编码器的测量数据,获取相机的位姿信息;根据所述图像帧以及所述位姿信息进行地图的构建。通过将编码器应用到视觉SLAM中,可提高视觉SLAM的稳定性和鲁棒性,可避免视觉短暂失效时视觉失效前后地图点和轨迹衔接不上导致无法继续工作。The visual SLAM method provided in this embodiment obtains camera images and encoder data; obtains camera pose information according to the image frames and the measurement data of the encoder; according to the image frames and the pose information Carry out the construction of the map. By applying the encoder to the visual SLAM, the stability and robustness of the visual SLAM can be improved, and the failure of the map points and trajectories before and after the visual failure when the visual failure is short-lived can be avoided and the work cannot be continued.
在局部建图线程S12中,如图4所示,在获取当前图像帧的位姿后,还可包括:In the local mapping thread S12, as shown in FIG. 4, after obtaining the pose of the current image frame, it may further include:
步骤S301、判断当前图像帧是否为关键帧;Step S301: Determine whether the current image frame is a key frame;
若当前图像帧为关键帧,则将当前图像帧增加到局部建图线程S12中, 进行地图的构建。具体的,可包括:If the current image frame is a key frame, the current image frame is added to the local mapping thread S12 to construct the map. Specifically, it may include:
步骤S302、当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;Step S302: When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame;
其中,所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。Wherein, the constructed map is constructed by key frames determined before the current image frame.
在本实施例中,为了减少地图构建过程的计算资源消耗,仅将能够提供足够多地图点的关键帧增加到建图线程中,对于RGBD相机或双目相机也可将距离近的地图点直接作为新地图点插入地图中。In this embodiment, in order to reduce the computational resource consumption of the map construction process, only the key frames that can provide enough map points are added to the mapping thread. For RGBD cameras or binocular cameras, map points that are close to each other can also be directly added. Insert into the map as a new map point.
现有技术中通常会判断当前图像帧离上一关键帧之间的距离是否处于预设距离范围内,也即在当前图像帧离上一关键帧之间的距离不太远不太近的情况下,才会将当前图像帧作为新关键帧;或者当前图像帧的特征点匹配数量大于预设匹配数量阈值且提供足够多的新地图点的时候,也会将当前图像帧作为新关键帧,其中特征点匹配数量指当前图像帧的特征点和地图中的地图点的匹配数量,或者当前图像帧的特征点和与当前图像帧具有相同特征点的关键帧的特征点之间的匹配数量;对于单目相机与IMU紧耦合的情况,为了防止零漂还需要当前图像帧与上一关键帧的时间间隔不能太长,也即当前图像帧与上一关键帧的时间间隔超过预设时间间隔阈值时,一定会将当前图像帧作为新关键帧。In the prior art, it is usually judged whether the distance between the current image frame and the previous key frame is within a preset distance range, that is, when the distance between the current image frame and the previous key frame is not too far or too close The current image frame will be used as the new key frame; or when the matching number of feature points of the current image frame is greater than the preset matching number threshold and sufficient new map points are provided, the current image frame will also be used as the new key frame. The number of feature point matches refers to the number of matches between the feature points of the current image frame and the map points in the map, or the number of matches between the feature points of the current image frame and the feature points of the key frame that has the same feature points as the current image frame; For the tight coupling of the monocular camera and the IMU, in order to prevent zero drift, the time interval between the current image frame and the previous key frame must not be too long, that is, the time interval between the current image frame and the previous key frame exceeds the preset time interval When thresholding, the current image frame must be used as the new key frame.
而本实施例中,在视觉未失效时,可采用上述现有技术中的关键帧判断策略;而在视觉失效时,由于可通过编码器数据获取到当前图像帧的预测位姿,允许两帧之间通过编码器连接(也即允许纯编码器边pure encoder edges),可不需要当前图像帧离上一关键帧之间的距离必须大于预设距离范围中的最小值,也不需要当前图像帧的特征点匹配数量大于预设匹配数量阈值,因此可以去掉上述两项限定;但对于RGBD相机或双目相机,需要增加当前图像帧创建地图点的最小数量限制,也即当前图像帧创建地图点的数量需要大于预设地图点数量阈值,以防止贡献的地图点太少而失去加入的意义;对于单目相机与IMU紧耦合的情况,增加一个最短时间间隔限制来试探性的在建图线程(局部建图)中通过三角化构建一些地图点以恢复视觉追踪,也即当前图像帧与上一关键帧的时间间隔需要大于预设最短时间间隔,并且由于建图线程(局部建图)中会去掉冗余或 不重要的关键帧和地图点,所以不用担心单目相机所采用的时间判据会带来冗余的问题。In this embodiment, when the vision has not failed, the key frame judgment strategy in the prior art can be used; and when the vision fails, since the predicted pose of the current image frame can be obtained through the encoder data, two frames are allowed It is connected through the encoder (that is, pure encoder edges are allowed), it is not necessary that the distance between the current image frame and the previous key frame must be greater than the minimum value in the preset distance range, and the current image frame is also not required The number of feature points matching is greater than the preset matching number threshold, so the above two restrictions can be removed; but for RGBD cameras or binocular cameras, the minimum number of map points created by the current image frame needs to be increased, that is, the current image frame creates map points The number needs to be greater than the preset number of map points threshold to prevent too few contributed map points and lose the meaning of joining; for the case where the monocular camera is tightly coupled with the IMU, add a minimum time interval limit to tentatively build the map thread In (Partial Mapping), some map points are constructed by triangulation to restore visual tracking, that is, the time interval between the current image frame and the previous key frame needs to be greater than the preset minimum time interval, and due to the mapping thread (Partial Mapping) Redundant or unimportant key frames and map points will be removed, so there is no need to worry about the time criterion adopted by the monocular camera will bring about redundancy problems.
进一步的,当视觉恢复后,可将上述视觉失效时的关键帧判断策略恢复到上述现有技术中的关键帧判断策略。Further, after the vision is restored, the key frame judgment strategy when the vision fails can be restored to the key frame judgment strategy in the prior art.
进一步的,如图5所示,根据所述当前图像帧对所述已构建地图进行更新,包括:Further, as shown in FIG. 5, updating the constructed map according to the current image frame includes:
步骤S401、根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Step S401: Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
步骤S402、在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括编码器误差项。Step S402: After generating a new map point, construct a common view of the current image frame and perform local beam adjustment to adjust other key frames that are in common view with the current image frame, the map points that can be seen in the current image frame, and The map points that can be seen in the other key frames are optimized; wherein the local beam adjustment includes an encoder error term.
在本实施例中,在获取到新关键帧后,可首先将去除冗余地图点,其中冗余地图点可包括不在当前关键帧附近的被看到次数过少的地图点及前几帧里被连续看到的次数比率较小的地图点,对于当前图像帧中能够和已有地图点进行融合的特征点,可通过融合来取代已有地图点。In this embodiment, after the new key frame is acquired, the redundant map points can be removed first. The redundant map points can include the map points that are not seen near the current key frame and the previous few frames. For map points with a small ratio of times that are continuously viewed, for feature points in the current image frame that can be fused with existing map points, the existing map points can be replaced by fusion.
在进行局部BA优化关键帧位姿和地图点后,还可去除冗余关键帧,其中冗余关键帧是指存在至少1个其他的关键帧和其看到相同地图点数量超过阈值(例如90%)的情况,本实施例中,会判断和当前关键帧能看到相同地图点的那些关键帧(共视图里的帧节点)是否冗余。通过上述的冗余地图点和冗余关键帧的去除,可以避免地图中地图点过多,减少计算量,提高建图效率。After performing local BA optimization of key frame pose and map points, redundant key frames can also be removed. Redundant key frames refer to the existence of at least one other key frame and the number of same map points that it sees exceeds the threshold (for example, 90 %), in this embodiment, it is determined whether those key frames (frame nodes in the common view) that can see the same map point as the current key frame are redundant. Through the above-mentioned removal of redundant map points and redundant key frames, it is possible to avoid too many map points in the map, reduce the amount of calculation, and improve the efficiency of mapping.
此外,如图6所示,所述构建所述当前图像帧的共视图进行局部束调整,包括:In addition, as shown in FIG. 6, the constructing the common view of the current image frame to perform local beam adjustment includes:
步骤S501、当任一关键帧与前一个关键帧由纯编码器边连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯编码器边为相邻关键帧之间发生所述视觉失效;Step S501: When any key frame and the previous key frame are connected by the pure encoder side, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, according to the The time stamp of the previous key frame and the time stamp of the next key frame acquire the second time period; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
步骤S502、获取所述第二时间段内第一编码器的所有测量数据和第二编码器测的所有测量数据;Step S502: Obtain all measurement data of the first encoder and all measurement data of the second encoder in the second time period;
步骤S503、根据所述第二时间段内第一编码器测量的所有测量数据和第二编码器的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。Step S503: Obtain a prediction transformation matrix of the next key frame relative to the previous key frame according to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period.
具体的,在获取到新关键帧后,还可去除连续的纯编码器边连接的关键帧,其中连续的纯编码器边连接的关键帧例如“关键帧1-编码器边1-关键帧2-编码器边2-关键帧3”,关键帧2相对于关键帧1的相对位姿是通过编码器数据得到的预测位姿(也即编码器边1),关键帧3相对于关键帧2的相对位姿也是通过编码器数据得到的预测位姿(也即编码器边2),去除连续的纯编码器边可减少后续优化的计算量,并防止只用编码器数据来进行三角化造成误差较大(其中由于编码器的精度和与地面存在摩擦,导致编码器比视觉的精度低)。去除连续的纯编码器边连接的关键帧,针对上述举例的三个关键帧,可删除连续的纯编码器边连接的关键帧2,从而得到“关键帧1-编码器边-关键帧3”,其中编码器边需要重新计算,也即通过上述实施例中的预积分过程计算关键帧3相对于关键帧1的预测位姿。Specifically, after the new key frame is obtained, the continuous key frames connected by the pure encoder side can be removed, and the key frames connected by the continuous pure encoder side, such as "key frame 1-encoder side 1-key frame 2" -Encoder side 2-key frame 3", the relative pose of key frame 2 to key frame 1 is the predicted pose obtained from the encoder data (ie encoder side 1), key frame 3 is relative to key frame 2 The relative pose of is also the predicted pose obtained from the encoder data (that is, encoder side 2). Removing the continuous pure encoder side can reduce the amount of calculation for subsequent optimization and prevent only the encoder data from being used for triangulation. The error is large (the accuracy of the encoder and the friction with the ground cause the encoder to have lower accuracy than the vision). Remove the consecutive key frames connected by the pure encoder side. For the three key frames in the above example, the key frame 2 connected by the continuous pure encoder side can be deleted to obtain "key frame 1-encoder side-key frame 3" , Where the encoder side needs to be recalculated, that is, the predicted pose of key frame 3 relative to key frame 1 is calculated through the pre-integration process in the foregoing embodiment.
在本实施例中,可通过三角化创建新的地图点,其中三角化是根据两个点(关键帧相对位姿)和过它们的射线(匹配点对的像素坐标,即方向),得到射线交点相对这两个点的位置(匹配点对或3D点的深度或位置)。In this embodiment, a new map point can be created by triangulation, where triangulation is based on two points (the relative pose of the key frame) and the ray passing them (the pixel coordinates of the matching point pair, that is, the direction) to obtain the ray The position of the intersection point relative to these two points (the depth or position of the matching point pair or 3D point).
具体的,对于双目相机,对于空间中存在的某一固定的3D点X,该3D点X被双目相机的左相机和右相机观测到,该3D点在左相机成像位置为x,在右相机成像位置为x’,根据针孔相机模型和相机投影方程,我们可以根据x和x’分别计算出两条射线,且这两条射线理论上会经过该3D点X。基于此理论,两条射线的交点就是空间3D点X的位置,从而可以确定3D点X的位置。对于单目相机,可以以相邻两帧(已知两帧间的相对位姿)代替双目相机的左相机和右相机采集的图像,该相邻两帧同时观测到空间中某一固定的3D点X,其三角化过程与双目相机相同,此处不再赘述。对于RGBD相机,由于预处理过程已计算出特征点相对于虚假的右相机的坐标,将RGBD相机采集的图像等效为双目相机采集的图像,因此可以采用与双目相机相同的三角化过程,此处也不再赘述。Specifically, for a binocular camera, for a fixed 3D point X that exists in space, the 3D point X is observed by the left and right cameras of the binocular camera, and the imaging position of the 3D point on the left camera is x. The imaging position of the right camera is x'. According to the pinhole camera model and the camera projection equation, we can calculate two rays respectively based on x and x', and the two rays will theoretically pass through the 3D point X. Based on this theory, the intersection of two rays is the position of the 3D point X in space, so that the position of the 3D point X can be determined. For a monocular camera, the images collected by the left and right cameras of the binocular camera can be replaced by two adjacent frames (the relative pose between the two frames is known). The two adjacent frames simultaneously observe a certain fixed in space. The triangulation process of 3D point X is the same as that of a binocular camera, so I won't repeat it here. For the RGBD camera, since the preprocessing process has calculated the coordinates of the feature points relative to the false right camera, the image collected by the RGBD camera is equivalent to the image collected by the binocular camera, so the same triangulation process as the binocular camera can be used , I won’t repeat it here.
在本实施例中,可通过局部BA优化关键帧位姿和地图点,其中局部 BA过程与上述实施例中Motion-only BA优化过程类似,其局部BA的目标函数同样也加入编码器误差项。现有技术中的局部BA中,参与局部BA的关键帧通常为当前关键帧的共视图(也即图包含帧和地图点两类顶点、及帧和它能看见的地图点之间的重投影误差边,这些关键帧能看到当前关键帧所看到的地图点,图中地图点是这些关键帧能看到的所有地图点),而本实施例中参与局部BA的关键帧改为当前图像帧往前推N个连续的关键帧(N大于等于1),可通过该N个连续的关键帧构建局部BA的目标函数,从而对当前关键帧的位姿以及地图点进行优化,可减小运算量。In this embodiment, the key frame pose and map points can be optimized by local BA, where the local BA process is similar to the Motion-only BA optimization process in the foregoing embodiment, and the objective function of the local BA also adds an encoder error term. In the local BA in the prior art, the key frames participating in the local BA are usually the common view of the current key frame (that is, the picture contains two types of vertices, the frame and the map point, and the reprojection between the frame and the map point it can see. Error margin, these key frames can see the map points seen by the current key frame, and the map points in the figure are all map points that can be seen by these key frames), and the key frame participating in the local BA in this embodiment is changed to the current The image frame is pushed forward by N consecutive key frames (N is greater than or equal to 1), and the objective function of the local BA can be constructed through the N consecutive key frames, so as to optimize the pose and map points of the current key frame, which can reduce Small amount of calculation.
进一步的,在得到局部地图后,将当前关键帧送入回环检测线程S13,以进行闭环检测,以判断可移动平台是否位于曾经到达过的位置。Further, after the local map is obtained, the current key frame is sent to the loop detection thread S13 to perform closed loop detection to determine whether the movable platform is located at the previously reached position.
现有技术中的闭环检测,查询地图数据库,通过RANSAC(Random Sample Consensus,随即采样一致性)算法计算当前关键帧和闭环关键帧的相似变换(SE3或Sim3),通过闭环融合以融合当前关键帧和闭环关键帧附近关键帧所看到的地图点。其中RANSAC算法是随机的采用几对匹配点来计算变换,使用概率统计的理论来求出最正确的变换,并区分哪些匹配点对是正确的(匹配点对正确,则称作内点)。Closed-loop detection in the prior art, query the map database, calculate the similar transformation (SE3 or Sim3) between the current key frame and the closed-loop key frame through the RANSAC (Random Sample Consensus) algorithm, and merge the current key frame through closed-loop fusion And the map point seen by the key frame near the closed-loop key frame. Among them, the RANSAC algorithm uses several pairs of matching points to calculate the transformation randomly, uses the theory of probability statistics to find the most correct transformation, and distinguishes which matching point pairs are correct (the matching point pairs are correct, then called interior points).
在本实施例中,可通过获取每一关键帧的词袋(Bag-of-words,BoW)矢量,获取任一关键帧的词袋矢量与当前关键帧的词袋矢量之间的距离(例如欧氏距离),根据两词袋矢量之间的距离判断是否存在闭环,其中词袋矢量是以根据关键帧的特征点以及预设特征点词袋获取,其中预设特征点词袋是通过特征聚类而构建,其中包含以预定顺序排列的特征点信息,而关键帧特征点通过与预设特征点词袋中的每一特征点信息进行匹配,若预设特征点词袋中的某一特征点在关键帧中出现则置1,未出现则置0,从而得到该关键帧的词袋矢量。在数据库中会存储每一关键帧的词袋矢量,可通过将当前关键帧的词袋矢量与数据库中的各关键帧的词袋矢量进行距离判断,来判断当前关键帧是否发生闭环,其中匹配的关键帧即为当前图像帧的闭环关键帧。In this embodiment, the bag-of-words (BoW) vector of each key frame can be obtained to obtain the distance between the bag-of-words vector of any key frame and the bag-of-words vector of the current key frame (for example, Euclidean distance), judge whether there is a closed loop based on the distance between two bag-of-words vectors, where the bag-of-words vector is obtained based on the feature points of the key frame and the preset feature point bag of words, and the preset feature point bag of words is obtained by the feature It is constructed by clustering, which contains the feature point information arranged in a predetermined order, and the key frame feature point is matched with each feature point information in the preset feature point word bag. If a certain feature point in the preset feature point word bag is The feature point is set to 1 if it appears in the key frame, and 0 if it does not appear, so that the bag of words vector of the key frame is obtained. The bag-of-words vector of each key frame is stored in the database. The distance between the bag-of-words vector of the current key frame and the bag-of-words vector of each key frame in the database can be used to judge whether the current key frame has a closed loop. The key frame of is the closed-loop key frame of the current image frame.
进一步的,本实施例中,可在视觉失效后,降低检测闭环的阈值,其中检测闭环的阈值包括重复检测次数、计算相似变换的匹配点数和内点数的阈值,通过放宽检测闭环的条件,可以在视觉失效后快速检测闭环,以 减小视觉失效过程中由纯编码器边产生的误差。而在视觉恢复后,可将检测变换的阈值恢复为正常阈值,可避免过度频繁的闭环消耗计算量。Further, in this embodiment, the threshold for detecting the closed loop can be lowered after the visual failure. The threshold for detecting the closed loop includes the number of repeated detections, the threshold for calculating the number of matching points for similar transformations, and the number of internal points. By relaxing the conditions for detecting the closed loop, Quickly detect the closed loop after the visual failure to reduce the error caused by the pure encoder side during the visual failure. After the vision is restored, the threshold of the detection change can be restored to the normal threshold, which can avoid excessively frequent closed-loop consumption of calculations.
进一步的,当检测到闭环时,可进行位姿图优化。Further, when a closed loop is detected, the pose graph can be optimized.
其中,位姿图为顶点不含地图点、只含所有关键帧、边只包含支撑树上帧之间的相对变换和所有闭环边对应的相似变换的图,其中支撑树(spanning tree)是连接整个地图中关键帧的无回环图,一般时间上连续的两关键帧会成为父子节点,但当删除某父亲节点时会根据共同看见的地图点数量去修改子节点的父节点(此时对应去除关键帧的操作,会更新子节点的预积分),根节点为视觉SLAM系统建图时最初的关键帧。Among them, the pose graph is a graph in which the vertices do not contain map points, only contain all the key frames, and the edges contain only the relative transformations between the frames on the supporting tree and the similar transformations corresponding to all closed-loop edges. The spanning tree is the connection A loop-free graph of key frames in the entire map. Generally, two consecutive key frames in time will become parent and child nodes, but when a parent node is deleted, the parent node of the child node will be modified according to the number of map points seen in common (this time the corresponding removal The operation of the key frame will update the pre-integration of the child nodes), and the root node is the initial key frame when the visual SLAM system is created.
本实施例中位姿图优化的是每个关键帧的位姿,目标函数是相对变换
Figure PCTCN2019103599-appb-000031
和顶点位姿的误差
Figure PCTCN2019103599-appb-000032
关于协方差矩阵
Figure PCTCN2019103599-appb-000033
(原本为3×3单位矩阵I)的平方和。
In this embodiment, the pose graph optimizes the pose of each key frame, and the objective function is relative transformation
Figure PCTCN2019103599-appb-000031
And vertex pose error
Figure PCTCN2019103599-appb-000032
About the covariance matrix
Figure PCTCN2019103599-appb-000033
(Originally a 3×3 identity matrix I) the sum of squares.
本实施例中,当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。具体的,可通过预积分得到的协方差矩阵
Figure PCTCN2019103599-appb-000034
去修改支撑树上纯编码器边对应相对变换
Figure PCTCN2019103599-appb-000035
的协方差矩阵
Figure PCTCN2019103599-appb-000036
使其对角线元素一般大于1,即表明纯编码器边的精度不如编码器与视觉紧耦合的边,可以让非线性优化时将纯编码器边引入的误差削弱地更多。通过上述的位姿图优化可以使纯编码器边引进的误差分散到所有边上,例如在支撑树上,关键帧1-关键帧2-关键帧3,当关键帧3与关键1发生闭环时,关键帧1与关键帧3之间形成了一条闭环边(当前关键帧和闭环关键帧的相似变换SE3或Sim3),可将关键帧1与关键帧3的累积误差分散到关键帧1-关键帧2-关键帧3的所有边上,使得每一条边的误差均相对较小,在可接受范围内。
In this embodiment, when a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between adjacent key frames. Specifically, the covariance matrix can be obtained by pre-integration
Figure PCTCN2019103599-appb-000034
To modify the corresponding relative transformation of the pure encoder side on the support tree
Figure PCTCN2019103599-appb-000035
Covariance matrix
Figure PCTCN2019103599-appb-000036
The diagonal element is generally greater than 1, which means that the accuracy of the pure encoder edge is not as good as the encoder and the visually tightly coupled edge, which can reduce the error introduced by the pure encoder edge during nonlinear optimization. Through the above-mentioned pose graph optimization, the error introduced by the pure encoder side can be dispersed to all sides, for example, on the support tree, key frame 1-key frame 2-key frame 3, when the closed loop of key frame 3 and key 1 occurs , A closed-loop edge is formed between key frame 1 and key frame 3 (similar transformation SE3 or Sim3 of current key frame and closed-loop key frame), which can disperse the cumulative error of key frame 1 and key frame 3 to key frame 1-key On all sides of frame 2-key frame 3, the error of each side is relatively small and within an acceptable range.
进一步的,在位姿图优化完成后,还可进行全局BA优化。其中全局BA与上述实施例中Motion-only BA、局部BA优化过程类似,全局BA的目标函数同样也加入编码器误差项,而与局部BA优化过程不同之处在于,参与全局BA的关键帧为整个地图库中所有关键帧和地图点。在完成全局 BA后,可更新地图,更新全部地图点、更新地图库中全部关键帧直到最新的关键帧。Further, after the optimization of the pose map is completed, global BA optimization can also be performed. The global BA is similar to the Motion-only BA and local BA optimization process in the above embodiment. The objective function of the global BA also adds the encoder error term, and the difference from the local BA optimization process is that the key frames participating in the global BA are All key frames and map points in the entire map library. After completing the global BA, you can update the map, update all map points, update all key frames in the map library until the latest key frame.
在上述任一实施例的基础上,若视觉SLAM与IMU紧耦合,则需要对IMU进行初始化,以获取IMU的零偏(bias),包括IMU中陀螺仪的零偏和加速度计的零偏,还可获取重力矢量,此外,对于单目相机与IMU紧耦合的情况,还需要通过IMU初始化确定地图尺度,而双目相机和RGBD相机通常不需要通过IMU初始化确定地图尺度,当然通过IMU初始化对双目相机和RGBD相机确定地图尺度时,初始化结果接近为1。On the basis of any of the above embodiments, if the visual SLAM is tightly coupled with the IMU, the IMU needs to be initialized to obtain the bias of the IMU, including the bias of the gyroscope in the IMU and the bias of the accelerometer. Gravity vector can also be obtained. In addition, for the case of monocular camera and IMU tightly coupled, the map scale needs to be determined through IMU initialization, while binocular cameras and RGBD cameras usually do not need to be initialized to determine the map scale through IMU initialization. When the binocular camera and RGBD camera determine the map scale, the initialization result is close to 1.
本实施例中,可以在视觉SLAM刚开始运行时,取最开始的预定数量的关键帧(例如20个),根据该些关键帧获取IMU的零偏、重力矢量、地图尺度,然后将地图尺度更新到所有的关键帧中,包括该预定数量的关键帧以及在初始化过程中得到的新的关键帧中,然后进行全局BA优化更新地图尺度,之后视觉SLAM可以以更新后的地图尺度运行。In this embodiment, when the visual SLAM just starts to run, the first predetermined number of key frames (for example, 20) can be taken, and the zero offset, gravity vector, and map scale of the IMU can be obtained according to these key frames, and then the map scale Update to all the key frames, including the predetermined number of key frames and the new key frames obtained in the initialization process, and then perform global BA optimization to update the map scale, and then the visual SLAM can run at the updated map scale.
在上述任一实施例的基础上,本实施例还提供一种轻量级定位方式。On the basis of any of the foregoing embodiments, this embodiment also provides a lightweight positioning method.
具体的,在本实施例中在完成地图构建后可采用最简保存方式,即只保存必要的地图的关键帧和地图点信息,可采用二进制文件存储。主要存储内容涉及传感器类型;关键帧的序号、位姿、词袋矢量、特征点序列和帧间传感器信息序列、帧所能观测的地图点序号信息、帧所连接的闭环关键帧序号信息;地图点的序号、位置、参考关键帧序号、地图点能被观测到的关键帧序号信息;支撑树结构(即父节点序号)Specifically, in this embodiment, after the map construction is completed, the simplest saving method can be adopted, that is, only the key frames and map point information of the necessary map are saved, and binary file storage can be adopted. The main storage content involves sensor type; key frame sequence number, pose, bag-of-words vector, feature point sequence and inter-frame sensor information sequence, sequence number information of map points that can be observed by the frame, and sequence number information of closed-loop key frame connected to the frame; map Point's serial number, position, reference key frame serial number, key frame serial number information that the map point can be observed; supporting tree structure (namely parent node serial number)
在读取时,采取逆保存操作,重新构建关键帧和地图点、更新相关信息(如支撑树结构、共视图信息和将关键帧加入地图数据库等操作),构建关键帧和地图点时可重新创建关键帧的序号和地图点的序号,可不采用原序号。When reading, take reverse save operations, rebuild key frames and map points, update related information (such as supporting tree structure, common view information, and add key frames to the map database), and rebuild key frames and map points. The sequence number for creating the key frame and the sequence number for the map point may not be the original sequence number.
本实施例中可将运行很久的视觉SLAM系统的地图信息以很小的存储空间进行存储,可以提高视觉SLAM系统的最大工作时长,但重新载入时需要花费较多的重构地图和轨迹的时间。在采用上述重构的地图时,视觉SLAM可仅启动追踪线程S11进行定位,而不需要再开启局部建图线程S12和闭环线程,实现轻量级定位。需要说明的是,对于单目相机与IMU紧耦合的情况,仍需要IMU初始化(获取初始的零偏与重力加速度)。In this embodiment, the map information of the long-running visual SLAM system can be stored in a small storage space, which can increase the maximum working time of the visual SLAM system, but it takes more time to reconstruct the map and trajectory when reloading. time. When the above-mentioned reconstructed map is used, the visual SLAM can only start the tracking thread S11 for positioning, and does not need to start the local mapping thread S12 and the closed loop thread to achieve lightweight positioning. It should be noted that for the case where the monocular camera and the IMU are tightly coupled, the IMU still needs to be initialized (to obtain the initial zero bias and gravitational acceleration).
本发明实施例提供一种可移动平台。图7为本发明实施例提供的可移动平台的结构图,如图7所示,所述可移动平台700包括第一滚动轮701、第二滚动轮702、第一编码器703、第二编码器704、摄像头705、存储器707和处理器706;The embodiment of the present invention provides a movable platform. FIG. 7 is a structural diagram of a movable platform provided by an embodiment of the present invention. As shown in FIG. 7, the movable platform 700 includes a first scroll wheel 701, a second scroll wheel 702, a first encoder 703, and a second encoder 704, camera 705, memory 707, and processor 706;
所述第一滚动轮701和第二滚动轮702用于使得所述可移动平台发生位移,所述第一滚动轮701和第二滚动轮702的轴线重合;The first rolling wheel 701 and the second rolling wheel 702 are used for displacing the movable platform, and the axes of the first rolling wheel 701 and the second rolling wheel 702 coincide;
所述摄像头705用于采集图像帧;The camera 705 is used to collect image frames;
所述第一编码器703用于检测所述第一滚动轮701的速率,所述第二编码器704用于检测所述第二滚动轮702的速率;The first encoder 703 is used to detect the speed of the first scroll wheel 701, and the second encoder 704 is used to detect the speed of the second scroll wheel 702;
所述存储器707用于存储程序代码;The memory 707 is used to store program codes;
所述处理器706,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor 706 calls the program code, and when the program code is executed, is configured to perform the following operations:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器703的所有测量数据和第二编码器704测的所有测量数据;Acquiring all the measurement data of the first encoder 703 and all the measurement data of the second encoder 704 in the first time period;
根据所述第一时间段内第一编码器703测量的所有测量数据和第二编码器704的所有测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to all the measurement data measured by the first encoder 703 and all the measurement data of the second encoder 704 in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
在上述实施例的基础上,在所述处理器706根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵时,所述处理器被配置为:On the basis of the foregoing embodiment, the processor 706 obtains the prediction of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period. When transforming the matrix, the processor is configured to:
根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获取当前图像帧的旋转矩阵关于时间的第一导数,以及当前图像帧的位置关于时间的第二导数;According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
基于所述第一时间段,分别对所述第一导数和第二导数进行积分获取当前图像帧相对于上一图像帧的的预测旋转矩阵和预测平移矩阵;Based on the first time period, respectively integrating the first derivative and the second derivative to obtain a predicted rotation matrix and a predicted translation matrix of the current image frame relative to the previous image frame;
根据当前图像帧相对于上一图像帧的预测旋转矩阵和预测平移矩阵, 获得当前图像帧相对于上一图像帧的预测变换矩阵。According to the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame, the predicted transformation matrix of the current image frame relative to the previous image frame is obtained.
在上述任一实施例的基础上,在所述处理器706根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿后,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, after the processor 706 obtains the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame, the processor 706 is also Configured as:
根据预设的纯运动束调整模型对所述当前图像帧的预测位姿进行优化;其中,所述纯运动束调整模型包括编码器误差项,所述编码器误差项通过所述第一编码器703的测量数据和所述第二编码器704的测量数据得到。The prediction pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term passes through the first encoder The measurement data of 703 and the measurement data of the second encoder 704 are obtained.
在上述任一实施例的基础上,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, the processor 706 is further configured to:
若当前图像帧相对于上一图像帧发生视觉失效时,所述当前图像帧的位姿为所述预测位姿;If the current image frame has a visual failure relative to the previous image frame, the pose of the current image frame is the predicted pose;
若当前图像帧相对于上一图像帧未发生视觉失效时,所述当前图像帧的位姿为所述纯运动束调整模型的优化结果。If there is no visual failure of the current image frame relative to the previous image frame, the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
在上述任一实施例的基础上,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, the processor 706 is further configured to:
当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧是否发生视觉失效;或,When the optimization result cannot be obtained through the pure motion beam adjustment model, determine whether the current image frame is visually invalid relative to the previous image frame; or,
根据所述优化位姿,将所述当前图像帧和所述上一图像帧进行特征匹配获取内点数量;Performing feature matching between the current image frame and the previous image frame to obtain the number of interior points according to the optimized pose;
当所述内点数量小于预设的阈值时,则确定当前图像帧相对于上一图像帧是否发生视觉失效。When the number of interior points is less than the preset threshold, it is determined whether the current image frame is visually invalid relative to the previous image frame.
在上述任一实施例的基础上,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, the processor 706 is further configured to:
当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined by the key frame determined before the current image frame Build.
在上述任一实施例的基础上,在所述处理器706根据所述当前图像帧对所述已构建地图进行更新时,所述处理器706被配置为:On the basis of any of the foregoing embodiments, when the processor 706 updates the constructed map according to the current image frame, the processor 706 is configured to:
根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以 及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括编码器误差项。After generating a new map point, construct the common view of the current image frame and perform local beam adjustment to adjust other key frames that are common view with the current image frame, the map points that can be seen in the current image frame, and the other The map points that can be seen in the key frame are optimized; wherein, the local beam adjustment includes an encoder error term.
在上述任一实施例的基础上,在所述处理器706构建所述当前图像帧的共视图进行局部束调整时,所述处理器706被配置为:On the basis of any of the foregoing embodiments, when the processor 706 constructs the common view of the current image frame to perform local beam adjustment, the processor 706 is configured to:
当任一关键帧与前一个关键帧由纯编码器连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯编码器边为相邻关键帧之间发生所述视觉失效;When any key frame and the previous key frame are connected by a pure encoder, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, and according to the previous key frame The second time period is obtained by the timestamp of and the timestamp of the next key frame; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
获取所述第二时间段内第一编码器703的所有测量数据和第二编码器704测的所有测量数据;Acquiring all the measurement data of the first encoder 703 and all the measurement data of the second encoder 704 in the second time period;
根据所述第二时间段内第一编码器703测量的所有测量数据和第二编码器704的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。According to all the measurement data measured by the first encoder 703 and all the measurement data of the second encoder 704 in the second time period, the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
在上述任一实施例的基础上,在所述处理器706构建所述当前图像帧的共视图进行局部束调整时,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, when the processor 706 constructs the common view of the current image frame to perform local beam adjustment, the processor 706 is further configured to:
获取所述当前图像帧之前预设的N个连续关键帧,根据所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点构建局部优化图;N大于等于1;Acquiring N consecutive key frames preset before the current image frame, and constructing a local optimization map according to the current image frame and the map points that can be seen, the N consecutive key frames and the map points that can be seen; N is greater than or equal to 1;
根据所述局部优化图中任意两个相邻关键帧之间的预测变换矩阵构建所述编码器误差项,并根据所述地图点在各个关键帧上的位姿进行重投影获取重投影误差;Construct the encoder error term according to the prediction transformation matrix between any two adjacent key frames in the local optimization map, and perform reprojection according to the pose of the map point on each key frame to obtain the reprojection error;
根据所述编码器误差和重投影误差对所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点进行优化,更新所述局部优化图中所有关键帧的位姿及所有地图点的位置。According to the encoder error and the reprojection error, optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map. The pose of the key frame and the position of all map points.
在上述任一实施例的基础上,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, the processor 706 is further configured to:
根据当前图像帧的特征点进行闭环检测;Perform closed-loop detection according to the feature points of the current image frame;
当检测到所述当前图像帧和任一关键帧发生闭环时,融合当前图像帧及所述关键帧附件的关键帧所看到的地图点。When a closed loop between the current image frame and any key frame is detected, the map points seen by the current image frame and the key frame attached to the key frame are merged.
在上述任一实施例的基础上,在所述处理器706进行闭环检测后,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, after the processor 706 performs closed-loop detection, the processor 706 is further configured to:
构建位姿图以对所有关键帧进行优化,以更新所有关键帧的位姿;所述位姿图包括所有的关键帧以及两个关键帧之间的相对变换。The pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
在上述任一实施例的基础上,在所述处理器706构建位姿图以对所有关键帧进行优化时,所述处理器706被配置为:On the basis of any of the foregoing embodiments, when the processor 706 constructs a pose graph to optimize all key frames, the processor 706 is configured to:
当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。When a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
在上述任一实施例的基础上,在所述处理器706构建位姿图以对所有关键帧进行优化后,所述处理器706还被配置为:On the basis of any of the foregoing embodiments, after the processor 706 constructs a pose graph to optimize all key frames, the processor 706 is further configured to:
根据预设的全局束调整对所有的关键帧和所有的地图点进行更新。All key frames and all map points are updated according to the preset global bundle adjustment.
在另一实施例中,所述处理器706,调用所述程序代码,当程序代码被执行时,用于执行以下操作:In another embodiment, the processor 706 calls the program code, and when the program code is executed, is configured to perform the following operations:
根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
获取所述第一时间段内第一编码器703的所有测量数据和第二编码器704测的所有测量数据;Acquiring all the measurement data of the first encoder 703 and all the measurement data of the second encoder 704 in the first time period;
根据所述第一时间段内第一编码器703测量的所有测量数据和第二编码器704的所有测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to all the measurement data measured by the first encoder 703 and all the measurement data of the second encoder 704 in the first time period;
根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿;Obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame;
在获得所述当前图像帧的预测位姿后,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
本发明实施例提供的可移动平台的具体原理和实现方式均与上述实施例类似,此处不再赘述。The specific principles and implementation manners of the movable platform provided in the embodiments of the present invention are similar to the foregoing embodiments, and will not be repeated here.
本实施例提供的可移动平台,通过根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。本实施例通过将编码器应用到视觉SLAM中,通 过编码器测量数据预测当前图像帧的位姿,可提高位姿获取的准确性,提高视觉SLAM的稳定性和鲁棒性,在视觉失效时视觉SLAM仍能稳定的运行。The mobile platform provided in this embodiment obtains the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame; obtains the measurement data and the second code of the first encoder in the first time period Measured measurement data; According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the prediction transformation matrix of the current image frame relative to the previous image frame; According to the predicted change The matrix and the pose information of the previous image frame obtain the predicted pose of the current image frame. In this embodiment, the encoder is applied to the visual SLAM, and the pose of the current image frame is predicted through the encoder measurement data, which can improve the accuracy of the pose acquisition, improve the stability and robustness of the visual SLAM, and when the vision fails Visual SLAM can still run stably.
另外,本实施例还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行以实现上述实施例所述的位姿预测方法和/或地图构建方法。In addition, this embodiment also provides a computer-readable storage medium on which a computer program is stored, and the computer program is executed by a processor to implement the pose prediction method and/or map construction method described in the foregoing embodiments.
另外,本实施例还提供计算机程序,包括程序代码,当计算机运行所述计算机程序时,所述程序代码执行如上述实施例所述的位姿预测方法和/或地图构建方法。In addition, this embodiment also provides a computer program, including program code. When the computer runs the computer program, the program code executes the pose prediction method and/or map construction method as described in the foregoing embodiment.
在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present invention, it should be understood that the disclosed device and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional units.
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-mentioned integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The above-mentioned software functional unit is stored in a storage medium, and includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor execute the method described in each embodiment of the present invention. Part of the steps. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and conciseness of the description, only the division of the above-mentioned functional modules is used as an example. In practical applications, the above-mentioned functions can be allocated by different functional modules as required, namely The internal structure is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments can still be modified, or some or all of the technical features can be equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the technical solutions of the embodiments of the present invention. range.

Claims (54)

  1. 一种位姿预测方法,其特征在于,适用于一可移动平台,所述可移动平台包括第一滚动轮、第二滚动轮、摄像头、第一编码器和第二编码器,所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;所述摄像头用于采集图像帧,所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;所述方法包括:A pose prediction method, characterized in that it is suitable for a movable platform, the movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder and a second encoder, the first The scroll wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide; the camera is used to collect image frames, and the first encoder is used to detect The speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel; the method includes:
    根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
    获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵,包括:The method according to claim 1, characterized in that, according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, the current image frame relative to the previous image frame is obtained. The prediction transformation matrix includes:
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获取当前图像帧的旋转矩阵关于时间的第一导数,以及当前图像帧的位置关于时间的第二导数;According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
    基于所述第一时间段,分别对所述第一导数和第二导数进行积分获取当前图像帧相对于上一图像帧的的预测旋转矩阵和预测平移矩阵;Based on the first time period, respectively integrating the first derivative and the second derivative to obtain a predicted rotation matrix and a predicted translation matrix of the current image frame relative to the previous image frame;
    根据当前图像帧相对于上一图像帧的预测旋转矩阵和预测平移矩阵,获得当前图像帧相对于上一图像帧的预测变换矩阵。According to the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame, the predicted transformation matrix of the current image frame relative to the previous image frame is obtained.
  3. 根据权利要求1或2所述的方法,其特征在于,在所述根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿后,还包括:The method according to claim 1 or 2, wherein after obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame, the method further comprises:
    根据预设的纯运动束调整模型对所述当前图像帧的预测位姿进行优 化;其中,所述纯运动束调整模型包括编码器误差项,所述编码器误差项通过基于所述第一编码器的测量数据和所述第二编码器的测量数据进行计算得到。The prediction pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term is based on the first encoding The measurement data of the encoder and the measurement data of the second encoder are calculated.
  4. 根据权利要求3所述的方法,其特征在于,还包括:The method according to claim 3, further comprising:
    若当前图像帧相对于上一图像帧发生视觉失效时,所述当前图像帧的位姿为所述预测位姿;If the current image frame has a visual failure relative to the previous image frame, the pose of the current image frame is the predicted pose;
    若当前图像帧相对于上一图像帧未发生视觉失效时,所述当前图像帧的位姿为所述纯运动束调整模型的优化结果。If there is no visual failure of the current image frame relative to the previous image frame, the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
  5. 根据权利要求4所述的方法,其特征在于,还包括:The method according to claim 4, further comprising:
    当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧发生视觉失效;或,When the optimization result cannot be obtained through the pure motion beam adjustment model, it is determined that the current image frame is visually invalid relative to the previous image frame; or,
    根据所述优化位姿,将所述当前图像帧和所述上一图像帧进行特征匹配获取内点数量;Performing feature matching between the current image frame and the previous image frame to obtain the number of interior points according to the optimized pose;
    当所述内点数量小于预设的阈值时,则确定当前图像帧相对于上一图像帧发生视觉失效。When the number of interior points is less than the preset threshold, it is determined that the current image frame is visually invalid relative to the previous image frame.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,还包括:The method according to any one of claims 1-5, further comprising:
    当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined by the key frame determined before the current image frame Build.
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述当前图像帧对所述已构建地图进行更新,包括:The method according to claim 6, wherein the updating the constructed map according to the current image frame comprises:
    根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
    在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括 编码器误差项。After generating a new map point, construct the common view of the current image frame and perform local beam adjustment to adjust other key frames that are common view with the current image frame, the map points that can be seen in the current image frame, and the other The map points that can be seen in the key frame are optimized; wherein, the local beam adjustment includes an encoder error term.
  8. 根据权利要求7所述的方法,其特征在于,所述构建所述当前图像帧的共视图进行局部束调整,包括:The method according to claim 7, wherein the constructing a common view of the current image frame to perform local beam adjustment comprises:
    当任一关键帧与前一个关键帧由纯编码器边连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯编码器边为相邻关键帧之间发生所述视觉失效;When any key frame and the previous key frame are connected by the pure encoder side, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, and the key frame is removed according to the previous key frame. The time stamp of the frame and the time stamp of the next key frame acquire the second time period; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
    获取所述第二时间段内第一编码器的所有测量数据和第二编码器测的所有测量数据;Acquiring all the measurement data of the first encoder and all the measurement data of the second encoder in the second time period;
    根据所述第二时间段内第一编码器测量的所有测量数据和第二编码器的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。According to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period, the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
  9. 根据权利要7或8所述的方法,其特征在于,所述构建所述当前图像帧的共视图进行局部束调整,还包括:The method according to claim 7 or 8, wherein the constructing a common view of the current image frame for local beam adjustment further comprises:
    获取所述当前图像帧之前预设的N个连续关键帧,根据所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点构建局部优化图;N大于等于1;Acquiring N consecutive key frames preset before the current image frame, and constructing a local optimization map according to the current image frame and the map points that can be seen, the N consecutive key frames and the map points that can be seen; N is greater than or equal to 1;
    根据所述局部优化图中任意两个相邻关键帧之间的预测变换矩阵构建所述编码器误差项,并根据所述地图点在各个关键帧上的位姿进行重投影获取重投影误差;Construct the encoder error term according to the prediction transformation matrix between any two adjacent key frames in the local optimization map, and perform reprojection according to the pose of the map point on each key frame to obtain the reprojection error;
    根据所述编码器误差和重投影误差对所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点进行优化,更新所述局部优化图中所有关键帧的位姿及所有地图点的位置。According to the encoder error and the reprojection error, optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map. The pose of the key frame and the position of all map points.
  10. 根据权利要求1-9任一项所述的方法,其特征在于,还包括:The method according to any one of claims 1-9, further comprising:
    根据当前图像帧的特征点进行闭环检测;Perform closed-loop detection according to the feature points of the current image frame;
    当检测到所述当前图像帧和任一关键帧发生闭环时,融合当前图像帧及所述关键帧附件的关键帧所看到的地图点。When a closed loop between the current image frame and any key frame is detected, the map points seen by the current image frame and the key frame attached to the key frame are merged.
  11. 根据权利要求10所述的方法,其特征在于,在进行闭环检测后,还包括:The method according to claim 10, characterized in that, after the closed-loop detection is performed, the method further comprises:
    构建位姿图以对所有关键帧进行优化,以更新所有关键帧的位姿;所述位姿图包括所有的关键帧以及两个关键帧之间的相对变换。The pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
  12. 根据权利要求11所述的方法,其特征在于,所述构建位姿图以对所有关键帧进行优化,包括:The method according to claim 11, wherein said constructing a pose map to optimize all key frames comprises:
    当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。When a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
  13. 根据权利要求11或12所述的方法,其特征在于,在所述构建位姿图以对所有关键帧进行优化后,还包括:The method according to claim 11 or 12, characterized in that, after said constructing the pose map to optimize all the key frames, the method further comprises:
    根据预设的全局束调整对所有的关键帧和所有的地图点进行更新。All key frames and all map points are updated according to the preset global bundle adjustment.
  14. 一种可移动平台,其特征在于,包括:第一滚动轮、第二滚动轮、摄像头、第一编码器、第二编码器、存储器和处理器;A movable platform, characterized by comprising: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
    所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;The first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
    所述摄像头用于采集图像帧;The camera is used to collect image frames;
    所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel;
    所述存储器用于存储程序代码;The memory is used to store program codes;
    所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
    根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
    获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿。Obtain the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame.
  15. 根据权利要求14所述的可移动平台,其特征在于,在所述处理器根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵时,所述处理器被配置为:The mobile platform according to claim 14, wherein the processor obtains the current image frame relative to the measurement data of the first encoder and the measurement data of the second encoder during the first time period. For the prediction transformation matrix of the previous image frame, the processor is configured to:
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获取当前图像帧的旋转矩阵关于时间的第一导数,以及当前图像帧的位置关于时间的第二导数;According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
    基于所述第一时间段,分别对所述第一导数和第二导数进行积分获取当前图像帧相对于上一图像帧的的预测旋转矩阵和预测平移矩阵;Based on the first time period, respectively integrating the first derivative and the second derivative to obtain a predicted rotation matrix and a predicted translation matrix of the current image frame relative to the previous image frame;
    根据当前图像帧相对于上一图像帧的预测旋转矩阵和预测平移矩阵,获得当前图像帧相对于上一图像帧的预测变换矩阵。According to the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame, the predicted transformation matrix of the current image frame relative to the previous image frame is obtained.
  16. 根据权利要求14或15所述的可移动平台,其特征在于,在所述处理器根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿后,所述处理器还被配置为:The mobile platform according to claim 14 or 15, wherein after the processor obtains the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame, The processor is also configured to:
    根据预设的纯运动束调整模型对所述当前图像帧的预测位姿进行优化;其中,所述纯运动束调整模型包括编码器误差项,所述编码器误差项通过基于所述第一编码器的测量数据和所述第二编码器的测量数据进行计算得到。The predicted pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term is based on the first encoding The measurement data of the encoder and the measurement data of the second encoder are calculated.
  17. 根据权利要求16所述的可移动平台,其特征在于,所述处理器还被配置为:The movable platform of claim 16, wherein the processor is further configured to:
    若当前图像帧相对于上一图像帧发生视觉失效时,所述当前图像帧的位姿为所述预测位姿;If the current image frame has a visual failure relative to the previous image frame, the pose of the current image frame is the predicted pose;
    若当前图像帧相对于上一图像帧未发生视觉失效时,所述当前图像帧的位姿为所述纯运动束调整模型的优化结果。If there is no visual failure of the current image frame relative to the previous image frame, the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
  18. 根据权利要求17所述的可移动平台,其特征在于,所述处理器还被配置为:The movable platform of claim 17, wherein the processor is further configured to:
    当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧是否发生视觉失效;或,When the optimization result cannot be obtained through the pure motion beam adjustment model, determine whether the current image frame is visually invalid relative to the previous image frame; or,
    根据所述优化位姿,将所述当前图像帧和所述上一图像帧进行特征匹配获取内点数量;Performing feature matching between the current image frame and the previous image frame to obtain the number of interior points according to the optimized pose;
    当所述内点数量小于预设的阈值时,则确定当前图像帧相对于上一图像帧是否发生视觉失效。When the number of interior points is less than the preset threshold, it is determined whether the current image frame is visually invalid relative to the previous image frame.
  19. 根据权利要求14-18任一项所述的可移动平台,其特征在于,所述处理器还被配置为:The movable platform according to any one of claims 14-18, wherein the processor is further configured to:
    当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined by the key frame determined before the current image frame Build.
  20. 根据权利要求19所述的可移动平台,其特征在于,在所述处理器根据所述当前图像帧对所述已构建地图进行更新时,所述处理器被配置为:The mobile platform according to claim 19, wherein when the processor updates the constructed map according to the current image frame, the processor is configured to:
    根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
    在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括编码器误差项。After generating new map points, construct the common view of the current image frame and perform local beam adjustments to adjust other key frames that are common view with the current image frame, the map points that can be seen in the current image frame, and the other The map points that can be seen in the key frame are optimized; wherein, the local beam adjustment includes an encoder error term.
  21. 根据权利要求20所述的可移动平台,其特征在于,在所述处理器构建所述当前图像帧的共视图进行局部束调整时,所述处理器被配置为:The movable platform according to claim 20, wherein when the processor constructs the common view of the current image frame to perform local beam adjustment, the processor is configured to:
    当任一关键帧与前一个关键帧由纯编码器连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯 编码器边为相邻关键帧之间发生所述视觉失效;When any key frame and the previous key frame are connected by a pure encoder, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, and the previous key frame is The second time period is obtained by the timestamp of and the timestamp of the next key frame; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
    获取所述第二时间段内第一编码器的所有测量数据和第二编码器测的所有测量数据;Acquiring all the measurement data of the first encoder and all the measurement data of the second encoder in the second time period;
    根据所述第二时间段内第一编码器测量的所有测量数据和第二编码器的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。According to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period, the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
  22. 根据权利要20或21所述的可移动平台,其特征在于,在所述处理器构建所述当前图像帧的共视图进行局部束调整时,所述处理器还被配置为:The movable platform according to claim 20 or 21, wherein when the processor constructs the common view of the current image frame to perform local beam adjustment, the processor is further configured to:
    获取所述当前图像帧之前预设的N个连续关键帧,根据所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点构建局部优化图;N大于等于1;Acquiring N consecutive key frames preset before the current image frame, and constructing a local optimization map according to the current image frame and the map points that can be seen, the N consecutive key frames and the map points that can be seen; N is greater than or equal to 1;
    根据所述局部优化图中任意两个相邻关键帧之间的预测变换矩阵构建所述编码器误差项,并根据所述地图点在各个关键帧上的位姿进行重投影获取重投影误差;Construct the encoder error term according to the prediction transformation matrix between any two adjacent key frames in the local optimization map, and perform reprojection according to the pose of the map point on each key frame to obtain the reprojection error;
    根据所述编码器误差和重投影误差对所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点进行优化,更新所述局部优化图中所有关键帧的位姿及所有地图点的位置。According to the encoder error and the reprojection error, optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map. The pose of the key frame and the position of all map points.
  23. 根据权利要求14-22任一项所述的可移动平台,其特征在于,所述处理器还被配置为:The movable platform according to any one of claims 14-22, wherein the processor is further configured to:
    根据当前图像帧的特征点进行闭环检测;Perform closed-loop detection according to the feature points of the current image frame;
    当检测到所述当前图像帧和任一关键帧发生闭环时,融合当前图像帧及所述关键帧附件的关键帧所看到的地图点。When a closed loop between the current image frame and any key frame is detected, the map points seen by the current image frame and the key frame attached to the key frame are merged.
  24. 根据权利要求23所述的可移动平台,其特征在于,在所述处理器进行闭环检测后,所述处理器还被配置为:The movable platform according to claim 23, wherein after the processor performs closed-loop detection, the processor is further configured to:
    构建位姿图以对所有关键帧进行优化,以更新所有关键帧的位姿;所述位姿图包括所有的关键帧以及两个关键帧之间的相对变换。The pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
  25. 根据权利要求24所述的可移动平台,其特征在于,在所述处理器构建位姿图以对所有关键帧进行优化时,所述处理器被配置为:The movable platform according to claim 24, wherein when the processor constructs the pose map to optimize all the key frames, the processor is configured to:
    当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。When a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
  26. 根据权利要求24或25所述的可移动平台,其特征在于,在所述处理器构建位姿图以对所有关键帧进行优化后,所述处理器还被配置为:The mobile platform according to claim 24 or 25, wherein after the processor constructs the pose graph to optimize all key frames, the processor is further configured to:
    根据预设的全局束调整对所有的关键帧和所有的地图点进行更新。All key frames and all map points are updated according to the preset global bundle adjustment.
  27. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被处理器执行以实现如权利要求1-13任一项所述的方法。A computer-readable storage medium, characterized in that a computer program is stored thereon, and the computer program is executed by a processor to implement the method according to any one of claims 1-13.
  28. 一种地图构建方法,其特征在于,适用于一可移动平台,所述可移动平台包括第一滚动轮、第二滚动轮、摄像头、第一编码器和第二编码器,所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;所述摄像头用于采集图像帧,所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;所述方法包括:A method for constructing a map, characterized in that it is suitable for a movable platform, and the movable platform includes a first scroll wheel, a second scroll wheel, a camera, a first encoder, and a second encoder. The wheel and the second scroll wheel are used to displace the movable platform, the axes of the first scroll wheel and the second scroll wheel coincide; the camera is used to collect image frames, and the first encoder is used to detect the The rate of the first scroll wheel, and the second encoder is used to detect the rate of the second scroll wheel; the method includes:
    根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
    获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿;Obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame;
    在获得所述当前图像帧的预测位姿后,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
  29. 根据权利要求28所述的方法,其特征在于,所述在获得在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新,包括:The method according to claim 28, characterized in that, on the basis of obtaining the constructed map, updating the constructed map according to the current image frame, comprising:
    当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新。When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame.
  30. 根据权利要求28或29所述的方法,其特征在于,所述根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵,包括:The method according to claim 28 or 29, wherein, according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, the current image frame is relative to the previous image. The prediction transformation matrix of the frame, including:
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获取当前图像帧的旋转矩阵关于时间的第一导数,以及当前图像帧的位置关于时间的第二导数;According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
    基于所述第一时间段,分别对所述第一导数和第二导数进行积分获取当前图像帧相对于上一图像帧的的预测旋转矩阵和预测平移矩阵;Based on the first time period, respectively integrating the first derivative and the second derivative to obtain a predicted rotation matrix and a predicted translation matrix of the current image frame relative to the previous image frame;
    根据当前图像帧相对于上一图像帧的预测旋转矩阵和预测平移矩阵,获得当前图像帧相对于上一图像帧的预测变换矩阵。According to the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame, the predicted transformation matrix of the current image frame relative to the previous image frame is obtained.
  31. 根据权利要求28-30任一项所述的方法,其特征在于,在获得所述当前图像帧的预测位姿后,还包括:The method according to any one of claims 28-30, wherein after obtaining the predicted pose of the current image frame, the method further comprises:
    根据预设的纯运动束调整模型对所述当前图像帧的预测位姿进行优化;其中,所述纯运动束调整模型包括编码器误差项,所述编码器误差项通过基于所述第一编码器的测量数据和所述第二编码器的测量数据进行计算得到。The predicted pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term is based on the first encoding The measurement data of the encoder and the measurement data of the second encoder are calculated.
  32. 根据权利要求31所述的方法,其特征在于,在获得所述当前图像帧的预测位姿后,还包括:The method according to claim 31, wherein after obtaining the predicted pose of the current image frame, the method further comprises:
    若当前图像帧相对于上一图像帧发生视觉失效时,所述当前图像帧的位姿为所述预测位姿;If the current image frame has a visual failure relative to the previous image frame, the pose of the current image frame is the predicted pose;
    若当前图像帧相对于上一图像帧未发生视觉失效时,所述当前图像帧的位姿为所述纯运动束调整模型的优化结果。If there is no visual failure of the current image frame relative to the previous image frame, the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
  33. 根据权利要求32所述的方法,其特征在于,在获得所述当前图像帧的预测位姿后,还包括:The method according to claim 32, wherein after obtaining the predicted pose of the current image frame, the method further comprises:
    当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧发生视觉失效;或,When the optimization result cannot be obtained through the pure motion beam adjustment model, it is determined that the current image frame is visually invalid relative to the previous image frame; or,
    根据所述优化位姿,将所述当前图像帧和所述上一图像帧进行特征匹配获取内点数量;Performing feature matching between the current image frame and the previous image frame to obtain the number of interior points according to the optimized pose;
    当所述内点数量小于预设的阈值时,则确定当前图像帧相对于上一图像帧发生视觉失效。When the number of interior points is less than the preset threshold, it is determined that the current image frame is visually invalid relative to the previous image frame.
  34. 根据权利要求28-33任一项所述的方法,其特征在于,所述根据所述当前图像帧对所述已构建地图进行更新,包括:The method according to any one of claims 28-33, wherein the updating the constructed map according to the current image frame comprises:
    根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
    在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括编码器误差项。After generating new map points, construct the common view of the current image frame and perform local beam adjustments to adjust other key frames that are common view with the current image frame, the map points that can be seen in the current image frame, and the other The map points that can be seen in the key frame are optimized; wherein, the local beam adjustment includes an encoder error term.
  35. 根据权利要求34所述的方法,其特征在于,所述构建所述当前图像帧的共视图进行局部束调整,包括:The method according to claim 34, wherein the constructing a common view of the current image frame to perform local beam adjustment comprises:
    当任一关键帧与前一个关键帧由纯编码器边连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯编码器边为相邻关键帧之间发生所述视觉失效;When any key frame and the previous key frame are connected by the pure encoder side, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, and the key frame is removed according to the previous key frame. The time stamp of the frame and the time stamp of the next key frame acquire the second time period; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
    获取所述第二时间段内第一编码器的所有测量数据和第二编码器测的所有测量数据;Acquiring all the measurement data of the first encoder and all the measurement data of the second encoder in the second time period;
    根据所述第二时间段内第一编码器测量的所有测量数据和第二编码器的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。According to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period, the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
  36. 根据权利要34或35所述的方法,其特征在于,所述构建所述当前图像帧的共视图进行局部束调整,还包括:The method according to claim 34 or 35, wherein the constructing a common view of the current image frame for local beam adjustment further comprises:
    获取所述当前图像帧之前预设的N个连续关键帧,根据所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点构建局部优化图;N大于等于1;Acquiring N consecutive key frames preset before the current image frame, and constructing a local optimization map according to the current image frame and the map points that can be seen, the N consecutive key frames and the map points that can be seen; N is greater than or equal to 1;
    根据所述局部优化图中任意两个相邻关键帧之间的预测变换矩阵构建所述编码器误差项,并根据所述地图点在各个关键帧上的位姿进行重投影获取重投影误差;Construct the encoder error term according to the prediction transformation matrix between any two adjacent key frames in the local optimization map, and perform reprojection according to the pose of the map point on each key frame to obtain the reprojection error;
    根据所述编码器误差和重投影误差对所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点进行优化,更新所述局部优化图中所有关键帧的位姿及所有地图点的位置。According to the encoder error and the reprojection error, optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map. The pose of the key frame and the position of all map points.
  37. 根据权利要求28-36任一项所述的方法,其特征在于,在获得所述当前图像帧的预测位姿后,还包括:The method according to any one of claims 28-36, wherein after obtaining the predicted pose of the current image frame, the method further comprises:
    根据当前图像帧的特征点进行闭环检测;Perform closed-loop detection according to the feature points of the current image frame;
    当检测到所述当前图像帧和任一关键帧发生闭环时,融合当前图像帧及所述关键帧附件的关键帧所看到的地图点。When a closed loop between the current image frame and any key frame is detected, the map points seen by the current image frame and the key frame attached to the key frame are merged.
  38. 根据权利要求37所述的方法,其特征在于,在进行闭环检测后,还包括:The method according to claim 37, wherein after the closed-loop detection is performed, the method further comprises:
    构建位姿图以对所有关键帧进行优化,以更新所有关键帧的位姿;所述位姿图包括所有的关键帧以及两个关键帧之间的相对变换。The pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
  39. 根据权利要求38所述的方法,其特征在于,所述构建位姿图以对所有关键帧进行优化,包括:The method according to claim 38, wherein said constructing a pose map to optimize all key frames comprises:
    当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。When a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
  40. 根据权利要求38或39所述的方法,其特征在于,在所述构建位姿图以对所有关键帧进行优化后,还包括:The method according to claim 38 or 39, characterized in that, after the construction of the pose map to optimize all the key frames, the method further comprises:
    根据预设的全局束调整对所有的关键帧和所有的地图点进行更新。All key frames and all map points are updated according to the preset global bundle adjustment.
  41. 一种可移动平台,其特征在于,包括:第一滚动轮、第二滚动轮、摄像头、第一编码器、第二编码器、存储器和处理器;A movable platform, characterized by comprising: a first scroll wheel, a second scroll wheel, a camera, a first encoder, a second encoder, a memory, and a processor;
    所述第一滚动轮和第二滚动轮用于使得所述可移动平台发生位移,所述第一滚动轮和第二滚动轮的轴线重合;The first rolling wheel and the second rolling wheel are used for displacing the movable platform, and the axes of the first rolling wheel and the second rolling wheel coincide;
    所述摄像头用于采集图像帧;The camera is used to collect image frames;
    所述第一编码器用于检测所述第一滚动轮的速率,所述第二编码器用于检测所述第二滚动轮的速率;The first encoder is used to detect the speed of the first scroll wheel, and the second encoder is used to detect the speed of the second scroll wheel;
    所述存储器用于存储程序代码;The memory is used to store program codes;
    所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
    根据当前图像帧的时间戳和上一图像帧的时间戳获取第一时间段;Obtain the first time period according to the time stamp of the current image frame and the time stamp of the previous image frame;
    获取所述第一时间段内第一编码器的测量数据和第二编码器测的测量数据;Acquiring the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵;Obtaining the prediction transformation matrix of the current image frame relative to the previous image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period;
    根据所述预测变化矩阵以及上一图像帧的位姿信息获得所述当前图像帧的预测位姿;Obtaining the predicted pose of the current image frame according to the predicted change matrix and the pose information of the previous image frame;
    在获得所述当前图像帧的预测位姿后,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新;所述已构建地图由在所述当前图像帧之前确定的关键帧进行构建。After obtaining the predicted pose of the current image frame, on the basis of the constructed map, the constructed map is updated according to the current image frame; the constructed map is determined before the current image frame The key frames are constructed.
  42. 根据权利要求41所述的可移动平台,其特征在于,在所述处理器在获得在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新时,所述处理器被配置为:The mobile platform according to claim 41, wherein the processor updates the constructed map according to the current image frame on the basis of obtaining the constructed map. Is configured as:
    当确定当前图像帧为关键帧时,在已构建地图的基础上,根据所述当前图像帧对所述已构建地图进行更新。When it is determined that the current image frame is a key frame, on the basis of the constructed map, the constructed map is updated according to the current image frame.
  43. 根据权利要求41或42所述的可移动平台,其特征在于,在所述 处理器根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获得当前图像帧相对于上一图像帧的预测变换矩阵时,所述处理器被配置为:The movable platform according to claim 41 or 42, wherein the processor obtains the current image frame according to the measurement data of the first encoder and the measurement data of the second encoder in the first time period Relative to the predicted transformation matrix of the previous image frame, the processor is configured to:
    根据所述第一时间段内第一编码器的测量数据和第二编码器的测量数据,获取当前图像帧的旋转矩阵关于时间的第一导数,以及当前图像帧的位置关于时间的第二导数;According to the measurement data of the first encoder and the measurement data of the second encoder in the first time period, obtain the first derivative of the rotation matrix of the current image frame with respect to time, and the second derivative of the position of the current image frame with respect to time ;
    基于所述第一时间段,分别对所述第一导数和第二导数进行积分获取当前图像帧相对于上一图像帧的的预测旋转矩阵和预测平移矩阵;Based on the first time period, respectively integrating the first derivative and the second derivative to obtain a predicted rotation matrix and a predicted translation matrix of the current image frame relative to the previous image frame;
    根据当前图像帧相对于上一图像帧的预测旋转矩阵和预测平移矩阵,获得当前图像帧相对于上一图像帧的预测变换矩阵。According to the predicted rotation matrix and predicted translation matrix of the current image frame relative to the previous image frame, the predicted transformation matrix of the current image frame relative to the previous image frame is obtained.
  44. 根据权利要求41-43任一项所述的可移动平台,其特征在于,在所述处理器获得所述当前图像帧的预测位姿后,所述处理器还被配置为:The movable platform according to any one of claims 41-43, wherein after the processor obtains the predicted pose of the current image frame, the processor is further configured to:
    根据预设的纯运动束调整模型对所述当前图像帧的预测位姿进行优化;其中,所述纯运动束调整模型包括编码器误差项,所述编码器误差项通过基于所述第一编码器的测量数据和所述第二编码器的测量数据进行计算得到。The predicted pose of the current image frame is optimized according to a preset pure motion beam adjustment model; wherein, the pure motion beam adjustment model includes an encoder error term, and the encoder error term is based on the first encoding The measurement data of the encoder and the measurement data of the second encoder are calculated.
  45. 根据权利要求44所述的可移动平台,其特征在于,在所述处理器获得所述当前图像帧的预测位姿后,所述处理器还被配置为:The mobile platform of claim 44, wherein after the processor obtains the predicted pose of the current image frame, the processor is further configured to:
    若当前图像帧相对于上一图像帧发生视觉失效时,所述当前图像帧的位姿为所述预测位姿;If the current image frame has a visual failure relative to the previous image frame, the pose of the current image frame is the predicted pose;
    若当前图像帧相对于上一图像帧未发生视觉失效时,所述当前图像帧的位姿为所述纯运动束调整模型的优化结果。If there is no visual failure of the current image frame relative to the previous image frame, the pose of the current image frame is the optimization result of the pure motion beam adjustment model.
  46. 根据权利要求45所述的可移动平台,其特征在于,在所述处理器获得所述当前图像帧的预测位姿后,所述处理器还被配置为:The mobile platform according to claim 45, wherein after the processor obtains the predicted pose of the current image frame, the processor is further configured to:
    当通过所述纯运动束调整模型不能获得优化结果时,确定当前图像帧相对于上一图像帧发生视觉失效;或,When the optimization result cannot be obtained through the pure motion beam adjustment model, it is determined that the current image frame is visually invalid relative to the previous image frame; or,
    根据所述优化位姿,将所述当前图像帧和所述上一图像帧进行特征匹 配获取内点数量;According to the optimized pose, perform feature matching between the current image frame and the previous image frame to obtain the number of interior points;
    当所述内点数量小于预设的阈值时,则确定当前图像帧相对于上一图像帧发生视觉失效。When the number of interior points is less than the preset threshold, it is determined that the current image frame is visually invalid relative to the previous image frame.
  47. 根据权利要求41-46任一项所述的可移动平台,其特征在于,在所处理器根据所述当前图像帧对所述已构建地图进行更新时,所述处理器被配置为:The mobile platform according to any one of claims 41-46, wherein when the processor updates the constructed map according to the current image frame, the processor is configured to:
    根据预设的策略取出冗余关键帧和冗余地图点,并根据所述当前图像帧进行三角化操作生成新的地图点;Take out redundant key frames and redundant map points according to a preset strategy, and perform a triangulation operation according to the current image frame to generate new map points;
    在生成新的地图点后,构建所述当前图像帧的共视图进行局部束调整,以对与所述当前图像帧共视的其他关键帧、当前图像帧能看到的地图点以及所述其他关键帧能看到的地图点进行优化;其中,所述局部束调整包括编码器误差项。After generating new map points, construct the common view of the current image frame and perform local beam adjustments to adjust other key frames that are common view with the current image frame, the map points that can be seen in the current image frame, and the other The map points that can be seen in the key frame are optimized; wherein the local beam adjustment includes an encoder error term.
  48. 根据权利要求47所述的可移动平台,其特征在于,在所述处理器构建所述当前图像帧的共视图进行局部束调整时,所述处理器被配置为:The movable platform according to claim 47, wherein when the processor constructs the common view of the current image frame to perform local beam adjustment, the processor is configured to:
    当任一关键帧与前一个关键帧由纯编码器边连接,且所述关键帧与后一个关键帧也由所述纯编码器边连接时,去除所述关键帧,根据所述前一关键帧的时间戳和所述后一关键帧的时间戳获取第二时间段;其中,所述纯编码器边为相邻关键帧之间发生所述视觉失效;When any key frame and the previous key frame are connected by the pure encoder side, and the key frame and the next key frame are also connected by the pure encoder side, the key frame is removed, and the key frame is removed according to the previous key frame. The time stamp of the frame and the time stamp of the next key frame acquire the second time period; wherein the pure encoder side is that the visual failure occurs between adjacent key frames;
    获取所述第二时间段内第一编码器的所有测量数据和第二编码器测的所有测量数据;Acquiring all the measurement data of the first encoder and all the measurement data of the second encoder in the second time period;
    根据所述第二时间段内第一编码器测量的所有测量数据和第二编码器的所有测量数据,获得所述后一关键帧相对于所述前一关键帧的预测变换矩阵。According to all the measurement data measured by the first encoder and all the measurement data of the second encoder in the second time period, the prediction transformation matrix of the next key frame relative to the previous key frame is obtained.
  49. 根据权利要47或48所述的可移动平台,其特征在于,在所处理器构建所述当前图像帧的共视图进行局部束调整时,所述处理器还被配置为:The movable platform according to claim 47 or 48, wherein when the processor constructs the common view of the current image frame to perform local beam adjustment, the processor is further configured to:
    获取所述当前图像帧之前预设的N个连续关键帧,根据所述当前图像 帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点构建局部优化图;N大于等于1;Acquiring N consecutive key frames preset before the current image frame, and constructing a local optimization map according to the current image frame and the map points that can be seen, the N consecutive key frames and the map points that can be seen; N is greater than or equal to 1;
    根据所述局部优化图中任意两个相邻关键帧之间的预测变换矩阵构建所述编码器误差项,并根据所述地图点在各个关键帧上的位姿进行重投影获取重投影误差;Construct the encoder error term according to the prediction transformation matrix between any two adjacent key frames in the local optimization map, and perform reprojection according to the pose of the map point on each key frame to obtain the reprojection error;
    根据所述编码器误差和重投影误差对所述当前图像帧及其所能看见的地图点、所述N个连续关键帧及其能看见的地图点进行优化,更新所述局部优化图中所有关键帧的位姿及所有地图点的位置。According to the encoder error and the reprojection error, optimize the current image frame and the map points it can see, the N consecutive key frames and the map points it can see, and update all the map points in the local optimization map. The pose of the key frame and the position of all map points.
  50. 根据权利要求41-49任一项所述的可移动平台,其特征在于,在所述处理器获得所述当前图像帧的预测位姿后,所述处理器还被配置为:The movable platform according to any one of claims 41-49, wherein after the processor obtains the predicted pose of the current image frame, the processor is further configured to:
    根据当前图像帧的特征点进行闭环检测;Perform closed-loop detection according to the feature points of the current image frame;
    当检测到所述当前图像帧和任一关键帧发生闭环时,融合当前图像帧及所述关键帧附件的关键帧所看到的地图点。When a closed loop between the current image frame and any key frame is detected, the map points seen by the current image frame and the key frame attached to the key frame are merged.
  51. 根据权利要求50所述的可移动平台,其特征在于,在所述处理器进行闭环检测后,所述处理器还被配置为:The mobile platform of claim 50, wherein after the processor performs closed-loop detection, the processor is further configured to:
    构建位姿图以对所有关键帧进行优化,以更新所有关键帧的位姿;所述位姿图包括所有的关键帧以及两个关键帧之间的相对变换。The pose map is constructed to optimize all the key frames to update the poses of all the key frames; the pose map includes all the key frames and the relative transformation between the two key frames.
  52. 根据权利要求51所述的可移动平台,其特征在于,在所述处理器构建位姿图以对所有关键帧进行优化时,所述处理器被配置为:The movable platform according to claim 51, wherein when the processor constructs the pose map to optimize all the key frames, the processor is configured to:
    当相邻关键帧之间发生视觉失效时,根据所述预测变换矩阵生成修正协方差矩阵修改所述相邻关键帧之间的相对变换的协方差矩阵。When a visual failure occurs between adjacent key frames, a modified covariance matrix is generated according to the predictive transformation matrix to modify the covariance matrix of the relative transformation between the adjacent key frames.
  53. 根据权利要求51或52所述的可移动平台,其特征在于,在所述处理器构建位姿图以对所有关键帧进行优化后,所述处理器还被配置为:The movable platform according to claim 51 or 52, wherein after the processor constructs the pose map to optimize all key frames, the processor is further configured to:
    根据预设的全局束调整对所有的关键帧和所有的地图点进行更新。All key frames and all map points are updated according to the preset global bundle adjustment.
  54. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序, 所述计算机程序被处理器执行以实现如权利要求28-40任一项所述的方法。A computer-readable storage medium, characterized in that a computer program is stored thereon, and the computer program is executed by a processor to implement the method according to any one of claims 28-40.
PCT/CN2019/103599 2019-08-30 2019-08-30 Pose prediction method, map construction method, movable platform, and storage medium WO2021035669A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980033455.4A CN112219087A (en) 2019-08-30 2019-08-30 Pose prediction method, map construction method, movable platform and storage medium
PCT/CN2019/103599 WO2021035669A1 (en) 2019-08-30 2019-08-30 Pose prediction method, map construction method, movable platform, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/103599 WO2021035669A1 (en) 2019-08-30 2019-08-30 Pose prediction method, map construction method, movable platform, and storage medium

Publications (1)

Publication Number Publication Date
WO2021035669A1 true WO2021035669A1 (en) 2021-03-04

Family

ID=74058715

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103599 WO2021035669A1 (en) 2019-08-30 2019-08-30 Pose prediction method, map construction method, movable platform, and storage medium

Country Status (2)

Country Link
CN (1) CN112219087A (en)
WO (1) WO2021035669A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819744A (en) * 2021-02-26 2021-05-18 中国人民解放军93114部队 GNSS and visual SLAM fused track measuring method and device
CN113379803A (en) * 2021-07-07 2021-09-10 上海谦尊升网络科技有限公司 Positioning method based on visual image
CN113516692A (en) * 2021-05-18 2021-10-19 上海汽车集团股份有限公司 Multi-sensor fusion SLAM method and device
CN113587934A (en) * 2021-07-30 2021-11-02 深圳市普渡科技有限公司 Robot, indoor positioning method and device and readable storage medium
CN113763548A (en) * 2021-08-17 2021-12-07 同济大学 Poor texture tunnel modeling method and system based on vision-laser radar coupling
CN113763560A (en) * 2021-08-02 2021-12-07 纵目科技(上海)股份有限公司 Method, system and equipment for generating point cloud data and computer readable storage medium
CN113781563A (en) * 2021-09-14 2021-12-10 中国民航大学 Mobile robot loop detection method based on deep learning
CN114088081A (en) * 2021-10-10 2022-02-25 北京工业大学 Map construction method for accurate positioning based on multi-segment joint optimization
CN114279456A (en) * 2021-12-06 2022-04-05 纵目科技(上海)股份有限公司 Picture construction/vehicle positioning method, system, terminal and computer storage medium
CN115930971A (en) * 2023-02-01 2023-04-07 七腾机器人有限公司 Data fusion processing method for robot positioning and mapping
CN116252581A (en) * 2023-03-15 2023-06-13 吉林大学 System and method for estimating vertical and pitching motion information of vehicle body under straight running working condition
WO2024002065A1 (en) * 2022-06-30 2024-01-04 维沃移动通信有限公司 Video encoding method and apparatus, electronic device, and medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991515B (en) * 2021-02-26 2022-08-19 山东英信计算机技术有限公司 Three-dimensional reconstruction method, device and related equipment
CN113052855B (en) * 2021-02-26 2021-11-02 苏州迈思捷智能科技有限公司 Semantic SLAM method based on visual-IMU-wheel speed meter fusion
CN113223161B (en) * 2021-04-07 2022-04-12 武汉大学 Robust panoramic SLAM system and method based on IMU and wheel speed meter tight coupling
CN113094462B (en) * 2021-04-30 2023-10-24 腾讯科技(深圳)有限公司 Data processing method and device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106679648A (en) * 2016-12-08 2017-05-17 东南大学 Vision-inertia integrated SLAM (Simultaneous Localization and Mapping) method based on genetic algorithm
CN106997688A (en) * 2017-06-08 2017-08-01 重庆大学 Parking position detecting method based on multi-sensor information fusion
US20170337690A1 (en) * 2016-05-20 2017-11-23 Qualcomm Incorporated Predictor-corrector based pose detection
CN109583290A (en) * 2017-09-28 2019-04-05 百度(美国)有限责任公司 The system and method for improving visual signature detection using motion-dependent data
CN110160527A (en) * 2019-05-06 2019-08-23 安徽红蝠智能科技有限公司 A kind of Mobile Robotics Navigation method and apparatus

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104374395A (en) * 2014-03-31 2015-02-25 南京邮电大学 Graph-based vision SLAM (simultaneous localization and mapping) method
CN108256574B (en) * 2018-01-16 2020-08-11 广东省智能制造研究所 Robot positioning method and device
CN108921898B (en) * 2018-06-28 2021-08-10 北京旷视科技有限公司 Camera pose determination method and device, electronic equipment and computer readable medium
CN109671120A (en) * 2018-11-08 2019-04-23 南京华捷艾米软件科技有限公司 A kind of monocular SLAM initial method and system based on wheel type encoder
CN110044354B (en) * 2019-03-28 2022-05-20 东南大学 Binocular vision indoor positioning and mapping method and device
CN110706248B (en) * 2019-08-20 2024-03-12 广东工业大学 Visual perception mapping method based on SLAM and mobile robot
CN115131420A (en) * 2022-06-24 2022-09-30 武汉依迅北斗时空技术股份有限公司 Visual SLAM method and device based on key frame optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170337690A1 (en) * 2016-05-20 2017-11-23 Qualcomm Incorporated Predictor-corrector based pose detection
CN106679648A (en) * 2016-12-08 2017-05-17 东南大学 Vision-inertia integrated SLAM (Simultaneous Localization and Mapping) method based on genetic algorithm
CN106997688A (en) * 2017-06-08 2017-08-01 重庆大学 Parking position detecting method based on multi-sensor information fusion
CN109583290A (en) * 2017-09-28 2019-04-05 百度(美国)有限责任公司 The system and method for improving visual signature detection using motion-dependent data
CN110160527A (en) * 2019-05-06 2019-08-23 安徽红蝠智能科技有限公司 A kind of Mobile Robotics Navigation method and apparatus

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819744A (en) * 2021-02-26 2021-05-18 中国人民解放军93114部队 GNSS and visual SLAM fused track measuring method and device
CN113516692A (en) * 2021-05-18 2021-10-19 上海汽车集团股份有限公司 Multi-sensor fusion SLAM method and device
CN113379803A (en) * 2021-07-07 2021-09-10 上海谦尊升网络科技有限公司 Positioning method based on visual image
CN113379803B (en) * 2021-07-07 2024-02-02 上海谦尊升网络科技有限公司 Positioning method based on visual image
CN113587934A (en) * 2021-07-30 2021-11-02 深圳市普渡科技有限公司 Robot, indoor positioning method and device and readable storage medium
CN113587934B (en) * 2021-07-30 2024-03-19 深圳市普渡科技有限公司 Robot, indoor positioning method and device and readable storage medium
CN113763560A (en) * 2021-08-02 2021-12-07 纵目科技(上海)股份有限公司 Method, system and equipment for generating point cloud data and computer readable storage medium
CN113763560B (en) * 2021-08-02 2024-02-09 纵目科技(上海)股份有限公司 Method, system, equipment and computer readable storage medium for generating point cloud data
CN113763548A (en) * 2021-08-17 2021-12-07 同济大学 Poor texture tunnel modeling method and system based on vision-laser radar coupling
CN113763548B (en) * 2021-08-17 2024-02-27 同济大学 Vision-laser radar coupling-based lean texture tunnel modeling method and system
CN113781563B (en) * 2021-09-14 2023-10-24 中国民航大学 Mobile robot loop detection method based on deep learning
CN113781563A (en) * 2021-09-14 2021-12-10 中国民航大学 Mobile robot loop detection method based on deep learning
CN114088081A (en) * 2021-10-10 2022-02-25 北京工业大学 Map construction method for accurate positioning based on multi-segment joint optimization
CN114279456A (en) * 2021-12-06 2022-04-05 纵目科技(上海)股份有限公司 Picture construction/vehicle positioning method, system, terminal and computer storage medium
WO2024002065A1 (en) * 2022-06-30 2024-01-04 维沃移动通信有限公司 Video encoding method and apparatus, electronic device, and medium
CN115930971B (en) * 2023-02-01 2023-09-19 七腾机器人有限公司 Data fusion processing method for robot positioning and map building
CN115930971A (en) * 2023-02-01 2023-04-07 七腾机器人有限公司 Data fusion processing method for robot positioning and mapping
CN116252581A (en) * 2023-03-15 2023-06-13 吉林大学 System and method for estimating vertical and pitching motion information of vehicle body under straight running working condition
CN116252581B (en) * 2023-03-15 2024-01-16 吉林大学 System and method for estimating vertical and pitching motion information of vehicle body under straight running working condition

Also Published As

Publication number Publication date
CN112219087A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
WO2021035669A1 (en) Pose prediction method, map construction method, movable platform, and storage medium
CN111561923B (en) SLAM (simultaneous localization and mapping) mapping method and system based on multi-sensor fusion
CN109307508B (en) Panoramic inertial navigation SLAM method based on multiple key frames
US11668571B2 (en) Simultaneous localization and mapping (SLAM) using dual event cameras
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
CN109993113B (en) Pose estimation method based on RGB-D and IMU information fusion
CN112634451B (en) Outdoor large-scene three-dimensional mapping method integrating multiple sensors
KR102427921B1 (en) Apparatus and method for real-time mapping and localization
CN109506642B (en) Robot multi-camera visual inertia real-time positioning method and device
Tanskanen et al. Live metric 3D reconstruction on mobile phones
KR101725060B1 (en) Apparatus for recognizing location mobile robot using key point based on gradient and method thereof
CN112304307A (en) Positioning method and device based on multi-sensor fusion and storage medium
WO2019157925A1 (en) Visual-inertial odometry implementation method and system
US9858640B1 (en) Device and method for merging 3D point clouds from sparsely distributed viewpoints
CN110726406A (en) Improved nonlinear optimization monocular inertial navigation SLAM method
CN112734852A (en) Robot mapping method and device and computing equipment
WO2020221307A1 (en) Method and device for tracking moving object
CN111127524A (en) Method, system and device for tracking trajectory and reconstructing three-dimensional image
CN112802096A (en) Device and method for realizing real-time positioning and mapping
CN111932674A (en) Optimization method of line laser vision inertial system
Zhang et al. Hand-held monocular SLAM based on line segments
CN114013449A (en) Data processing method and device for automatic driving vehicle and automatic driving vehicle
CN112731503A (en) Pose estimation method and system based on front-end tight coupling
CN116468786A (en) Semantic SLAM method based on point-line combination and oriented to dynamic environment
CN112767482B (en) Indoor and outdoor positioning method and system with multi-sensor fusion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19942755

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19942755

Country of ref document: EP

Kind code of ref document: A1