WO2019057179A1 - 一种基于点线特征的视觉slam方法和装置 - Google Patents

一种基于点线特征的视觉slam方法和装置 Download PDF

Info

Publication number
WO2019057179A1
WO2019057179A1 PCT/CN2018/107097 CN2018107097W WO2019057179A1 WO 2019057179 A1 WO2019057179 A1 WO 2019057179A1 CN 2018107097 W CN2018107097 W CN 2018107097W WO 2019057179 A1 WO2019057179 A1 WO 2019057179A1
Authority
WO
WIPO (PCT)
Prior art keywords
line
feature
feature line
state vector
global
Prior art date
Application number
PCT/CN2018/107097
Other languages
English (en)
French (fr)
Inventor
李晚龙
李建飞
高亚军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201810184021.1A external-priority patent/CN109558879A/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18859404.8A priority Critical patent/EP3680809A4/en
Publication of WO2019057179A1 publication Critical patent/WO2019057179A1/zh
Priority to US16/824,219 priority patent/US11270148B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to a visual SLAM method and apparatus based on dotted line features.
  • SLAM synchronous localization and mapping
  • the traditional visual SLAM theory mainly uses the feature points in the environment for mapping and positioning.
  • the advantage is that the feature points are easy to detect and track, but the disadvantage is that for some artificial buildings, such as corridor walls, only the characteristic points of the environment are considered. Information seriously affects the accuracy of SLAM.
  • the embodiment of the present application provides a visual SLAM method and apparatus based on dotted line features, which can integrate feature points and feature line information in a visual image frame to improve the accuracy of the visual SLAM.
  • an embodiment of the present application provides a visual SLAM method based on a dotted line feature, which is applied to an imaging device that collects surrounding images, including: receiving a current visual image frame input by a camera; and extracting feature points of the current visual image frame And a feature line; predicting a first pose of the imaging device by using the feature point; observing a first feature line to determine a feature line view of the first feature line, wherein the first feature line Any one of the feature lines extracted; acquiring a global feature line state vector set in the current visual image frame, where the global feature line state vector set includes feature line states of N historical feature lines a vector, N is a positive integer; the first pose is updated by the feature line observation and the global feature line state vector set to obtain an updated first pose.
  • the above-mentioned visual SLAM method combines the motion estimation based on feature points and the observation features of the observed feature lines in the environment to update the pose of the imaging device in real time, compared with the motion estimation of only the feature points in the prior art.
  • the accuracy of the visual SLAM is improved, in addition, the observation characteristics of the observed historical feature lines are considered, the closed-loop constraint is realized, the robustness is improved, and the accuracy of the visual SLAM is correspondingly improved.
  • the method further includes: updating the global feature line state vector set by using the feature line observation and the first pose to obtain an updated global feature line state vector set.
  • the method further includes: traversing the N historical feature lines, sequentially calculating a Mahalanobis distance between each of the historical feature lines and the first feature line, and obtaining N Mahalanobis distances; And using the feature line observation, the global feature line state vector set, to update the first pose to obtain the updated first pose, including: when the N Markov distances When the minimum Mahalanobis distance is less than the preset threshold, the first pose is updated by using the feature line observation and the global feature line state vector set to obtain the updated first pose.
  • the Markov distance algorithm is used in this design to determine that the extracted feature line has been observed, that is, when the minimum Mahalanobis distance is less than the preset threshold, the first feature is considered to have been observed, so that the The feature line observation of the first feature line and the global feature line state vector set update the first pose using the feature point estimate in real time.
  • the first pose is updated by using the feature line view and the global feature line state vector set to obtain an updated first pose, including: Deviating between a feature line state vector of the feature line corresponding to the minimum Mahalanobis distance and the feature line observation, and updating the first pose and the global feature line state vector set by using a filtering method based on the deviation .
  • the first pose and the global feature line state vector set can be updated to obtain an optimal value using an existing filtering method.
  • the feature line observation is added to the global feature line state vector set to obtain an update.
  • the set of global feature line state vectors when the first Mahalanobis distance is not less than the preset threshold, it indicates that the extracted first feature line is a newly observed feature line, thereby updating the global feature line state vector set, and implementing the global feature line state vector set. Optimized.
  • the extracting the feature line of the current visual image frame may be implemented by: extracting all line segments of the current visual image frame; if any two extracted line segments satisfy the first preset condition And merging any two line segments into a new line segment until there is no line segment satisfying the first preset condition; if any two of the merged line segments satisfy the second preset condition, Outputting any two of the merged line segments as the same feature line; if any two of the merged line segments do not satisfy the second preset condition, the two line segments are regarded as two feature lines Output.
  • the above method for extracting feature lines can combine different line segments belonging to the same line and can eliminate repeated feature lines, which can improve the extraction of feature lines compared with the existing feature line extraction methods. Degree and efficiency, reducing redundant feature lines,
  • any two line segments extracted satisfy the first preset condition if any two line segments extracted satisfy the first preset condition, the any two line segments are merged into a new line segment, including: if between the endpoints of any two extracted line segments When the minimum distance is less than the first preset value and the distance between the two line segments is less than the second preset value, and the angle between the any two line segments is less than the third preset value, the any two of the two The line segments are merged into a new line segment.
  • any two of the merged line segments satisfy the second preset condition, the any two of the merged line segments are output as the same feature line, including: if any An angle between the two merged line segments is less than a fourth preset value, and the lengths of the two line segments are the same, and the overlap degree of the two line segments is greater than a fifth preset value, and the two When the distance between the line segments is less than the sixth preset value, the arbitrary two line segments are output as the same feature line.
  • the observing the first feature line to determine the feature line view of the first feature line comprises: minimizing the extracted first feature line by using orthogonalization parameters Description, the feature line observation is obtained.
  • the acquiring the global feature line state vector set in the current visual image frame includes: during the motion of the camera device, the current visual image frame is a key frame and is in the When the current visual image frame observes the feature line, the currently observed feature line is associated with the previously observed historical feature line, and the key frame is a frame in which the key action occurs during the motion of the camera device.
  • the embodiment of the present application provides a visual SLAM device based on a dotted line feature, which is applied to an image capturing device for acquiring a surrounding image, and includes: a receiving unit configured to receive a current visual image frame input by a camera; and extracting a unit, configured to extract a feature point and a feature line of the current visual image frame; a prediction unit, configured to predict a first pose of the imaging device by using the feature point; and a determining unit, configured to perform the first feature line Observing to determine feature line observation of the first feature line, wherein the first feature line is any one of the extracted feature lines; and an acquiring unit, configured to acquire the current visual image a global feature line state vector set in the frame, wherein the global feature line state vector set includes a feature line state vector of N historical feature lines, N is a positive integer; and an update unit is configured to use the feature line view to measure and The global feature line state vector set is updated, and the first pose is updated to obtain the updated first pose.
  • the updating unit is further configured to: use the feature line observation, the first pose to update the global feature line state vector set to obtain an updated global feature line state Vector collection.
  • the determining unit is further configured to: traverse the N historical feature lines, sequentially calculate Markovs between each historical feature line and the first feature line, and obtain N Mahalanobis distances
  • the updating unit is specifically configured to: when the minimum Mahalanobis distance of the N Mahalanobis distances is less than a preset threshold, using the feature line observation and the global feature line state vector set, the first The pose is updated to get the updated first pose.
  • the updating unit is specifically configured to: calculate a deviation between a feature line state vector of the feature line corresponding to the minimum Mahalanobis distance and the feature line view measurement, and utilize filtering according to the deviation
  • the method updates the first pose and the set of global feature line state vectors.
  • the updating unit is further configured to: add the feature line observation to the global feature line when a minimum Mahalanobis distance of the N Mahalanobis distances is not less than a preset threshold
  • the state vector set is obtained to obtain the updated global feature line state vector set.
  • the extracting unit is specifically configured to: extract all line segments of the current visual image frame; if any two extracted line segments satisfy the first When the condition is preset, the any two line segments are merged into a new line segment until there is no line segment that satisfies the first preset condition; if any two of the merged line segments meet the second preset In the case of the condition, the any two of the merged line segments are output as the same feature line; if any two of the merged line segments do not satisfy the second preset condition, the two line segments are regarded as two Strip feature line output.
  • the extracting unit is specifically used to: if any The minimum distance between the end points of the two line segments is smaller than the first preset value, and the distance between the two line segments is smaller than the second preset value, and the angle between the two line segments is smaller than the third preset.
  • the two line segments are combined into one new line segment.
  • the extracting unit Specifically, if the angle between the two or more merged line segments is less than a fourth preset value, and the lengths of the two line segments are the same, and the overlap between the two line segments is greater than the fifth pre- When a value is set and the distance between the two line segments is less than a sixth preset value, the any two line segments are output as the same feature line.
  • the determining unit is specifically configured to: minimize the description by using orthogonalization parameters for the extracted feature lines, and obtain the feature line observation.
  • the acquiring unit is specifically configured to: when the current visual image frame is a key frame and the feature line is observed in the current visual image frame during the motion of the imaging device,
  • the observed feature line is associated with a previously observed historical feature line, wherein the key frame is a frame in which the key action occurs during the motion of the camera device; and the current line is calculated for the matching feature line.
  • a reprojection error between the observed feature line and each of the previously observed historical feature lines constructing an objective function using the heavy cephalometric error, minimizing the objective function to obtain the currently observed feature line Feature line state vector, which is updated into the global feature line state vector set; for the feature line that fails to match, the feature line state vector of the currently observed feature line is acquired and added to the global feature line state Vector collection.
  • an embodiment of the present application provides a visual SLAM processing device based on a dotted line feature, including a transceiver, a processor, a memory, and a transceiver, configured to send and receive information, and the memory is configured to store a program, an instruction, or a code, and the processor uses The program, the instruction or the code in the memory is executed to perform the method in the first aspect or any possible implementation of the first aspect.
  • the present application provides a computer readable storage medium having stored therein instructions that, when run on a computer, cause the computer to perform any of the above aspects or any of the possible aspects of the first aspect Methods.
  • the present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any of the first aspect or the first aspect of the first aspect described above.
  • FIG. 1 is a block diagram of a visual SLAM system according to an embodiment of the present application.
  • FIG. 2 is a flowchart of a method for visual SLAM based on dotted line features in an embodiment of the present application
  • FIG. 3A is a schematic diagram of the merging of line segments in the embodiment of the present application.
  • FIG. 3B is a schematic diagram of a method for extracting feature points of a current visual image frame in an embodiment of the present application
  • FIG. 4A is a flowchart of a visual SLAM method in an embodiment of the present application.
  • 4B is a schematic diagram of a matching process of feature points and feature lines in the embodiment of the present application.
  • FIG. 5 is a structural diagram of a visual SLAM device based on a dotted line feature in an embodiment of the present application
  • FIG. 6 is a structural diagram of a visual SLAM processing apparatus based on a dotted line feature in an embodiment of the present application.
  • Visual SLAM refers to the use of external information of the image to determine the location of the robot, vehicle or mobile camera in the environment, and to establish a representation of the explored area.
  • Visual odometry Also known as the front end, the task of a visual odometer is to estimate the motion of the camera between adjacent images, as well as the appearance of a partial map.
  • the feature point-based visual odometer first extracts the feature points of the two key frames before and after, and then performs feature point matching. After matching the feature points, two one-to-one correspondences can be obtained. Pixel point set. The next thing to do is to calculate the camera's motion based on the two sets of matching points. Utilizing the monocular orb feature is a typical 2D-2D form, using a polar geometry approach. In this process, the 3D spatial position information of these pixels is not used.
  • the spatial position of each feature point can be calculated based on the motion information, which is also called triangulation.
  • a modified essential matrix is obtained, and then the essential matrix is decomposed to obtain rotation and translation between two frames of images, that is, a visual odometer is realized.
  • Position position and attitude
  • the position is the translation of the three directions x, y, z in the coordinate system
  • the attitude is the rotation of the three directions x, y, z in the coordinate system.
  • Key frame A video frame in the video sequence that is very different from its previous sequence, which represents the new location. Keyframes are also used to effectively estimate the pose of the camera and reduce the redundancy of the information.
  • Visual word bag model Visual SLAM uses the word bag model to search for feature points, and can quickly find similar images.
  • Mahalanobis distance A method of representing the covariance distance of data and effectively calculating the similarity of two unknown sample sets. If the covariance matrix is an identity matrix, the Mahalanobis distance is reduced to the Euclidean distance.
  • Graph optimization is a way to represent optimization problems as graphs.
  • the diagram here is a diagram in the sense of graph theory.
  • a graph consists of several vertices and the edges that connect these nodes. Furthermore, the vertices are used to represent the optimization variables, and the edges are used to represent the error terms. Thus, for any of the above-described forms of nonlinear least squares problem, we can construct a corresponding map.
  • Graph optimization is also known as the backend.
  • Bundle adjustment refers to the extraction of optimal 3D models and camera parameters (including internal parameters and external parameters) from visual reconstruction, and several beams of light reflected from each feature point (bundles of Light rays), the process of finally converging to the camera's optical center after making optimal adjustments to the camera pose and feature point spatial position.
  • Extended kalman filter The Kalman filter is a highly efficient recursive filter that estimates the state of a dynamic system from a series of incomplete and noise-containing measurements. When the state equation or the measurement equation is nonlinear, the extended Kalman filter (EKF) is usually used to estimate the state of the dynamic system. The EKF performs a first-order linear truncation of the Taylor expansion of the nonlinear function, ignoring the remaining high-order terms, thereby transforming the nonlinear problem into linearity, and the Kalman linear filtering algorithm can be applied to the nonlinear system.
  • EKF Extended kalman filter
  • Drift error Due to the measurement noise of the sensor, the estimation error at the previous time is added to the motion of the latter time. This phenomenon is called drift, and the error formed is called drift error.
  • the embodiment of the present application is an improvement on the traditional SLAM, and provides a visual SLAM method and device based on the dotted line feature, which can integrate the feature points and feature line information in the visual image frame, and improve the accuracy of the visual SLAM.
  • the visual SLAM scheme based on the dotted line feature in the embodiment can be applied to the construction and positioning of the automatic driving, the mobile robot, the drone, and also can be used for the augmented reality and the virtual reality scene of the mobile terminal.
  • the embodiment of the present application provides a visual SLAM system, which specifically includes: a feature tracking module, a local map module, a closed loop detection module, and a global map module.
  • a feature tracking module specifically includes: a feature tracking module, a local map module, a closed loop detection module, and a global map module.
  • a global map module specifically includes: a feature tracking module, a local map module, a closed loop detection module, and a global map module.
  • the feature tracking module After receiving the video image frame, the feature tracking module performs reading and pre-processing to extract feature points and feature lines of the video image frame; searching for similar feature points and feature lines in consecutive video image frames for association matching, and imaging
  • the motion of the device is estimated to achieve the pose tracking of the camera device.
  • the main task of the feature tracking module is to output the pose of the camera device in real time and filter the key frames to complete the motion estimation of the camera device.
  • the local map module mainly selects the key frames in the local scope, calculates the point cloud information of the key frames, and constructs the local map; and obtains the heterogeneous feature points and feature line maps through the local BA optimization algorithm.
  • the main task of the loop closing detection module is to determine whether the current shooting scene of the camera device has come before.
  • the closed loop detection can effectively eliminate the accumulated drift error caused by the motion of the camera device.
  • the main steps are as follows:
  • the first step is to use the word bag model to perform closed-loop detection on the observed feature points; the word bag model is used to calculate the similarity between the current key frame and the candidate key frame.
  • the second step is to determine the Mahalanobis distance of the characteristic line by covariance, and realize the closed-loop recognition of the characteristic line in the environment.
  • the third step combining two closed-loop detections to obtain more robust results.
  • the main task of the global map module is to obtain all the key frames on the entire motion path, calculate the global consistency trajectory and map, and use the global BA optimization algorithm to optimize all key frames, feature points and feature lines after closed-loop detection, and The optimized global feature line state vector set is updated into the feature tracking module.
  • the embodiment of the present application provides a visual SLAM method based on a dotted line feature, which is applied to an imaging device for acquiring a surrounding image, and includes the following steps:
  • Step 20 Receive the current visual image frame input by the camera.
  • Step 21 Extract feature points and feature lines of the current visual image frame.
  • the feature point refers to an environment element in the form of a point in the environment in which the image capturing apparatus is located;
  • the feature line refers to an environment in the form of a line in an environment in which the image capturing apparatus is located element.
  • the current visual image of the input is adopted by the existing FAST and rotated BRIEF (ORB) algorithm in the embodiment of the present application.
  • the feature extraction and description are performed on the frame.
  • the feature point extraction is implemented by the improved features from accelerated segment test (FAST) algorithm.
  • the feature point description adopts binary robust independent elementary features (BRIEF).
  • the algorithm implementation performs characterization.
  • the improved FASE algorithm is as follows:
  • Step 1 Rough extraction. This step can extract a large number of feature points, but a large part of the feature points are not of high quality.
  • the extraction method is described below.
  • a point P is selected from the image.
  • the method of determining whether the point is a feature point is to draw a circle with a radius of 3 pixels with P as the center. If the gradation value of consecutive n pixel points on the circumference is larger or smaller than the gradation value of the P point, P is considered to be a feature point. Generally n is set to 12.
  • the gray values at positions 1, 9, 5, and 13 are first detected. If P is a feature point, then there are 3 or more of the four positions. The pixel values are all greater or smaller than the gray value of the P point. If not, the point is directly discharged.
  • Step 2 The method of machine learning filters the optimal feature points. To put it simply, an ID3 algorithm is used to train a decision tree, and 16 pixels on the circumference of the feature point are input into the decision tree to filter out the optimal FAST feature points.
  • Step 3 Non-maximum value suppression removes locally dense feature points.
  • a non-maximum suppression algorithm is used to remove multiple feature points in adjacent locations.
  • the response size is calculated for each feature point.
  • the calculation method is the absolute value sum of the feature point P and the deviation of the 16 feature points around it. In comparing the adjacent feature points, the feature points with larger response values are retained, and the remaining feature points are deleted.
  • Step 5 Rotation invariance of feature points.
  • the ORB algorithm proposes the use of the moment method to determine the direction of the FAST feature points. That is to say, the feature point is calculated by the moment with r as the centroid within the radius, and the feature point coordinate to the centroid forms a vector as the direction of the feature point.
  • the moment is defined as follows:
  • I(x, y) is the image grayscale expression.
  • the centroid of the moment is:
  • step 21 when extracting the feature line of the current visual image frame, the following process may be implemented:
  • S1 All line segments of the image frame are extracted.
  • all line segments of the image frame may be extracted by using an existing line segment detector (LSD) method.
  • LSD line segment detector
  • the LSD extraction algorithm is as follows:
  • any two line segments are combined into one new line segment.
  • the above steps can combine multiple line segments belonging to the same line.
  • l represents the minimum distance before the end points of two line segments
  • d represents the distance between the middle point of one line segment and another line segment
  • the two line segments are considered to belong to the same line, and the line segments are merged into a new line segment.
  • any two line segments are output as the same feature line.
  • the line binary descriptor (LBD) of l 1 and l 2 describes that the distance between the sub-subjects is less than a certain threshold.
  • the line segment in the embodiment of the present application is described by the LBD method. Therefore, the distance between the two line segments is represented by the distance between the LBD descriptors of the two line segments.
  • Step 22 predicting the first pose of the imaging device by using the feature point.
  • the imaging device is a binocular camera
  • the method of estimating the motion of the camera by using a perspective-n-point (PnP) algorithm is adopted, and the method is solved by using a nonlinear optimization method to obtain the first The pose, that is, the estimation of the rotation R and the translation T of the imaging apparatus.
  • Step 23 Observing the first feature line to determine feature line observation of the first feature line, wherein the first feature line is any one of the extracted feature lines.
  • the orthogonalization parameter is used to minimize the description, and the feature line observation of the first feature line is obtained.
  • Step 24 Acquire a global feature line state vector set in the current visual image frame, where the global feature line state vector set includes a feature line state vector of the N historical feature line, where N is a positive integer.
  • the global feature line state vector set is obtained by performing closed-loop detection and global optimization on a feature line outputted by a key frame in a continuous visual image frame during the motion of the image capturing apparatus.
  • a set of global feature line state vectors in a current visual image frame comprising: when a feature line is observed in the key frame during motion of the camera device, for a newly observed feature line and a previously observed feature line
  • the historical feature line performs association matching; for the feature line that matches the success, a reprojection error between the currently observed feature line and each of the previously observed historical feature lines is calculated, and the heavy head is utilized.
  • the shadow error constructs an objective function, and the objective function is minimized to obtain a feature line state vector of the currently observed feature line, and is updated into the global feature line state vector set; and the feature line for the matching failure is obtained.
  • a feature line state vector of the currently observed feature line is added to the set of global feature line state vectors.
  • the re-projection error refers to the error of the projected point (theoretical value) and the measurement point on the image.
  • the re-projection error refers to the error of the projected point (theoretical value) and the measurement point on the image.
  • the physical point on the calibration plate is the theoretical value.
  • the theoretical pixel point a will be obtained, and the measured point passes.
  • the pixel points after distortion correction are a', and their Euclidean distance
  • one frame is a picture in the video
  • the key frame is also called I frame, which is the most important frame for interframe compression coding.
  • a keyframe is equivalent to the original painting in a two-dimensional animation, which refers to the frame in which the key motion of the character or object moves or changes.
  • Step 25 Update the first pose by using the feature line observation and the global feature line state vector set to obtain an updated first pose.
  • the feature line observation can also be utilized when the feature line view measurement, the global feature line state vector set, and the updated first pose are obtained by updating the first pose.
  • the quantity, the first pose updates the global feature line state vector set to obtain an updated global feature line state vector set.
  • step 25 it is required to determine whether the observed first feature line is an already observed feature line, which can be implemented by the following process:
  • S51 Traverse the feature line state vector in the global feature line state vector set, and sequentially calculate a Mahalanobis distance between the feature line observations to obtain N Mahalanobis distances.
  • the first pose of the imaging device is updated by using the feature line observation and the global feature line state vector set, calculating a feature line state vector corresponding to the minimum Mahalanobis distance and the feature Deviation between line observations, based on the deviation, updating the first pose and the global feature line state vector set using a filtering method.
  • the residual between the feature line state vector and the feature line observation is used as a deviation between the feature line state vector and the feature line observation.
  • step S52 If it is determined in the foregoing step S52 that the observed first feature line is a newly observed feature line, the feature line view measurement is added to the global feature line state vector set, and the global feature line is updated. State vector collection.
  • the visual SLAM method in the embodiment of the present application combines the motion estimation based on feature points and the observation features of the feature lines observed in the environment to update the pose of the camera device in real time, and further considers the observed Observed features of the historical feature line, the closed-loop constraint is realized, the robustness is improved, and the accuracy of the visual SLAM is improved.
  • FIG. 2 The implementation process of FIG. 2 is described in detail below through an autopilot scenario of a binocular camera. The specific implementation process can be seen in FIG. 4A.
  • Step 1 Obtain a binocular visual image input for automatic driving
  • Step 2 Extract feature points and feature lines from the acquired binocular visual image features.
  • the feature tracking module takes the corrected binocular visual image sequence as an input, and for each frame of the input visual image, four threads are simultaneously activated to extract feature points and feature lines from the left and right eye visual images.
  • the ORB method is used to implement feature extraction and description.
  • the feature line is extracted by an improved LSD-based method and described by the LBD descriptor. After that, two threads are started, and one thread matches the extracted feature points. If the same feature points exist in both the left and right eye images, It is a binocular feature point, and other point features are monocular feature points; another thread matches the feature lines. If the same feature line is found in both the left and right eye images, it is a binocular feature line, and other unmatched features are not matched.
  • Step 3 Using the feature points to estimate the camera motion between adjacent images to obtain the motion estimation of the camera.
  • step 2 extract the feature points of the two video image frames before and after, and obtain two sets of feature points, and match the two sets of feature points to estimate the motion of the camera. Get the motion equation of the camera.
  • the nonlinear optimization method is used to perform the iterative solution to obtain the motion estimation of the camera.
  • the motion estimation of the camera is represented by the rotation q and the translation p, that is, the position of the camera.
  • the camera's equation of motion is expressed as follows:
  • x(t) is the pose of the camera movement
  • G is the global coordinate system
  • C is the camera coordinate system.
  • the pose for the camera in the global coordinate system is represented by a quaternion
  • G P C represents the position of the camera in the global coordinate system
  • F is the state transition matrix of the camera.
  • the rotation matrix R is initialized to a unit matrix at the first image frame, and P is 0.
  • Step 4 Construct an observation equation of the camera by observing the extracted feature lines.
  • step 41 and step 42 are included.
  • Step 41 According to the feature line extracted in step 2, the orthogonalization parameter is used to minimize the observation description.
  • the method of orthogonalizing the feature lines is as follows:
  • the feature line is described by the plucker coordinates.
  • L f is a six-dimensional vector composed of two three-dimensional vectors n and v, v represents a straight line vector X 1 -X 2 of the feature line, and n represents a normal vector of the feature line and the camera center plane.
  • the camera's observed state vector as x (including the camera's pose and feature line state vector) with the following expression, with The rotation and evaluation of the camera respectively, G L f is expressed as the feature line state vector, G in the upper and lower marks represents the global coordinate system, C represents the camera, f represents the feature line, and L represents the line itself (Line):
  • Step 42 construct an observation equation of the characteristic line
  • H L represents the Jacobian matrix of the characteristic line observation
  • H n represents the Jacobian matrix of the characteristic line observation noise
  • K is the internal reference matrix of the camera
  • the observed Jacobian matrix that is, the observation equation of the camera
  • Step 5 Observe the feature lines in the environment, perform global closed-loop detection on the observed feature lines, and perform global optimization to obtain a global feature line state vector set, including the following steps.
  • Step 51 Data association is performed on the feature lines by using the Mahalanobis distance between the feature lines.
  • the feature line needs to be associated with the observed feature line state vector to determine whether the feature line is a new feature line or an observed feature. line.
  • the degree of association between two feature lines is quickly calculated by the calculation method of the Mahalanobis distance, and the specific algorithm is as follows:
  • R is the covariance matrix of the measured noise, as shown below.
  • the observed feature line state vector is traversed, and the corresponding Mahalanobis distance is calculated, from which the minimum Mahalanobis distance is selected. If the Mahalanobis distance is less than a set threshold, it can be confirmed that the feature line has been observed before, and a closed loop detection is implemented, otherwise a new feature line is initialized.
  • Step 52 Global optimization of the observed feature lines.
  • the existing global optimizer (such as g2o) is used to estimate all the motion poses of the camera, and the observed feature lines are globally optimized.
  • the closed loop detection result of step 51 is also one of the inputs of the optimizer.
  • the feature line observation of the present application is one of the constraints of the global optimizer, and the global target optimization function and the Jacobian matrix of the feature line observation are calculated. The derivation process is as follows:
  • the pose of the camera is T kw
  • the edge of the point pose and the edge of the line pose are constructed as the two sides of the graph at the front end data association, and the reprojection of the edge is expressed as follows:
  • Ep k,i x k,i -n(KT kw X w,i )
  • x k, i is the position of the point in the image coordinate system
  • n (.) is the transformation from the homogeneous coordinate to the non-homogeneous coordinate.
  • the global optimization objective function C can be obtained by the following equation, where ⁇ p -1 , ⁇ l -1 are the covariance matrices of points and lines, respectively, and ⁇ p , ⁇ l are cost functions.
  • the feature line state vector of the currently observed feature line can be optimized to update it into the global feature line state vector set.
  • the Jacobian matrix needs to be solved for the optimization function.
  • u i is the i th column of U.
  • exp( ⁇ ⁇ ⁇ ) is the Lie group corresponding to the Lie algebra ⁇ ⁇ ⁇ .
  • the derivative for ⁇ ⁇ can then be calculated as follows:
  • Step 53 Update the global feature line state vector set according to the result of step 52.
  • Step 6 Using the filtered SLAM method, using the motion estimation of the feature points, the observation state of the camera is updated by the feature line observation of the feature line to realize the motion tracking of the camera.
  • step 6 is as follows:
  • Step 61 Obtain the pose of the camera according to step 3.
  • step 3 according to the feature points observed at time T and the feature points observed at time T-1, the visual odometer method is used to match the frame feature points of the two visual images before and after, and the pose estimation from time T-1 to time T is obtained.
  • Step 62 If the T time, the backend has a global feature line state vector set update, and the global feature line state vector set is updated.
  • Step 63 m characteristic lines observed at time T are expressed as Calculate the observed residuals by using each observed feature line and the global feature line state vector set G L f And the covariance matrix H L , determining whether the observed feature line is the observed feature line by determining the degree of data association of the line segment by the Mahalanobis distance.
  • Step 64 If it is a newly observed feature line, update the feature line to the global feature line state vector set G L f to update the state vector of the camera; if it is an already observed feature line, calculate the observed residual of the feature line Difference z ⁇ and Jacobian matrix H L .
  • Step 65 Estimation of pose based on T time Using the observed residual z ⁇ and Jacobian matrix H L of the characteristic line, observing the covariance matrix R of the noise and the Jacobian matrix H n , optimizing the state vector of the camera
  • the EKF filtering method is used for example, and the state covariance estimation of the camera is calculated iteratively through the state covariance matrix P X of the camera and the state transition matrix F of the camera. Then using the observation line of the characteristic line, the Jacobian matrix H L , the state covariance estimation of the camera Observing the noise covariance matrix R and the Jacobian matrix H n of the noise to obtain an iterative update Kalman gain, Using the updated Kalman gain and the observational Jacobian matrix H L of the characteristic line to update the state covariance matrix, Realize the iterative update of the Kalman gain and state covariance matrix P X , and optimize the state vector of the camera by the updated Kalman gain and the observed residual z of the feature line
  • the existing visual SLAM scheme uses the visual odometer method to estimate the camera motion, and does not consider the closed loop constraint and the error of the map observation.
  • the feature tracking module of the present application combines feature point-based camera interframe motion estimation and environment feature line observation vectors based on the filtering framework; wherein the feature line map is maintained in the filtering framework, and the feature line map and camera pose observation are simultaneously optimized; Maintain real-time correlation between line map and camera motion to ensure real-time performance and robust closed-loop detection.
  • the closed-loop detection of SLAM improves the accuracy of motion tracking estimation.
  • the orthogonal projection representation is used to minimize the error projection of the 3D line feature, and the Jacobian matrix of the observed variable is calculated by mathematical analysis method to reduce the number of optimized variables and improve Backend optimization accuracy, stability and efficiency.
  • the present application further provides a visual SLAM device based on a dotted line feature, which is applied to an image capturing device for acquiring a surrounding image, and the device can be used to execute the corresponding method embodiment in FIG. 2 to FIG. 4B above.
  • a visual SLAM device based on a dotted line feature, which is applied to an image capturing device for acquiring a surrounding image, and the device can be used to execute the corresponding method embodiment in FIG. 2 to FIG. 4B above.
  • the visual SLAM device provided by the embodiment, reference may be made to the implementation of the method, and the repeated description is omitted.
  • the embodiment of the present application provides a visual SLAM device 500 based on dotted line features, including: a receiving unit 510, an extracting unit 520, a predicting unit 530, a determining unit 540, an obtaining unit 550, and an updating unit 560, where:
  • the receiving unit 510 is configured to receive a current visual image frame input by the camera
  • An extracting unit 520 configured to extract feature points and feature lines of the current visual image frame
  • a prediction unit 530 configured to predict, by using the feature point, a first pose of the imaging device
  • a determining unit 540 configured to observe a first feature line to determine a feature line view of the first feature line, wherein the first feature line is any one of the extracted feature lines ;
  • the obtaining unit 550 is configured to acquire a global feature line state vector set in the current visual image frame, where the global feature line state vector set includes a feature line state vector of N historical feature lines, where N is a positive integer;
  • the updating unit 560 is configured to update the first pose by using the feature line observation and the global feature line state vector set to obtain an updated first pose.
  • the updating unit 560 is further configured to:
  • the global feature line state vector set is updated by the feature line view and the first pose to obtain an updated global feature line state vector set.
  • the determining unit 540 is further configured to:
  • the updating unit 560 is specifically configured to: when the minimum Mahalanobis distance of the N Mahalanobis distances is less than a preset threshold, using the feature line observation and the global feature line state vector set, the first The pose is updated to get the updated first pose.
  • the updating unit 560 is specifically configured to:
  • the first pose and the global feature line state vector set are updated using a filtering method.
  • the updating unit 560 is further configured to: add the feature line observation to the global feature line state vector when a minimum Mahalanobis distance of the N Mahalanobis distances is not less than a preset threshold. In the set, to obtain the updated global feature line state vector set.
  • the extracting unit 520 is specifically configured to:
  • any two extracted line segments satisfy the first preset condition, the any two line segments are merged into a new line segment until there is no line segment that satisfies the first preset condition;
  • any two of the merged line segments satisfy the second preset condition, the any two of the merged line segments are output as the same feature line; if any two of the merged line segments are not satisfied In the second preset condition, any two line segments are output as two feature lines.
  • the extracting unit 520 is specifically configured to:
  • any two line segments are merged into a new line segment.
  • the extracting unit 520 is specifically used. to:
  • any two line segments are output as the same feature line.
  • the determining unit 540 is specifically configured to:
  • the characteristic line observation is obtained by minimizing the description of the extracted feature lines by using orthogonalization parameters.
  • the obtaining unit 550 is specifically configured to:
  • the current visual image frame is a key frame and a feature line is observed in the current visual image frame
  • the key frame is a frame in which a key action occurs during the motion of the camera device
  • the currently observed feature line is calculated and each history observed previously Reprojection error between feature lines, constructing an objective function by using the cephalogram error, minimizing the objective function to obtain a feature line state vector of the currently observed feature line, and updating it to the global feature line state vector
  • the feature line state vector of the currently observed feature line is acquired for the feature line that matches the failure, and is added to the global feature line state vector set.
  • the embodiment of the present application provides a visual SLAM processing device 600 based on a dotted line feature, including a transceiver 610, a processor 620, and a memory 630.
  • the memory 630 is configured to store a program, an instruction, or a code.
  • the processor 620 is configured to execute a program, an instruction, or a code in the memory 630;
  • the transceiver 610 is configured to receive a current visual image frame input by the camera;
  • the processor 620 is configured to extract feature points and feature lines of the current visual image frame, predict a first pose of the imaging device by using the feature points, and observe the first feature line to determine the first Feature line observation of the feature line, wherein the first feature line is any one of the extracted feature lines; acquiring a global feature line state vector set in the current visual image frame, the global The feature line state vector set includes a feature line state vector of N historical feature lines, where N is a positive integer; and the first pose is updated by using the feature line view measurement and the global feature line state vector set, To get the updated first pose.
  • the processor 620 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 620 or an instruction in a form of software.
  • the processor 602 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in memory 630, and processor 620 reads the information in memory 630 and performs the above method steps in conjunction with its hardware.
  • embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Abstract

一种基于点线特征的视觉SLAM方法和装置,能够综合视觉图像帧中的特征点和特征线信息,提高视觉SLAM的准确度。该方法为,接收摄像头输入的视觉图像帧;提取所述视觉图像帧的特征点和特征线;利用所述特征点预测所述摄像设备的第一位姿;针对提取的第一特征线进行观测,确定所第一特征线的特征线观测量;获取所述摄像设备的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N历史特征线的特征线状态向量,N为正整数;利用所述特征线观测量、所述全局特征线状态向量集合更新所述第一位姿,这样融合了基于特征点的运动估计及环境中观测到的特征线的观测特征,来实时更新摄像设备的位姿,提高了视觉SLAM的准确度。

Description

一种基于点线特征的视觉SLAM方法和装置
本申请要求在2017年9月22日提交中国专利局、申请号为201710868034.6、发明名称为《一种基于点线特征的视觉SLAM方法和装置》的中国专利申请的优先权,其全部内容通过引用结合在本申请中,以及要求在2018年3月6日提交中国专利局、申请号为201810184021.1、发明名称为《一种基于点线特征的视觉SLAM方法和装置》的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种基于点线特征的视觉SLAM方法和装置。
背景技术
目前,同步定位与地图构建(simultaneous localization and mapping,SLAM)是实现自主导航的基本问题与研究热点。它的目标是解决在进入未知环境后,如何感知周围环境构建增量式地图,并同时进行自身定位的问题。用于感知周围环境的传感器有很多种,摄像设备凭借其廉价,体积小,便于安装等优点使视觉SLAM方法成为领域中重要研究内容。
传统的视觉SLAM理论主要是利用环境中的特征点进行制图和定位,优点是特征点便于检测和跟踪,而缺点则是对于一些人造建筑的环境如走廊的墙面等,仅考虑环境的特征点信息而严重影响SLAM的准确度。
发明内容
本申请实施例提供一种基于点线特征的视觉SLAM方法和装置,能够综合视觉图像帧中的特征点和特征线信息,以提高视觉SLAM的准确度。
本申请实施例提供的具体技术方案如下:
第一方面,本申请实施例提供一种基于点线特征的视觉SLAM方法,应用于采集周围图像的摄像设备,包括:接收摄像头输入的当前视觉图像帧;提取所述当前视觉图像帧的特征点和特征线;利用所述特征点预测所述摄像设备的第一位姿;对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
上述视觉SLAM方法,融合了基于特征点的运动估计及环境中观测到的特征线的观测特征,来实时更新摄像设备的位姿,相比于现有技术中仅考虑特征点的运动估计,一定程度上提高了视觉SLAM的准确度,此外,同时考虑了已观测到的历史特征线的观测特征,实现了闭环约束,提高了鲁棒性,相应地也提高了视觉SLAM的准确度。
一种可能的设计中,所述方法还包括:利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新得到更新后的全局特征线状态向量集合。
一种可能的设计中,所述方法还包括:遍历所述N条历史特征线,依次计算每条历史 特征线与所述第一特征线之间的马氏距离,得到N个马氏距离;所述利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿,包括:当所述N个马氏距离中的最小马氏距离小于预设阈值时,利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。应理解的是,这种设计中利用马氏距离算法来确定提取的特征线已经观测过,即当最小的马氏距离小于预设阈值时就认为第一特征已经给观测过,从而能够利用该第一特征线的特征线观测量、全局特征线状态向量集合对利用特征点估计的第一位姿进行实时更新。
一种可能的设计中,所述利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿,包括:计算所述最小马氏距离对应的特征线的特征线状态向量与所述特征线观测量之间的偏差,基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。
这种设计中,能够利用已有的滤波方法对第一位姿和全局特征线状态向量集合进行更新得到最优值。
一种可能的设计中,当所述N个马氏距离中的最小马氏距离不小于预设阈值时,将所述特征线观测量添加到所述全局特征线状态向量集合中,以得到更新后的所述全局特征线状态向量集合。这种设计中,第一马氏距离不小于预设阈值时,表明提取到的第一特征线是新观测到的特征线,从而更新全局特征线状态向量集合,对全局特征线状态向量集合实现了优化。
一种可能的设计中,所述提取所述当前视觉图像帧的特征线,可以通过以下过程实现:提取所述当前视觉图像帧的所有线段;若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止;若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
应理解的是,上述特征线的提取方法,能够将属于同一条直线的不同线段进行合并,并能剔除重复的特征线,与现有的特征线提取方法相比,能够提升特征线的提取准确度和效率,减少冗余的特征线,
一种可能的设计中,若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,包括:若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
一种可能的设计中,若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出,包括:若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
一种可能的设计中,所述对所述第一特征线进行观测,以确定所述第一特征线的特征线观测量,包括:针对提取的第一特征线采用正交化参数进行最小化描述,得到所述特征线观测量。
一种可能的设计中,所述获取所述当前视觉图像帧中的全局特征线状态向量集合,包 括:在所述摄像设备运动过程中,在所述当前视觉图像帧是关键帧且在所述当前视觉图像帧观测到特征线时,针对当前观测到的特征线与在先已经观测到的历史特征线进行关联匹配,所述关键帧是所述摄像设备运动过程中发生关键动作所处的帧;针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特征线之间的重投影误差,利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
第二方面,本申请实施例提供一种基于点线特征的视觉SLAM装置,应用于采集周围图像的摄像设备,其特征在于,包括:接收单元,用于接收摄像头输入的当前视觉图像帧;提取单元,用于提取所述当前视觉图像帧的特征点和特征线;预测单元,用于利用所述特征点预测所述摄像设备的第一位姿;确定单元,用于对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;获取单元,用于获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;更新单元,用于利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
一种可能的设计中,所述更新单元还用于:利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新,以得到更新后的全局特征线状态向量集合。
一种可能的设计中,所述确定单元还用于:遍历所述N条历史特征线,依次计算每条历史特征线与所述第一特征线之间的马氏,得到N个马氏距离;所述更新单元具体用于:当所述N个马氏距离中的最小马氏距离小于预设阈值,利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
一种可能的设计中,所述更新单元具体用于:计算所述最小马氏距离对应的特征线的特征线状态向量与所述特征线观测量之间的偏差,基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。
一种可能的设计中,所述更新单元还用于:当所述N个马氏距离中的最小马氏距离不小于预设阈值时,将所述特征线观测量添加到所述全局特征线状态向量集合中,以得到更新后的所述全局特征线状态向量集合。
一种可能的设计中,在提取所述当前视觉图像帧的特征线的方面,所述提取单元具体用于:提取所述当前视觉图像帧的所有线段;若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止;若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
一种可能的设计中,在提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段的方面,所述提取单元具体用于:若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
一种可能的设计中,在所述任意两条、经合并的线段满足第二预设条件时,将所述任 意两条、经合并的线段作为同一条特征线输出的方面,所述提取单元具体用于:若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
一种可能的设计中,所述确定单元具体用于:针对提取的特征线采用正交化参数进行最小化描述,得到所述特征线观测量。
一种可能的设计中,所述获取单元具体用于:在所述摄像设备运动过程中,在所述当前视觉图像帧是关键帧且在所述当前视觉图像帧观测到特征线时,针对当前观测到的特征线与在先已经观测到的历史特征线进行关联匹配,所述关键帧是所述摄像设备运动过程中发生关键动作所处的帧;针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特征线之间的重投影误差,利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
第三方面,本申请实施例提供一种基于点线特征的视觉SLAM处理设备,包括收发器、处理器、存储器,收发器,用于收发信息,存储器用于存储程序、指令或代码,处理器用于执行存储器中的程序、指令或代码,以完成上述第一方面或第一方面的任意可能的实现方式中的方法。
第四方面,本申请提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面的任意可能的设计的方法。
第五方面,本申请还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第一方面的任意可能的设计的方法。
附图说明
图1为本申请实施例的视觉SLAM系统的架构图;
图2为本申请实施例中的基于点线特征的视觉SLAM方法流程图;
图3A为本申请实施例中线段的合并示意图;
图3B为本申请实施例中提取当前视觉图像帧的特征点的方法示意图;
图4A为本申请实施例中的一种视觉SLAM方法流程图;
图4B为本申请实施例中特征点和特征线的匹配过程示意图;
图5为本申请实施例中的基于点线特征的视觉SLAM装置结构图;
图6为本申请实施例中的基于点线特征的视觉SLAM处理设备结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
下面介绍一下本申请中涉及的相关术语。
视觉SLAM:是指利用图像这一外部信息来确定环境中机器人、车辆或者移动相机在 环境中的位置,同时能够建立已探索区域的一种表示方法。
视觉里程计(visual odometry,VO):又称为前端(front end),视觉里程计的任务是估算相邻图像间相机的运动,以及局部地图的样子。基于特征点的视觉里程计在估算相邻图像间相机的运动时,首先提取前后两个关键帧的特征点,然后做特征点匹配,匹配好特征点之后,就可以得到两个一一对应的像素点集。接下来要做的,就是根据两组匹配好的点集,计算相机的运动了。利用单目orb特征是典型的2D-2D形式,采用对极几何的方法。在此过程中,并没有用到这些像素点的3D空间位置信息。不过,在得到相机运动之后,就可以根据这个运动信息,计算各个特征点的空间位置,该问题也称为三角化。通过对极几何约束,p'Ep=0构建本质矩阵E的方程,先求解得到一个E的初值,然后根据E的特征进行修正,
Figure PCTCN2018107097-appb-000001
得到一个修正之后的本质矩阵,然后分解这个本质矩阵,获得两帧图像之间的旋转和平移,即实现了视觉里程计。
位姿:位置和姿态,位置为坐标系中x,y,z三个方向的平移,姿态为坐标系中x,y,z三个方向的旋转。
关键帧(key frame):在视频序列中与其之前序列很不同的视频帧,它表示新的位置。关键帧也被用来有效估计相机的位姿并且减少信息的冗余。
视觉词袋模型:视觉SLAM多使用词袋模型进行特征点的搜索,可以很快找出相似的图像。
马氏距离:表示数据的协方差距离,有效的计算两个未知样本集的相似度的方法。如果协方差矩阵为单位矩阵,马氏距离就简化为欧氏距离。
图优化(graph optimization):图优化是把优化问题表现成图(graph)的一种方式。这里的图是图论意义上的图。一个图由若干个顶点(vertex),以及连接着这些节点的边(Edge)组成。进而,用顶点表示优化变量,用边表示误差项。于是,对任意一个上述形式的非线性最小二乘问题,我们可以构建与之对应的一个图。图优化又称为后端。
光束平差法(bundle adjustment,BA):是指从视觉重建中提炼出最优的3D模型和相机参数(包括内参数和外参数),从每一个特征点反射出来的几束光线(bundles of light rays),在把相机姿态和特征点空间位置做出最优的调整(adjustment))之后,最后收束到相机光心的这个过程。
扩展卡尔曼滤波器(extended kalman filter,EKF):卡尔曼滤波器是一种高效率的递归滤波器,它能够从一系列的不完全及包含噪音的测量中,估计动态系统的状态。在状态方程或测量方程为非线性时,通常采用扩展卡尔曼滤波器(EKF)估计动态系统的状态。EKF对非线性函数的泰勒展开式进行一阶线性化截断,忽略其余高阶项,从而将非线性问题转化为线性,可以将卡尔曼线性滤波算法应用于非线性系统中。
漂移误差:受传感器测量噪声影响,先前时刻的估计误差,会累加到后面时间的运动之上,这种现象称为漂移,形成的误差叫做漂移误差。
数据关联:将不同时刻传感器的观测信息进行关联的过程,也成为重观察过程。
本申请实施例是在传统SLAM上的改进,提供了一种基于点线特征的视觉SLAM方法和装置,能够综合视觉图像帧中的特征点和特征线信息,提高视觉SLAM的准确度,本申请实施例中的基于点线特征的视觉SLAM方案可应用于自动驾驶、移动机器人、无人机的建图及定位,也可用于移动终端的增强现实及虚拟现实场景。
本申请实施例提供一种视觉SLAM系统,具体包括:特征跟踪模块、局部地图模块、 闭环检测模块和全局地图模块,下面具体介绍各个模块的功能实现:
1)特征跟踪模块
特征跟踪模块在接收到视频图像帧后,进行读取及预处理,提取视频图像帧的特征点和特征线;在连续的视频图像帧中寻找相似的特征点和特征线进行关联匹配,对摄像设备的运动进行估计,实现摄像设备的位姿跟踪。特征跟踪模块的主要任务是实时输出摄像设备的位姿并筛选关键帧,完成摄像设备的运动估计。
2)局部地图模块
局部地图模块主要选择局部范围内的关键帧,计算关键帧的点云信息,构建局部地图;通过局部BA优化算法得到异构的特征点和特征线地图。
3)闭环检测模块
闭环(loop closing)检测模块的主要任务是判断摄像设备当前的拍摄场景之前是否来过,通过闭环检测可以有效消除由于摄像设备运动造成的累计漂移误差,主要步骤为:
第一步:采用词袋模型对观测到的特征点进行闭环检测;采用词袋模型计算当前关键帧和候选关键帧的相似度。
第二步:通过协方差对特征线进行马氏距离判定,实现环境中特征线的闭环识别。
第三步:融合两种闭环检测得到更鲁棒的结果。
4)全局地图模块
全局地图模块的主要任务是获取整个运动路径上的所有关键帧,计算全局一致性轨迹及地图,在闭环检测之后,利用全局BA优化算法对所有的关键帧、特征点及特征线进行优化,并将优化得到的全局特征线状态向量集合更新到特征跟踪模块中。
基于图1所示的视觉SLAM系统,参阅图2所示,本申请实施例提供一种基于点线特征的视觉SLAM方法,应用于采集周围图像的摄像设备,包括以下步骤:
步骤20:接收摄像头输入的当前视觉图像帧。
步骤21:提取所述当前视觉图像帧的特征点和特征线。
其中,特征点指的是在所述摄像设备所处的环境中,以点的形式存在的环境元素;特征线指的是在所述摄像设备所处的环境中,以线的形式存在的环境元素。
具体的,步骤21中提取当前视觉图像帧的特征点时,本申请实施例中采用现有的旋转不变二进制鲁棒独立基本特征(oriented FAST and rotated BRIEF,ORB)算法对输入的当前视觉图像帧进行特征点提取和描述,特征点提取采用改进的基于加速分割测试的特征(features from accelerated segment test,FAST)算法实现,特征点描述采用二进制鲁棒独立基本特征(binary robust independent elementary features,BRIEF)算法实现进行特征描述。
具体的,改进的FASE算法如下:
步骤一:粗提取。该步能够提取大量的特征点,但是有很大一部分的特征点的质量不高。下面介绍提取方法。从图像中选取一点P,如图3A所示,判断该点是不是特征点的方法是,以P为圆心画一个半径为3pixel的圆。圆周上如果有连续n个像素点的灰度值比P点的灰度值大或者小,则认为P为特征点。一般n设置为12。为了加快特征点的提取,快速排除非特征点,首先检测1、9、5、13位置上的灰度值,如果P是特征点,那么这四个位置上有3个或3个以上的的像素值都大于或者小于P点的灰度值。如果不满足,则直接排出此点。
步骤二:机器学习的方法筛选最优特征点。简单来说就是使用ID3算法训练一个决策 树,将特征点圆周上的16个像素输入决策树中,以此来筛选出最优的FAST特征点。
步骤三:非极大值抑制去除局部较密集特征点。使用非极大值抑制算法去除临近位置多个特征点的问题。为每一个特征点计算出其响应大小。计算方式是特征点P和其周围16个特征点偏差的绝对值和。在比较临近的特征点中,保留响应值较大的特征点,删除其余的特征点。
步骤四:特征点的尺度不变形。建立金字塔,来实现特征点的多尺度不变性。设置一个比例因子scaleFactor(opencv默认为1.2)和金字塔的层数nlevels(pencv默认为8)。将原图像按比例因子缩小成nlevels幅图像。缩放后的图像为:I’=I/scaleFactork(k=1,2,…,nlevels)。nlevels幅不同比例的图像提取特征点总和作为这幅图像的oFAST特征点。
步骤五:特征点的旋转不变性。ORB算法提出使用矩(moment)法来确定FAST特征点的方向。也就是说通过矩来计算特征点以r为半径范围内的质心,特征点坐标到质心形成一个向量作为该特征点的方向。矩定义如下:
Figure PCTCN2018107097-appb-000002
其中,I(x,y)为图像灰度表达式。该矩的质心为:
Figure PCTCN2018107097-appb-000003
假设角点坐标为O,则向量的角度即为该特征点的方向。计算公式如下:
Figure PCTCN2018107097-appb-000004
具体的,步骤21中,提取所述当前视觉图像帧的特征线时,可以通过以下过程实现:
S1:提取所述图像帧的所有线段,可选的,本申请实施例中可以利用现有的线段检测器(line segment detector,LSD)方法提取出图像帧的所有线段。
具体的,LSD提取算法如下所示:
1.以s=0.8的尺度对输入图像进行高斯下采样。
2.计算每一个点的梯度值以及梯度方向(level-line orientation)。
3.根据梯度值对所有点进行伪排序(pseudo-ordered),建立状态列表,所有点设置为UNUSED。
4.将梯度值小于ρ的点状态表中相应位置设置为USED。
5.取出列表中梯度最大(伪排列的首位)的点作为种子点(seed),状态列表中设为USED。
do:
a.以seed为起点,搜索周围UNUSED并且方向在阈值[-t,t]范围内的点,状态改为USED。
b.生成包含所有满足点的矩形R。
c.判断同性点(aligned points)密度是否满足阈值D,若不满足,截断(cut)R变为多个矩形框,直至满足。
d.计算NFA(number of false alarms)。
e.改变R使NFA的值更小直至NFA<=ε,R加入输出列表。
S2:若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止。
具体的,若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
上述步骤能够将属于同一个线条的多个线段进行合并,例如,图3B所示,l表示两条线段端点之前的最小距离;d表示一条线段的中间点距离另一条线段的距离;当l和d都小于设定的阈值时,且两条线段之间的夹角比较小,也小于设定的阈值,则认为这两条线段属于同一个线条,将这条线段合并为一条新的线段。
S3:若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
具体的,若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
例如,当任意两条线段l 1和l 2满足如下条件,则认为l 1和l 2是同一条直线:
1)l 1和l 2的夹角小于给定阈值Φ;
2)两条线段的长度基本相同,
Figure PCTCN2018107097-appb-000005
3)l 1和l 2的重叠度大于某个阈值
Figure PCTCN2018107097-appb-000006
4)l 1和l 2的线二进制描述符(line binary descriptor,LBD)描述子之间的距离小于一定的阈值。
需要说明的是,本申请实施例中的线段采用LBD方法描述,因此,两条线段之间的距离用两条线段的LBD描述子之间的距离表征。
步骤22:利用所述特征点预测所述摄像设备的第一位姿。
具体的,在提取到前后两个视频图像帧的特征点后,得到两组特征点集,对两组特征点集进行匹配,估计摄像设备的运动,据此预测所述摄像设备的第一位姿。进一步的若摄像设备是双目相机时,可选的,采用n点透视定位算法(perspective-n-point,PnP)的方法估计相机的运动,采用非线性优化的方法进行迭代求解,得到第一位姿,即摄像设备的旋转R和平移T的估计。
步骤23:对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线。
具体的,针对提取的第一特征线采用正交化参数进行最小化描述,得到所述第一特征线的特征线观测量。
步骤24:获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N历史特征线的特征线状态向量,N为正整数。
需要说明的是,所述全局特征线状态向量集合是在所述摄像设备运动过程中,连续的视觉图像帧中的关键帧输出的特征线进行闭环检测和全局优化得到的,所述获取所述当前视觉图像帧中的全局特征线状态向量集合,包括:在所述摄像设备运动过程中,在所述关键帧中观测到特征线时,针对新观测到的特征线与在先已观测到的历史特征线进行关联匹配;针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特 征线之间的重投影误差(reprojection error),利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
其中,重投影误差是指投影的点(理论值)与图像上的测量点的误差。例如在标定的时候我们经常用到重投影误差作为最终标定效果的评价标准,我们认为标定板上的物理点是理论值,它经过投影变换后会得到理论的像素点a,而测量的点经过畸变矫正后的像素点为a′,它们的欧氏距离||a-a′|| 2即表示重投影误差。
值得一提的是,一帧就是视频中的一个画面,关键帧也叫作I帧,它是帧间压缩编码的最重要帧。关键帧相当于二维动画中的原画,指角色或者物体运动或变化中的关键动作所处的那一帧。
步骤25:利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
值得注意的是,在利用所述特征线观测量、所述全局特征线状态向量集合、对所述第一位姿进行更新得到更新后的第一位姿时,还可以利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新得到更新后的全局特征线状态向量集合。
需要说明的是,在执行步骤25之前,需要确定上述观测到的第一特征线是否是已经观测到的特征线,可以通过以下过程实现:
S51:遍历所述全局特征线状态向量集合中的特征线状态向量,依次计算与所述特征线观测量之间的马氏距离,得到N个马氏距离。
S52:当所述N个马氏距离中的最小马氏距离小于预设阈值时,确定上述观测到的第一特征线为已经观测到的特征线;否则,确定上述观测到的第一特征线为新观测到的特征线。
进一步的,所述利用所述特征线观测量和所述全局特征线状态向量集合更新所述摄像设备的第一位姿时,计算所述最小马氏距离对应的特征线状态向量与所述特征线观测量之间的偏差,基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。可选的,将所述特征线状态向量与所述特征线观测量之间的残差作为特征线状态向量与所述特征线观测量之间的偏差。
若在上述步骤S52中,确定上述观测到的第一特征线为新观测到的特征线时,将所述特征线观测量添加到所述全局特征线状态向量集合中,更新所述全局特征线状态向量集合。
通过上述过程可知,本申请实施例中视觉SLAM方法,融合了基于特征点的运动估计及环境中观测到的特征线的观测特征,来实时更新摄像设备的位姿,此外,同时考虑了已观测到的历史特征线的观测特征,实现了闭环约束,提高了鲁棒性,提高了视觉SLAM的准确度。
下面通过一个双目相机的自动驾驶场景来详细说明图2的实施过程,具体实现过程可参阅图4A所示。
步骤1:获取自动驾驶的双目视觉图像输入;
步骤2:对获取的双目视觉图像特征进行特征点和特征线的提取。
需要说明的是,针对双目视觉图像提取到特点和特征线之后,还需执行如图4B所示的特征点和特征线的匹配过程。
例如,特征跟踪模块将校正后的双目视觉图像序列作为输入,对于输入的每一帧视觉图像,同时启动四个线程来对左右目视觉图像提取特征点和特征线。本实施例中采用ORB方法实现特征点的检测提取和描述。特征线则采用基于LSD的改进方法进行提取,并用LBD描述子进行描述,之后,再启动两个线程,一个线程对提取的特征点进行匹配,如果在左右目图像中同时存在相同的特征点,则为双目特征点,其他点特征为单目特征点;另一个线程对特征线进行匹配,如果左右目图像中同时找到相同的特征线,则为双目特征线,其他未匹配成功的特征为线单目特征线,对于每一个没有匹配到的单目特征点和单目特征线,将它们与其他关键帧中未匹配的单目特征点和特征线进行匹配,一旦匹配成功,则之后的处理方式与双目特征点和双目特征线的处理方式相同。
步骤3:利用特征点估计相邻图像间的相机运动,得到相机的运动估计。
设定相机起始位置为全局坐标系原点,根据步骤2的结果,提取前后两个视频图像帧的特征点,得到两组特征点集,对两组特征点集进行匹配,估计相机的运动,得到相机的运动方程。
例如,对双目相机,采用PnP的方法估计相机的运动时,采用非线性优化的方法进行迭代求解,得到相机的运动估计,这里用旋转q和平移p表征相机的运动估计,即相机的位姿,相机的运动方程表示如下:
x(t)=F(t)x(t-1)
其中:x(t)为相机运动的位姿,
Figure PCTCN2018107097-appb-000007
G为全局坐标系,C为相机坐标系,
Figure PCTCN2018107097-appb-000008
为相机在全局坐标系下的姿态用四元数表示, GP C表示相机在全局坐标系的位置,F为相机的状态转移矩阵。
Figure PCTCN2018107097-appb-000009
Figure PCTCN2018107097-appb-000010
其中
Figure PCTCN2018107097-appb-000011
Figure PCTCN2018107097-appb-000012
四元数的矩阵表示形式,
Figure PCTCN2018107097-appb-000013
随着相机位置变化而不断更新,在首张图像帧时初始化旋转矩阵R为单位阵,P为0。
步骤4:通过对提取的特征线进行观测,构建相机的观测方程。
具体实现时,包括步骤41和步骤42。
步骤41:根据步骤2提取的特征线,采用正交化参数对其进行最小化观测描述。
其中特征线正交化表示方法如下:
采用普鲁克(plucker)坐标对特征线进行描述,根据步骤2提取的特征线,假设提取的特征线的两个端点三维坐标表示分别为:X 1=(x 1,y 1,z 1,1) T,X 2=(x 2,y 2,z 2,1) T,该特征线在普鲁克(plucker)坐标描述如下:
Figure PCTCN2018107097-appb-000014
L f是由两个三维向量n和v组成的六维向量,v表示该特征线的直线向量X 1-X 2,n表示特征线和相机中心构成平面的法向量。通过正交化表示后,利用四个参数δ θ=[δθ T,δφ] T∈R 4更新特征线的正交化表示,三维向量θ∈R 3是特征线绕着三个坐标轴的旋转,用来更新n;φ表示中心点到特征直线的垂直距离,用来更新v。
定义相机的观测状态向量为x(包括相机的位姿及特征线状态向量)其表达式如下,其中
Figure PCTCN2018107097-appb-000015
Figure PCTCN2018107097-appb-000016
分别为相机的旋转和评议, GL f表示为特征线状态向量,上下标中G代表全 局坐标系,C代表相机,f表示特征线,L表示线本身(Line):
Figure PCTCN2018107097-appb-000017
其中
Figure PCTCN2018107097-appb-000018
为相机的状态估计误差向量,
Figure PCTCN2018107097-appb-000019
为特征线估计误差向量。
步骤42:构建特征线的观测方程;
对环境中观测到的特征线进行投影,l'表示投影后的相机平面直线:l 1u+l 2v+l 3=0,其中u,v为相机平面二维坐标表示,令l′=[l 1 l 2 l 3] T
设观测到的特征线的两个三维端点投影到相机平面坐标为x s和x e;特征线的观测残差:
Figure PCTCN2018107097-appb-000020
特征线的观测方程用如下式子表示:
令z=[x s x e] T
Figure PCTCN2018107097-appb-000021
其中d表示观测到的特征线z到l′的距离,距离越小代表估计误差越小。
线性化观测方程:
Figure PCTCN2018107097-appb-000022
其中:
Figure PCTCN2018107097-appb-000023
为特征观线测误差,H L表示特征线观测的雅克比矩阵,H n表示特征线观测噪声的雅克比矩阵;
Figure PCTCN2018107097-appb-000024
特征线测量的雅克比矩阵为:
Figure PCTCN2018107097-appb-000025
Figure PCTCN2018107097-appb-000026
其中:
Figure PCTCN2018107097-appb-000027
x s=[u 1 v 1 1] T和x e=[u 2 v 2 1] T
Figure PCTCN2018107097-appb-000028
其中K为相机的内参矩阵;
进一步通过求导,得到观测的雅克比矩阵,即相机的观测方程,具体的通过下述公式表示,其中公式中
Figure PCTCN2018107097-appb-000029
代表参数的估计:
Figure PCTCN2018107097-appb-000030
步骤5:观测环境中的特征线,并对观测到的特征线进行全局闭环检测并进行全局优化,得到全局特征线状态向量集合,具体包括以下步骤。
步骤51:利用特征线之间的马氏距离对特征线进行数据关联。
随着相机运动每个视频图像帧都有特征线输出,需要对该特征线与已观测到的特征线状态向量的特征线进行关联,确定此特征线线为新特征线还是已观测到的特征线。
具体的,本申请实施例中通过马氏距离的计算方法快速计算两条特征线的关联程度,具体算法如下所示:
Figure PCTCN2018107097-appb-000031
上式d m表示马氏距离,S表示观测到的特征线的协方差矩阵,具体计算公式可表示为:
Figure PCTCN2018107097-appb-000032
其中Px是相机的观测状态向量
Figure PCTCN2018107097-appb-000033
的协方差矩阵,R为测量噪声的协方差矩阵,如下所示。
Figure PCTCN2018107097-appb-000034
遍历已观测到的特征线状态向量,计算对应的马氏距离,从中选择马氏距离最小者。如果该马氏距离小于一个设定的阀值,可以确认此特征线之前观测过,实现一次闭环检测,否则初始化一条新的特征线。
步骤52:对观测到的特征线进行全局优化。
采用现有的全局优化器(如g2o)对相机的所有运动姿态进行估计,并对观测的特征线进行全局优化,其中步骤51的闭环检测结果也是优化器的输入之一。本申请的特征线观测量作为全局优化器的约束之一,对全局目标优化函数及特征线观测的雅克比矩阵进行计算,求导过程如下所示:
1)确定全局目标优化函数:
选择3D点的位置X w,i和3D线的位置
Figure PCTCN2018107097-appb-000035
作为图的顶点,相机的位姿为T kw,点位姿的边和线位姿的边作为图的两种边在前端的数据关联时候被构造,边的重投影表示如下:
ep k,i=x k,i-n(KT kwX w,i)
Figure PCTCN2018107097-appb-000036
上式中x k,i是点在图像坐标系中的位置,n(.)是从齐次坐标到非齐次坐标的变换。全局优化目标函数C能够通过下式得到,其中,∑p -1,∑l -1分别是点和线的协方差矩阵,ρ pl是代价函数。
Figure PCTCN2018107097-appb-000037
最小化目标函数,能够优化当前观测到的特征线的特征线状态向量,从而将其更新到全局特征线状态向量集合中。为了利用非线性优化方法优化目标函数,需要对优化函数求解雅克比矩阵。
2)解析观测的雅克比矩阵:
首先需要计算线的重投影误差关于微小位姿变化δ ξ和描述线正交化表示更新的四维向量δ θ的雅克比矩阵。重投影误差关于投影想像素坐标系中的反投影线l'=[l 1,l 2,l 3]的导数 如下:
Figure PCTCN2018107097-appb-000038
上式中
Figure PCTCN2018107097-appb-000039
x s=[u 1,v 1,1] T和x e=[u 2,v 2,1] T分别是图像坐标中要匹配的线段的两个端点。3D线的投影方程为:
Figure PCTCN2018107097-appb-000040
则l'关于相机坐标系下线段的导数如下:
Figure PCTCN2018107097-appb-000041
设线段在世界坐标系下的正交化表示如下:
Figure PCTCN2018107097-appb-000042
由变换方程
Figure PCTCN2018107097-appb-000043
可知
Figure PCTCN2018107097-appb-000044
关于
Figure PCTCN2018107097-appb-000045
的雅克比矩阵如下所示:
Figure PCTCN2018107097-appb-000046
上式中,u i是U的i th列。
直接计算导数
Figure PCTCN2018107097-appb-000047
比较困难,所以本申请把δ ξ分成平移变化的部分δ ρ和旋转变化的部分δ φ。计算关于平移量δ ρ的导数的时候,假设旋转量为0,同理,计算关于旋转量的导数的时候,假设平移量为0。首先计算关于平移量的导数。经过δ ρ平移量之后的变换矩阵T *,旋转矩阵R *,平移量t *,线的变换矩阵
Figure PCTCN2018107097-appb-000048
和变换后的线的坐标如下所示:
Figure PCTCN2018107097-appb-000049
上式中exp(δ ξ )是李代数δ ξ 对应的李群。然后可以计算出关于δ ρ的导数如下所示:
Figure PCTCN2018107097-appb-000050
计算
Figure PCTCN2018107097-appb-000051
和计算
Figure PCTCN2018107097-appb-000052
是相似的,计算结果如下所示:
Figure PCTCN2018107097-appb-000053
最后关于δ ξ的雅
Figure PCTCN2018107097-appb-000054
克比矩阵如下所示:
根据求导法则,综合以上,本申请中重投影误差关于线参数和位姿变化的雅克比矩阵如下:
Figure PCTCN2018107097-appb-000055
Figure PCTCN2018107097-appb-000056
计算出雅克比矩阵之后,就可以利用高斯牛顿等非线性方法来迭代求最优的特征线状态向量和相机的位姿。
步骤53:根据步骤52的结果,更新全局特征线状态向量集合。
步骤6:采用滤波的SLAM方式,利用特征点的运动估计,利用特征线的特征线观测量对相机的观测状态进行更新,实现相机的运动跟踪。
举例说明步骤6的具体实现过程:
步骤61:根据步骤3获取相机的位姿。
步骤3中能够根据T时刻观测的特征点及T-1时刻观测的特征点,采用视觉里程计方法对前后两个视觉图像的帧特征点进行匹配得到T-1时刻到T时刻的位姿估计
Figure PCTCN2018107097-appb-000057
步骤62:若T时刻,后端有全局特征线状态向量集合更新,更新全局特征线状态向量集合
Figure PCTCN2018107097-appb-000058
步骤63:T时刻观测到的m条特征线表示为
Figure PCTCN2018107097-appb-000059
将观测到的每条特征线与全局特征线状态向量集合 GL f计算观测残差
Figure PCTCN2018107097-appb-000060
及协方差矩阵H L,通过马氏距离判定线段的数据关联程度,确定观测到的特征线是否为已经观测到的特征线。
步骤64:如果是新观测到的特征线,将该特征线更新到全局特征线状态向量集合 GL f中,更新相机的状态向量;如果是已经观测到的特征线,计算特征线的观测残差z~及雅克比矩阵H L
步骤65:基于T时刻的位姿估计
Figure PCTCN2018107097-appb-000061
利用特征线的观测残差z~及雅克比矩阵H L,观测噪声的协方差矩阵R及雅克比矩阵H n,优化相机的状态向量
Figure PCTCN2018107097-appb-000062
具体的,采用EKF滤波方法进行举例说明,通过相机的状态协方差矩阵P X、相机的状态转移矩阵F迭代计算相机的状态协方差估计
Figure PCTCN2018107097-appb-000063
然后利用特征线的观测雅克比矩阵H L,相机的状态协方差估计
Figure PCTCN2018107097-appb-000064
观测噪声协方差矩阵R及噪声的雅克比矩阵H n获得迭代更新卡尔曼增益,
Figure PCTCN2018107097-appb-000065
利用更新的卡尔曼增益及特征线的观测雅克比矩阵H L更新状态协方差矩阵,得到
Figure PCTCN2018107097-appb-000066
实现卡尔曼增益和状态协方差矩阵P X的不断迭代更新,通过更新的卡尔曼增益及特征线的观测残差z~优化相机的状态向量
Figure PCTCN2018107097-appb-000067
综上,现有的视觉SLAM方案,采用视觉里程计方法对相机运动进行估计,未考虑闭环约束及地图观测的误差。本申请的特征跟踪模块基于滤波框架融合了基于特征点的相机帧间运动估计及环境特征线的观测向量;其中,在滤波框架中维护特征线地图,特征线地图和相机位姿观测同时优化;维护线地图和相机运动之间的实时关联,保证系统实时性能及鲁棒闭环检测。利用SLAM的闭环检测提升运动跟踪估计精度。
此外,特征线表示及观测误差的解析求解时,采用正交化表示法对3D线特征的误差投影进行最小参数化表示,通过数学解析方法计算观测变量的雅克比矩阵,减少优化变量数量,提升后端优化精度、稳定性及效率。
基于同一构思,本申请还提供了一种基于点线特征的视觉SLAM装置,应用于采集周围图像的摄像设备,该装置可以用于执行上述图2~图4B中对应的方法实施例,因此本申请实施例提供的视觉SLAM装置的实施方式可以参见该方法的实施方式,重复之处不再赘述。
参阅图5所示,本申请实施例提供基于点线特征的视觉SLAM装置500,包括:接收单元510,提取单元520,预测单元530,确定单元540,获取单元550和更新单元560,其中:
接收单元510,用于接收摄像头输入的当前视觉图像帧;
提取单元520,用于提取所述当前视觉图像帧的特征点和特征线;
预测单元530,用于利用所述特征点预测所述摄像设备的第一位姿;
确定单元540,用于对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;
获取单元550,用于获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;
更新单元560,用于利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
可选的,所述更新单元560还用于:
利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新,以得到更新后的全局特征线状态向量集合。
可选的,所述确定单元540还用于:
遍历所述N条历史特征线,依次计算每条历史特征线与所述第一特征线之间的马氏距离,得到N个马氏距离;
所述更新单元560具体用于:当所述N个马氏距离中的最小马氏距离小于预设阈值,利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
可选的,所述更新单元560具体用于:
计算所述最小马氏距离对应的特征线的特征线状态向量与所述特征线观测量之间的偏差,
基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。
可选的,所述更新单元560还用于:当所述N个马氏距离中的最小马氏距离不小于预设阈值时,将所述特征线观测量添加到所述全局特征线状态向量集合中,以得到更新后的所述全局特征线状态向量集合。
可选的,在提取所述当前视觉图像帧的特征线的方面,所述提取单元520具体用于:
提取所述当前视觉图像帧的所有线段;
若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止;
若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
可选的,在提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段的方面,所述提取单元520具体用于:
若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
可选的,在所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出的方面,所述提取单元520具体用于:
若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
可选的,所述确定单元540具体用于:
针对提取的特征线采用正交化参数进行最小化描述,得到所述特征线观测量。
可选的,所述获取单元550具体用于:
在所述摄像设备运动过程中,在所述当前视觉图像帧是关键帧且在所述当前视觉图像帧观测到特征线时,针对当前观测到的特征线与在先已经观测到的历史特征线进行关联匹配,所述关键帧是所述摄像设备运动过程中发生关键动作所处的帧;针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特征线之间的重投影误差,利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
基于同一构思,参阅图6所示,本申请实施例提供一种基于点线特征的视觉SLAM处理设备600,包括收发器610、处理器620、存储器630;存储器630用于存储程序、指令或代码;处理器620用于执行存储器630中的程序、指令或代码;
收发器610,用于接收摄像头输入的当前视觉图像帧;
处理器620,用于提取所述当前视觉图像帧的特征点和特征线;利用所述特征点预测所述摄像设备的第一位姿;对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
其中,处理器620可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器620中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器602可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器630,处理器620读取存储器630中的信息,结合其硬件执行以上方法步骤。
本领域内的技术人员应明白,本申请实施例可提供为方法、系统、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的 计算机程序产品的形式。
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (22)

  1. 一种基于点线特征的视觉SLAM方法,应用于采集周围图像的摄像设备,其特征在于,包括:
    接收摄像头输入的当前视觉图像帧;
    提取所述当前视觉图像帧的特征点和特征线;
    利用所述特征点预测所述摄像设备的第一位姿;
    对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;
    获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;
    利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新,以得到更新后的全局特征线状态向量集合。
  3. 如权利要求1所述的方法,其特征在于,所述方法还包括:遍历所述N条历史特征线,依次计算每条历史特征线与所述第一特征线之间的马氏距离,得到N个马氏距离;
    所述利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿,包括:
    当所述N个马氏距离中的最小马氏距离小于预设阈值时,利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
  4. 如权利要求3所述的方法,其特征在于,所述利用所述特征线观测量、所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿,包括:
    计算所述最小马氏距离对应的特征线的特征线状态向量与所述特征线观测量之间的偏差,
    基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。
  5. 如权利要求3所述的方法,其特征在于,所述方法还包括:
    当所述N个马氏距离中的最小马氏距离不小于预设阈值时,将所述特征线观测量添加到所述全局特征线状态向量集合中,以得到更新后的所述全局特征线状态向量集合。
  6. 如权利要求1所述的方法,其特征在于,所述提取所述当前视觉图像帧的特征线,包括:
    提取所述当前视觉图像帧的所有线段;
    若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止;
    若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
  7. 如权利要求6所述的方法,其特征在于,若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,包括:
    若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
  8. 如权利要求6所述的方法,其特征在于,若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出,包括:
    若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
  9. 如权利要求1-8任一项所述的方法,其特征在于,所述对所述第一特征线进行观测,以确定所述第一特征线的特征线观测量,包括:
    针对提取的第一特征线采用正交化参数进行最小化描述,得到所述特征线观测量。
  10. 如权利要求1-9任一项所述的方法,其特征在于,所述获取所述当前视觉图像帧中的全局特征线状态向量集合,包括:
    在所述摄像设备运动过程中,在所述当前视觉图像帧是关键帧且观测到特征线时,针对当前观测到的特征线与在先已经观测到的历史特征线进行关联匹配,所述关键帧是所述摄像设备运动过程中发生关键动作所处的帧;
    针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特征线之间的重投影误差,利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;
    针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
  11. 一种基于点线特征的视觉SLAM装置,应用于采集周围图像的摄像设备,其特征在于,包括:
    接收单元,用于接收摄像头输入的当前视觉图像帧;
    提取单元,用于提取所述当前视觉图像帧的特征点和特征线;
    预测单元,用于利用所述特征点预测所述摄像设备的第一位姿;
    确定单元,用于对第一特征线进行观测,以确定所述第一特征线的特征线观测量,其中,所述第一特征线是提取到的所述特征线中的任意一条特征线;
    获取单元,用于获取所述当前视觉图像帧中的全局特征线状态向量集合,所述全局特征线状态向量集合中包括N条历史特征线的特征线状态向量,N为正整数;
    更新单元,用于利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更新后的第一位姿。
  12. 如权利要求11所述的装置,其特征在于,所述更新单元还用于:
    利用所述特征线观测量、所述第一位姿对所述全局特征线状态向量集合进行更新,以得到更新后的全局特征线状态向量集合。
  13. 如权利要求11所述的装置,其特征在于,所述确定单元还用于:
    遍历所述N条历史特征线,依次计算每条历史特征线与所述第一特征线之间的马氏距离,得到N个马氏距离;
    所述更新单元具体用于:当所述N个马氏距离中的最小马氏距离小于预设阈值,利用所述特征线观测量和所述全局特征线状态向量集合,对所述第一位姿进行更新,以得到更 新后的第一位姿。
  14. 如权利要求11或12所述的装置,其特征在于,所述更新单元具体用于:
    计算所述最小马氏距离对应的特征线的特征线状态向量与所述特征线观测量之间的偏差,
    基于所述偏差,利用滤波方法更新所述第一位姿和所述全局特征线状态向量集合。
  15. 如权利要求13所述的装置,其特征在于,所述更新单元还用于:当所述N个马氏距离中的最小马氏距离不小于预设阈值时,将所述特征线观测量添加到所述全局特征线状态向量集合中,以得到更新后的所述全局特征线状态向量集合。
  16. 如权利要求11所述的装置,其特征在于,在提取所述当前视觉图像帧的特征线的方面,所述提取单元具体用于:
    提取所述当前视觉图像帧的所有线段;
    若提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段,直到不存在满足所述第一预设条件的线段为止;
    若所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出;若所述任意两条、经合并的线段不满足第二预设条件时,将所述任意两条线段作为两条特征线输出。
  17. 如权利要求16所述的装置,其特征在于,在提取的任意两条线段满足第一预设条件时,将所述任意两条线段合并为一条新的线段的方面,所述提取单元具体用于:
    若提取的任意两条线段的端点之间的最小距离小于第一预设值且所述两条线段之间的距离小于第二预设值,且所述任意两条线段之间的夹角小于第三预设值时,将所述任意两条线段合并为一条新的线段。
  18. 如权利要求16所述的装置,其特征在于,在所述任意两条、经合并的线段满足第二预设条件时,将所述任意两条、经合并的线段作为同一条特征线输出的方面,所述提取单元具体用于:
    若所述任意两条、经合并的线段之间的夹角小于第四预设值,且所述两条线段的长度相同,且所述两条线段的重叠度大于第五预设值,且所述两条线段之间的距离小于第六预设值时,将所述任意两条线段作为同一条特征线输出。
  19. 如权利要求11所述的装置,其特征在于,所述确定单元具体用于:
    针对提取的特征线采用正交化参数进行最小化描述,得到所述特征线观测量。
  20. 如权利要求11所述的装置,其特征在于,所述获取单元具体用于:
    在所述摄像设备运动过程中,在所述当前视觉图像帧是关键帧且在所述当前视觉图像帧观测到特征线时,针对当前观测到的特征线与在先已经观测到的历史特征线进行关联匹配,所述关键帧是所述摄像设备运动过程中发生关键动作所处的帧;
    针对匹配成功的特征线,计算所述当前观测到的特征线与在先已观测到的每条历史特征线之间的重投影误差,利用所述重头影误差构造目标函数,最小化所述目标函数得到所述当前观测到的特征线的特征线状态向量,将其更新到所全局特征线状态向量集合中;
    针对匹配失败的特征线,获取所述当前观测到的特征线的特征线状态向量,将其添加到所述全局特征线状态向量集合中。
  21. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机可执行指令,当所述计算机可执行指令在计算机上运行时,使得所述计算机执行权利要求1-10任一 项所述的方法。
  22. 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储在如权利要求21所述的计算机存储介质上的计算机可执行指令,当所述计算机可执行指令在计算机上运行时,使所述计算机执行权利要求1-10任一项所述的方法。
PCT/CN2018/107097 2017-09-22 2018-09-21 一种基于点线特征的视觉slam方法和装置 WO2019057179A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18859404.8A EP3680809A4 (en) 2017-09-22 2018-09-21 METHOD AND APPARATUS FOR SIMULTANEOUS LOCATION AND MAPPING BY VISUAL SLAM BASED ON A CHARACTERISTIC OF POINTS AND LINES
US16/824,219 US11270148B2 (en) 2017-09-22 2020-03-19 Visual SLAM method and apparatus based on point and line features

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201710868034 2017-09-22
CN201710868034.6 2017-09-22
CN201810184021.1 2018-03-06
CN201810184021.1A CN109558879A (zh) 2017-09-22 2018-03-06 一种基于点线特征的视觉slam方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/824,219 Continuation US11270148B2 (en) 2017-09-22 2020-03-19 Visual SLAM method and apparatus based on point and line features

Publications (1)

Publication Number Publication Date
WO2019057179A1 true WO2019057179A1 (zh) 2019-03-28

Family

ID=65811029

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/107097 WO2019057179A1 (zh) 2017-09-22 2018-09-21 一种基于点线特征的视觉slam方法和装置

Country Status (2)

Country Link
US (1) US11270148B2 (zh)
WO (1) WO2019057179A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113012196A (zh) * 2021-03-05 2021-06-22 华南理工大学 一种基于双目相机与惯导传感器信息融合的定位方法
CN113536024A (zh) * 2021-08-11 2021-10-22 重庆大学 一种基于fpga的orb_slam重定位特征点检索加速方法
CN113570535A (zh) * 2021-07-30 2021-10-29 深圳市慧鲤科技有限公司 视觉定位方法及相关装置、设备

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019057179A1 (zh) * 2017-09-22 2019-03-28 华为技术有限公司 一种基于点线特征的视觉slam方法和装置
CN112013844B (zh) * 2019-05-31 2022-02-11 北京小米智能科技有限公司 建立室内环境地图的方法及装置
CN110930519B (zh) * 2019-11-14 2023-06-20 华南智能机器人创新研究院 基于环境理解的语义orb-slam感知方法及装置
US20220103831A1 (en) * 2020-09-30 2022-03-31 Alibaba Group Holding Limited Intelligent computing resources allocation for feature network based on feature propagation
CN112305554B (zh) * 2020-11-23 2021-05-28 中国科学院自动化研究所 基于有向几何点和稀疏帧的激光里程计方法、系统、装置
JP7451456B2 (ja) * 2021-03-22 2024-03-18 株式会社東芝 運動推定装置及びそれを用いた運動推定方法
CN113192140B (zh) * 2021-05-25 2022-07-12 华中科技大学 一种基于点线特征的双目视觉惯性定位方法和系统
CN113392909B (zh) * 2021-06-17 2022-12-27 深圳市睿联技术股份有限公司 数据处理方法、数据处理装置、终端及可读存储介质
CN113393524B (zh) * 2021-06-18 2023-09-26 常州大学 一种结合深度学习和轮廓点云重建的目标位姿估计方法
CN113608236A (zh) * 2021-08-03 2021-11-05 哈尔滨智兀科技有限公司 一种基于激光雷达及双目相机的矿井机器人定位建图方法
CN114170366B (zh) * 2022-02-08 2022-07-12 荣耀终端有限公司 基于点线特征融合的三维重建方法及电子设备
CN114627556B (zh) * 2022-03-15 2023-04-07 北京百度网讯科技有限公司 动作检测方法、动作检测装置、电子设备以及存储介质
CN116071538B (zh) * 2023-03-03 2023-06-27 天津渤海职业技术学院 一种基于slam的机器人定位系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886107A (zh) * 2014-04-14 2014-06-25 苏州市华天雄信息科技有限公司 基于天花板图像信息的机器人定位与地图构建系统
CN106444757A (zh) * 2016-09-27 2017-02-22 成都普诺思博科技有限公司 基于直线特征地图的ekf‑slam算法
CN106897666A (zh) * 2017-01-17 2017-06-27 上海交通大学 一种室内场景识别的闭环检测方法

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5253066B2 (ja) * 2008-09-24 2013-07-31 キヤノン株式会社 位置姿勢計測装置及び方法
US8311285B2 (en) * 2009-06-30 2012-11-13 Mitsubishi Electric Research Laboratories, Inc. Method and system for localizing in urban environments from omni-direction skyline images
US8855406B2 (en) * 2010-09-10 2014-10-07 Honda Motor Co., Ltd. Egomotion using assorted features
KR101750340B1 (ko) * 2010-11-03 2017-06-26 엘지전자 주식회사 로봇 청소기 및 이의 제어 방법
US9031782B1 (en) * 2012-01-23 2015-05-12 The United States Of America As Represented By The Secretary Of The Navy System to use digital cameras and other sensors in navigation
US9183631B2 (en) * 2012-06-29 2015-11-10 Mitsubishi Electric Research Laboratories, Inc. Method for registering points and planes of 3D data in multiple coordinate systems
US9734586B2 (en) * 2012-09-21 2017-08-15 The Schepens Eye Research Institute, Inc. Collision prediction
US10306206B2 (en) * 2013-07-23 2019-05-28 The Regents Of The University Of California 3-D motion estimation and online temporal calibration for camera-IMU systems
US20150269436A1 (en) * 2014-03-18 2015-09-24 Qualcomm Incorporated Line segment tracking in computer vision applications
US20150371440A1 (en) * 2014-06-19 2015-12-24 Qualcomm Incorporated Zero-baseline 3d map initialization
CN104077809B (zh) 2014-06-24 2017-04-12 上海交通大学 基于结构性线条的视觉slam方法
WO2017076929A1 (en) * 2015-11-02 2017-05-11 Starship Technologies Oü Device and method for autonomous localisation
US9807365B2 (en) * 2015-12-08 2017-10-31 Mitsubishi Electric Research Laboratories, Inc. System and method for hybrid simultaneous localization and mapping of 2D and 3D data acquired by sensors from a 3D scene
WO2018111920A1 (en) * 2016-12-12 2018-06-21 The Charles Stark Draper Laboratory, Inc. System and method for semantic simultaneous localization and mapping of static and dynamic objects
CN106909877B (zh) 2016-12-13 2020-04-14 浙江大学 一种基于点线综合特征的视觉同时建图与定位方法
WO2019057179A1 (zh) * 2017-09-22 2019-03-28 华为技术有限公司 一种基于点线特征的视觉slam方法和装置
WO2019086465A1 (en) * 2017-11-02 2019-05-09 Starship Technologies Oü Visual localization and mapping in low light conditions
CN108665508B (zh) * 2018-04-26 2022-04-05 腾讯科技(深圳)有限公司 一种即时定位与地图构建方法、装置及存储介质
US10726264B2 (en) * 2018-06-25 2020-07-28 Microsoft Technology Licensing, Llc Object-based localization
US10740645B2 (en) * 2018-06-29 2020-08-11 Toyota Research Institute, Inc. System and method for improving the representation of line features
CN109903330B (zh) * 2018-09-30 2021-06-01 华为技术有限公司 一种处理数据的方法和装置
CN112013844B (zh) * 2019-05-31 2022-02-11 北京小米智能科技有限公司 建立室内环境地图的方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103886107A (zh) * 2014-04-14 2014-06-25 苏州市华天雄信息科技有限公司 基于天花板图像信息的机器人定位与地图构建系统
CN106444757A (zh) * 2016-09-27 2017-02-22 成都普诺思博科技有限公司 基于直线特征地图的ekf‑slam算法
CN106897666A (zh) * 2017-01-17 2017-06-27 上海交通大学 一种室内场景识别的闭环检测方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HYUKDOO CHOI: "CV-SLAM using Line and Point Features", 12TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS 2012 12 31, 31 December 2012 (2012-12-31), XP032291427 *
XIE, XIAOJIA: "Stereo Visual SLAM Using Point and Line Features ", ELECTRONIC TECHNOLOGY & INFORMATION SCIENCE CHINA MATERS THESES, 15 August 2017 (2017-08-15), XP055684025 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113012196A (zh) * 2021-03-05 2021-06-22 华南理工大学 一种基于双目相机与惯导传感器信息融合的定位方法
CN113570535A (zh) * 2021-07-30 2021-10-29 深圳市慧鲤科技有限公司 视觉定位方法及相关装置、设备
CN113536024A (zh) * 2021-08-11 2021-10-22 重庆大学 一种基于fpga的orb_slam重定位特征点检索加速方法

Also Published As

Publication number Publication date
US11270148B2 (en) 2022-03-08
US20200218929A1 (en) 2020-07-09

Similar Documents

Publication Publication Date Title
US11270148B2 (en) Visual SLAM method and apparatus based on point and line features
EP3680809A1 (en) Visual slam method and apparatus based on point and line characteristic
Yang et al. Monocular object and plane slam in structured environments
US20230258455A1 (en) Simultaneous location and mapping (slam) using dual event cameras
CN108564616B (zh) 快速鲁棒的rgb-d室内三维场景重建方法
Yang et al. Pop-up slam: Semantic monocular plane slam for low-texture environments
CN107980150B (zh) 对三维空间建模
CN110555901B (zh) 动静态场景的定位和建图方法、装置、设备和存储介质
WO2022188094A1 (zh) 一种点云匹配方法及装置、导航方法及设备、定位方法、激光雷达
CN112132897A (zh) 一种基于深度学习之语义分割的视觉slam方法
CN108229416B (zh) 基于语义分割技术的机器人slam方法
AU2016246024A1 (en) Method and device for real-time mapping and localization
CN110097584B (zh) 结合目标检测和语义分割的图像配准方法
Zhang et al. Building a partial 3D line-based map using a monocular SLAM
Li et al. Review of vision-based Simultaneous Localization and Mapping
Zhang et al. Hand-held monocular SLAM based on line segments
CN112419497A (zh) 基于单目视觉的特征法与直接法相融合的slam方法
CN114782499A (zh) 一种基于光流和视图几何约束的图像静态区域提取方法及装置
CN111998862A (zh) 一种基于bnn的稠密双目slam方法
CN113674400A (zh) 基于重定位技术的光谱三维重建方法、系统及存储介质
Rückert et al. Snake-SLAM: Efficient global visual inertial SLAM using decoupled nonlinear optimization
Zhang LILO: A Novel Lidar–IMU SLAM System With Loop Optimization
Elhashash et al. Cross-view SLAM solver: Global pose estimation of monocular ground-level video frames for 3D reconstruction using a reference 3D model from satellite images
Li-Chee-Ming et al. Augmenting visp’s 3d model-based tracker with rgb-d slam for 3d pose estimation in indoor environments
Ma et al. An Improved Feature-Based Visual Slam Using Semantic Information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18859404

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018859404

Country of ref document: EP

Effective date: 20200406