CN114972501B - Visual positioning method based on prior semantic map structure information and semantic information - Google Patents

Visual positioning method based on prior semantic map structure information and semantic information Download PDF

Info

Publication number
CN114972501B
CN114972501B CN202210423500.0A CN202210423500A CN114972501B CN 114972501 B CN114972501 B CN 114972501B CN 202210423500 A CN202210423500 A CN 202210423500A CN 114972501 B CN114972501 B CN 114972501B
Authority
CN
China
Prior art keywords
semantic
visual
map
prior
landmark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210423500.0A
Other languages
Chinese (zh)
Other versions
CN114972501A (en
Inventor
张云洲
梁世文
田瑞
杨凌昊
曹振中
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Priority to CN202210423500.0A priority Critical patent/CN114972501B/en
Publication of CN114972501A publication Critical patent/CN114972501A/en
Application granted granted Critical
Publication of CN114972501B publication Critical patent/CN114972501B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of visual SLAM, and provides a visual positioning method based on prior semantic map structural information and semantic information. The invention introduces the priori semantic map factors into visual positioning through a mixed constraint of fusing the semantic information and the structural information of the priori semantic map. And then, using a data association between the visual road sign and the priori semantic map, and simultaneously optimizing the data association and the camera pose by using a expectation maximization algorithm, thereby improving the accuracy and the robustness of visual positioning. The algorithm can effectively limit drift errors of the visual odometer and improve visual positioning accuracy so as to serve application scenes such as navigation. Under the condition of meeting the requirement of real-time performance, higher positioning precision is obtained.

Description

Visual positioning method based on prior semantic map structure information and semantic information
Technical Field
The invention relates to the field of visual SLAM (simultaneous localization AND MAPPING), in particular to a visual positioning algorithm based on prior semantic map structure information and semantic information.
Background
With the wider and wider application of mobile robots and unmanned robots, how to provide accurate and robust pose for related hardware devices such as autonomous vehicles is an urgent problem to be solved. The vision SLAM technology obtains images through a camera to estimate the pose of the vision SLAM technology and build a map. As one of the currently mainstream positioning technologies, the visual SLAM method attracts a great deal of research work due to the characteristics of low cost and light weight of cameras. However, compared with a laser radar-based SLAM algorithm, the traditional vision SLAM mainly relies on characteristics of points, lines, planes and the like in the environment to complete self positioning and mapping, so that the traditional vision SLAM is very easily influenced by factors such as ambient illumination, visual angles and the like. In the field of autopilot or mobile robotics, one possible approach is to use high-precision maps built off-line to provide constraints to limit the cumulative drift error of the visual odometer. The method can ensure the accuracy of visual positioning, and can improve the robustness of the visual positioning, so that the visual positioning can meet the requirements of practical application.
3D line features in prior semantic maps are quantified in IEEE/RSJ International Conference on Intelligent Robots AND SYSTEMS (IROS), 4588-4594,2020, and 2D line features extracted from images are matched with 3D line features. Based on the data association of the 2D-3D line characteristics, a priori semantic map constraint is introduced for the visual positioning system, and accumulated drift errors of the visual odometer are limited. The Journal of Field Robotics,1003-1026,2020 proposes a ProW-NDT point cloud matching algorithm to estimate the coordinate transformation of a local visual map and an a priori laser map. And optimizing the key frame pose in the sliding window by utilizing a local pose graph optimization algorithm in order to integrate the results of the point cloud registration and the visual odometer.
The visual localization algorithms described in IEEE/RSJ International Conference on Intelligent Robots AND SYSTEMS (IROS), 4588-4594,2020, and Journal of Field Robotics,1003-1026,2020 all use only the structural information of the prior semantic map, without considering the semantic information in the prior semantic map. When visual localization is performed in outdoor large scenes, merely utilizing structural information in a priori semantic map may result in reduced localization robustness or even failure. In order to ensure the accuracy and robustness of visual localization, common constraints of semantic information and structural information in a priori semantic map need to be introduced.
Disclosure of Invention
The invention aims to provide a visual positioning algorithm based on prior semantic map structural information and semantic information, and the accuracy and the robustness of visual positioning are improved.
The technical scheme of the invention is as follows: a visual positioning algorithm based on prior semantic map structure information and semantic information comprises the following specific steps:
(1) Acquiring structural information and semantic information by using a priori semantic map M, and taking a visual image feature U and a semantic segmentation image C as system observation O, O= { U, C }; setting the pose of a camera as T, setting the coordinate transformation relation between a visual image and a priori semantic map as S, and estimating the state X= { T, S } of the camera and the coordinate P of a semantic landmark by using the priori semantic map M and system observation O; obtaining posterior probability estimation of the camera state, the semantic signpost, the priori semantic map and the system observation, and performing maximization estimation to obtain a visual observation factor, a semantic tracking factor and a priori semantic map factor:
p(X,P|M,O)∝p(U|P,T)·p(C|Z)·p(P|M,S) (1)
wherein Z represents a semantic tag of a semantic landmark, P (U|P, T) is a visual observation factor, P (C|Z) is a semantic tracking factor, and P (P|M, S) is a priori semantic map factor;
(2) Continuous visual tracking is carried out on the semantic signpost based on the visual feature description sub-feature matching visual image;
Setting the camera pose of the kth frame of image as T k, setting the coordinate of the ith semantic landmark as P i, using the reprojection error in the characteristic point method as a residual term E visual of a visual observation factor P (U|P, T), and estimating the initial pose T k of each frame of visual image;
wherein pi (·) represents a reprojection function, u i,k represents a corresponding image feature of the ith semantic landmark on the kth frame of image, Σ i,k represents a visual data association corresponding covariance matrix, and Ω represents a visual data association set;
(3) The semantic segmentation image carries out continuous semantic tracking on the semantic landmark based on Dirichlet distribution, and semantic label Z of the semantic landmark is estimated by using semantic tracking factor p (C|Z) and semantic segmentation image C;
Defining g i=[p(zi=1),p(zi=2),...,p(zi=H)]T as the discrete probability of the semantic label z i corresponding to the ith semantic landmark; defining the observation of the ith semantic landmark on the kth frame semantic segmentation image as c i,k, and updating the distribution of g i through the observation;
Estimating the discrete probability distribution of a semantic label z i corresponding to the semantic landmark by utilizing multi-frame semantic observation C i={ci,k}k∈K, namely semantic state D i, wherein K represents a semantic segmentation image set in which the ith semantic landmark is observed;
Di=p(gi|Ci)∝p(Ci|gi)·Dir(gi|iα)
(3)
Wherein Dir (·) represents the dirichlet distribution, iα=[α12,...,αH]T is a parameter of the dirichlet distribution; p (C i|gi) obeys polynomial distribution, and the semantic state of the semantic roadmap is converted into update of the dirichlet distribution parameters, namely:
iαnewiαold+ci,k (4)
Wherein iαnew is post-update iα,iαold is pre-update i a;
(4) In the positioning thread, the prior semantic map M is further divided into a plurality of sub-maps M l according to semantic tags of the prior semantic map M; establishing data association of semantic signpost p i with each sub-map M l Setting the coordinate transformation relation between the visual image obtained by the kth frame estimation and the priori semantic map as S k, extracting the structural information in the priori semantic map M according to the obtained priori semantic map factor P (P|M, S) in the formula (1)And semantic information p (z i|M,Sk,pi) construct a hybrid constraint as a priori constraint for the algorithm:
(4.1) structural information in the formula (5) Its residual term is defined as:
Wherein, Representing a probability distribution of the local point cloud; the structure information in the prior semantic map is divided into a planar structure and a non-planar structure;
For planar structures, the residual term in equation (6) is modeled as the distance from point to straight line, i.e.:
Wherein the method comprises the steps of AndRespectively isSurface normal vector and three-dimensional point coordinates of (a); in a planar structure we willExpressed as a unitary matrix; ;
for non-planar structures, the residual form in the ICP algorithm is employed, namely:
(4.2) p (z i|M,Sk,pi) in equation (5) is the semantic weight w i of the ith semantic landmark, define Obeying the Dirichlet distribution, namely: w i~Dir(i beta), wherein iβ=[iβ1,iβ2,...,iβH]T is a parameter of the dirichlet allocation, and is used for optimizing by fusing semantic states of semantic signposts in a positioning thread;
Wherein, gamma is the balance coefficient;
(5) In a localization thread, based on the proposed hybrid constraints, a coordinate transformation between a priori semantic map and a world coordinate system is solved using a expectation maximization algorithm Coordinate transformation based on solutionEstimating 6-degree-of-freedom pose of camera in prior semantic map
Estimating coordinate transformation between an initial frame of a visual image and an a priori semantic map using a expectation maximization algorithmThe flow of (2) is as follows:
1) E, step E: for each semantic landmark, constructing a data association between the semantic landmark and the prior semantic map with a variable Z using nearest neighbor search, and estimating a weight w i of each data association with equation (9);
2) M steps: based on the data association and the weight w i estimated based on equation (9), solving by the following optimization model
Wherein S k isThe coordinate transformation S 0 between the prior semantic map and the visual image initial frame is an identity matrix.
Repeating the steps E and M until convergence or the set iteration times are reached; finally, pose passing of current camera in the prior check semantic mapAnd (5) performing calculation.
Coordinate transformation based on solutionFurther refining the 3D coordinates of the semantic signpost; setting semantic landmark point coordinates before optimization asThe coordinates of the waypoints after refinement are:
Wherein, Transforming between a world coordinate system before optimizing for a desired maximization algorithm and a priori semantic map coordinate system; the system further optimizes the camera pose T and the coordinates P of the semantic landmark by using a local BA optimization algorithm.
The plane structure is a road, a sidewalk or a wall.
The invention has the beneficial effects that: aiming at the defect that the prior semantic map-based algorithm only utilizes geometric structure information in the prior semantic map, the invention provides a visual positioning algorithm based on prior semantic map structure information and semantic information. The invention extracts the structure information and the semantic information of the prior semantic map as prior constraints. Unlike other approaches that utilize geometric features in prior semantic maps to constrain camera pose, the present invention introduces prior semantic map factors into visual positioning through a hybrid constraint that fuses prior semantic map semantic information and structural information. And then, using a data association between the visual road sign and the priori semantic map, and simultaneously optimizing the data association and the camera pose by using an Expectation Maximization (EM) algorithm, so that the accuracy and the robustness of visual positioning are improved. The algorithm can effectively limit drift errors of the visual odometer and improve visual positioning accuracy so as to serve application scenes such as navigation. Under the condition of meeting the requirement of real-time performance, higher positioning precision is obtained.
Drawings
FIG. 1 is a flow chart of a visual localization algorithm based on prior semantic map structural information and semantic information
Detailed Description
Fig. 1 is a main flow chart of the technical scheme of the present invention. The visual positioning algorithm based on the prior semantic map structural information and the semantic information provided by the invention takes the prior semantic map M as input to acquire the structural information and the semantic information, and takes the visual feature U and the semantic segmentation image C as system observation O= { U, C }, respectively. Let the camera pose be T and the coordinate transformation between the visual image and the prior semantic map be S, the system estimates the camera state x= { T, S } and the visual landmark P using the prior semantic map M and the system observations. The whole problem can be modeled as a maximum posterior probability estimate, which is used as a priori constraint of a visual positioning algorithm to limit drift errors of the visual odometer. The maximum posterior probability estimation is expressed as follows:
p(X,P|M,O)∝p(U|P,T)·p(C|Z)·p(P|M,S),(1)
Where Z represents the semantic tag of the visual landmark. In the formula (1), P (u|p, T) is a visual observation factor, P (c|z) is a semantic tracking factor, and P (p|m, S) is an a priori semantic map factor.
As shown in fig. 1, the visual positioning algorithm based on prior semantic map structure information and semantic information provided by the invention comprises the following steps:
(1) In the tracking thread, the visual image and the semantic segmentation image are used as system observation, and the semantic signpost is continuously tracked based on the visual feature descriptors and the Dirichlet distribution.
In the front end, a data association between the visual landmark and ORB (Oriented FAST and Rotated BRIEF) features is established according to the feature matching, and the initial pose T k of each frame of image is estimated by utilizing a PnP (PERSPECTIVE-n-Point) algorithm. As for the visual observation factor P (u|p, T) in the formula (1), the reprojection error in the characteristic point method is used as a residual term. The expression form is as follows:
Wherein pi (·) represents the re-projection function; u i,k denotes the corresponding image features of the ith semantic landmark on the kth frame image, Σ i,k denotes the visual data association corresponding covariance matrix, Ω denotes the visual data association set.
For the semantic tracking factor p (c|z) in equation (1), it estimates the semantic tag Z of the visual landmark using the semantic segmentation result as the semantic segmentation image C. The system does not directly quantify the distribution p (c|z), but rather estimates the probability distribution of the semantic attribute Z. We define g i=[p(zi=1),p(zi=2),...,p(zi=H)]T as the discrete probability of z i and update the distribution of g i by the corresponding semantic observation c i,k. Based on the above analysis, the system estimates the discrete probability distribution of the semantic label z i of the corresponding landmark point, i.e., the semantic state D i, using the semantic observations C i of the multiframe. Thus, the semantic state D i of each visual landmark is represented as:
Di=p(gi|Ci)∝p(Ci|gi)·Dir(gi|iα), (3)
Where Dir (·) represents the dirichlet distribution and iα=[α12,...,αH]T is a parameter of the dirichlet distribution. Assuming that p (C i|gi) is a polynomial distribution, the estimation of the semantic state of the semantic roadmap can be translated into an update of the Dirichlet distribution parameters, namely:
iαnewiαold+ci,k. (4)
Wherein iαnew is post-update iα,iαold is pre-update i a;
(2) In the positioning thread, extracting structural information and semantic information in the prior semantic map to construct a mixed constraint, and taking the mixed constraint as the prior constraint:
the prior semantic map factor P (P|M, S) in the formula (1) is utilized to further lead out the prior semantic map semantic information P (z i|M,S,pi) and the structural information which are fused Is a mixed constraint of (1), namely:
The prior semantic map M is further divided into a plurality of sub-maps M l according to semantic tags. Subsequently, for each semantic landmark, a data association of the semantic landmark coordinates p i with the respective sub-map M l is established Based on the data association of the local semantic signpost and the prior semantic map, a mixing constraint E hybrid is introduced.
On the other hand, since a large number of planar features exist in an actual outdoor scene, the structure information in the prior semantic map is further divided into a planar structure and a non-planar structure to be considered.
For planar structures such as roads and sidewalks of a priori semantic map, the structural information in the formula (5) is modeled as a point-to-straight line distance, namely:
Wherein the method comprises the steps of AndRespectively isSurface normal vector and three-dimensional point coordinates.
For the rest of the unstructured scene, the residual form in the ICP algorithm is adopted, namely:
(3) In the localization thread, the 6-degree-of-freedom pose of the camera in the prior semantic map is solved using an EM (estimation-Maximization) algorithm based on the proposed hybrid constraints.
Based on the mixed constraint in the formula (5), for the kth frame image, solving the coordinate transformation between the prior semantic map and the world coordinate system corresponding to the visual image by using an EM algorithmSubsequently, a coordinate transformation based on the solutionEstimating 6-degree-of-freedom pose of camera in prior semantic mapP (z i|M,Sk,pi) in equation (5) can be considered the semantic weight of each landmarkSince p (z i|M,Sk,pi) is a discrete probability distribution, the system further definesAnd assuming that it obeys the Dirichlet distribution, i.e.: w i~Dir(i beta), wherein iβ=[iβ1,iβ2,...,iβH]T is a parameter of the dirichlet distribution. To optimize the semantic state of the front-end visual roadmap fused in the positioning thread, we define iβl as:
Wherein, gamma is the balance coefficient.
Thus, the weight w i can be further modeled as the expectation of the dirichlet distribution. Finally, for the kth frame image, estimating coordinate transformation of the prior semantic map and the world coordinate system corresponding to the visual image by using an EM algorithmThe flow of (2) is as follows:
1) E, step E: for each visual landmark, a data association between the visual landmark and the prior semantic map is constructed with a variable Z using nearest neighbor search and the weight w i of each data association is estimated with equation (8).
2) M steps: based on a series of data correlations and the estimated weights, S can be solved by the following optimization model:
Wherein, The probability distribution, represented as a local point cloud in a non-planar structure, is represented as an identity matrix in a planar structure. And (3) repeating the steps E and M until convergence or the set iteration times are reached. Finally, the pose of the current camera in the prior-check semantic map can be determined byAnd (5) performing calculation.
In addition, in order to improve the accuracy of pose estimation in the tracking thread. The system further refines the 3D coordinates of the semantic signpost based on the coordinate transformation S obtained by solving. Setting the coordinates of the road mark points before optimization asThe coordinates of the waypoints after refinement are:
Wherein, Transforming between a world coordinate system before optimizing for a desired maximization algorithm and a priori semantic map coordinate system; the system further optimizes the camera pose T and coordinates P of the semantic landmark using a local BA (Bundle Adjustment) optimization algorithm.
The present invention tested both binocular and monocular systems on 9 sequences of outdoor KITTI datasets, respectively. On the outdoor KITTI dataset, the average positioning error of the binocular system was 0.5216m and the average positioning error of the monocular system was 2.1838m. In addition, experimental tests are carried out on the time consumption of system positioning, and the time consumption of single-frame positioning is about 78.77 ms. According to the experimental result, the system provided by the invention achieves higher positioning precision under the condition of meeting the real-time requirement.
Table 1 ATE error test results in meters for the system

Claims (10)

1. A visual positioning method based on prior semantic map structure information and semantic information is characterized by comprising the following specific steps:
(1) Acquiring structural information and semantic information by using a priori semantic map M, and taking a visual image feature U and a semantic segmentation image C as system observation O, O= { U, C }; setting the pose of a camera as T, setting the coordinate transformation relation between a visual image and a priori semantic map as S, and estimating the state X= { T, S } of the camera and the coordinate P of a semantic landmark by using the priori semantic map M and system observation O; obtaining a camera state, a semantic landmark, a priori semantic map and posterior probability estimation of system observation, and performing maximization estimation to obtain a visual observation factor P (U|P, T), a semantic tracking factor P (C|Z) and a priori semantic map factor P (P|M, S);
(2) Continuous visual tracking is carried out on the semantic signpost based on the visual feature description sub-feature matching visual image;
Setting the camera pose of the kth frame of image as T k, setting the coordinate of the ith semantic landmark as P i, using the reprojection error in the characteristic point method as a residual term E visual of a visual observation factor P (U|P, T), and estimating the initial pose T k of each frame of visual image;
(3) The semantic segmentation image carries out continuous semantic tracking on the semantic landmark based on Dirichlet distribution, and semantic label Z of the semantic landmark is estimated by using semantic tracking factor p (C|Z) and semantic segmentation image C;
Defining g i=[p(zi=1),p(zi=2),...,p(zi=H)]T as the discrete probability of the semantic label z i corresponding to the ith semantic landmark; defining the observation of the ith semantic landmark on the kth frame semantic segmentation image as c i,k, and updating the distribution of g i through the observation;
Estimating the discrete probability distribution of a semantic label z i corresponding to the semantic landmark by utilizing multi-frame semantic observation C i={ci,k}k∈K, namely semantic state D i, wherein K represents a semantic segmentation image set in which the ith semantic landmark is observed;
(4) In the positioning thread, the prior semantic map M is further divided into a plurality of sub-maps M l according to semantic tags of the prior semantic map M; establishing data association of coordinates p i of semantic signposts with each sub-map M l Setting the coordinate transformation relation between the visual image obtained by the kth frame estimation and the priori semantic map as S k, extracting the structural information in the priori semantic map M according to the obtained priori semantic map factor P (P|M, S) in the formula (1)And semantic information p (z i|M,Sk,pi) to construct a hybrid constraint as a priori constraint of the method:
(5) In the positioning thread, based on the proposed mixed constraint, a desired maximization algorithm is used for solving coordinate transformation between a priori semantic map and a world coordinate system corresponding to a visual image Coordinate transformation based on solutionEstimating 6-degree-of-freedom pose of camera in prior semantic map
2. The visual positioning method based on prior semantic map structural information and semantic information according to claim 1, wherein in the step (1), the specific formulas of the posterior probability estimation of the camera state, the semantic landmark, the prior semantic map and the system observation are:
p(X,P|M,O)∝p(U|P,T)·p(C|Z)·p(P|M,S) (1)
Wherein Z represents the semantic label of the semantic landmark, P (U|P, T) is a visual observation factor, P (C|Z) is a semantic tracking factor, and P (P|M, S) is a priori semantic map factor.
3. The visual localization method based on prior semantic map structure information and semantic information according to claim 1, wherein in the step (2), the residual term E visual of the visual observation factor P (u|p, T) is:
Wherein pi (·) represents a reprojection function, ui, k represents a corresponding image feature of the ith semantic landmark on the kth frame image, Σ i,k represents a visual data association corresponding covariance matrix, and Ω represents a visual data association set.
4. The visual localization method based on prior semantic map structure information and semantic information according to claim 1, wherein the semantic state D i in the step (3) is,
Di=p(gi|Ci)∝p(Ci|gi)·Dir(gi|iα) (3)
Wherein Dir (·) represents the dirichlet distribution, iα=[α12,...,αH]T is a parameter of the dirichlet distribution; p (C i|gi) obeys polynomial distribution, and the semantic state of the semantic roadmap is converted into update of the dirichlet distribution parameters, namely:
iαnewiαold+ci,k (4)
Wherein iαnew is post-update iα,iαold is pre-update i a.
5. The visual localization method based on a priori semantic map structural information and semantic information according to claim 1, wherein the mixing constraint in step (4) is that,
6. The visual positioning method based on prior semantic map structural information and semantic information according to claim 5, wherein in the step (4), for the structural information in formula (5)Its residual term is defined as:
Wherein, Representing a probability distribution of the local point cloud; the structure information in the prior semantic map is divided into a planar structure and a non-planar structure;
For planar structures, including roads, sidewalks, walls, the residual term in equation (6) is modeled as a point-to-straight distance, namely:
Wherein the method comprises the steps of AndRespectively isSurface normal vector and three-dimensional point coordinates of (a); in a planar structure we willExpressed as a unitary matrix;
for non-planar structures, the residual form in the ICP algorithm is employed, namely:
7. the visual localization method based on prior semantic map structure information and semantic information according to claim 5 or 6, wherein in the step (4), p (z i|M,Sk,pi) of formula (5) is the semantic weight w i of the ith semantic landmark, defining Obeying the Dirichlet distribution, namely: w i~Dir(i beta), wherein iβ=[iβ1,iβ2,...,iβH]T is a parameter of the dirichlet allocation, and is used for optimizing by fusing semantic states of semantic signposts in a positioning thread;
Wherein, gamma is the balance coefficient.
8. The visual positioning method based on prior semantic map structure information and semantic information according to claim 1, wherein the using a expectation maximization algorithm solves a coordinate transformation between a prior semantic map and a world coordinate system corresponding to a visual imageThe flow of (2) is as follows:
1) E, step E: for each semantic landmark, using nearest neighbor search to construct data association between the semantic landmark and the prior semantic map with semantic label Z of semantic coordinates, and estimating weight w i of each data association with formula (9);
2) M steps: based on the data association and the weight w i estimated based on equation (9), solving by the following optimization model
Wherein S k is
And (3) repeating the steps E and M until convergence or the set iteration times are reached.
9. The visual positioning method based on prior semantic map structure information and semantic information according to claim 8, wherein in the coordinate transformation between the prior semantic map and the world coordinate system corresponding to the visual image, the coordinate transformation S 0 corresponding to the visual image of the initial frame is an identity matrix.
10. The visual positioning method based on prior semantic map structure information and semantic information according to claim 8, wherein the coordinate transformation based on solvingFurther refining the 3D coordinates of the semantic signpost; setting semantic landmark point coordinates before optimization asThe coordinates of the waypoints after refinement are:
Wherein, Transforming between a world coordinate system before optimizing for a desired maximization algorithm and a priori semantic map coordinate system; the system further optimizes the pose T of the local camera and the coordinates P of the semantic landmark by using a local BA optimization algorithm.
CN202210423500.0A 2022-04-21 2022-04-21 Visual positioning method based on prior semantic map structure information and semantic information Active CN114972501B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210423500.0A CN114972501B (en) 2022-04-21 2022-04-21 Visual positioning method based on prior semantic map structure information and semantic information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210423500.0A CN114972501B (en) 2022-04-21 2022-04-21 Visual positioning method based on prior semantic map structure information and semantic information

Publications (2)

Publication Number Publication Date
CN114972501A CN114972501A (en) 2022-08-30
CN114972501B true CN114972501B (en) 2024-07-02

Family

ID=82979847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210423500.0A Active CN114972501B (en) 2022-04-21 2022-04-21 Visual positioning method based on prior semantic map structure information and semantic information

Country Status (1)

Country Link
CN (1) CN114972501B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738673A (en) * 2019-10-21 2020-01-31 哈尔滨理工大学 Visual SLAM method based on example segmentation
CN112132897A (en) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 Visual SLAM method based on deep learning semantic segmentation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10452927B2 (en) * 2017-08-09 2019-10-22 Ydrive, Inc. Object localization within a semantic domain
CN114200481A (en) * 2020-08-28 2022-03-18 华为技术有限公司 Positioning method, positioning system and vehicle
CN114202579B (en) * 2021-11-01 2024-07-16 东北大学 Dynamic scene-oriented real-time multi-body SLAM system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738673A (en) * 2019-10-21 2020-01-31 哈尔滨理工大学 Visual SLAM method based on example segmentation
CN112132897A (en) * 2020-09-17 2020-12-25 中国人民解放军陆军工程大学 Visual SLAM method based on deep learning semantic segmentation

Also Published As

Publication number Publication date
CN114972501A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN108242079B (en) VSLAM method based on multi-feature visual odometer and graph optimization model
CN108225327B (en) Construction and positioning method of top mark map
KR20190082071A (en) Method, apparatus, and computer readable storage medium for updating electronic map
CN108615246B (en) Method for improving robustness of visual odometer system and reducing calculation consumption of algorithm
CN114323033B (en) Positioning method and equipment based on lane lines and feature points and automatic driving vehicle
CN109615698A (en) Multiple no-manned plane SLAM map blending algorithm based on the detection of mutual winding
CN108010081A (en) A kind of RGB-D visual odometry methods based on Census conversion and Local map optimization
CN110032965A (en) Vision positioning method based on remote sensing images
CN115420276A (en) Outdoor scene-oriented multi-robot cooperative positioning and mapping method
CN115031744A (en) Cognitive map positioning method and system based on sparse point cloud-texture information
CN116429116A (en) Robot positioning method and equipment
Tao et al. Automated processing of mobile mapping image sequences
CN113379915B (en) Driving scene construction method based on point cloud fusion
Zhang et al. Cross-modal monocular localization in prior lidar maps utilizing semantic consistency
CN113838129A (en) Method, device and system for obtaining pose information
CN113932796A (en) High-precision map lane line generation method and device and electronic equipment
CN114972501B (en) Visual positioning method based on prior semantic map structure information and semantic information
CN114323038B (en) Outdoor positioning method integrating binocular vision and 2D laser radar
CN114708321B (en) Semantic-based camera pose estimation method and system
CN116202487A (en) Real-time target attitude measurement method based on three-dimensional modeling
CN115031735A (en) Pose estimation method of monocular vision inertial odometer system based on structural features
Hu et al. Efficient Visual-Inertial navigation with point-plane map
CN113487741B (en) Dense three-dimensional map updating method and device
Sun et al. Accurate deep direct geo-localization from ground imagery and phone-grade gps
Wang et al. Improvement and experimental evaluation based on orb-slam-vi algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant