CN111795704A - Method and device for constructing visual point cloud map - Google Patents
Method and device for constructing visual point cloud map Download PDFInfo
- Publication number
- CN111795704A CN111795704A CN202010615170.6A CN202010615170A CN111795704A CN 111795704 A CN111795704 A CN 111795704A CN 202010615170 A CN202010615170 A CN 202010615170A CN 111795704 A CN111795704 A CN 111795704A
- Authority
- CN
- China
- Prior art keywords
- key frame
- frame
- feature points
- map
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/28—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
- G01C21/30—Map- or contour-matching
- G01C21/32—Structuring or formatting of map data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
Abstract
The method comprises the steps of carrying out feature extraction on a source image frame acquired by a space of a map to be built to obtain feature points of the source image frame, carrying out interframe tracking on the source image frame to determine a key frame, matching the feature points in the current key frame with the feature points in the previous key frame to obtain matching feature points of the current key frame, calculating the spatial position information of the matching feature points in the current key frame, and using the spatial position information of the matching feature points as the map point information of the current key frame, wherein a point cloud formed by a map point set of all key frames is a first visual point cloud map. The map building process separates map building from positioning, effectively removes the mutual influence of map building and positioning, and has better adaptability and stability in a complex and changeable environment.
Description
Technical Field
The invention relates to the field of navigation and positioning, in particular to a construction method of a visual point cloud map.
Background
Map construction and positioning are key technologies in instant positioning and map (SLAM) research, while map construction is a precondition for realizing positioning, and the accuracy of positioning is directly influenced by the quality of the map. The visual point cloud map is one of the constructed maps, and describes information such as the vision, the pose and the like of points in the environment through a three-dimensional point set in space, so that two types of data information are required for constructing the visual point cloud map: the system comprises key frames and map points, wherein the key frames describe the vision of points in the environment, the map points describe the poses of the points, and a set formed by a large number of map points forms a point cloud.
SLAM starts from an unknown place of an unknown environment, positions the position and the posture of the robot through repeatedly observed map features in the movement process, and constructs a map incrementally according to the position of the robot, so that the purposes of positioning and map construction are achieved simultaneously. From the aspect of input, before the robot runs, no special input exists, when the robot starts to run, sensor original data exist, and from the aspect of output, the pose is estimated and the map is estimated; that is, while a robot is positioned on a map model while a new map model is being built or a known map is being improved, which is similar to the process of putting a person in an unfamiliar city to familiarize the person with the environment, it is known that existing SLAM mapping typically couples together mapping and positioning problems that interact.
Disclosure of Invention
The invention provides a construction method of a visual point cloud map, which is used for avoiding the influence of positioning on map construction.
The construction method of the visual point cloud map provided by the invention is realized as follows:
carrying out feature extraction on a source image frame acquired by a space of a map to be built to obtain source image frame feature points,
performing inter-frame tracking on the source image frame, determining a key frame,
matching the feature points in the current key frame with the feature points in the previous key frame to obtain the matched feature points of the current key frame,
calculating the spatial position information of the matched feature points in the current key frame, taking the spatial position information of the matched feature points as the map point information of the current key frame,
the point cloud formed by the map point sets of all the key frames is a first visual point cloud map.
Preferably, the extracting the features of the source image frames collected from the space of the map to be created to obtain the feature points of the source image frames further comprises,
carrying out image preprocessing on the source image frame to obtain a target image frame,
performing feature extraction based on the target image frame to obtain feature points of the target image frame;
the method further comprises the step of enabling the user to select the target,
based on the first visual point cloud map,
according to the closed loop key frame with closed loop constraint, the least square method is adopted to carry out the image optimization of the key frame position,
and/or, optimizing the spatial position information of the map points according to the reprojection error,
and obtaining a second visual point cloud map.
The invention provides a construction device of a visual point cloud map, which comprises a first visual point cloud map construction unit, a second visual point cloud map construction unit and a third visual point cloud map construction unit,
the characteristic extraction module is used for extracting the characteristics of the source image frames acquired by the space of the map to be built to obtain the characteristic points of the source image frames,
a map point generating module for performing inter-frame tracking on the source image frame, determining the key frame, matching the feature points in the current key frame with the feature points in the previous key frame to obtain the matching feature points of the current key frame, calculating the spatial position information of the matching feature points in the current key frame, using the spatial position information of the matching feature points as the map point information of the current key frame,
the point cloud formed by the map point sets of all the key frames is a first visual point cloud map.
According to the construction method of the visual point cloud map, the image frames acquired by the space of the map to be constructed are subjected to feature extraction, the spatial position information of the matched feature points is obtained through inter-frame matching, the matched feature points are used as map points, the visual point cloud map formed by the map point set of all key frames is obtained, and generation and description of three-dimensional points in a physical environment are achieved. The map building process separates map building from positioning, effectively removes the mutual influence of map building and positioning, and has better adaptability and stability in a complex and changeable environment. Because the point cloud map has continuity, compared with a map formed by map nodes, the method can realize continuous positioning, avoid the problem of jumping in the positioning process and reduce the occurrence probability of relocation. Furthermore, the map precision is improved by optimizing the map key frame pose and/or the map point pose, the map can be corrected in time under the condition of wrong closed loop, the initial map data cannot be lost, the amplification is strong, and the map construction method is favorably fused with the improved map construction method.
Drawings
Fig. 1 is a schematic flow chart of a process for constructing a map based on image data acquired by a monocular camera.
FIG. 2 is a schematic diagram of feature point screening.
Fig. 3 is a schematic flow chart of a process for constructing a map based on forward-looking image data acquired by a monocular camera.
FIG. 4 is a schematic view of a process for constructing a map based on image data acquired by a binocular camera
FIG. 5 is a diagram illustrating accumulated error.
Fig. 6 is a schematic flow chart illustrating optimization of the first visual point cloud map.
FIG. 7 is a schematic diagram of a visual dictionary.
Fig. 8 is a schematic diagram of a visual point cloud map construction apparatus according to the present application.
FIG. 9 is a diagram of an image pre-processing module.
Detailed Description
For the purpose of making the objects, technical means and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings.
The method comprises the steps that based on collected image data, a visual point cloud map is obtained through feature extraction and feature point matching of inter-frame tracking; further, pose map optimization is performed through closed-loop constraint and/or map point optimization is performed through reprojection errors, so that the map accuracy is improved. The constructed visual point cloud map at least comprises key frame position and posture information and map point position information, wherein each map point also comprises feature point descriptor information.
Example one
For convenience of understanding, in the present embodiment, the image data is collected by the monocular camera, and the image is the ground texture. It should be understood that the present embodiment may not be limited to ground texture images, and that other images may also be suitable.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a process of constructing a map based on image data acquired by a monocular camera. The method mainly comprises image preprocessing, feature extraction and inter-frame tracking, and specifically comprises the following steps of for each frame of source image:
And step 1011, performing distortion removal processing on the source image according to the distortion coefficient of the camera to obtain a distortion-removed image I (u, v), wherein u and v represent pixel coordinates, and I (u, v) represents the pixel value of the pixel coordinates.
Ib(u,v)=G×I(u,v),
wherein G is a filter kernel, Ib(u, v) is a background image, and I (u, v) is a undistorted image;
preferably, the image filtering kernel is generally set to be relatively large, so that the filtered image is as close to the background image as possible.
If the texture region is a darker part of the image, the undistorted image I (u, v) may be inverted and expressed as: pixel maximum-I (u, v),
for example, if the maximum pixel value is 255, the inversion operation is: 255-I (u, v).
If(u,v)=I(u,v)-Ib(u,v)
1014, stretching the foreground image to obtain a target image;
in this step, since the texture information is weak in the image that is usually acquired, the pixel values (gray levels) in the texture region are mostly distributed in a narrow gray level section, and therefore the pixel values in this section are stretched to the pixel value range. In particular to a method for preparing a high-performance nano-silver alloy,
when the foreground image pixel value is less than or equal to the minimum gray value, taking the foreground image pixel value as the minimum value in the pixel value range, namely, taking the pixel value as 0;
when the pixel value of the foreground image is larger than the minimum gray value and smaller than the maximum gray value, the contrast of the pixel value of the foreground image is increased; preferably, the foreground image pixel value is the pixel value of the foreground image according to the pixel value which is in a certain proportion to the maximum value of the pixel; preferably, the ratio is: the ratio of the difference between the foreground image pixel value and the minimum gray value to the difference between the maximum gray value and the minimum gray value.
When the foreground image pixel value is greater than or equal to the maximum gray scale value, the foreground image pixel value is set to be the maximum value in the pixel value range, for example, the pixel maximum value is 255.
Expressed mathematically as:
stretched foreground image If' (u, v) is represented by:
wherein, IminIs the minimum gray value, ImaxThe gray scale value is the maximum gray scale value, and in the above formula, the pixel value range is 0-255.
The image preprocessing is beneficial to improving the contrast of the image, and in some environments with weak texture information, the contrast of the gray scale of the texture area is beneficial to improving, so that more feature points are extracted.
102, extracting feature points based on a current target image frame to convert image information into feature information to obtain a feature point set of the current target image frame; features such as ORB (organized FAST and rotaed BRIEF), Scale Invariant Feature Transform (SIFT), SURF (speeded Up Robust features), etc. may be used.
Taking the ORB algorithm as an example, based on a frame of target image:
And step 1023, respectively determining the direction of each FAST feature point, namely calculating the centroid of the feature point in the radius range by taking r as the center of mass, and forming a vector from the coordinates of the feature point to the centroid as the direction of the feature point.
And 103, tracking between frames to match the feature points in the previous and next frames, calculating the coordinates of the matched feature points in a world coordinate system, and storing the coordinates as map points with three-dimensional space position information.
In this step, for the target image frame for which the feature points are currently extracted:
for any feature point i of the current target image frame, calculating whether the matching degree between the feature point i in the current target image frame and the feature point i descriptor in the previous key frame is smaller than a set matching threshold value, if so, judging that the two feature points are matched, otherwise, judging that the two feature points are not matched; the matching degree can be described by a Hamming distance, and the matching threshold is a Hamming distance threshold;
and 1033, judging whether the current target image frame is a key frame according to the key frame condition, if so, taking the current target image frame as the key frame, executing 1034 to update the map based on the key frame, otherwise, not updating the map.
In step 1033, when the number of matched feature points is greater than a set first threshold, it is determined that the current target image frame is a key frame;
the key frame condition may also be one of the following conditions:
the space distance between the key frame and the previous key frame is larger than a set second threshold;
the space angle between the last key frame and the previous key frame is larger than a set third threshold;
when the key frame condition is a condition that the number of the processing matching feature points is larger than the set first threshold, step 1033 and step 1032 are exchanged, that is: the key frames are determined first, and then the matching feature points are determined.
Step 1034, calculating the coordinates of each matched feature point (simply referred to as a matched feature point) based on the current key frame, and storing the coordinates as map point information;
since the matching feature points of each current keyframe are different from the matching feature points of the previous keyframe, the matching feature points in the previous keyframe are updated by the calculation result of step 1034, and the coordinates of the unmatched feature points are not updated, so that the obtained current map information includes the un-updated map point information and the updated map point information, wherein each map point corresponds to the three-dimensional spatial location information.
In this embodiment, considering that the ground textures acquired by the monocular camera are located on the same plane, the camera is usually installed at the bottom of the mobile robot, so that the coordinates of any matching feature point i of the current keyframe in the world coordinate system can be projected onto the image plane through external parameters of the camera, specifically:
the x coordinate is the ratio of the product of the pixel abscissa u of the current key frame matching feature point i and the camera installation height to the camera focal length,
the y coordinate is the ratio of the product of the pixel ordinate v of the current key frame matching feature point i and the camera installation height to the camera focal length,
the z coordinate is the camera mounting height.
Expressed mathematically:
Z=H
wherein, H is the installation height of the camera, f is the focal length of the camera, and the pixel coordinate of the matching feature point i in the image coordinate system is (u, v).
And (4) repeatedly executing the steps 101-103 until all the source images are processed, and obtaining a first visual point cloud map consisting of a plurality of map points.
The embodiment provides a visual point cloud construction method based on ground texture, which comprises the steps of carrying out post-processing on an acquired source image, carrying out feature point matching on adjacent key frames, and generating spatial three-dimensional coordinates of map points based on pixel coordinates of the matched feature points so as to obtain a visual point cloud map.
Example two
In this embodiment, the image data is collected by a monocular camera, and the collected images are not the same plane. For example, the camera is mounted with a forward view, i.e. the mobile robot captures images with a forward looking camera.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a process of constructing a map based on forward-looking image data collected by a monocular camera. For each frame of image, the following steps are performed:
Step 303, subtracting the background image from the undistorted image to obtain a foreground image If(u, v), wherein the foreground image is expressed by the following mathematical formula:
If(u,v)=I(u,v)-Ib(u,v)
in this step, the high image quality is not subjected to the image stretching processing, and the low image quality is subjected to the image stretching processing, so that the image stretching processing is selectively processed in accordance with the image quality.
305, extracting feature points based on the current target image frame to convert the image information into feature information to obtain a feature point set of the current target image frame; features such as ORB (organized FAST and rotaed BRIEF), Scale Invariant Feature Transform (SIFT), SURF (speeded Up Robust features) which is a highly efficient modified version of SIFT, and the like may be used. This step 305 is the same as step 102.
for any feature point i of the current target image frame, calculating whether the matching degree between the feature point i in the current target image frame and the feature point i descriptor in the previous key frame is smaller than a set matching threshold value, if so, judging that the two feature points are matched, otherwise, judging that the two feature points are not matched; the matching degree can be described by a Hamming distance, and the matching threshold is a Hamming distance threshold;
the number of the matched feature points is greater than a set first threshold value;
the space distance between the key frame and the previous key frame is larger than a set second threshold;
the space angle between the last key frame and the previous key frame is larger than a set third threshold;
In this embodiment, since images acquired by the monocular camera are on a non-identical plane, an eight-point method is adopted to calculate an essential matrix between two frames according to pixel coordinates of the matched feature points, the essential matrix is subjected to SVD decomposition to obtain a relative pose of the camera, and then a least square method is adopted to calculate coordinates of any matched feature point i in a world coordinate system according to the poses of the two frames based on a triangulation calculation principle. Specifically, the method comprises the following steps:
3091, matching the normalized plane coordinate p of the feature point i according to the intrinsic matrix E and the current key frame1Last key frame matching feature point i normalization plane coordinate p2Satisfies the following conditions: for any matching feature point, the product of the transpose matrix of the normalized plane coordinate of the matching feature point in the previous key frame, the essential matrix and the matrix of the normalized plane coordinate of the matching feature point in the current key frame is equal to 0. Expressed mathematically as:
the intrinsic matrix E reflects the relationship between the representations of the image points of one point P in the space in the camera coordinate system under the cameras with different view angles, is a 3 x 3 matrix, and has the function that one point on the first image frame is multiplied by the intrinsic matrix, and the result is the epipolar line of the point on the second image frame; normalized plane coordinates of current keyframe matching feature point iLast key frame matching feature point i normalization plane coordinate
Substituting into the normalized plane coordinates of eight pairs of matched characteristic points to solve the essential matrix. And the current key frame matching feature point i and the previous key frame matching feature point i form a pair of matching feature points.
3092, performing Singular Value Decomposition (SVD) on the intrinsic matrix E to obtain the inter-frame relative pose between the current key frame and the previous key frame, namely the relative pose of the camera, which includes the translation matrix t and the rotation matrix R.
3093, based on the trigonometric calculation principle, matching the depth value s of the feature point i with the current key frame1The depth value s of the last key frame matching feature point i2Satisfies the following conditions:
s1p1=s2Rp2+t
by using least squares, s can be solved1And s2。
Alternatively, the first and second electrodes may be,
multiplying both sides of the above formula by p1Of the inverse-symmetric matrix p1^, can obtain:
s1p1^p1=0=s2p1^Rp2+p1^t
from this, s can be obtained2S to be obtained2Substitution of formula 1 to give s1。
3094, matching the depth value s of the feature point i according to the current key frame1And calculating the coordinate of the current key frame matching feature point i in a world coordinate system:
the x coordinate is: the product of the pixel abscissa of the normalized plane of the matched feature points in the current keyframe and the depth value of the matched feature points,
the y coordinate is: the product of the pixel ordinate of the matching feature point normalization plane in the current keyframe and the matching feature point depth value,
the z coordinate is the camera focal length.
The mathematical formula is expressed as:
X=s1u1
Y=s1v1
Z=f
wherein f is the focal length of the camera to convert the normalized plane coordinates into an imaging plane.
And 310, judging whether the source image is processed or not, if so, ending, otherwise, processing the next frame, returning to the step 301 until all the source images are processed, and obtaining a first visual point cloud map consisting of a large number of map points.
The embodiment provides a visual point cloud construction method of a non-coplanar image based on forward-looking collection, which comprises the steps of carrying out post-processing on a collected source image, carrying out feature point matching on adjacent key frames, and generating a space three-dimensional coordinate of a map point based on a pixel coordinate of the matched feature point so as to obtain a visual point cloud map.
EXAMPLE III
In the present embodiment, the image data is collected by a binocular camera, and the collected images are non-coplanar.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a process of constructing a map based on image data collected by a binocular camera. For each binocular image frame, i.e. a first source image frame from a first destination and a second source image frame from a second destination at the same time, the following steps are performed:
in this step, the first image frame and the second image frame may be subjected to image preprocessing in parallel, or the first image frame and the second image frame may be subjected to image preprocessing in series respectively,
the specific processing is the same as steps 301 to 304,
in step 404, in order to improve the matching efficiency, any target image frame in the current binocular target image frames may be matched with the previous key frame, so as to obtain the matching feature points of the target image frames,
the specific matching method is the same as step 307.
when one of the following key frame conditions is satisfied, the current target image frame is determined to be a key frame:
the number of the matched feature points is greater than a set first threshold value;
the space distance between the key frame and the previous key frame is larger than a set second threshold;
the space angle between the last key frame and the previous key frame is larger than a set third threshold;
In this embodiment, the calculation process of the coordinates of any matching feature point (first matching feature point) i in the current keyframe is as follows:
taking the current key frame as a first frame in the current binocular target image frame, wherein the matching feature points in the frame are obtained through the step 404; taking another target image frame in the binocular target image frames as a second frame; matching the matching feature point i in the first frame with the feature point j in the second frame, namely, calculating the matching degree between the descriptors of the matching feature point i and the feature point j, and if the matching feature point i in the second frame is obtained through matching, the map point coordinates of the matching feature point i are as follows:
the x coordinate is: dividing the product of the pixel abscissa of the normalization plane of the matching feature point in the current key frame and the length of the binocular baseline by the absolute value of the difference between the pixel abscissa of the normalization plane of the matching feature point in the current key frame and the pixel abscissa of the normalization plane of the matching feature point in the second frame;
the y coordinate is: dividing the product of the pixel ordinate of the matching feature point normalization plane in the current key frame and the length of the binocular baseline by the absolute value of the difference between the pixel abscissa of the matching feature point normalization plane in the current key frame and the pixel abscissa of the matching feature point normalization plane in the second frame;
the z coordinate is: the product of the camera focal length and the binocular baseline length is divided by the absolute value of the difference between the pixel abscissa of the matched feature point normalized plane in the current keyframe and the pixel abscissa of the matched feature point normalized plane in the second frame.
The mathematical formula is expressed as:
wherein (u)1,v1) Matching normalized planar pixel coordinates of feature points for the first frame, (u)3,v2) The normalized plane pixel coordinates of the second frame feature points are represented, f represents the focal length of the camera, and b represents the length of a binocular baseline;
if not, the coordinate calculation of the matching feature point i is abandoned.
And (4) repeatedly executing the steps 401 to 406 until all the source binocular images are processed, and obtaining a first visual point cloud map consisting of a large number of map points.
The embodiment provides a binocular image-based visual point cloud construction method, the binocular image is used for obtaining the space coordinates of the matched feature points, the calculation is simple, the positioning information of a mobile robot is not needed in the construction process, the map construction and the positioning are separated, and the stability and the adaptability of a complex environment are improved.
In view of the fact that the first visual point cloud map is obtained by continuously matching the inter-frame images and continuously recording generated map points, the inter-frame continuously matched map point generation mode can generate accumulated errors, and the errors are larger and larger along with the increase of the movement distance. Referring to fig. 5, fig. 5 is a schematic diagram of accumulated error. In the figure, TiRepresenting the position of the ith frame data, the left-hand objective real track, and the right-hand track, where T is the calculated track1And T19Near the same location, but the calculated trajectory will not be near the same location due to accumulated errors. In order to eliminate accumulated errors and improve the precision of map point spatial position information in the first visual point cloud map, closed-loop constraint is constructed, and a least square method is adopted for optimization.
Referring to fig. 6, fig. 6 is a schematic flow chart illustrating optimization of the first visual point cloud map. The optimization method mainly comprises closed-loop point identification, closed-loop constraint calculation and map optimization, wherein the map optimization comprises pose map optimization and/or map point optimization. The details are as follows.
In one embodiment, the method of manual marking is adopted: a unique identification pattern is placed in the environment at the time of image data acquisition so that a closed loop is created between key frames of the same identification acquired at different times. This approach has the advantage of high reliability.
In a second embodiment, the method of natural labeling: whether closed loop occurs is judged by calculating whether the similarity between the two key frame images is larger than a set similarity threshold, wherein the similarity comprises the similarity on the distribution of the characteristic points and the similarity of image pixels. In particular, it may be that,
constructing a visual dictionary of natural features, if N feature points exist, constructing a tree with depth of d and bifurcation of k each time, wherein the process is as follows:
at the root node, all feature points are clustered into k classes by using a k-means clustering algorithm (k-means), so that a first-layer node is obtained.
For each node of the first layer, feature points belonging to the node are further aggregated into k types to obtain the next layer.
And so on until the leaf level, wherein d levels are counted from the root node to the leaf level. The leaf layer is the word feature point in the dictionary.
Referring to fig. 7, fig. 7 is a schematic diagram of a visual dictionary.
For all the features in the image with the number n, the number of times of the word feature point omega appears is ciThe weight of the word feature point is expressed as:
according to the dictionary model, any key frame A is described as each word feature point omegaiAnd the weight thereof is a set of elements, and the mathematical expression is:
A={(ω1,η1),(ω2,η2),……(ωN,ηN)}
and N is the total number of the feature points in the visual dictionary.
According to the above model, the similarity S between any two key frames a and B can be described in the form of L1 norm:
wherein v isAiIs an element in the set of key frames A described according to the dictionary model, vBiAre elements in the set of key frames B described according to the dictionary model.
And if the similarity S is larger than the set similarity threshold, judging that closed-loop constraint exists between the two frames.
In step 602, a closed-loop constraint is calculated based on the key frame (hereinafter referred to as a closed-loop key frame) determined to have the closed-loop constraint. Specifically, the method comprises the following steps of,
if the first closed-loop key frame A and the second closed-loop key frame B have closed loops, calculating the matching degree of any map point i in the first closed-loop key frame A and any map point j in the second closed-loop key frame B, if the matching degree is smaller than a set matching degree threshold value, judging that the two map points are matched, repeatedly matching each map point in such a way to obtain m matched feature points, wherein the mathematical expression is as follows:
P={p1,p2……pm},P′={p1′,p2′……pm′}
and P is a set of m matched feature points in the first closed-loop key frame, and P is a set of m matched feature points in the second closed-loop key frame.
The matching degree calculation may be to calculate a hamming distance between the two map point descriptors, and determine a match if the hamming distance is smaller than a set hamming threshold.
Considering that any matching feature point in the closed-loop key frame satisfies the relationship:
pi=Rp′i+t
wherein, (R, t) reflects closed-loop constraint between two closed-loop key frames, and relative pose can be calculated through the relationship and is used as an initial value; i is a natural number, i is more than or equal to 1 and less than or equal to m, pi、p′iIs a matrix of pixel coordinates.
For the solution of the closed-loop constraint, a least square method can be used, for example, a nonlinear optimization method LM algorithm is used. And (3) substituting pixel position information of all matched characteristic points in the closed-loop key frame by constructing a first objective function, and iteratively solving zeta when the first objective function obtains a minimum value, thereby solving R and t. Expressed mathematically as:
where ζ is the lie algebra representation of (R, t).
It should be understood that steps 6031 and 6032 may also be performed selectively to optimize either step, such as only pose map optimization or only map point optimization.
wherein the symbol ^ represents an antisymmetric matrix, and the symbol V represents the inverse operation of the antisymmetric matrix.
Therefore, a second objective function for pose graph optimization of the key frame is constructed:
wherein Ω is the weight of the error term, and is the key frame set.
Substituting the measured inter-frame relative pose errors of the key frame i and the key frame j into the second objective function as initial values, taking the relative pose between the closed-loop key frames obtained in the step 6022 as constraint, and adopting a Gaussian-Newton method or an LM method to iteratively solve the pose T of the key frame i when the second objective function obtains the minimum valueiLie algebraic representation of ζiAnd shut offKey frame j pose TjLie algebraic representation of ζjIn this way, the accumulated error determined from the closed-loop keyframes is distributed to the individual keyframes, thereby correcting the pose of the keyframes.
wherein the content of the first and second substances,the position of the map point re-projection in the image is represented as:
wherein I is an identity matrix, [ I ]3×303×1y forming a 3 x 4 matrix, TiIs a matrix of 4 x 4, and,is a 4 × 1 matrix, and K is camera intrinsic parameters.
Constructing a third objective function of the reprojection error:
where Ω is the weight of the error term and j is the map point.
Substituting the pose T according to the key frame i into a third objective functioniThe coordinates of the map point j, the camera internal reference and the reprojection error obtained by the pixel coordinates of the map point j in the key frame i are used as initial values, and Gaussian-Newton is adoptedIterative solution of the method or LM method to make the three-dimensional map point j coordinate y when the third objective function takes the minimum valuejThereby correcting the three-dimensional spatial position information of the map point.
In this step, preferably, the pose T of the keyframe iiThe pose optimized through step 6031.
And storing the optimized key frame pose and/or the optimized map point position as the map information of the visual point cloud, so that a second visual point cloud map is obtained by optimizing the pose of the key frame and/or the coordinates of the map points.
According to the method, the mapping process is separated into an independent first visual point cloud map construction, a second visual point cloud map is obtained through closed-loop constraint calculation and map optimization, corresponding output maps are stored in each processing stage, and even if the mapping is not ideal, original data in the previous process are stored; the expansibility is strong, and the fusion with various improved methods is convenient.
Referring to fig. 8, fig. 8 is a schematic diagram of a visual point cloud map building apparatus according to the present application. The device comprises a first visual point cloud map building unit 801, a closed-loop unit 802, a map optimization unit 803, and an input/output IO unit 804 for reading and saving map files, wherein an external source image is input to the first visual point cloud map building unit 801, the closed-loop unit 802 is used for adding closed-loop constraints to the first visual point cloud map generated by the first visual point cloud map building unit 801, and the map optimization unit 803 performs keyframe pose map optimization and map point position information optimization on the first visual point cloud map based on the closed-loop constraints.
The first visual point cloud mapping unit 801 includes,
an image preprocessing module 8011 configured to preprocess an image;
a feature extraction module 8012, configured to convert the image information into feature information;
a map point generating module 8013, configured to track image frames from frame to frame, determine a key frame, match feature points in the current key frame with feature points in the previous key frame to obtain matching feature points of the current key frame, calculate spatial location information of the matching feature points in the current key frame, use the spatial location information of the matching feature points as map point information of the current key frame,
the point cloud formed by the map point sets of all the key frames is a first visual point cloud map.
The closed-loop element 802 comprises a closed-loop element,
a closed-loop key frame identification module 8021, which identifies a closed-loop key frame in the first visual point cloud map according to the artificial mark or the key frame similarity calculation;
the closed-loop constraint calculation module 8022 calculates the relative pose between the closed-loop key frames based on the closed-loop key frames as closed-loop constraints; and constructing a second objective function for optimizing the key frame pose graph, and solving the key frame pose when the second objective function obtains the minimum value by adopting a least square method.
The map optimization unit 803 includes a keyframe pose map optimization module 8031 and/or a map point optimization module 8032, wherein,
a key frame pose graph optimization module 8031, which performs graph optimization on the key frame pose by using a least square method according to the closed loop key frame with closed loop constraint based on the first visual point cloud map to obtain a second visual point cloud map;
the map point optimization module 8032 optimizes the spatial position information of the map points according to the reprojection error based on the first visual point cloud map to obtain a second visual point cloud map.
Referring to fig. 9, fig. 9 is a schematic diagram of an image preprocessing module. The image pre-processing module comprises a pre-processing module,
the image distortion removal submodule is used for carrying out distortion removal processing on the source image frame according to the distortion coefficient of the camera to obtain a distortion removal image,
the image filtering submodule carries out image filtering on the distortion-removed image to obtain a background image,
an image difference submodule for subtracting the background image from the de-distorted image to obtain a foreground image,
and the image stretching submodule is used for stretching the foreground image to obtain a target image frame.
The mapping process is separated into an independent first visual point cloud mapping construction unit, a closed loop unit and a mapping optimization unit, coupling relations do not exist among the units, corresponding output maps are stored in each processing stage, and even if the mapping is not ideal, original data in the previous process are also stored; the expansibility is strong, and the fusion with various improved methods is convenient.
The present application may also provide a visual point cloud mapping apparatus comprising a memory storing executable computer instructions and a processor configured to execute the instructions in the memory to implement the steps of the visual point cloud mapping method.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored in the storage medium, and when being executed by a processor, the computer program realizes the steps of the visual point cloud map construction method.
For the device/network side device/storage medium embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, refer to the partial description of the method embodiment.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (15)
1. A construction method of a visual point cloud map is characterized by comprising the following steps,
carrying out feature extraction on a source image frame acquired by a space of a map to be built to obtain source image frame feature points,
performing inter-frame tracking on the source image frame, determining a key frame,
matching the feature points in the current key frame with the feature points in the previous key frame to obtain the matched feature points of the current key frame,
calculating the spatial position information of the matched feature points in the current key frame, taking the spatial position information of the matched feature points as the map point information of the current key frame,
the point cloud formed by the map point sets of all the key frames is a first visual point cloud map.
2. The method as claimed in claim 1, wherein said performing feature extraction on source image frames collected in a space to be mapped to obtain source image frame feature points, further comprises,
carrying out image preprocessing on the source image frame to obtain a target image frame,
performing feature extraction based on the target image frame to obtain feature points of the target image frame;
the method further comprises the step of enabling the user to select the target,
based on the first visual point cloud map,
according to the closed loop key frame with closed loop constraint, the least square method is adopted to carry out the image optimization of the key frame position,
and/or, optimizing the spatial position information of the map points according to the reprojection error,
and obtaining a second visual point cloud map.
3. The method of claim 2, wherein the image pre-processing comprises,
performing distortion removal processing on the source image frame according to the distortion coefficient of the camera to obtain a distortion-removed image,
judging whether the pixel value of each pixel point in the distortion-removed image is larger than a first pixel threshold value, if so, carrying out inversion operation on the pixel points with the pixel values larger than the first pixel threshold value, then carrying out image filtering to obtain a background image, otherwise, carrying out image filtering on the distortion-removed image to obtain the background image,
subtracting the background image from the de-distorted image to obtain a foreground image,
and judging whether the pixel values in the foreground image are uniformly distributed, if so, taking the foreground image as a target image frame, otherwise, stretching the foreground image to obtain the target image frame.
4. The method of claim 3, wherein the stretching the foreground image comprises,
if the foreground image pixel value is less than or equal to the set minimum gray value, taking the foreground image pixel value as the minimum value in the pixel value range;
if the foreground image pixel value is larger than the minimum gray value and smaller than the set maximum gray value, taking the pixel value which is in a certain proportion to the maximum pixel value as the foreground image pixel value; the ratio is the ratio of the difference between the pixel value and the minimum gray value of the foreground image to the difference between the maximum gray value and the minimum gray value;
if the pixel value of the foreground image is larger than or equal to the maximum gray value, taking the pixel value of the foreground image as the maximum value in a pixel value range;
the feature extraction is carried out based on the target image frame to obtain the feature points of the target image frame, including,
carrying out feature detection on the target image frame to obtain feature points,
the target image frame is divided into a number of grids,
for the feature points in any grid, arranging the feature points in the grid in a descending order according to the response values of the feature points, and reserving the first Q feature points to obtain screened feature points; q is determined according to the number of the feature points in the target image frame, the set upper limit of the total number of the feature points and the total number of the feature points in the grid;
and respectively calculating the feature descriptors of the screened feature points.
5. The method as claimed in claim 4, wherein Q is determined according to the number of the feature points in the target image frame and the set upper limit of the total number of the feature points, and comprises that Q is the number of the feature points in the target image frame divided by the upper limit of the total number of the feature points, and is rounded down by multiplying the result of the total number of the feature points in the grid;
the inter-frame tracking of the source image frame, the determination of the key frame, including,
for each target image frame: judging whether the target image frame is the first frame or not, if so, taking the target image frame as a key frame, otherwise, determining whether the target image frame is the key frame or not according to the key frame condition;
wherein the key frame condition satisfies at least one of the following conditions:
the number of the matched feature points is greater than a set first threshold value;
the space distance between the key frame and the previous key frame is larger than a set second threshold;
the spatial angle between the last key frame and the previous key frame is larger than a set third threshold value.
6. The method of claim 5, wherein the source image frames are image frames that originate from a monocular camera and are in the same plane;
the calculating the spatial location information of the matching feature points in the current key frame comprises,
for each matching feature point:
the x coordinate is: the ratio of the product of the pixel abscissa of the matched feature point in the current key frame and the camera installation height to the camera focal length,
the y coordinate is: the ratio of the product of the pixel ordinate of the matched feature point in the current key frame and the camera installation height to the camera focal length,
the z coordinate is the camera mounting height.
7. The method of claim 5, wherein the source image frames are image frames that originate from a monocular camera and are non-coplanar;
the calculating the spatial location information of the matching feature points in the current key frame comprises,
obtaining the essential matrix of the current key frame and the previous key frame according to the pixel coordinates of at least 8 pairs of matched feature points consisting of the matched feature point pairs of the current key frame matched feature point and the previous key frame matched feature point,
performing singular value decomposition on the essential matrix to obtain the inter-frame relative pose between the current key frame and the previous key frame;
for each matching feature point: at least obtaining the depth value of the matching feature point of the current key frame according to the inter-frame relative pose between the current key frame and the previous key frame and the triangularization calculation relationship; and obtaining the spatial position information of the matched feature points according to the depth value of the matched feature points of the current key frame.
8. The method of claim 7, wherein obtaining the essence matrix of the current key frame and the previous key frame according to the pixel coordinates of at least 8 pairs of matched feature points consisting of the matched feature points of the current key frame matched feature point and the matched feature points of the previous key frame matched feature point comprises,
for any of the matched feature points, the feature points,
according to the relationship that the product of the transpose matrix of the normalized plane coordinates of the matching feature points in the previous key frame, the essential matrix and the matrix of the normalized plane coordinates of the matching feature points in the current key frame is equal to 0,
substituting 8 pairs of pixel coordinates of the matched characteristic points to obtain an essential matrix;
the method at least obtains the depth value of the matching feature point of the current key frame according to the relative pose between the current key frame and the previous key frame and the triangularization calculation relationship, and comprises the following steps,
for any of the matched feature points, the feature points,
based on the sum of the product of the depth value of the matching feature point in the current key frame and the matrix of the normalized plane coordinate of the matching feature point being equal to, the depth value of the matching feature point in the previous key frame, the rotation matrix in the relative pose, and the product of the matrix of the normalized plane coordinate of the matching feature point in the previous key frame and the translation matrix in the relative pose,
obtaining the depth value of the matching feature point in the current key frame according to the rotation matrix and the translation matrix in the relative pose and the matrixes of the normalized plane coordinates of the matching feature point in the current key frame and the previous key frame;
the obtaining the spatial position information of the matching feature point according to the depth value of the matching feature point of the current key frame includes,
the x coordinate is: the product of the pixel abscissa of the normalized plane of the matched feature points in the current keyframe and the depth value of the matched feature points,
the y coordinate is: the product of the pixel ordinate of the matching feature point normalization plane in the current keyframe and the matching feature point depth value,
the z coordinate is the camera focal length.
9. The method of claim 5, wherein the source image frames are binocular image frames derived from a binocular camera and are non-coplanar;
performing image preprocessing on the source image frames to obtain target image frames, wherein the image preprocessing is performed on a first source image frame from a first destination and a second source image frame from a second destination respectively to obtain a first target image frame and a second target image frame which are used as binocular target image frames;
performing feature extraction based on the target image frame to obtain feature points of the target image frame, wherein the feature points of the first target image frame and the feature points of the second target image frame are respectively extracted;
judging whether the target image frame is the first frame or not comprises judging whether the binocular target image frame is the first frame or not, if so, taking any frame in the binocular target image frame as a key frame, and otherwise, determining whether any frame in the target image frame is the key frame or not according to the key frame condition;
the calculating the spatial location information of the matching feature points in the current key frame comprises,
for each matching feature point in the current keyframe:
taking the current key frame as a first frame in the current binocular target image frame, taking another target image frame in the current binocular target image frame as a second frame, matching the matching feature points in the first frame with the feature points in the second frame, and if the matching is successful, obtaining the matching feature points in the second frame
The x coordinate of the matched feature point in the current keyframe is: dividing the product of the pixel abscissa of the normalization plane of the matching feature point in the current key frame and the length of the binocular baseline by the absolute value of the difference between the pixel abscissa of the normalization plane of the matching feature point in the current key frame and the pixel abscissa of the normalization plane of the matching feature point in the second frame;
the y coordinate of the matching feature point in the current key frame is: dividing the product of the pixel ordinate of the matching feature point normalization plane in the current key frame and the length of the binocular baseline by the absolute value of the difference between the pixel abscissa of the matching feature point normalization plane in the current key frame and the pixel abscissa of the matching feature point normalization plane in the second frame;
the z-coordinate of the matched feature point in the current keyframe is: the product of the camera focal length and the binocular baseline length is divided by the absolute value of the difference between the pixel abscissa of the matched feature point normalized plane in the current keyframe and the pixel abscissa of the matched feature point normalized plane in the second frame.
10. The method according to any one of claims 2 to 9, wherein the performing the graph optimization on the key frame positions by using a least square method according to the closed-loop key frame with the closed-loop constraint comprises,
identifying a closed loop key frame in the first visual point cloud map according to the manual mark or key frame similarity calculation;
calculating the relative pose between the closed-loop key frames based on the closed-loop key frames to serve as closed-loop constraint;
and constructing a second objective function for optimizing the key frame pose graph, and solving the key frame pose when the second objective function obtains the minimum value by adopting a least square method.
11. The method of claim 10, wherein identifying closed-loop keyframes in the first visual point cloud map based on artificial markers or keyframe similarity calculations comprises,
acquiring key frames with the same identification at different times as closed-loop key frames;
alternatively, the first and second electrodes may be,
calculating whether the similarity between the two key frames is greater than a set similarity threshold, if so, judging the two key frames to be closed-loop key frames, wherein the similarity comprises the similarity on the distribution of the feature points and the similarity of the image pixels;
the computing of relative poses between closed-loop keyframes as closed-loop constraints based on the closed-loop keyframes includes,
calculating matching feature points in the closed-loop key frame based on the closed-loop key frame,
for any matching feature point in the closed-loop key frame, calculating to obtain a relative pose as an initial value according to the relation that a pixel coordinate matrix of the matching feature point in the first closed-loop key frame is equal to, the product of a rotation matrix in the relative pose and a pixel coordinate matrix of the second closed-loop key frame and a translation matrix in the relative pose;
constructing a first objective function for accumulating pixel position information errors of all matched characteristic points in a closed-loop key frame, substituting the first objective function into pixel coordinate matrixes of all matched characteristic points, and iteratively solving to enable the first objective function to obtain a relative pose when the first objective function obtains a minimum value;
the second objective function for optimizing the key frame pose graph is constructed, the key frame pose when the second objective function obtains the minimum value is solved by adopting a least square method, and the method comprises the following steps of,
constructing a second objective function for accumulating the pose errors of the first key frame and the second key frame according to the relative pose errors between the pose of any first key frame and the pose of any second key frame,
and taking the inter-frame relative pose error of the first key frame and the second key frame as an initial value, taking the closed-loop constraint as a constraint, and iteratively solving the pose of the first key frame and the pose of the second key frame when the second objective function obtains the minimum value.
12. The method of claim 11, wherein said calculating whether the similarity between two key frames is greater than a set similarity threshold comprises,
at the root node, all the characteristic points are clustered into k classes by using a k-means clustering algorithm to obtain a first-layer node,
for each node of the first layer, clustering the characteristic points belonging to the node into k classes to obtain the next layer node,
and so on until the last leaf layer, obtaining a visual dictionary which comprises N characteristic points and is branched into k trees each time,
wherein d layers are counted from the root node to the leaf layer, and the leaf layer comprises word feature points in the dictionary; k. d and N are both natural numbers, and N is the total number of the feature points in the dictionary;
for any key frame, calculating the weight of the word feature point according to the quantity of all features in the key frame and the occurrence frequency of any word feature point, describing the key frame as a set taking each word feature point and the weight thereof as elements, wherein the set comprises N elements,
calculating the similarity of the first key frame and the second key frame according to all elements in the set described by the first key frame and all elements in the set described by the second key frame,
and if the similarity is greater than the set similarity threshold, judging that closed-loop constraint exists between the two key frames.
13. The method of any one of claims 2 to 9, wherein said optimizing the spatial location information of the map points based on the reprojection error comprises,
for any key frame, constructing a third objective function related to the reprojection error according to the reprojection error of any map point in the pixel position of the key frame collected in the pose of the key frame,
using the initial value of the reprojection error to iteratively solve the space position information of the space map point when the third objective function obtains the minimum value,
wherein, the initial value of the reprojection error is: the difference between the pixel position of the map point in the key frame and the position of the re-projected map point in the image,
and the position of the re-projection of the map point in the image is obtained according to the camera internal parameters, the pose of the key frame and the spatial position of the map point.
14. The construction device of the visual point cloud map is characterized by comprising a first visual point cloud map construction unit, wherein the unit comprises,
the characteristic extraction module is used for extracting the characteristics of the source image frames acquired by the space of the map to be built to obtain the characteristic points of the source image frames,
a map point generating module for performing inter-frame tracking on the source image frame, determining the key frame, matching the feature points in the current key frame with the feature points in the previous key frame to obtain the matching feature points of the current key frame, calculating the spatial position information of the matching feature points in the current key frame, using the spatial position information of the matching feature points as the map point information of the current key frame,
the point cloud formed by the map point sets of all the key frames is a first visual point cloud map.
15. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for constructing a visual point cloud map according to any one of claims 1 to 13.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010615170.6A CN111795704B (en) | 2020-06-30 | 2020-06-30 | Method and device for constructing visual point cloud map |
PCT/CN2021/103653 WO2022002150A1 (en) | 2020-06-30 | 2021-06-30 | Method and device for constructing visual point cloud map |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010615170.6A CN111795704B (en) | 2020-06-30 | 2020-06-30 | Method and device for constructing visual point cloud map |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111795704A true CN111795704A (en) | 2020-10-20 |
CN111795704B CN111795704B (en) | 2022-06-03 |
Family
ID=72809796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010615170.6A Active CN111795704B (en) | 2020-06-30 | 2020-06-30 | Method and device for constructing visual point cloud map |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111795704B (en) |
WO (1) | WO2022002150A1 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112614185A (en) * | 2020-12-29 | 2021-04-06 | 浙江商汤科技开发有限公司 | Map construction method and device and storage medium |
CN112767546A (en) * | 2021-01-22 | 2021-05-07 | 湖南大学 | Binocular image-based visual map generation method for mobile robot |
CN113063424A (en) * | 2021-03-29 | 2021-07-02 | 湖南国科微电子股份有限公司 | Method, device, equipment and storage medium for intra-market navigation |
CN113515536A (en) * | 2021-07-13 | 2021-10-19 | 北京百度网讯科技有限公司 | Map updating method, device, equipment, server and storage medium |
CN113536024A (en) * | 2021-08-11 | 2021-10-22 | 重庆大学 | ORB-SLAM relocation feature point retrieval acceleration method based on FPGA |
CN113670293A (en) * | 2021-08-11 | 2021-11-19 | 追觅创新科技(苏州)有限公司 | Map construction method and device |
CN113674411A (en) * | 2021-07-29 | 2021-11-19 | 浙江大华技术股份有限公司 | Pose graph adjustment-based graph establishing method and related equipment |
CN113761091A (en) * | 2020-11-27 | 2021-12-07 | 北京京东乾石科技有限公司 | Closed loop detection method, device, electronic equipment, system and storage medium |
WO2022002150A1 (en) * | 2020-06-30 | 2022-01-06 | 杭州海康机器人技术有限公司 | Method and device for constructing visual point cloud map |
CN114088099A (en) * | 2021-11-18 | 2022-02-25 | 北京易航远智科技有限公司 | Semantic relocation method and device based on known map, electronic equipment and medium |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111311756B (en) * | 2020-02-11 | 2023-04-11 | Oppo广东移动通信有限公司 | Augmented reality AR display method and related device |
CN114745533A (en) * | 2022-02-28 | 2022-07-12 | 杭州小伴熊科技有限公司 | Spatial key point data acquisition extreme value calibration method and system |
CN114529705B (en) * | 2022-04-22 | 2022-07-19 | 山东捷瑞数字科技股份有限公司 | Interface layout processing method of three-dimensional engine editor |
CN116147618B (en) * | 2023-01-17 | 2023-10-13 | 中国科学院国家空间科学中心 | Real-time state sensing method and system suitable for dynamic environment |
CN116030136B (en) * | 2023-03-29 | 2023-06-09 | 中国人民解放军国防科技大学 | Cross-view visual positioning method and device based on geometric features and computer equipment |
CN116452776B (en) * | 2023-06-19 | 2023-10-20 | 国网浙江省电力有限公司湖州供电公司 | Low-carbon substation scene reconstruction method based on vision synchronous positioning and mapping system |
CN116567166B (en) * | 2023-07-07 | 2023-10-17 | 广东省电信规划设计院有限公司 | Video fusion method and device, electronic equipment and storage medium |
CN116681733B (en) * | 2023-08-03 | 2023-11-07 | 南京航空航天大学 | Near-distance real-time pose tracking method for space non-cooperative target |
CN116883251B (en) * | 2023-09-08 | 2023-11-17 | 宁波市阿拉图数字科技有限公司 | Image orientation splicing and three-dimensional modeling method based on unmanned aerial vehicle video |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107330373A (en) * | 2017-06-02 | 2017-11-07 | 重庆大学 | A kind of parking offense monitoring system based on video |
CN107341814A (en) * | 2017-06-14 | 2017-11-10 | 宁波大学 | The four rotor wing unmanned aerial vehicle monocular vision ranging methods based on sparse direct method |
CN107369183A (en) * | 2017-07-17 | 2017-11-21 | 广东工业大学 | Towards the MAR Tracing Registration method and system based on figure optimization SLAM |
CN109887029A (en) * | 2019-01-17 | 2019-06-14 | 江苏大学 | A kind of monocular vision mileage measurement method based on color of image feature |
CN110533722A (en) * | 2019-08-30 | 2019-12-03 | 的卢技术有限公司 | A kind of the robot fast relocation method and system of view-based access control model dictionary |
CN110570453A (en) * | 2019-07-10 | 2019-12-13 | 哈尔滨工程大学 | Visual odometer method based on binocular vision and closed-loop tracking characteristics |
CN110782494A (en) * | 2019-10-16 | 2020-02-11 | 北京工业大学 | Visual SLAM method based on point-line fusion |
CN111325842A (en) * | 2020-03-04 | 2020-06-23 | Oppo广东移动通信有限公司 | Map construction method, repositioning method and device, storage medium and electronic equipment |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10636198B2 (en) * | 2017-12-28 | 2020-04-28 | Beijing Jingdong Shangke Information Technology Co., Ltd. | System and method for monocular simultaneous localization and mapping |
CN110657803B (en) * | 2018-06-28 | 2021-10-29 | 深圳市优必选科技有限公司 | Robot positioning method, device and storage device |
CN109671120A (en) * | 2018-11-08 | 2019-04-23 | 南京华捷艾米软件科技有限公司 | A kind of monocular SLAM initial method and system based on wheel type encoder |
CN111322993B (en) * | 2018-12-13 | 2022-03-04 | 杭州海康机器人技术有限公司 | Visual positioning method and device |
CN110378345B (en) * | 2019-06-04 | 2022-10-04 | 广东工业大学 | Dynamic scene SLAM method based on YOLACT instance segmentation model |
CN111795704B (en) * | 2020-06-30 | 2022-06-03 | 杭州海康机器人技术有限公司 | Method and device for constructing visual point cloud map |
-
2020
- 2020-06-30 CN CN202010615170.6A patent/CN111795704B/en active Active
-
2021
- 2021-06-30 WO PCT/CN2021/103653 patent/WO2022002150A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107330373A (en) * | 2017-06-02 | 2017-11-07 | 重庆大学 | A kind of parking offense monitoring system based on video |
CN107341814A (en) * | 2017-06-14 | 2017-11-10 | 宁波大学 | The four rotor wing unmanned aerial vehicle monocular vision ranging methods based on sparse direct method |
CN107369183A (en) * | 2017-07-17 | 2017-11-21 | 广东工业大学 | Towards the MAR Tracing Registration method and system based on figure optimization SLAM |
CN109887029A (en) * | 2019-01-17 | 2019-06-14 | 江苏大学 | A kind of monocular vision mileage measurement method based on color of image feature |
CN110570453A (en) * | 2019-07-10 | 2019-12-13 | 哈尔滨工程大学 | Visual odometer method based on binocular vision and closed-loop tracking characteristics |
CN110533722A (en) * | 2019-08-30 | 2019-12-03 | 的卢技术有限公司 | A kind of the robot fast relocation method and system of view-based access control model dictionary |
CN110782494A (en) * | 2019-10-16 | 2020-02-11 | 北京工业大学 | Visual SLAM method based on point-line fusion |
CN111325842A (en) * | 2020-03-04 | 2020-06-23 | Oppo广东移动通信有限公司 | Map construction method, repositioning method and device, storage medium and electronic equipment |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022002150A1 (en) * | 2020-06-30 | 2022-01-06 | 杭州海康机器人技术有限公司 | Method and device for constructing visual point cloud map |
CN113761091B (en) * | 2020-11-27 | 2024-04-05 | 北京京东乾石科技有限公司 | Closed loop detection method, device, electronic equipment, system and storage medium |
CN113761091A (en) * | 2020-11-27 | 2021-12-07 | 北京京东乾石科技有限公司 | Closed loop detection method, device, electronic equipment, system and storage medium |
CN112614185B (en) * | 2020-12-29 | 2022-06-21 | 浙江商汤科技开发有限公司 | Map construction method and device and storage medium |
CN112614185A (en) * | 2020-12-29 | 2021-04-06 | 浙江商汤科技开发有限公司 | Map construction method and device and storage medium |
CN112767546A (en) * | 2021-01-22 | 2021-05-07 | 湖南大学 | Binocular image-based visual map generation method for mobile robot |
CN112767546B (en) * | 2021-01-22 | 2022-08-02 | 湖南大学 | Binocular image-based visual map generation method for mobile robot |
CN113063424A (en) * | 2021-03-29 | 2021-07-02 | 湖南国科微电子股份有限公司 | Method, device, equipment and storage medium for intra-market navigation |
CN113515536A (en) * | 2021-07-13 | 2021-10-19 | 北京百度网讯科技有限公司 | Map updating method, device, equipment, server and storage medium |
CN113674411A (en) * | 2021-07-29 | 2021-11-19 | 浙江大华技术股份有限公司 | Pose graph adjustment-based graph establishing method and related equipment |
CN113670293A (en) * | 2021-08-11 | 2021-11-19 | 追觅创新科技(苏州)有限公司 | Map construction method and device |
CN113536024B (en) * | 2021-08-11 | 2022-09-09 | 重庆大学 | ORB-SLAM relocation feature point retrieval acceleration method based on FPGA |
CN113536024A (en) * | 2021-08-11 | 2021-10-22 | 重庆大学 | ORB-SLAM relocation feature point retrieval acceleration method based on FPGA |
CN114088099A (en) * | 2021-11-18 | 2022-02-25 | 北京易航远智科技有限公司 | Semantic relocation method and device based on known map, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
WO2022002150A1 (en) | 2022-01-06 |
CN111795704B (en) | 2022-06-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111795704B (en) | Method and device for constructing visual point cloud map | |
CN111780763B (en) | Visual positioning method and device based on visual map | |
CN111780764B (en) | Visual positioning method and device based on visual map | |
US11341738B2 (en) | Using a probabtilistic model for detecting an object in visual data | |
CN113012212B (en) | Depth information fusion-based indoor scene three-dimensional point cloud reconstruction method and system | |
CN110287873B (en) | Non-cooperative target pose measurement method and system based on deep neural network and terminal equipment | |
CN106991388B (en) | Key point positioning method | |
CN110472585B (en) | VI-S L AM closed-loop detection method based on inertial navigation attitude track information assistance | |
Wu et al. | Metric learning based structural appearance model for robust visual tracking | |
CN110110694B (en) | Visual SLAM closed-loop detection method based on target detection | |
CN113298934A (en) | Monocular visual image three-dimensional reconstruction method and system based on bidirectional matching | |
JP2019185787A (en) | Remote determination of containers in geographical region | |
CN106611030B (en) | Object similarity comparison method and retrieval method based on video and system thereof | |
CN111368733B (en) | Three-dimensional hand posture estimation method based on label distribution learning, storage medium and terminal | |
Huang et al. | Tracking-by-detection of 3d human shapes: from surfaces to volumes | |
CN111709317A (en) | Pedestrian re-identification method based on multi-scale features under saliency model | |
CN113762049B (en) | Content identification method, content identification device, storage medium and terminal equipment | |
Srivastava et al. | Drought stress classification using 3D plant models | |
CN116894876A (en) | 6-DOF positioning method based on real-time image | |
Lin et al. | Scale alignment of 3D point clouds with different scales | |
CN116246119A (en) | 3D target detection method, electronic device and storage medium | |
CN115527050A (en) | Image feature matching method, computer device and readable storage medium | |
CN111414802B (en) | Protein data characteristic extraction method | |
CN114067128A (en) | SLAM loop detection method based on semantic features | |
CN111724438B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: 310051 room 304, B / F, building 2, 399 Danfeng Road, Binjiang District, Hangzhou City, Zhejiang Province Patentee after: Hangzhou Hikvision Robot Co.,Ltd. Address before: 310051 room 304, B / F, building 2, 399 Danfeng Road, Binjiang District, Hangzhou City, Zhejiang Province Patentee before: HANGZHOU HIKROBOT TECHNOLOGY Co.,Ltd. |