CN110889349A - VSLAM-based visual positioning method for sparse three-dimensional point cloud chart - Google Patents
VSLAM-based visual positioning method for sparse three-dimensional point cloud chart Download PDFInfo
- Publication number
- CN110889349A CN110889349A CN201911127519.5A CN201911127519A CN110889349A CN 110889349 A CN110889349 A CN 110889349A CN 201911127519 A CN201911127519 A CN 201911127519A CN 110889349 A CN110889349 A CN 110889349A
- Authority
- CN
- China
- Prior art keywords
- image
- coordinate system
- point cloud
- matrix
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000000007 visual effect Effects 0.000 title claims abstract description 24
- 239000011159 matrix material Substances 0.000 claims abstract description 64
- 238000012546 transfer Methods 0.000 claims abstract description 16
- 238000013519 translation Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 5
- 230000005484 gravity Effects 0.000 claims description 3
- 238000003384 imaging method Methods 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a visual positioning method of a sparse three-dimensional point cloud image based on VSLAM, which obtains a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and the sparse three-dimensional point cloud image under the global coordinate system by changing the output of an open-source ORB-SLAM system, wherein three-dimensional point cloud is the three-dimensional coordinate of a key road mark point, and an initial image database is established based on the obtained information, so that the user positioning is realized based on the established image database. Meanwhile, the image database can be further compressed based on the method, and the capacity of the image database is reduced.
Description
Technical Field
The invention belongs to the technical field of image processing, relates to an indoor Visual positioning method based on Visual based Simultaneous positioning and Mapping technology (VSLAM) and multi-view geometric theory in the field of computer vision, and particularly relates to a Visual positioning method of a sparse three-dimensional point cloud picture based on VSLAM.
Background
With the development of science and technology, people can often come in and go out of large unknown indoor environments, such as superstores, art exhibitions, airports and the like. Therefore, the position location service is increasingly paid attention to. In a more complex indoor environment, the conventional GPS positioning method fails due to signal attenuation. In recent years, various indoor positioning technologies have been proposed. Compared with other indoor positioning methods which need to arrange additional equipment, the vision-based indoor positioning method does not need other facilities, and has certain advantages in cost and precision. Meanwhile, because indoor visual information is rich, the image-based feature extraction technology is mature day by day, and mobile phone devices are rapidly developed, so that more than one camera is integrated on each smart phone, and the like, the indoor positioning technology based on vision gradually becomes a research hotspot.
The indoor vision-based indoor positioning system is mainly divided into an off-line stage and an on-line stage: and in the off-line stage, acquiring an image of the indoor database to be positioned and the position of the shot image, and in the on-line stage, shooting a picture by a user and positioning. In visual positioning, a user firstly shoots an image in an indoor scene through a camera equipped on an intelligent mobile terminal; then, visual feature extraction is carried out through the images shot by the user; finally, the image shot by the user is subjected to feature matching with the database image, and the position of the camera is solved on the basis. The visual positioning has the advantages that positioning facilities do not need to be arranged, an image database is only required to be established in an indoor scene in advance, and images shot by a user are acquired in a positioning stage, so that the position of the user can be estimated. In addition, since the visual positioning algorithm is to solve the camera position in units of pixels, the visual positioning algorithm can theoretically realize high-precision position estimation.
The technical key points of visual positioning are the establishment of an image database, image searching and positioning methods, and the technical basis is the establishment of the image database. Compared with the traditional image database establishing method for acquiring image information point by point and calculating the position information of the image information point by point, the method for establishing the image database by utilizing the VSLAM technology is more efficient. The SLAM technology is originated in the field of robots and aims to solve the problems of positioning and mapping of the robots in position environments, and the VSLAM mainly means that input signals are visual signals. The VSLAM algorithm can be broadly divided into four parts, front-end visual odometer, back-end optimization, loop detection, and mapping: firstly, obtaining image information from a vision sensor, and estimating camera motion between adjacent pictures by a vision odometer; then, whether the camera returns to the position which is reached before is judged through loop detection; then, the contents of the visual odometer and the loop detection are sent to the rear end for optimization; and finally, realizing three-dimensional scene reconstruction according to the estimated camera track and the estimated attitude.
Generally, the location service requires real-time performance, and the biggest factor of visual positioning influencing the positioning speed at the present stage is that an image database needs to be traversed during image matching. Most of the current methods improve the image search algorithm or build the image database in a layered structure, so as to improve the image matching speed. However, there is often a large redundancy between adjacent images in the image database, and multiple repeated matching of the same features of multiple images is required when traversing the image database. When the positioning environment is large and the capacity of the image database is large, the positioning speed is seriously influenced. Based on the method, the established sparse three-dimensional point cloud image is matched with the user input image, so that the repeated matching times are reduced, and the positioning speed is improved. Meanwhile, the image database can be further compressed based on the method, and the capacity of the image database is reduced.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a visual positioning method of a VSLAM-based sparse three-dimensional point cloud chart. According to the invention, the output of an open-source ORB-SLAM system is changed to obtain a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud map under the global coordinate system, wherein the three-dimensional point cloud is a three-dimensional coordinate of a key landmark point. And an initial image database is established based on the obtained information.
The invention is realized by the following technical scheme, and provides a visual positioning method of a VSLAM-based sparse three-dimensional point cloud picture, which comprises the following steps:
step one, establishing an image database based on VSLAM;
step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution of SURF matching characteristic points of the matching image obtained in the step three and the user input image in a v axis under a pixel coordinate system, and determining a three-dimensional point cloud range to be searched according to the pixel distribution and IndexLists of the image;
step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
Further, the first step specifically comprises:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; the three-dimensional rectangular coordinate system is a global coordinate system;
step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule;
and fifthly, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database.
Further, in the first step three, the sparse three-dimensional point cloud chart is a set composed of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded asThe transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
Further, the sorting rule stores each frame of image I according to the sequence of the time stamps of the collected imagesiAnd the corresponding point clouds sort the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the three-dimensional point clouds in the image, and arrange the three-dimensional point clouds according to the sequence that the u axis is from small to large and then the v axis is from small to large.
Further, for each image, the transfer matrix and the global three-dimensional point cloud obtained in the first step and the third step are utilized to obtain pixels corresponding to the three-dimensional point cloud through a projection relation, and the method specifically comprises the following steps:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen, the satisfied relation is shown as formula (2), the value calculated by formula (2) is taken as the nearest neighbor integer to obtain the characteristic point pixel value corresponding to the point cloud,
further, the first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
comparing the characteristic points in the two images one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
Further, the sixth step is specifically:
step six, obtaining n from step fiveEThree-dimensional position coordinate set { p) of global spatial pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
and further obtaining a matrix A:
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jWeighted value of (1), ithOwnership weight values of spatial points need to satisfy:
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
further sorting the formula (14) by using the camera parameters to obtain:
in the formula (f)cIs the focal length of the camera (u)0,v0) At the intersection of the optical axis and the imaging planeCoordinates;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j;
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
wherein if R < 0, then R (2:) — -R (2:).
The invention reduces the capacity of the image database as much as possible and improves the positioning speed on the premise of acceptable positioning accuracy. Therefore, the indoor visual positioning method based on the VSLAM is proposed, and focuses on two aspects of image database establishment and a positioning method based on the established image database. And in the off-line stage, a KinectV2 device is used for collecting color images and depth images, a SLAM technology is used for obtaining a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud image in the global coordinate system, and an image database is established on the basis of the transfer matrix. And in an online stage, a two-step matching strategy is provided, a well-matched 2D-3D matching pair is found, and the relative pose is obtained by utilizing EPnP (effective Point n Point), so that the user positioning is completed.
Drawings
FIG. 1 is a block diagram of a visual positioning method of a VSLAM-based sparse three-dimensional point cloud graph according to the present invention;
FIG. 2 is a schematic diagram of selecting a coordinate origin on an indoor map and establishing a coordinate system;
FIG. 3 is a schematic diagram of a pixel coordinate system corresponding to each image;
FIG. 4 is a schematic diagram of an image database format;
fig. 5 is a schematic diagram of the EPnP algorithm.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In practical application, the indoor environment is a dynamic changing environment, and multiple factors such as non-rigid state change, light change in the morning and evening, and the like exist. In order to improve the indoor positioning accuracy, image information needs to be acquired at the reference point for multiple times so as to exhaust possible shooting scenes of a user. In order to reduce the workload of establishing the database, the invention provides an improved image database establishing method based on the VSLAM technology. Meanwhile, in consideration of redundant operation existing in image matching, and therefore the positioning speed is reduced, the invention provides a positioning method based on an improved image database, and a two-step matching method is adopted to obtain a 2D-3D matching pair of a user input image characteristic point and a global three-dimensional point cloud point, so that the absolute position of a user is solved.
With reference to fig. 1, the present invention provides a visual positioning method for a VSLAM-based sparse three-dimensional point cloud map, the method includes the following steps:
step one, establishing an image database based on VSLAM;
the first step is specifically as follows:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; as shown in fig. 2, the three-dimensional rectangular coordinate system is a global coordinate system; one point in the coordinate system is denoted as pw,i。
Step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
in the third step, the sparse three-dimensional point cloud chart is a set consisting of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded asThe transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
Step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule; storing by adopting a sequential structure;
the sequencing rule stores each frame of image I according to the sequence of the time stamps of the acquired imagesiThe corresponding point clouds rank the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the point clouds in the image, and are arranged from small to large according to the u-axis and then arranged from small to large according to the v-axis, as shown in fig. 3.
For each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step one and the step three, and specifically:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen, the satisfied relation is shown as formula (2), the value calculated by formula (2) is taken as the nearest neighbor integer to obtain the characteristic point pixel value corresponding to the point cloud,
step five, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database, as shown in FIG. 4.
Step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution P (a is less than or equal to x is less than or equal to b) of the V-axis of the SURF matching characteristic points of the matching image obtained in the step three and the user input image under the pixel coordinate system,
wherein N is the total number of matched characteristic points, and the pixel coordinate p of each matched characteristic pointI=[u,v]T
Determining a three-dimensional point cloud range to be searched according to pixel distribution and IndexLists of the image; suppose an image Ii-1、Ii、Ii+1Have IndexLists values of [ index1, index2 [ ]]、[index3,index4]、[index5,index6]Resolution of the image umax×vmax. When P (x ≦ v)max0.5, the search range is [ index1, index4 ]](ii) a When P (x ≦ v)max[ index3, index6 ] when/2) < 0.5]。
Step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
The first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
comparing the characteristic points in the two images one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
With reference to fig. 5, the sixth step specifically is:
step six, obtaining n from step fiveE(nEThree-dimensional position coordinate set { p) of more than or equal to 4) global space pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
and further obtaining a matrix A:
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jThe ownership weight value of the ith space point needs to satisfy:
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
further sorting the formula (14) by using the camera parameters to obtain:
in the formula (f)cIs the focal length of the camera (u)0,v0) Is the coordinate of the intersection point of the optical axis and the imaging plane;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j;
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
wherein if R < 0, then R (2:) — -R (2:).
The meaning of each parameter in the present invention is shown in table 1:
TABLE 1 meanings of the parameters
The VSLAM-based visual positioning method for sparse three-dimensional point cloud images is introduced in detail, and a specific example is applied to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (7)
1. A visual positioning method of a sparse three-dimensional point cloud chart based on VSLAM is characterized by comprising the following steps: the method comprises the following steps:
step one, establishing an image database based on VSLAM;
step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution of SURF matching characteristic points of the matching image obtained in the step three and the user input image in a v axis under a pixel coordinate system, and determining a three-dimensional point cloud range to be searched according to the pixel distribution and IndexLists of the image;
step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
2. The method of claim 1, wherein: the first step is specifically as follows:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; the three-dimensional rectangular coordinate system is a global coordinate system;
step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule;
and fifthly, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database.
3. The method of claim 2, wherein: in the third step, the sparse three-dimensional point cloud chart is a set consisting of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded asThe transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
4. The method of claim 2, wherein: the sequencing rule stores each frame of image I according to the sequence of the time stamps of the acquired imagesiAnd the corresponding point clouds sort the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the three-dimensional point clouds in the image, and arrange the three-dimensional point clouds according to the sequence that the u axis is from small to large and then the v axis is from small to large.
5. The method of claim 2, wherein: for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step one and the step three, and specifically:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen satisfyThe relation of (2) is shown in the formula, the nearest neighbor integer is taken from the value calculated by the formula (2) to obtain the characteristic point pixel value corresponding to the point cloud,
6. the method of claim 2, wherein: the first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
in two imagesComparing the characteristic points one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
7. The method of claim 1, wherein: the sixth step is specifically as follows:
step six, obtaining n from step fiveEThree-dimensional position coordinate set { p) of global spatial pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
and further obtaining a matrix A:
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jThe ownership weight value of the ith space point needs to satisfy:
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
further sorting the formula (14) by using the camera parameters to obtain:
in the formula (f)cIs the focal length of the camera (u)0,v0) Is the coordinate of the intersection point of the optical axis and the imaging plane;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j;
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
wherein if R < 0, then R (2:) — -R (2:).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911127519.5A CN110889349A (en) | 2019-11-18 | 2019-11-18 | VSLAM-based visual positioning method for sparse three-dimensional point cloud chart |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911127519.5A CN110889349A (en) | 2019-11-18 | 2019-11-18 | VSLAM-based visual positioning method for sparse three-dimensional point cloud chart |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110889349A true CN110889349A (en) | 2020-03-17 |
Family
ID=69747849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911127519.5A Pending CN110889349A (en) | 2019-11-18 | 2019-11-18 | VSLAM-based visual positioning method for sparse three-dimensional point cloud chart |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110889349A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111624997A (en) * | 2020-05-12 | 2020-09-04 | 珠海市一微半导体有限公司 | Robot control method and system based on TOF camera module and robot |
CN111882590A (en) * | 2020-06-24 | 2020-11-03 | 广州万维创新科技有限公司 | AR scene application method based on single picture positioning |
CN112907745A (en) * | 2021-03-23 | 2021-06-04 | 北京三快在线科技有限公司 | Method and device for generating digital orthophoto map |
CN113034600A (en) * | 2021-04-23 | 2021-06-25 | 上海交通大学 | Non-texture planar structure industrial part identification and 6D pose estimation method based on template matching |
CN113643422A (en) * | 2021-07-09 | 2021-11-12 | 北京三快在线科技有限公司 | Information display method and device |
CN113808273A (en) * | 2021-09-14 | 2021-12-17 | 大连海事大学 | Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150070470A1 (en) * | 2013-09-10 | 2015-03-12 | Board Of Regents, The University Of Texas System | Apparatus, System, and Method for Mobile, Low-Cost Headset for 3D Point of Gaze Estimation |
CN106228538A (en) * | 2016-07-12 | 2016-12-14 | 哈尔滨工业大学 | Binocular vision indoor orientation method based on logo |
JP2017053795A (en) * | 2015-09-11 | 2017-03-16 | 株式会社リコー | Information processing apparatus, position attitude measurement method, and position attitude measurement program |
CN106826815A (en) * | 2016-12-21 | 2017-06-13 | 江苏物联网研究发展中心 | Target object method of the identification with positioning based on coloured image and depth image |
CN107103056A (en) * | 2017-04-13 | 2017-08-29 | 哈尔滨工业大学 | A kind of binocular vision indoor positioning database building method and localization method based on local identities |
US20180061126A1 (en) * | 2016-08-26 | 2018-03-01 | Osense Technology Co., Ltd. | Method and system for indoor positioning and device for creating indoor maps thereof |
WO2018049581A1 (en) * | 2016-09-14 | 2018-03-22 | 浙江大学 | Method for simultaneous localization and mapping |
CN107830854A (en) * | 2017-11-06 | 2018-03-23 | 深圳精智机器有限公司 | Vision positioning method based on sparse cloud of ORB and Quick Response Code |
CN109960402A (en) * | 2018-12-18 | 2019-07-02 | 重庆邮电大学 | A kind of actual situation register method merged based on cloud and visual signature |
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
CN110097599A (en) * | 2019-04-19 | 2019-08-06 | 电子科技大学 | A kind of workpiece position and orientation estimation method based on partial model expression |
CN110360999A (en) * | 2018-03-26 | 2019-10-22 | 京东方科技集团股份有限公司 | Indoor orientation method, indoor locating system and computer-readable medium |
CN110443840A (en) * | 2019-08-07 | 2019-11-12 | 山东理工大学 | The optimization method of sampling point set initial registration in surface in kind |
-
2019
- 2019-11-18 CN CN201911127519.5A patent/CN110889349A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150070470A1 (en) * | 2013-09-10 | 2015-03-12 | Board Of Regents, The University Of Texas System | Apparatus, System, and Method for Mobile, Low-Cost Headset for 3D Point of Gaze Estimation |
JP2017053795A (en) * | 2015-09-11 | 2017-03-16 | 株式会社リコー | Information processing apparatus, position attitude measurement method, and position attitude measurement program |
CN106228538A (en) * | 2016-07-12 | 2016-12-14 | 哈尔滨工业大学 | Binocular vision indoor orientation method based on logo |
US20180061126A1 (en) * | 2016-08-26 | 2018-03-01 | Osense Technology Co., Ltd. | Method and system for indoor positioning and device for creating indoor maps thereof |
US20190234746A1 (en) * | 2016-09-14 | 2019-08-01 | Zhejiang University | Method for simultaneous localization and mapping |
WO2018049581A1 (en) * | 2016-09-14 | 2018-03-22 | 浙江大学 | Method for simultaneous localization and mapping |
CN106826815A (en) * | 2016-12-21 | 2017-06-13 | 江苏物联网研究发展中心 | Target object method of the identification with positioning based on coloured image and depth image |
CN107103056A (en) * | 2017-04-13 | 2017-08-29 | 哈尔滨工业大学 | A kind of binocular vision indoor positioning database building method and localization method based on local identities |
CN107830854A (en) * | 2017-11-06 | 2018-03-23 | 深圳精智机器有限公司 | Vision positioning method based on sparse cloud of ORB and Quick Response Code |
CN110360999A (en) * | 2018-03-26 | 2019-10-22 | 京东方科技集团股份有限公司 | Indoor orientation method, indoor locating system and computer-readable medium |
CN109960402A (en) * | 2018-12-18 | 2019-07-02 | 重庆邮电大学 | A kind of actual situation register method merged based on cloud and visual signature |
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
CN110097599A (en) * | 2019-04-19 | 2019-08-06 | 电子科技大学 | A kind of workpiece position and orientation estimation method based on partial model expression |
CN110443840A (en) * | 2019-08-07 | 2019-11-12 | 山东理工大学 | The optimization method of sampling point set initial registration in surface in kind |
Non-Patent Citations (3)
Title |
---|
GAO QIAN: "Monocular Vision Based Object Recognition and Tracking for Intelligent Robot" * |
王龙辉;杨光;尹芳;丑武胜;: "基于Kinect2.0的三维视觉同步定位与地图构建" * |
马琳;杨浩;谭学治;冯冠元;: "基于图像关键帧的Visual-Depth Map建立方法" * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111624997A (en) * | 2020-05-12 | 2020-09-04 | 珠海市一微半导体有限公司 | Robot control method and system based on TOF camera module and robot |
CN111882590A (en) * | 2020-06-24 | 2020-11-03 | 广州万维创新科技有限公司 | AR scene application method based on single picture positioning |
CN112907745A (en) * | 2021-03-23 | 2021-06-04 | 北京三快在线科技有限公司 | Method and device for generating digital orthophoto map |
CN112907745B (en) * | 2021-03-23 | 2022-04-01 | 北京三快在线科技有限公司 | Method and device for generating digital orthophoto map |
CN113034600A (en) * | 2021-04-23 | 2021-06-25 | 上海交通大学 | Non-texture planar structure industrial part identification and 6D pose estimation method based on template matching |
CN113034600B (en) * | 2021-04-23 | 2023-08-01 | 上海交通大学 | Template matching-based texture-free planar structure industrial part identification and 6D pose estimation method |
CN113643422A (en) * | 2021-07-09 | 2021-11-12 | 北京三快在线科技有限公司 | Information display method and device |
CN113643422B (en) * | 2021-07-09 | 2023-02-03 | 北京三快在线科技有限公司 | Information display method and device |
CN113808273A (en) * | 2021-09-14 | 2021-12-17 | 大连海事大学 | Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation |
CN113808273B (en) * | 2021-09-14 | 2023-09-12 | 大连海事大学 | Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110889349A (en) | VSLAM-based visual positioning method for sparse three-dimensional point cloud chart | |
CN111968129B (en) | Instant positioning and map construction system and method with semantic perception | |
CN110728717B (en) | Positioning method and device, equipment and storage medium | |
CN110738143B (en) | Positioning method and device, equipment and storage medium | |
CN109658445A (en) | Network training method, increment build drawing method, localization method, device and equipment | |
CN113393522B (en) | 6D pose estimation method based on monocular RGB camera regression depth information | |
CN110675457B (en) | Positioning method and device, equipment and storage medium | |
CN109579825B (en) | Robot positioning system and method based on binocular vision and convolutional neural network | |
CN108648240A (en) | Based on a non-overlapping visual field camera posture scaling method for cloud characteristics map registration | |
CN108519102B (en) | Binocular vision mileage calculation method based on secondary projection | |
CN106940186A (en) | A kind of robot autonomous localization and air navigation aid and system | |
CN110070598B (en) | Mobile terminal for 3D scanning reconstruction and 3D scanning reconstruction method thereof | |
CN110276768B (en) | Image segmentation method, image segmentation device, image segmentation apparatus, and medium | |
CN111862213A (en) | Positioning method and device, electronic equipment and computer readable storage medium | |
CN115205489A (en) | Three-dimensional reconstruction method, system and device in large scene | |
CN111323024B (en) | Positioning method and device, equipment and storage medium | |
CN111127524A (en) | Method, system and device for tracking trajectory and reconstructing three-dimensional image | |
CN111860651B (en) | Monocular vision-based semi-dense map construction method for mobile robot | |
CN102263957B (en) | Search-window adaptive parallax estimation method | |
CN111899280A (en) | Monocular vision odometer method adopting deep learning and mixed pose estimation | |
CN111709317B (en) | Pedestrian re-identification method based on multi-scale features under saliency model | |
CN116772820A (en) | Local refinement mapping system and method based on SLAM and semantic segmentation | |
CN116843754A (en) | Visual positioning method and system based on multi-feature fusion | |
CN112615993A (en) | Depth information acquisition method, binocular camera module, storage medium and electronic equipment | |
WO2024032101A1 (en) | Feature map generation method and apparatus, storage medium, and computer device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200317 |
|
RJ01 | Rejection of invention patent application after publication |