CN110889349A - VSLAM-based visual positioning method for sparse three-dimensional point cloud chart - Google Patents

VSLAM-based visual positioning method for sparse three-dimensional point cloud chart Download PDF

Info

Publication number
CN110889349A
CN110889349A CN201911127519.5A CN201911127519A CN110889349A CN 110889349 A CN110889349 A CN 110889349A CN 201911127519 A CN201911127519 A CN 201911127519A CN 110889349 A CN110889349 A CN 110889349A
Authority
CN
China
Prior art keywords
image
coordinate system
point cloud
matrix
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911127519.5A
Other languages
Chinese (zh)
Inventor
马琳
姜晗
谭学治
王彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201911127519.5A priority Critical patent/CN110889349A/en
Publication of CN110889349A publication Critical patent/CN110889349A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a visual positioning method of a sparse three-dimensional point cloud image based on VSLAM, which obtains a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and the sparse three-dimensional point cloud image under the global coordinate system by changing the output of an open-source ORB-SLAM system, wherein three-dimensional point cloud is the three-dimensional coordinate of a key road mark point, and an initial image database is established based on the obtained information, so that the user positioning is realized based on the established image database. Meanwhile, the image database can be further compressed based on the method, and the capacity of the image database is reduced.

Description

VSLAM-based visual positioning method for sparse three-dimensional point cloud chart
Technical Field
The invention belongs to the technical field of image processing, relates to an indoor Visual positioning method based on Visual based Simultaneous positioning and Mapping technology (VSLAM) and multi-view geometric theory in the field of computer vision, and particularly relates to a Visual positioning method of a sparse three-dimensional point cloud picture based on VSLAM.
Background
With the development of science and technology, people can often come in and go out of large unknown indoor environments, such as superstores, art exhibitions, airports and the like. Therefore, the position location service is increasingly paid attention to. In a more complex indoor environment, the conventional GPS positioning method fails due to signal attenuation. In recent years, various indoor positioning technologies have been proposed. Compared with other indoor positioning methods which need to arrange additional equipment, the vision-based indoor positioning method does not need other facilities, and has certain advantages in cost and precision. Meanwhile, because indoor visual information is rich, the image-based feature extraction technology is mature day by day, and mobile phone devices are rapidly developed, so that more than one camera is integrated on each smart phone, and the like, the indoor positioning technology based on vision gradually becomes a research hotspot.
The indoor vision-based indoor positioning system is mainly divided into an off-line stage and an on-line stage: and in the off-line stage, acquiring an image of the indoor database to be positioned and the position of the shot image, and in the on-line stage, shooting a picture by a user and positioning. In visual positioning, a user firstly shoots an image in an indoor scene through a camera equipped on an intelligent mobile terminal; then, visual feature extraction is carried out through the images shot by the user; finally, the image shot by the user is subjected to feature matching with the database image, and the position of the camera is solved on the basis. The visual positioning has the advantages that positioning facilities do not need to be arranged, an image database is only required to be established in an indoor scene in advance, and images shot by a user are acquired in a positioning stage, so that the position of the user can be estimated. In addition, since the visual positioning algorithm is to solve the camera position in units of pixels, the visual positioning algorithm can theoretically realize high-precision position estimation.
The technical key points of visual positioning are the establishment of an image database, image searching and positioning methods, and the technical basis is the establishment of the image database. Compared with the traditional image database establishing method for acquiring image information point by point and calculating the position information of the image information point by point, the method for establishing the image database by utilizing the VSLAM technology is more efficient. The SLAM technology is originated in the field of robots and aims to solve the problems of positioning and mapping of the robots in position environments, and the VSLAM mainly means that input signals are visual signals. The VSLAM algorithm can be broadly divided into four parts, front-end visual odometer, back-end optimization, loop detection, and mapping: firstly, obtaining image information from a vision sensor, and estimating camera motion between adjacent pictures by a vision odometer; then, whether the camera returns to the position which is reached before is judged through loop detection; then, the contents of the visual odometer and the loop detection are sent to the rear end for optimization; and finally, realizing three-dimensional scene reconstruction according to the estimated camera track and the estimated attitude.
Generally, the location service requires real-time performance, and the biggest factor of visual positioning influencing the positioning speed at the present stage is that an image database needs to be traversed during image matching. Most of the current methods improve the image search algorithm or build the image database in a layered structure, so as to improve the image matching speed. However, there is often a large redundancy between adjacent images in the image database, and multiple repeated matching of the same features of multiple images is required when traversing the image database. When the positioning environment is large and the capacity of the image database is large, the positioning speed is seriously influenced. Based on the method, the established sparse three-dimensional point cloud image is matched with the user input image, so that the repeated matching times are reduced, and the positioning speed is improved. Meanwhile, the image database can be further compressed based on the method, and the capacity of the image database is reduced.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a visual positioning method of a VSLAM-based sparse three-dimensional point cloud chart. According to the invention, the output of an open-source ORB-SLAM system is changed to obtain a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud map under the global coordinate system, wherein the three-dimensional point cloud is a three-dimensional coordinate of a key landmark point. And an initial image database is established based on the obtained information.
The invention is realized by the following technical scheme, and provides a visual positioning method of a VSLAM-based sparse three-dimensional point cloud picture, which comprises the following steps:
step one, establishing an image database based on VSLAM;
step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution of SURF matching characteristic points of the matching image obtained in the step three and the user input image in a v axis under a pixel coordinate system, and determining a three-dimensional point cloud range to be searched according to the pixel distribution and IndexLists of the image;
step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
Further, the first step specifically comprises:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; the three-dimensional rectangular coordinate system is a global coordinate system;
step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule;
and fifthly, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database.
Further, in the first step three, the sparse three-dimensional point cloud chart is a set composed of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded as
Figure BDA0002277314370000031
The transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
Figure BDA0002277314370000032
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
Further, the sorting rule stores each frame of image I according to the sequence of the time stamps of the collected imagesiAnd the corresponding point clouds sort the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the three-dimensional point clouds in the image, and arrange the three-dimensional point clouds according to the sequence that the u axis is from small to large and then the v axis is from small to large.
Further, for each image, the transfer matrix and the global three-dimensional point cloud obtained in the first step and the third step are utilized to obtain pixels corresponding to the three-dimensional point cloud through a projection relation, and the method specifically comprises the following steps:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen, the satisfied relation is shown as formula (2), the value calculated by formula (2) is taken as the nearest neighbor integer to obtain the characteristic point pixel value corresponding to the point cloud,
Figure BDA0002277314370000033
further, the first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Figure BDA0002277314370000034
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
Figure BDA0002277314370000041
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
comparing the characteristic points in the two images one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
Figure BDA0002277314370000042
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
Further, the sixth step is specifically:
step six, obtaining n from step fiveEThree-dimensional position coordinate set { p) of global spatial pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
Figure BDA0002277314370000043
and further obtaining a matrix A:
Figure BDA0002277314370000044
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
Figure BDA0002277314370000051
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
Figure BDA0002277314370000052
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jWeighted value of (1), ithOwnership weight values of spatial points need to satisfy:
Figure BDA0002277314370000053
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
Figure BDA0002277314370000054
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
Figure BDA0002277314370000055
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
Figure BDA0002277314370000056
further sorting the formula (14) by using the camera parameters to obtain:
Figure BDA0002277314370000057
in the formula (f)cIs the focal length of the camera (u)0,v0) At the intersection of the optical axis and the imaging planeCoordinates;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
Figure BDA0002277314370000061
if it is provided with
Figure BDA0002277314370000062
Then equation (16) is organized as a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
Figure BDA0002277314370000063
Figure BDA0002277314370000064
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
Figure BDA0002277314370000065
wherein if R < 0, then R (2:) — -R (2:).
The invention reduces the capacity of the image database as much as possible and improves the positioning speed on the premise of acceptable positioning accuracy. Therefore, the indoor visual positioning method based on the VSLAM is proposed, and focuses on two aspects of image database establishment and a positioning method based on the established image database. And in the off-line stage, a KinectV2 device is used for collecting color images and depth images, a SLAM technology is used for obtaining a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud image in the global coordinate system, and an image database is established on the basis of the transfer matrix. And in an online stage, a two-step matching strategy is provided, a well-matched 2D-3D matching pair is found, and the relative pose is obtained by utilizing EPnP (effective Point n Point), so that the user positioning is completed.
Drawings
FIG. 1 is a block diagram of a visual positioning method of a VSLAM-based sparse three-dimensional point cloud graph according to the present invention;
FIG. 2 is a schematic diagram of selecting a coordinate origin on an indoor map and establishing a coordinate system;
FIG. 3 is a schematic diagram of a pixel coordinate system corresponding to each image;
FIG. 4 is a schematic diagram of an image database format;
fig. 5 is a schematic diagram of the EPnP algorithm.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In practical application, the indoor environment is a dynamic changing environment, and multiple factors such as non-rigid state change, light change in the morning and evening, and the like exist. In order to improve the indoor positioning accuracy, image information needs to be acquired at the reference point for multiple times so as to exhaust possible shooting scenes of a user. In order to reduce the workload of establishing the database, the invention provides an improved image database establishing method based on the VSLAM technology. Meanwhile, in consideration of redundant operation existing in image matching, and therefore the positioning speed is reduced, the invention provides a positioning method based on an improved image database, and a two-step matching method is adopted to obtain a 2D-3D matching pair of a user input image characteristic point and a global three-dimensional point cloud point, so that the absolute position of a user is solved.
With reference to fig. 1, the present invention provides a visual positioning method for a VSLAM-based sparse three-dimensional point cloud map, the method includes the following steps:
step one, establishing an image database based on VSLAM;
the first step is specifically as follows:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; as shown in fig. 2, the three-dimensional rectangular coordinate system is a global coordinate system; one point in the coordinate system is denoted as pw,i
Step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
in the third step, the sparse three-dimensional point cloud chart is a set consisting of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded as
Figure BDA0002277314370000071
The transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
Figure BDA0002277314370000072
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
Step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule; storing by adopting a sequential structure;
the sequencing rule stores each frame of image I according to the sequence of the time stamps of the acquired imagesiThe corresponding point clouds rank the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the point clouds in the image, and are arranged from small to large according to the u-axis and then arranged from small to large according to the v-axis, as shown in fig. 3.
For each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step one and the step three, and specifically:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen, the satisfied relation is shown as formula (2), the value calculated by formula (2) is taken as the nearest neighbor integer to obtain the characteristic point pixel value corresponding to the point cloud,
Figure BDA0002277314370000081
step five, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database, as shown in FIG. 4.
Step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution P (a is less than or equal to x is less than or equal to b) of the V-axis of the SURF matching characteristic points of the matching image obtained in the step three and the user input image under the pixel coordinate system,
Figure BDA0002277314370000082
wherein N is the total number of matched characteristic points, and the pixel coordinate p of each matched characteristic pointI=[u,v]T
Figure BDA0002277314370000083
Determining a three-dimensional point cloud range to be searched according to pixel distribution and IndexLists of the image; suppose an image Ii-1、Ii、Ii+1Have IndexLists values of [ index1, index2 [ ]]、[index3,index4]、[index5,index6]Resolution of the image umax×vmax. When P (x ≦ v)max0.5, the search range is [ index1, index4 ]](ii) a When P (x ≦ v)max[ index3, index6 ] when/2) < 0.5]。
Step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
The first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Figure BDA0002277314370000091
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
Figure BDA0002277314370000092
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
comparing the characteristic points in the two images one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
Figure BDA0002277314370000093
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
With reference to fig. 5, the sixth step specifically is:
step six, obtaining n from step fiveE(nEThree-dimensional position coordinate set { p) of more than or equal to 4) global space pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
Figure BDA0002277314370000101
and further obtaining a matrix A:
Figure BDA0002277314370000102
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
Figure BDA0002277314370000103
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
Figure BDA0002277314370000104
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jThe ownership weight value of the ith space point needs to satisfy:
Figure BDA0002277314370000105
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
Figure BDA0002277314370000106
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
Figure BDA0002277314370000111
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
Figure BDA0002277314370000112
further sorting the formula (14) by using the camera parameters to obtain:
Figure BDA0002277314370000113
in the formula (f)cIs the focal length of the camera (u)0,v0) Is the coordinate of the intersection point of the optical axis and the imaging plane;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
Figure BDA0002277314370000114
if it is provided with
Figure BDA0002277314370000115
Then equation (16) is organized as a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
Figure BDA0002277314370000116
Figure BDA0002277314370000117
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
Figure BDA0002277314370000121
wherein if R < 0, then R (2:) — -R (2:).
The meaning of each parameter in the present invention is shown in table 1:
TABLE 1 meanings of the parameters
Figure BDA0002277314370000122
Figure BDA0002277314370000131
The VSLAM-based visual positioning method for sparse three-dimensional point cloud images is introduced in detail, and a specific example is applied to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (7)

1. A visual positioning method of a sparse three-dimensional point cloud chart based on VSLAM is characterized by comprising the following steps: the method comprises the following steps:
step one, establishing an image database based on VSLAM;
step two, extracting SURF characteristic points from the user input image;
step three, roughly matching the user input image with the representative image in the image database by using the SURF descriptor to find an image with the highest matching degree;
step four, counting the pixel distribution of SURF matching characteristic points of the matching image obtained in the step three and the user input image in a v axis under a pixel coordinate system, and determining a three-dimensional point cloud range to be searched according to the pixel distribution and IndexLists of the image;
step five, performing fine matching of the user input image and the three-dimensional point cloud in the search range obtained in the step four to obtain a well-matched 2D-3D matching pair;
and step six, calculating the position coordinates of the user by utilizing an EPnP algorithm according to the 2D-3D matching pair obtained in the step five, and completing indoor positioning.
2. The method of claim 1, wherein: the first step is specifically as follows:
one by one, selecting a proper coordinate origin p according to the indoor environment to be positionedW,0(x0,y0,z0) Establishing a three-dimensional rectangular coordinate system; the three-dimensional rectangular coordinate system is a global coordinate system;
step two, starting from the coordinate origin selected in the step one by one, stably walking in an environment to be positioned by using a platform carrying KinectV2 equipment, and collecting color image information and depth image information to form RGB-D information;
inputting RGB-D information acquired in the first step and the second step, changing the output of an open source system ORB SALM, and acquiring a camera track, a transfer matrix of a camera coordinate system and a global coordinate system and a sparse three-dimensional point cloud picture under the global coordinate system by using the open source system;
step four, for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step three, and extracting SURF feature descriptors from the pixels; ordering the sparse three-dimensional point cloud according to a certain rule;
and fifthly, extracting partial representative images through an image key frame extraction strategy, forming an image database, establishing IndexLists for the selected representative images, and giving a three-dimensional point cloud feature descriptor to complete the establishment of the image database.
3. The method of claim 2, wherein: in the third step, the sparse three-dimensional point cloud chart is a set consisting of global three-dimensional coordinates of ORB feature points extracted from each image, and is recorded as
Figure FDA0002277314360000011
The transfer matrix T is a 4-order square matrix, and T consists of a rotation matrix R and a translation matrix T, and is shown in (1):
Figure FDA0002277314360000021
where R denotes a rotation matrix from the camera coordinate system of the current frame to the selected global coordinate system.
4. The method of claim 2, wherein: the sequencing rule stores each frame of image I according to the sequence of the time stamps of the acquired imagesiAnd the corresponding point clouds sort the three-dimensional point clouds belonging to each frame according to the pixel coordinates of the three-dimensional point clouds in the image, and arrange the three-dimensional point clouds according to the sequence that the u axis is from small to large and then the v axis is from small to large.
5. The method of claim 2, wherein: for each image, obtaining pixels corresponding to the three-dimensional point cloud through a projection relation by using the transfer matrix and the global three-dimensional point cloud obtained in the step one and the step three, and specifically:
let the coordinate of a point on the image be pI=[u,v]TThe camera reference matrix is K, and the point cloud under the global coordinate system is pw=[x,y,z]TThen satisfyThe relation of (2) is shown in the formula, the nearest neighbor integer is taken from the value calculated by the formula (2) to obtain the characteristic point pixel value corresponding to the point cloud,
Figure FDA0002277314360000022
6. the method of claim 2, wherein: the first step and the fifth step are specifically as follows:
step one, five or one, extracting SURF characteristic points from all images, wherein the descriptor of the characteristic is s, and each image is expressed as a set of characteristic point pixels and the descriptor thereof
Figure FDA0002277314360000023
Step one, five or two, calculating any two images I with sequences of a and baAnd IbSimilarity between S (I)a,Ib) The similarity is described by the matching degree of the feature points between the two images, and the matching degree of the feature points is measured by the Euclidean distance between the corresponding descriptors of the feature points;
suppose the Euclidean distance Ed between any two feature points between two imagesijComprises the following steps:
Edij=||si-sj||2si∈Ia,sj∈Ib(5)
in the formula, siRepresenting the image sequence as the ith feature point, s, in the a imagejRepresenting that the image sequence is the jth characteristic point in the b image, and if the two characteristic points satisfy the formula (6), determining that the two characteristic points are mutually matched;
Figure FDA0002277314360000024
in the formula, Edmin1Representing nearest neighbor feature point Euclidean distance, Edmin2Representing the Euclidean distance of the next neighbor feature points, wherein epsilon is a correct matching judgment threshold value;
in two imagesComparing the characteristic points one by one, and counting two images IaAnd IbNumber of matching points between, noted as N(a,b)(ii) a Image similarity is defined as follows:
Figure FDA0002277314360000031
step one, five and three, similarity S (I)a,Ib) Images smaller than a preset threshold value are gathered into one class, and a first image is reserved in each class to serve as a representative image so as to form an image database; assigning a feature descriptor to the three-dimensional point cloud, wherein the descriptor is an image SURF feature descriptor of the three-dimensional point cloud appearing in the image for the first time; and for IndexLists, taking the minimum value and the maximum value of the corresponding three-dimensional point cloud serial number in the class.
7. The method of claim 1, wherein: the sixth step is specifically as follows:
step six, obtaining n from step fiveEThree-dimensional position coordinate set { p) of global spatial pointsW,i=(x,y,z),i=1,2,...,nEGet 4 virtual control points pWV,i,i=1,2,3,4;
To nEThe gravity center is obtained from the space points and is used as a virtual control point:
Figure FDA0002277314360000032
and further obtaining a matrix A:
Figure FDA0002277314360000033
when A isTCharacteristic value of A is lambdaiCorresponding feature vector is viThen, the other three virtual control points are unit points in three main directions:
Figure FDA0002277314360000034
in this case, the spatial points in the global coordinate system can be obtained from the solved virtual control points:
Figure FDA0002277314360000035
in the formula, wijIs that the ith spatial point corresponds to a virtual control point pWV,jThe ownership weight value of the ith space point needs to satisfy:
Figure FDA0002277314360000036
sixthly, solving the coordinate of the virtual control point in the camera coordinate system, namely pCV,i,i=1,2,3,4;
After the coordinates of the virtual control points in the camera coordinate system are known, the position coordinates of any one space point in the camera coordinate system can be represented in the form of weighted sum of the virtual control points:
Figure FDA0002277314360000041
in the formula, wijIn accordance with formula (11);
if the homogeneous coordinate of the image point of the space point on the image plane is set as pI,i=[ui,vi,1]TThen p is obtained from the camera modelC,iAnd pI,iThe relationship between:
Figure FDA0002277314360000042
in the formula, αsIs a scale coefficient, K is an internal parameter matrix of the camera; if the virtual control point coordinates in the camera coordinate system are represented as pCV,j=[xCV,j,yCV,j,zCV,j]TThen scale factor αsExpressed as:
Figure FDA0002277314360000043
further sorting the formula (14) by using the camera parameters to obtain:
Figure FDA0002277314360000044
in the formula (f)cIs the focal length of the camera (u)0,v0) Is the coordinate of the intersection point of the optical axis and the imaging plane;
equation (14) is substituted for equation (15), and the relationship between the spatial point coordinates and the corresponding image point coordinates is expressed by a linear equation:
Figure FDA0002277314360000045
if it is provided with
Figure FDA0002277314360000046
Then equation (16) is organized as a linear equation:
MEzP=0 (17)
in the formula, matrix MEIs formed by arranging the coefficients in the formula (16), and the matrix is 2nEX 12 dimension; the result is the position coordinates p of the virtual control point in the camera coordinate systemCV,j
Sixthly, resolving a final rotation matrix R and a final translation matrix t according to results obtained in the step six I and the step six II;
calculating a matrix B:
Figure FDA0002277314360000051
Figure FDA0002277314360000052
calculating matrix H ═ BTA, and carrying out SVD on the H matrix to obtain H ═ UDVTThen the rotation matrix and the translation matrix are respectively:
Figure FDA0002277314360000053
wherein if R < 0, then R (2:) — -R (2:).
CN201911127519.5A 2019-11-18 2019-11-18 VSLAM-based visual positioning method for sparse three-dimensional point cloud chart Pending CN110889349A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911127519.5A CN110889349A (en) 2019-11-18 2019-11-18 VSLAM-based visual positioning method for sparse three-dimensional point cloud chart

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911127519.5A CN110889349A (en) 2019-11-18 2019-11-18 VSLAM-based visual positioning method for sparse three-dimensional point cloud chart

Publications (1)

Publication Number Publication Date
CN110889349A true CN110889349A (en) 2020-03-17

Family

ID=69747849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911127519.5A Pending CN110889349A (en) 2019-11-18 2019-11-18 VSLAM-based visual positioning method for sparse three-dimensional point cloud chart

Country Status (1)

Country Link
CN (1) CN110889349A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111624997A (en) * 2020-05-12 2020-09-04 珠海市一微半导体有限公司 Robot control method and system based on TOF camera module and robot
CN111882590A (en) * 2020-06-24 2020-11-03 广州万维创新科技有限公司 AR scene application method based on single picture positioning
CN112907745A (en) * 2021-03-23 2021-06-04 北京三快在线科技有限公司 Method and device for generating digital orthophoto map
CN113034600A (en) * 2021-04-23 2021-06-25 上海交通大学 Non-texture planar structure industrial part identification and 6D pose estimation method based on template matching
CN113643422A (en) * 2021-07-09 2021-11-12 北京三快在线科技有限公司 Information display method and device
CN113808273A (en) * 2021-09-14 2021-12-17 大连海事大学 Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150070470A1 (en) * 2013-09-10 2015-03-12 Board Of Regents, The University Of Texas System Apparatus, System, and Method for Mobile, Low-Cost Headset for 3D Point of Gaze Estimation
CN106228538A (en) * 2016-07-12 2016-12-14 哈尔滨工业大学 Binocular vision indoor orientation method based on logo
JP2017053795A (en) * 2015-09-11 2017-03-16 株式会社リコー Information processing apparatus, position attitude measurement method, and position attitude measurement program
CN106826815A (en) * 2016-12-21 2017-06-13 江苏物联网研究发展中心 Target object method of the identification with positioning based on coloured image and depth image
CN107103056A (en) * 2017-04-13 2017-08-29 哈尔滨工业大学 A kind of binocular vision indoor positioning database building method and localization method based on local identities
US20180061126A1 (en) * 2016-08-26 2018-03-01 Osense Technology Co., Ltd. Method and system for indoor positioning and device for creating indoor maps thereof
WO2018049581A1 (en) * 2016-09-14 2018-03-22 浙江大学 Method for simultaneous localization and mapping
CN107830854A (en) * 2017-11-06 2018-03-23 深圳精智机器有限公司 Vision positioning method based on sparse cloud of ORB and Quick Response Code
CN109960402A (en) * 2018-12-18 2019-07-02 重庆邮电大学 A kind of actual situation register method merged based on cloud and visual signature
CN110097553A (en) * 2019-04-10 2019-08-06 东南大学 The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system
CN110097599A (en) * 2019-04-19 2019-08-06 电子科技大学 A kind of workpiece position and orientation estimation method based on partial model expression
CN110360999A (en) * 2018-03-26 2019-10-22 京东方科技集团股份有限公司 Indoor orientation method, indoor locating system and computer-readable medium
CN110443840A (en) * 2019-08-07 2019-11-12 山东理工大学 The optimization method of sampling point set initial registration in surface in kind

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150070470A1 (en) * 2013-09-10 2015-03-12 Board Of Regents, The University Of Texas System Apparatus, System, and Method for Mobile, Low-Cost Headset for 3D Point of Gaze Estimation
JP2017053795A (en) * 2015-09-11 2017-03-16 株式会社リコー Information processing apparatus, position attitude measurement method, and position attitude measurement program
CN106228538A (en) * 2016-07-12 2016-12-14 哈尔滨工业大学 Binocular vision indoor orientation method based on logo
US20180061126A1 (en) * 2016-08-26 2018-03-01 Osense Technology Co., Ltd. Method and system for indoor positioning and device for creating indoor maps thereof
US20190234746A1 (en) * 2016-09-14 2019-08-01 Zhejiang University Method for simultaneous localization and mapping
WO2018049581A1 (en) * 2016-09-14 2018-03-22 浙江大学 Method for simultaneous localization and mapping
CN106826815A (en) * 2016-12-21 2017-06-13 江苏物联网研究发展中心 Target object method of the identification with positioning based on coloured image and depth image
CN107103056A (en) * 2017-04-13 2017-08-29 哈尔滨工业大学 A kind of binocular vision indoor positioning database building method and localization method based on local identities
CN107830854A (en) * 2017-11-06 2018-03-23 深圳精智机器有限公司 Vision positioning method based on sparse cloud of ORB and Quick Response Code
CN110360999A (en) * 2018-03-26 2019-10-22 京东方科技集团股份有限公司 Indoor orientation method, indoor locating system and computer-readable medium
CN109960402A (en) * 2018-12-18 2019-07-02 重庆邮电大学 A kind of actual situation register method merged based on cloud and visual signature
CN110097553A (en) * 2019-04-10 2019-08-06 东南大学 The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system
CN110097599A (en) * 2019-04-19 2019-08-06 电子科技大学 A kind of workpiece position and orientation estimation method based on partial model expression
CN110443840A (en) * 2019-08-07 2019-11-12 山东理工大学 The optimization method of sampling point set initial registration in surface in kind

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GAO QIAN: "Monocular Vision Based Object Recognition and Tracking for Intelligent Robot" *
王龙辉;杨光;尹芳;丑武胜;: "基于Kinect2.0的三维视觉同步定位与地图构建" *
马琳;杨浩;谭学治;冯冠元;: "基于图像关键帧的Visual-Depth Map建立方法" *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111624997A (en) * 2020-05-12 2020-09-04 珠海市一微半导体有限公司 Robot control method and system based on TOF camera module and robot
CN111882590A (en) * 2020-06-24 2020-11-03 广州万维创新科技有限公司 AR scene application method based on single picture positioning
CN112907745A (en) * 2021-03-23 2021-06-04 北京三快在线科技有限公司 Method and device for generating digital orthophoto map
CN112907745B (en) * 2021-03-23 2022-04-01 北京三快在线科技有限公司 Method and device for generating digital orthophoto map
CN113034600A (en) * 2021-04-23 2021-06-25 上海交通大学 Non-texture planar structure industrial part identification and 6D pose estimation method based on template matching
CN113034600B (en) * 2021-04-23 2023-08-01 上海交通大学 Template matching-based texture-free planar structure industrial part identification and 6D pose estimation method
CN113643422A (en) * 2021-07-09 2021-11-12 北京三快在线科技有限公司 Information display method and device
CN113643422B (en) * 2021-07-09 2023-02-03 北京三快在线科技有限公司 Information display method and device
CN113808273A (en) * 2021-09-14 2021-12-17 大连海事大学 Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation
CN113808273B (en) * 2021-09-14 2023-09-12 大连海事大学 Disordered incremental sparse point cloud reconstruction method for ship traveling wave numerical simulation

Similar Documents

Publication Publication Date Title
CN110889349A (en) VSLAM-based visual positioning method for sparse three-dimensional point cloud chart
CN111968129B (en) Instant positioning and map construction system and method with semantic perception
CN110728717B (en) Positioning method and device, equipment and storage medium
CN110738143B (en) Positioning method and device, equipment and storage medium
CN109658445A (en) Network training method, increment build drawing method, localization method, device and equipment
CN113393522B (en) 6D pose estimation method based on monocular RGB camera regression depth information
CN110675457B (en) Positioning method and device, equipment and storage medium
CN109579825B (en) Robot positioning system and method based on binocular vision and convolutional neural network
CN108648240A (en) Based on a non-overlapping visual field camera posture scaling method for cloud characteristics map registration
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN106940186A (en) A kind of robot autonomous localization and air navigation aid and system
CN110070598B (en) Mobile terminal for 3D scanning reconstruction and 3D scanning reconstruction method thereof
CN110276768B (en) Image segmentation method, image segmentation device, image segmentation apparatus, and medium
CN111862213A (en) Positioning method and device, electronic equipment and computer readable storage medium
CN115205489A (en) Three-dimensional reconstruction method, system and device in large scene
CN111323024B (en) Positioning method and device, equipment and storage medium
CN111127524A (en) Method, system and device for tracking trajectory and reconstructing three-dimensional image
CN111860651B (en) Monocular vision-based semi-dense map construction method for mobile robot
CN102263957B (en) Search-window adaptive parallax estimation method
CN111899280A (en) Monocular vision odometer method adopting deep learning and mixed pose estimation
CN111709317B (en) Pedestrian re-identification method based on multi-scale features under saliency model
CN116772820A (en) Local refinement mapping system and method based on SLAM and semantic segmentation
CN116843754A (en) Visual positioning method and system based on multi-feature fusion
CN112615993A (en) Depth information acquisition method, binocular camera module, storage medium and electronic equipment
WO2024032101A1 (en) Feature map generation method and apparatus, storage medium, and computer device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317

RJ01 Rejection of invention patent application after publication