WO2017114507A1 - 基于射线模型三维重构的图像定位方法以及装置 - Google Patents

基于射线模型三维重构的图像定位方法以及装置 Download PDF

Info

Publication number
WO2017114507A1
WO2017114507A1 PCT/CN2016/113804 CN2016113804W WO2017114507A1 WO 2017114507 A1 WO2017114507 A1 WO 2017114507A1 CN 2016113804 W CN2016113804 W CN 2016113804W WO 2017114507 A1 WO2017114507 A1 WO 2017114507A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
matching
dimensional
feature point
images
Prior art date
Application number
PCT/CN2016/113804
Other languages
English (en)
French (fr)
Inventor
周杰
邓磊
段岳圻
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Priority to US16/066,168 priority Critical patent/US10580204B2/en
Publication of WO2017114507A1 publication Critical patent/WO2017114507A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/005Tree description, e.g. octree, quadtree
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present invention relates to the field of image processing and pattern recognition technologies, and in particular, to an image positioning method and apparatus based on three-dimensional reconstruction of a ray model.
  • Image localization technology calculates the pose of oneself through one or a group of images.
  • the technology can be used for robot navigation, path planning, digital tourism, virtual reality, etc. It can be applied to areas where GPS (Global Positioning System) cannot work, such as indoors and underground.
  • GPS Global Positioning System
  • WiFi Wireless Fidelity
  • image-point cloud (2D-3D) matching method this method firstly collects a large number of planar images about the target scene in advance, performs three-dimensional reconstruction offline, and obtains a three-dimensional feature point cloud of the scene, and when online positioning stage, The query image features are extracted and matched with the 3D feature point cloud in 2D-3D, and the matching results are used to estimate the pose of the target camera.
  • the object of the present invention is to solve at least one of the above technical problems to some extent.
  • a first object of the present invention is to propose an image localization method based on three-dimensional reconstruction of a ray model.
  • the method improves the reconstruction effect, reduces the acquisition cost of the reconstruction process, improves the calculation speed, and improves the accuracy of image localization in the image localization process.
  • a second object of the present invention is to provide an image localization apparatus based on three-dimensional reconstruction of a ray model.
  • a third object of the present invention is to provide a storage medium.
  • an image localization method based on a three-dimensional reconstruction of a ray model includes the following steps: pre-capturing a plurality of images of a plurality of scenes, and performing feature extraction on the plurality of images respectively.
  • the ray model in a three-dimensional reconstruction process based on a ray model, by using three-dimensional ray to describe two-dimensional pixel coordinates, the ray model can express various camera models without distortion (such as panoramic, fisheye, plane), can be applied to many types of cameras, and make full use of its inherent geometric properties, so that the reconstruction effect is better, and the acquisition cost is reduced, the calculation speed is improved, and the image is positioned.
  • the proposed positioning frame based on attitude map optimization combines 2D-3D feature matching between image point clouds and pose information of neighboring cameras, which improves the accuracy of image localization.
  • an image localization apparatus based on a three-dimensional reconstruction of a ray model includes: a first acquisition module, configured to pre-collect a plurality of images of a plurality of scenes, and respectively respectively Performing feature extraction on the image to obtain a corresponding plurality of feature points; generating a module, performing feature matching on the plurality of images on the two images, and generating a corresponding eigenmatrix according to the feature matching of the two images, and Performing noise processing on the eigenmatrix; and a reconstruction module, configured to perform three-dimensional reconstruction according to the eigenmatrix after the noise processing to generate a three-dimensional feature point cloud and a reconstructed camera pose set based on the ray model; An acquiring module, configured to acquire a query image, and perform feature extraction on the query image to obtain a corresponding two-dimensional feature point set; and an image positioning module, configured to optimize the frame according to the two-dimensional feature point set according to
  • An image localization apparatus based on three-dimensional reconstruction of a ray model in a three-dimensional reconstruction process based on a ray model, by using three-dimensional ray to describe two-dimensional pixel coordinates, the ray model can express various camera models without distortion (such as panoramic, fisheye, plane), can be applied to many types of cameras, and make full use of its inherent geometric properties, so that the reconstruction effect is better, and the acquisition cost is reduced, the calculation speed is improved, and the image is positioned.
  • the proposed positioning frame based on attitude map optimization combines 2D-3D feature matching between image point clouds and pose information of neighboring cameras, which improves the accuracy of image localization.
  • a storage medium configured to store an application for performing an image localization method based on a three-dimensional reconstruction of a ray model according to the first aspect of the present invention.
  • FIG. 1 is a flow chart of an image localization method based on three-dimensional reconstruction of a ray model, in accordance with one embodiment of the present invention
  • FIG. 2 is a flow chart of generating a three-dimensional feature point cloud and a reconstructed camera pose set according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a specific implementation process of image positioning according to an embodiment of the present invention.
  • FIG. 4 is a diagram showing an example of an image localization method based on three-dimensional reconstruction of a ray model, in accordance with one embodiment of the present invention
  • FIG. 5 is a structural block diagram of an image localization apparatus based on three-dimensional reconstruction of a ray model according to an embodiment of the present invention
  • FIG. 6 is a structural block diagram of a reconstruction module according to an embodiment of the present invention.
  • FIG. 7 is a structural block diagram of a reconstruction module according to another embodiment of the present invention.
  • FIG. 8 is a block diagram showing the structure of an image positioning module according to an embodiment of the present invention.
  • the image localization method based on three-dimensional reconstruction of a ray model may include:
  • S101 Collect multiple images of multiple scenes in advance, and perform feature extraction on multiple images to obtain corresponding multiple feature point sets.
  • the term “plurality” can be understood broadly, that is, correspondingly enough.
  • the type of image may include, but is not limited to, a panorama type, a fisheye type, a plane type, and the like.
  • a sufficient scene image may be collected in advance as the image mentioned in this embodiment, and SIFT (Scale-invariant feature transform) features are extracted from the images to obtain the position of each feature point and A sub-set is described, wherein the descriptive sub-set is used to describe surrounding area information of a corresponding feature point.
  • SIFT Scale-invariant feature transform
  • S102 Perform feature matching on two images of multiple images, and generate corresponding eigenma matrices according to feature matching of the two images, and perform noise processing on the eigenmatrix.
  • multiple images may be matched by multiple pairs according to a plurality of feature point sets. And store the feature point matching of each image pair. Thereafter, the eigenmatrix can be estimated based on the set of feature points on the match.
  • all the images may be matched according to the description subset of the feature points, and the feature point matching of each image pair is stored, and then the eigenmatrix matrix may be estimated based on the feature points on the matching, and simultaneously The eigenmatrix performs filtering noise. It can be understood that, in the embodiment of the present invention, if the two pairs of matched feature points are organized, a plurality of tracks can be formed, wherein each track corresponds to a 3D (three-dimensional) point to be reconstructed.
  • the present invention can adapt and combine different camera types (such as panoramic type, fisheye type, plane type, etc.) by the ray model compared to the conventional pixel-based planar model.
  • the gesture map may be first constructed, where the gesture map may include a camera borrow point, a three-dimensional (3D) point node, a camera-camera connection edge, a camera-3D point connection edge, etc., which may be used together to describe the camera set and Visibility relationship between 3D point sets.
  • the gesture map may be based on incremental reconstruction of the ray model, that is, a pair of cameras with higher relative pose estimation quality can be selected as the initial seed, and a new sample 3D point can be searched by using ray model-based triangulation. New sample 3D points can be used to find more cameras based on the ray model, iteratively and denoised until no more cameras or 3D points are found.
  • the three-dimensional reconstruction is performed according to the feature matching after the noise processing and the eigenmatrix to generate a three-dimensional feature point cloud and a specific set of camera poses.
  • the implementation process can include the following steps:
  • the corresponding attitude map can be constructed according to the relative posture between the plurality of cameras and the plurality of feature points by the preset gesture map creation formula.
  • the preset gesture map creation formula may be:
  • NP is a camera node
  • NX is a feature point (ie, sample 3D point) node
  • EP is a camera-camera connection edge
  • the model of the camera such as a panoramic model, a fisheye model, a planar model, etc.
  • the image coordinates u(u,v) are one-to-one correspondence by the mapping function;
  • Different camera models have different mapping functions, wherein the mapping functions corresponding to the panoramic camera, the fisheye camera, and the planar camera can be respectively described by the following equations (2)-(4):
  • u c is the coordinate of the camera's main point
  • f is the focal length, especially for the panoramic camera.
  • p is the angle of rotation about the y-axis
  • t is the pitch angle around the x-axis
  • u 1 , v 1 , ⁇ , ⁇ , and r are temporary variables.
  • S204 Perform incremental reconstruction on the attitude map based on the corresponding ray model to generate a three-dimensional feature point cloud and a reconstructed camera pose set.
  • a pair of cameras with high relative pose quality estimation between multiple cameras may be selected as an initial seed, and then a new 3D point may be searched by triangulation based on the corresponding ray model, and then a new one is utilized.
  • the 3D point searches for more cameras based on the ray model, and iterates until no more cameras or 3D points are found; wherein, in this process, nonlinear optimization can be continuously implemented to reduce the error of the three-dimensional reconstruction;
  • the quality evaluation function is used to eliminate the low quality camera and 3D points.
  • the distance metric, triangulation, camera attitude estimation, nonlinear optimization, quality evaluation function and other modules in the process are all improved for the ray model, compared with the traditional reconstruction algorithm only applicable to the planar image. , has a wider range of universality.
  • the ray model can express various camera models (such as panorama, fisheye, plane, etc.) without distortion, that is, suitable for multiple Various types of cameras have expanded the scope of application.
  • the image localization method may further include: establishing an index of each three-dimensional feature point cloud in the three-dimensional feature point cloud. Tree and An index tree that establishes a spatial location for a plurality of cameras in the reconstructed camera pose set.
  • a point cloud feature and a camera position index tree may be established, wherein it can be understood that each point in the three-dimensional feature point cloud is accompanied by a plurality of features from the image in which the point is observed;
  • the present invention establishes a Kd-tree index tree for the feature point cloud to speed up the retrieval speed; Since the online positioning stage needs to retrieve the spatial neighbors of the query image, the present invention establishes a Kd-tree index tree of the spatial position of the reconstructed camera.
  • the foregoing steps S101-S103 may be offline analysis. That is to say, the image library can be pre-established through the above steps S101-S103, and the corresponding three-dimensional feature point cloud and the reconstructed camera pose set are generated in advance according to the image library, and stored for use in the subsequent online image positioning stage. .
  • S104 Acquire a query image, and perform feature extraction on the query image to obtain a corresponding two-dimensional feature point set.
  • a feature may be extracted from the acquired query image to obtain a two-dimensional feature point set of the query image.
  • each two-dimensional feature point corresponds to one feature descriptor
  • each 3D point in the 3D feature point cloud corresponds to multiple feature descriptors, which can be contributed by multiple images in the three-dimensional reconstruction stage.
  • the positioning posture map optimization frame performs image positioning according to the two-dimensional feature point set, the three-dimensional feature point cloud, and the reconstructed camera pose set.
  • the feature of the query image may be matched with the feature of the 3D point cloud generated by the offline part (ie, 2D-3D matching), and the initial pose of the query image is estimated by using the camera pose estimation algorithm according to a sufficient number of effective matches.
  • the neighboring library camera ie, the neighboring image
  • the 2D-3D matching and the relative pose between the neighboring images are used to establish a positioning attitude map optimization framework and optimized to obtain higher precision positioning results.
  • the specific implementation of the image positioning based on the two-dimensional feature point set, the three-dimensional feature point cloud, and the reconstructed camera pose set based on the positioning posture map optimization framework can include the following steps:
  • the two-dimensional feature point set is effectively matched with the three-dimensional feature point cloud according to the index tree of the plurality of three-dimensional feature point clouds to obtain a bidirectional 2D-3D matching set.
  • the 2D-3D matching that does not satisfy the camera geometric constraint is eliminated by the camera pose estimation algorithm, and the inner point set I 2D-3D is obtained and estimated.
  • the initial pose of the image P q 2D-3D R q 2D-3D [I
  • the position of the optical center is C.
  • the initial spatial position C q 2D-3D of the query image q may be obtained from the initial pose of the query image, and then may be corresponding to the 3D feature point cloud according to the initial spatial position of the query image and the index tree of the spatial position.
  • the query image can be matched with the neighbor image by 2D-2D to obtain multiple effective matching sets between the two images.
  • the eigenmatric matrix is estimated based on the effective matching set, and at the same time, the inner point matching is obtained.
  • the matching quantity is less than a certain threshold, the eigen matrix is considered to be noisy, the neighboring image is removed, and the original is decomposed.
  • the matrix is obtained to obtain a relative pose R iq , C iq with the neighboring image, wherein the translation C iq in the relative pose can only provide a direction and cannot provide a size.
  • the objective function ie, the above-mentioned positioning posture map optimization framework
  • P q R q [I
  • the cost function of the relative pose contains two items, namely the cost of the rotation and the cost of the translation direction, which are independent of each other; the cost of the rotation is defined as the relative Euler angle of R i , R q , The cost of the translational direction is the observed translational direction R i , C iq and the translational direction to be optimized The chord distance between.
  • the positioning result of 2D-3D ie, the initial pose of the query image described above
  • P q 2D-3D is used as the initial value
  • the initial pose P of the query image is determined by the Levenberg-Marquardt algorithm according to the positioning posture map optimization framework.
  • q 2D-3D is optimized for higher accuracy positioning results.
  • the present invention combines the matching information of 2D-3D and the relative attitude information between images by using the method of graph optimization, thereby improving the accuracy of the final positioning result.
  • steps S104-S105 are online calculation, that is, receiving a query image, and then querying the pre-generated three-dimensional feature point cloud and the reconstructed camera pose set according to the query image to implement image localization.
  • offline reconstruction may be performed in advance to obtain a three-dimensional feature point cloud and a reconstructed camera pose set, that is, an image of a sufficient scene may be collected offline, and image features are extracted and the image is performed.
  • the attitude map can be constructed, and the three-dimensional reconstruction is performed based on the incremental of the ray model to obtain the three-dimensional feature point cloud and the reconstructed camera pose set, and the index tree and the camera of the three-dimensional feature point cloud are established. Spatial location index tree.
  • online positioning can be performed, and the acquired query image can be extracted first, and the extracted feature and the 3D feature point cloud can be effectively matched by 2D-3D to obtain a bidirectional 2D-3D matching set.
  • the bidirectional 2D-3D matching set is estimated by the camera pose estimation algorithm to generate an initial pose of the query image, and the neighboring camera is retrieved and the relative pose is calculated.
  • the information is obtained by establishing the positioning posture map. Accurate positioning results, ie target camera position and attitude.
  • the ray model in a three-dimensional reconstruction process based on a ray model, by using three-dimensional ray to describe two-dimensional pixel coordinates, the ray model can express various camera models without distortion (such as panoramic, fisheye, plane), can be applied to many types of cameras, and make full use of its inherent geometric properties, so that the reconstruction effect is better, and the acquisition cost is reduced, the calculation speed is improved, and the image is positioned.
  • the proposed positioning frame based on attitude map optimization combines 2D-3D feature matching between image point clouds and pose information of neighboring cameras, which improves the accuracy of image localization.
  • the present invention also proposes an image positioning apparatus based on three-dimensional reconstruction of a ray model.
  • FIG. 5 is a structural block diagram of an image localization apparatus based on three-dimensional reconstruction of a ray model according to an embodiment of the present invention.
  • the image localization apparatus based on the ray model three-dimensional reconstruction may include: a first acquisition module 100, a generation module 200, a reconstruction module 300, a second acquisition module 400, and an image positioning module 500.
  • the first acquiring module 100 may be configured to collect multiple images of multiple scenes in advance, and perform feature extraction on multiple images to obtain corresponding multiple feature point sets.
  • the term "plurality" can be understood broadly, that is, correspondingly enough.
  • the type of image may include, but is not limited to, a panorama type, a fisheye type, a plane type, and the like.
  • the first acquisition module 100 may pre-capture sufficient scene images as the images mentioned in this embodiment, and extract SIFT (Scale-invariant feature transform) features for the images, respectively, to obtain each A location of the feature point and a description sub-set, wherein the description sub-set is used to describe surrounding area information of the corresponding feature point.
  • SIFT Scale-invariant feature transform
  • the generating module 200 can be configured to perform feature matching on the two images, and generate corresponding eigenma matrices according to feature matching of the two images, and perform noise processing on the eigenmatrix. Specifically, in the embodiment of the present invention, the generating module 200 may first perform pairwise matching on multiple images according to a plurality of feature point sets, and store feature point matching of each image pair, and then, based on the matching, The feature point set estimates the eigenmatrix.
  • the generating module 200 may perform pairwise matching on all images according to the description subset of the feature points, and store the feature point matching of each image pair, and then estimate the eigenmatrix matrix based on the feature points on the matching, and At the same time, the eigenmatrix is filtered. It can be understood that, in the embodiment of the present invention, if the two pairs of matched feature points are organized, a plurality of tracks can be formed, wherein each track corresponds to a 3D (three-dimensional) point to be reconstructed.
  • the reconstruction module 300 is configured to perform three-dimensional reconstruction according to the ray model after the noise processing to generate a three-dimensional feature point cloud and a reconstructed camera pose set. It can be understood that the present invention can adapt and combine different camera types (such as panoramic type, fisheye type, plane type, etc.) by the ray model compared to the conventional pixel-based planar model.
  • the reconstruction module 300 may first construct a gesture map, where the gesture map may include a camera borrow point, a three-dimensional (3D) point node, a camera-camera connection edge, a camera-3D point connection edge, etc., which may be used together.
  • the gesture map may include a camera borrow point, a three-dimensional (3D) point node, a camera-camera connection edge, a camera-3D point connection edge, etc., which may be used together.
  • 3D three-dimensional
  • the reconstruction module 300 may include: a decomposition unit 310, a construction unit 320, a definition unit 330, and a reconstruction unit 340. More specifically, the decomposition unit 310 can be configured to decompose the noise-processed eigenmatrix to obtain a relative pose between the corresponding plurality of cameras.
  • the building unit 320 can be configured to construct a corresponding attitude map according to a relative posture between the plurality of cameras and a plurality of feature points. More specifically, the formula can be created by a preset attitude map according to a relative posture between a plurality of cameras and a plurality of feature points. Create a corresponding attitude map.
  • the preset attitude map creation formula may be the above formula (1).
  • the defining unit 330 can be configured to respectively acquire models of the plurality of cameras, and respectively define corresponding ray models according to models of the plurality of cameras. More specifically, the defining unit 330 may first acquire a model of the camera, such as a panoramic model, a fisheye model, a planar model, etc., and then respectively define corresponding ray models according to models of different cameras.
  • a model of the camera such as a panoramic model, a fisheye model, a planar model, etc.
  • the image coordinates u(u,v) are one-to-one correspondence by the mapping function;
  • the mapping functions of the different camera models are different, and the mapping functions corresponding to the panoramic camera, the fisheye camera, and the planar camera can be respectively described by the above formulas (2)-(4).
  • the reconstruction unit 340 can be configured to incrementally reconstruct the attitude map based on the corresponding ray model to generate a three-dimensional feature point cloud and a reconstructed camera pose set. More specifically, the reconstruction unit 340 may first select a pair of cameras with higher relative pose estimation qualities between the plurality of cameras as initial seeds, and then use the triangulation based on the corresponding ray model to find new 3D points. Then, using the new 3D point to find more cameras based on the ray model, iteratively, until no more cameras or 3D points are found; wherein, in this process, nonlinear optimization can be continuously implemented to reduce the three-dimensional Reconstruction error; use the quality evaluation function to eliminate low-quality cameras and 3D points.
  • the distance metric, triangulation, camera attitude estimation, nonlinear optimization, quality evaluation function and other modules in the process are all improved for the ray model, compared with the traditional reconstruction algorithm only applicable to the planar image. , has a wider range of universality.
  • the ray model can express various camera models (such as panorama, fisheye, plane, etc.) without distortion, that is, suitable for multiple Various types of cameras have expanded the scope of application.
  • the reconstruction module 300 may further include: an establishing unit 350, where the establishing unit 350 may be configured to generate a three-dimensional feature point cloud and the reconstructed unit at the reconstruction unit 340. After the camera poses are set, an index tree of each three-dimensional feature point cloud in the three-dimensional feature point cloud is established, and an index tree of spatial positions is established for the plurality of cameras in the reconstructed camera pose set.
  • the establishing unit 350 may establish a point cloud feature and a camera position index tree, wherein it can be understood that each point in the three-dimensional feature point cloud is accompanied by several features, which are from An image of the point is observed; in the subsequent online positioning phase, a feature of the query image needs to be matched with the feature point cloud to achieve image localization; to accelerate the matching process, the present invention establishes a Kd-tree index tree for the feature point cloud. In order to speed up the retrieval speed; in addition, since the online positioning stage needs to retrieve the spatial neighbors of the query image, the present invention establishes a Kd-tree index tree of the spatial position of the reconstructed camera.
  • the second obtaining module 400 is configured to obtain a query image, and perform feature extraction on the query image to obtain a corresponding two-dimensional feature point set. More specifically, the second obtaining module 400 may extract features from the acquired query image to obtain a two-dimensional feature point set of the query image. It should be noted that each two-dimensional feature point corresponds to one feature descriptor, and 3D Each 3D point in the feature point cloud corresponds to a plurality of feature descriptors, which can be contributed by multiple images in the three-dimensional reconstruction stage.
  • the image positioning module 500 can be configured to perform image positioning according to the two-dimensional feature point set, the three-dimensional feature point cloud, and the reconstructed camera pose set based on the positioning posture map optimization framework. More specifically, the image localization module 500 can match the features of the query image with the features of the 3D point cloud generated by the offline portion (ie, 2D-3D matching), and estimate the query image using the camera pose estimation algorithm according to a sufficient number of effective matches.
  • the initial pose, after which the neighboring library camera (ie, the neighbor image) can be queried according to the initial pose, and the 2D-3D matching and the relative pose between the neighboring image are used to establish a positioning attitude map optimization framework and optimized to obtain higher Accurate positioning results.
  • the image positioning module 500 may include: a first matching unit 510, a first generating unit 520, a query unit 530, a second matching unit 540, and a second The generating unit 550, the establishing unit 560, and the image positioning unit 570.
  • the first matching unit 510 is configured to effectively match the two-dimensional feature point set with the three-dimensional feature point cloud according to the index tree of the plurality of three-dimensional feature point clouds to obtain a bidirectional 2D-3D matching set.
  • th match it is considered that a 2D to 3D one-way effective match is constructed between the two-dimensional feature point and the nearest neighbor 3D point, and all such matches in the F 2D constitute a 2D.
  • the second matching unit 540 can be configured to perform feature matching on the query image and the neighbor image to obtain a corresponding plurality of valid matching sets. More specifically, the second matching unit 540 can perform feature matching of the query image and the neighbor image in 2D-2D. Get multiple valid match sets between the two images.
  • the second generating unit 550 is configured to generate a relative pose between the neighboring images according to the plurality of valid matching sets. More specifically, the second generating unit 550 may estimate the eigenmatrix based on the effective matching set, and at the same time obtain an interior point matching. When the matching quantity is less than a certain threshold, the eigenmatrix may be considered to be noisy, and the neighbor images, and the eigen decomposition of the matrix to obtain the relative attitude between the neighboring image R iq, C iq, wherein the relative translational C iq posture only provide direction, the size can not be provided.
  • the image positioning unit 570 can be configured to optimize an initial pose of the query image according to the positioning posture map optimization framework to achieve image localization. More specifically, the image positioning unit 570 can adopt the positioning result of 2D-3D (ie, the initial pose of the query image described above) P q 2D-3D as an initial value, and the query is performed by the Levenberg-Marquardt algorithm according to the positioning posture map optimization framework. The initial pose P q 2D-3D of the image is optimized to obtain a more accurate positioning result.
  • 2D-3D ie, the initial pose of the query image described above
  • P q 2D-3D as an initial value
  • the initial pose P q 2D-3D of the image is optimized to obtain a more accurate positioning result.
  • the present invention combines the matching information of 2D-3D and the relative attitude information between images by using the method of graph optimization, thereby improving the accuracy of the final positioning result.
  • An image localization apparatus based on three-dimensional reconstruction of a ray model in a three-dimensional reconstruction process based on a ray model, by using three-dimensional ray to describe two-dimensional pixel coordinates, the ray model can express various camera models without distortion (such as panoramic, fisheye, plane), can be applied to many types of cameras, and make full use of its inherent geometric properties, so that the reconstruction effect is better, and the acquisition cost is reduced, the calculation speed is improved, and the image is positioned.
  • the proposed positioning frame based on attitude map optimization combines 2D-3D feature matching between image point clouds and pose information of neighboring cameras, which improves the accuracy of image localization.
  • the present invention also provides a storage medium for storing an application for performing an image localization method based on three-dimensional reconstruction of a ray model according to any of the above embodiments of the present invention.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality" is at least two, such as two, three, etc., unless specifically defined otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种基于射线模型三维重构的图像定位方法及装置,其中方法包括:预先采集多个场景的多个图像,并分别对多个图像进行特征提取以得到对应的多个特征点集合(S101);对多个图像进行两两图像的特征匹配,并根据两两图像的特征匹配生成对应的本征矩阵,并对本征矩阵进行噪声处理(S102);基于射线模型根据噪声处理后的特征匹配以及本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合(S103);获取查询图像,并对查询图像进行特征提取以得到对应的二维特征点集合(S104);基于定位姿态图优化框架根据二维特征点集合、三维特征点云以及重构的摄像机位姿集合进行图像定位(S105)。该方法提高重构效果,降低重构过程的采集成本,提高计算速度,并在图像定位过程中,提高图像定位的精度。

Description

基于射线模型三维重构的图像定位方法以及装置
相关申请的交叉引用
本申请要求清华大学于2015年12月31日提交的、发明名称为“基于射线模型三维重构的图像定位方法以及装置”的、中国专利申请号“201511026787.X”的优先权。
技术领域
本发明涉及图像处理及模式识别技术领域,尤其涉及一种基于射线模型三维重构的图像定位方法以及装置。
背景技术
图像定位技术是通过一张或一组图像,计算得到自身的位姿。该项技术可以用于机器人导航、路径规划、数字旅游、虚拟现实等,能够适用于GPS(Global Positioning System,全球定位系统)无法工作的区域,如室内和地下等。相比基于蓝牙、WiFi(Wireless Fidelity,无线保真)的定位技术,图像定位技术不依赖于专业设备,实施成本低。
相关技术中,基于图像定位的方法主要有两类:一类是基于图像检索的方法,此类方法寻找查询图像在数据库中的近邻图像,以其位置作为自身位置;另一类是基于三维重构结合图像-点云(2D-3D)匹配的方法,这类方法首先预先采集大量关于目标场景的平面图像,离线进行三维重构,得到场景的三维特征点云,并当在线定位阶段时,提取查询图像特征,并将其与三维特征点云进行2D-3D匹配,利用匹配结果估计目标摄像机的位姿。
但是存在的问题是:针对上述图像检索方法,由于没有更充分的利用三维信息,只能适用于查询图像与库图像姿态差异较小的情况,其定位精度不优于库图像本身的位置精度和采样间隔;相比图像检索方法,上述第二类方法虽然能够得到精度更高的定位结果,但是,其三维重构算法只能用于平面摄像机,受限于平面摄像机较小的视场,通常需要对同一位置变换多个角度进行拍摄,得到数量较大的平面图像集合进行三维重建,重构代价较高,如采集量大、计算量大等。
发明内容
本发明的目的旨在至少在一定程度上解决上述的技术问题之一。
为此,本发明的第一个目的在于提出一种基于射线模型三维重构的图像定位方法。该方法提高了重构效果,降低了重构过程的采集成本,提高了计算速度,并在图像定位过程中,提高图像定位的精度。
本发明的第二个目的在于提出一种基于射线模型三维重构的图像定位装置。
本发明的第三个目的在于提出一种存储介质。
为了实现上述目的,本发明第一方面实施例的基于射线模型三维重构的图像定位方法,包括以下步骤:预先采集多个场景的多个图像,并分别对所述多个图像进行特征提取以得到对应的多个特征点集合;对所述多个图像进行两两图像的特征匹配,并根据所述两两图像的特征匹配生成对应的本征矩阵,并对所述本征矩阵进行噪声处理;基于射线模型根据噪声处理后的所述特征匹配以及本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合;获取查询图像,并对所述查询图像进行特征提取以得到对应的二维特征点集合;以及基于定位姿态图优化框架根据所述二维特征点集合、所述三维特征点云以及重构的摄像机位姿集合进行图像定位。
根据本发明实施例的基于射线模型三维重构的图像定位方法,在基于射线模型的三维重构过程中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面),即能够适用于多种类型的摄像机,并充分利用其内在的几何性质,使得重构效果更好,且降低了采集成本,提高了计算速度,并且在图像定位过程中,提出的基于姿态图优化的定位框架,融合了图像点云之间的2D-3D特征匹配和近邻摄像机的位姿信息,提高了图像定位的精度。
为了实现上述目的,本发明第二方面实施例的基于射线模型三维重构的图像定位装置,包括:第一获取模块,用于预先采集多个场景的多个图像,并分别对所述多个图像进行特征提取以得到对应的多个特征点;生成模块,用于对所述多个图像进行两两图像的特征匹配,并根据所述两两图像的特征匹配生成对应的本征矩阵,并对所述本征矩阵进行噪声处理;重构模块,用于基于射线模型根据噪声处理后的所述本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合;第二获取模块,用于获取查询图像,并对所述查询图像进行特征提取以得到对应的二维特征点集合;以及图像定位模块,用于基于定位姿态图优化框架根据所述二维特征点集合、所述三维特征点云以及重构的摄像机位姿集合进行图像定位。
根据本发明实施例的基于射线模型三维重构的图像定位装置,在基于射线模型的三维重构过程中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面),即能够适用于多种类型的摄像机,并充分利用其内在的几何性质,使得重构效果更好,且降低了采集成本,提高了计算速度,并且在图像定位过程中,提出的基于姿态图优化的定位框架,融合了图像点云之间的2D-3D特征匹配和近邻摄像机的位姿信息,提高了图像定位的精度。
为了实现上述目的,本发明第三方面实施例的存储介质,用于存储应用程序,所述应用程序用于执行本发明第一方面实施例所述的基于射线模型三维重构的图像定位方法。
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明 显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中,
图1是根据本发明一个实施例的基于射线模型三维重构的图像定位方法的流程图;
图2是根据本发明实施例的生成三维特征点云以及重构的摄像机位姿集合的流程图;
图3是根据本发明实施例的图像定位的具体实现过程的流程图;
图4是根据本发明一个实施例的基于射线模型三维重构的图像定位方法的示例图;
图5是根据本发明一个实施例的基于射线模型三维重构的图像定位装置的结构框图;
图6是根据本发明一个实施例的重构模块的结构框图;
图7是根据本发明另一个实施例的重构模块的结构框图;以及
图8是根据本发明一个实施例的图像定位模块的结构框图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
下面参考附图描述根据本发明实施例的基于射线模型三维重构的图像定位方法以及装置。
图1是根据本发明一个实施例的基于射线模型三维重构的图像定位方法的流程图。如图1所示,该基于射线模型三维重构的图像定位方法可以包括:
S101,预先采集多个场景的多个图像,并分别对多个图像进行特征提取以得到对应的多个特征点集合。
其中,在本发明的实施例中,术语“多个”可进行广义理解,即对应足够多的数量。此外,在本发明的实施例中,图像的类型可包括但不限于全景类型、鱼眼类型和平面类型等。
具体地,可预先采集足够的场景图像作为本实施例所提到的图像,并分别对这些图像提取SIFT(Scale-invariant feature transform,尺度不变特征变换)特征,得到每个特征点的位置以及描述子集合,其中该描述子集合用于描述对应的特征点的周围区域信息。
S102,对多个图像进行两两图像的特征匹配,并根据两两图像的特征匹配生成对应的本征矩阵,并对本征矩阵进行噪声处理。
具体而言,在本发明的实施例中,可先根据多个特征点集合对多个图像进行两两匹配, 并存储每个图像对的特征点匹配情况。之后,可基于匹配上的特征点集合估计本征矩阵。
更具体地,可根据特征点的描述子集合对所有图像进行两两匹配,并储存每个图像对的特征点匹配情况,然后,可基于匹配上的特征点估计本征矩阵,并同时对该本征矩阵进行过滤噪声。可以理解,在本发明的实施例中,若将上述两两匹配的特征点组织起来,则可形成多条轨迹,其中,每条轨迹对应于将被重构的一个3D(三维)点。
S103,基于射线模型根据噪声处理后的特征匹配以及本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合。
可以理解,相较于传统的基于像素的平面模型,本发明通过射线模型能够适应不同的摄像机类型(如全景类型、鱼眼类型、平面类型等),并将其统一起来。
具体地,可先构建姿态图,其中,该姿态图可包含摄像机借点、三维(3D)点节点、摄像机-摄像机连接边、摄像机-3D点连接边等,这些可共同用于描述摄像机集合与3D点集合之间的可视性关系。之后,可基于射线模型的增量式重构,即:可先选择相对姿态估计质量较高的一对摄像机作为初始种子,利用基于射线模型的三角测量(triangulation)寻找新的样本3D点,之后可利用新的样本3D点基于射线模型寻找更多的摄像机,不断迭代并去噪优化,直至寻找不到更多的摄像机或者3D点。
具体而言,在本发明的一个实施例中,如图2所示,基于射线模型根据噪声处理后的特征匹配以及本征矩阵进行三维重构以生成三维特征点云以及摄像机位姿集合的具体实现过程可包括以下步骤:
S201,对噪声处理后的本征矩阵进行分解以得到对应的多个摄像机之间的相对姿态。
S202,根据多个摄像机之间的相对姿态和多个特征点构建对应的姿态图。
具体地,可通过预设的姿态图创建公式根据多个摄像机之间的相对姿态和多个特征点构建对应的姿态图。其中,在本发明的实施例中,预设的姿态图创建公式可为:
G=(NP,NX,EP,EX)          (1)
其中,NP为摄像机节点,NX为特征点(即样本3D点)节点,EP为摄像机-摄像机连接边,边上附有摄像机i和k之间的相对位置姿态属性,该属性可包括相对旋转Rik和相对平移方向Cik,即EPrelpose(i,k)=(Rik,Cik);EX为摄像机-特征点连接边,该边上附有该摄像机观测到的特征点坐标EXox=xij;根据该姿态图可以定义可视性函数visX(Xj,Ps)和visP(Pi,Xs),其中visX(Xj,Ps)={i:(i,j)∈EX,i∈Ps}意为给定特征点Xj和摄像机集合Ps的条件下,返回Ps中观测到Xj的摄像机集合;visP(Pi,Xs)={j:(i,j)∈EX,j∈Xs}意为给定特征点集合Xs和摄像机Pi的条件下,返回Xs中被Pi观测到的特征点集合。
S203,分别获取多个摄像机的模型,并分别根据多个摄像机的模型定义对应的射线模型。
具体地,可先获取摄像机的模型,如全景模型、鱼眼模型、平面模型等,之后可根据 不同的摄像机的模型分别定义对应的射线模型。需要说明的是,射线模型是基于每条射线r可以用原点和单位球上的另一个点x(x,y,z),x2+y2+z2=1所定义的,该射线与图像坐标u(u,v)通过映射函数一一对应;映射函数k可定义为x=k(u,K),u=k-1(x,K),其中K为摄像机的内参;对于不同的摄像机模型,其映射函数各自不同,其中,对于全景摄像机、鱼眼摄像机、平面摄像机所对应的映射函数可分别由如下式(2)-(4)描述:
Figure PCTCN2016113804-appb-000001
k(u,(f,uc))=(cos(t)sin(p),-sin(t),cos(t)cos(p))
Figure PCTCN2016113804-appb-000002
φ=arctan2(v1,u1)
Figure PCTCN2016113804-appb-000003
Figure PCTCN2016113804-appb-000004
k(u,(f,uc))=(cos(φ)sin(θ),-cos(θ),sin(φ)sin(θ))
Figure PCTCN2016113804-appb-000005
k(u,(f,uc))=(cos(t)sin(p),-sin(t),cos(t)cos(p))
其中,式(2)-(4)中uc为摄像机主点坐标,f为焦距,特别的对于全景摄像机
Figure PCTCN2016113804-appb-000006
p为绕y轴的旋转角度,t为绕x轴的俯仰角度,u1、v1、φ、θ、r均为临时变量。
S204,基于对应的射线模型对姿态图进行增量式重构以生成三维特征点云以及重构的摄像机位姿集合。
具体地,可先选择多个摄像机之间的相对姿态估计质量较高的一对摄像机作为初始种子,之后可利用基于对应的射线模型的三角测量(triangulation)寻找新的3D点,而后利用新的3D点基于该射线模型寻找更多的摄像机,不断迭代,直至寻找不到更多的摄像机或者3D点;其中,在该过程可不断实施非线性优化,以用于减小三维重构的误差;同时使用质量评价函数剔除质量不高的摄像机和3D点。需要说明的是,在该过程中的距离度量、三角测量、摄像机姿态估计、非线性优化、质量评价函数等模块都是针对射线模型所改进的,相比传统的仅适用平面图像的重构算法,具有更广泛的普适性。
由此,在基于射线模型的三维重构算法中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面等),即适用于多种类型的摄像机,扩大了适用范围。
进一步地,在本发明的一个实施例中,在生成三维特征点云以及重构的摄像机位姿集合之后,该图像定位方法还可包括:建立三维特征点云中每个三维特征点云的索引树,并 针对重构的摄像机位姿集合中多个摄像头建立空间位置的索引树。具体地,在三维重构完成之后,可建立点云特征与摄像机位置索引树,其中可以理解,三维特征点云中的每个点都附带若干特征,这些特征来自于观测到该点的图像;在后续在线定位阶段中,需要建立查询图像的特征与该特征点云的匹配,以实现图像定位;为了加速匹配过程,本发明对特征点云建立Kd-tree索引树,以加快检索速度;此外,由于在线定位阶段需要检索查询图像的空间近邻,所以本发明又对重构出的摄像机建立空间位置的Kd-tree索引树。
需要说明的是,在本发明的实施例中,上述步骤S101-S103均可为离线分析。也就是说,通过上述步骤S101-S103可以预先建立图像库,并根据该图像库预先生成对应的三维特征点云以及重构的摄像机位姿集合,并存储,以供后续在线图像定位阶段的使用。
S104,获取查询图像,并对查询图像进行特征提取以得到对应的二维特征点集合。
具体地,可对获取到的查询图像提取特征,得到该查询图像的二维特征点集合。需要说明的是,每一个二维特征点对应一个特征描述子,而3D特征点云中每个3D点对应多个特征描述子,这些描述子可由三维重构阶段多张图像所贡献。
S105,基于定位姿态图优化框架根据二维特征点集合、三维特征点云以及重构的摄像机位姿集合进行图像定位。
具体地,可将查询图像的特征与离线部分生成的3D点云的特征进行匹配(即2D-3D匹配),依据足够数量的有效匹配,利用摄像机姿态估计算法估计查询图像的初始位姿,之后,可根据该初始位姿查询近邻库摄像机(即近邻图像),并融合2D-3D匹配和与近邻图像之间的相对姿态建立定位姿态图优化框架并进行优化以得到更高精度的定位结果。
具体而言,在本发明的一个实施例中,如图3所示,基于定位姿态图优化框架根据二维特征点集合、三维特征点云以及重构的摄像机位姿集合进行图像定位的具体实现过程可包括以下步骤:
S301,根据多个三维特征点云的索引树将二维特征点集合与三维特征点云进行有效匹配以得到双向2D-3D匹配集合。
具体地,可先对某一个二维特征点Fi 2D在3D点云特征集合F3D中进行k近邻查询(如k=5),如果k近邻中来自不同3D点的最近邻与次近邻的比值小于某一阈值thmatch,则认为二维特征点与最近邻3D点之间构建了一个2D至3D的单向有效匹配,F2D中所有此类匹配构成了一个2D至3D的单向有效匹配集合M2D→3D(F2D,F3D);其次对M2D→3D(F2D,F3D)中的每个3D点,反向在查询图像的特征集合F2D中查询近邻和次近邻.若最近邻与次近邻的比值小于阈值thmatch,则认为得到了一个有效的3D至2D的单向匹配,这些匹配构成3D至2D的单项匹配集合M2D←3D(F2D,F3D);这两单向匹配集合M2D→3D(F2D,F3D)和M2D←3D(F2D,F3D)的交集即为双向2D-3D匹配集合M2D-3D(F2D,F3D)。
S302,通过摄像机姿态估计算法对双向2D-3D匹配集合进行估计以生成查询图像的初始位姿。
具体地,基于双向2D-3D匹配集合M2D-3D(F2D,F3D)通过摄像机姿态估计算法剔除不满足摄像机几何约束的2D-3D匹配,得到内点集合I2D-3D,并估计出查询图像的初始位姿Pq 2D-3D=Rq 2D-3D[I|-Cq 2D-3D],其中,Pq 2D-3D为查询摄像机的摄像机矩阵,由旋转矩阵R和该摄像机矩阵的光心位置C构成。
S303,根据查询图像的初始位姿以及空间位置的索引树在重构的摄像机位姿集合中进行查询以得到近邻图像。
具体地,可先从查询图像的初始位姿中得到查询图像q的初始空间位置Cq 2D-3D,之后可根据该查询图像的初始空间位置以及空间位置的索引树,在3D特征点云对应的重构的摄像机位姿集合中查询以得到k近邻{Pi,i=1,...,k},即近邻图像。
S304,将查询图像与近邻图像进行特征匹配以得到对应的多个有效匹配集合。
具体地,可将查询图像与近邻图像进行2D-2D的特征匹配得到两个图像之间的多个有效匹配集合。
S305,根据多个有效匹配集合生成近邻图像之间的相对姿态。
具体地,基于该有效匹配集合估计本征矩阵,并同时得到内点匹配,当匹配数量少于某一阈值时,可认为该本征矩阵噪声较大,移除该近邻图像,并分解该本征矩阵以得到与近邻图像间的相对姿态Riq,Ciq,其中相对姿态中的平移Ciq只能提供方向,不能提供大小。
S306,融合双向2D-3D匹配集合和近邻图像之间的相对姿态建立定位姿态图优化框架。
具体地,可定义关于查询图像q的姿态图Gq=(NP,NX,EP,EX),其中,NP为摄像机节点,包含查询图像的摄像机Pq和其近邻图像的摄像机{Pi,i=1,...,k},NX为3D点节点,对应于2D-3D匹配得到的中的3D点;EP为查询图像的摄像机Pq与近邻图像的摄像机{Pi,i=1,...,k}的连接边,边上附有i和q之间的相对位置姿态,包括相对旋转Riq和相对平移方向Ciq,即EPrel-pose(i,q)=(Riq,Ciq);EX为查询图像的摄像机Pq与3D点Xj的连接边,边上附有查询图像的摄像机Pq观测到的特征点射线坐标EXox=xqj
之后,对反投影误差及相对姿态误差之和进行优化,基于该查询图像构建目标函数(即上述的定位姿态图优化框架)如下:
Figure PCTCN2016113804-appb-000007
其中,Pq=Rq[I|-Cq]为待优化的查询图像的摄像机矩阵,Rq,Cq为该摄像机在世界坐标系下的旋转和平移;{(xqj,Xj),j=1,...n}为输入的双向2D-3D匹配集合;{(Pi,Riq,Ciq),i=1,...m}为查询图像的近邻图像以及相应的相对姿态集合;λ为两类代价的平衡因子;drel()为相对姿态边上的代价函数,其定义如下:
Figure PCTCN2016113804-appb-000008
其中,相对姿态的代价函数包含两项,分别为旋转的代价和平移方向的代价,二者相互独立;旋转的代价定义为Ri,Rq的相对欧拉角,
Figure PCTCN2016113804-appb-000009
平移方向的代价为观测出的平移方向Ri,Ciq与待优化的平移方向
Figure PCTCN2016113804-appb-000010
之间的弦距离。
S307,根据定位姿态图优化框架对查询图像的初始位姿进行优化以实现图像定位。
具体地,采用2D-3D的定位结果(即上述的查询图像的初始位姿)Pq 2D-3D作为初值,根据定位姿态图优化框架通过Levenberg-Marquardt算法对该查询图像的初始位姿Pq 2D-3D进行优化以得到更高精度的定位结果。
由此,相比传统仅使用2D-3D匹配信息的定位方法,本发明通过使用图优化的方法融合了2D-3D的匹配信息和图像间的相对姿态信息,提高了最终定位结果的精确性。
需要说明的是,上述步骤S104-S105为在线计算,即接收查询图像,然后根据该查询图像查询预先生成的三维特征点云以及重构的摄像机位姿集合,以实现图像定位。
下面将结合图4对本发明实施例的基于射线模型三维重构的图像定位方法进行描述。
举例而言,如图4所示,可预先进行离线重构,以得到三维特征点云以及重构的摄像机位姿集合,即可先离线采集足够场景的图像,并提取图像特征并对图像进行两两匹配,之后,可构建姿态图,并基于射线模型的增量式进行三维重构,以得到三维特征点云以及重构的摄像机位姿集合,并建立三维特征点云的索引树以及摄像头的空间位置索引树。当获取到查询图像时,可进行在线定位,即可先对获取到的查询图像提取特征,并将提取到的特征与三维特征点云进行2D-3D的有效匹配以得到双向2D-3D匹配集合,之后通过摄像机姿态估计算法对该双向2D-3D匹配集合进行估计以生成查询图像的初始位姿,并检索近邻摄像机并计算相对位姿,最后,通过建立定位姿态图融合二者信息得到更高精度的定位结果,即目标摄像机位置与姿态。
根据本发明实施例的基于射线模型三维重构的图像定位方法,在基于射线模型的三维重构过程中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面),即能够适用于多种类型的摄像机,并充分利用其内在的几何性质,使得重构效果更好,且降低了采集成本,提高了计算速度,并且在图像定位过程中,提出的基于姿态图优化的定位框架,融合了图像点云之间的2D-3D特征匹配和近邻摄像机的位姿信息,提高了图像定位的精度。
为了实现上述实施例,本发明还提出了一种基于射线模型三维重构的图像定位装置。
图5是根据本发明一个实施例的基于射线模型三维重构的图像定位装置的结构框图。 如图5所示,该基于射线模型三维重构的图像定位装置可以包括:第一获取模块100、生成模块200、重构模块300、第二获取模块400和图像定位模块500。
具体地,第一获取模块100可用于预先采集多个场景的多个图像,并分别对多个图像进行特征提取以得到对应的多个特征点集合。其中,在本发明的实施例中,术语“多个”可进行广义理解,即对应足够多的数量。此外,在本发明的实施例中,图像的类型可包括但不限于全景类型、鱼眼类型和平面类型等。
更具体地,第一获取模块100可预先采集足够的场景图像作为本实施例提到的图像,并分别对这些图像提取SIFT(Scale-invariant feature transform,尺度不变特征变换)特征,得到每个特征点的位置以及描述子集合,其中该描述子集合用于描述对应的特征点的周围区域信息。
生成模块200可用于对多个图像进行两两图像的特征匹配,并根据两两图像的特征匹配生成对应的本征矩阵,并对本征矩阵进行噪声处理。具体而言,在本发明的实施例中,生成模块200可先根据多个特征点集合对多个图像进行两两匹配,并存储每个图像对的特征点匹配情况,之后,可基于匹配上的特征点集合估计本征矩阵。
更具体地,生成模块200可根据特征点的描述子集合对所有图像进行两两匹配,并储存每个图像对的特征点匹配情况,然后,可基于匹配上的特征点估计本征矩阵,并同时对该本征矩阵进行过滤噪声。可以理解,在本发明的实施例中,若将上述两两匹配的特征点组织起来,则可形成多条轨迹,其中,每条轨迹对应于将被重构的一个3D(三维)点。
重构模块300可用于基于射线模型根据噪声处理后的本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合。可以理解,相较于传统的基于像素的平面模型,本发明通过射线模型能够适应不同的摄像机类型(如全景类型、鱼眼类型、平面类型等),并将其统一起来。
更具体地,重构模块300可先构建姿态图,其中,该姿态图可包含摄像机借点、三维(3D)点节点、摄像机-摄像机连接边、摄像机-3D点连接边等,这些可共同用于描述摄像机集合与3D点集合之间的可视性关系,之后,可基于射线模型的增量式重构,即:可先选择相对姿态估计质量较高的一对摄像机作为初始种子,利用基于射线模型的三角测量(triangulation)寻找新的3D点,之后可利用新的3D点基于射线模型寻找更多的摄像机,不断迭代并去噪优化,直至寻找不到更多的摄像机或者3D点。
具体而言,在本发明的实施例中,如图6所示,该重构模块300可包括:分解单元310、构建单元320、定义单元330和重构单元340。更具体地,分解单元310可用于对噪声处理后的本征矩阵进行分解以得到对应的多个摄像机之间的相对姿态。
构建单元320可用于根据多个摄像机之间的相对姿态和多个特征点构建对应的姿态图。更具体地,可通过预设的姿态图创建公式根据多个摄像机之间的相对姿态和多个特征点构 建对应的姿态图。其中,在本发明的实施例中,预设的姿态图创建公式可为上述式(1)。
定义单元330可用于分别获取多个摄像机的模型,并分别根据多个摄像机的模型定义对应的射线模型。更具体地,定义单元330可先获取摄像机的模型,如全景模型、鱼眼模型、平面模型等,之后可根据不同的摄像机的模型分别定义对应的射线模型。需要说明的是,射线模型是基于每条射线r可以用原点和单位球上的另一个点x(x,y,z),x2+y2+z2=1所定义的,该射线与图像坐标u(u,v)通过映射函数一一对应;映射函数k可定义为x=k(u,K),u=k-1(x,K),其中K为摄像机的内参;对于不同的摄像机模型,其映射函数各自不同,其中,对于全景摄像机、鱼眼摄像机、平面摄像机所对应的映射函数可分别由上述式(2)-(4)描述。
重构单元340可用于基于对应的射线模型对姿态图进行增量式重构以生成三维特征点云以及重构的摄像机位姿集合。更具体地,重构单元340可先选择多个摄像机之间的相对姿态估计质量较高的一对摄像机作为初始种子,之后可利用基于对应的射线模型的三角测量(triangulation)寻找新的3D点,而后利用新的3D点基于该射线模型寻找更多的摄像机,不断迭代,直至寻找不到更多的摄像机或者3D点;其中,在该过程可不断实施非线性优化,以用于减小三维重构的误差;同时使用质量评价函数剔除质量不高的摄像机和3D点。需要说明的是,在该过程中的距离度量、三角测量、摄像机姿态估计、非线性优化、质量评价函数等模块都是针对射线模型所改进的,相比传统的仅适用平面图像的重构算法,具有更广泛的普适性。
由此,在基于射线模型的三维重构算法中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面等),即适用于多种类型的摄像机,扩大了适用范围。
进一步地,在本发明的一个实施例中,如图7所示,该重构模块300还可包括:建立单元350,建立单元350可用于在重构单元340生成三维特征点云以及重构的摄像机位姿集合之后,建立三维特征点云中每个三维特征点云的索引树,并针对重构的摄像机位姿集合中多个摄像头建立空间位置的索引树。具体地,建立单元350在重构单元340三维重构完成之后,可建立点云特征与摄像机位置索引树,其中可以理解,三维特征点云中的每个点都附带若干特征,这些特征来自于观测到该点的图像;在后续在线定位阶段中,需要建立查询图像的特征与该特征点云的匹配,以实现图像定位;为了加速匹配过程,本发明对特征点云建立Kd-tree索引树,以加快检索速度;此外,由于在线定位阶段需要检索查询图像的空间近邻,所以本发明又对重构出的摄像机建立空间位置的Kd-tree索引树。
第二获取模块400可用于获取查询图像,并对查询图像进行特征提取以得到对应的二维特征点集合。更具体地,第二获取模块400可对获取到的查询图像提取特征,得到该查询图像的二维特征点集合。需要说明的是,每一个二维特征点对应一个特征描述子,而3D 特征点云中每个3D点对应多个特征描述子,这些描述子可由三维重构阶段多张图像所贡献。
图像定位模块500可用于基于定位姿态图优化框架根据二维特征点集合、三维特征点云以及重构的摄像机位姿集合进行图像定位。更具体地,图像定位模块500可将查询图像的特征与离线部分生成的3D点云的特征进行匹配(即2D-3D匹配),依据足够数量的有效匹配,利用摄像机姿态估计算法估计查询图像的初始位姿,之后,可根据该初始位姿查询近邻库摄像机(即近邻图像),并融合2D-3D匹配和与近邻图像之间的相对姿态建立定位姿态图优化框架并进行优化以得到更高精度的定位结果。
具体而言,在本发明的实施例中,如图8所示,该图像定位模块500可包括:第一匹配单元510、第一生成单元520、查询单元530、第二匹配单元540、第二生成单元550、建立单元560和图像定位单元570。
具体地,第一匹配单元510可用于根据多个三维特征点云的索引树将二维特征点集合与三维特征点云进行有效匹配以得到双向2D-3D匹配集合。
更具体地,第一匹配单元510可先对某一个二维特征点Fi 2D在3D点云特征集合F3D中进行k近邻查询(如k=5),如果k近邻中来自不同3D点的最近邻与次近邻的比值小于某一阈值thmatch,则认为二维特征点与最近邻3D点之间构建了一个2D至3D的单向有效匹配,F2D中所有此类匹配构成了一个2D至3D的单向有效匹配集合M2D→3D(F2D,F3D);其次对M2D→3D(F2D,F3D)中的每个3D点,反向在查询图像的特征集合F2D中查询近邻和次近邻.若最近邻与次近邻的比值小于阈值thmatch,则认为得到了一个有效的3D至2D的单向匹配,这些匹配构成3D至2D的单项匹配集合M2D←3D(F2D,F3D);这两单向匹配集合M2D→3D(F2D,F3D)和M2D←3D(F2D,F3D)的交集即为双向2D-3D匹配集合M2D-3D(F2D,F3D)。
第一生成单元520可用于通过摄像机姿态估计算法对双向2D-3D匹配集合进行估计以生成查询图像的初始位姿。更具体地,第一生成单元520可基于双向2D-3D匹配集合M2D-3D(F2D,F3D)通过摄像机姿态估计算法剔除不满足摄像机几何约束的2D-3D匹配,得到内点集合I2D-3D,并估计出查询图像的初始位姿Pq 2D-3D=Rq 2D-3D[I|-Cq 2D-3D],其中,Pq 2D-3D为查询摄像机的摄像机矩阵,由旋转矩阵R和该摄像机矩阵的光心位置C构成。
查询单元530可用于根据查询图像的初始位姿以及空间位置的索引树在重构的摄像机位姿集合中进行查询以得到近邻图像。更具体地,查询单元530可先从查询图像的初始位姿中得到查询图像q的初始空间位置Cq 2D-3D,之后可根据该查询图像的初始空间位置以及空间位置的索引树,在3D特征点云对应的重构的摄像机位姿集合中查询以得到k近邻{Pi,i=1,...,k},即近邻图像。
第二匹配单元540可用于将查询图像与近邻图像进行特征匹配以得到对应的多个有效匹配集合。更具体地,第二匹配单元540可将查询图像与近邻图像进行2D-2D的特征匹配 得到两个图像之间的多个有效匹配集合。
第二生成单元550可用于根据多个有效匹配集合生成近邻图像之间的相对姿态。更具体地,第二生成单元550可基于该有效匹配集合估计本征矩阵,并同时得到内点匹配,当匹配数量少于某一阈值时,可认为该本征矩阵噪声较大,移除该近邻图像,并分解该本征矩阵以得到与近邻图像间的相对姿态Riq,Ciq,其中相对姿态中的平移Ciq只能提供方向,不能提供大小。
建立单元560可用于融合双向2D-3D匹配集合和近邻图像之间的相对姿态建立定位姿态图优化框架。更具体地,建立单元560可定义关于查询图像q的姿态图Gq=(NP,NX,EP,EX),其中,NP为摄像机节点,包含查询图像的摄像机Pq和其近邻图像的摄像机{Pi,i=1,...,k},NX为3D点节点,对应于2D-3D匹配得到的中的3D点;EP为查询图像的摄像机Pq与近邻图像的摄像机{Pi,i=1,...,k}的连接边,边上附有i和q之间的相对位置姿态,包括相对旋转Riq和相对平移方向Ciq,即EPrel-pose(i,q)=(Riq,Ciq);EX为查询图像的摄像机Pq与3D点Xj的连接边,边上附有查询图像的摄像机Pq观测到的特征点射线坐标EXox=xqj,之后,对反投影误差及相对姿态误差之和进行优化,基于该查询图像构建目标函数(即上述的定位姿态图优化框架)如上述式(5)。
图像定位单元570可用于根据定位姿态图优化框架对查询图像的初始位姿进行优化以实现图像定位。更具体地,图像定位单元570可采用2D-3D的定位结果(即上述的查询图像的初始位姿)Pq 2D-3D作为初值,根据定位姿态图优化框架通过Levenberg-Marquardt算法对该查询图像的初始位姿Pq 2D-3D进行优化以得到更高精度的定位结果。
由此,相比传统仅使用2D-3D匹配信息的定位方法,本发明通过使用图优化的方法融合了2D-3D的匹配信息和图像间的相对姿态信息,提高了最终定位结果的精确性。
根据本发明实施例的基于射线模型三维重构的图像定位装置,在基于射线模型的三维重构过程中,通过使用三维射线描述二维像素坐标,射线模型能够无畸变地表达多种摄像机模型(如全景、鱼眼、平面),即能够适用于多种类型的摄像机,并充分利用其内在的几何性质,使得重构效果更好,且降低了采集成本,提高了计算速度,并且在图像定位过程中,提出的基于姿态图优化的定位框架,融合了图像点云之间的2D-3D特征匹配和近邻摄像机的位姿信息,提高了图像定位的精度。
为了实现上述实施例,本发明还提出了一种存储介质,用于存储应用程序,该应用程序用于执行本发明上述任一个实施例所述的基于射线模型三维重构的图像定位方法。
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (11)

  1. 一种基于射线模型三维重构的图像定位方法,其特征在于,包括以下步骤:
    预先采集多个场景的多个图像,并分别对所述多个图像进行特征提取以得到对应的多个特征点集合;
    对所述多个图像进行两两图像的特征匹配,并根据所述两两图像的特征匹配生成对应的本征矩阵,并对所述本征矩阵进行噪声处理;
    基于射线模型根据噪声处理后的所述特征匹配以及本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合;
    获取查询图像,并对所述查询图像进行特征提取以得到对应的二维特征点集合;以及
    基于定位姿态图优化框架根据所述二维特征点集合、所述三维特征点云以及重构的摄像机位姿集合进行图像定位。
  2. 如权利要求1所述的基于射线模型三维重构的图像定位方法,其特征在于,所述对所述多个图像进行两两图像的特征匹配,并根据所述两两图像的特征匹配生成对应的本征矩阵,具体包括:
    根据所述多个特征点集合对所述多个图像进行两两匹配,并存储每个图像对的特征点匹配情况;以及
    基于匹配上的特征点集合估计本征矩阵。
  3. 如权利要求1或2所述的基于射线模型三维重构的图像定位方法,其特征在于,基于射线模型根据噪声处理后的所述特征匹配以及本征矩阵进行三位重构以生成样本三维特征点云以及重构的摄像机位姿集合,具体包括:
    对所述噪声处理后的本征矩阵进行分解以得到对应的多个摄像机之间的相对姿态;
    根据所述多个摄像机之间的相对姿态和多个特征点构建对应的姿态图;
    分别获取所述多个摄像机的模型,并分别根据所述多个摄像机的模型定义对应的射线模型;
    基于所述对应的射线模型对所述姿态图进行增量式重构以生成三维特征点云以及重构的摄像机位姿集合。
  4. 如权利要求3所述的基于射线模型三维重构的图像定位方法,其特征在于,在生成三维特征点云以及重构的摄像机位姿集合之后,所述方法还包括:
    建立所述三维特征点云中多个三维特征点云的索引树,并针对所述重构的摄像机位姿集合中多个摄像头建立空间位置的索引树。
  5. 如权利要求4所述的基于射线模型三维重构的图像定位方法,其特征在于,基于定位姿态图优化框架根据所述二维特征点集合、所述三维特征点云以及重构的摄像机位姿集合进行图像定位,具体包括:
    根据所述多个三维特征点云的索引树将所述二维特征点集合与所述三维特征点云进行有效匹配以得到双向2D-3D匹配集合;
    通过摄像机姿态估计算法对所述双向2D-3D匹配集合进行估计以生成所述查询图像的初始位姿;
    根据所述查询图像的初始位姿以及所述空间位置的索引树在所述重构的摄像机位姿集合中进行查询以得到近邻图像;
    将所述查询图像与所述近邻图像进行特征匹配以得到对应的多个有效匹配集合;
    根据所述多个有效匹配集合生成所述近邻图像之间的相对姿态;
    融合所述双向2D-3D匹配集合和所述近邻图像之间的相对姿态建立所述定位姿态图优化框架;以及
    根据所述定位姿态图优化框架对所述查询图像的初始位姿进行优化以实现图像定位。
  6. 一种基于射线模型三维重构的图像定位装置,其特征在于,包括:
    第一获取模块,用于预先采集多个场景的多个图像,并分别对所述多个图像进行特征提取以得到对应的多个特征点集合;
    生成模块,用于对所述多个图像进行两两图像的特征匹配,并根据所述两两图像的特征匹配生成对应的本征矩阵,并对所述本征矩阵进行噪声处理;
    重构模块,用于基于射线模型根据噪声处理后的所述本征矩阵进行三维重构以生成三维特征点云以及重构的摄像机位姿集合;
    第二获取模块,用于获取查询图像,并对所述查询图像进行特征提取以得到对应的二维特征点集合;以及
    图像定位模块,用于基于定位姿态图优化框架根据所述二维特征点集合、所述三维特征点云以及重构的摄像机位姿集合进行图像定位。
  7. 如权利要求6所述的基于射线模型三维重构的图像定位装置,其特征在于,所述生成模块具体用于:
    根据所述多个特征点集合对所述多个图像进行两两匹配,并存储每个图像对的特征点匹配情况;以及
    基于匹配上的特征点集合估计本征矩阵。
  8. 如权利要求6或7所述的基于射线模型三维重构的图像定位装置,其特征在于,所述重构模块包括:
    分解单元,用于对所述噪声处理后的本征矩阵进行分解以得到对应的多个摄像机之间的相对姿态;
    构建单元,用于根据所述多个摄像机之间的相对姿态和多个特征点构建对应的姿态图;
    定义单元,用于分别获取所述多个摄像机的模型,并分别根据所述多个摄像机的模型定义对应的射线模型;
    重构单元,用于基于所述对应的射线模型对所述姿态图进行增量式重构以生成三维特征点云以及重构的摄像机位姿集合。
  9. 如权利要求8所述的基于射线模型三维重构的图像定位装置,其特征在于,还包括:
    建立单元,用于在所述重构单元生成三维特征点云以及重构的摄像机位姿集合之后,建立所述三维特征点云中多个三维特征点云的索引树,并针对所述重构的摄像机位姿集合中多个摄像头建立空间位置的索引树。
  10. 如权利要求9所述的基于射线模型三维重构的图像定位装置,其特征在于,所述图像定位模块包括:
    第一匹配单元,用于根据所述多个三维特征点云的索引树将所述二维特征点集合与所述三维特征点云进行有效匹配以得到双向2D-3D匹配集合;
    第一生成单元,用于通过摄像机姿态估计算法对所述双向2D-3D匹配集合进行估计以生成所述查询图像的初始位姿;
    查询单元,用于根据所述查询图像的初始位姿以及所述空间位置的索引树在所述重构的摄像机位姿集合中进行查询以得到近邻图像;
    第二匹配单元,用于将所述查询图像与所述近邻图像进行特征匹配以得到对应的多个有效匹配集合;
    第二生成单元,用于根据所述多个有效匹配集合生成所述近邻图像之间的相对姿态;
    建立单元,用于融合所述双向2D-3D匹配集合和所述近邻图像之间的相对姿态建立所述定位姿态图优化框架;以及
    图像定位单元,用于根据所述定位姿态图优化框架对所述查询图像的初始位姿进行优化以实现图像定位。
  11. 一种存储介质,其特征在于,用于存储应用程序,所述应用程序用于执行权利要求1至5中任一项所述的基于射线模型三维重构的图像定位方法。
PCT/CN2016/113804 2015-12-31 2016-12-30 基于射线模型三维重构的图像定位方法以及装置 WO2017114507A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/066,168 US10580204B2 (en) 2015-12-31 2016-12-30 Method and device for image positioning based on 3D reconstruction of ray model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201511026787.XA CN105844696B (zh) 2015-12-31 2015-12-31 基于射线模型三维重构的图像定位方法以及装置
CN201511026787.X 2015-12-31

Publications (1)

Publication Number Publication Date
WO2017114507A1 true WO2017114507A1 (zh) 2017-07-06

Family

ID=56580355

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/113804 WO2017114507A1 (zh) 2015-12-31 2016-12-30 基于射线模型三维重构的图像定位方法以及装置

Country Status (3)

Country Link
US (1) US10580204B2 (zh)
CN (1) CN105844696B (zh)
WO (1) WO2017114507A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060334A (zh) * 2019-04-19 2019-07-26 吉林大学 基于尺度不变特征变换的计算集成成像图像重构方法
WO2020038386A1 (zh) * 2018-08-22 2020-02-27 杭州萤石软件有限公司 确定单目视觉重建中的尺度因子
CN111860544A (zh) * 2020-07-28 2020-10-30 杭州优链时代科技有限公司 一种投影辅助衣物特征提取方法及系统

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10726593B2 (en) * 2015-09-22 2020-07-28 Fyusion, Inc. Artificially rendering images using viewpoint interpolation and extrapolation
CN105844696B (zh) * 2015-12-31 2019-02-05 清华大学 基于射线模型三维重构的图像定位方法以及装置
US10546385B2 (en) 2016-02-25 2020-01-28 Technion Research & Development Foundation Limited System and method for image capture device pose estimation
CN107563366A (zh) * 2017-07-26 2018-01-09 安徽讯飞爱途旅游电子商务有限公司 一种定位方法及装置、电子设备
WO2019065784A1 (ja) * 2017-09-29 2019-04-04 Necソリューションイノベータ株式会社 画像処理装置、画像処理方法、及びコンピュータ読み取り可能な記録媒体
CN110785792A (zh) * 2017-11-21 2020-02-11 深圳市柔宇科技有限公司 3d建模方法、电子设备、存储介质及程序产品
CN110044353B (zh) * 2019-03-14 2022-12-20 深圳先进技术研究院 一种飞行机构室内定位方法及定位系统
CN110443907A (zh) * 2019-06-28 2019-11-12 北京市政建设集团有限责任公司 巡检任务处理方法和巡检任务处理服务器
US11436743B2 (en) * 2019-07-06 2022-09-06 Toyota Research Institute, Inc. Systems and methods for semi-supervised depth estimation according to an arbitrary camera
CN110728720B (zh) * 2019-10-21 2023-10-13 阿波罗智能技术(北京)有限公司 用于相机标定的方法、装置、设备和存储介质
WO2021097744A1 (zh) * 2019-11-21 2021-05-27 北京机电研究所有限公司 用于三维尺寸的动态测量装置及其测量方法
CN111640181A (zh) * 2020-05-14 2020-09-08 佳都新太科技股份有限公司 一种交互式视频投影方法、装置、设备及存储介质
CN111599001B (zh) * 2020-05-14 2023-03-14 星际(重庆)智能装备技术研究院有限公司 基于图像三维重建技术的无人机导航地图构建系统及方法
CN111649724B (zh) 2020-06-04 2022-09-06 百度在线网络技术(北京)有限公司 基于移动边缘计算的视觉定位方法和装置
CN111860225B (zh) * 2020-06-30 2023-12-12 阿波罗智能技术(北京)有限公司 一种图像处理的方法、装置、电子设备及存储介质
CN111862351B (zh) * 2020-08-03 2024-01-19 字节跳动有限公司 定位模型优化方法、定位方法和定位设备
US11494927B2 (en) 2020-09-15 2022-11-08 Toyota Research Institute, Inc. Systems and methods for self-supervised depth estimation
US11615544B2 (en) * 2020-09-15 2023-03-28 Toyota Research Institute, Inc. Systems and methods for end-to-end map building from a video sequence using neural camera models
CN112750164B (zh) * 2021-01-21 2023-04-18 脸萌有限公司 轻量化定位模型的构建方法、定位方法、电子设备
CN113034600B (zh) * 2021-04-23 2023-08-01 上海交通大学 基于模板匹配的无纹理平面结构工业零件识别和6d位姿估计方法
CN115937722A (zh) * 2021-09-30 2023-04-07 华为技术有限公司 一种设备定位方法、设备及系统
CN114359522B (zh) * 2021-12-23 2024-06-18 阿依瓦(北京)技术有限公司 Ar模型放置方法及装置
CN114419272B (zh) * 2022-01-20 2022-08-19 盈嘉互联(北京)科技有限公司 一种基于单张照片和bim的室内定位方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090232355A1 (en) * 2008-03-12 2009-09-17 Harris Corporation Registration of 3d point cloud data using eigenanalysis
CN102075686A (zh) * 2011-02-10 2011-05-25 北京航空航天大学 一种鲁棒的实时在线摄像机跟踪方法
CN102074015A (zh) * 2011-02-24 2011-05-25 哈尔滨工业大学 一种基于二维图像序列的目标对象的三维重建方法
CN103745498A (zh) * 2014-01-16 2014-04-23 中国科学院自动化研究所 一种基于图像的快速定位方法
CN103824278A (zh) * 2013-12-10 2014-05-28 清华大学 监控摄像机的标定方法和系统
CN105844696A (zh) * 2015-12-31 2016-08-10 清华大学 基于射线模型三维重构的图像定位方法以及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7187809B2 (en) * 2004-06-10 2007-03-06 Sarnoff Corporation Method and apparatus for aligning video to three-dimensional point clouds
CN100533487C (zh) * 2007-04-19 2009-08-26 北京理工大学 基于单幅对称图像的光滑曲面三维实体模型重建方法
KR102077498B1 (ko) * 2013-05-13 2020-02-17 한국전자통신연구원 상호 기하 관계가 고정된 카메라 군의 이동 경로 추출 장치 및 방법
CN103759716B (zh) * 2014-01-14 2016-08-17 清华大学 基于机械臂末端单目视觉的动态目标位置和姿态测量方法
JP2016057108A (ja) * 2014-09-08 2016-04-21 株式会社トプコン 演算装置、演算システム、演算方法およびプログラム
CN104316057A (zh) * 2014-10-31 2015-01-28 天津工业大学 一种无人机视觉导航方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090232355A1 (en) * 2008-03-12 2009-09-17 Harris Corporation Registration of 3d point cloud data using eigenanalysis
CN102075686A (zh) * 2011-02-10 2011-05-25 北京航空航天大学 一种鲁棒的实时在线摄像机跟踪方法
CN102074015A (zh) * 2011-02-24 2011-05-25 哈尔滨工业大学 一种基于二维图像序列的目标对象的三维重建方法
CN103824278A (zh) * 2013-12-10 2014-05-28 清华大学 监控摄像机的标定方法和系统
CN103745498A (zh) * 2014-01-16 2014-04-23 中国科学院自动化研究所 一种基于图像的快速定位方法
CN105844696A (zh) * 2015-12-31 2016-08-10 清华大学 基于射线模型三维重构的图像定位方法以及装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020038386A1 (zh) * 2018-08-22 2020-02-27 杭州萤石软件有限公司 确定单目视觉重建中的尺度因子
CN110060334A (zh) * 2019-04-19 2019-07-26 吉林大学 基于尺度不变特征变换的计算集成成像图像重构方法
CN110060334B (zh) * 2019-04-19 2022-02-22 吉林大学 基于尺度不变特征变换的计算集成成像图像重构方法
CN111860544A (zh) * 2020-07-28 2020-10-30 杭州优链时代科技有限公司 一种投影辅助衣物特征提取方法及系统
CN111860544B (zh) * 2020-07-28 2024-05-17 杭州优链时代科技有限公司 一种投影辅助衣物特征提取方法及系统

Also Published As

Publication number Publication date
US10580204B2 (en) 2020-03-03
CN105844696A (zh) 2016-08-10
US20190005718A1 (en) 2019-01-03
CN105844696B (zh) 2019-02-05

Similar Documents

Publication Publication Date Title
WO2017114507A1 (zh) 基于射线模型三维重构的图像定位方法以及装置
EP2833322B1 (en) Stereo-motion method of three-dimensional (3-D) structure information extraction from a video for fusion with 3-D point cloud data
Teller et al. Calibrated, registered images of an extended urban area
CN110135455A (zh) 影像匹配方法、装置及计算机可读存储介质
Li et al. Large scale image mosaic construction for agricultural applications
Sun et al. RBA: Reduced Bundle Adjustment for oblique aerial photogrammetry
Tao et al. Massive stereo-based DTM production for Mars on cloud computers
WO2021004416A1 (zh) 一种基于视觉信标建立信标地图的方法、装置
JP2016194895A (ja) 室内2d平面図の生成方法、装置及びシステム
Nilosek et al. Assessing geoaccuracy of structure from motion point clouds from long-range image collections
CN115423863A (zh) 相机位姿估计方法、装置及计算机可读存储介质
CN113129422A (zh) 一种三维模型构建方法、装置、存储介质和计算机设备
Skuratovskyi et al. Outdoor mapping framework: from images to 3d model
Ding et al. Stereo vision SLAM-based 3D reconstruction on UAV development platforms
Feng et al. Research on Calibration Method of Multi-camera System without Overlapping Fields of View Based on SLAM
Cui et al. MMA: Multi-camera based global motion averaging
Yang et al. Three-dimensional panoramic terrain reconstruction from aerial imagery
Yu et al. Multi-view 2D–3D alignment with hybrid bundle adjustment for visual metrology
Chen et al. The power of indoor crowd: Indoor 3D maps from the crowd
CN116468878B (zh) 一种基于定位地图的ar设备定位方法
US20240135623A1 (en) Differentiable real-time radiance field rendering for large scale view synthesis
Tanner et al. Keep geometry in context: Using contextual priors for very-large-scale 3d dense reconstructions
WO2024083010A1 (zh) 一种视觉定位方法及相关装置
Arth et al. Geospatial management and utilization of large-scale urban visual reconstructions
He 3d reconstruction from passive sensors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16881296

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16881296

Country of ref document: EP

Kind code of ref document: A1