CN115719407B - Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method - Google Patents
Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method Download PDFInfo
- Publication number
- CN115719407B CN115719407B CN202310011438.9A CN202310011438A CN115719407B CN 115719407 B CN115719407 B CN 115719407B CN 202310011438 A CN202310011438 A CN 202310011438A CN 115719407 B CN115719407 B CN 115719407B
- Authority
- CN
- China
- Prior art keywords
- image
- point cloud
- images
- calculating
- cloud model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Processing (AREA)
Abstract
The invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which comprises the steps of firstly calculating a sparse point cloud model and a camera gesture of a scene, dividing the sparse point cloud model into different areas, calculating a depth map of an image contained in each area, selecting two optimal depth images for each area as initial fusion views, fusing the depth images of each area to obtain a dense point cloud model in each area, and merging dense point clouds in a plurality of areas to obtain a dense point cloud model of a complete scene. The invention fully utilizes the regionality among large-scale aerial images, converts the multi-view three-dimensional reconstruction problem of a large-scale scene into a small-scale multi-view three-dimensional reconstruction problem which can be solved on a low-performance computer, thereby improving the time efficiency of three-dimensional reconstruction and reducing the cost of three-dimensional reconstruction.
Description
Technical Field
The invention relates to a computer vision and image processing technology, in particular to a distributed multi-view three-dimensional reconstruction method for large-scale aerial images.
Background
Multi-view Stereo (MVS) is a technique for computing a scene dense point cloud model from image data, typically with the output information of the motion estimation structure (Structure from Motion, sfM), i.e., sparse point cloud model and camera parameters, as input information for the MVS. Currently, the multi-view stereo reconstruction problem for small-scale image data (such as small-scale scene image data acquired by a handheld camera) has been greatly researched, however, for a large-scale outdoor scene, the existing multi-view stereo reconstruction method needs to be further improved. In addition, with the popularization of consumer unmanned aerial vehicle equipment, the large-scale data of outdoor scene is easy to obtain. The existing multi-view three-dimensional reconstruction method mainly has the following challenges when processing large-scale aerial image data: (a) The multi-view three-dimensional reconstruction process is very time-consuming, and particularly when outdoor large-scale aerial image data are processed, the existing multi-view three-dimensional reconstruction method cannot calculate a dense point cloud model in a limited time, so that the timeliness requirement of a high-level application system is difficult to meet; (b) The memory overflow is caused by the multi-view three-dimensional reconstruction method, so that the memory overflow problem can occur in the multi-view three-dimensional reconstruction method of a single-machine version, and the three-dimensional reconstruction process is failed.
The above problems seriously hamper the development and application of multi-view stereoscopic reconstruction technology, and expose the defects of single-machine version multi-view stereoscopic reconstruction methods in processing large-scale aerial image data. Therefore, there is a need for a distributed multi-view stereoscopic reconstruction method for large-scale aerial images, so as to be able to quickly calculate a dense point cloud model of a scene from the large-scale aerial images.
Currently, classical papers on multi-view stereoscopic studies are mainly: [1] account, dense, and Robust Multi-View Stereopsis, [2] [ Pixelwise View Selection for Unstructured Multi-View Stereo, [3] [ BlendedMVS: A large-scale dataset for generalized Multi-View Stereo networks ]. The paper [1] was published in 2007 on a CVPR conference, and is a multi-view three-dimensional reconstruction method based on seed point diffusion; paper [2] published in 2016 on ECCV conference, is a multi-view stereo reconstruction method based on image block matching; the paper [3] is published on CVPR in 2020, and is a multi-view three-dimensional reconstruction method based on deep learning, and mainly adopts a deep learning technology to estimate the depth map of each image. The focus of these multi-view stereo reconstruction methods is how to improve the accuracy of the multi-view stereo reconstruction model (three-dimensional dense point cloud), and the object of processing is the image data of the small-scale scene.
Therefore, when the existing multi-view stereo reconstruction method is applied to large-scale aerial image data, the following challenges still face: (1) When the existing single-machine version multi-view stereo reconstruction method is used for processing large-scale aerial image data, a large memory space is required, for example, the content space required for processing 1000 pieces of image data is 64 Gb, even exceeds the maximum memory space supported by the existing hardware equipment, for example, when the image data reaches 1500 pieces, 128Gb of memory space is required, and the range of the maximum memory space supported by a single computer is far exceeded; (2) The operation efficiency of the existing multi-view three-dimensional reconstruction method is too low to meet the time efficiency requirement of large-scale three-dimensional reconstruction based on aerial image data, for example, 10 days are required for processing 1000 aerial images.
Disclosure of Invention
The invention aims to: the invention aims to solve the defects in the prior art, provides a distributed multi-view three-dimensional reconstruction method for large-scale aerial images,
and a dense point cloud model of the scene is quickly calculated from the large-scale aerial image data above the distributed running environment, so that the progress of multi-view three-dimensional reconstruction technology for the large-scale aerial image is promoted, and the goal of quickly calculating high-quality dense point cloud of the large-scale scene is realized.
The technical scheme is as follows: the invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which comprises the following steps of:
s1, for a given large-scale aerial image data set,wherein->Representing the number of aerial images, and calculating a sparse point cloud model and camera parameters of a corresponding scene:
the sparse point cloud model is SWherein->Representing the number of three-dimensional points in a sparse point cloud of the entire scene, +.>Represent the firstiThree-dimensional dot->Position in world coordinate system, +.>A sequence number representing a three-dimensional point;
camera parameter CWherein->Representing the number of aerial images taken,indicate->Internal parameter matrix of individual cameras, +.>Indicate->Rotation matrix of the individual cameras, < >>Indicate->Translation vector of the individual camera,>a serial number indicating the camera;
s2, dividing the large-scale sparse point cloud model S into different subareas to obtain SWherein->Representing the number of subregions>Serial number representing three-dimensional point->Indicate->A sub-region;
dividing camera parameters C intoSub-regions corresponding to the sparse point cloud model S are obtained, and camera parameters C +.>;
S3, calculating a depth map of the image in each region, and recording the regionThere is->Aerial image, region->The corresponding depth image is +.>,/>The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Indicate->Sequence number of the depth map;
s4, for each areaDepth image data +.>Selecting two optimal initial fusion views, and recording asWherein->Representation area->Number of aerial images in->And->Is a subscript for distinguishing between different depth images; />And->Representing an optimal initial fusion view;
s5, fusion regionAll depth images +.>The region +.>The corresponding dense point cloud model, noted +.>Wherein->Representing the number of three-dimensional points in the dense cloud after depth map fusion in the subarea, ++>Representation dot->A location in a world coordinate system;
s6, the dense point cloud model of each region is obtainedMerging the two points into a whole to obtain a dense point cloud model of the complete scene, which is marked as +.>Wherein->Representing the number of sub-regions.
Further, in step S1, a hybrid motion estimation structure method is used to estimate the motion of the image data set from the aerial imageThe sparse point cloud model and the camera parameters of the scene are calculated, and the method comprises the following specific steps:
step S1.1, aerial image matching
Firstly, detecting feature points and calculating feature descriptors by using a local feature detection method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;
step S1.2, calculating camera parameters
According to the feature matching points obtained in the step S1.1, calculating camera parameters by using an incremental motion estimation structure method; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;
step S1.3, calculating a sparse point cloud model of the regional image
According to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2, a global motion estimation structure method is used for calculating a sparse point cloud model of the regional scene, so that the time efficiency of three-dimensional reconstruction is improved. First, the region isRegistering all three-dimensional points corresponding to the images in the world coordinate system, and then simultaneously optimizing the camera parameters and the three-dimensional points in the world coordinate system by using a cluster adjustment (Bundle Adjustment) method until convergence to obtain an accurate sparse point cloud model; wherein (1)>Indicate->Sequence number of each region.
Further, in the step S2, the dominant clustering method is used to divide the large-scale sparse point cloud into a plurality of sub-area scenes, and the specific method is as follows:
recording deviceRepresents a composition comprising->Aerial image and sparse point cloud model +.>Set of->Representing a device havingLine and->A square matrix of columns for recording similarities between images; />And->Respectively represent image +.>And image->All three-dimensional point sets that can be observed, then image +.>And image->The similarity between them is defined as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representation ofVector->Sum vector->Angle between->The calculation method of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,and->Are all intermediate computational variables;
to this end, a similarity matrix between the images is calculatedAccording to->The values of (2) constitute a graph structure +.>Wherein->Representing vertex(s),>representing edges;
recording deviceIndicating that contains->Vectors of individual elements, then arbitrary->Moment, vector->The values of each element in (a) are as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,subscript +.>Representation->Time vector->Is>Component(s)>Representation->Is a transpose of (2);
thus, the first and second substrates are bonded together,time and->Time, vector->The error between them is:
wherein, the liquid crystal display device comprises a liquid crystal display device,representation->Is a transpose of (2);
if it isAnd terminating the iterative calculation process, namely dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas.
Further, in the step S3, a stereo matching method based on image blocks is used to calculate a corresponding high-quality depth map for each image, so as to lay a foundation for reconstructing a high-quality dense point cloud model. The detailed calculation steps are as follows:
s3.1, performing parallax propagation by using a parity iterative test, wherein the strategies comprise spatial propagation, visual angle propagation and time propagation;
step S3.2, when the space propagation is respectively iterated in odd number and even number, the propagation direction starts from the upper left corner and the lower right corner respectively, and each pixel point is compared with the parallax planes of the left pixel and the right pixel, and the matching cost of the parallax of the upper left corner and the lower right corner pixel and the current parallax value is calculated;
step S3.3, using the parallax value with the minimum substitution value as the optimal parallax value;
and step S3.4, iteratively calculating the steps S3.1 to S3.4 until the optimal parallax values of all pixels are calculated, and obtaining the optimal depth image.
Further, in the step S4, two optimal initial fusion images are selected for each sub-region by using a multiple constraint method, so as to ensure the geometric consistency of the generated dense point cloud and the real scene; the detailed calculation steps are as follows:
s4.1, calculating feature matching points meeting homography matrix constraint between any two images in the input image by using homography matrix constraint method, and marking as;
S4.2, calculating feature matching points meeting the basic matrix constraint relation between any two images in the input image according to the basic matrix constraint relation, and marking the feature matching points as;
S4.3, calculating feature matching points meeting the constraint relation of the intrinsic matrix between any two images in the input image according to the constraint relation of the intrinsic matrix, and marking the feature matching points as;
Step S4.4, taking the matching pointsMatching point->And matching point->Intersection between them, obtain matching point +.>;
Step S4.5, the matching points will be satisfiedThe two images with the least error points are constrained as the initial fused image.
Further, in the step S5, the depth image in the region is fused into a whole by fully utilizing the normal vector information of the depth image, so as to obtain a dense point cloud model corresponding to the image in the region, and the detailed steps are as follows:
s5.1, calculating the confidence coefficient of each vertex of the depth images to be fused;
s5.2, deleting redundant overlapping points from the depth images to be fused according to the confidence degree of each vertex in the depth images, and obtaining topology information of each region image in each depth image;
s5.3, carrying out weighting operation on vertexes on the depth image according to the topology information to obtain geometric information of the image;
and S5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region. Further, in the step S6, a global iterative nearest neighbor method is used to merge a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene, and the detailed steps are as follows:
Step S6.2, finding out origin cloudCorresponding point set->And->So that->And->Distance betweenMinimum;
step S6.3, calculating the Point setAnd Point set->A rotation matrix between->And translation vector->;
Step S6.4, use of a rotation matrixAnd translation vector->For->The above points are rotated and translated to calculate a new set of points +.>;
Step S6.5, calculating the Point setAnd (2) point set->Average distance between>,/>Representing the number of three-dimensional points in the point set;
step S6.6 ifIf the number of iterations is smaller than the preset threshold or larger than the preset number of iterations, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.
The beneficial effects are that: compared with the prior art, the invention has the following advantages:
(1) According to the invention, the sparse point cloud model of the large-scale scene is divided into different sub-set areas, so that the problems of memory overflow and system collapse of a single-machine version multi-view three-dimensional reconstruction method are avoided, and the large-scale multi-view three-dimensional reconstruction is possible.
(2) According to the invention, on different nodes of the distributed system, sparse point cloud models and camera parameters (including camera focal length, rotation matrix and translation vector) of different subareas are independently processed, a dense point cloud model is rapidly calculated, and the time efficiency of large-scale multi-view three-dimensional reconstruction is improved.
(3) The method not only can solve the problem of memory space overflow of a single-machine version multi-view three-dimensional reconstruction method, but also can improve the time efficiency of large-scale multi-view three-dimensional reconstruction, and lays an important foundation for the application of unmanned aerial vehicle aerial images in the three-dimensional reconstruction field and the development and application of the three-dimensional reconstruction technology.
Drawings
FIG. 1 is a schematic overall flow chart of the present invention;
FIG. 2 is an aerial image dataset in an embodiment;
FIG. 3 is a large-scale sparse point cloud model in an embodiment;
FIG. 4 is sparse point cloud model and camera pose information for a sub-region in an embodiment;
FIG. 5 is an original image (non-depth image) of a primary fused view of a sub-region in an embodiment;
FIG. 6 is a dense point cloud model of a sub-region of an embodiment;
FIG. 7 is a prior art dense point cloud model;
FIG. 8 is a complete dense point cloud model according to an embodiment of the present invention.
Detailed Description
The technical scheme of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.
The invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which aims at reconstructing a scene dense point cloud from the large-scale aerial images, and has the following core ideas: firstly, dividing a sparse point cloud model corresponding to a large-scale aerial image into a plurality of subareas; secondly, distributing aerial images, sparse point cloud models and camera parameters corresponding to the subareas on different nodes of a distributed environment for processing, and calculating a depth image of each image; then, fusing depth images in the subareas on different nodes of the distributed environment, so as to obtain a dense point cloud model of the subareas; finally, merging the dense point clouds on the child nodes on a main control machine of the distributed environment, so as to obtain a dense point cloud model of the complete scene; and finally, a high-quality dense point cloud model can be rapidly calculated from large-scale aerial image data.
As shown in fig. 1, the distributed multi-view stereoscopic reconstruction method for large-scale aerial images of the invention comprises the following steps:
s1, for a given large-scale aerial image data set,wherein->Representing the number of aerial images, and calculating a sparse point cloud model and camera parameters of a corresponding scene:
the sparse point cloud model is SWherein->Representing the number of three-dimensional points,representation dot->Position in world coordinate system, +.>Subscript->A sequence number representing a three-dimensional point;
camera parameter CWherein->Representing the number of aerial images taken,indicate->Internal parameter matrix of individual cameras, +.>Indicate->Rotation matrix of the individual cameras, < >>Indicate->Translation vector of the individual camera,>a serial number indicating the camera;
step S1.1, detecting feature points and calculating feature descriptors by using a SuperPoint method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;
s1.2, calculating camera parameters by using an incremental motion estimation structure method according to the feature matching points obtained in the S1.1; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;
and S1.3, calculating a sparse point cloud model of the regional scene by using a global motion estimation structure method according to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2, so that the time efficiency of three-dimensional reconstruction is improved. First, the region isRegistering all three-dimensional points corresponding to the images in the world coordinate system, and then simultaneously optimizing the camera parameters and the three-dimensional points in the world coordinate system by using a cluster adjustment (Bundle Adjustment) method until convergence to obtain an accurate sparse point cloud model; wherein (1)>Indicate->A sequence number of the individual region;
s2, dividing the large-scale sparse point cloud model S into different areas to obtain SWherein->Representing the number of three-dimensional points in a sparse point cloud of the whole scene, +.>Representation dot->Position in world coordinate system, +.>Representing the number of subregions>Serial number representing three-dimensional point->Indicate->A sub-region;
dividing camera parameters C intoDifferent areas, obtaining camera parameters C of each areaWherein->Representing the number of aerial images, +.>Indicate->Internal parameter matrix of individual cameras, +.>Indicate->Rotation matrix of the individual cameras, < >>Indicate->Translation vector of the individual camera,>representing the number of subregions;
dividing a large-scale sparse point cloud into a plurality of sub-region scenes by using an advantage aggregation method, wherein the specific method comprises the following steps of:
recording deviceRepresents a composition comprising->Aerial image and sparse point cloud model +.>Set of->Representing a device havingLine and->A square matrix of columns for recording similarities between images; />And->Respectively represent image +.>And image->All three-dimensional point sets that can be observed, then image +.>And image->The similarity between them is defined as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representation vector->Sum vector->Angle between->The calculation method of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,and->Are all intermediate computational variables;
to this end, a similarity matrix between the images is calculatedAccording to->The values of (2) constitute a graph structure +.>Wherein->Representing vertex(s),>representing edges;
recording deviceIndicating that contains->Vectors of individual elements, then arbitrary->Moment, vector->The values of each element in (a) are as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,subscript +.>Representation->Time vector->Is>Component(s)>Representation->Is a transpose of (2);
thus, the first and second substrates are bonded together,time and->Time, vector->The error between them is:
wherein, the liquid crystal display device comprises a liquid crystal display device,representation->Is a transpose of (2);
if it isTerminating the iterative computation process, namely dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas;
s3, calculating a depth map of the image in each region, and recording the regionThere is->Aerial image, region->Corresponding to the depthThe degree image is +.>,/>The method comprises the steps of carrying out a first treatment on the surface of the Calculating a corresponding high-quality depth map for each image by using a stereo matching method based on image blocks, and laying a foundation for reconstructing a high-quality dense point cloud model; the specific process is as follows:
s3.1, performing parallax propagation by using a parity iterative test, wherein the strategies comprise spatial propagation, visual angle propagation and time propagation;
step S3.2, when the space propagation is respectively iterated in odd number and even number, the propagation direction starts from the upper left corner and the lower right corner respectively, and each pixel point is compared with the parallax planes of the left pixel and the right pixel, and the matching cost of the parallax of the upper left corner and the lower right corner pixel and the current parallax value is calculated;
step S3.3, using the parallax value with the minimum substitution value as the optimal parallax value;
step S3.4, iterative computation step S3.1 to step S3.4 until the optimal parallax values of all pixels are calculated, and the optimal depth image can be obtained
S4, for each areaDepth image data +.>Selecting two optimal initial fusion views, and recording asWherein->Representation area->Number of aerial images in->And->Is a subscript for distinguishing between different depth images; selecting two optimal initial fusion images for each sub-region by using a multiple constraint method, and ensuring the geometric consistency of the generated dense point cloud and a real scene; the specific process is as follows:
s4.1, calculating feature matching points meeting homography matrix constraint between any two images in the input image by using homography matrix constraint method, and marking as;
S4.2, calculating feature matching points meeting the basic matrix constraint relation between any two images in the input image according to the basic matrix constraint relation, and marking the feature matching points as;
S4.3, calculating feature matching points meeting the constraint relation of the intrinsic matrix between any two images in the input image according to the constraint relation of the intrinsic matrix, and marking the feature matching points as;
Step S4.4, taking the matching pointsMatching point->And matching point->Intersection between them, obtain matching point +.>;
Step S4.5, the matching points will be satisfiedConstraining two images with the least error points to be used as initial fusion images;
s5, fusion regionAll depth images +.>The region +.>The corresponding dense point cloud model, noted +.>Wherein->Representing the number of three-dimensional points in the dense cloud generated by depth map fusion, +.>Representation dot->A location in a world coordinate system; the normal vector information of the depth map is fully utilized to integrate the depth image in the region, and a dense point cloud model corresponding to the image in the region is obtained, and the specific process is as follows:
s5.1, calculating the confidence coefficient of each vertex of the depth images to be fused;
s5.2, deleting redundant overlapping points from the depth images to be fused according to the confidence degree of each vertex in the depth images, and obtaining topology information of each region image in each depth image;
s5.3, carrying out weighting operation on vertexes on the depth image according to the topology information to obtain geometric information of the image;
s5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region;
s6, the dense point cloud model of each region is obtainedCombining to obtain a complete dense point cloud model, which is marked as +.>Wherein->Representing the number of subregions; combining a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene by using a global iterative nearest neighbor method, wherein the specific process is as follows:
Step S6.2, finding out origin cloudCorresponding point set->And->So that->And->Distance betweenMinimum;
step S6.3, calculating the Point setAnd Point set->A rotation matrix between->And translation vector->;
Step S6.4, use of a rotation matrixAnd translation vector->For->The above points are rotated and translated to calculate a new set of points +.>;
Step S6.5, calculating the Point setAnd (2) point set->Average distance between>,/>Representing the number of three-dimensional points in the point set;
step S6.6 ifIf the number of iterations is smaller than the preset threshold or larger than the preset number of iterations, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.
Example 1:
the original aerial image of the embodiment is shown in fig. 2, the final reconstruction result of the embodiment is shown in fig. 8, and the dense point cloud model reconstructed from the large-scale aerial image has higher geometric consistency with the real scene.
As can be seen from the above embodiments, according to the present invention, firstly, a sparse point cloud model of a complete scene is calculated (as shown in fig. 3), then a large-scale sparse point cloud model is divided into sparse point cloud models of a plurality of sub-regions (as shown in fig. 4), secondly, a depth map of an image in each sub-region is calculated, and an initial fusion view is selected for each sub-region (as shown in fig. 5, an original image (non-depth image) of the initial fusion view), and fourthly, a dense point cloud model corresponding to the sub-region is obtained by fusing the depth images in the sub-region (as shown in fig. 6); finally, the dense point clouds of all the subregions are integrated, and a dense point cloud model of the complete scene can be obtained (as shown in fig. 8). In addition, as can be seen from fig. 7, the dense point cloud model calculated by other prior art schemes has a large difference in geometric consistency with the real scene.
In addition, the technical scheme of the embodiment only needs 8 hours for processing 1000 aerial images and only occupies 24Gb memory space, that is, the invention can not only improve the time efficiency of large-scale multi-view three-dimensional reconstruction, but also avoid the memory overflow problem.
The invention can also be applied to other fields such as metauniverse, digital Chinese construction, digital village construction, digital city construction, military simulation, unmanned, automatic navigation under satellite-free conditions, digital protection of cultural heritage, three-dimensional scene monitoring, shooting and manufacturing of large-scale movies, three-dimensional investigation of natural disaster sites, three-dimensional visual science popularization creation, virtual reality and augmented reality.
Claims (6)
1. A distributed multi-view three-dimensional reconstruction method for large-scale aerial images is characterized by comprising the following steps of: the method comprises the following steps:
s1, for a given large-scale aerial image data set, UAV= { I 1 ,I 2 ,…,I N N represents the number of aerial images, and a mixed motion estimation structure method is used to calculate sparse point cloud models and camera parameters of the corresponding scene: the method comprises the following specific steps:
step S1.1, aerial image matching
Firstly, detecting feature points and calculating feature descriptors by using a local feature method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;
step S1.2, calculating camera parameters
According to the feature matching points obtained in the step S1.1, calculating camera parameters by using an incremental motion estimation structure method; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;
wherein the sparse point cloud model is S= { P i |P i =(x i ,y i ,z i )},i∈[1,M]Where M represents three-dimensional points in a sparse point cloud of the entire sceneNumber of (x) i ,y i ,z i ) Representing the ith three-dimensional point P i A position in the world coordinate system, i representing a sequence number of the three-dimensional point;
camera parameter c= { Q ii |Q ii =(K ii ,R ii ,T ii )},ii∈[1,N]Where N represents the number of aerial images, K ii Representing the intrinsic parameter matrix of the ii-th camera, R ii Representing the rotation matrix, T, of the ii-th camera ii A translation vector representing the ii-th camera, ii representing the serial number of the camera;
s2, calculating a sparse point cloud model of the regional scene by using a global motion estimation structure method according to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2:
dividing a large-scale sparse point cloud model S into different subareas to obtain S= { S j |S j ={P i |P i =(x i ,y i ,z i )}},j∈[1,M′],i∈[1,M]Wherein M' represents the number of sub-regions and i represents the three-dimensional point P i J represents the j-th sub-region;
dividing the camera parameters C into M' subregions corresponding to the sparse point cloud model S to obtain the camera parameters of each subregion
Respectively represent the ith in the subarea * An internal parameter matrix of each camera, a rotation matrix of the camera and a translation vector of the camera;
will sub-region S j Registering all three-dimensional points corresponding to the images in the world coordinate system, and simultaneously optimizing the camera parameters and the three-dimensional points in the world coordinate system by using a bundling adjustment method until convergence to obtain an accurate sparse point cloud model;
s3, calculating the image in each sub-areaDepth map, memory area S j With Q therein * Aerial image, sub-region S j The corresponding depth image isk∈[1,Q * ]The method comprises the steps of carrying out a first treatment on the surface of the Wherein k represents the sequence number of the kth depth map;
s4, for each sub-area S j Depth image data D (S) j ) Two optimal initial fusion views are selected, denoted as D' (S) j )={d p ,d q },p,g∈[1,Q * ]Wherein Q is * Representing the subarea S j The number of aerial images in the image acquisition device, p and q are subscripts and are used for distinguishing different depth images; d, d p And d q Representing an optimal initial fusion view;
s5, fusion sub-region S j All depth images D (S) j ) Thus obtaining the subarea S j The corresponding dense point cloud model, denoted as M j ={G a |G a =(x a ,y a ,z a )},a∈[1,W]Wherein W represents the number of three-dimensional points in the dense cloud after depth map fusion in the sub-region, (x) a ,y a ,z a ) Representation point G a A location in a world coordinate system;
s6, a dense point cloud model M of each sub-area j Merging the two points into a whole to obtain a dense point cloud Model of the complete scene, namely model= { M 1 ,…,M i ,…M M′ }。
2. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S2, the subareas are divided by using the advantage aggregation method, and the specific method is as follows:
the beta mark represents a set containing N aerial images and a sparse point cloud model S, and J represents a square matrix with N rows and N columns for recording the similarity between the images; v (V) i And V j Respectively represent image I i Sum pictureImage I j All three-dimensional point sets that can be observed, then image I i And image I j The similarity between them is defined as:
wherein alpha represents a vectorSum vector->Angle between->The calculation method of (2) is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,and->Are all intermediate computational variables;
so far, a similarity matrix J between images is calculated, and a graph structure G= (V, E) is constructed according to the value of J, wherein V represents vertexes and E represents edges;
the symbol x represents a vector containing N' elements, and at any time t+1, the value of each element in the vector x is as follows:
wherein x is g The subscript g of (t) denotes the g-th component of the t-time vector x (t), x (t) T Represents a transpose of x (t);
therefore, the error between the time t+1 and the time t, vector x, is:
ε=x(t+1) T J 0 x(t+1)-x(t) T J 0 x(t) (5)
wherein x (t) T Represents a transpose of x (t);
if epsilon is less than 0.001, terminating the iterative computation process, and dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas.
3. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S3, a corresponding high-quality depth map is calculated for each image by using a stereo matching method based on image blocks, and the detailed calculation steps are as follows:
s3.1, performing parallax propagation by using a parity iterative test, wherein the strategies comprise spatial propagation, visual angle propagation and time propagation;
step S3.2, when the space propagation is respectively iterated in odd number and even number, the propagation direction starts from the upper left corner and the lower right corner respectively, and each pixel point is compared with the parallax planes of the left pixel and the right pixel, and the matching cost of the parallax of the upper left corner and the lower right corner pixel and the current parallax value is calculated;
step S3.3, using the parallax value with the minimum substitution value as the optimal parallax value;
and step S3.4, iteratively calculating the steps S3.1 to S3.4 until the optimal parallax values of all pixels are calculated, and obtaining the optimal depth image.
4. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S4, two optimal initial fusion images are selected for each sub-region by using a multiple constraint method, and the detailed calculation steps are as follows:
s4.1, calculating feature matching points meeting homography matrix constraint between any two images in the input image by using homography matrix constraint method, and marking as Matches 1 ;
S4.2, calculating feature matching points meeting the basic matrix constraint relation between any two images in the input image according to the basic matrix constraint relation, and marking as Matches 2 ;
S4.3, calculating feature matching points meeting the constraint relation of the intrinsic matrix between any two images in the input image according to the constraint relation of the intrinsic matrix, and marking the feature matching points as Matches 3 ;
S4.4, taking matching points Matches 1 Matching points Matches 2 And matching point Matches 3 Intersection between them to obtain matching point Matches 4 ;
Step S4.5, matching points Matches will be satisfied 4 The two images with the least error points are constrained as the initial fused image.
5. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S5, the depth images in the region are fused into a whole by fully utilizing the normal vector information of the depth image, and a dense point cloud model corresponding to the images in the region is obtained; the detailed steps are as follows:
s5.1, calculating the confidence coefficient of each vertex of the depth images to be fused;
s5.2, deleting redundant overlapping points from the depth images to be fused according to the confidence degree of each vertex in the depth images, and obtaining topology information of each region image in each depth image;
s5.3, carrying out weighting operation on vertexes on the depth image according to the topology information to obtain geometric information of the image;
and S5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region.
6. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S6, a global iterative nearest neighbor method is used to merge a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene; the detailed steps are as follows:
step S6.1, selecting a point set P from the target point cloud P i′ And P is i′ ∈P;
Step S6.2, finding out the corresponding point set Q in the origin cloud Q i′ And Q is i′ E Q, such that Q i′ And P i′ Distance P between i′ -Q i′ Minimum |;
step S6.3, calculating the Point set P i′ Sum point set Q i′ A rotation matrix R therebetween i′ And translation vector t i′ ;
Step S6.4, using rotation matrix R i′ And translation vector t i′ P pair P i′ The above-mentioned points undergo the processes of rotation conversion and translation conversion so as to calculate new point set P' i′ =R i′ P i′ +t i′ ;
Step S6.5, calculating Point set P' i′ And point set P i′ Average distance betweenn represents the number of three-dimensional points in the point set;
step S6.6, if d is smaller than a preset threshold value or larger than a preset iteration number, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310011438.9A CN115719407B (en) | 2023-01-05 | 2023-01-05 | Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310011438.9A CN115719407B (en) | 2023-01-05 | 2023-01-05 | Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115719407A CN115719407A (en) | 2023-02-28 |
CN115719407B true CN115719407B (en) | 2023-06-27 |
Family
ID=85257835
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310011438.9A Active CN115719407B (en) | 2023-01-05 | 2023-01-05 | Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115719407B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116071504B (en) * | 2023-03-06 | 2023-06-09 | 安徽大学 | Multi-view three-dimensional reconstruction method for high-resolution image |
CN116805355B (en) * | 2023-08-25 | 2023-11-21 | 安徽大学 | Multi-view three-dimensional reconstruction method for resisting scene shielding |
CN116993925B (en) * | 2023-09-25 | 2023-12-01 | 安徽大学 | Distributed bundling adjustment method for large-scale three-dimensional reconstruction |
CN117408999B (en) * | 2023-12-13 | 2024-02-20 | 安格利(成都)仪器设备有限公司 | Method for automatically detecting corrosion pits of containers and pipelines by utilizing point cloud complement |
CN117437363B (en) * | 2023-12-20 | 2024-03-22 | 安徽大学 | Large-scale multi-view stereoscopic method based on depth perception iterator |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416840A (en) * | 2018-03-14 | 2018-08-17 | 大连理工大学 | A kind of dense method for reconstructing of three-dimensional scenic based on monocular camera |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11074701B2 (en) * | 2019-12-13 | 2021-07-27 | Reconstruct Inc. | Interior photographic documentation of architectural and industrial environments using 360 panoramic videos |
WO2021185322A1 (en) * | 2020-03-18 | 2021-09-23 | 广州极飞科技有限公司 | Image processing method and related device |
CN111599001B (en) * | 2020-05-14 | 2023-03-14 | 星际(重庆)智能装备技术研究院有限公司 | Unmanned aerial vehicle navigation map construction system and method based on image three-dimensional reconstruction technology |
CN111968218A (en) * | 2020-07-21 | 2020-11-20 | 电子科技大学 | Three-dimensional reconstruction algorithm parallelization method based on GPU cluster |
CN112085845B (en) * | 2020-09-11 | 2021-03-19 | 中国人民解放军军事科学院国防科技创新研究院 | Outdoor scene rapid three-dimensional reconstruction device based on unmanned aerial vehicle image |
CN113284227B (en) * | 2021-05-14 | 2022-11-22 | 安徽大学 | Distributed motion inference structure method for large-scale aerial images |
CN115205489A (en) * | 2022-06-06 | 2022-10-18 | 广州中思人工智能科技有限公司 | Three-dimensional reconstruction method, system and device in large scene |
-
2023
- 2023-01-05 CN CN202310011438.9A patent/CN115719407B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416840A (en) * | 2018-03-14 | 2018-08-17 | 大连理工大学 | A kind of dense method for reconstructing of three-dimensional scenic based on monocular camera |
Also Published As
Publication number | Publication date |
---|---|
CN115719407A (en) | 2023-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115719407B (en) | Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method | |
CN111325797B (en) | Pose estimation method based on self-supervision learning | |
US10152828B2 (en) | Generating scene reconstructions from images | |
CN113808261B (en) | Panorama-based self-supervised learning scene point cloud completion data set generation method | |
CN113096250A (en) | Three-dimensional building model library system construction method based on unmanned aerial vehicle aerial image sequence | |
CN112765095A (en) | Method and system for filing image data of stereo mapping satellite | |
El Hazzat et al. | 3D reconstruction system based on incremental structure from motion using a camera with varying parameters | |
CN113284227B (en) | Distributed motion inference structure method for large-scale aerial images | |
Wei et al. | Fgr: Frustum-aware geometric reasoning for weakly supervised 3d vehicle detection | |
CN111881985B (en) | Stereo matching method, device, terminal and storage medium | |
CN112150518B (en) | Attention mechanism-based image stereo matching method and binocular device | |
CN114202632A (en) | Grid linear structure recovery method and device, electronic equipment and storage medium | |
CN116977596A (en) | Three-dimensional modeling system and method based on multi-view images | |
Liu et al. | Deep learning based multi-view stereo matching and 3D scene reconstruction from oblique aerial images | |
Lai et al. | 2D3D-MVPNet: Learning cross-domain feature descriptors for 2D-3D matching based on multi-view projections of point clouds | |
Liu et al. | Dense stereo matching strategy for oblique images that considers the plane directions in urban areas | |
US20080111814A1 (en) | Geometric tagging | |
CN117456136A (en) | Digital twin scene intelligent generation method based on multi-mode visual recognition | |
CN112085842B (en) | Depth value determining method and device, electronic equipment and storage medium | |
Chen et al. | Densefusion: Large-scale online dense pointcloud and dsm mapping for uavs | |
CN116612235A (en) | Multi-view geometric unmanned aerial vehicle image three-dimensional reconstruction method and storage medium | |
CN114608558A (en) | SLAM method, system, device and storage medium based on feature matching network | |
CN113850293A (en) | Positioning method based on multi-source data and direction prior joint optimization | |
Chen et al. | End-to-end multi-view structure-from-motion with hypercorrelation volume | |
Li et al. | BDLoc: Global localization from 2.5 D building map |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |