CN115719407B

CN115719407B - Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method

Info

Publication number: CN115719407B
Application number: CN202310011438.9A
Authority: CN
Inventors: 曹明伟; 王子洋; 赵海峰; 孙登第
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2023-01-05
Filing date: 2023-01-05
Publication date: 2023-06-27
Anticipated expiration: 2043-01-05
Also published as: CN115719407A

Abstract

The invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which comprises the steps of firstly calculating a sparse point cloud model and a camera gesture of a scene, dividing the sparse point cloud model into different areas, calculating a depth map of an image contained in each area, selecting two optimal depth images for each area as initial fusion views, fusing the depth images of each area to obtain a dense point cloud model in each area, and merging dense point clouds in a plurality of areas to obtain a dense point cloud model of a complete scene. The invention fully utilizes the regionality among large-scale aerial images, converts the multi-view three-dimensional reconstruction problem of a large-scale scene into a small-scale multi-view three-dimensional reconstruction problem which can be solved on a low-performance computer, thereby improving the time efficiency of three-dimensional reconstruction and reducing the cost of three-dimensional reconstruction.

Description

Large-scale aerial image-oriented distributed multi-view three-dimensional reconstruction method

Technical Field

The invention relates to a computer vision and image processing technology, in particular to a distributed multi-view three-dimensional reconstruction method for large-scale aerial images.

Background

Multi-view Stereo (MVS) is a technique for computing a scene dense point cloud model from image data, typically with the output information of the motion estimation structure (Structure from Motion, sfM), i.e., sparse point cloud model and camera parameters, as input information for the MVS. Currently, the multi-view stereo reconstruction problem for small-scale image data (such as small-scale scene image data acquired by a handheld camera) has been greatly researched, however, for a large-scale outdoor scene, the existing multi-view stereo reconstruction method needs to be further improved. In addition, with the popularization of consumer unmanned aerial vehicle equipment, the large-scale data of outdoor scene is easy to obtain. The existing multi-view three-dimensional reconstruction method mainly has the following challenges when processing large-scale aerial image data: (a) The multi-view three-dimensional reconstruction process is very time-consuming, and particularly when outdoor large-scale aerial image data are processed, the existing multi-view three-dimensional reconstruction method cannot calculate a dense point cloud model in a limited time, so that the timeliness requirement of a high-level application system is difficult to meet; (b) The memory overflow is caused by the multi-view three-dimensional reconstruction method, so that the memory overflow problem can occur in the multi-view three-dimensional reconstruction method of a single-machine version, and the three-dimensional reconstruction process is failed.

The above problems seriously hamper the development and application of multi-view stereoscopic reconstruction technology, and expose the defects of single-machine version multi-view stereoscopic reconstruction methods in processing large-scale aerial image data. Therefore, there is a need for a distributed multi-view stereoscopic reconstruction method for large-scale aerial images, so as to be able to quickly calculate a dense point cloud model of a scene from the large-scale aerial images.

Currently, classical papers on multi-view stereoscopic studies are mainly: [1] account, dense, and Robust Multi-View Stereopsis, [2] [ Pixelwise View Selection for Unstructured Multi-View Stereo, [3] [ BlendedMVS: A large-scale dataset for generalized Multi-View Stereo networks ]. The paper [1] was published in 2007 on a CVPR conference, and is a multi-view three-dimensional reconstruction method based on seed point diffusion; paper [2] published in 2016 on ECCV conference, is a multi-view stereo reconstruction method based on image block matching; the paper [3] is published on CVPR in 2020, and is a multi-view three-dimensional reconstruction method based on deep learning, and mainly adopts a deep learning technology to estimate the depth map of each image. The focus of these multi-view stereo reconstruction methods is how to improve the accuracy of the multi-view stereo reconstruction model (three-dimensional dense point cloud), and the object of processing is the image data of the small-scale scene.

Therefore, when the existing multi-view stereo reconstruction method is applied to large-scale aerial image data, the following challenges still face: (1) When the existing single-machine version multi-view stereo reconstruction method is used for processing large-scale aerial image data, a large memory space is required, for example, the content space required for processing 1000 pieces of image data is 64 Gb, even exceeds the maximum memory space supported by the existing hardware equipment, for example, when the image data reaches 1500 pieces, 128Gb of memory space is required, and the range of the maximum memory space supported by a single computer is far exceeded; (2) The operation efficiency of the existing multi-view three-dimensional reconstruction method is too low to meet the time efficiency requirement of large-scale three-dimensional reconstruction based on aerial image data, for example, 10 days are required for processing 1000 aerial images.

Disclosure of Invention

The invention aims to: the invention aims to solve the defects in the prior art, provides a distributed multi-view three-dimensional reconstruction method for large-scale aerial images,

and a dense point cloud model of the scene is quickly calculated from the large-scale aerial image data above the distributed running environment, so that the progress of multi-view three-dimensional reconstruction technology for the large-scale aerial image is promoted, and the goal of quickly calculating high-quality dense point cloud of the large-scale scene is realized.

The technical scheme is as follows: the invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which comprises the following steps of:

s1, for a given large-scale aerial image data set,

wherein->

Representing the number of aerial images, and calculating a sparse point cloud model and camera parameters of a corresponding scene:

the sparse point cloud model is S

Wherein->

Representing the number of three-dimensional points in a sparse point cloud of the entire scene, +.>

Represent the firstiThree-dimensional dot->

Position in world coordinate system, +.>

A sequence number representing a three-dimensional point;

camera parameter C

Wherein->

Representing the number of aerial images taken,

indicate->

Internal parameter matrix of individual cameras, +.>

Indicate->

Rotation matrix of the individual cameras, < >>

Indicate->

Translation vector of the individual camera,>

a serial number indicating the camera;

s2, dividing the large-scale sparse point cloud model S into different subareas to obtain S

Wherein->

Representing the number of subregions>

Serial number representing three-dimensional point->

Indicate->

A sub-region;

dividing camera parameters C into

Sub-regions corresponding to the sparse point cloud model S are obtained, and camera parameters C +.>

；

S3, calculating a depth map of the image in each region, and recording the region

There is->

Aerial image, region->

The corresponding depth image is +.>

,/>

The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>

Indicate->

Sequence number of the depth map;

s4, for each area

Depth image data +.>

Selecting two optimal initial fusion views, and recording as

Wherein->

Representation area->

Number of aerial images in->

And->

Is a subscript for distinguishing between different depth images; />

And->

Representing an optimal initial fusion view;

s5, fusion region

All depth images +.>

The region +.>

The corresponding dense point cloud model, noted +.>

Wherein->

Representing the number of three-dimensional points in the dense cloud after depth map fusion in the subarea, ++>

Representation dot->

A location in a world coordinate system;

s6, the dense point cloud model of each region is obtained

Merging the two points into a whole to obtain a dense point cloud model of the complete scene, which is marked as +.>

Wherein->

Representing the number of sub-regions.

Further, in step S1, a hybrid motion estimation structure method is used to estimate the motion of the image data set from the aerial image

The sparse point cloud model and the camera parameters of the scene are calculated, and the method comprises the following specific steps:

step S1.1, aerial image matching

Firstly, detecting feature points and calculating feature descriptors by using a local feature detection method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

step S1.2, calculating camera parameters

According to the feature matching points obtained in the step S1.1, calculating camera parameters by using an incremental motion estimation structure method; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;

step S1.3, calculating a sparse point cloud model of the regional image

According to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2, a global motion estimation structure method is used for calculating a sparse point cloud model of the regional scene, so that the time efficiency of three-dimensional reconstruction is improved. First, the region is

Registering all three-dimensional points corresponding to the images in the world coordinate system, and then simultaneously optimizing the camera parameters and the three-dimensional points in the world coordinate system by using a cluster adjustment (Bundle Adjustment) method until convergence to obtain an accurate sparse point cloud model; wherein (1)>

Indicate->

Sequence number of each region.

Further, in the step S2, the dominant clustering method is used to divide the large-scale sparse point cloud into a plurality of sub-area scenes, and the specific method is as follows:

recording device

Represents a composition comprising->

Aerial image and sparse point cloud model +.>

Set of->

Representing a device having

Line and->

A square matrix of columns for recording similarities between images; />

And->

Respectively represent image +.>

And image->

All three-dimensional point sets that can be observed, then image +.>

And image->

The similarity between them is defined as:

（1）

wherein, the liquid crystal display device comprises a liquid crystal display device,

representation ofVector->

Sum vector->

Angle between->

The calculation method of (2) is as follows:

（2）

and is also provided with

The calculation method of (2) is as follows:

（3）

and->

Are all intermediate computational variables;

to this end, a similarity matrix between the images is calculated

According to->

The values of (2) constitute a graph structure +.>

Wherein->

Representing vertex(s),>

representing edges;

recording device

Indicating that contains->

Vectors of individual elements, then arbitrary->

Moment, vector->

The values of each element in (a) are as follows:

（4）

subscript +.>

Representation->

Time vector->

Is>

Component(s)>

Representation->

Is a transpose of (2);

thus, the first and second substrates are bonded together,

time and->

Time, vector->

The error between them is:

（5）

representation->

Is a transpose of (2);

if it is

And terminating the iterative calculation process, namely dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas.

Further, in the step S3, a stereo matching method based on image blocks is used to calculate a corresponding high-quality depth map for each image, so as to lay a foundation for reconstructing a high-quality dense point cloud model. The detailed calculation steps are as follows:

s3.1, performing parallax propagation by using a parity iterative test, wherein the strategies comprise spatial propagation, visual angle propagation and time propagation;

step S3.2, when the space propagation is respectively iterated in odd number and even number, the propagation direction starts from the upper left corner and the lower right corner respectively, and each pixel point is compared with the parallax planes of the left pixel and the right pixel, and the matching cost of the parallax of the upper left corner and the lower right corner pixel and the current parallax value is calculated;

step S3.3, using the parallax value with the minimum substitution value as the optimal parallax value;

and step S3.4, iteratively calculating the steps S3.1 to S3.4 until the optimal parallax values of all pixels are calculated, and obtaining the optimal depth image.

Further, in the step S4, two optimal initial fusion images are selected for each sub-region by using a multiple constraint method, so as to ensure the geometric consistency of the generated dense point cloud and the real scene; the detailed calculation steps are as follows:

s4.1, calculating feature matching points meeting homography matrix constraint between any two images in the input image by using homography matrix constraint method, and marking as

；

S4.2, calculating feature matching points meeting the basic matrix constraint relation between any two images in the input image according to the basic matrix constraint relation, and marking the feature matching points as

；

S4.3, calculating feature matching points meeting the constraint relation of the intrinsic matrix between any two images in the input image according to the constraint relation of the intrinsic matrix, and marking the feature matching points as

；

Step S4.4, taking the matching points

Matching point->

And matching point->

Intersection between them, obtain matching point +.>

；

Step S4.5, the matching points will be satisfied

The two images with the least error points are constrained as the initial fused image.

Further, in the step S5, the depth image in the region is fused into a whole by fully utilizing the normal vector information of the depth image, so as to obtain a dense point cloud model corresponding to the image in the region, and the detailed steps are as follows:

s5.1, calculating the confidence coefficient of each vertex of the depth images to be fused;

s5.2, deleting redundant overlapping points from the depth images to be fused according to the confidence degree of each vertex in the depth images, and obtaining topology information of each region image in each depth image;

s5.3, carrying out weighting operation on vertexes on the depth image according to the topology information to obtain geometric information of the image;

and S5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region. Further, in the step S6, a global iterative nearest neighbor method is used to merge a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene, and the detailed steps are as follows:

step S6.1, cloud at the target point

Selecting point set +.>

And->

；

Step S6.2, finding out origin cloud

Corresponding point set->

And->

So that->

And->

Distance between

Minimum;

step S6.3, calculating the Point set

And Point set->

A rotation matrix between->

And translation vector->

；

Step S6.4, use of a rotation matrix

And translation vector->

For->

The above points are rotated and translated to calculate a new set of points +.>

；

Step S6.5, calculating the Point set

And (2) point set->

Average distance between>

，/>

Representing the number of three-dimensional points in the point set;

step S6.6 if

If the number of iterations is smaller than the preset threshold or larger than the preset number of iterations, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.

The beneficial effects are that: compared with the prior art, the invention has the following advantages:

(1) According to the invention, the sparse point cloud model of the large-scale scene is divided into different sub-set areas, so that the problems of memory overflow and system collapse of a single-machine version multi-view three-dimensional reconstruction method are avoided, and the large-scale multi-view three-dimensional reconstruction is possible.

(2) According to the invention, on different nodes of the distributed system, sparse point cloud models and camera parameters (including camera focal length, rotation matrix and translation vector) of different subareas are independently processed, a dense point cloud model is rapidly calculated, and the time efficiency of large-scale multi-view three-dimensional reconstruction is improved.

(3) The method not only can solve the problem of memory space overflow of a single-machine version multi-view three-dimensional reconstruction method, but also can improve the time efficiency of large-scale multi-view three-dimensional reconstruction, and lays an important foundation for the application of unmanned aerial vehicle aerial images in the three-dimensional reconstruction field and the development and application of the three-dimensional reconstruction technology.

Drawings

FIG. 1 is a schematic overall flow chart of the present invention;

FIG. 2 is an aerial image dataset in an embodiment;

FIG. 3 is a large-scale sparse point cloud model in an embodiment;

FIG. 4 is sparse point cloud model and camera pose information for a sub-region in an embodiment;

FIG. 5 is an original image (non-depth image) of a primary fused view of a sub-region in an embodiment;

FIG. 6 is a dense point cloud model of a sub-region of an embodiment;

FIG. 7 is a prior art dense point cloud model;

FIG. 8 is a complete dense point cloud model according to an embodiment of the present invention.

Detailed Description

The technical scheme of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.

The invention discloses a distributed multi-view three-dimensional reconstruction method for large-scale aerial images, which aims at reconstructing a scene dense point cloud from the large-scale aerial images, and has the following core ideas: firstly, dividing a sparse point cloud model corresponding to a large-scale aerial image into a plurality of subareas; secondly, distributing aerial images, sparse point cloud models and camera parameters corresponding to the subareas on different nodes of a distributed environment for processing, and calculating a depth image of each image; then, fusing depth images in the subareas on different nodes of the distributed environment, so as to obtain a dense point cloud model of the subareas; finally, merging the dense point clouds on the child nodes on a main control machine of the distributed environment, so as to obtain a dense point cloud model of the complete scene; and finally, a high-quality dense point cloud model can be rapidly calculated from large-scale aerial image data.

As shown in fig. 1, the distributed multi-view stereoscopic reconstruction method for large-scale aerial images of the invention comprises the following steps:

s1, for a given large-scale aerial image data set,

wherein->

the sparse point cloud model is S

Wherein->

Representing the number of three-dimensional points,

representation dot->

Position in world coordinate system, +.>

Subscript->

A sequence number representing a three-dimensional point;

camera parameter C

Wherein->

Representing the number of aerial images taken,

indicate->

Internal parameter matrix of individual cameras, +.>

Indicate->

Rotation matrix of the individual cameras, < >>

Indicate->

Translation vector of the individual camera,>

a serial number indicating the camera;

step S1.1, detecting feature points and calculating feature descriptors by using a SuperPoint method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

s1.2, calculating camera parameters by using an incremental motion estimation structure method according to the feature matching points obtained in the S1.1; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;

and S1.3, calculating a sparse point cloud model of the regional scene by using a global motion estimation structure method according to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2, so that the time efficiency of three-dimensional reconstruction is improved. First, the region is

Indicate->

A sequence number of the individual region;

s2, dividing the large-scale sparse point cloud model S into different areas to obtain S

Wherein->

Representing the number of three-dimensional points in a sparse point cloud of the whole scene, +.>

Representation dot->

Position in world coordinate system, +.>

Representing the number of subregions>

Serial number representing three-dimensional point->

Indicate->

A sub-region;

dividing camera parameters C into

Different areas, obtaining camera parameters C of each area

Wherein->

Representing the number of aerial images, +.>

Indicate->

Internal parameter matrix of individual cameras, +.>

Indicate->

Rotation matrix of the individual cameras, < >>

Indicate->

Translation vector of the individual camera,>

representing the number of subregions;

dividing a large-scale sparse point cloud into a plurality of sub-region scenes by using an advantage aggregation method, wherein the specific method comprises the following steps of:

recording device

Represents a composition comprising->

Aerial image and sparse point cloud model +.>

Set of->

Representing a device having

Line and->

A square matrix of columns for recording similarities between images; />

And->

Respectively represent image +.>

And image->

All three-dimensional point sets that can be observed, then image +.>

And image->

The similarity between them is defined as:

（1）

representation vector->

Sum vector->

Angle between->

The calculation method of (2) is as follows:

（2）

and is also provided with

The calculation method of (2) is as follows:

（3）

and->

Are all intermediate computational variables;

to this end, a similarity matrix between the images is calculated

According to->

The values of (2) constitute a graph structure +.>

Wherein->

Representing vertex(s),>

representing edges;

recording device

Indicating that contains->

Vectors of individual elements, then arbitrary->

Moment, vector->

The values of each element in (a) are as follows:

（4）

subscript +.>

Representation->

Time vector->

Is>

Component(s)>

Representation->

Is a transpose of (2);

thus, the first and second substrates are bonded together,

time and->

Time, vector->

The error between them is:

（5）

representation->

Is a transpose of (2);

if it is

Terminating the iterative computation process, namely dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas;

There is->

Aerial image, region->

Corresponding to the depthThe degree image is +.>

,/>

The method comprises the steps of carrying out a first treatment on the surface of the Calculating a corresponding high-quality depth map for each image by using a stereo matching method based on image blocks, and laying a foundation for reconstructing a high-quality dense point cloud model; the specific process is as follows:

step S3.4, iterative computation step S3.1 to step S3.4 until the optimal parallax values of all pixels are calculated, and the optimal depth image can be obtained

S4, for each area

Depth image data +.>

Selecting two optimal initial fusion views, and recording as

Wherein->

Representation area->

Number of aerial images in->

And->

Is a subscript for distinguishing between different depth images; selecting two optimal initial fusion images for each sub-region by using a multiple constraint method, and ensuring the geometric consistency of the generated dense point cloud and a real scene; the specific process is as follows:

；

；

；

Step S4.4, taking the matching points

Matching point->

And matching point->

Intersection between them, obtain matching point +.>

；

Step S4.5, the matching points will be satisfied

Constraining two images with the least error points to be used as initial fusion images;

s5, fusion region

All depth images +.>

The region +.>

The corresponding dense point cloud model, noted +.>

Wherein->

Representing the number of three-dimensional points in the dense cloud generated by depth map fusion, +.>

Representation dot->

A location in a world coordinate system; the normal vector information of the depth map is fully utilized to integrate the depth image in the region, and a dense point cloud model corresponding to the image in the region is obtained, and the specific process is as follows:

s5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region;

s6, the dense point cloud model of each region is obtained

Combining to obtain a complete dense point cloud model, which is marked as +.>

Wherein->

Representing the number of subregions; combining a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene by using a global iterative nearest neighbor method, wherein the specific process is as follows:

step S6.1, cloud at the target point

Selecting point set +.>

And->

；

Step S6.2, finding out origin cloud

Corresponding point set->

And->

So that->

And->

Distance between

Minimum;

step S6.3, calculating the Point set

And Point set->

A rotation matrix between->

And translation vector->

；

Step S6.4, use of a rotation matrix

And translation vector->

For->

；

Step S6.5, calculating the Point set

And (2) point set->

Average distance between>

，/>

Representing the number of three-dimensional points in the point set;

step S6.6 if

Example 1:

the original aerial image of the embodiment is shown in fig. 2, the final reconstruction result of the embodiment is shown in fig. 8, and the dense point cloud model reconstructed from the large-scale aerial image has higher geometric consistency with the real scene.

As can be seen from the above embodiments, according to the present invention, firstly, a sparse point cloud model of a complete scene is calculated (as shown in fig. 3), then a large-scale sparse point cloud model is divided into sparse point cloud models of a plurality of sub-regions (as shown in fig. 4), secondly, a depth map of an image in each sub-region is calculated, and an initial fusion view is selected for each sub-region (as shown in fig. 5, an original image (non-depth image) of the initial fusion view), and fourthly, a dense point cloud model corresponding to the sub-region is obtained by fusing the depth images in the sub-region (as shown in fig. 6); finally, the dense point clouds of all the subregions are integrated, and a dense point cloud model of the complete scene can be obtained (as shown in fig. 8). In addition, as can be seen from fig. 7, the dense point cloud model calculated by other prior art schemes has a large difference in geometric consistency with the real scene.

In addition, the technical scheme of the embodiment only needs 8 hours for processing 1000 aerial images and only occupies 24Gb memory space, that is, the invention can not only improve the time efficiency of large-scale multi-view three-dimensional reconstruction, but also avoid the memory overflow problem.

The invention can also be applied to other fields such as metauniverse, digital Chinese construction, digital village construction, digital city construction, military simulation, unmanned, automatic navigation under satellite-free conditions, digital protection of cultural heritage, three-dimensional scene monitoring, shooting and manufacturing of large-scale movies, three-dimensional investigation of natural disaster sites, three-dimensional visual science popularization creation, virtual reality and augmented reality.

Claims

1. A distributed multi-view three-dimensional reconstruction method for large-scale aerial images is characterized by comprising the following steps of: the method comprises the following steps:

s1, for a given large-scale aerial image data set, UAV= { I ₁ ，I ₂ ，…，I _N N represents the number of aerial images, and a mixed motion estimation structure method is used to calculate sparse point cloud models and camera parameters of the corresponding scene: the method comprises the following specific steps:

step S1.1, aerial image matching

Firstly, detecting feature points and calculating feature descriptors by using a local feature method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

step S1.2, calculating camera parameters

wherein the sparse point cloud model is S= { P _i |P _i ＝(x _i ，y _i ，z _i )}，i∈[1，M]Where M represents three-dimensional points in a sparse point cloud of the entire sceneNumber of (x) _i ，y _i ，z _i ) Representing the ith three-dimensional point P _i A position in the world coordinate system, i representing a sequence number of the three-dimensional point;

camera parameter c= { Q _ii |Q _ii ＝(K _ii ，R _ii ，T _ii )}，ii∈[1，N]Where N represents the number of aerial images, K _ii Representing the intrinsic parameter matrix of the ii-th camera, R _ii Representing the rotation matrix, T, of the ii-th camera _ii A translation vector representing the ii-th camera, ii representing the serial number of the camera;

s2, calculating a sparse point cloud model of the regional scene by using a global motion estimation structure method according to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2:

dividing a large-scale sparse point cloud model S into different subareas to obtain S= { S _j |S _j ＝{P _i |P _i ＝(x _i ，y _i ，z _i )}}，j∈[1，M′]，i∈[1，M]Wherein M' represents the number of sub-regions and i represents the three-dimensional point P _i J represents the j-th sub-region;

dividing the camera parameters C into M' subregions corresponding to the sparse point cloud model S to obtain the camera parameters of each subregion

Respectively represent the ith in the subarea ^* An internal parameter matrix of each camera, a rotation matrix of the camera and a translation vector of the camera;

will sub-region S _j Registering all three-dimensional points corresponding to the images in the world coordinate system, and simultaneously optimizing the camera parameters and the three-dimensional points in the world coordinate system by using a bundling adjustment method until convergence to obtain an accurate sparse point cloud model;

s3, calculating the image in each sub-areaDepth map, memory area S _j With Q therein ^* Aerial image, sub-region S _j The corresponding depth image is

k∈[1，Q ^* ]The method comprises the steps of carrying out a first treatment on the surface of the Wherein k represents the sequence number of the kth depth map;

s4, for each sub-area S _j Depth image data D (S) _j ) Two optimal initial fusion views are selected, denoted as D' (S) _j )＝{d _p ，d _q }，p，g∈[1，Q ^* ]Wherein Q is ^* Representing the subarea S _j The number of aerial images in the image acquisition device, p and q are subscripts and are used for distinguishing different depth images; d, d _p And d _q Representing an optimal initial fusion view;

s5, fusion sub-region S _j All depth images D (S) _j ) Thus obtaining the subarea S _j The corresponding dense point cloud model, denoted as M _j ＝{G _a |G _a ＝(x _a ，y _a ，z _a )}，a∈[1，W]Wherein W represents the number of three-dimensional points in the dense cloud after depth map fusion in the sub-region, (x) _a ，y _a ，z _a ) Representation point G _a A location in a world coordinate system;

s6, a dense point cloud model M of each sub-area _j Merging the two points into a whole to obtain a dense point cloud Model of the complete scene, namely model= { M ₁ ，…，M _i ，…M _M′ }。

2. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S2, the subareas are divided by using the advantage aggregation method, and the specific method is as follows:

the beta mark represents a set containing N aerial images and a sparse point cloud model S, and J represents a square matrix with N rows and N columns for recording the similarity between the images; v (V) _i And V _j Respectively represent image I _i Sum pictureImage I _j All three-dimensional point sets that can be observed, then image I _i And image I _j The similarity between them is defined as:

wherein alpha represents a vector

Sum vector->

Angle between->

The calculation method of (2) is as follows:

in the above-mentioned method, the step of,

the calculation method of (2) is as follows:

and->

Are all intermediate computational variables;

so far, a similarity matrix J between images is calculated, and a graph structure G= (V, E) is constructed according to the value of J, wherein V represents vertexes and E represents edges;

the symbol x represents a vector containing N' elements, and at any time t+1, the value of each element in the vector x is as follows:

wherein x is _g The subscript g of (t) denotes the g-th component of the t-time vector x (t), x (t) ^T Represents a transpose of x (t);

therefore, the error between the time t+1 and the time t, vector x, is:

ε＝x(t+1) ^T J ₀ x(t+1)-x(t) ^T J ₀ x(t) (5)

wherein x (t) ^T Represents a transpose of x (t);

if epsilon is less than 0.001, terminating the iterative computation process, and dividing the large-scale sparse point cloud model into sparse point cloud models of a plurality of subareas.

3. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S3, a corresponding high-quality depth map is calculated for each image by using a stereo matching method based on image blocks, and the detailed calculation steps are as follows:

4. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S4, two optimal initial fusion images are selected for each sub-region by using a multiple constraint method, and the detailed calculation steps are as follows:

s4.1, calculating feature matching points meeting homography matrix constraint between any two images in the input image by using homography matrix constraint method, and marking as Matches ₁ ；

S4.2, calculating feature matching points meeting the basic matrix constraint relation between any two images in the input image according to the basic matrix constraint relation, and marking as Matches ₂ ；

S4.3, calculating feature matching points meeting the constraint relation of the intrinsic matrix between any two images in the input image according to the constraint relation of the intrinsic matrix, and marking the feature matching points as Matches ₃ ；

S4.4, taking matching points Matches ₁ Matching points Matches ₂ And matching point Matches ₃ Intersection between them to obtain matching point Matches ₄ ；

Step S4.5, matching points Matches will be satisfied ₄ The two images with the least error points are constrained as the initial fused image.

5. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S5, the depth images in the region are fused into a whole by fully utilizing the normal vector information of the depth image, and a dense point cloud model corresponding to the images in the region is obtained; the detailed steps are as follows:

and S5.4, stitching the regional image according to the topology information and the geometric information, so as to obtain a dense point cloud model of the corresponding region.

6. The large-scale aerial image-oriented distributed multi-view stereoscopic reconstruction method according to claim 1, wherein the method comprises the following steps: in the step S6, a global iterative nearest neighbor method is used to merge a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene; the detailed steps are as follows:

step S6.1, selecting a point set P from the target point cloud P _i′ And P is _i′ ∈P；

Step S6.2, finding out the corresponding point set Q in the origin cloud Q _i′ And Q is _i′ E Q, such that Q _i′ And P _i′ Distance P between _i′ -Q _i′ Minimum |;

step S6.3, calculating the Point set P _i′ Sum point set Q _i′ A rotation matrix R therebetween _i′ And translation vector t _i′ ；

Step S6.4, using rotation matrix R _i′ And translation vector t _i′ P pair P _i′ The above-mentioned points undergo the processes of rotation conversion and translation conversion so as to calculate new point set P' _i′ ＝R _i′ P _i′ +t _i′ ；

Step S6.5, calculating Point set P' _i′ And point set P _i′ Average distance between

n represents the number of three-dimensional points in the point set;

step S6.6, if d is smaller than a preset threshold value or larger than a preset iteration number, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.