CN115719407A

CN115719407A - Distributed multi-view stereo reconstruction method for large-scale aerial images

Info

Publication number: CN115719407A
Application number: CN202310011438.9A
Authority: CN
Inventors: 曹明伟; 王子洋; 赵海峰; 孙登第
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2023-01-05
Filing date: 2023-01-05
Publication date: 2023-02-28
Anticipated expiration: 2043-01-05
Also published as: CN115719407B

Abstract

The invention discloses a distributed multi-view stereo reconstruction method facing a large-scale aerial image. The method makes full use of the regionality among large-scale aerial images, converts the multi-view stereo reconstruction problem of a large-scale scene into a small-scale multi-view stereo reconstruction problem which can be solved on a low-performance computer, improves the time efficiency of three-dimensional reconstruction, and reduces the cost of three-dimensional reconstruction.

Description

Distributed multi-view stereo reconstruction method for large-scale aerial images

Technical Field

The invention relates to computer vision and image processing technology, in particular to a distributed multi-view stereo reconstruction method for large-scale aerial images.

Background

Multi-view Stereo (MVS) is a technique for computing a scene dense point cloud model from image data, usually with the output information of a Structure from Motion (SfM), i.e., a sparse point cloud model and camera parameters, as input information to the MVS. Currently, great research progress is made on the multi-view stereo reconstruction problem of small-scale image data (e.g., small-scale scene image data acquired by a handheld camera), however, for a large-scale outdoor scene, the existing multi-view stereo reconstruction method needs to be further improved. In addition, with the popularization of consumer-grade unmanned aerial vehicle equipment, large-scale data of outdoor scenes can be acquired easily. The existing multi-view stereo reconstruction method mainly has the following challenges when processing large-scale aerial image data: (a) The method is very time-consuming, the multi-view stereo reconstruction process is very time-consuming, and particularly when outdoor large-scale aerial image data are processed, the existing multi-view stereo reconstruction method cannot calculate a dense point cloud model within limited time and is difficult to meet the timeliness requirement of a high-level application system; (b) The storage overflows, the multi-view stereo reconstruction method has a large demand on a computer memory, and particularly when the data volume of an aerial image is large, the single-version multi-view stereo reconstruction method has the problem of memory overflow, so that the three-dimensional reconstruction process fails.

The above problems seriously hinder the development and application of the multi-view stereo reconstruction technology, and expose the shortcomings of the single-machine version multi-view stereo reconstruction method in processing large-scale aerial image data. Therefore, a distributed multi-view stereo reconstruction method oriented to large-scale aerial images is needed, so that dense point cloud models of scenes can be rapidly calculated from the large-scale aerial images.

At present, the classical papers on multi-view stereo research are mainly: [1] accurate, dense, and Robust Multi-View Steropsis, [2] Pixel View Selection for Unstructured Multi-View Stereo, [3] BlendedMVS: A large-scale dataset for generated Multi-View Stereo networks. A paper [1] published in 2007 on a CVPR conference is a multi-view stereo reconstruction method based on seed point diffusion; paper [2] published in the ECCV conference in 2016, is a multi-view stereo reconstruction method based on image block matching; paper [3] was published in CVPR in 2020, and is a multi-view stereo reconstruction method based on deep learning, and the depth map of each image is estimated mainly by using a deep learning technique. The important point of the multi-view stereo reconstruction methods is how to improve the precision of a multi-view stereo reconstruction model (three-dimensional dense point cloud), and the processed targets are image data of small-scale scenes.

Therefore, when the existing multi-view stereo reconstruction method is applied to large-scale aerial image data, the following challenges still face: (1) When the existing single-machine version multi-view stereo reconstruction method is used for processing large-scale aerial image data, a larger memory space is needed, for example, the content space needed for processing 1000 pieces of image data is 64 Gb, and even exceeds the maximum memory space supported by the existing hardware equipment, for example, when the image data reaches 1500 pieces, the memory space of 128Gb is needed, and the maximum memory space range supported by a single computer is far exceeded; (2) The existing multi-view stereo reconstruction method has too low operating efficiency, and is difficult to meet the time efficiency requirement of large-scale three-dimensional reconstruction based on aerial image data, for example, the time for processing 1000 aerial images takes 10 days.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to solve the defects in the prior art, provides a distributed multi-view stereo reconstruction method for large-scale aerial images,

the dense point cloud model of the scene is rapidly calculated from the large-scale aerial image data on the basis of the distributed operation environment, the progress of the multi-view three-dimensional reconstruction technology for the large-scale aerial image is promoted, and the purpose of rapidly calculating the high-quality dense point cloud of the large-scale scene is achieved.

The technical scheme is as follows: the invention discloses a distributed multi-view stereo reconstruction method for large-scale aerial images, which comprises the following steps of:

s1, for a given large-scale aerial image data set,

wherein, in the step (A),

representing the number of aerial images, and calculating a sparse point cloud model and camera parameters of a corresponding scene:

the sparse point cloud model is S

Wherein

The number of three-dimensional points in the sparse point cloud representing the entire scene,

denotes the firstiA three-dimensional point

The position in the world coordinate system is,

a serial number representing a three-dimensional point;

camera parameters C

Wherein, in the step (A),

indicating the number of aerial images to be taken,

denotes the first

The internal parameter matrix of the individual cameras,

denotes the first

The rotation matrix of the individual cameras is,

is shown as

The translation vector of each of the cameras is,

a serial number representing a camera;

s2, dividing the large-scale sparse point cloud model S into different sub-regions to obtain S

In which

The number of sub-areas is indicated,

a serial number representing a three-dimensional point,

is shown as

A sub-region;

dividing camera parameters C into

Obtaining the camera parameters C of each subarea corresponding to the sparse point cloud model S

；

S3, calculating a depth map of the image in each area, and recording the area

Therein is provided with

Taking aerial images, areas

The corresponding depth image is

,

(ii) a Wherein the content of the first and second substances,

is shown as

A number of the amplitude depth map;

s4, for each region

Of the depth image data

Selecting two optimal initial fusion views, and recording as

In which

Indicating area

Number of aerial images inThe amount of the (B) component (A),

and

is a subscript, used to distinguish different depth images;

and

representing an optimal initial fusion view;

s5, fusion region

All depth images in (1)

I.e. obtaining the region

Corresponding dense point cloud model, note

Wherein, in the process,

representing the number of three-dimensional points in the dense cloud after the depth map in the sub-region is fused,

indicating points

A position in a world coordinate system;

s6, modeling the dense point cloud of each area

Are combined into a wholeThe dense point cloud model of the complete scene can be obtained and recorded as

Wherein, in the process,

indicating the number of sub-regions.

Further, in step S1, a hybrid motion inference structure method is used to extract the aerial image data set from the aerial image data set

Calculating sparse point cloud model and camera parameters of a scene, and specifically comprising the following steps:

step S1.1, aerial image matching

Firstly, detecting feature points and calculating feature descriptors by using a local feature detection method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

step S1.2, calculating camera parameters

Calculating the parameters of the camera by using an incremental motion inference structure method according to the feature matching points obtained in the S1.1; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;

s1.3, calculating a sparse point cloud model of the regional image

And calculating a sparse point cloud model of the area scene by using a global motion inference structure method according to the feature matching points obtained in the step S1.1 and the camera parameters obtained in the step S1.2, so that the time efficiency of three-dimensional reconstruction is improved. First, the region is divided

The three-dimensional points corresponding to all the images are registered in a world coordinate system, and then Bundle adjustment (Bundle A) is useddjustment) method simultaneously optimizes the camera parameters and three-dimensional points in the world coordinate system until convergence, and then an accurate sparse point cloud model can be obtained; wherein, the first and the second end of the pipe are connected with each other,

is shown as

The serial number of each region.

Further, in step S2, a dominant set clustering method is used to divide the large-scale sparse point cloud into a plurality of sub-region scenes, and the specific method is as follows:

note the book

Represent an inclusion

Aerial image and sparse point cloud model

The set of (a) and (b),

indicates a device having

Rows and columns

A square matrix of columns for recording similarity between images;

and

respectively representing images

And image

Set of all three-dimensional points that can be observed, image

And an image

The similarity between them is defined as:

（1）

wherein, the first and the second end of the pipe are connected with each other,

representing a vector

Sum vector

The angle between the two (C) and the angle between the two (C) are determined,

the calculation method of (2) is as follows:

（2）

and is

The calculation method of (2) is as follows:

（3）

wherein the content of the first and second substances,

and

are all intermediate calculation variables;

to this end, a similarity matrix between the images is calculated

According to

Construct a graph structure from the values of

Wherein

The position of the vertex is represented and,

representing an edge;

note the book

Is represented by containing

A vector of elements, then arbitrary

Time of day, vector

The values of each element in (a) are as follows:

（4）

wherein the content of the first and second substances,

subscripts of (2)

To represent

Time of day vector

To (1) a

The number of the components is such that,

to represent

Transposing;

therefore, the temperature of the molten metal is controlled,

time of day and

time of day, vector

The error between is:

（5）

wherein the content of the first and second substances,

represent

Transposing;

if it is used

And terminating the iterative calculation process, namely dividing the large-scale sparse point cloud model into a plurality of sub-area sparse point cloud models.

Further, in the step S3, a stereo matching method based on image blocks is used to calculate a corresponding high-quality depth map for each image, so as to lay a foundation for reconstructing a high-quality dense point cloud model. The detailed calculation steps are as follows:

s3.1, performing parallax propagation by using an odd-even iteration test, wherein the strategy comprises space propagation, visual angle propagation and time propagation;

s3.2, when the space transmission is respectively iterated for odd and even times, the transmission directions respectively start from the upper left corner and the lower right corner, cost comparison is carried out on each pixel point and the parallax planes of the left pixel and the right pixel, and matching cost of the parallax of the upper left corner and the lower right corner and the current parallax value is calculated;

s3.3, finally, taking the parallax value with the minimum substitution value as the optimal parallax value;

and step S3.4, iterating and calculating the steps S3.1 to S3.4 until the optimal parallax values of all pixels are calculated, namely obtaining the optimal depth image.

Further, in the step S4, a multiple constraint method is used to select two optimal initial fusion images for each sub-region, so as to ensure the geometric consistency between the generated dense point cloud and the real scene; the detailed calculation steps are as follows:

s4.1, calculating the characteristic matching points meeting the homography matrix constraint between any two images in the input images by using a homography matrix constraint method, and recording the characteristic matching points as

；

S4.2, calculating the characteristic matching points meeting the basic matrix constraint relation between any two images in the input images according to the basic matrix constraint relation, and recording the characteristic matching points as

；

S4.3, calculating characteristic matching points meeting the intrinsic matrix constraint relation between any two images in the input images according to the intrinsic matrix constraint relation, and recording the characteristic matching points as

；

Step S4.4, matching point is taken

Matching point

And matching point

The intersection between them to obtain the matching point

；

Step S4.5, matching points are satisfied

And (5) taking the two images with the least error points as initial fusion images.

Further, in the step S5, the depth images in the region are fused together by fully utilizing the normal vector information of the depth map, so as to obtain a dense point cloud model corresponding to the image in the region, and the detailed steps are as follows:

s5.1, calculating the confidence coefficient of each vertex of the depth image to be fused;

s5.2, deleting some redundant overlapped points from the depth image to be fused according to the confidence coefficient of each vertex in the depth image to obtain the topological information of each area image in each depth image;

s5.3, carrying out weighting operation on the vertexes on the depth image according to the topological information to obtain the geometric information of the image;

and S5.4, stitching the area images according to the topological information and the geometric information so as to obtain a dense point cloud model of the corresponding area. Further, in step S6, a global iterative nearest neighbor method is used to merge a plurality of dense point cloud models with feature overlap into a dense point cloud model of a complete scene, and the detailed steps are as follows:

step S6.1, in the target point cloud

In select point set

And is made of

；

Step S6.2, finding out origin cloud

Corresponding point set in (1)

And is and

so that

And

the distance between

Minimum;

step S6.3, calculating point set

Sum point set

A rotation matrix of

And translation vector

；

Step S6.4, use of rotation matrix

And translation vector

To pair

The points are subjected to rotation transformation and translation transformation, and a new point set is calculated

；

Step S6.5, calculating point set

And point set

Average distance therebetween

，

Representing the number of three-dimensional points in the point set;

step S6.6, if

If the number of iterations is less than the preset threshold or greater than the preset number of iterations, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.

Has the beneficial effects that: compared with the prior art, the invention has the following advantages:

(1) According to the method, the sparse point cloud model of the large-scale scene is divided into different subset areas, so that the problems of memory overflow and system crash of a single-version multi-view stereo reconstruction method are avoided, and the large-scale multi-view stereo reconstruction is possible.

(2) According to the method, on different nodes of a distributed system, sparse point cloud models and camera parameters (including camera focal length, rotation matrix and translation vector) of different sub-areas are independently processed, the dense point cloud models are rapidly calculated, and the time efficiency of large-scale multi-view three-dimensional reconstruction is improved.

(3) The method not only can solve the problem of memory space overflow of a single-version multi-view stereo reconstruction method, but also can improve the time efficiency of large-scale multi-view stereo reconstruction, and lays an important foundation for the application of unmanned aerial vehicle aerial images in the field of three-dimensional reconstruction and the development and application of three-dimensional reconstruction technology.

Drawings

FIG. 1 is a schematic overall flow diagram of the present invention;

FIG. 2 is an aerial image dataset in an embodiment;

FIG. 3 is a large-scale sparse point cloud model in an embodiment;

FIG. 4 is a sparse point cloud model and camera pose information for a subregion in an embodiment;

FIG. 5 is an original image (non-depth image) of a primary fused view of a sub-region in an embodiment;

FIG. 6 is a dense point cloud model of a sub-region in an embodiment;

FIG. 7 is a further prior art dense point cloud model;

FIG. 8 is a block diagram of a complete dense point cloud model according to an embodiment of the present invention.

Detailed Description

The technical solution of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.

The invention discloses a distributed multi-view stereo reconstruction method for large-scale aerial images, which aims to reconstruct scene dense point cloud from the large-scale aerial images, and has the following core idea: firstly, dividing a sparse point cloud model corresponding to a large-scale aerial image into a plurality of sub-regions; secondly, distributing the aerial images, the sparse point cloud models and the camera parameters corresponding to the sub-regions on different nodes of a distributed environment for processing, and calculating a depth image of each image; then, fusing depth images in the sub-regions on different nodes of the distributed environment, so as to obtain dense point cloud models of the sub-regions; finally, merging the dense point clouds on the sub-nodes on a main control machine in the distributed environment, so as to obtain a dense point cloud model of a complete scene; finally, the method makes it possible to quickly calculate a high-quality dense point cloud model from large-scale aerial image data.

As shown in fig. 1, the distributed multi-view stereo reconstruction method for large-scale aerial images of the present invention includes the following steps:

s1, for a given large-scale aerial image data set,

wherein, in the step (A),

the sparse point cloud model is S

Wherein

The number of three-dimensional points is represented,

indicating points

The position in the world coordinate system is,

inSubscript

A serial number representing a three-dimensional point;

camera parameters C

Wherein, in the process,

indicating the number of aerial images to be taken,

denotes the first

The internal parameter matrix of the individual cameras,

is shown as

The rotation matrix of the individual cameras is,

is shown as

The translation vector of the individual cameras is,

a serial number indicating a camera;

s1.1, detecting feature points and calculating feature descriptors by using a SuperPoint method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception Hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

s1.2, calculating camera parameters by using an incremental motion inference structure method according to the feature matching points obtained in the S1.1; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;

and S1.3, calculating a sparse point cloud model of the area scene by using a global motion inference structure method according to the feature matching points obtained in the S1.1 and the camera parameters obtained in the S1.2, and improving the time efficiency of three-dimensional reconstruction. First, the region is divided

Registering three-dimensional points corresponding to all the images in a world coordinate system, and then optimizing camera parameters and the three-dimensional points in the world coordinate system by using a Bundle Adjustment (Bundle Adjustment) method until convergence to obtain an accurate sparse point cloud model; wherein the content of the first and second substances,

is shown as

The serial number of each region;

s2, dividing the large-scale sparse point cloud model S into different areas to obtain S

Wherein, in the process,

representing the number of three-dimensional points in the sparse point cloud throughout the scene,

indicating points

The position in the world coordinate system is such that,

indicating the number of sub-regions，

A number representing a three-dimensional point is indicated,

is shown as

A sub-region;

dividing camera parameters C into

Different regions, obtaining camera parameters C of each region

Wherein, in the step (A),

indicating the number of aerial images to be taken,

is shown as

The internal parameter matrix of the individual cameras,

is shown as

The rotation matrix of the individual cameras is,

is shown as

The translation vector of the individual cameras is,

representing the number of sub-regions;

dividing a large-scale sparse point cloud into a plurality of sub-region scenes by using a dominance set clustering method, wherein the specific method comprises the following steps:

note the book

Show a package of

Aerial image and sparse point cloud model

The set of (a) and (b),

indicates a device having

Rows and columns

A square matrix of columns for recording similarity between images;

and

respectively representing images

And image

Set of all three-dimensional points that can be observed, image

And image

In betweenThe similarity is defined as:

（1）

wherein the content of the first and second substances,

representing a vector

Sum vector

The angle between the two (C) and the angle between the two (C) are determined,

the calculation method of (2) is as follows:

（2）

and is provided with

The calculation method of (2) is as follows:

（3）

wherein the content of the first and second substances,

and

are all intermediate calculation variables;

to this end, a similarity matrix between the images is calculated

According to

Value of (a) to construct a graph structure

Wherein

The position of the vertex is represented and,

representing an edge;

note book

Is represented by containing

A vector of elements, then arbitrary

Time of day, vector

The values of each element in (a) are as follows:

（4）

wherein the content of the first and second substances,

subscripts of (2)

To represent

Time of day vector

To (1)

The number of the components is such that,

to represent

Transposing;

therefore, the temperature of the molten metal is controlled,

time of day and

time of day, vector

The error between is:

（5）

represent

Transposing;

if it is used

If the iterative computation process is ended, the large-scale sparse point cloud model can be divided into sparse point cloud models of a plurality of sub-areas;

s3, calculating a depth map of the image in each area, and recording the area

Is provided with therein

Taking aerial images, areas

The corresponding depth image is

,

(ii) a Calculating a corresponding high-quality depth map for each image by using a stereo matching method based on image blocks, and laying a foundation for reconstructing a high-quality dense point cloud model; the specific process is as follows:

s3.2, when the space transmission is respectively iterated for odd and even times, the transmission directions respectively start from the upper left corner and the lower right corner, cost comparison is carried out on each pixel point and the parallax planes of the left pixel and the right pixel, and the matching cost of the parallax of the upper left corner and the lower right corner and the current parallax value is calculated;

step S3.4, iterative computation step S3.1 to step S3.4, until the optimal parallax value of all pixels is computed, the optimal depth image can be obtained

S4, for each region

Depth image data of

Selecting two optimal initial fusion views, and recording as

Wherein

Indicating area

The number of aerial images in the interior,

and

is a subscript, used to distinguish different depth images; selecting two optimal initial fusion images for each sub-region by using a multiple constraint method, and ensuring the geometric consistency of the generated dense point cloud and a real scene; the specific process is as follows:

；

；

；

Step S4.4, matching point is taken

Matching point

And matching point

The intersection between them to obtain the matching point

；

Step S4.5, matching points are satisfied

Two images which are constrained and have the least error points are used as initial fusion images;

s5, fusion region

All depth images in (1)

To obtain the region

Corresponding dense point cloud model, note

Wherein, in the step (A),

representing the number of three-dimensional points in the dense cloud generated by the depth map fusion,

indicating points

A location in a world coordinate system; the method comprises the following steps of fully utilizing normal vector information of a depth map to fuse depth images in a region into a whole to obtain a dense point cloud model corresponding to the images in the region, wherein the specific process comprises the following steps:

s5.2, deleting some redundant overlapped points from the depth image to be fused according to the confidence of each vertex in the depth image to obtain the topological information of each region image in each depth image;

s5.3, carrying out weighting operation on the vertex on the depth image according to the topological information to obtain the geometric information of the image;

s5.4, stitching the area images according to the topological information and the geometric information so as to obtain dense point cloud models of corresponding areas;

s6, modeling the dense point cloud of each area

Combining them into a whole body to obtain complete dense point cloud model, and marking it as

Wherein, in the step (A),

representing the number of sub-regions; combining a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene by using a global iterative nearest neighbor method, wherein the specific process is as follows:

step S6.1, in the target point cloud

In select point set

And is and

；

step S6.2, finding out origin cloud

Corresponding point set in (1)

And is and

so that

And

the distance between

Minimum;

step S6.3, calculating point set

Sum point set

A rotation matrix of

And translation vector

；

Step S6.4, using the rotation matrix

And translation vector

For is to

；

Step S6.5, calculating point set

And point set

Average distance between

，

Representing the number of three-dimensional points in the point set;

step S6.6, if

If the value is less than the preset threshold value or greater than the preset iteration number, the calculation process is terminated, otherwise, the step S6.2 is returned until the calculation process converges.

Example 1:

the original aerial image of the embodiment is shown in fig. 2, the final reconstruction result of the embodiment is shown in fig. 8, and the dense point cloud model reconstructed from the large-scale aerial image has high geometric consistency with the real scene.

As can be seen from the above embodiments, the sparse point cloud model of the complete scene is calculated firstly (as shown in fig. 3), then the large-scale sparse point cloud model is divided into sparse point cloud models of a plurality of sub-regions (as shown in fig. 4), then the depth map of the image in each sub-region is calculated, and an initial fusion view is selected for each sub-region (as shown in fig. 5, the initial fusion view is an original image (non-depth image)) of the initial fusion view), and fourth, the depth images in the sub-regions are fused to obtain dense point cloud models corresponding to the sub-regions (as shown in fig. 6); and finally, combining the dense point clouds of the sub-regions into a whole to obtain a dense point cloud model of the complete scene (as shown in fig. 8). In addition, as can be seen from fig. 7, the dense point cloud model calculated by other prior art schemes has a large difference from the real scene in terms of geometric consistency.

In addition, the technical scheme of the embodiment only needs to use 1000 aerial images for 8 hours and only occupies 24Gb memory space, that is, the time efficiency of large-scale multi-view stereo reconstruction can be improved, and the problem of memory overflow can be avoided.

The invention can also be applied to other fields, such as the Yuan universe, digital Chinese construction, digital country construction, digital city construction, military simulation, unmanned driving, automatic navigation under the condition of no satellite, digital protection of cultural heritage, three-dimensional scene monitoring, shooting and manufacturing of large-scale movies and televisions, three-dimensional surveying of natural disaster sites, three-dimensional visual science popularization creation, virtual reality and augmented reality.

Claims

1. A distributed multi-view stereo reconstruction method for large-scale aerial images is characterized by comprising the following steps: the method comprises the following steps:

s1, for a given large-scale aerial image data set,

wherein, in the step (A),

the sparse point cloud model is S

In which

is shown asiA three-dimensional point

The position in the world coordinate system is,

a serial number representing a three-dimensional point;

camera parameters C

Wherein

Indicating the number of aerial images to be taken,

is shown as

The internal parameter matrix of the individual cameras,

is shown as

The rotation matrix of the individual cameras is,

is shown as

The translation vector of each of the cameras is,

a serial number indicating a camera;

Wherein, in the process,

the number of sub-regions is indicated,

representing three-dimensional points

The serial number of (a) is included,

denotes the first

A sub-region;

dividing camera parameters C into

Sub-regions corresponding to the sparse point cloud model S, and obtaining camera parameters C of each sub-region

；

S3, calculating a depth map of the image in each area, and recording the area

Therein is provided with

Taking aerial images, i.e. areas

The corresponding depth image is

,

(ii) a Wherein, the first and the second end of the pipe are connected with each other,

denotes the first

A number of the depth map;

s4, for each region

Depth image data of

Selecting two optimal initial fusion views, and recording as

Wherein

Indicating area

The number of aerial images in the interior,

and

is a subscript, used to distinguish different depth images;

and

representing an optimal initial fused view;

s5, fusion region

All depth images in (1)

To obtain the region

Corresponding dense point cloud model, note

Wherein, in the step (A),

indicating points

A position in a world coordinate system;

s6, carrying out dense point cloud model of each area

Merging into a whole to obtain a dense point cloud model of the complete scene, and recording the model as

。

2. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in step S1, a hybrid motion inference structure method is used to extract a set of aerial image data from a set of aerial image data

step S1.1, aerial image matching

Firstly, detecting feature points and calculating feature descriptors by using a local feature method based on deep learning, then calculating a matching relation between the feature descriptors by using a local perception hash method, and finally eliminating wrong matching points according to geometric consistency between images to obtain correct feature matching points;

step S1.2, calculating camera parameters

Calculating camera parameters by using an incremental motion inference structure method according to the feature matching points obtained in the S1.1; firstly, calculating relative attitude information of a camera by using a five-point algorithm, then calculating absolute attitude information of the camera by using a three-point method, and finally calculating focal length information of each image by using a camera self-calibration method;

s1.3, calculating a sparse point cloud model of the regional image

And (3) according to the feature matching points obtained in the step (S1.1) and the camera parameters obtained in the step (S1.2), calculating a sparse point cloud model of the area scene by using a global motion inference structure method:

first, the region is divided

Registering three-dimensional points corresponding to all the images in a world coordinate system, and then optimizing camera parameters and the three-dimensional points in the world coordinate system by using a bundling adjustment method until convergence, so as to obtain an accurate sparse point cloud model; wherein, the first and the second end of the pipe are connected with each other,

is shown as

The serial number of each region.

3. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in the step S2, the dominant set clustering method is used to divide the sub-regions, and the specific method is as follows:

note the book

Show a package of

Aerial image and sparse point cloud model

The set of (a) and (b),

indicates a device having

Rows and columns

A square matrix of columns for recording similarity between images;

and

respectively representing images

And image

Set of all three-dimensional points that can be observed, then image

And an image

The similarity between them is defined as:

（1）

representing a vector

Sum vector

The angle between the two (C) and the angle between the two (C) are determined,

the calculation method of (2) is as follows:

（2）

and is

The calculation method of (2) is as follows:

（3）

wherein the content of the first and second substances,

and

are all intermediate calculation variables;

to this end, a similarity matrix between the images is calculated

According to

Value of (a) to construct a graph structure

Wherein

The position of the vertex is represented and,

representing an edge;

note book

Is expressed as containing

A vector of elements, then arbitrary

Time of day, vector

The values of each element in (a) are as follows:

（4）

wherein the content of the first and second substances,

subscript of

Represent

Time of day vector

To (1)

The number of the components is such that,

to represent

Transposing;

therefore, the temperature of the molten metal is controlled,

time of day and

time and vector

The error between is:

（5）

wherein the content of the first and second substances,

represent

Transposing;

if it is not

Then the iterative calculation is terminatedThe process can divide the large-scale sparse point cloud model into a plurality of subarea sparse point cloud models.

4. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in step S3, a stereo matching method based on image blocks is used to calculate a corresponding high-quality depth map for each image, and the detailed calculation steps are as follows:

5. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in the step S4, two optimal initial fusion images are selected for each sub-region by using a multiple constraint method, and the detailed calculation steps are as follows:

；

；

S4.3, calculating the feature matching points meeting the intrinsic matrix constraint relation between any two images in the input images according to the intrinsic matrix constraint relation, and recording the feature matching points as

；

Step S4.4, matching point is taken

Matching points

And matching point

The intersection between them to obtain the matching point

；

Step S4.5, matching points are satisfied

6. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in the step S5, the depth images in the region are fused into a whole by fully utilizing the normal vector information of the depth image, and a dense point cloud model corresponding to the images in the region is obtained; the detailed steps are as follows:

and S5.4, stitching the area images according to the topological information and the geometric information so as to obtain a dense point cloud model of the corresponding area.

7. The distributed multi-view stereo reconstruction method for large-scale aerial images according to claim 1, characterized in that: in the step S6, combining a plurality of dense point cloud models with feature overlapping into a dense point cloud model of a complete scene by using a global iterative nearest neighbor method; the detailed steps are as follows:

step S6.1, in the target point cloud