CN115100535B

CN115100535B - Satellite remote sensing image rapid reconstruction method and device based on affine camera model

Info

Publication number: CN115100535B
Application number: CN202210173926.5A
Authority: CN
Inventors: 董秋雷; 陈豹
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2022-02-24
Filing date: 2022-02-24
Publication date: 2023-03-28
Anticipated expiration: 2042-02-24
Also published as: CN115100535A

Abstract

The invention provides a method and a device for quickly reconstructing a satellite remote sensing image based on an affine camera model, wherein the method comprises the following steps: carrying out image classification on a remote sensing data set of a target scene to obtain N image subsets; cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas; respectively calculating three-dimensional affine point clouds corresponding to each image block with a preset size in each image subset; respectively calculating to obtain a global affine motion matrix of a camera corresponding to each image block with the preset size in each image subset under a uniform global affine coordinate system based on the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset; respectively obtaining a three-dimensional Euclidean structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points; and obtaining the three-dimensional European structure of the target scene based on the three-dimensional European structure corresponding to each image subset.

Description

Satellite remote sensing image rapid reconstruction method and device based on affine camera model

Technical Field

The invention relates to the technical field of affine imaging, in particular to a satellite remote sensing image rapid reconstruction method and device based on an affine camera model.

Background

With the rapid development of satellite imaging technology, the three-dimensional reconstruction technology based on high-beam optical remote sensing images shows great application potential in various fields such as city planning, geographical mapping, positioning and navigation, and the like, and is widely concerned by researchers.

Most of the existing satellite optical image reconstruction methods use a Rational Polynomial Camera (RPC) model to reconstruct a scene. As a general sensor model, RPC typically contains 78 polynomial coefficients and 10 normalization constants. At present, a reconstruction algorithm based on an RPC model firstly completes two-view reconstruction, and then fuses the reconstruction results of every two views, thereby completing multi-view reconstruction. The reconstruction process mainly comprises the following steps: the method comprises the steps of (1) obtaining dense corresponding relations among images by adopting parallax or optical flow, (2) optimizing estimation residual errors of an RPC (remote position control) model of an optical satellite image through binding adjustment, and (3) triangularizing based on the RPC model to recover a three-dimensional scene structure corresponding to the images. However, these reconstruction methods are no exception to solving complex third-order polynomial coefficients of the RPC model, and at least 39 sets of ground control points and their corresponding relationships with the relevant images are required for estimating the RPC model, so that the reconstruction process consumes a lot of computation time.

It is worth noting that a large number of optical image-based three-dimensional scene reconstruction methods have been proposed in the field of computer vision. However, these methods are often only suitable for processing images captured by an area-array camera, and cannot process images acquired by a push-broom line camera mounted on a satellite, that is, these three-dimensional reconstruction methods in the field of computer vision cannot directly process satellite image three-dimensional reconstruction tasks. In view of the above situation, researchers have proposed to apply weak perspective projection models and affine models to three-dimensional reconstruction of remote sensing image satellites, but these reconstruction methods all use incremental methods to perform three-dimensional reconstruction, so that repeated binding adjustment is required, and a large amount of calculation time is required.

In summary, although many three-dimensional reconstruction methods based on optical satellite images have been proposed in recent years, which enables the three-dimensional reconstruction technology related to optical satellite images to be developed rapidly, the existing methods either need to repeatedly perform bundling adjustment optimization or need a large number of parameters of RPC models to participate in calculation, which often consumes a large amount of calculation time, and therefore, how to reduce the three-dimensional reconstruction time on the premise of maintaining the scene reconstruction accuracy becomes an urgent problem to be solved.

Disclosure of Invention

The invention provides a method and a device for quickly reconstructing a satellite remote sensing image based on an affine camera model, which are used for solving the defect that the three-dimensional reconstruction technology of an optical satellite image in the prior art needs to perform complicated and time-consuming optimization steps on model parameters, and the three-dimensional reconstruction time is reduced on the premise of keeping the scene reconstruction precision.

The invention provides a satellite remote sensing image rapid reconstruction method based on an affine camera model, which comprises the following steps: carrying out image classification on a remote sensing data set of a target scene to obtain N image subsets; each image subset comprises remote sensing images of different visual angles in the same time period, and N is an integer greater than or equal to 1; cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas; respectively calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in each image subset; respectively calculating to obtain a global affine motion matrix of a camera corresponding to each preset size image block in each image subset under a uniform global affine coordinate system based on the three-dimensional affine point cloud corresponding to each preset size image block in each image subset; respectively obtaining a three-dimensional European structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points; and obtaining the three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each image subset.

According to the method for quickly reconstructing the satellite remote sensing image based on the affine camera model, provided by the invention, the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset is respectively calculated, and the method comprises the following steps: extracting key points of each image block with the preset size according to a scale-invariant feature detection algorithm SIFT; determining a plurality of mutually matched image pairs based on the corresponding relation of each key point between image blocks with different preset sizes; and solving the three-dimensional affine point cloud corresponding to each image block with the preset size in each image pair through a factorization algorithm in the multi-view geometry.

According to the method for quickly reconstructing the satellite remote sensing image based on the affine camera model, provided by the invention, a global affine motion matrix of a camera corresponding to each image block with the preset size in each image subset under a unified global affine coordinate system is respectively obtained by calculation based on the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset, and the method comprises the following steps of: constructing an undirected graph according to an overlapping area or a public area between image blocks with preset sizes; wherein, the nodes of the undirected graph represent one image pair, and the edges of the undirected graph represent that an overlapping area or a common area exists between the image pair corresponding to two nodes connected by the same edge; according to the undirected graph, solving an absolute affine transformation matrix corresponding to each node in the undirected graph by constructing an equation; registering three-dimensional affine point clouds corresponding to image blocks with preset sizes under a unified global affine coordinate system according to an absolute affine transformation matrix corresponding to each node to obtain a sparse affine structure of a ground scene corresponding to the image subset; calculating an initial global affine motion matrix of a camera corresponding to each image block with the preset size by using the projection relation between the sparse affine point cloud with the sparse affine structure and the image blocks with the preset size; and binding, adjusting and optimizing the sparse affine point cloud and the initial global affine motion matrix to obtain a global affine motion matrix corresponding to each image block with the preset size.

According to the method for quickly reconstructing the satellite remote sensing image based on the affine camera model, which is provided by the invention, the three-dimensional Euclidean structure corresponding to each image subset is respectively obtained based on the global affine motion matrix and the preset number of ground control points, and the method comprises the following steps: obtaining a dense matching relation between all the image blocks with the preset size in the image subset by utilizing an existing affine stereo correction algorithm and a stereo matching algorithm; according to the dense matching relation among all the image blocks with the preset size and the global affine motion matrix corresponding to each image block with the preset size, obtaining dense affine point clouds of the ground scene corresponding to the image subsets through affine triangulation calculation; and for the dense affine point cloud, upgrading the dense affine point cloud into the three-dimensional Euclidean structure by calculating a global affine transformation matrix.

According to the method for quickly reconstructing the satellite remote sensing image based on the affine camera model, which is provided by the invention, the three-dimensional European structure of the target scene is obtained based on the three-dimensional European structure corresponding to each image subset, and the method comprises the following steps: generating a ground elevation map of a three-dimensional Euclidean structure corresponding to each image subset; and registering and fusing all the ground elevation maps by using a method for fusing the ground elevation maps to obtain the three-dimensional European-style structure of the target scene.

The invention also provides a satellite remote sensing image rapid reconstruction device based on the affine camera model, which comprises the following components:

the image classification module is used for carrying out image classification on the remote sensing data set of the target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period;

the image cutting module is used for cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas;

the first calculation module is used for calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in the image subsets respectively;

the second calculation module is used for respectively calculating a global affine motion matrix of a camera corresponding to each image block with the preset size in each image subset under a unified global affine coordinate system based on the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset;

the first reconstruction module is used for respectively obtaining a three-dimensional Euclidean structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points;

and the second reconstruction module is used for obtaining the three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each image subset.

According to the device for quickly reconstructing the satellite remote sensing image based on the affine camera model, provided by the invention, the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset is respectively calculated, and the method comprises the following steps: extracting key points of each image block with the preset size according to a scale-invariant feature detection algorithm SIFT; determining a plurality of mutually matched image pairs based on the corresponding relation of each key point between image blocks with different preset sizes; and solving the three-dimensional affine point cloud corresponding to each image block with the preset size in each image pair through a factorization algorithm in the multi-view geometry.

According to the device for quickly reconstructing the satellite remote sensing image based on the affine camera model, provided by the invention, a global affine motion matrix of a camera corresponding to each image block with the preset size in each image subset under a unified global affine coordinate system is respectively obtained by calculation based on the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset, and the device comprises: constructing an undirected graph according to an overlapping area or a public area between image blocks with preset sizes; the nodes of the undirected graph represent one image pair, and the edges of the undirected graph represent that an overlapping region or a common region exists between the image pair corresponding to two nodes connected by the same edge; according to the undirected graph, solving an absolute affine transformation matrix corresponding to each node in the undirected graph by constructing an equation; registering three-dimensional affine point clouds corresponding to image blocks with preset sizes under a unified global affine coordinate system according to an absolute affine transformation matrix corresponding to each node to obtain a sparse affine structure of a ground scene corresponding to the image subset; calculating an initial global affine motion matrix of a camera corresponding to each image block with a preset size by using a projection relation between the sparse affine point cloud with the sparse affine structure and the image block with the preset size; and binding, adjusting and optimizing the sparse affine point cloud and the initial global affine motion matrix to obtain a global affine motion matrix corresponding to each image block with the preset size.

The invention also provides electronic equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the program, the processor realizes the method for quickly reconstructing the satellite remote sensing image based on the affine camera model.

The present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for quickly reconstructing a satellite remote sensing image based on an affine camera model as described in any one of the above.

The invention also provides a computer program product comprising a computer program, wherein the computer program is used for realizing the method for quickly reconstructing the satellite remote sensing image based on the affine camera model.

According to the method and the device for quickly reconstructing the satellite remote sensing image based on the affine camera model, a plurality of image subsets are obtained by classifying a satellite remote sensing image data set of a target scene, each image in each image subset is cut to obtain a group of preset-size image blocks with an overlapping area, a three-dimensional Euclidean structure corresponding to each image subset is finally calculated by using all the preset-size image blocks in each image subset, and then the three-dimensional Euclidean structure of the target scene is obtained. The method does not need to execute repeated binding adjustment optimization, has strong robustness, can reduce the three-dimensional reconstruction time of the satellite remote sensing image on the premise of ensuring the integrity and the precision, quickly obtains the European structure of the target scene, and further effectively improves the reconstruction efficiency of the remote sensing image.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a method for rapidly reconstructing a satellite remote sensing image based on an affine camera model, provided by the invention;

FIG. 2 is a schematic diagram of a method for cropping an image according to the present invention;

FIG. 3 is a schematic structural diagram of a device for rapidly reconstructing a satellite remote sensing image based on an affine camera model, provided by the invention;

fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

FIG. 1 is a schematic flow chart of a method for quickly reconstructing a satellite remote sensing image based on an affine camera model provided by the invention. As shown in fig. 1, the method includes:

step 101, performing image classification on a remote sensing data set of a target scene to obtain N image subsets.

And each image subset comprises remote sensing images with different visual angles in the same time period, and N is an integer greater than or equal to 1.

It should be noted that, the three-dimensional reconstruction accuracy of the paired optical remote sensing images is easily affected by factors such as season, weather, illumination, shadow, and the like. In order to improve the reconstruction quality of the multiple optical remote sensing images, the remote sensing data sets need to be classified according to the image shooting time in the embodiment of the invention to obtain multiple image subsets, for example, the remote sensing images with shooting time within one week are classified into the same class.

It is understood that the remote sensing data set is a multi-view image set of a target scene in different shooting time periods, each image subset is a multi-view image set of the target scene in the same shooting time period, and different images in each image subset are shot from different views.

It can be understood that the electronic device classifies the remote sensing data set of the target scene according to the shooting time period, and then obtains at least one image subset.

Step 102, each image in each image subset is cut into a group of image blocks with preset sizes and overlapping areas.

In one embodiment, each image is cropped to 1000 x 1000 image blocks with 500 x 1000 overlap areas between adjacent image blocks. For example, a 2000 × 2000 image may be cropped into 9 image blocks with an overlap area of 1000 × 1000, wherein the overlap area is 500 × 1000.

It should be noted that the size of the overlapping area and the size of the image block may be customized according to actual needs.

It can be understood that the size of each image block obtained after cropping is consistent, and the size of the image block is smaller than that of the image.

Fig. 2 is a schematic diagram of a method for cropping an image according to the present invention. As shown in fig. 2, an image is cropped into a set of image blocks having an overlapping area.

It will be appreciated that the electronic device crops each image in each subset of images separately, cropping each image into a set of small size image blocks having overlapping areas.

And 103, respectively calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in each image subset.

The affine transformation is a transformation in which one vector space is linearly transformed once in geometry, and then the one vector space is transformed into the other vector space by piecing together one translation.

The local affine motion matrix and a group of three-dimensional affine point clouds respectively corresponding to the two image blocks can be calculated by utilizing the two image blocks.

Each image block in an image subset corresponds to a local affine motion matrix and a set of three-dimensional affine point clouds.

It can be understood that the electronic device respectively calculates the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset.

In one embodiment, step 103 comprises:

extracting key points of each image block with the preset size according to a scale-invariant feature detection algorithm SIFT;

determining a plurality of mutually matched image pairs based on the corresponding relation of each key point between image blocks with different preset sizes;

and solving the three-dimensional affine point cloud corresponding to each image block with the preset size in each image pair through a factorization algorithm in the multi-view geometry.

It should be noted that Scale-invariant feature transform (SIFT) is a description mode used in the field of image processing. The description has scale invariance, can detect key points in the image and is a local feature descriptor. The SIFT algorithm realizes feature matching mainly through three processes, 1) key points are extracted; 2) Adding detailed information (local features), i.e. descriptors, to the keypoints; 3) And finding out a plurality of pairs of feature points which are matched with each other through pairwise comparison of the feature points (the key points with the feature vectors), and further establishing a corresponding relation between the scenes.

It is understood that each image block can extract a plurality of keypoints, and two keypoints that match each other can be referred to as a set of corresponding points.

It should be noted that, when two image blocks are respectively image blocks at different views in the same image subset and corresponding points between the two image blocks exceed a preset number, the two image blocks form an image pair that matches each other.

In one embodiment, when the corresponding points between image blocks at two different views exceed 4 groups, the two image blocks constitute one image pair.

It should be understood that a common area exists between two image blocks constituting an image pair, and this common area refers to a scene that is partially identical on the images corresponding to the two image blocks.

It can be understood that the three-dimensional affine point cloud corresponding to each image block under the local affine coordinate system can be calculated through the factorization algorithm in the multi-view geometry.

The following explains the three-dimensional affine point cloud corresponding to the image block calculated by the factorization algorithm in detail by combining multiple sets of formulas:

first, two image blocks having a common area in any two views are referred to as a 1 st image block and a 2 nd image block. Assume that the two image blocks have n corresponding points, and let X _i ∈R ³ Representing the non-homogeneous coordinates of the ith three-dimensional imitation point,

is shown at three-dimensional imitation point X _i Non-homogeneous coordinates of key points corresponding to the kth (k =1, 2) image block; local affine motion matrix a ^k ∈R ^2×4 By linear transformation of matrix M ^k ∈R ^2×3 And a displacement vector t ^k ∈R ² And (4) forming.

Second, the affine camera model can be described as:

wherein the content of the first and second substances,

representing three-dimensional imitation points X _i Non-homogeneous coordinates, M, of the keypoints corresponding to the kth (k =1, 2) image block ^k Representing a linear variation matrix, X _i Representing the non-homogeneous coordinate, t, of the ith three-dimensional imitation point ^k Representing a displacement vector.

The objective of considering affine reconstruction is to accurately obtain the camera matrix M ^k ,t ^k And three-dimensional imitation point X _i Thereby enabling the image points to be re-projected

And a value measured at the position of the image point>

The error between is minimal and therefore, the minimization objective function is written as:

wherein the content of the first and second substances,

represents a measurement value for the position of the image point, and->

Representing reprojected image points, M ^k Representing a linear variation matrix, X _i Representing the non-homogeneous coordinate, t, of the ith three-dimensional imitation point ^k Representing a displacement vector.

To solve such minimization problems, it is common practice to choose the centroid of these points as the coordinate system origin, thereby eliminating the translation vector t. Since the projection mode of the affine camera is parallel projection, the centroids of the three-dimensional points can be mapped to the centroids of the projected images, the centroids of the image points can be used as the origin of the world coordinate system, that is, the image point coordinates are centered, and further, t =0 can be assumed in the new coordinate system, in this case, the minimization objective function can be written as:

equation (3) is expressed using a matrix:

wherein the content of the first and second substances,

the 4 xn matrix formed by centralizing n groups of matching points under two view angles is represented as follows:

w denotes passing through the camera matrix M ^k And three-dimensional imitation point X _i A 4 × n matrix constructed by the image point coordinates obtained by projection is:

wherein, M ¹ ，M ² Local affine motion matrix, X, representing two image blocks obtained after factorization ₁ ，X ₂ ...X ₃ A set of three-dimensional affine point clouds computed representing the two image blocks.

Comparing equation (4) with equation (3), it can be found that minimizing the geometric error is equivalent to finding a sum

An approaching matrix W, which can be taken over by->

Performing Singular Value Decomposition (SVD) and truncating to rank 3, specifically, if->

Is decomposed into->

Then->

Is the closest->

And a matrix of rank 3, where U _4×3 Is composed of the first three columns of U, and/or>

Composed of the first three columns of V, and D _3×3 Is a diagonal matrix containing the first three singular values.

It can be understood that the key points of each image block are extracted by using the SIFT algorithm, a plurality of image pairs are determined according to the corresponding relation between the key points, and finally the three-dimensional affine point cloud corresponding to each image block is calculated by using the factorization algorithm, so that data support is provided for the subsequent three-dimensional reconstruction.

And 104, respectively calculating to obtain a global affine motion matrix of the camera corresponding to each image block with the preset size in each image subset under a unified global affine coordinate system based on the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset.

It can be understood that the electronic device can calculate a global affine motion matrix of each image block in a uniform global affine coordinate system according to the three-dimensional affine point cloud corresponding to the image block in each image subset.

In one embodiment, step 104 includes:

constructing an undirected graph according to an overlapping area or a public area between image blocks with preset sizes; the nodes of the undirected graph represent one image pair, and the edges of the undirected graph represent that an overlapping region or a common region exists between the image pair corresponding to two nodes connected by the same edge;

according to the undirected graph, solving an absolute affine transformation matrix corresponding to each node in the undirected graph by constructing an equation;

registering three-dimensional affine point clouds corresponding to image blocks with preset sizes under a unified global affine coordinate system according to an absolute affine transformation matrix corresponding to each node to obtain a sparse affine structure of a ground scene corresponding to the image subset;

calculating an initial global affine motion matrix of a camera corresponding to each image block with a preset size by using a projection relation between the sparse affine point cloud with the sparse affine structure and the image block with the preset size;

and binding, adjusting and optimizing the sparse affine point cloud and the initial global affine motion matrix to obtain a global affine motion matrix corresponding to each image block with the preset size.

It should be understood that there is an overlapping area between adjacent image blocks in the same view, and a common area exists between two image blocks in different views when the number of corresponding points between the two image blocks exceeds a preset threshold.

The global affine coordinate system is a coordinate system corresponding to sparse affine point clouds that transform all three-dimensional affine point clouds into a ground scene corresponding to the image subset.

Based on the three-dimensional affine point clouds corresponding to the image blocks, a global affine motion estimation algorithm based on local point clouds is mainly used in the process of obtaining a global affine motion matrix of the camera corresponding to the image blocks through calculation under a unified global affine coordinate system, and the algorithm comprises two steps of relative affine transformation estimation between the local point clouds and global affine camera motion matrix estimation.

The estimation based on the relative affine transformation between local point clouds is explained in detail below in conjunction with a set of formulas, while assuming that the corresponding points between the image pairs are equal to or greater than 4 sets:

firstly, no matter SIFT matching between small image blocks under different view angles or repeated SIFT key points in an overlapping region of adjacent small image blocks under the same view angle, enough constraint conditions can be provided for solving relative affine transformation between local affine reconstruction point clouds.

Secondly, given any pair of local point clouds A and B with a common area, let affine matrix H _mn ∈R ^4×4 Representing the affine transformation relation between A and B, and obtaining corresponding points in the two groups of local point clouds by utilizing the corresponding SIFT key point index between A and B, and respectively marking as X _m And X _n . Ideally, these corresponding points satisfy the following equation:

H _mn X _m ＝X _n

h can be obtained by using any 4 groups of corresponding points _mn . Under the condition of considering the external points, the RANSAC algorithm can be used for calculating the maximum consistent set meeting the equation, and then the optimized relative affine transformation H is obtained by the least square method _mn 。

It should be noted that a set of relative affine transformation matrices between local point clouds can be obtained through estimating relative affine changes between the local point clouds.

The following explains the global affine camera motion matrix estimation in detail with reference to multiple sets of formulas:

first, an undirected graph G = (V, E) is constructed with all image pairs in the same image subset, where V denotes a set of nodes, E denotes a set of edges, and each node V in the undirected graph G _m e.V represents an image pair, and if the image pair corresponding to the nodes m and n has an overlapping region or a common region, an edge e exists between the nodes m and n _mn ∈E。

Secondly, it is assumed that absolute affine transformation matrices of three-dimensional affine point clouds corresponding to nodes m and n are respectively represented as H _m And H _n The relative affine transformation between nodes m and n is represented as H _mn Then they satisfy:

H _mn H _m ＝H _n (7)

assuming that the graph G has N nodes and that there is a relative affine transformation between M groups of nodes, equation (7) can be expressed using a matrix form as:

wherein I represents a 4 × 4 identity matrix,

representing a 4 x 4 zero matrix. Further equation (8) can be expressed as:

PY＝QY (9)

wherein P ∈ R ^4m×4n Is formed by H _mn And

formed sparse matrix, Q ∈ R ^4m×4n Is determined by I and>

and constructing a sparse matrix. Y denotes->

However, since the formula (9) does not hold in a general case due to the presence of noise, the formula (9) is converted into:

and S = P-Q, performing Frobenius norm optimization on the formula (10) to convert the minimization problem into constrained least squares for solving, and obtaining an absolute affine transformation matrix corresponding to each node in the graph G.

Suppose node i pairs in undirected graph GThe corresponding three-dimensional affine point cloud is

The corresponding absolute affine transformation matrix is represented as H _i Can pass through>

And transforming the group of point clouds to a certain coordinate system, and performing affine transformation on the three-dimensional affine point clouds corresponding to all nodes in the undirected graph G according to the mode to register the three-dimensional affine point clouds corresponding to the image blocks with the preset sizes to a uniform global affine coordinate system, so that a sparse affine structure of a ground scene corresponding to an image subset is formed.

And finally, calculating an initial global affine motion matrix of the camera corresponding to each image block by using the projection relation between the sparse affine point cloud in the sparse affine structure and the image blocks. The initial global affine motion matrix can be represented as:

wherein the image block I ^k The homogeneous coordinate of the corresponding ith three-dimensional imitation point is expressed as X _i The non-homogeneous coordinates of the corresponding image feature points are expressed as

The initial global affine motion matrix is expressed as £>

Can tell>

The degree of freedom of (2) is 8, so only the corresponding relation between 4 sets of 2D and 3D is needed to solve linearly.

It should be noted that the binding adjustment can perform nonlinear optimization on the three-dimensional point and the camera pose at the same time.

In order to improve the sparse scene structure and solve a more accurate global affine motion matrix, bundle Adjustment (BA) optimization may be performed on the solved three-dimensional imitation point and the initial global affine motion matrix together, where an optimization function may be expressed as:

wherein, the first and the second end of the pipe are connected with each other,

representing the kth image block and the three-dimensional imitation point X _i And the corresponding image point coordinates, and lambda represents the projection relation between the three-dimensional point and the image block. Because each image block in the image subset only corresponds to partial affine point clouds of all the point clouds in the structure, when a camera corresponding to the k-th image block can observe a three-dimensional imitation point X _i When is delta _ik Is 1, otherwise δ _ik Is 0./>

And X _i All the variables need to be optimized, namely the global affine motion matrix and the three-dimensional simulation points are the variables needing to be optimized. Among them, the Levenberg-Marquardt algorithm in the Ceres solvent library can be used in solving the nonlinear equation of equation (12).

It is understood that the local affine motion matrix and the three-dimensional affine point cloud of each image block are registered in a unified global affine coordinate system. The method comprises the steps of firstly, constructing an undirected graph by using an image pair, then solving an absolute affine transformation matrix corresponding to each node in the undirected graph by using a construction equation, registering three-dimensional affine point clouds corresponding to image blocks with preset sizes under a unified global affine coordinate system by using the absolute affine transformation matrix corresponding to each node to obtain a sparse affine structure of a ground scene corresponding to an image subset, then calculating an initial global affine motion matrix of a camera corresponding to each image block by using a projection relation between the sparse affine point clouds and the image blocks with the preset sizes, and finally obtaining the global affine motion matrix corresponding to each image block through binding optimization. In the process, the global affine motion matrix corresponding to each image block can be obtained by solving only once binding adjustment optimization, and the binding adjustment optimization does not need to be repeatedly executed, so that the time of three-dimensional reconstruction is reduced, and the reconstruction efficiency is improved.

And 105, respectively obtaining a three-dimensional Euclidean structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points.

It should be noted that the ground control point is an important data source for geometric correction and geographic positioning of the satellite remote sensing image, and is known accurate ground point information obtained through field measurement or through other channels.

It can be understood that, the electronic device can obtain the three-dimensional euclidean structure corresponding to an image subset through the global affine motion matrix of all image blocks in the image subset and the preset number of ground control points.

In one embodiment, step 105 comprises:

obtaining a dense matching relation between all the image blocks with the preset size in the image subset by utilizing an existing affine stereo correction algorithm and a stereo matching algorithm;

according to the dense matching relation among all the image blocks with the preset sizes and the global affine motion matrix corresponding to each image block with the preset sizes, calculating through an affine triangulation to obtain dense affine point clouds of the ground scene corresponding to the image subsets;

and for the dense affine point cloud, upgrading the dense affine point cloud into the three-dimensional Euclidean structure by calculating a global affine transformation matrix.

It can be understood that, for the image blocks in each image subset after classification, the existing affine stereo correction algorithm and stereo matching algorithm can be used to obtain the dense matching relationship between different image blocks.

It should be noted that the stereo correction is to make the epipolar lines of two image blocks parallel, so as to facilitate stereo matching, where stereo matching refers to performing pixel-by-pixel matching on different image blocks.

In one embodiment, more Global Matching (MGM) algorithms of the stereo matching algorithm may be used.

It should be noted that dense matching is stereo matching based on regional correlation, and by calculating the correlation value between the pixels in one small window function of one image and the pixels in the size window function of another image with the same potential corresponding feature, the small window region with the largest correlation is the corresponding region, and only the corresponding region satisfying the epipolar line constraint is the matching region.

Affine triangulation refers to calculating three-dimensional point clouds of local scenes corresponding to two image blocks through matching points between the two image blocks and an affine matrix corresponding to each image block; the local scene refers to a ground scene corresponding to each image block.

In one image subset, performing dense matching on any two image blocks under different viewing angles, performing affine triangulation on the two image blocks which are successfully subjected to dense matching to obtain three-dimensional point clouds of local scenes corresponding to the two image blocks, wherein the three-dimensional point clouds of the local scenes corresponding to all the successfully-matched image blocks form the dense affine point cloud of the ground scene corresponding to the image subset.

It can be understood that sparse affine point cloud represents a small number of points and a large distance between the points; the dense affine point cloud has larger number of points and dense points, and the three-dimensional European structure obtained by the dense point cloud is more complete.

It should be noted that only one spatial affine transformation needs to be performed between the affine structure and the european structure of the scene, a spatial affine transformation matrix, that is, a global affine transformation matrix, can be fitted through a preset number of ground control points, and the european structure of the ground scene corresponding to each image subset can be obtained through the global affine transformation matrix.

It can be understood that the dense matching relationship between different image blocks in the same image subset is obtained through the existing affine stereo correction algorithm and stereo matching algorithm, and then the dense affine point cloud of the ground scene corresponding to the image subset is obtained, and then the dense affine point cloud is transformed into the three-dimensional Euclidean structure by using a global affine transformation matrix, so that the three-dimensional Euclidean structure of the ground scene corresponding to each image subset is obtained, the obtained three-dimensional Euclidean structure of each image subset is more complete and accurate, and more accurate data support is provided for the subsequent obtaining of the three-dimensional Euclidean structure of the target scene.

And 106, obtaining the three-dimensional European structure of the target scene based on the three-dimensional European structure corresponding to each image subset.

It can be understood that the electronic device can finally obtain the three-dimensional european structure of the target scene through the three-dimensional european structure corresponding to each image subset.

In one embodiment, step 106 includes:

generating a ground elevation map of a three-dimensional Euclidean structure corresponding to each image subset;

and registering and fusing all the ground elevation maps by using a method for fusing the ground elevation maps to obtain the three-dimensional European-style structure of the target scene.

It should be understood that the ground elevation maps of the ground scene corresponding to the image subsets are registered, and the registered ground elevation maps are fused to obtain a complete three-dimensional Euclidean structure of the target scene.

The method has the advantages that the three-dimensional Euclidean structures of the target scene are obtained by registering and fusing the ground elevation maps of the three-dimensional Euclidean structures corresponding to the image subsets, so that the details of the target scene can be better reconstructed, and the completeness and the accuracy of reconstruction are guaranteed.

In this embodiment, a plurality of image subsets are obtained by classifying a satellite remote sensing image data set of a target scene, each graph in each image subset is further cut to obtain a group of preset-size image blocks with an overlapping area, a three-dimensional european structure corresponding to each image subset is finally calculated by using all the preset-size image blocks in each image subset, and then the three-dimensional european structure of the target scene is obtained. The embodiment of the invention does not need to execute repeated execution of binding adjustment optimization, has strong robustness, can reduce the three-dimensional reconstruction time of the satellite remote sensing image on the premise of ensuring the integrity and the precision, quickly obtains the European structure of the target scene, and further effectively improves the reconstruction efficiency of the remote sensing image.

The satellite remote sensing image rapid reconstruction device based on the affine camera model provided by the invention is described below, and the satellite remote sensing image rapid reconstruction device based on the affine camera model described below and the satellite remote sensing image rapid reconstruction method based on the affine camera model described above can be referred to correspondingly.

FIG. 3 is a schematic structural diagram of a device for rapidly reconstructing a satellite remote sensing image based on an affine camera model provided by the invention. As shown in fig. 3, the apparatus comprises an image classification module 301, an image cropping module 302, a first calculation module 303, a second calculation module 304, a first reconstruction module 305, and a second reconstruction module 306, wherein:

the image classification module 301 is configured to perform image classification on the remote sensing data set of the target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period, and N is an integer greater than or equal to 1;

an image cropping module 302, configured to crop each image in each of the image subsets into a set of preset-size image blocks with an overlapping area;

the first calculating module 303 is configured to calculate a three-dimensional affine point cloud corresponding to each image block with a preset size in each image subset;

a second calculating module 304, configured to respectively calculate, based on the three-dimensional affine point cloud corresponding to each image block of a preset size in each image subset, a global affine motion matrix of a camera under a unified global affine coordinate system corresponding to each image block of a preset size in each image subset;

a first reconstruction module 305, configured to obtain a three-dimensional european structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points;

and a second re-modeling block 306, configured to obtain a three-dimensional european structure of the target scene based on the three-dimensional european structure corresponding to each image subset.

The device for quickly reconstructing the satellite remote sensing image based on the affine camera model obtains a plurality of image subsets by classifying a satellite remote sensing image data set of a target scene, further cuts each image in each image subset respectively to obtain a group of preset-size image blocks with overlapping areas, finally calculates a three-dimensional European structure corresponding to each image subset by using all the preset-size image blocks in each image subset, and then obtains the three-dimensional European structure of the target scene. The method does not need to execute repeated binding adjustment optimization, has strong robustness, can reduce the three-dimensional reconstruction time of the satellite remote sensing image on the premise of ensuring the completeness and the precision, quickly obtains the European structure of the target scene, and further effectively improves the reconstruction efficiency of the remote sensing image.

Optionally, the calculating a three-dimensional affine point cloud corresponding to each image block with a preset size in each image subset includes:

Optionally, the calculating, based on the three-dimensional affine point cloud corresponding to each preset-size image block in each image subset, a global affine motion matrix of a camera corresponding to each preset-size image block in each image subset under a unified global affine coordinate system includes:

calculating an initial global affine motion matrix of a camera corresponding to each image block with the preset size by using the projection relation between the sparse affine point cloud with the sparse affine structure and the image blocks with the preset size;

Optionally, the obtaining, based on the camera affine motion matrix and a preset number of ground control points, a three-dimensional european structure corresponding to each of the image subsets respectively includes:

according to the dense matching relation among all the image blocks with the preset size and the global affine motion matrix corresponding to each image block with the preset size, obtaining dense affine point clouds of the ground scene corresponding to the image subsets through affine triangulation calculation;

Optionally, the obtaining a three-dimensional european structure of the target scene based on the ground elevation map of the three-dimensional european structure corresponding to each image subset includes:

and registering and fusing all the ground elevation maps by using a method for fusing the ground elevation maps to obtain a three-dimensional European structure of the target scene.

It should be noted that, the apparatus for quickly reconstructing a satellite remote sensing image based on an affine camera model provided in the embodiment of the present invention can implement all the method steps implemented by the embodiment of the method for quickly reconstructing a satellite remote sensing image based on an affine camera model, and can achieve the same technical effects, and details of the same parts and beneficial effects as those of the embodiment of the method are not repeated herein.

Fig. 4 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 4: a processor (processor) 410, a communication Interface 420, a memory (memory) 430 and a communication bus 440, wherein the processor 410, the communication Interface 420 and the memory 430 are communicated with each other via the communication bus 440. The processor 410 may call logic instructions in the memory 430 to perform a method for fast reconstruction of satellite remote sensing images based on an affine camera model, the method comprising: carrying out image classification on a remote sensing data set of a target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period; cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas; respectively calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in each image subset; respectively calculating to obtain a global affine motion matrix of a camera corresponding to each preset size image block in each image subset under a uniform global affine coordinate system based on the three-dimensional affine point cloud corresponding to each preset size image block in each image subset; respectively obtaining a three-dimensional European structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points; and obtaining the three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each image subset.

In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The electronic device provided by the embodiment of the invention can realize each process of the affine camera model-based satellite remote sensing image rapid reconstruction method embodiment, can achieve the same technical effect, and is not repeated here to avoid repetition.

In another aspect, the present invention further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and the program or the instruction, when executed by a processor, implements the processes of the above embodiment of the affine camera model based satellite remote sensing image fast reconstructing method, for example, including: carrying out image classification on a remote sensing data set of a target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period; cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas; respectively calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in each image subset; respectively calculating to obtain a global affine motion matrix of a camera corresponding to each preset size image block in each image subset under a uniform global affine coordinate system based on the three-dimensional affine point cloud corresponding to each preset size image block in each image subset; respectively obtaining a three-dimensional European structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points; and obtaining the three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each image subset.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A satellite remote sensing image rapid reconstruction method based on an affine camera model is characterized by comprising the following steps:

carrying out image classification on a remote sensing data set of a target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period, and N is an integer greater than or equal to 1;

cutting each image in each image subset into a group of image blocks with preset sizes and overlapping areas;

respectively calculating three-dimensional affine point clouds corresponding to image blocks with preset sizes in each image subset;

respectively calculating to obtain a global affine motion matrix of a camera corresponding to each preset size image block in each image subset under a uniform global affine coordinate system based on the three-dimensional affine point cloud corresponding to each preset size image block in each image subset;

respectively obtaining a three-dimensional European structure corresponding to each image subset based on the global affine motion matrix and a preset number of ground control points;

obtaining a three-dimensional European structure of the target scene based on the three-dimensional European structure corresponding to each image subset;

the calculating of the three-dimensional affine point cloud corresponding to each image block with the preset size in each image subset comprises the following steps:

solving a three-dimensional affine point cloud corresponding to each image block with the preset size in each image pair through a factorization algorithm in the multi-view geometry;

the method for obtaining the global affine motion matrix of the camera corresponding to each preset size image block in each image subset under the unified global affine coordinate system through calculation based on the three-dimensional affine point cloud corresponding to each preset size image block in each image subset comprises the following steps:

2. The affine camera model-based satellite remote sensing image fast reconstruction method according to claim 1, wherein the three-dimensional Euclidean structure corresponding to each image subset is respectively obtained based on the global affine motion matrix and a preset number of ground control points, and the method comprises the following steps:

3. The affine camera model-based satellite remote sensing image fast reconstruction method according to claim 1, wherein the obtaining of the three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each of the image subsets comprises:

4. A satellite remote sensing image fast reconstruction device based on an affine camera model comprises:

the image classification module is used for carrying out image classification on the remote sensing data set of the target scene to obtain N image subsets; each image subset comprises remote sensing images with different view angles in the same time period, and N is an integer greater than or equal to 1;

the second reconstruction module is used for obtaining a three-dimensional Euclidean structure of the target scene based on the three-dimensional Euclidean structure corresponding to each image subset;

5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for quickly reconstructing an affine camera model based satellite remote sensing image according to any one of claims 1 to 3 when executing the program.

6. A non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the method for quickly reconstructing affine camera model based satellite remote sensing images according to any one of claims 1 to 3.