CN117011561A

CN117011561A - Image matching optimization method and system based on geometric constraint and convolutional neural network

Info

Publication number: CN117011561A
Application number: CN202310958186.0A
Authority: CN
Inventors: 陈学业; 杨楠
Original assignee: Shenzhen Planning And Natural Resources Data Management Center Shenzhen Spatial Geographic Information Center; Harbin Institute of Technology
Current assignee: Shenzhen Planning And Natural Resources Data Management Center Shenzhen Spatial Geographic Information Center; Harbin Institute of Technology
Priority date: 2023-07-31
Filing date: 2023-07-31
Publication date: 2023-11-07

Abstract

The invention discloses an image matching optimization method and system based on geometric constraint and convolutional neural network, firstly, three-dimensional points of an initial matching image are respectively projected to a reference image and a search image to obtain coordinates of image points of the initial matching image, and a window of the reference image and a matching search range are determined by taking the initial matching image pair as a center; then, searching an optimal matching point position in a matching search range by using a convolutional neural network with an initial matching image window as a reference; and finally, the position coordinates of the optimal matching image pair are brought into a front intersection error equation, and the accurate three-dimensional coordinates of the corresponding shooting ground object point are calculated. The invention adopts the space geometric projection relation to restrict the matching search range, so as to weaken the mismatching caused by repeated textures; the convolutional neural network is used for replacing the traditional image gray level-based matching optimization method, so that the probability of matching failure and mismatching is reduced, and the matching optimization precision and the integrity of the reconstructed point cloud are improved.

Description

Image matching optimization method and system based on geometric constraint and convolutional neural network

Technical Field

The invention belongs to the technical field of image matching in remote sensing image processing, relates to an image matching optimization method and system in remote sensing image processing, and in particular relates to an image matching optimization method and system based on geometric constraint and convolutional neural network.

Background

The use of pixel-by-pixel dense matching of overlapping oblique images to quickly recover three-dimensional models of objects or scenes has become an effective means of acquiring three-dimensional information of cities in recent years. However, the conventional gray-scale-based matching optimization method has defects in matching areas with depth discontinuities (building edges and corners, road boundaries), weak textures and the like existing in artificial buildings. The traditional matching optimization method is based on the cross correlation of the gray levels of image windows, wherein the larger the matching window is, the more pixels are involved in the matching process, the lower the probability of mismatching is, and the higher the integrity of the reconstructed point cloud is. This is because the rich image information participates in the matching process, and the weight of image noise is reduced, thereby improving the robustness of image matching. However, when the matching image window is enlarged, the reconstructed point cloud model is deformed more relative to the real ground model. This is because the number of pixels involved in the matching optimization is too large, and the weight of the pixels where the point and line features of the image are located becomes smaller, so that the more serious the features are blurred after the matching optimization. The greater the deformation of the reconstructed model.

Disclosure of Invention

In order to solve the technical problems, the invention provides an optimization method and an optimization system for mining image pixel depth information and improving matching precision and reconstruction point cloud integrity in a low-altitude remote sensing image.

The method adopts the following technical scheme: an image matching optimization method based on geometric constraint and convolutional neural network comprises the following steps:

step 1: projecting the three-dimensional points of the initial matching image to the reference image and the search image respectively to obtain the image point coordinates p of the initial matching image _r (x _r ,y _r )、p _s (x _s ,y _s ) Determining a reference image window and a matching search range by taking the initial matching image pair as a center;

step 2: searching an optimal matching point position in a matching search range by using the convolution neural network with the initial matching image window as a reference;

step 3: and (3) taking the position coordinates of the optimal matching image pair into a front intersection error equation, and calculating the accurate three-dimensional coordinates of the corresponding shooting ground object points.

Preferably, in step 1, the image point coordinates of the initial matching image are obtained by using the following formula;

wherein, (x) _(r/s) ,y _(r/s) ) Representing the projection of a three-dimensional point P (X, Y, Z) onto a reference image I _r Or search for image I _s The resulting image point p _r Or p _s At I _r Or I _s Position coordinates of (x) ₀ ,y ₀ ) Representing the position coordinates of the principal point of the photographed camera image in the image, f representing the principal distance of the photographed camera, (X) _s(r/s) ,Y _s(r/s) ,Z _s(r/s) ) Representing a shot reference image I _r Or search for image I _s The coordinates of the camera optical center in the three-dimensional space coordinate system are shot,representing a shot reference image I _r Or search for image I _s The rotation matrix of the camera is photographed.

Preferably, in step 1, an image window of n×n pixels is selected as an image window to be matched in the reference image with the projection image point as the center; m x m pixels with the projected image point as the center in the search image are used as a matching search range; wherein m and n are preset values, and m is greater than n.

Preferably, in step 2, (m-n) x (m-n) image windows and reference image windows are intercepted from left to right and from top to bottom in the search range, and the convolutional neural network is utilized to find the optimal match.

Preferably, in step 2, the matching optimization convolutional neural network includes a convolutional image information extraction module and a fully-connected optimal matching search module;

the convolution image information extraction module comprises 16 convolution layers which are arranged in series, wherein residual blocks are added to the sum of the 1 st layer to the 15 th layer of convolution layers, and normalization layers, an activation layer and the residual blocks are sequentially added to the 16 th layer;

the full-connection optimal matching search module comprises a window feature vector description sub-module and an optimal matching search sub-module, wherein the feature vector description sub-module consists of 48 multiplied by 1000 full-connection layers, and the output result is a 1000-dimensional feature vector of a matching image window; the optimal matching search sub-module consists of 1000 multiplied by 64 full connection layers, and an output result is a similarity measure of a reference image and a search image window.

Preferably, in step 3, the three-dimensional coordinates of the object point corresponding to the matching image pair are obtained by adopting the following formula:

wherein P (X, Y, Z) is the three-dimensional coordinates of the object point, l ₁ -l ₆ Coefficients for the equation; l (L) _x And l _y Is a constant term of the equation; f represents the main distance of the shooting camera, (x) ₀ ,y ₀ ) Representing the position coordinates of the main point of the photographed camera image in the image, and p (X, y) represents the coordinates of the best matching image point searched in the step 2, (X) _s ,Y _s ,Z _s ) Representing the coordinates of the optical center of the photographing camera in three-dimensional space,representing a rotation matrix of the photographing camera.

The system of the invention adopts the technical proposal that: an image matching optimization system based on geometric constraints and convolutional neural network, comprising:

one or more processors;

and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the image matching optimization method based on the geometric constraint and the convolutional neural network.

According to the technical scheme, the matching search range is constrained through the geometric projection relationship, the matching efficiency is improved, meanwhile, mismatching caused by repeated textures is weakened, then the information quantity of image pixels is enriched through deep information of a convolutional neural network to achieve the purpose of matching optimization, and finally, accurate three-dimensional coordinates of object points are calculated through intersection in front of a matching image pair space, so that three-dimensional point cloud data acquisition with high accuracy and high integrity is achieved.

Compared with the prior art, the invention has the beneficial effects that:

(1) The space geometrical projection relation is utilized to restrict the matching search range from the whole image to the very small range of the same name projection beam, and the mismatching caused by repeated textures is weakened on the basis of greatly reducing the searching time of the matching pixels.

(2) And a small image window with few pixels is used as input data, and a matched and optimized convolutional neural network without a pooling layer is constructed so as to achieve the aim of considering matching efficiency and high-precision matching.

(3) The depth information in the image window is mined by utilizing the matching optimization convolutional neural network, so that the traditional matching optimization method based on the image gray level is replaced, the information quantity of image pixels is enriched, the probability of matching failure and mismatching is reduced, and the matching optimization precision and the integrity of the reconstruction point cloud are improved.

Drawings

The following examples, as well as specific embodiments, are used to further illustrate the technical solutions herein. In addition, in the course of describing the technical solutions, some drawings are also used. Other figures and the intent of the present invention can be derived from these figures without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an embodiment of the present invention.

Fig. 2 is a schematic diagram of a matched optimized convolutional neural network established in an embodiment of the present invention.

Detailed Description

In order to facilitate the understanding and practice of the invention, those of ordinary skill in the art will now make further details with reference to the drawings and examples, it being understood that the examples described herein are for the purpose of illustration and explanation only and are not intended to limit the invention thereto.

Referring to fig. 1, the image matching optimization method based on geometric constraint and convolutional neural network provided by the invention comprises the following steps:

step 1: the three-dimensional points of the initial matching image are projected to the reference image and the search image respectively by utilizing a collineation conditional equation, and the image point coordinates p of the initial matching image are obtained _r (x _r ,y _r )、p _s (x _s ,y _s ) Determining a reference image window and a matching search range by taking the initial matching image pair as a center;

in the embodiment, firstly, image related parameters and initial matching three-dimensional point cloud data are input, projection point coordinates are calculated by utilizing a space geometrical projection relation, and a matching image window and a matching search range are determined by taking the projection point coordinates as the center.

In one embodiment, for each three-dimensional point P (X, Y, Z), the projected image point coordinates P are calculated according to the following collineation conditional equation _r (x _r ,y _r ) And p _s (x _s ,y _s )：

In one embodiment, the image point p is projected in the reference image _r (x _r ,y _r ) An image window of 11×11 pixels is selected for the center as the image window to be matched. In the search image to project an image point p _s (x _s ,y _s ) For the central 19×19 pixel as a matching search range, the purpose of selecting 11×11 pixels for the image matching window is to reduce the matching window and increase the feature pixel weight, and the purpose of restricting the matching search range is to improve the matching efficiency and reduce the mismatching caused by repeated textures.

please refer to fig. 2, in one embodiment, 64 image windows and reference image windows are intercepted from left to right and from top to bottom in the search range to construct the matching optimized convolutional neural network shown in fig. 2:

in one embodiment, the input data is an 11×11 pixel RGB color image, and the matching optimized convolutional neural network includes a convolutional image information extraction module and a full-connection optimal matching search module.

The convolution image information extraction module comprises 16 convolution layers which are arranged in series, wherein the convolution kernels are 3 in size, and the step length is 1; the sum of the 1 st layer to the 15 th layer convolution layers is added with a residual block to weaken image noise, and the 16 th layer is added with a normalization layer, an activation layer and a residual block in sequence to inhibit gradient disappearance and gradient explosion and accelerate network operation. The purpose of the convolution layer is to extract deep information in the image to enrich the information content of the image pixels.

The full-connection optimal matching search module comprises a window feature vector description sub-module and an optimal matching search sub-module, wherein the feature vector description sub-module consists of 48 multiplied by 1000 full-connection layers, and an output result is 1000-dimensional feature vectors describing global information of an image window matched with an image; the optimal matching search sub-module consists of 1000 multiplied by 64 full connection layers, and an output result is a similarity measure of a reference image and a search image window.

And selecting the center pixel of the search image window with the highest similarity measure score as an optimal matching point for the output result of the matching optimization convolutional neural network. The purpose of the matched and optimized convolutional neural network of the invention is to keep the deep information of each pixel in the image window so as to realize the purpose of high-precision matching.

In one embodiment, the three-dimensional coordinates of the object point corresponding to the matching image pair are obtained using the following formula:

wherein P (X, Y, Z) is the three-dimensional coordinates of the object point, l ₁ -l ₆ Coefficients for the equation; l (L) _x And l _y Is a constant term of the equation; f represents the main distance of the shooting camera, (x) ₀ ,y ₀ ) Representing the position coordinates of the main point of the photographed camera image in the image, and p (x, y) represents the coordinates of the best matching image point searched in the step 2，(X _s ,Y _s ,Z _s ) Representing the coordinates of the optical center of the photographing camera in three-dimensional space,representing a rotation matrix of the photographing camera.

Knowing an image point coordinate p (X, Y) in the implementation process, 2 linear error equations about (X, Y, Z) can be established, and then the overlapping images obtained after matching optimization are matched relatively to p _ri (x _ri ,y _r ) And p _si (x _si ,y _si ) The introduction of the error equation may create a 4 x 3 system of linear equations, namely:

namely:

wherein,is a coefficient matrix->As a constant term vector, since the linear equation number is larger than the unknown, the least squares method can be used to solve:

wherein L is ^T A transformation matrix for coefficient matrix L, (L) ^T L) ^-1 Is a matrix L ^T An inverse of the x L product.

Recording the results obtained by optimizing and calculating all the initial three-dimensional points one by one as txt documents, wherein the document content can provide comprehensive results and comprises the following steps: initial three-dimensional point coordinates, corrected three-dimensional point coordinates, reference image point coordinates, search image point coordinates, optimized matching window correlation coefficients, whether optimization is successful or not, and the like.

The invention also provides an image matching optimization system based on geometric constraint and convolutional neural network, which comprises:

one or more processors;

The effectiveness of the present invention was verified by simulation experiments as follows:

the simulation experiment adopts two groups of city area unmanned aerial vehicle images with overlapped heading, and obtains accurate inside and outside azimuth elements, each group of images extracts two groups of point cloud data of regular point clouds and building edge characteristic point clouds which are uniformly distributed, the effectiveness of the invention is verified by comparing the least square matching optimization (Least Square Matching, LSM) based on the image gray level with the matching optimization result of the invention, and the detailed information of experimental data is shown in table 1:

table 1: experimental image and initial data basic parameters

Evaluation index: and evaluating the matching optimization result from three aspects of matching optimization success rate, effective matching optimization probability and correlation coefficient between matching image windows.

(1) Matching optimization success rate: i.e. the percentage of the total points of the three-dimensional points of the matching image points located in the image ranges of the reference image and the search image after the optimization process is completed.

(2) Effective match optimization probability: the three-dimensional coordinates of the ground feature points finally obtained by the front intersection are the percentage of the total points in a reasonable range of the digital surface model (Digital Surface Model, DSM).

(3) Average normalized cross-correlation coefficient (Normalized Correlation Coefficient, NCC) between image windows: the correlation coefficient between the image windows is an important evaluation index for evaluating whether the two image windows are correlated or not, and the higher the correlation coefficient of the image windows is, the higher the probability that the image center point is the same name point is.

Experimental data of the simulation experiments according to the evaluation index are shown in tables 2 to 5:

table 2: image matching optimization result of northwest university (58014 points)

Table 3: yangjiang city image matching optimization result (165353 points)

Table 4: edge point cloud matching optimization result of northwest university (105619 points)

Table 5: yangjiang city image edge point cloud optimization result (147555 point)

As can be seen from the experimental results in tables 2-5, since the depth information of the image is mined by using the convolutional neural network to enrich the information quantity of the image pixels, the matching optimization result is particularly obvious in the aspects of probability of successful matching and effective matching optimization and correlation coefficient among matching windows due to the LSM method, especially in the improvement of edge point cloud matching optimization, because the information quantity contained in the edge point cloud is richer than that of the common point cloud, the convolutional neural network can be used for removing the matching interference item more easily to obtain a better matching result. Theoretically, each three-dimensional point in the reconstructed three-dimensional point cloud should be located on the DSM surface of the object, so that discrete points in the point cloud data represent points that are very low in accuracy and are most likely to be mismatching points. The enlarged point cloud details intercepted in the experimental comparison graph can be seen that the matching optimization point cloud has fewer discrete points and richer details. These results verify that the present invention is superior to LSM algorithms in terms of matching accuracy and point cloud integrity.

Compared with the traditional least square matching method, the method has obvious advantages, has higher matching optimization success rate, higher effective matching rate, fewer abnormal matching and richer reconstruction point cloud texture details, and is a feasible image matching optimization method.

It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.

Claims

1. The image matching optimization method based on the geometric constraint and the convolutional neural network is characterized by comprising the following steps of:

step 2: searching an optimal matching point position in a matching search range by using the initial matching image window as a reference and utilizing a matching optimization convolutional neural network;

2. The image matching optimization method based on geometric constraint and convolutional neural network according to claim 1, wherein the method is characterized in that: in step 1, the image point coordinates of an initial matching image are obtained by adopting the following formula;

3. The image matching optimization method based on geometric constraint and convolutional neural network according to claim 1, wherein the method is characterized in that: in the step 1, selecting an image window with n multiplied by n pixels in a reference image by taking a projection image point as a center as an image window to be matched; m x m pixels with the projected image point as the center in the search image are used as a matching search range; wherein m and n are preset values, and m is greater than n.

4. The image matching optimization method based on geometric constraint and convolutional neural network according to claim 1, wherein the method is characterized in that: in the step 2, the matching optimization convolutional neural network comprises a convolutional image information extraction module and a full-connection optimal matching search module;

the convolution image information extraction module comprises 16 convolution layers which are arranged in series, wherein residual blocks are added to the sum of the 1 st layer to the 15 th layer of convolution layers, and a normalization layer, an activation layer and the residual blocks are sequentially added after the 16 th layer;

5. The image matching optimization method based on geometric constraint and convolutional neural network according to claim 1, wherein the method is characterized in that: in the step 2, the (m-n) x (m-n) image windows and the reference image windows are intercepted from left to right and from top to bottom in the searching range, and the convolutional neural network is utilized to find the optimal matching.

6. The image matching optimization method based on geometric constraint and convolutional neural network according to claim 1, wherein the method is characterized in that: in step 3, the three-dimensional coordinates of the object point corresponding to the matched image pair are obtained by adopting the following formula:

wherein P (X, Y, Z) is the three-dimensional coordinates of the object point, l ₁ -l ₆ Coefficients for the equation; l (L) _x And l _y Is a constant term of the equation; f represents the main distance of the shooting camera, (x) ₀ ,y ₀ ) Representing the position coordinates of the main point of the photographed camera image in the image, and p (X, y) represents the coordinates of the best matching image point searched in the step 2, (X) _s ,Y _s ,Z _s ) Watch (watch)Coordinates of the optical center of the photographing camera in a three-dimensional space are shown,representing a rotation matrix of the photographing camera.

7. An image matching optimization system based on geometric constraint and convolutional neural network, which is characterized by comprising:

one or more processors;

storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the geometric constraint and convolutional neural network based image matching optimization method as claimed in any one of claims 1 to 6.