CN114494372B - Remote sensing image registration method based on unsupervised deep learning - Google Patents
Remote sensing image registration method based on unsupervised deep learning Download PDFInfo
- Publication number
- CN114494372B CN114494372B CN202210026370.7A CN202210026370A CN114494372B CN 114494372 B CN114494372 B CN 114494372B CN 202210026370 A CN202210026370 A CN 202210026370A CN 114494372 B CN114494372 B CN 114494372B
- Authority
- CN
- China
- Prior art keywords
- image
- scale
- model network
- corrected
- transformation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000013135 deep learning Methods 0.000 title claims abstract description 17
- 230000009466 transformation Effects 0.000 claims abstract description 113
- 230000006870 function Effects 0.000 claims abstract description 49
- 238000012549 training Methods 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 238000011524 similarity measure Methods 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims description 23
- 238000012937 correction Methods 0.000 claims description 14
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 abstract description 2
- 230000003287 optical effect Effects 0.000 description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a remote sensing image registration method based on unsupervised deep learning, which converts image registration into regression optimization, and can integrate a feature extraction network, image similarity measure and feature descriptors of various forms and parameters. According to the invention, the depth characteristics of the images to be registered are extracted on a plurality of scales by using a model network, geometric transformation parameters are obtained through parameter regression, the images are geometrically corrected by using the parameters, and the multi-scale gradual registration of the images from coarse to fine is realized. According to the invention, registration truth values are not needed to be used as training samples, the loss functions on multiple scales are jointly trained by constructing the loss functions based on the similarity measure and the feature descriptors between the images, the parameters of each model network are updated by back propagation, the geometric transformation parameters are optimized, and the high-precision and high-robustness multi-source remote sensing image registration is realized.
Description
Technical Field
The invention belongs to the technical field of remote sensing, and particularly relates to a design of a remote sensing image registration method based on unsupervised deep learning.
Background
With the rapid development of aerospace and remote sensing technologies, the acquisition means of remote sensing images are continuously increased, and the types are continuously enriched. Because of the differences of equipment technologies and imaging mechanisms of various sensors, the remote sensing image of a single data source is difficult to comprehensively reflect the characteristics of the ground object. In order to fully utilize the multi-source remote sensing data acquired by different types of sensors and realize integration and information complementation, the multi-source remote sensing images are required to be registered.
The multi-source remote sensing image registration refers to the process of aligning and information superposition on multi-sensor remote sensing images of the same area acquired under different time, different visual angles or different sensor conditions, so that the same-name points on the aligned images have the same geographic coordinates. In the prior art, the method for registering the multi-source remote sensing images comprises a traditional method and a method based on deep learning, wherein the traditional method does not need to adopt a deep learning technology. Traditional methods are based on feature or region templates that rely on manually designed features that typically need to be redesigned for remote sensing image registration of different modalities of different sensors. Deep learning-based method extracts deep features from multi-source remote sensing images, and has better universality compared with manual features. The current method based on supervised deep learning requires a large number of samples with truth labels as training data, and the current remote sensing field does not have a large number of data for training, so that the practical application of the method is limited by cost factors.
Disclosure of Invention
The invention aims to solve the problem that a large number of training samples are difficult to obtain in the existing remote sensing image registration method based on supervised deep learning, and provides a remote sensing image registration method based on unsupervised deep learning, which can realize accurate registration between remote sensing images under the condition of no training samples.
The technical scheme of the invention is as follows: a remote sensing image registration method based on unsupervised deep learning comprises the following steps:
s1, a multisource remote sensing image registration data set comprising two groups of image data is established, every two images of the two groups of image data correspond to each other one by one, wherein one group of image data is used as a reference image data set, and the other group of image data is used as an image data set to be corrected.
S2, selecting a reference image f from the reference image data set, selecting an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample.
S3, respectively calculating transformation parameters mu of the image on model networks of all scales on 3 scales 1 、μ 2 、μ 3 Gradually correcting the image m to be corrected to generate a corrected image m 1 、m 2 、m 3 Back-propagating the loss function of model network of each scale and correcting image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample.
S4, respectively initializing model network parameters of 3 scales.
S5, performing joint training on the model network with 3 scales in an end-to-end mode, and optimizing joint loss functions on the 3 scales.
S6, searching the direction in which the joint loss function value is reduced most rapidly through a deep learning optimizer, carrying out back propagation on the model network in the direction, iteratively updating the model network parameters, storing the network model parameters at the moment when the joint loss function is reduced to a preset threshold value and converged, and outputting a registered reference image f and a registered corrected image m 3 。
Further, step S3 includes the following sub-steps:
s3-1, inputting the reference image f and the image m to be corrected into a model network of the 1 st scale to obtain a transformation parameter mu of the 1 st scale 1 。
S3-2, using the transformation parameter mu 1 Geometrically correcting the image m to be corrected to generate a corrected image m 1 。
S3-3, calculating a loss function of the model network of the 1 st scale.
S3-4, combining the reference image f and the corrected image m 1 Inputting the residual error delta mu of the transformation parameters into a model network of the 2 nd scale 1 And associate it with the transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 。
S3-5, using transformation parameter mu 2 For the corrected image m 1 Performing geometric correction to generate a corrected image m 2 。
S3-6, calculating a loss function of the model network of the 2 nd scale.
S3-7, combining the reference image f and the corrected image m 2 Inputting the residual error delta mu of the transformation parameters into a model network of the 3 rd scale 2 And associate it with the transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 。
S3-8, using transformation parameter mu 3 For the corrected image m 2 Performing geometric correction to generate a corrected image m 3 。
S3-9, calculating a loss function of the model network of the 3 rd scale.
S3-10, correcting the image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample.
Further, step S3-1 includes the following sub-steps:
s3-1-1, respectively downsampling the reference image f and the image m to be corrected to 1/4 of the original size, and superposing two images generated after downsampling in the channel direction to generate a superposed image.
S3-1-2, inputting the overlapped image into a feature extraction part of the model network of the 1 st scale to generate depth features.
S3-1-3, enabling the depth characteristic to pass through a parameter regression part of a model network with the 1 st scale to obtain a transformation parameter mu with the 1 st scale 1 。
Further, step S3-2 includes the following sub-steps:
s3-2-1, by transformation parameter mu 1 Composing the geometric transformation matrix T μ1 。
S3-2-2 through geometric transformation matrix T μ1 Geometrically transforming the image m to be corrected to generate a corrected image m 1 。
Further, step S3-4 includes the following sub-steps:
s3-4-1, the reference image f and the correction image m 1 Downsampling to 1/2 of the original size respectively, and superposing the two images generated after downsampling in the channel direction to generate a superposed image.
S3-4-2, inputting the overlapped image into a feature extraction part of a model network of the 2 nd scale to generate depth features.
S3-4-3, passing the depth characteristic through a parameter regression part of a model network with the 2 nd scale to obtain a residual error delta mu of transformation parameters 1 。
S3-4-4, residual error delta mu 1 And transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 。
Further, step S3-5 includes the following sub-steps:
s3-5-1, consist ofTransformation parameter mu 2 Composing the geometric transformation matrix T μ2 。
S3-5-2 through the geometric transformation matrix T μ2 For the corrected image m 1 Performing geometric transformation to generate a corrected image m 2 。
Further, step S3-7 includes the following sub-steps:
s3-7-1, the reference image f and the correction image m 2 Stacking is performed in the channel direction, resulting in a stacked image.
S3-7-2, inputting the overlapped image into a feature extraction part of a model network of a 3 rd scale to generate depth features.
S3-7-3, passing the depth characteristic through a parameter regression part of a model network with the 3 rd scale to obtain a residual error delta mu of transformation parameters 2 。
S3-7-4, residual error delta mu 2 And transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 。
Further, step S3-8 includes the following sub-steps:
s3-8-1, by transformation parameter mu 3 Composing the geometric transformation matrix T μ3 。
S3-8-2 through geometric transformation matrix T μ3 For the corrected image m 2 Performing geometric transformation to generate a corrected image m 3 。
Further, the Loss function Loss of the model network of the 1 st scale in the step S3-3 sim (f,m,μ 1 ) The method comprises the following steps:
loss function Loss of model network of 2 nd scale in step S3-6 sim (f,m 1 ,μ 2 ) The method comprises the following steps:
3 rd scale of the die in step S3-9Loss function Loss of network sim (f,m 2 ,μ 3 ) The method comprises the following steps:
the joint Loss function Loss in step S5 is:
Loss=λ 1 ×Loss sim (f,m,μ 1 )+λ 2 ×Loss sim (f,m 1 ,μ 2 )+λ 3 ×Loss sim (f,m 2 ,μ 3 )
wherein Sim (. Cndot.) represents a similarity measure, lambda 1 ,λ 2 ,λ 3 A weight factor for a loss function for each scale model network.
Further, step S4 includes the following sub-steps:
s4-1 to minimize Loss function Loss sim (f,m,μ 1 ) The model network of scale 1 is trained.
S4-2, fixing parameters of the model network of the 1 st scale to minimize a Loss function Loss sim (f,m 1 ,μ 2 ) The model network of the 2 nd scale is trained.
S4-3, fixing parameters of the model network of the 1 st scale and the model network of the 2 nd scale to minimize the Loss function Loss sim (f,m 2 ,μ 3 ) The model network of the 3 rd scale is trained.
The beneficial effects of the invention are as follows:
(1) The invention converts image registration into regression optimization, can integrate the feature extraction network, image similarity measure and feature descriptor of various forms and parameters, and realizes the accurate registration of multi-scale images which are completely unsupervised and studied and mapped end to end.
(2) According to the invention, the depth characteristics of the images to be registered are extracted on a plurality of scales by using a model network, geometric transformation parameters are obtained through parameter regression, the images are geometrically corrected by using the parameters, and the multi-scale gradual registration of the images from coarse to fine is realized.
(3) According to the invention, registration truth values are not needed to be used as training samples, the loss functions on multiple scales are jointly trained by constructing the loss functions based on the similarity measure and the feature descriptors between the images, the parameters of each model network are updated by back propagation, the geometric transformation parameters are optimized, and the high-precision and high-robustness multi-source remote sensing image registration is realized.
Drawings
Fig. 1 is a flowchart of a remote sensing image registration method based on unsupervised deep learning according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a reference image, an image to be corrected, and a corrected image according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of an overall frame of a remote sensing image registration method according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a model network 1 according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of calculating a similarity measure of a multisource remote sensing image according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It is to be understood that the embodiments shown and described in the drawings are merely illustrative of the principles and spirit of the invention and are not intended to limit the scope of the invention.
The embodiment of the invention provides a remote sensing image registration method based on unsupervised deep learning, which is shown in fig. 1 and comprises the following steps S1 to S6:
s1, a multisource remote sensing image registration data set comprising two groups of image data is established, every two images of the two groups of image data correspond to each other one by one, wherein one group of image data is used as a reference image data set, and the other group of image data is used as an image data set to be corrected.
In the embodiment of the present invention, the image to be corrected in the image data set to be corrected should be an image with geometric distortion and overlapping with the ground feature information contained in the reference image to a certain extent (in the embodiment of the present invention, greater than or equal to 70%).
In one embodiment of the present invention, step S1 is further described by taking registration of an optical image with a synthetic aperture radar (Synthetic Aperture Radar, SAR) image as an example. As shown in fig. 2, in the embodiment of the present invention, an image a with a fixed resolution is used as a reference image, an image b overlapping with a partial region of the image a and having geometric distortion is used as an image to be corrected, and after registration and correction by the registration method provided by the present invention, an image c aligned pixel by pixel with the overlapping region of the image a is obtained. The multi-source remote sensing image dataset comprises a plurality of pairs of regional images similar to the images a and b. It should be understood that other embodiments of the present invention include, but are not limited to, registration of multisource optical images, registration of optical images with infrared images, registration of optical images with LiDAR (Light Detection and Ranging, liDAR) intensity and elevation images, registration of optical images with grid maps, and that the registration methods provided by the present invention are within the protective effectiveness of the present invention.
S2, selecting a reference image f from the reference image data set, selecting an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample.
S3, respectively calculating transformation parameters mu of the image on model networks of all scales on 3 scales 1 、μ 2 、μ 3 Gradually correcting the image m to be corrected to generate a corrected image m 1 、m 2 、m 3 Back-propagating the loss function of model network of each scale and correcting image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample.
The embodiment of the invention adopts a multi-scale matching strategy from coarse to fine, and a model network on 3 scales is jointly trained by an end-to-end framework to predict transformation parameters and residual errors thereof, thereby realizing accurate registration of images. The end-to-end frame refers to the input reference image f and the image m to be corrected in the embodiment of the invention, and outputs the corrected image m 3 And transformation parameter mu 3 Which constitutes an end-to-endIs a mapping relation of (a) to (b).
As shown in fig. 3, step S3 includes the following substeps S3-1 to S3-10:
s3-1, inputting the reference image f and the image m to be corrected into a model network (in the embodiment of the invention, the model network is simply referred to as "model network 1") with the 1 st scale (in the embodiment of the invention, the model network is simply referred to as "scale 1") to obtain a transformation parameter mu with the 1 st scale 1 。
Step S3-1 includes the following substeps S3-1-1 to S3-1-3:
s3-1-1, respectively downsampling the reference image f and the image m to be corrected to 1/4 of the original size, and superposing two images generated after downsampling in the channel direction to generate a superposed image.
In the embodiment of the present invention, the size of the reference image f is fixed, and if the size of the image m to be corrected is inconsistent with the size of the reference image f, the size of the image m to be corrected is adjusted to be consistent with the size of the reference image f by adopting a zero filling or clipping mode.
S3-1-2, inputting the overlapped image into a feature extraction part of the model network of the 1 st scale to generate depth features.
In one embodiment of the invention, as shown in fig. 4, the feature extraction part of the model network 1 is composed of k groups of interconnected convolution blocks and downsampling layers, each convolution block comprises a convolution layer, a local response normalization layer and a linear unit activation function layer, and each downsampling layer reduces the image resolution to 1/2 of the original one. Experiments show that the size of the feature map generated by the last convolution block is located at [4,7 by reasonably selecting the value of k]When the number of convolution kernel channels of the convolution layer is set to be 1/4 of the size of the image to be corrected, the method is favorable for generating more accurate transformation parameters mu in the subsequent steps 1 . In the embodiment of the present invention, if the size of the reference image f and the image m to be corrected is 512× 512,1/4, the size of the image after downsampling is 128×128, the number of convolution kernel channels of each convolution block is set to 32, the value of k is set to 5, and the size of the feature map generated by the stacked image through 5 sets of convolution blocks and the downsampling layer is 4.
In another embodiment of the invention, the feature extraction portion of the model network 1 includes, but is not limited to, the use of a U-shaped structural network (U-Net), a fully convolutional neural network (FCN), and the like.
S3-1-3, enabling the depth characteristic to pass through a parameter regression part of a model network with the 1 st scale to obtain a transformation parameter mu with the 1 st scale 1 。
As shown in fig. 4, in one embodiment of the present invention, the parameter regression portion of the model network 1 is composed of t fully connected layers connected in parallel, and the value of t can be set by integrating the calculation speed and the range of the image scale transformation, which is not limited by the present invention. Experiments prove that if the scaling factor is 0.5 and 2, the effect of setting 4 parallel full connection layers is better. The parallel fully connected layers are similar to the pyramid strategy used in conventional image registration, except that the initial values of the output spatial transformation parameters differ in scale. The computation of multiple parallel fully connected layers greatly accelerates the convergence of the loss function compared to using a single fully connected layer output parameter.
It should be understood that the implementation of the feature extraction part and the parameter regression part of the model network 1 of the present invention is not limited in form and parameters, and the concept of extracting depth features and outputting geometric transformation parameters in the channel direction through convolutional neural networks (Convolutional Neural Network, CNN) of various forms and parameters by adopting the input mode of stacked images is within the protection effectiveness of the present invention.
S3-2, using the transformation parameter mu 1 Geometrically correcting the image m to be corrected to generate a corrected image m 1 。
The step S3-2 comprises the following substeps S3-2-1 to S3-2-2:
s3-2-1, by transformation parameter mu 1 Composing the geometric transformation matrix T μ1 。
In one embodiment of the present invention, as shown in FIG. 4, 6 geometric transformation parameters a are output in step S3-1-3 1 ,a 2 ,a 3 ,a 4 ,a 5 ,a 6 I.e. forming a two-dimensional affine matrix T μ1 :
The 6 parameters in the affine transformation matrix form represent operations such as translation, rotation, scaling and miscut on the coordinates of the pixels of the image. The geometric transformation of the image is assumed to include: the translation in the x direction is D x The translation in the y direction is D x The method comprises the steps of carrying out a first treatment on the surface of the Scaling factor in x-direction is S x The scaling factor in the y direction is S y The method comprises the steps of carrying out a first treatment on the surface of the A clockwise rotation angle θ; the offset angle in the x direction isThe miscut angle in the y-direction is ω, then a two-dimensional affine matrix T μ1 The 6 parameters in the above are obtained by any permutation and combination of the above operations:
in one embodiment of the present invention, the parameter regression section of the model network 1 outputs a greater or lesser number of geometric transformation parameters to construct other geometric transformation matrices than affine transformation, such as perspective transformation, rigid transformation, etc., to which the present invention is not limited.
S3-2-2 through geometric transformation matrix T μ1 Geometrically transforming the image m to be corrected to generate a corrected image m 1 :
m 1 =T μ1 (m)
Specifically, each pixel with coordinates (X, Y) on the image m to be corrected is set with gray value sigma, coordinates (X, Y) of the pixel on the corrected image are calculated through spatial transformation, and the corrected image m is generated according to a certain resampling and interpolation method 1 . In an embodiment of affine transformation, there are:
s3-3, calculating a Loss function Loss of the model network with the 1 st scale sim (f,m,μ 1 ):
sim (·) represents a similarity measure, i.e. Sim (a, B) represents some similarity measure for computing image a and image B. Common similarity measure calculation methods include gray level difference sum of squares (Sum of Squared Difference, SSD), normalized cross correlation (Normalized Cross Correlation, NCC), phase correlation (Phase Correlation), and the like:
wherein the dimensions of image a and image B are both w x w,and->The gray-scale averages for image a and image B, respectively.
Calculating the traditional similarity measure (such as SSD or NCC) is time-consuming, and according to the fact that the correlation or convolution of two images in a space domain is equal to the product of the two images in a frequency domain, the phase correlation with higher calculation speed is adopted, and the method comprises the following specific steps:
set image A and imageB has a displacement relation (x 0 ,y 0 ) I.e. B (x, y) =a (x-x 0 ,y-y 0 ) Respectively denoted as F by Fourier transform A (u, v) and F B (u, v) that there is the following relationship in the frequency domain:
F B (u,v)=F A (u,v)exp(-i(ux 0 +vy 0 ))
the normalized mutual power spectrum of the two is expressed as:
wherein superscript denotes complex conjugate.
In one embodiment of the present invention, image a and image B are multi-source optical remote sensing images acquired by the same type of sensor in the same region, and gray values are used as inputs for calculating similarity measures of image a and image B.
In another embodiment of the present invention, image a and image B are remote sensing images acquired by different types of sensors (e.g., optical, infrared, SAR, etc.) in the same region, and instead of directly using gray values as inputs for computing similarity measures for image a and image B, local feature descriptors of image a and image B, such as directional gradient feature channels (Chanel Feature of Orientated Gradient, CFOG), directional gradient histograms (Histogram of Orientated Gradient, HOG), local Self-similarity descriptors (Local Self-similarity Descriptor, LSS), phase-consistency direction histograms (Histogram of Orientated Phase Congruency, HOPC), etc., are computed pixel by pixel. As shown in fig. 5, SSD, NCC, or phase correlation between feature descriptor images of two images is used as a similarity measure.
Steps S3-1 to S3-3, which describe in detail the specific implementation of the correlation operation, are the generation of transformation parameters and correction images on scale 1, and the calculation of the loss function. The subsequent steps (steps S3-4 to S3-9) will also repeat similar operations on other scales, which differ from the relevant operations on scale 1 only in terms of parameters, the flow of which will be briefly summarized without detailed description of the principle thereof.
S3-4, combining the reference image f and the corrected image m 1 Input into a model network (abbreviated as 'model network 2' in the embodiment of the invention) with the 2 nd scale (abbreviated as 'scale 2' in the embodiment of the invention) to obtain residual error delta mu of transformation parameters 1 And associate it with the transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 。
The step S3-4 comprises the following substeps S3-4-1 to S3-4-4:
s3-4-1, the reference image f and the correction image m 1 Downsampling to 1/2 of the original size respectively, and superposing the two images generated after downsampling in the channel direction to generate a superposed image.
S3-4-2, inputting the overlapped image into a feature extraction part of a model network of the 2 nd scale to generate depth features.
In the embodiment of the present invention, the network structure of the model network 2 is similar to the network structure of the model network 1 described above, except for the difference in parameter setting. The specific implementation of feature extraction in step S3-4-2 will now be further described with reference to the specific embodiment, if reference image f and corrected image m 1 The size of the downsampled image is 512× 512,1/2, the number of convolution kernel channels of each convolution block is set to 64, the value of k is set to 6, and the size of the feature map generated by the stacked image through 6 sets of convolution blocks and downsampling layers is 4.
S3-4-3, passing the depth characteristic through a parameter regression part of a model network with the 2 nd scale to obtain a residual error delta mu of transformation parameters 1 。
S3-4-4, residual error delta mu 1 And transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 :
μ 2 =μ 1 *Δμ 1
Where represents the multiplication of the matrix.
S3-5, using transformation parameter mu 2 For the corrected image m 1 Performing geometric correction to generate a corrected image m 2 。
The step S3-5 comprises the following substeps S3-5-1 to S3-5-2:
s3-5-1, by transformation parameter mu 2 Composing the geometric transformation matrix T μ2 。
S3-5-2 through the geometric transformation matrix T μ2 For the corrected image m 1 Performing geometric transformation to generate a corrected image m 2 :
m 2 =T μ2 (m 1 )
S3-6, calculating a Loss function Loss of the model network of the 2 nd scale sim (f,m 1 ,μ 2 ):
S3-7, combining the reference image f and the corrected image m 2 Input into a model network of a 3 rd scale (in the embodiment of the invention, the model network is abbreviated as 'scale 3') to obtain a residual error delta mu of a transformation parameter 2 And associate it with the transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 。
The step S3-7 comprises the following substeps S3-7-1 to S3-7-4:
s3-7-1, the reference image f and the correction image m 2 Stacking is performed in the channel direction, resulting in a stacked image.
S3-7-2, inputting the overlapped image into a feature extraction part of a model network of a 3 rd scale to generate depth features.
In the embodiment of the present invention, the network structure of the model network 3 is similar to the network structures of the model network 1 and the model network 2 described above, except for the difference in parameter setting. The implementation of feature extraction in step S3-7-2 will be further described with reference to the specific embodiment, where if the sizes of the image f and the image m are 512×512, the number of convolution kernel channels of each convolution block is set to 128, the value of k is set to 7, and the size of the feature map generated by stacking the images through 7 sets of convolution blocks and the downsampling layer is 4.
S3-7-3, enabling the depth characteristic to pass through a parameter regression part of a model network with the 3 rd scale to obtain transformation parametersIs a residual of Deltaμ 2 。
S3-7-4, residual error delta mu 2 And transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 :
μ 3 =μ 2 *Δμ 2
Where represents the multiplication of the matrix.
S3-8, using transformation parameter mu 3 For the corrected image m 2 Performing geometric correction to generate a corrected image m 3 。
The step S3-8 comprises the following substeps S3-8-1 to S3-8-2:
s3-8-1, by transformation parameter mu 3 Composing the geometric transformation matrix T μ3 。
S3-8-2 through geometric transformation matrix T μ3 For the corrected image m 2 Performing geometric transformation to generate a corrected image m 3 :
m 3 =T μ3 (m 2 )
S3-9, calculating a Loss function Loss of the model network of the 3 rd scale sim (f,m 2 ,μ 3 ):
S3-10, correcting the image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample.
S4, respectively initializing model network parameters of 3 scales.
Step S4 includes the following sub-steps S4-1 to S4-3:
s4-1 to minimize Loss function Loss sim (f,m,μ 1 ) The model network of scale 1 is trained.
S4-2, fixing parameters of the model network of the 1 st scale to minimize a Loss function Loss sim (f,m 1 ,μ 2 ) The model network of the 2 nd scale is trained.
S4-3, fixing the model net of the 1 st scaleParameters of the model network of complex and scale 2 to minimize Loss function Loss sim (f,m 2 ,μ 3 ) The model network of the 3 rd scale is trained.
S5, performing joint training on the model network with 3 scales in an end-to-end mode, and optimizing joint loss functions on the 3 scales.
In the embodiment of the invention, before the model networks with 3 scales are jointly trained, parameters of all the model networks are required to be released from fixation.
In the embodiment of the invention, the joint Loss function Loss is:
Loss=λ 1 ×Loss sim (f,m,μ 1 )+λ 2 ×Loss sim (f,m 1 ,μ 2 )+λ 3 ×Loss sim (f,m 2 ,μ 3 )
wherein lambda is 1 ,λ 2 ,λ 3 As the weight factor of the loss function of each scale model network, lambda in the embodiment of the invention 1 ,λ 2 ,λ 3 The values of (2) were 0.05, 0.05 and 0.9, respectively.
S6, searching the direction in which the joint loss function value is reduced most rapidly through a deep learning optimizer, carrying out back propagation on the model network in the direction, iteratively updating the model network parameters, and when the joint loss function is reduced to a preset threshold value and converged, all the model networks mapped end to end have overall optimal parameters, and the reference image f and the corrected image m 3 With the best similarity, the network model parameters are stored at the moment, and the registered reference image f and the corrected image m are output 3 。
Therefore, the invention realizes the accurate registration of the multi-scale remote sensing images which are completely unsupervised and studied and mapped end to end.
Those of ordinary skill in the art will recognize that the embodiments described herein are for the purpose of aiding the reader in understanding the principles of the present invention and should be understood that the scope of the invention is not limited to such specific statements and embodiments. Those of ordinary skill in the art can make various other specific modifications and combinations from the teachings of the present disclosure without departing from the spirit thereof, and such modifications and combinations remain within the scope of the present disclosure.
Claims (2)
1. The remote sensing image registration method based on the unsupervised deep learning is characterized by comprising the following steps of:
s1, establishing a multi-source remote sensing image registration data set comprising two groups of image data, wherein the two groups of image data correspond to each other one by one, one group of image data is used as a reference image data set, and the other group of image data is used as an image data set to be corrected;
s2, selecting a reference image f from the reference image data set, selecting an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and taking the reference image f and the image m to be corrected as end-to-end input on a training sample;
s3, respectively calculating transformation parameters mu of the image on model networks of all scales on 3 scales 1 、μ 2 、μ 3 Gradually correcting the image m to be corrected to generate a corrected image m 1 、m 2 、m 3 Back-propagating the loss function of model network of each scale and correcting image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample;
s4, respectively initializing model network parameters of 3 scales;
s5, performing joint training on the model network with 3 scales in an end-to-end mode, and optimizing joint loss functions on the 3 scales;
s6, searching the direction in which the joint loss function value is reduced most rapidly through a deep learning optimizer, carrying out back propagation on the model network in the direction, iteratively updating the model network parameters, storing the network model parameters at the moment when the joint loss function is reduced to a preset threshold value and converged, and outputting a registered reference image f and a registered corrected image m 3 ;
The step S3 comprises the following sub-steps:
s3-1, the reference image f and the image to be corrected are processedInputting the image m into a model network of the 1 st scale to obtain a transformation parameter mu of the 1 st scale 1 ;
S3-2, using the transformation parameter mu 1 Geometrically correcting the image m to be corrected to generate a corrected image m 1 ;
S3-3, calculating a loss function of the model network of the 1 st scale;
s3-4, combining the reference image f and the corrected image m 1 Inputting the residual error delta mu of the transformation parameters into a model network of the 2 nd scale 1 And associate it with the transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 ;
S3-5, using transformation parameter mu 2 For the corrected image m 1 Performing geometric correction to generate a corrected image m 2 ;
S3-6, calculating a loss function of the model network of the 2 nd scale;
s3-7, combining the reference image f and the corrected image m 2 Inputting the residual error delta mu of the transformation parameters into a model network of the 3 rd scale 2 And associate it with the transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 ;
S3-8, using transformation parameter mu 3 For the corrected image m 2 Performing geometric correction to generate a corrected image m 3 ;
S3-9, calculating a loss function of the model network of the 3 rd scale;
s3-10, correcting the image m 3 And transformation parameter mu 3 As an end-to-end output on a training sample;
the step S3-1 comprises the following sub-steps:
s3-1-1, respectively downsampling a reference image f and an image m to be corrected to 1/4 of the original size, and superposing two images generated after downsampling in the channel direction to generate a superposed image;
s3-1-2, inputting the overlapped image into a feature extraction part of a model network of the 1 st scale to generate depth features;
s3-1-3, parameters of model network passing depth features through 1 st scaleRegression section, obtaining the 1 st scale transformation parameter mu 1 ;
The step S3-2 comprises the following sub-steps:
s3-2-1, by transformation parameter mu 1 Composing the geometric transformation matrix T μ1 ;
S3-2-2 through geometric transformation matrix T μ1 Geometrically transforming the image m to be corrected to generate a corrected image m 1 ;
The step S3-4 comprises the following sub-steps:
s3-4-1, the reference image f and the correction image m 1 Downsampling to 1/2 of the original size respectively, and superposing two images generated after downsampling in the channel direction to generate a superposed image;
s3-4-2, inputting the overlapped image into a feature extraction part of a model network of a 2 nd scale to generate depth features;
s3-4-3, passing the depth characteristic through a parameter regression part of a model network with the 2 nd scale to obtain a residual error delta mu of transformation parameters 1 ;
S3-4-4, residual error delta mu 1 And transformation parameter mu 1 Combining to obtain a 2 nd scale transformation parameter mu 2 ;
The step S3-5 comprises the following sub-steps:
s3-5-1, by transformation parameter mu 2 Composing the geometric transformation matrix T μ2 ;
S3-5-2 through the geometric transformation matrix T μ2 For the corrected image m 1 Performing geometric transformation to generate a corrected image m 2 ;
The step S3-7 comprises the following sub-steps:
s3-7-1, the reference image f and the correction image m 2 Stacking in the channel direction to produce a stacked image;
s3-7-2, inputting the overlapped image into a feature extraction part of a model network of a 3 rd scale to generate depth features;
s3-7-3, passing the depth characteristic through a parameter regression part of a model network with the 3 rd scale to obtain a residual error delta mu of transformation parameters 2 ;
S3-7-4, residual error delta mu 2 And transformation parameter mu 2 Combining to obtain the transformation parameter mu of the 3 rd scale 3 ;
The step S3-8 comprises the following sub-steps:
s3-8-1, by transformation parameter mu 3 Composing the geometric transformation matrix T μ3 ;
S3-8-2 through geometric transformation matrix T μ3 For the corrected image m 2 Performing geometric transformation to generate a corrected image m 3 ;
The Loss function Loss of the model network of the 1 st scale in the step S3-3 sim (f,m,μ 1 ) The method comprises the following steps:
the Loss function Loss of the model network of the 2 nd scale in the step S3-6 sim (f,m 1 ,μ 2 ) The method comprises the following steps:
the Loss function Loss of the model network of the 3 rd scale in the step S3-9 sim (f,m 2 ,μ 3 ) The method comprises the following steps:
the joint Loss function Loss in step S5 is:
Loss=λ 1 ×Loss sim (f,m,μ 1 )+λ 2 ×Loss sim (f,m 1 ,μ 2 )+λ 3 ×Loss sim (f,m 2 ,μ 3 )
wherein Sim (. Cndot.) represents a similarity measure, lambda 1 ,λ 2 ,λ 3 A weight factor for a loss function for each scale model network.
2. The remote sensing image registration method according to claim 1, wherein the step S4 comprises the following sub-steps:
s4-1 to minimize Loss function Loss sim (f,m,μ 1 ) Training a model network with the 1 st scale;
s4-2, fixing parameters of the model network of the 1 st scale to minimize a Loss function Loss sim (f,m 1 ,μ 2 ) Training a model network with the 2 nd scale;
s4-3, fixing parameters of the model network of the 1 st scale and the model network of the 2 nd scale to minimize the loss function loss sim (f,m 2 ,μ 3 ) The model network of the 3 rd scale is trained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210026370.7A CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210026370.7A CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114494372A CN114494372A (en) | 2022-05-13 |
CN114494372B true CN114494372B (en) | 2023-04-21 |
Family
ID=81509569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210026370.7A Active CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114494372B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114693755B (en) * | 2022-05-31 | 2022-08-30 | 湖南大学 | Non-rigid registration method and system for multimode image maximum moment and space consistency |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109345575B (en) * | 2018-09-17 | 2021-01-19 | 中国科学院深圳先进技术研究院 | Image registration method and device based on deep learning |
CN109711444B (en) * | 2018-12-18 | 2024-07-19 | 中国科学院遥感与数字地球研究所 | Novel remote sensing image registration method based on deep learning |
CN111414968B (en) * | 2020-03-26 | 2022-05-03 | 西南交通大学 | Multi-mode remote sensing image matching method based on convolutional neural network characteristic diagram |
CN113901900A (en) * | 2021-09-29 | 2022-01-07 | 西安电子科技大学 | Unsupervised change detection method and system for homologous or heterologous remote sensing image |
-
2022
- 2022-01-11 CN CN202210026370.7A patent/CN114494372B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN114494372A (en) | 2022-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ye et al. | Fast and robust matching for multimodal remote sensing image registration | |
Ye et al. | A robust multimodal remote sensing image registration method and system using steerable filters with first-and second-order gradients | |
CN108765476B (en) | Polarized image registration method | |
Chen et al. | Convolutional neural network based dem super resolution | |
Fang et al. | SAR-optical image matching by integrating Siamese U-Net with FFT correlation | |
CN114693755B (en) | Non-rigid registration method and system for multimode image maximum moment and space consistency | |
KR101941878B1 (en) | System for unmanned aircraft image auto geometric correction | |
CN114494372B (en) | Remote sensing image registration method based on unsupervised deep learning | |
Zhang et al. | GPU-accelerated large-size VHR images registration via coarse-to-fine matching | |
CN117788296B (en) | Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network | |
CN113034371B (en) | Infrared and visible light image fusion method based on feature embedding | |
CN117689702A (en) | Point cloud registration method and device based on geometric attention mechanism | |
CN107358625B (en) | SAR image change detection method based on SPP Net and region-of-interest detection | |
CN117765039A (en) | Point cloud coarse registration method, device and equipment | |
CN114998630B (en) | Ground-to-air image registration method from coarse to fine | |
Dong et al. | An Intelligent Detection Method for Optical Remote Sensing Images Based on Improved YOLOv7. | |
Ansari et al. | Curvelet Based U-Net Framework for Building Footprint Identification | |
Dou et al. | Object detection based on hierarchical visual perception mechanism | |
WO2024222610A1 (en) | Remote-sensing image change detection method based on deep convolutional network | |
Tang et al. | MU-NET: A multiscale unsupervised network for remote sensing image registration | |
Ge et al. | a Novel Remote Sensing Image Registration Algorithm Based on the Adaptive Pcnn Segmentation | |
Li | Accuracy improved image registration based on pre-estimation and compensation | |
CN117765045A (en) | Visible light-SAR image registration method and device based on multipoint matching constraint | |
Nie et al. | Sea-land segmentation based on template matching | |
Zhou et al. | Research on unsupervised learning-based image stitching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |