CN114494372B - Remote sensing image registration method based on unsupervised deep learning - Google Patents
Remote sensing image registration method based on unsupervised deep learning Download PDFInfo
- Publication number
- CN114494372B CN114494372B CN202210026370.7A CN202210026370A CN114494372B CN 114494372 B CN114494372 B CN 114494372B CN 202210026370 A CN202210026370 A CN 202210026370A CN 114494372 B CN114494372 B CN 114494372B
- Authority
- CN
- China
- Prior art keywords
- image
- scale
- model network
- corrected
- transformation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013135 deep learning Methods 0.000 title claims abstract description 16
- 230000009466 transformation Effects 0.000 claims abstract description 115
- 230000006870 function Effects 0.000 claims abstract description 49
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 238000011524 similarity measure Methods 0.000 claims abstract description 13
- 238000012937 correction Methods 0.000 claims description 42
- 239000011159 matrix material Substances 0.000 claims description 23
- 230000007423 decrease Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims 2
- 238000005457 optimization Methods 0.000 abstract description 2
- 230000003287 optical effect Effects 0.000 description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 3
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000010008 shearing Methods 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域Technical Field
本发明属于遥感技术领域,具体涉及一种基于无监督深度学习的遥感影像配准方法的设计。The present invention belongs to the field of remote sensing technology, and in particular relates to the design of a remote sensing image registration method based on unsupervised deep learning.
背景技术Background Art
随着航空航天与遥感技术的飞速发展,遥感影像的获取手段不断增加,类型不断丰富。由于各类传感器的设备技术与成像机理的差异,单一数据源的遥感影像难以全面反映出地物的特征。为充分利用不同类型传感器获取到的多源遥感数据,实现集成和信息互补,需对多源遥感影像进行配准。With the rapid development of aerospace and remote sensing technology, the means of acquiring remote sensing images are increasing and the types are becoming more diverse. Due to the differences in equipment technology and imaging mechanisms of various sensors, remote sensing images from a single data source are difficult to fully reflect the characteristics of ground objects. In order to make full use of multi-source remote sensing data obtained by different types of sensors and realize integration and information complementarity, multi-source remote sensing images need to be registered.
多源遥感影像配准是指对在不同时间、不同视角或不同传感器条件下获取的同一地区的多传感器遥感影像进行对齐和信息叠加的过程,使得对齐后影像上的同名点具有相同的地理坐标。现有技术中,多源遥感影像配准的方法包括不需要采用深度学习技术的传统方法和基于深度学习的方法。传统方法基于特征或区域模板,其依赖于手工设计的特征,针对不同传感器不同模态的遥感影像配准,这些手工特征通常需要重新设计。基于深度学习的方法从多源遥感影像中提取出深层次的特征,相比于手工特征具有更好的通用性。现阶段基于有监督深度学习的方法需要大量带有真值标签的样本作为训练数据,而现阶段遥感领域并无大量的数据以供训练,成本因素限制了该类方法的实际应用。Multi-source remote sensing image registration refers to the process of aligning and superimposing information on multi-sensor remote sensing images of the same area acquired at different times, different perspectives or different sensor conditions, so that the same-name points on the aligned images have the same geographic coordinates. In the prior art, the methods for multi-source remote sensing image registration include traditional methods that do not require the use of deep learning technology and methods based on deep learning. Traditional methods are based on features or regional templates, which rely on manually designed features. For remote sensing image registration of different sensors and different modalities, these manual features usually need to be redesigned. Methods based on deep learning extract deep features from multi-source remote sensing images and have better versatility than manual features. At present, methods based on supervised deep learning require a large number of samples with true value labels as training data, but at present, there is no large amount of data in the remote sensing field for training, and cost factors limit the practical application of such methods.
发明内容Summary of the invention
本发明的目的是为了解决现有基于有监督深度学习的遥感影像配准方法难以获取大量训练样本的问题,提出了一种基于无监督深度学习的遥感影像配准方法,可在无训练样本的情况下实现遥感影像间的精确配准。The purpose of the present invention is to solve the problem that it is difficult to obtain a large number of training samples in the existing remote sensing image registration method based on supervised deep learning, and propose a remote sensing image registration method based on unsupervised deep learning, which can achieve accurate registration between remote sensing images without training samples.
本发明的技术方案为:一种基于无监督深度学习的遥感影像配准方法,包括以下步骤:The technical solution of the present invention is: a remote sensing image registration method based on unsupervised deep learning, comprising the following steps:
S1、建立包括两组影像数据的多源遥感影像配准数据集,两组影像数据的两两影像之间逐一对应,其中一组影像数据作为参考影像数据集,另一组影像数据作为待校正影像数据集。S1. Establish a multi-source remote sensing image registration dataset including two sets of image data, where each two images of the two sets of image data correspond to each other one by one, one set of image data is used as a reference image dataset, and the other set of image data is used as an image dataset to be corrected.
S2、从参考影像数据集中选取一个参考影像f,从待校正影像数据集中选取与参考影像f对应的待校正影像m,将参考影像f和待校正影像m作为在一个训练样本上的端对端的输入。S2. Select a reference image f from the reference image data set, select an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and use the reference image f and the image m to be corrected as end-to-end inputs on a training sample.
S3、在3个尺度上分别计算影像在各尺度的模型网络上的变换参数μ1、μ2、μ3,对待校正影像m进行逐步校正,产生校正影像m1、m2、m3,反向传播各尺度的模型网络的损失函数,并将校正影像m3和变换参数μ3作为在一个训练样本上的端对端的输出。S3. Calculate the transformation parameters μ 1 , μ 2 , and μ 3 of the image on the model network of each scale at three scales respectively, correct the image m to be corrected step by step, generate corrected images m 1 , m 2 , and m 3 , back-propagate the loss function of the model network of each scale, and use the corrected image m 3 and the transformation parameter μ 3 as the end-to-end output on a training sample.
S4、分别初始化3个尺度的模型网络参数。S4. Initialize the model network parameters of the three scales respectively.
S5、以端到端的方式对3个尺度的模型网络进行联合训练,最优化3个尺度上的联合损失函数。S5. Jointly train the model networks of the three scales in an end-to-end manner to optimize the joint loss function at the three scales.
S6、通过深度学习优化器寻找联合损失函数值降低最快的方向,以所述方向对模型网络进行反向传播,迭代更新模型网络参数,当联合损失函数下降至预设阈值并收敛时,保存此时的网络模型参数,并输出配准后的参考影像f和校正影像m3。S6. Use a deep learning optimizer to find the direction in which the joint loss function value decreases fastest, perform back propagation on the model network in the direction, iteratively update the model network parameters, and when the joint loss function drops to a preset threshold and converges, save the network model parameters at this time, and output the registered reference image f and the corrected image m 3 .
进一步地,步骤S3包括以下分步骤:Further, step S3 includes the following sub-steps:
S3-1、将参考影像f和待校正影像m输入到第1个尺度的模型网络中,得到第1个尺度的变换参数μ1。S3-1. Input the reference image f and the image to be corrected m into the model network of the first scale to obtain the transformation parameter μ 1 of the first scale.
S3-2、采用变换参数μ1对待校正影像m进行几何校正,产生校正影像m1。S3-2. Use the transformation parameter μ 1 to perform geometric correction on the image to be corrected m to generate a corrected image m 1 .
S3-3、计算第1个尺度的模型网络的损失函数。S3-3. Calculate the loss function of the model network of the first scale.
S3-4、将参考影像f和校正影像m1输入到第2个尺度的模型网络中,得到变换参数的残差Δμ1,并将其与变换参数μ1组合得到第2个尺度的变换参数μ2。S3-4. Input the reference image f and the corrected image m 1 into the model network of the second scale to obtain the residual Δμ 1 of the transformation parameter, and combine it with the transformation parameter μ 1 to obtain the transformation parameter μ 2 of the second scale.
S3-5、采用变换参数μ2对校正影像m1进行几何校正,产生校正影像m2。S3-5. Use the transformation parameter μ 2 to perform geometric correction on the correction image m 1 to generate a correction image m 2 .
S3-6、计算第2个尺度的模型网络的损失函数。S3-6. Calculate the loss function of the model network of the second scale.
S3-7、将参考影像f和校正影像m2输入到第3个尺度的模型网络中,得到变换参数的残差Δμ2,并将其与变换参数μ2组合得到第3个尺度的变换参数μ3。S3-7. Input the reference image f and the corrected image m2 into the model network of the third scale to obtain the residual Δμ2 of the transformation parameter, and combine it with the transformation parameter μ2 to obtain the transformation parameter μ3 of the third scale.
S3-8、采用变换参数μ3对校正影像m2进行几何校正,产生校正影像m3。S3-8. Use the transformation parameter μ 3 to perform geometric correction on the correction image m 2 to generate a correction image m 3 .
S3-9、计算第3个尺度的模型网络的损失函数。S3-9. Calculate the loss function of the model network of the third scale.
S3-10、将校正影像m3和变换参数μ3作为在一个训练样本上的端对端的输出。S3-10. Take the corrected image m 3 and the transformation parameter μ 3 as the end-to-end output on a training sample.
进一步地,步骤S3-1包括以下分步骤:Furthermore, step S3-1 includes the following sub-steps:
S3-1-1、将参考影像f和待校正影像m分别下采样至原尺寸的1/4,并将下采样后产生的两张影像在通道方向上进行叠置,产生叠置影像。S3-1-1. Downsample the reference image f and the image to be corrected m to 1/4 of their original sizes respectively, and superimpose the two images generated after downsampling in the channel direction to generate a superimposed image.
S3-1-2、将叠置影像输入到第1个尺度的模型网络的特征提取部分,产生深度特征。S3-1-2. Input the superimposed image into the feature extraction part of the model network of the first scale to generate deep features.
S3-1-3、将深度特征通过第1个尺度的模型网络的参数回归部分,得到第1个尺度的变换参数μ1。S3-1-3. Pass the deep features through the parameter regression part of the model network of the first scale to obtain the transformation parameter μ 1 of the first scale.
进一步地,步骤S3-2包括以下分步骤:Further, step S3-2 includes the following sub-steps:
S3-2-1、由变换参数μ1组成几何变换矩阵Tμ1。S3-2-1. The geometric transformation matrix T μ1 is formed by the transformation parameters μ 1 .
S3-2-2、通过几何变换矩阵Tμ1对待校正影像m进行几何变换,产生校正影像m1。S3-2-2. Perform geometric transformation on the image to be corrected m using the geometric transformation matrix T μ1 to generate a corrected image m 1 .
进一步地,步骤S3-4包括以下分步骤:Further, step S3-4 includes the following sub-steps:
S3-4-1、将参考影像f和校正影像m1分别下采样至原尺寸的1/2,并将下采样后产生的两张影像在通道方向上进行叠置,产生叠置影像。S3-4-1. Downsample the reference image f and the correction image m1 to 1/2 of their original sizes respectively, and superimpose the two images generated after downsampling in the channel direction to generate a superimposed image.
S3-4-2、将叠置影像输入到第2个尺度的模型网络的特征提取部分,产生深度特征。S3-4-2. Input the superimposed image into the feature extraction part of the model network of the second scale to generate deep features.
S3-4-3、将深度特征通过第2个尺度的模型网络的参数回归部分,得到变换参数的残差Δμ1。S3-4-3. Pass the deep features through the parameter regression part of the model network of the second scale to obtain the residual Δμ 1 of the transformation parameter.
S3-4-4、将残差Δμ1与变换参数μ1组合得到第2个尺度的变换参数μ2。S3-4-4. Combine the residual Δμ 1 with the transformation parameter μ 1 to obtain the transformation parameter μ 2 of the second scale.
进一步地,步骤S3-5包括以下分步骤:Further, step S3-5 includes the following sub-steps:
S3-5-1、由变换参数μ2组成几何变换矩阵Tμ2。S3-5-1. The geometric transformation matrix T μ2 is formed by the transformation parameters μ 2 .
S3-5-2、通过几何变换矩阵Tμ2对校正影像m1进行几何变换,产生校正影像m2。S3-5-2. Perform geometric transformation on the correction image m 1 through the geometric transformation matrix T μ2 to generate a correction image m 2 .
进一步地,步骤S3-7包括以下分步骤:Further, step S3-7 includes the following sub-steps:
S3-7-1、将参考影像f和校正影像m2在通道方向上进行叠置,产生叠置影像。S3-7-1. Overlay the reference image f and the correction image m2 in the channel direction to generate an overlay image.
S3-7-2、将叠置影像输入到第3个尺度的模型网络的特征提取部分,产生深度特征。S3-7-2. Input the superimposed image into the feature extraction part of the model network of the third scale to generate deep features.
S3-7-3、将深度特征通过第3个尺度的模型网络的参数回归部分,得到变换参数的残差Δμ2。S3-7-3. Pass the deep features through the parameter regression part of the model network of the third scale to obtain the residual Δμ 2 of the transformation parameter.
S3-7-4、将残差Δμ2与变换参数μ2组合得到第3个尺度的变换参数μ3。S3-7-4. Combine the residual Δμ 2 and the transformation parameter μ 2 to obtain the transformation parameter μ 3 of the third scale.
进一步地,步骤S3-8包括以下分步骤:Further, step S3-8 includes the following sub-steps:
S3-8-1、由变换参数μ3组成几何变换矩阵Tμ3。S3-8-1. The geometric transformation matrix T μ3 is formed by the transformation parameters μ 3 .
S3-8-2、通过几何变换矩阵Tμ3对校正影像m2进行几何变换,产生校正影像m3。S3-8-2. Perform geometric transformation on the corrected image m 2 using the geometric transformation matrix T μ3 to generate a corrected image m 3 .
进一步地,步骤S3-3中第1个尺度的模型网络的损失函数Losssim(f,m,μ1)为:Furthermore, the loss function Loss sim (f, m, μ 1 ) of the model network of the first scale in step S3-3 is:
步骤S3-6中第2个尺度的模型网络的损失函数Losssim(f,m1,μ2)为:The loss function Loss sim (f, m 1 , μ 2 ) of the model network of the second scale in step S3-6 is:
步骤S3-9中第3个尺度的模型网络的损失函数Losssim(f,m2,μ3)为:The loss function Loss sim (f, m 2 , μ 3 ) of the model network of the third scale in step S3-9 is:
步骤S5中的联合损失函数Loss为:The joint loss function Loss in step S5 is:
Loss=λ1×Losssim(f,m,μ1)+λ2×Losssim(f,m1,μ2)+λ3×Losssim(f,m2,μ3)Loss=λ 1 ×Loss sim (f, m, μ 1 )+λ 2 ×Loss sim (f, m 1 , μ 2 )+λ 3 ×Loss sim (f, m 2 , μ 3 )
其中Sim(·)表示相似性测度,λ1,λ2,λ3为各尺度模型网络的损失函数的权重因子。Where Sim(·) represents the similarity measure, λ 1 , λ 2 , λ 3 are the weight factors of the loss function of each scale model network.
进一步地,步骤S4包括以下分步骤:Further, step S4 includes the following sub-steps:
S4-1、以最小化损失函数Losssim(f,m,μ1)对第1个尺度的模型网络进行训练。S4-1. Train the model network of the first scale by minimizing the loss function Loss sim (f, m, μ 1 ).
S4-2、固定第1个尺度的模型网络的参数,以最小化损失函数Losssim(f,m1,μ2)对第2个尺度的模型网络进行训练。S4-2. Fix the parameters of the model network of the first scale and train the model network of the second scale to minimize the loss function Loss sim (f, m 1 , μ 2 ).
S4-3、固定第1个尺度的模型网络和第2个尺度的模型网络的参数,以最小化损失函数Losssim(f,m2,μ3)对第3个尺度的模型网络进行训练。S4-3. Fix the parameters of the model network of the first scale and the model network of the second scale, and train the model network of the third scale to minimize the loss function Loss sim (f, m 2 , μ 3 ).
本发明的有益效果是:The beneficial effects of the present invention are:
(1)本发明将影像配准转化为回归优化问题,可集成多种形式和参数的特征提取网络、影像相似性测度和特征描述符,实现了完全无监督学习的、端对端映射的多尺度影像精确配准。(1) The present invention transforms image registration into a regression optimization problem, which can integrate feature extraction networks, image similarity measures and feature descriptors of various forms and parameters, and realizes multi-scale image precise registration with fully unsupervised learning and end-to-end mapping.
(2)本发明在多个尺度上利用模型网络提取出待配准影像的深度特征,经参数回归得到几何变换参数,利用该参数对影像进行几何校正,实现影像“由粗到精”的多尺度逐级配准。(2) The present invention uses a model network to extract the depth features of the image to be registered at multiple scales, obtains geometric transformation parameters through parameter regression, and uses the parameters to perform geometric correction on the image, thereby achieving multi-scale step-by-step registration of the image "from coarse to fine".
(3)本发明不需要配准真值作为训练样本,通过构建基于影像间相似性测度和特征描述符的损失函数,对多尺度上的损失函数进行联合训练,以反向传播更新各模型网络的参数,优化几何变换参数,实现高精度、高鲁棒性的多源遥感影像配准。(3) The present invention does not require the true value of the registration as a training sample. By constructing a loss function based on the similarity measurement between images and the feature descriptor, the loss function at multiple scales is jointly trained to update the parameters of each model network by back propagation, optimize the geometric transformation parameters, and achieve high-precision and high-robust multi-source remote sensing image registration.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1所示为本发明实施例提供的一种基于无监督深度学习的遥感影像配准方法流程图。FIG1 is a flow chart of a remote sensing image registration method based on unsupervised deep learning provided by an embodiment of the present invention.
图2所示为本发明实施例提供的参考影像、待校正影像和校正影像示意图。FIG. 2 is a schematic diagram showing a reference image, an image to be corrected, and a corrected image provided by an embodiment of the present invention.
图3所示为本发明实施例提供的遥感影像配准方法总体框架示意图。FIG3 is a schematic diagram showing the overall framework of the remote sensing image registration method provided by an embodiment of the present invention.
图4所示为本发明实施例提供的模型网络1结构示意图。FIG4 is a schematic diagram showing the structure of a model network 1 provided in an embodiment of the present invention.
图5所示为本发明实施例提供的计算多源遥感影像相似性测度示意图。FIG5 is a schematic diagram showing a method for calculating a similarity measure of multi-source remote sensing images provided by an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
现在将参考附图来详细描述本发明的示例性实施方式。应当理解,附图中示出和描述的实施方式仅仅是示例性的,意在阐释本发明的原理和精神,而并非限制本发明的范围。Now, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. It should be understood that the embodiments shown and described in the accompanying drawings are only exemplary and are intended to explain the principles and spirit of the present invention, rather than to limit the scope of the present invention.
本发明实施例提供了一种基于无监督深度学习的遥感影像配准方法,如图1所示,包括以下步骤S1~S6:The embodiment of the present invention provides a remote sensing image registration method based on unsupervised deep learning, as shown in FIG1 , comprising the following steps S1 to S6:
S1、建立包括两组影像数据的多源遥感影像配准数据集,两组影像数据的两两影像之间逐一对应,其中一组影像数据作为参考影像数据集,另一组影像数据作为待校正影像数据集。S1. Establish a multi-source remote sensing image registration dataset including two sets of image data, where each two images of the two sets of image data correspond to each other one by one, one set of image data is used as a reference image dataset, and the other set of image data is used as an image dataset to be corrected.
本发明实施例中,待校正影像数据集中的待校正影像应当是与参考影像包含的地物信息有一定范围重叠(本发明实施例中大于或等于70%)且带有几何畸变的影像。In the embodiment of the present invention, the image to be corrected in the image data set to be corrected should be an image that has a certain overlap (greater than or equal to 70% in the embodiment of the present invention) with the ground object information contained in the reference image and has geometric distortion.
在本发明的一个实施例中,以光学影像与合成孔径雷达(Synthetic ApertureRadar,SAR)影像的配准为例,对步骤S1进行进一步说明。如图2所示,本发明实施例将固定分辨率的影像a作为参考影像,将与影像a部分区域重叠且带有几何畸变的影像b作为待校正影像,经本发明提供的配准方法配准和校正后,得到与影像a重叠区域逐像素对准的影像c。多源遥感影像数据集中包含多对类似上述影像a和影像b的区域影像。应当理解,本发明的其他实施例包括但不限于对多源光学影像的配准、光学影像与红外影像的配准、光学影像与激光雷达(Light Detection and Ranging,LiDAR)强度和高程影像的配准、光学影像与栅格地图的配准,采用本发明提供的配准方法均应在本发明的保护效力之内。In one embodiment of the present invention, the registration of optical images and synthetic aperture radar (SAR) images is taken as an example to further illustrate step S1. As shown in FIG2, the embodiment of the present invention uses image a with a fixed resolution as a reference image, and image b that overlaps with a portion of image a and has geometric distortion as an image to be corrected. After registration and correction by the registration method provided by the present invention, an image c that is aligned pixel by pixel with the overlapping area of image a is obtained. The multi-source remote sensing image data set contains multiple pairs of regional images similar to the above-mentioned image a and image b. It should be understood that other embodiments of the present invention include but are not limited to the registration of multi-source optical images, the registration of optical images and infrared images, the registration of optical images and laser radar (Light Detection and Ranging, LiDAR) intensity and elevation images, and the registration of optical images and raster maps. The registration method provided by the present invention should be within the protection effect of the present invention.
S2、从参考影像数据集中选取一个参考影像f,从待校正影像数据集中选取与参考影像f对应的待校正影像m,将参考影像f和待校正影像m作为在一个训练样本上的端对端的输入。S2. Select a reference image f from the reference image data set, select an image m to be corrected corresponding to the reference image f from the image data set to be corrected, and use the reference image f and the image m to be corrected as end-to-end inputs on a training sample.
S3、在3个尺度上分别计算影像在各尺度的模型网络上的变换参数μ1、μ2、μ3,对待校正影像m进行逐步校正,产生校正影像m1、m2、m3,反向传播各尺度的模型网络的损失函数,并将校正影像m3和变换参数μ3作为在一个训练样本上的端对端的输出。S3. Calculate the transformation parameters μ 1 , μ 2 , and μ 3 of the image on the model network of each scale at three scales respectively, correct the image m to be corrected step by step, generate corrected images m 1 , m 2 , and m 3 , back-propagate the loss function of the model network of each scale, and use the corrected image m 3 and the transformation parameter μ 3 as the end-to-end output on a training sample.
本发明实施例采用“由粗到精”的多尺度匹配策略,以一种端对端的框架联合训练3个尺度上的模型网络,预测变换参数及其残差,从而实现影像的精确配准。端对端的框架指本发明实施例中输入参考影像f和待校正影像m,输出校正影像m3和变换参数μ3,其构成端对端的映射关系。The embodiment of the present invention adopts a "coarse-to-fine" multi-scale matching strategy, and jointly trains the model networks at three scales in an end-to-end framework to predict the transformation parameters and their residuals, thereby achieving accurate image registration. The end-to-end framework refers to the input of the reference image f and the image to be corrected m in the embodiment of the present invention, and the output of the corrected image m 3 and the transformation parameter μ 3 , which constitute an end-to-end mapping relationship.
如图3所示,步骤S3包括以下分步骤S3-1~S3-10:As shown in FIG3 , step S3 includes the following sub-steps S3-1 to S3-10:
S3-1、将参考影像f和待校正影像m输入到第1个尺度(本发明实施例中简称“尺度1”)的模型网络(本发明实施例中简称“模型网络1”)中,得到第1个尺度的变换参数μ1。S3-1. Input the reference image f and the image to be corrected m into a model network (referred to as "model network 1" in the embodiment of the present invention) of the first scale (referred to as "scale 1" in the embodiment of the present invention) to obtain the transformation parameter μ 1 of the first scale.
步骤S3-1包括以下分步骤S3-1-1~S3-1-3:Step S3-1 includes the following sub-steps S3-1-1 to S3-1-3:
S3-1-1、将参考影像f和待校正影像m分别下采样至原尺寸的1/4,并将下采样后产生的两张影像在通道方向上进行叠置,产生叠置影像。S3-1-1. Downsample the reference image f and the image to be corrected m to 1/4 of their original sizes respectively, and superimpose the two images generated after downsampling in the channel direction to generate a superimposed image.
本发明实施例中,参考影像f的尺寸是固定的,若待校正影像m与参考影像f的尺寸不一致,通常采用零填充或裁剪的方式将待校正影像m的尺寸调整至与参考影像f一致。In the embodiment of the present invention, the size of the reference image f is fixed. If the size of the image to be corrected m is inconsistent with the size of the reference image f, zero padding or cropping is usually used to adjust the size of the image to be corrected m to be consistent with the reference image f.
S3-1-2、将叠置影像输入到第1个尺度的模型网络的特征提取部分,产生深度特征。S3-1-2. Input the superimposed image into the feature extraction part of the model network of the first scale to generate deep features.
如图4所示,本发明的一个实施例中,模型网络1的特征提取部分由k组相互连接的卷积块和下采样层组成,每个卷积块包括一个卷积层、一个局部响应归一化层和一个线性单元激活函数层,每个下采样层将影像分辨率降至原来的1/2。实验表明,通过合理选取k的值,使得最后一个卷积块生成的特征图的尺寸位于[4,7]之间,且将卷积层的卷积核通道数设定为待校正影像尺寸的1/4时,有利于后续步骤产生更为精确的变换参数μ1。在本发明实施例中,若参考影像f和待校正影像m的尺寸为512×512,1/4下采样后的影像尺寸为128×128,将每一个卷积块的卷积核通道数设定为32,且k的值设定为5,叠置影像经5组卷积块和下采样层产生的特征图尺寸为4。As shown in FIG4 , in one embodiment of the present invention, the feature extraction part of the model network 1 is composed of k groups of interconnected convolution blocks and downsampling layers, each convolution block includes a convolution layer, a local response normalization layer and a linear unit activation function layer, and each downsampling layer reduces the image resolution to 1/2 of the original. Experiments show that by reasonably selecting the value of k, the size of the feature map generated by the last convolution block is between [4, 7], and the number of convolution kernel channels of the convolution layer is set to 1/4 of the size of the image to be corrected, which is conducive to the subsequent steps to generate more accurate transformation parameters μ 1 . In an embodiment of the present invention, if the size of the reference image f and the image to be corrected m is 512×512, the image size after 1/4 downsampling is 128×128, the number of convolution kernel channels of each convolution block is set to 32, and the value of k is set to 5, and the size of the feature map generated by the superimposed image through 5 groups of convolution blocks and downsampling layers is 4.
在本发明的另一个实施例中,模型网络1的特征提取部分包括但不限定采用U型结构网络(U-Net)、全卷积神经网络(FCN)等。In another embodiment of the present invention, the feature extraction part of the model network 1 includes but is not limited to a U-net, a fully convolutional neural network (FCN), etc.
S3-1-3、将深度特征通过第1个尺度的模型网络的参数回归部分,得到第1个尺度的变换参数μ1。S3-1-3. Pass the deep features through the parameter regression part of the model network of the first scale to obtain the transformation parameter μ 1 of the first scale.
如图4所示,本发明的一个实施例中,模型网络1的参数回归部分由t个并行连接的全连接层组成,t的值可综合计算速度和影像尺度变换的范围而设定,本发明对此不进行限定。实验证明若缩放系数在[0.5,2]时,设定4个并行的全连接层效果较好。并行的全连接层类似于传统影像配准中使用的金字塔策略,其区别在于输出空间变换参数的初始值在尺度上的不同。与采用单一全连接层输出参数相比,多个并行全连接层的计算会极大地加速损失函数的收敛。As shown in Figure 4, in one embodiment of the present invention, the parameter regression part of the model network 1 is composed of t parallel connected fully connected layers. The value of t can be set based on the comprehensive calculation speed and the range of image scale transformation, and the present invention does not limit this. Experiments have shown that if the scaling factor is in [0.5, 2], setting 4 parallel fully connected layers has a better effect. The parallel fully connected layers are similar to the pyramid strategy used in traditional image registration, the difference being that the initial values of the output space transformation parameters are different in scale. Compared with using a single fully connected layer output parameter, the calculation of multiple parallel fully connected layers will greatly accelerate the convergence of the loss function.
应当理解,本发明对模型网络1的特征提取部分和参数回归部分的实现并不做出形式上和参数上的限定,凡采用叠置影像的输入方式,经各种形式和参数的卷积神经网络(Convolutional Neural Network,CNN)提取出通道方向上的深度特征、输出几何变换参数的思想,均在本发明的保护效力之内。It should be understood that the present invention does not impose any formal or parametric limitations on the implementation of the feature extraction part and the parameter regression part of the model network 1. Any idea of using an input method of superimposed images and extracting deep features in the channel direction and outputting geometric transformation parameters through convolutional neural networks (CNN) of various forms and parameters is within the protection scope of the present invention.
S3-2、采用变换参数μ1对待校正影像m进行几何校正,产生校正影像m1。S3-2. Use the transformation parameter μ 1 to perform geometric correction on the image to be corrected m to generate a corrected image m 1 .
步骤S3-2包括以下分步骤S3-2-1~S3-2-2:Step S3-2 includes the following sub-steps S3-2-1 to S3-2-2:
S3-2-1、由变换参数μ1组成几何变换矩阵Tμ1。S3-2-1. The geometric transformation matrix T μ1 is formed by the transformation parameters μ 1 .
在本发明的一个实施例中,如图4所示,步骤S3-1-3中输出6个几何变换参数a1,a2,a3,a4,a5,a6,即构成二维仿射矩阵Tμ1:In one embodiment of the present invention, as shown in FIG4 , six geometric transformation parameters a 1 , a 2 , a 3 , a 4 , a 5 , a 6 are output in step S3 - 1 - 3 , namely, a two-dimensional affine matrix T μ1 is formed:
仿射变换矩阵式中的6个参数代表了对影像像素坐标的平移、旋转、放缩及错切等操作。假设对影像的几何变换包括:在x方向上的平移量为Dx,在y方向上的平移量为Dx;在x方向上的缩放系数为Sx,在y方向上的缩放系数为Sy;顺时针旋转角度θ;在x方向上的错切角为在y方向上的错切角为ω,则二维仿射矩阵Tμ1中6个参数由上述操作的任意排列组合得到:The six parameters in the affine transformation matrix represent the translation, rotation, scaling and shearing operations on the image pixel coordinates. Assume that the geometric transformation of the image includes: the translation amount in the x direction is D x , the translation amount in the y direction is D x ; the scaling factor in the x direction is S x , the scaling factor in the y direction is Sy ; the clockwise rotation angle θ; the shearing angle in the x direction is The misalignment angle in the y direction is ω, and the six parameters in the two-dimensional affine matrix T μ1 are obtained by any permutation and combination of the above operations:
在本发明的一个实施例中,模型网络1的参数回归部分输出更多或更少个数的几何变换参数,以构成仿射变换之外的其它几何变换矩阵,如透视变换、刚性变换等,本发明对此不进行限定。In one embodiment of the present invention, the parameter regression part of the model network 1 outputs a greater or lesser number of geometric transformation parameters to form other geometric transformation matrices besides the affine transformation, such as perspective transformation, rigid transformation, etc., which is not limited by the present invention.
S3-2-2、通过几何变换矩阵Tμ1对待校正影像m进行几何变换,产生校正影像m1:S3-2-2. Perform geometric transformation on the image to be corrected m through the geometric transformation matrix T μ1 to generate a corrected image m 1 :
m1=Tμ1(m)m 1 =T μ1 (m)
具体地,对待校正影像m上的每一个坐标为(x,y)的像素,设其灰度值为σ,计算其经空间变换在校正影像上的坐标(X,Y),按照一定的重采样和内插方法生成校正影像m1。在仿射变换的实施例中,有:Specifically, for each pixel with coordinates (x, y) on the corrected image m, set its gray value to σ, calculate its coordinates (X, Y) on the corrected image after spatial transformation, and generate the corrected image m 1 according to a certain resampling and interpolation method. In the embodiment of affine transformation, there is:
S3-3、计算第1个尺度的模型网络的损失函数Losssim(f,m,μ1):S3-3. Calculate the loss function Loss sim (f, m, μ 1 ) of the model network of the first scale:
其中,表示Tμ1的几何逆变换,其定义为:in, represents the geometric inverse transformation of T μ1 , which is defined as:
Sim(·)表示相似性测度,即Sim(A,B)表示计算影像A和影像B的某种相似性测度。常用的相似性测度计算方法有灰度差平方和(Sum of Squared Difference,SSD)、归一化互相关(Normalized Cross Correlation,NCC)和相位相关(Phase Correlation)等:Sim(·) represents a similarity measure, that is, Sim(A, B) represents a similarity measure between images A and B. Commonly used similarity measure calculation methods include Sum of Squared Difference (SSD), Normalized Cross Correlation (NCC), and Phase Correlation, etc.:
其中影像A和影像B的尺寸都是w×w,和分别是影像A和影像B的灰度均值。The size of image A and image B is w×w, and are the grayscale means of image A and image B respectively.
计算传统的相似性测度(如SSD或NCC)较为耗时,根据两幅影像在空间域中的相关或卷积等于其在频率域的乘积,采用计算速度较快的相位相关,具体步骤如下:Calculating traditional similarity measures (such as SSD or NCC) is time-consuming. Based on the fact that the correlation or convolution of two images in the spatial domain is equal to their product in the frequency domain, phase correlation with faster calculation speed is used. The specific steps are as follows:
设影像A和影像B在空间域存在位移关系(x0,y0),即B(x,y)=A(x-x0,y-y0),经傅里叶变换分别表示为FA(u,v)和FB(u,v),二者在频率域存在以下关系:Assume that image A and image B have a displacement relationship (x 0 , y 0 ) in the spatial domain, that is, B(x, y) = A(xx 0 , yy 0 ), which are respectively expressed as FA (u, v) and FB (u, v) by Fourier transform. The two have the following relationship in the frequency domain:
FB(u,v)=FA(u,v)exp(-i(ux0+vy0))F B (u, v) = F A (u, v) exp (-i (ux 0 + vy 0 ))
二者的归一化互功率频谱表示为:The normalized cross-power spectrum of the two is expressed as:
其中上标*表示复共轭。The superscript * indicates complex conjugation.
在本发明的一个实施例中,影像A和影像B是由同一类传感器在同一地区获取的多源光学遥感影像,采用灰度值作为计算影像A和影像B相似性测度的输入。In one embodiment of the present invention, image A and image B are multi-source optical remote sensing images acquired by the same type of sensor in the same area, and grayscale values are used as input for calculating similarity measures between image A and image B.
在本发明的另一个实施例中,影像A和影像B是由不同类别传感器(如光学、红外、SAR等)在同一地区获取的遥感影像,不直接采用灰度值作为计算影像A和影像B相似性测度的输入,而是逐像素地计算影像A和影像B的局部特征描述符,如方向梯度特征通道(ChanelFeature of Orientated Gradient,CFOG)、方向梯度直方图(Histogram of OrientatedGradient,HOG)、局部自相似描述子(Local Self-similarity Descriptor,LSS)和相位一致性方向直方图(Histogram of Orientated Phase Congruency,HOPC)等。如图5所示,将两幅影像的特征描述符影像间的SSD、NCC或相位相关作为相似性测度。In another embodiment of the present invention, image A and image B are remote sensing images acquired in the same area by sensors of different types (such as optical, infrared, SAR, etc.). Grayscale values are not directly used as input for calculating the similarity measure between image A and image B. Instead, local feature descriptors of image A and image B are calculated pixel by pixel, such as Chanel Feature of Orientated Gradient (CFOG), Histogram of Orientated Gradient (HOG), Local Self-similarity Descriptor (LSS), and Histogram of Orientated Phase Congruency (HOPC). As shown in FIG5 , the SSD, NCC or phase correlation between the feature descriptors of the two images is used as the similarity measure.
步骤S3-1~S3-3是在尺度1上产生变换参数和校正影像,以及计算损失函数,这几个步骤详细地介绍了相关操作的具体实现方式。后续步骤(步骤S3-4~S3-9)还将在其他尺度上重复类似的操作,其与尺度1上的相关操作仅仅是参数上的不同,将简要概述其流程,而不重复详述其原理。Steps S3-1 to S3-3 generate transformation parameters and correct images at scale 1, and calculate the loss function. These steps introduce the specific implementation methods of the relevant operations in detail. Subsequent steps (steps S3-4 to S3-9) will repeat similar operations at other scales. The only difference between them and the relevant operations at scale 1 is the parameters. The process will be briefly outlined without repeating the detailed description of the principles.
S3-4、将参考影像f和校正影像m1输入到第2个尺度(本发明实施例中简称“尺度2”)的模型网络(本发明实施例中简称“模型网络2”)中,得到变换参数的残差Δμ1,并将其与变换参数μ1组合得到第2个尺度的变换参数μ2。S3-4, input the reference image f and the corrected image m1 into the model network of the second scale (referred to as "scale 2" in the embodiment of the present invention) (referred to as "model network 2" in the embodiment of the present invention), obtain the residual Δμ1 of the transformation parameter, and combine it with the transformation parameter μ1 to obtain the transformation parameter μ2 of the second scale.
步骤S3-4包括以下分步骤S3-4-1~S3-4-4:Step S3-4 includes the following sub-steps S3-4-1 to S3-4-4:
S3-4-1、将参考影像f和校正影像m1分别下采样至原尺寸的1/2,并将下采样后产生的两张影像在通道方向上进行叠置,产生叠置影像。S3-4-1. Downsample the reference image f and the correction image m1 to 1/2 of their original sizes respectively, and superimpose the two images generated after downsampling in the channel direction to generate a superimposed image.
S3-4-2、将叠置影像输入到第2个尺度的模型网络的特征提取部分,产生深度特征。S3-4-2. Input the superimposed image into the feature extraction part of the model network of the second scale to generate deep features.
本发明实施例中,模型网络2的网络结构类似上述模型网络1的网络结构,仅仅是参数设定上有所不同。现结合具体实施例对步骤S3-4-2中特征提取的具体实现作进一步说明,若参考影像f和校正影像m1的尺寸为512×512,1/2下采样后的影像尺寸为256×256,将每一个卷积块的卷积核通道数设定为64,且k的值设定为6,叠置影像经6组卷积块和下采样层产生的特征图尺寸为4。In the embodiment of the present invention, the network structure of the model network 2 is similar to the network structure of the model network 1, and only the parameter settings are different. Now, the specific implementation of feature extraction in step S3-4-2 is further described in conjunction with a specific embodiment. If the size of the reference image f and the correction image m1 is 512×512, the image size after 1/2 downsampling is 256×256, the number of convolution kernel channels of each convolution block is set to 64, and the value of k is set to 6, the feature map size generated by the superimposed image through 6 groups of convolution blocks and downsampling layers is 4.
S3-4-3、将深度特征通过第2个尺度的模型网络的参数回归部分,得到变换参数的残差Δμ1。S3-4-3. Pass the deep features through the parameter regression part of the model network of the second scale to obtain the residual Δμ 1 of the transformation parameter.
S3-4-4、将残差Δμ1与变换参数μ1组合得到第2个尺度的变换参数μ2:S3-4-4. Combine the residual Δμ 1 with the transformation parameter μ 1 to obtain the transformation parameter μ 2 of the second scale:
μ2=μ1*Δμ1 μ 2 =μ 1 *Δμ 1
其中*表示矩阵的乘法。Where * represents matrix multiplication.
S3-5、采用变换参数μ2对校正影像m1进行几何校正,产生校正影像m2。S3-5. Use the transformation parameter μ 2 to perform geometric correction on the correction image m 1 to generate a correction image m 2 .
步骤S3-5包括以下分步骤S3-5-1~S3-5-2:Step S3-5 includes the following sub-steps S3-5-1 to S3-5-2:
S3-5-1、由变换参数μ2组成几何变换矩阵Tμ2。S3-5-1. The geometric transformation matrix T μ2 is formed by the transformation parameters μ 2 .
S3-5-2、通过几何变换矩阵Tμ2对校正影像m1进行几何变换,产生校正影像m2:S3-5-2. Perform geometric transformation on the correction image m 1 through the geometric transformation matrix T μ2 to generate the correction image m 2 :
m2=Tμ2(m1)m 2 =T μ 2 (m 1 )
S3-6、计算第2个尺度的模型网络的损失函数Losssim(f,m1,μ2):S3-6. Calculate the loss function Loss sim (f, m 1 , μ 2 ) of the model network of the second scale:
S3-7、将参考影像f和校正影像m2输入到第3个尺度(本发明实施例中简称“尺度3”)的模型网络(本发明实施例中简称“模型网络3”)中,得到变换参数的残差Δμ2,并将其与变换参数μ2组合得到第3个尺度的变换参数μ3。S3-7, input the reference image f and the corrected image m2 into the model network of the third scale (referred to as "
步骤S3-7包括以下分步骤S3-7-1~S3-7-4:Step S3-7 includes the following sub-steps S3-7-1 to S3-7-4:
S3-7-1、将参考影像f和校正影像m2在通道方向上进行叠置,产生叠置影像。S3-7-1. Overlay the reference image f and the correction image m2 in the channel direction to generate an overlay image.
S3-7-2、将叠置影像输入到第3个尺度的模型网络的特征提取部分,产生深度特征。S3-7-2. Input the superimposed image into the feature extraction part of the model network of the third scale to generate deep features.
本发明实施例中,模型网络3的网络结构类似上述模型网络1和模型网络2的网络结构,仅仅是参数设定上有所不同。现结合具体实施例对步骤S3-7-2中特征提取的具体实现作进一步说明,若影像f和影像m的尺寸为512×512,将每一个卷积块的卷积核通道数设定为128,且k的值设定为7,叠置影像经7组卷积块和下采样层产生的特征图尺寸为4。In the embodiment of the present invention, the network structure of the
S3-7-3、将深度特征通过第3个尺度的模型网络的参数回归部分,得到变换参数的残差Δμ2。S3-7-3. Pass the deep features through the parameter regression part of the model network of the third scale to obtain the residual Δμ 2 of the transformation parameter.
S3-7-4、将残差Δμ2与变换参数μ2组合得到第3个尺度的变换参数μ3:S3-7-4. Combine the residual Δμ 2 with the transformation parameter μ 2 to obtain the transformation parameter μ 3 of the third scale:
μ3=μ2*Δμ2 μ 3 =μ 2 *Δμ 2
其中*表示矩阵的乘法。Where * represents matrix multiplication.
S3-8、采用变换参数μ3对校正影像m2进行几何校正,产生校正影像m3。S3-8. Use the transformation parameter μ 3 to perform geometric correction on the correction image m 2 to generate a correction image m 3 .
步骤S3-8包括以下分步骤S3-8-1~S3-8-2:Step S3-8 includes the following sub-steps S3-8-1 to S3-8-2:
S3-8-1、由变换参数μ3组成几何变换矩阵Tμ3。S3-8-1. The geometric transformation matrix T μ3 is formed by the transformation parameters μ 3 .
S3-8-2、通过几何变换矩阵Tμ3对校正影像m2进行几何变换,产生校正影像m3:S3-8-2. Perform geometric transformation on the correction image m 2 through the geometric transformation matrix T μ3 to generate the correction image m 3 :
m3=Tμ3(m2)m 3 =T μ3 (m 2 )
S3-9、计算第3个尺度的模型网络的损失函数Losssim(f,m2,μ3):S3-9. Calculate the loss function Loss sim (f, m 2 , μ 3 ) of the model network of the third scale:
S3-10、将校正影像m3和变换参数μ3作为在一个训练样本上的端对端的输出。S3-10. Take the corrected image m 3 and the transformation parameter μ 3 as the end-to-end output on a training sample.
S4、分别初始化3个尺度的模型网络参数。S4. Initialize the model network parameters of the three scales respectively.
步骤S4包括以下分步骤S4-1~S4-3:Step S4 includes the following sub-steps S4-1 to S4-3:
S4-1、以最小化损失函数Losssim(f,m,μ1)对第1个尺度的模型网络进行训练。S4-1. Train the model network of the first scale by minimizing the loss function Loss sim (f, m, μ 1 ).
S4-2、固定第1个尺度的模型网络的参数,以最小化损失函数Losssim(f,m1,μ2)对第2个尺度的模型网络进行训练。S4-2. Fix the parameters of the model network of the first scale and train the model network of the second scale to minimize the loss function Loss sim (f, m 1 , μ 2 ).
S4-3、固定第1个尺度的模型网络和第2个尺度的模型网络的参数,以最小化损失函数Losssim(f,m2,μ3)对第3个尺度的模型网络进行训练。S4-3. Fix the parameters of the model network of the first scale and the model network of the second scale, and train the model network of the third scale to minimize the loss function Loss sim (f, m 2 , μ 3 ).
S5、以端到端的方式对3个尺度的模型网络进行联合训练,最优化3个尺度上的联合损失函数。S5. Jointly train the model networks of the three scales in an end-to-end manner to optimize the joint loss function at the three scales.
本发明实施例中,在对3个尺度的模型网络进行联合训练之前,需要解除固定所有模型网络的参数。In the embodiment of the present invention, before jointly training the model networks of three scales, it is necessary to unfix the parameters of all the model networks.
本发明实施例中,联合损失函数Loss为:In the embodiment of the present invention, the joint loss function Loss is:
Loss=λ1×Losssim(f,m,μ1)+λ2×Losssim(f,m1,μ2)+λ3×Losssim(f,m2,μ3)Loss=λ 1 ×Loss sim (f, m, μ 1 )+λ 2 ×Loss sim (f, m 1 , μ 2 )+λ 3 ×Loss sim (f, m 2 , μ 3 )
其中λ1,λ2,λ3为各尺度模型网络的损失函数的权重因子,本发明实施例中λ1,λ2,λ3的值分别取0.05、0.05、0.9。Wherein λ 1 , λ 2 , and λ 3 are weight factors of the loss function of each scale model network. In the embodiment of the present invention, the values of λ 1 , λ 2 , and λ 3 are 0.05, 0.05, and 0.9, respectively.
S6、通过深度学习优化器寻找联合损失函数值降低最快的方向,以所述方向对模型网络进行反向传播,迭代更新模型网络参数,当联合损失函数下降至预设阈值并收敛时,端对端映射的所有模型网络具有整体最优的参数,参考影像f与校正影像m3具有最佳的相似性,保存此时的网络模型参数,并输出配准后的参考影像f和校正影像m3。S6. Use a deep learning optimizer to find the direction in which the joint loss function value decreases fastest, perform backpropagation on the model network in the direction, iteratively update the model network parameters, and when the joint loss function drops to a preset threshold and converges, all model networks of end-to-end mapping have the overall optimal parameters, and the reference image f and the corrected image m 3 have the best similarity. Save the network model parameters at this time, and output the aligned reference image f and corrected image m 3 .
由此,本发明实现了完全无监督学习的、端对端映射的多尺度遥感影像精确配准。Therefore, the present invention realizes the precise registration of multi-scale remote sensing images with completely unsupervised learning and end-to-end mapping.
本领域的普通技术人员将会意识到,这里所述的实施例是为了帮助读者理解本发明的原理,应被理解为本发明的保护范围并不局限于这样的特别陈述和实施例。本领域的普通技术人员可以根据本发明公开的这些技术启示做出各种不脱离本发明实质的其它各种具体变形和组合,这些变形和组合仍然在本发明的保护范围内。Those skilled in the art will appreciate that the embodiments described herein are intended to help readers understand the principles of the present invention, and should be understood that the protection scope of the present invention is not limited to such specific statements and embodiments. Those skilled in the art can make various other specific variations and combinations that do not deviate from the essence of the present invention based on the technical revelations disclosed by the present invention, and these variations and combinations are still within the protection scope of the present invention.
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210026370.7A CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210026370.7A CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114494372A CN114494372A (en) | 2022-05-13 |
CN114494372B true CN114494372B (en) | 2023-04-21 |
Family
ID=81509569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210026370.7A Active CN114494372B (en) | 2022-01-11 | 2022-01-11 | Remote sensing image registration method based on unsupervised deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114494372B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114693755B (en) * | 2022-05-31 | 2022-08-30 | 湖南大学 | Non-rigid registration method and system for maximum moment and spatial consistency of multimodal images |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109345575B (en) * | 2018-09-17 | 2021-01-19 | 中国科学院深圳先进技术研究院 | Image registration method and device based on deep learning |
CN109711444B (en) * | 2018-12-18 | 2024-07-19 | 中国科学院遥感与数字地球研究所 | Novel remote sensing image registration method based on deep learning |
CN111414968B (en) * | 2020-03-26 | 2022-05-03 | 西南交通大学 | Multi-mode remote sensing image matching method based on convolutional neural network characteristic diagram |
CN113901900A (en) * | 2021-09-29 | 2022-01-07 | 西安电子科技大学 | An unsupervised change detection method and system for homologous or heterologous remote sensing images |
-
2022
- 2022-01-11 CN CN202210026370.7A patent/CN114494372B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN114494372A (en) | 2022-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ye et al. | A robust multimodal remote sensing image registration method and system using steerable filters with first-and second-order gradients | |
Fang et al. | SAR-optical image matching by integrating Siamese U-Net with FFT correlation | |
CN112085772B (en) | A remote sensing image registration method and device | |
CN104067312A (en) | Image Registration Method and System Robust to Noise | |
Xie et al. | A novel extended phase correlation algorithm based on Log-Gabor filtering for multimodal remote sensing image registration | |
CN112883850B (en) | Multi-view space remote sensing image matching method based on convolutional neural network | |
Zhu et al. | Robust registration of aerial images and LiDAR data using spatial constraints and Gabor structural features | |
CN104200461A (en) | Mutual information image selected block and sift (scale-invariant feature transform) characteristic based remote sensing image registration method | |
CN111626927A (en) | Binocular image super-resolution method, system and device adopting parallax constraint | |
CN111696196A (en) | Three-dimensional face model reconstruction method and device | |
Binaghi et al. | Neural adaptive stereo matching | |
CN114494372B (en) | Remote sensing image registration method based on unsupervised deep learning | |
Xian et al. | Super-resolved fine-scale sea ice motion tracking | |
Dalmiya et al. | A survey of registration techniques in remote sensing images | |
CN113223066A (en) | Multi-source remote sensing image matching method and device based on characteristic point fine tuning | |
Li et al. | Subpixel image registration algorithm based on pyramid phase correlation and upsampling | |
CN117788296B (en) | Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network | |
Panigrahi et al. | Pre-processing algorithm for rectification of geometric distortions in satellite images | |
Jiang et al. | Semantic segmentation network combined with edge detection for building extraction in remote sensing images | |
CN113469003B (en) | Matching method of remote sensing images | |
CN116863285A (en) | Infrared and visible light image fusion method of multi-scale generative adversarial network | |
Chib et al. | A computational study on calibrated vgg19 for multimodal learning and representation in surveillance | |
CN115937704A (en) | Remote sensing image road segmentation method based on topology perception neural network | |
Zhang et al. | Superresolution approach of remote sensing images based on deep convolutional neural network | |
Naouai et al. | New approach for road extraction from high resolution remotely sensed images using the quaternionic wavelet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |