CN114332181B - Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network - Google Patents

Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network Download PDF

Info

Publication number
CN114332181B
CN114332181B CN202111644476.5A CN202111644476A CN114332181B CN 114332181 B CN114332181 B CN 114332181B CN 202111644476 A CN202111644476 A CN 202111644476A CN 114332181 B CN114332181 B CN 114332181B
Authority
CN
China
Prior art keywords
image
flow field
layer
motion flow
field prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111644476.5A
Other languages
Chinese (zh)
Other versions
CN114332181A (en
Inventor
陈浩
徐樱笑
杜春
彭双
伍江江
李军
熊伟
吴烨
杨岸然
马梦宇
贾庆仁
钟志农
陈荦
景宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202111644476.5A priority Critical patent/CN114332181B/en
Publication of CN114332181A publication Critical patent/CN114332181A/en
Application granted granted Critical
Publication of CN114332181B publication Critical patent/CN114332181B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention relates to a remote sensing image automatic registration method based on a non-rigid bidirectional registration network. The method comprises the following steps: the method comprises the steps of constructing a first multi-layer motion flow field prediction network, inputting a first motion image to be registered and a first fixed image into the first multi-layer motion flow field prediction network to obtain a first prediction image, inputting a second multi-layer motion flow field prediction network constructed in advance by taking the first prediction image output by the first multi-layer motion flow field prediction network as a second motion image and a similar fixed image corresponding to the first fixed image as a second fixed image, and outputting a conversion image.

Description

Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network
Technical Field
The invention relates to the technical field of remote sensing image registration, in particular to a remote sensing image automatic registration method and device based on a non-rigid bidirectional registration network.
Background
Remote sensing image registration is the basis for change detection, environmental monitoring and image fusion. For massive remote sensing images acquired from different sources using different coordinate systems, registration of the massive remote sensing images is required as a precondition for subsequent applications.
The feature-based image matching method describes a moving image and a fixed image through feature vectors, and performs similarity matching according to a distance or similarity measurement rule so as to solve a global transformation matrix, wherein the global transformation matrix comprises rotation, scaling and translation. SIFT and other feature matching-based algorithms perform well in homogenous image registration, however, manually designed feature descriptors can produce a large number of unmatched pairs when dealing with images of differences in appearance and large displacements, resulting in matching failure. Some researchers use remote sensing images to manufacture sample sets to train convolutional neural networks, and a higher-dimensional feature description vector is generated for each feature point so as to generate more accurate corresponding points in the feature point matching process, thereby improving the estimation precision of the geometric transformation model. There are still a large number of non-matching descriptors that prevent the correct matching of images. Various feature matching methods model finding the correct corresponding point pairs as seeking correspondence between a set of feature point sets in a high-dimensional space, such as the classical RANSAC (random sample consensus algorithm, random sample consensus) algorithm, grid-based motion statistics method GMS (Grid-based Motion Statistics), with matching results with high confidence on small hypothesis sets to guide matching on large hypothesis sets. The outlier removing method for complex local distortion, outlier removal and the like is difficult to be applied to remote sensing image registration for appearance change. Moreover, this two-stage approach of constructing descriptors followed by feature matching introduces a large amount of uncertainty.
In addition, many remote sensing images have local geometric deformation, for example, complicated local displacement caused by ground fluctuation in mountain areas, and shadow shielding caused by oblique shooting view angles or different shooting time or sun altitude angles also has the situation that a part of pixels cannot find corresponding relations. In this case, the non-rigid registration method is more flexible than the feature-based registration method.
But non-rigid registration methods tend to ignore reversibility between images, resulting in registration errors and inconsistencies.
Disclosure of Invention
Based on this, there is a need for a method and apparatus for automatic registration of remote sensing images based on a non-rigid bi-directional registration network that may be inaccurate as a result of conventional non-rigid registration.
A remote sensing image automatic registration method based on a non-rigid bidirectional registration network comprises the following steps:
constructing a first multi-layer motion flow field prediction network, and inputting a first motion image and a first fixed image to be registered into the first multi-layer motion flow field prediction network to obtain a first predicted image; the motion flow field prediction networks are connected in cascade, and the input of a first motion flow field prediction network is a first motion image and a first fixed image of a remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network;
a first predicted image output by the first multi-layer motion flow field prediction network is taken as a second motion image, a similar fixed image corresponding to the first fixed image is taken as a second fixed image, and the second multi-layer motion flow field prediction network constructed in advance is input to the second multi-layer motion flow field prediction network, so that a conversion image is output; wherein each motion flow field prediction network in the second multi-layer motion flow field prediction network is connected in cascade, and the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first motion image one by one
In one embodiment, the motion flow field prediction network comprises: correlation layer, mutual matching layer, neighborhood correlation layer, mutual matching layer, decoding layer and resampling layer.
In one embodiment, the method further comprises:
calculating a first feature correlation map between the first moving image and a first fixed image through the correlation layer;
processing the feature correlation map through the mutual matching layer, the neighborhood correlation layer and the mutual matching layer to obtain a second feature map;
decoding the second feature map through the decoding layer to obtain a predicted flow field;
and changing the predicted flow field through the resampling layer to obtain a first predicted image.
In one embodiment, the style consistency of the second fixed image and the similar fixed image is:
wherein,representing a similar fixed image.
In one embodiment, the method further comprises: training the two-way registration network formed by the first multi-layer motion flow field prediction network and the second multi-layer motion flow field prediction network according to a pre-constructed loss function to obtain network parameters;
in one embodiment, the loss function includes: EPE penalty, IOU penalty, content feature penalty, and normalized cross-correlation penalty;
the EPE loss is:
wherein m represents the number of pixels of the image, F A→B Representing a first moving image I A Fixed image I relative to first B Is arranged in the motion flow field;
the IOU penalty is:
wherein L is B Representing a first fixed image I B Corresponding building semantic tags, T (L A ) Representing a building semantic tag corresponding to a predicted image output by a motion flow field prediction network;
the content characteristic loss is as follows:
wherein Feat (T (I) A ) Representing extracted content features of predicted images output by a motion flow field prediction network using a pre-trained VGG-16 model on ImageNet, feat (I) B ) Representing the use of a pre-trained VGG-16 model on ImageNet for the first fixed image I B Extracting content characteristics;
the normalized cross-correlation loss is:
wherein I and J are two input image blocks,and->Respectively the local mean of I and J at the x-position, Ω representing all pixels in the image block.
A remote sensing image automatic registration device based on a non-rigid bi-directional registration network, the device comprising:
the first unidirectional registration module is used for constructing a first multilayer motion flow field prediction network, and inputting a first motion image and a first fixed image to be registered into the first multilayer motion flow field prediction network to obtain a first predicted image; the motion flow field prediction networks are connected in cascade, and the input of a first motion flow field prediction network is a first motion image and a first fixed image of a remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network;
the second unidirectional registration module is used for inputting a second multi-layer motion flow field prediction network constructed in advance by taking a first prediction image output by the first multi-layer motion flow field prediction network as a second motion image and a similar fixed image corresponding to the first fixed image as a second fixed image, and outputting a conversion image; and each motion flow field prediction network in the second multi-layer motion flow field prediction network is connected in cascade, and the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first motion image one by one.
According to the remote sensing image automatic registration method and device based on the non-rigid bidirectional registration network, registration reversibility and geometric consistency are enhanced through bidirectional registration.
Drawings
FIG. 1 is a schematic flow chart of a remote sensing image automatic registration method based on a non-rigid bi-directional registration network in one embodiment;
FIG. 2 is a schematic block diagram of a multi-layer motion flow field prediction network in one embodiment;
FIG. 3 is a schematic block diagram of a remote sensing image automatic registration method based on a non-rigid bi-directional registration network in one embodiment;
FIG. 4 is a block diagram of a computer device in one embodiment.
Detailed Description
For a better understanding of the objects, technical solutions and technical effects of the present invention, the present invention will be further explained below with reference to the drawings and examples. Meanwhile, it is stated that the embodiments described below are only for explaining the present invention and are not intended to limit the present invention.
In one embodiment, as shown in fig. 1, a schematic flow chart of a remote sensing image automatic registration method based on a non-rigid bidirectional registration network is provided, comprising:
step 102, a first multi-layer motion flow field prediction network is constructed, and a first motion image and a first fixed image to be registered are input into the first multi-layer motion flow field prediction network to obtain a first prediction image.
The motion flow field prediction networks are connected in cascade, and the input of the first motion flow field prediction network is a first motion image and a first fixed image of the remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network.
And 104, inputting a second multi-layer motion flow field prediction network constructed in advance by taking a first prediction image output by the first multi-layer motion flow field prediction network as a second motion image and a similar fixed image corresponding to the first fixed image as a second fixed image, and outputting a conversion image.
Each motion flow field prediction network in the second multilayer motion flow field prediction network is connected in cascade, and the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first motion image one by one.
According to the remote sensing image automatic registration method based on the non-rigid bidirectional registration network, registration reversibility and geometric consistency are enhanced through bidirectional registration.
In one embodiment, a motion flow field prediction network comprises: correlation layer, mutual matching layer, neighborhood correlation layer, mutual matching layer, decoding layer and resampling layer.
In one embodiment, a first feature correlation map between a first moving image and a first fixed image is calculated by a correlation layer; processing the feature correlation map through the mutual matching layer, the neighborhood correlation layer and the mutual matching layer to obtain a second feature map; decoding the second feature map through a decoding layer to obtain a predicted flow field; and changing the predicted flow field through a resampling layer to obtain a first predicted image.
Specifically, the correlation layer calculates the similarity between the positions of the two feature vectors to evaluate the matching probability. With the characteristic f A 、f B For inputting and outputting characteristic correlation diagrams
c ijkl =f B (k,l) T f A (i,j)
Where (i, j) and (k, l) represent the locations of individual features in the feature vector.
Because of large distortion of remote sensing images and complex background, distribution of matching relations between nearest neighbors can lead to a large number of false matches. Then eliminating initial inconsistent matching through a mutual matching layer and a neighborhood related layer, wherein the mutual matching layer:
and->Representing a specific match c ijkl The ratio of the largest fractions across each pair of dimensions ab or cd corresponding to image a or B.
Neighborhood related layer:
where N () consists of a series of convolutional layers and active layers.
The decoding module obtains a normalized prediction flow field through a series of convolution layers, a BN layer and a RELU activation layer, and the resampling layer adopts a grid_sample function in a pytorch and transforms a moving image according to the prediction flow field.
In another embodiment, the multi-layer motion flow field prediction network constructed by the present invention is essentially an iteration of performing image transformation, as shown in fig. 2, taking three iterations as an example:
the decoder adopts a coarse-to-fine strategy, the output motion flow field estimation resolutions are respectively 32 x 32, 64 x 64, 128 x 128, 256 x 256, and the motion flow field according to the highest resolutionObtaining a first round of transformed images T 1 (I A ). Second transformation to fix image B and transformed image T of first round 1 (I A ) For input, predict motion flow field of second wheel +.>And obtain a distorted image T 2 (I A ). The second wheel predicted flow field is equal to the first wheel predicted flow field +.>Plus a second round predictive flow fieldAnd so on. The motion flow field finally predicted by the third iteration is F A→B The transformed image is T 3 (I A ). Bi-directional registration continues with T 3 (I A ) As a moving picture, in picture->Repeating the steps as a fixed image to obtain a predicted motion flow field F B→A The transformed image is T (I A ))。
In one embodiment, the style consistency of the fixed image and the similar fixed image is:
wherein,representing a similar fixed image which maintains the style of the first fixed image through the transformation described above.
In one embodiment, training the bidirectional registration network formed by the first multi-layer motion flow field prediction network and the second multi-layer motion flow field prediction network according to a pre-constructed loss function to obtain network parameters.
In the training process, a moving image I is input A And fixed image I B To the network, according to the motion flow field F predicted by the network A→B Converting moving image into image T (I A ) Then bi-directionally registered with image T (I A ) As a moving image, an imageAs a fixed image, a motion flow field F predicted from a network B→A Converting a moving image into an image T (I A )). In the test process, only the first half is reserved, and the moving image I is input A And fixed image I B To the network, according to the network pre-allocationMeasured motion flow field F A→B Converting moving image into image T (I A ). Image T (I) A ) And image I B Alignment, image I A 、T(T(I A ) Sum->Aligned one by one. Image->And I B Is of the same modality, other images and image I A Is the same modality.
In the training process, a pair of moving images and fixed images are used as network inputs, a predicted moving flow field is output, a loss function is calculated between an output result of the network and a true value label of the flow field in each iteration process, the minimum loss function is used as an objective function, parameters in the deep convolutional neural network are continuously optimized by utilizing an Adam network parameter optimization algorithm, the learning rate is set to be 2e-4, and the attenuation rate is set to be 4e-4. When the loss value is no longer reduced, the network parameters at that time are saved as final network model parameters.
In one embodiment, the loss function includes: EPE penalty, IOU penalty, content feature penalty, and normalized cross-correlation penalty;
EPE loss is:
where m represents the number of pixels of the image, m=256×256, f for a 256×256 image A→B Representing a first moving image I A Fixed image I relative to first B Is arranged in the motion flow field;
the IOU penalty is:
wherein L is B Representing a first fixed image I B Corresponding constructionBuilding semantic tags, T (L) A ) Representing a building semantic tag corresponding to a predicted image output by a motion flow field prediction network;
the content characteristic loss is as follows:
wherein Feat (T (I) A ) Representing extracted content features of predicted images output by a motion flow field prediction network using a pre-trained VGG-16 model on ImageNet, feat (I) B ) Representing the use of a pre-trained VGG-16 model on ImageNet for the first fixed image I B Extracting content characteristics;
normalized cross-correlation loss is:
wherein I and J are two input image blocks,and->Respectively the local mean of I and J at the x-position, Ω representing all pixels in the image block.
Specifically, EPE loss represents the predicted flow field F and the true motion flow field F GT The IOU loss indicates the error of the building tag T (L A ) Building label L corresponding to fixed image B In order to avoid distortion of the content of the distorted image, constraints are put forward on the loss of content features and the loss of normalized cross-correlation (NCC). The content feature loss is to use a VGG-16 model trained in advance on ImageNet to perform feature extraction and to convert the motion image T (I A ) And the fixed image B.
Finally, the total loss function is expressed as:
where i represents a bi-directional registration task, j represents five layers of features output in the loss of content features, and k represents 3 iterations.
Inputting a pair of optical remote sensing images to be registered which are not learned by a network into a model saved in a training process, and obtaining a transformed registration result T (I) by only adopting unidirectional registration from a moving image A to a fixed image B A )。
It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.
In one embodiment, as shown in fig. 3, there is provided a remote sensing image automatic registration device based on a non-rigid bidirectional registration network, including: a first unidirectional registration module 302 and a second unidirectional registration module 304, wherein:
the first unidirectional registration module 302 is configured to construct a first multi-layer motion flow field prediction network, and input a first motion image and a first fixed image to be registered into the first multi-layer motion flow field prediction network to obtain a first predicted image; the motion flow field prediction networks are connected in cascade, and the input of a first motion flow field prediction network is a first motion image and a first fixed image of a remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network;
the second unidirectional registration module 304 is configured to input a second multi-layer motion flow field prediction network pre-constructed for a second fixed image by using the first predicted image output by the first multi-layer motion flow field prediction network as the second motion image, and output a conversion image; and each motion flow field prediction network in the second multi-layer motion flow field prediction network is connected in cascade, and the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first motion image one by one.
In one embodiment, the motion flow field prediction network comprises: correlation layer, mutual matching layer, neighborhood correlation layer, mutual matching layer, decoding layer and resampling layer.
In one embodiment, the unidirectional registration module 302 is further configured to calculate a first feature correlation map between the first moving image and the first fixed image through the correlation layer; processing the feature correlation map through the mutual matching layer, the neighborhood correlation layer and the mutual matching layer to obtain a second feature map; decoding the second feature map through the decoding layer to obtain a predicted flow field; and changing the predicted flow field through the resampling layer to obtain a first predicted image.
In one embodiment, the style consistency of the second fixed image and the similar fixed image is:
wherein,representing a similar fixed image.
In one embodiment, a training module includes: training the bidirectional registration network formed by the first multilayer motion flow field prediction network and the second multilayer motion flow field prediction network according to the loss function through a pre-constructed loss function, and obtaining network parameters.
In one embodiment, the loss function includes: EPE penalty, IOU penalty, content feature penalty, and normalized cross-correlation penalty;
the EPE loss is:
wherein m represents the number of pixels of the image, F A→B Representing a first moving image I A Fixed image I relative to first B Is arranged in the motion flow field;
the IOU penalty is:
wherein L is B Representing a first fixed image I B Corresponding building semantic tags, T (L A ) Representing a building semantic tag corresponding to a predicted image output by a motion flow field prediction network;
the content characteristic loss is as follows:
wherein Feat (T (I) A ) Representing extracted content features of predicted images output by a motion flow field prediction network using a pre-trained VGG-16 model on ImageNet, feat (I) B ) Representing the use of a pre-trained VGG-16 model on ImageNet for the first fixed image I B Extracting content characteristics;
the normalized cross-correlation loss is:
wherein I and J are two input image blocks,and->Respectively the local mean of I and J at the x-position, Ω representing all pixels in the image block.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by the processor is used for realizing a remote sensing image automatic registration method based on a non-rigid bidirectional registration network. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the structures shown in FIG. 4 are block diagrams only and do not constitute a limitation of the computer device on which the present aspects apply, and that a particular computer device may include more or less components than those shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, a computer device is provided comprising a memory storing a computer program and a processor implementing the steps of the method of the above embodiments when the computer program is executed.
In an embodiment, a computer readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, implements the steps of the method of the above embodiments.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (4)

1. The remote sensing image automatic registration method based on the non-rigid bidirectional registration network is characterized by comprising the following steps of:
constructing a first multi-layer motion flow field prediction network, and inputting a first motion image and a first fixed image to be registered into the first multi-layer motion flow field prediction network to obtain a first predicted image; each motion flow field prediction network in the first multilayer motion flow field prediction network is connected in cascade, and the input of the first motion flow field prediction network is a first motion image and a first fixed image of a remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network;
a first predicted image output by the first multi-layer motion flow field prediction network is taken as a second motion image, a similar fixed image corresponding to the first fixed image is taken as a second fixed image, and the second multi-layer motion flow field prediction network constructed in advance is input to the second multi-layer motion flow field prediction network, so that a conversion image is output; the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first moving image one by one;
the motion flow field prediction network comprises: the device comprises a correlation layer, a mutual matching layer, a neighborhood correlation layer, a mutual matching layer, a decoding layer and a resampling layer;
the construction of a first multi-layer motion flow field prediction network, inputting a first motion image and a first fixed image to be registered into the first multi-layer motion flow field prediction network to obtain a first predicted image, comprising:
calculating a first feature correlation map between the first moving image and a first fixed image through the correlation layer;
processing the feature correlation map through the mutual matching layer, the neighborhood correlation layer and the mutual matching layer to obtain a second feature map;
decoding the second feature map through the decoding layer to obtain a predicted flow field;
the predicted flow field is changed through the resampling layer, and a first predicted image is obtained;
the second fixed image and the similar fixed image satisfy style consistency:
wherein,representing a similar fixed image.
2. The method according to claim 1, wherein the method further comprises:
training the bidirectional registration network formed by the first multilayer motion flow field prediction network and the second multilayer motion flow field prediction network according to the loss function through a pre-constructed loss function, and obtaining network parameters.
3. The method of claim 2, wherein the loss function comprises: EPE penalty, IOU penalty, content feature penalty, and normalized cross-correlation penalty;
taking the first multi-layer motion flow field prediction network as an example, the EPE loss is:
wherein,representing the number of pixels of an image +.>Representing a first moving picture +.>Relative to the first fixed image->Is arranged in the motion flow field;
the IOU penalty is:
wherein,representing a first fixed image +.>Corresponding building semantic tags, +.>A building semantic tag representing a first predicted image;
the content characteristic loss is as follows:
wherein,representing content features extracted from predicted images output by a motion flow field prediction network using a pre-trained VGG-16 model on ImageNet, < >>Representing the use of a pre-trained VGG-16 model on ImageNet for the first fixed image +.>Extracting content characteristics;
the normalized cross-correlation loss is:
wherein,and->For two input image blocks +.>And->Respectively->And->At->Local mean on location,/->Representing all pixels in an image block.
4. A remote sensing image automatic registration device based on a non-rigid bidirectional registration network, the device comprising:
the first unidirectional registration module is used for constructing a first multilayer motion flow field prediction network, and inputting a first motion image and a first fixed image to be registered into the first multilayer motion flow field prediction network to obtain a first predicted image; the motion flow field prediction networks are connected in cascade, and the input of a first motion flow field prediction network is a first motion image and a first fixed image of a remote sensing image to be registered; the input of other motion flow field prediction networks is a first fixed image and a predicted image output by a previous stage motion flow field prediction network;
the second unidirectional registration module is used for inputting a second multi-layer motion flow field prediction network constructed in advance by taking a first prediction image output by the first multi-layer motion flow field prediction network as a second motion image and a similar fixed image corresponding to the first fixed image as a second fixed image, and outputting a conversion image; the second fixed image and the first fixed image meet style consistency and correspond to pixel positions of the first moving image one by one;
the motion flow field prediction network comprises: the device comprises a correlation layer, a mutual matching layer, a neighborhood correlation layer, a mutual matching layer, a decoding layer and a resampling layer;
the first unidirectional registration module is further used for calculating a first characteristic correlation diagram between the first moving image and the first fixed image through the correlation layer; processing the feature correlation map through the mutual matching layer, the neighborhood correlation layer and the mutual matching layer to obtain a second feature map; decoding the second feature map through the decoding layer to obtain a predicted flow field; the predicted flow field is changed through the resampling layer, and a first predicted image is obtained;
the second fixed image and the similar fixed image satisfy style consistency:
wherein,representing a similar fixed image.
CN202111644476.5A 2021-12-29 2021-12-29 Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network Active CN114332181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111644476.5A CN114332181B (en) 2021-12-29 2021-12-29 Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111644476.5A CN114332181B (en) 2021-12-29 2021-12-29 Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network

Publications (2)

Publication Number Publication Date
CN114332181A CN114332181A (en) 2022-04-12
CN114332181B true CN114332181B (en) 2024-02-20

Family

ID=81017246

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111644476.5A Active CN114332181B (en) 2021-12-29 2021-12-29 Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network

Country Status (1)

Country Link
CN (1) CN114332181B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689060A (en) * 2019-09-16 2020-01-14 西安电子科技大学 Heterogeneous image matching method based on aggregation feature difference learning network
CN110838139A (en) * 2019-11-04 2020-02-25 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method and computer equipment
CN112785542A (en) * 2021-02-07 2021-05-11 中国人民解放军国防科技大学 Method and device for converting remote sensing image into network map, computer equipment and medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7327865B2 (en) * 2004-06-30 2008-02-05 Accuray, Inc. Fiducial-less tracking with non-rigid image registration

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689060A (en) * 2019-09-16 2020-01-14 西安电子科技大学 Heterogeneous image matching method based on aggregation feature difference learning network
CN110838139A (en) * 2019-11-04 2020-02-25 上海联影智能医疗科技有限公司 Training method of image registration model, image registration method and computer equipment
CN112785542A (en) * 2021-02-07 2021-05-11 中国人民解放军国防科技大学 Method and device for converting remote sensing image into network map, computer equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度学习的高光谱影像配准研究;许东丽;硕士电子期刊(第2020年08期期);全文 *

Also Published As

Publication number Publication date
CN114332181A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN113902926B (en) General image target detection method and device based on self-attention mechanism
Lai et al. Deep recurrent regression for facial landmark detection
CN113673594B (en) Defect point identification method based on deep learning network
Piccinelli et al. iDisc: Internal discretization for monocular depth estimation
US20230055146A1 (en) Methods for recognizing small targets based on deep learning networks
EP3965071A2 (en) Method and apparatus for pose identification
CN111160288A (en) Gesture key point detection method and device, computer equipment and storage medium
Jiang et al. Learning for mismatch removal via graph attention networks
Fu et al. Learning to reduce scale differences for large-scale invariant image matching
Chen et al. A hierarchical consensus attention network for feature matching of remote sensing images
Chen et al. CSR-Net: Learning adaptive context structure representation for robust feature correspondence
Chen et al. StateNet: Deep state learning for robust feature matching of remote sensing images
Khan et al. A survey of the vision transformers and their CNN-transformer based variants
Zhang et al. DHNet: Salient object detection with dynamic scale-aware learning and hard-sample refinement
CN114550014A (en) Road segmentation method and computer device
Yang et al. Robust visual tracking using adaptive local appearance model for smart transportation
CN114332181B (en) Remote sensing image automatic registration method and device based on non-rigid bidirectional registration network
CN113159053A (en) Image recognition method and device and computing equipment
CN113158831A (en) Method and device for detecting movement of camera equipment, computer equipment and storage medium
Zeng et al. Adaptive Edge-aware Semantic Interaction Network for Salient Object Detection in Optical Remote Sensing Images
Dutta et al. Best pair formulation & accelerated scheme for non-convex principal component pursuit
CN114708436B (en) Training method of semantic segmentation model, semantic segmentation method, semantic segmentation device and semantic segmentation medium
CN115861384A (en) Optical flow estimation method and system based on generation of countermeasure and attention mechanism
Wang et al. EMAT: Efficient feature fusion network for visual tracking via optimized multi-head attention
Jun et al. Two-view correspondence learning via complex information extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant