CN114612698A - Infrared and visible light image registration method and system based on hierarchical matching - Google Patents
Infrared and visible light image registration method and system based on hierarchical matching Download PDFInfo
- Publication number
- CN114612698A CN114612698A CN202210191584.XA CN202210191584A CN114612698A CN 114612698 A CN114612698 A CN 114612698A CN 202210191584 A CN202210191584 A CN 202210191584A CN 114612698 A CN114612698 A CN 114612698A
- Authority
- CN
- China
- Prior art keywords
- image
- visible light
- infrared
- light image
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012216 screening Methods 0.000 claims abstract description 39
- 230000002776 aggregation Effects 0.000 claims abstract description 22
- 238000004220 aggregation Methods 0.000 claims abstract description 22
- 230000009466 transformation Effects 0.000 claims abstract description 21
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 17
- 238000005070 sampling Methods 0.000 claims abstract description 13
- 238000006243 chemical reaction Methods 0.000 claims abstract description 12
- 230000000750 progressive effect Effects 0.000 claims abstract description 9
- 230000001131 transforming effect Effects 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims description 17
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013527 convolutional neural network Methods 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 8
- 238000012544 monitoring process Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 238000003860 storage Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 16
- 230000000694 effects Effects 0.000 description 7
- 229920001651 Cyanoacrylate Polymers 0.000 description 6
- 239000004830 Super Glue Substances 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013441 quality evaluation Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of image processing, and provides an infrared and visible light image registration method and system based on hierarchical matching. The method comprises the steps of acquiring an infrared image and a visible light image; pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening; extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening; according to the pixel coordinates of the matched feature point pairs in the infrared image and the visible light image, a progressive consistent sampling algorithm is utilized to obtain conversion parameters from the infrared image to the visible light image; and transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize hierarchical registration of the infrared image and the visible light image.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an infrared and visible light image registration method and system based on hierarchical matching.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Due to the fact that the infrared lens and the visible light lens are different in position, focal length, distortion parameters and the like, deformation problems such as offset, scaling and the like inevitably exist between infrared modal images and visible light modal images collected by the same thermal infrared imager. Performing fusion analysis on the infrared image and the visible light image, firstly, accurately registering images of two modes, and simultaneously overcoming interference caused by some objects in the visible light image; the registration process generally includes pixel screening, corner extraction, feature computation, and feature matching.
In a traditional method for directly performing corner extraction, Feature calculation and Feature matching on an original image, Scale-Invariant Feature Transform (SIFT) is often not robust enough to changes of different modes such as infrared and visible light, and a Feature matching method of a k-d tree nearest neighbor query algorithm (Best Bin First, BBF) ignores context information of adjacent Feature points during matching and is also easy to cause matching errors. Compared with the traditional method, the method based on the deep learning, such as SuperPoint, for extracting the feature points and calculating the feature, and the method based on the deep graph convolution network, such as SuperGlue, is easy to generate compact matching point pairs, and can generate more stable matching pairs to a certain extent. However, the inventor finds that feature point extraction and matching are directly performed on the infrared image and the visible light image, and if feature points or matching pairs are extracted from sky clouds or other non-registration related components, invalid matching is extracted most frequently, and subsequent registration and fusion are interfered.
Disclosure of Invention
In order to solve the technical problems in the background art, the invention provides a method and a system for registering infrared and visible light images based on hierarchical matching, which can realize more accurate registration of the infrared image and the visible light image compared with the traditional method; and moreover, accurate registration results can be obtained for different images shot at different angles.
In order to achieve the purpose, the invention adopts the following technical scheme:
the first aspect of the invention provides an infrared and visible light image registration method based on hierarchical matching, which comprises the following steps:
acquiring an infrared image and a visible light image;
pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening;
extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening;
according to the pixel coordinates of the matched feature point pairs in the infrared image and the visible light image, a progressive consistent sampling algorithm is utilized to obtain conversion parameters from the infrared image to the visible light image;
and transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize hierarchical registration of the infrared image and the visible light image.
A second aspect of the present invention provides a hierarchy matching based infrared and visible image registration system, comprising:
the image acquisition module is used for acquiring an infrared image and a visible light image;
the pixel pre-screening module is used for pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening;
the characteristic point extracting and matching module is used for extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening;
the conversion parameter calculation module is used for obtaining conversion parameters from the infrared image to the visible light image by utilizing a progressive consistent sampling algorithm according to the pixel coordinates of the matched feature point pairs in the infrared image and the visible light image;
and the level registration module is used for transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize level registration of the infrared image and the visible light image.
A third aspect of the present invention provides a computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the infrared and visible light image registration method based on hierarchical matching as described above.
A fourth aspect of the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the infrared and visible image registration method based on hierarchical matching as described above when executing the program.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention realizes the registration of infrared and visible light images based on hierarchical matching on the basis of pixel pre-screening based on local aggregation characteristics, extracting characteristic points by using a self-supervision learning network, matching the characteristic points by using a depth map convolution network and obtaining transformation parameters by using a progressive consistent sampling algorithm, has more accurate registration result and realizes more effective matching of the infrared images and the visible light images.
(2) Compared with the traditional image registration method, the method has the advantages that unreliable feature point selection areas can be filtered out through the local aggregation features NetVLAD based on the deep learning, more reliable feature points and descriptors can be extracted through SuperPoint based on the deep self-supervision, and more effective matching is realized through the SuperGlue feature matching method based on the depth map convolutional network by effectively utilizing the feature point position information and the context information; on the basis, the adopted asymptotic sampling consistency method can obtain more accurate transformation parameter estimation results, and accurate registration results can be obtained for images shot at different angles.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flowchart of a method for infrared and visible image registration based on hierarchical matching according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating NetVLAD pixel prescreening according to an embodiment of the invention;
FIG. 3 is a diagram showing the effect of the visible light image after pixel screening by NetVLAD according to the embodiment of the present invention;
fig. 4 is a schematic diagram of a superfiepoint feature extraction and description network according to an embodiment of the present invention;
FIG. 5 is an effect diagram of the infrared image extracted through the SuperPoint feature points in the embodiment of the invention;
fig. 6 is a diagram illustrating an effect of extracting a SuperPoint feature point of a visible light image after pixel pre-screening according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a SuperGlue feature matching network according to an embodiment of the present invention;
FIG. 8 is an effect diagram of matching SuperGlue feature points of an infrared image and a visible light image according to an embodiment of the present invention;
FIG. 9 is a final fused image effect diagram according to an embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Example one
As shown in fig. 1, the present embodiment provides a method for registering infrared and visible light images based on hierarchical matching, which specifically includes the following steps:
s101: an infrared image and a visible light image are acquired.
The infrared image can be realized by adopting infrared image acquisition equipment, and the visible light image can be realized by adopting visible light image acquisition equipment.
S102: and pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening.
In this embodiment, a local aggregation feature NetVLAD method based on deep learning is used to perform pixel pre-screening. Specifically, calculating the availability of a cluster of local aggregation characteristics based on a convolutional neural network and a local aggregation characteristic network, and further performing pre-screening on pixels; wherein, the availability is the L2 distance of the corresponding local feature matching the negative sample on each cluster.
Hard distributing the visible light image pixels to the clustering clusters of the local aggregation characteristics;
and screening out the pixels aggregated to the clustering cluster with high utilization as pixels for pre-screening and extracting the characteristic points.
In view of the fact that the existing method directly extracts and matches feature points of an original image, feature points which are difficult to accurately match, such as sky clouds or objects with periodic texture changes, such as sleeves, are easily extracted in some unstable areas.
As shown in fig. 2, the CNN feature extraction module in this example is composed of a VGG16 network with a classification layer removed, and the NetVLAD part includes 16 cluster centers and respective soft distribution weight parameters. For an input image with the size of W multiplied by H, the input image is obtained after passing through a CNN feature extraction moduleCharacteristic diagram of (1), NetVLAD willSoft distribution and aggregation of 512-dimensional features on 16 cluster clusters in a residual aggregation mode to obtain 16 x 512-dimensional NetVLAD output features, wherein the features are as follows:
wherein N is equal toxiA 512-dimensional feature column vector of the feature map is output for the CNN. w is ak,bkA learnable parameter representing the k-th cluster, ckIs the cluster center of the kth cluster. Before the training begins, cluster center ckInitialization is needed, wherein the initialization mode is to calculate all images, obtain initial features through a CNN feature extraction module, perform K-means clustering on all the initial features, and calculate to obtain 16 clustering centers.
The CNN module and the NetVLAD module of the present example are trained on an image matching training set, so that the features of images at slightly varying angles at the same location are drawn closer, and the features of images at different locations are pushed away. After training, the utility of each cluster is calculated by summing the L2 distances between a cluster in any query image and the features of the corresponding clusters in all non-matching images in the validation set. Calculating the average value of the utilizability of all the clustering clusters of all the query images to obtain the utilizability of all the clustering clusters, namely:
wherein, UKRepresents the utilization of the kth cluster, N is the total number of all query images, (V)K)aTo query the features of the image on the kth cluster, (V)K)nMatching for query imagesThe characteristics of the negative examples on the kth cluster.
And then inputting the visible light image into the trained CNN network, extracting the feature map, and then performing hard distribution on the trained NetVLAD cluster, namely mapping a point on the feature map to the only nearest cluster, wherein the point also corresponds to the pixel region of the original input image. And selecting a high-utilization cluster, correspondingly corresponding to some pixel regions of the input visible light image, and using the pixels as pre-screened pixels for extracting feature points and matching in subsequent steps. In this example, the effect graph after pixel screening through the local aggregation feature network is shown in fig. 3.
S103: and extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening.
In the specific implementation process, an automatic supervision learning network (SuperPoint method) is used for extracting characteristic points of the infrared image and the visible light image subjected to pixel pre-screening.
The learning process of the self-supervision learning network is as follows:
constructing an image set containing definite corner information for training a primary corner extraction network;
carrying out homography transformation on the original image at random, adding noise, and obtaining the positions of feature points in the transformed image by using a trained primary network;
and monitoring the self-monitoring learning network by using the transformed image and the angular point information thereof, and learning to obtain the self-monitoring learning network with the capability of acquiring the infrared image and the visible light image characteristic point.
In view of the lack of robustness of the conventional point feature extraction and description methods such as SIFT and surf (speedup Robust features) to modal changes, in the present embodiment, a deep learning-based SuperPoint method is adopted to extract and describe feature points of images in two different modalities, namely infrared and visible light.
As shown in fig. 4, the SuperPoint feature extraction and description network in this embodiment is composed of one encoder and two decoders; the encoder is a convolutional network similar to VGGNet, comprises a plurality of convolutional layers and pooling layers and is used for encoding the input image with the size of W multiplied by H; the two decoders are respectively used for feature point extraction and feature point description; the characteristic point extraction decoder is composed of a convolution layer, a Softmax layer and a Reshape layer and outputs a W multiplied by H multiplied by 1 image, wherein the value of each pixel represents the probability that the pixel is the characteristic point; the feature description decoder is composed of a convolution layer, a bilinear interpolation layer and an L2 modular normalization layer, and finally outputs a W × H × D feature map, wherein each pixel corresponds to a D-dimensional feature vector.
Because the truth value of the characteristic points is difficult to label manually, the embodiment designs an automatic supervision learning strategy which does not need the truth value; the method specifically comprises the following steps: firstly, constructing a synthetic image set containing simple shapes such as triangles, quadrangles, cubes, checkers and the like, wherein the images have definite angular point information, and training to obtain an initial angular point extraction network; then, randomly carrying out various homographic transformations on the real image and adding noise; extracting characteristic points from the network by using the trained initial corner points, taking the characteristic points as monitoring information, and learning to obtain a SuperPoint network; in this embodiment, the result of extracting the feature points is shown in fig. 5 and 6.
In the specific implementation process, in view of the problem that errors are easily caused because the traditional feature matching method is mostly used for independently matching each feature point, in the embodiment, the accuracy of feature matching is improved by adopting a superslue method based on a depth map neural network and combining the spatial geometric relationship between the feature points and the mutual correlation information of the feature points.
Wherein, as shown in fig. 7, the depth map convolution network comprises an attention map convolution network and an optimization matching layer; the input of the attention map convolutional network is a set of the feature point positions and the description vectors of a pair of images, and a feature descriptor after spatial information aggregation is output; the input of the optimization matching layer is a feature descriptor output by the attention-seeking convolutional network, and a matching result is output.
In the attention-driven convolutional network, firstly, the positions of characteristic points are coded and added with characteristic descriptor vectors to obtain initial representation of each characteristic point;
for example: and (3) the positions of the feature points are subjected to dimension increasing through a feature point encoder consisting of a multilayer perceptron (MLP), high-dimensional vectors are obtained and then added with feature descriptor vectors, and the initial representation of each feature point is obtained.
Then, constructing a multivariate graph; the vertexes of the multivariate graph are all characteristic points in the two images, and the edges comprise an image inner edge and an image crossing edge; the inner edge of the image is connected with the characteristic point pairs in the single image, and the cross-image edge is connected with the characteristic point pairs from the two images;
the vertexes of the multivariate graph are all characteristic points in the two images, and the edges comprise an inner image edge and an image-crossing edge; wherein, the inner edge of the image is connected with the characteristic point pairs in the single image, and the cross-image edge is connected with the characteristic point pairs from the two images; after the graph is constructed, the information transmission mechanism is utilized to alternately carry out information aggregation and update on the features of the inner edge and the vertex of the cross-image edge in the graph, and the description vector of each feature point is obtained after the information aggregation updateAnd adding an optimized matching layer, and calculating an M multiplied by N similarity matrix S, wherein each unit in the matrix represents the similarity of the features in the image A and the features in the image B.
And after constructing the multivariate graph, performing message aggregation and updating on the characteristics of all vertexes in the constructed multivariate graph.
And the optimized matching layer calculates a similarity matrix according to the updated features, each unit in the matrix represents the similarity of the features in the two images, the matrix is expanded, and the newly added row and column are used for describing the condition that the feature points are not matched.
Due to occlusion or different view ranges and the like, the feature points in one image may not have matched feature points in the other image; for this purpose, the matrix S is expanded to a matrix of (M +1) × (N +1)Wherein, the added row and column are used to describe the condition that there is no match between the feature points.
Thus, the problem of feature point matching is converted into an optimal transportation problem, and a Sinkhorn algorithm can be used for solving the problem. Since the Sinkhorn algorithm is conducive, it can be implemented with one network layer.
S104: and obtaining the conversion parameters from the infrared image to the visible light image by utilizing a progressive consistent sampling algorithm according to the pixel coordinates of the matched characteristic point pairs in the infrared image and the visible light image.
After the characteristic point pairs matched in the infrared image and the visible light image are obtained, calculating transformation parameters between the images according to pixel coordinates of the characteristic point pairs; and performing quality evaluation on all the matching point pairs to obtain Q values, performing descending order arrangement according to the Q values, performing random sampling according to the descending order arrangement result of the Q values in each iteration, and performing model hypothesis and verification.
In the embodiment, the conversion parameters from the infrared image to the visible light image are estimated by using the asymptotic sampling consensus PROSAC algorithm according to the pixel coordinates of the matched feature points in the two images.
The method specifically comprises the following steps: after the matched feature point pairs in the infrared image and the visible light image are obtained, in the embodiment, transformation parameters between the images are estimated according to pixel coordinates where the feature point pairs are located; because the offset between the infrared lens and the visible light lens is very small relative to the distance of the shot object, the offset can be approximate to a common optical center, and the infrared image is converted into a coordinate system of the visible light image by utilizing a homography conversion matrix H; under the homogeneous coordinate system, the pixel coordinate transformation in the two images can be expressed as the following relation:
wherein H in the homography transformation matrix H 331. Therefore, the degree of freedom of the parameters of the transformation matrix is 8, and four or more pairs of feature points can be used for estimation.
Because matching noise and even outliers of error matching exist, a least square method, a RANSAC method and the like are commonly used for parameter estimation; however, in RANSAC, each pair of feature point pairs are treated equally, and samples are randomly selected from the entire set of feature point pairs, which has problems of randomness of estimation results and slow convergence speed. For this reason, in the present embodiment, the PROSAC algorithm is used to estimate the conversion parameters.
In this embodiment, the sac algorithm designs a semi-random method, performs quality evaluation calculation on all matching point pairs to obtain Q values, then performs descending order arrangement according to the Q values, preferentially performs random sampling on high-quality point pairs in each iteration, and performs model assumption and verification, thereby reducing algorithm complexity, improving efficiency, and avoiding the situation that convergence cannot be guaranteed in the RANSAC random algorithm.
S105: and transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize hierarchical registration of the infrared image and the visible light image. Wherein, fig. 8 is an effect diagram of the infrared image and the visible light image after matching the SuperGlue feature points.
And transforming the coordinates of the infrared image into a coordinate system of the visible light image according to the transformation parameters, and averaging according to the pixels at the corresponding positions to obtain a final fusion image.
Specifically, after the transformation model is calculated, the coordinates of the infrared image are transformed into the coordinate system of the visible light image, and then the final fusion image is obtained by averaging the pixels at the corresponding positions, as shown in fig. 9.
In view of difference between infrared modalities and visible light modalities, the embodiment adopts local aggregation feature NetVLAD based on deep learning to perform pixel pre-screening on a visible light image, SuperPoint feature point extraction and descriptor calculation based on deep self-supervision, a SuperGlue feature matching method based on a depth map convolution network, and a progressive consistent sampling algorithm, so that matching of the infrared image and the visible light image is realized, registration of the infrared image and the visible light image which is more accurate than that of a traditional method can be realized, and accurate registration results can be obtained for images of different power transformation equipment shot at different angles.
Example two
The embodiment provides an infrared and visible light image registration system based on hierarchical matching, which specifically comprises the following modules:
the image acquisition module is used for acquiring an infrared image and a visible light image;
the pixel pre-screening module is used for pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening;
the characteristic point extracting and matching module is used for extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening;
the conversion parameter calculation module is used for obtaining conversion parameters from the infrared image to the visible light image by utilizing a progressive consistent sampling algorithm according to the pixel coordinates of the matched feature point pairs in the infrared image and the visible light image;
and the level registration module is used for transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize level registration of the infrared image and the visible light image.
It should be noted that, each module in the present embodiment corresponds to each step in the first embodiment one to one, and the specific implementation process is the same, which is not described herein again.
EXAMPLE III
The present embodiment provides a computer readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the infrared and visible light image registration method based on hierarchical matching as described above.
Example four
The present embodiment provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the infrared and visible image registration method based on hierarchical matching as described above when executing the program.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A method for registering infrared and visible light images based on hierarchical matching is characterized by comprising the following steps:
acquiring an infrared image and a visible light image;
pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening;
extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening;
according to pixel coordinates of the matched feature point pairs in the infrared image and the visible light image, obtaining a transformation parameter from the infrared image to the visible light image by utilizing a progressive consistent sampling algorithm;
and transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize hierarchical registration of the infrared image and the visible light image.
2. The infrared and visible image registration method based on hierarchical matching as claimed in claim 1, wherein in the process of pre-screening the pixels of the visible image, the availability of clusters of local aggregated features is calculated based on a convolutional neural network and a local aggregated feature network, thereby pre-screening the pixels; wherein, the availability is the L2 distance of the corresponding local feature matching the negative sample on each cluster.
3. The infrared and visible image registration method based on hierarchical matching as claimed in claim 1, wherein the feature point extraction is performed on both the infrared image and the visible image after pixel pre-screening by using an auto-supervised learning network.
4. The infrared and visible image registration method based on hierarchical matching according to claim 3, wherein the learning process of the self-supervised learning network is as follows:
constructing an image set containing definite corner information for training a primary corner extraction network;
carrying out homography transformation on the original image at random, adding noise, and obtaining the positions of feature points in the transformed image by using a trained primary network;
and monitoring the self-monitoring learning network by using the transformed image and the angular point information thereof, and learning to obtain the self-monitoring learning network with the capability of acquiring the infrared image and the visible light image characteristic point.
5. The infrared and visible image registration method based on hierarchical matching as claimed in claim 1, wherein the infrared image and the visible image after pixel pre-screening are feature point matched using a depth map convolution network.
6. The infrared and visible image registration method based on hierarchical matching according to claim 5, wherein the depth map convolution network comprises an attention map convolution network and an optimization matching layer; the input of the attention map convolutional network is a set of the feature point positions and the description vectors of a pair of images, and a feature descriptor after spatial information aggregation is output; the input of the optimization matching layer is a feature descriptor output by the attention-seeking convolutional network, and a matching result is output.
7. The infrared and visible image registration method based on hierarchical matching as claimed in claim 6, wherein the optimized matching layer calculates a similarity matrix according to the updated features, each cell in the matrix represents the similarity of the features in the two images, and the matrix is expanded, and the added row and column are used to describe the condition that there is no matching of the feature points.
8. An infrared and visible image registration system based on hierarchical matching, comprising:
the image acquisition module is used for acquiring an infrared image and a visible light image;
the pixel pre-screening module is used for pre-screening pixels of the visible light image based on the local aggregation characteristics to obtain the visible light image after the pixel pre-screening;
the characteristic point extraction and matching module is used for extracting and matching characteristic points of the infrared image and the visible light image subjected to pixel pre-screening;
the conversion parameter calculation module is used for obtaining conversion parameters from the infrared image to the visible light image by utilizing a progressive consistent sampling algorithm according to the pixel coordinates of the matched feature point pairs in the infrared image and the visible light image;
and the level registration module is used for transforming the coordinates of the infrared image into a visible light image coordinate system according to the transformation parameters to realize level registration of the infrared image and the visible light image.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the infrared and visible light image registration method based on hierarchical matching according to any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps in the infrared and visible image registration method based on hierarchical matching according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210191584.XA CN114612698B (en) | 2022-02-28 | 2022-02-28 | Infrared and visible light image registration method and system based on hierarchical matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210191584.XA CN114612698B (en) | 2022-02-28 | 2022-02-28 | Infrared and visible light image registration method and system based on hierarchical matching |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114612698A true CN114612698A (en) | 2022-06-10 |
CN114612698B CN114612698B (en) | 2024-09-06 |
Family
ID=81858735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210191584.XA Active CN114612698B (en) | 2022-02-28 | 2022-02-28 | Infrared and visible light image registration method and system based on hierarchical matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114612698B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035168A (en) * | 2022-08-15 | 2022-09-09 | 南京航空航天大学 | Multi-constraint-based photovoltaic panel multi-source image registration method, device and system |
CN115965843A (en) * | 2023-01-04 | 2023-04-14 | 长沙观谱红外科技有限公司 | Visible light and infrared image fusion method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170243084A1 (en) * | 2015-11-06 | 2017-08-24 | The Regents Of The University Of California | Dsp-sift: domain-size pooling for image descriptors for image matching and other applications |
CN109522434A (en) * | 2018-10-24 | 2019-03-26 | 武汉大学 | Social image geographic positioning and system based on deep learning image retrieval |
CN113674400A (en) * | 2021-08-18 | 2021-11-19 | 公安部物证鉴定中心 | Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium |
CN114092531A (en) * | 2021-10-28 | 2022-02-25 | 国网山东省电力公司电力科学研究院 | Infrared-visible light image registration method and system |
-
2022
- 2022-02-28 CN CN202210191584.XA patent/CN114612698B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170243084A1 (en) * | 2015-11-06 | 2017-08-24 | The Regents Of The University Of California | Dsp-sift: domain-size pooling for image descriptors for image matching and other applications |
CN109522434A (en) * | 2018-10-24 | 2019-03-26 | 武汉大学 | Social image geographic positioning and system based on deep learning image retrieval |
CN113674400A (en) * | 2021-08-18 | 2021-11-19 | 公安部物证鉴定中心 | Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium |
CN114092531A (en) * | 2021-10-28 | 2022-02-25 | 国网山东省电力公司电力科学研究院 | Infrared-visible light image registration method and system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115035168A (en) * | 2022-08-15 | 2022-09-09 | 南京航空航天大学 | Multi-constraint-based photovoltaic panel multi-source image registration method, device and system |
CN115965843A (en) * | 2023-01-04 | 2023-04-14 | 长沙观谱红外科技有限公司 | Visible light and infrared image fusion method |
CN115965843B (en) * | 2023-01-04 | 2023-09-29 | 长沙观谱红外科技有限公司 | Visible light and infrared image fusion method |
Also Published As
Publication number | Publication date |
---|---|
CN114612698B (en) | 2024-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114612698B (en) | Infrared and visible light image registration method and system based on hierarchical matching | |
CN106991695A (en) | A kind of method for registering images and device | |
CN103456022A (en) | High-resolution remote sensing image feature matching method | |
CN114092531A (en) | Infrared-visible light image registration method and system | |
CN111028292A (en) | Sub-pixel level image matching navigation positioning method | |
CN117218343A (en) | Semantic component attitude estimation method based on deep learning | |
CN114830131A (en) | Equal-surface polyhedron spherical gauge convolution neural network | |
CN113592923A (en) | Batch image registration method based on depth local feature matching | |
CN115018999A (en) | Multi-robot-cooperation dense point cloud map construction method and device | |
Mei et al. | COTReg: Coupled optimal transport based point cloud registration | |
CN109242854A (en) | A kind of image significance detection method based on FLIC super-pixel segmentation | |
WO2022247126A1 (en) | Visual localization method and apparatus, and device, medium and program | |
CN112734818B (en) | Multi-source high-resolution remote sensing image automatic registration method based on residual network and SIFT | |
Luanyuan et al. | MGNet: Learning Correspondences via Multiple Graphs | |
CN117689702A (en) | Point cloud registration method and device based on geometric attention mechanism | |
GONG et al. | Non-segmented Chinese license plate recognition algorithm based on deep neural networks | |
CN116612385B (en) | Remote sensing image multiclass information extraction method and system based on depth high-resolution relation graph convolution | |
CN109934298B (en) | Progressive graph matching method and device of deformation graph based on clustering | |
An et al. | PointTr: Low-Overlap Point Cloud Registration with Transformer | |
CN114742869B (en) | Brain neurosurgery registration method based on pattern recognition and electronic equipment | |
CN115410014A (en) | Self-supervision characteristic point matching method of fisheye image and storage medium thereof | |
Cui et al. | 3D reconstruction with spherical cameras | |
Shen et al. | Transformer with Linear-Window Attention for Feature Matching | |
CN113326790A (en) | Capsule robot drain pipe disease detection method based on abnormal detection thinking | |
Zeng et al. | Comparison between the traditional and deep learning algorithms on image matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |