CN113628261A - Infrared and visible light image registration method in power inspection scene - Google Patents
Infrared and visible light image registration method in power inspection scene Download PDFInfo
- Publication number
- CN113628261A CN113628261A CN202110892727.5A CN202110892727A CN113628261A CN 113628261 A CN113628261 A CN 113628261A CN 202110892727 A CN202110892727 A CN 202110892727A CN 113628261 A CN113628261 A CN 113628261A
- Authority
- CN
- China
- Prior art keywords
- image
- feature
- infrared
- visible light
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000007689 inspection Methods 0.000 title claims abstract description 18
- 230000009466 transformation Effects 0.000 claims abstract description 26
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims abstract description 15
- 238000003708 edge detection Methods 0.000 claims abstract description 12
- 229920001651 Cyanoacrylate Polymers 0.000 claims abstract description 6
- 239000004830 Super Glue Substances 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims abstract description 3
- 238000012216 screening Methods 0.000 claims abstract description 3
- 239000011159 matrix material Substances 0.000 claims description 33
- 239000013598 vector Substances 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 30
- 238000012545 processing Methods 0.000 claims description 16
- 238000010586 diagram Methods 0.000 claims description 15
- 238000011176 pooling Methods 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 6
- 230000004931 aggregating effect Effects 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 241000282414 Homo sapiens Species 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 3
- 230000000630 rising effect Effects 0.000 claims description 3
- 238000005096 rolling process Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 description 8
- 210000004027 cell Anatomy 0.000 description 4
- 238000010438 heat treatment Methods 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- QVRVXSZKCXFBTE-UHFFFAOYSA-N n-[4-(6,7-dimethoxy-3,4-dihydro-1h-isoquinolin-2-yl)butyl]-2-(2-fluoroethoxy)-5-methylbenzamide Chemical compound C1C=2C=C(OC)C(OC)=CC=2CCN1CCCCNC(=O)C1=CC(C)=CC=C1OCCF QVRVXSZKCXFBTE-UHFFFAOYSA-N 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10048—Infrared image
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to an infrared and visible light image registration method in a power inspection scene, which comprises the following steps: step S1, acquiring an infrared image and a visible light image of the power equipment; step S2, respectively extracting edge information of the infrared and visible light images of the power equipment through a Sobel edge detection operator to obtain the edge images of the infrared and visible light; step S3, respectively detecting the characteristic points of the two edge images by using a SuperPoint characteristic extraction network and calculating a descriptor; s4, matching the feature points through a SuperGlue feature matching network according to the feature points of the two edge images obtained in the step S3, screening to obtain correct feature point matching pairs, and simultaneously removing unmatchable feature points; and step S5, calculating affine transformation model parameters according to the matched feature points, and performing space coordinate transformation on the image to be registered through bilinear interpolation to realize image registration. The method and the device realize accurate registration of the infrared and visible light images of the power equipment, and acquire the temperature information of the power equipment in the background of the visible light image.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an infrared and visible light image registration method in a power inspection scene.
Background
The construction and maintenance of the power grid play an important role in the development of the country and the society, and the electric power industry is an important guarantee for the improvement of the comprehensive strength of the country and the high-speed development of the society. The power equipment is an important component in the power grid, and when the power equipment normally works, a certain amount of heat is generated under the action of current, but the temperature should be within a certain range. When the power equipment ages or fails, the abnormal condition of local heating of the equipment can occur, and the safe and stable operation of the power grid is endangered. Therefore, it is necessary to perform safety detection on the power equipment, find abnormal heating conditions of the equipment in time and maintain the equipment.
The existing heating detection of the power equipment is usually carried out by using a temperature measuring instrument or an infrared camera, the detection instrument needs to be arranged in a power system on a large scale, the cost is high, dead corners are easy to miss, the detection result needs to be analyzed and judged by technicians, the cost is high in manpower and material resources, and the efficiency is low. With the development of an image processing technology, a mode of cooperative processing of infrared and visible light images can be adopted, and the functions of acquiring the temperature information of the power equipment in the background of the visible light image are realized by combining the characteristics that the infrared image can detect the temperature of an object, the anti-interference capability is strong, the detail information of the visible light image is rich, and the resolution is high. The detection result information combining the infrared image and the visible light image is rich, easy to observe, and convenient for technical personnel to carry out abnormal heating detection on the power equipment.
The precondition of cooperative processing of infrared and visible light images requires registration of the two images. The image registration is a process of identifying and then corresponding to the same or similar structures and contents from two or more images acquired from different sensors, different visual angles and different times, determining transformation parameters between the images according to some similarity measures, transforming the two or more images to the same coordinate system, and obtaining the best matching on a pixel layer. The infrared and visible light image registration belongs to multi-mode image registration, and due to the fact that the imaging mechanisms of the infrared and visible light images are different, the images are obviously different, the infrared image is low in resolution, fuzzy in image and poor in detail information compared with the visible light image, the gray features of the infrared image and the visible light image are greatly different, and the registration difficulty is high. The existing methods mostly perform registration based on image point features: such as SIFT and SURF. However, most methods have low accuracy and accuracy in registering infrared and visible images, and cannot realize registration. In response to this problem, a method is needed that allows for accurate registration of infrared and visible images.
Disclosure of Invention
In view of this, the present invention provides an infrared and visible light image registration method in a power inspection scene, which is used for accurately registering infrared and visible light images of a power device and acquiring temperature information of the power device in a background of the visible light image.
In order to achieve the purpose, the invention adopts the following technical scheme:
an infrared and visible light image registration method under a power inspection scene comprises the following steps:
step S1, acquiring an infrared image and a visible light image of the power equipment;
step S2, respectively extracting edge information of the infrared and visible light images of the power equipment through a Sobel edge detection operator to obtain the edge images of the infrared and visible light;
step S3, respectively detecting the characteristic points of the two edge images by using a SuperPoint characteristic extraction network and calculating a descriptor;
s4, matching the feature points through a SuperGlue feature matching network according to the feature points of the two edge images obtained in the step S3, screening to obtain correct feature point matching pairs, and simultaneously removing unmatchable feature points;
and step S5, calculating affine transformation model parameters according to the matched feature points, and performing space coordinate transformation on the image to be registered through bilinear interpolation to realize image registration.
Further, the step S2 is specifically:
step S21, carrying out graying processing on the infrared image A and the visible light image B of the power equipment through a grayscale conversion function, and converting the color image into a grayscale image;
step S22 by SExtracting infrared edge images of power equipment by using obel edge detection operatorAnd visible edge imagesLet Sobelx、SobelyRespectively as horizontal and vertical convolution factors, and performing convolution operation on the factors and the image to respectively obtain a horizontal and vertical edge detection result image G of the imagex=Sobelx*A,Gy=SobelyA; by passingAnd combining the two images to obtain a final edge image.
Further, the Sobel convolution factor is specifically:
further, the step S3 is specifically:
step S31, according to the acquired infrared edge imageThe image size is H multiplied by W, the infrared edge image is processed by an encoder network, and the processed edge image is composed ofBecome intoThe space size is Hc multiplied by Wc, wherein Hc is less than H, Wc is less than W;
step S32, extracting network from the feature points, inputting the network as tensorFirstly, carrying out convolution operation twice through convolution layer to obtainEach pixel point is the score of a characteristic point; the score for each pixel is then mapped to [0,1 ] by the Softmax function]The probability that the pixel points corresponding to the infrared edge image are the characteristic points; finally, restoring the original size through upsampling;
step S33 describing the sub-decoder network by computing tensorsAnd convert it intoThe output of the network is a fixed-length descriptor normalized by an L2 norm standard;
step S34 for visible light edge imageThrough the same processing, the characteristic points and descriptors of the visible light edge image are obtained.
Further, the encoder network structure unit is composed of a convolutional layer Conv, a nonlinear activation function Relu, and a pooling layer Pool, and specifically as follows:
a. and (3) rolling layers: the convolution layer firstly fills the boundary of the input image, then uses the convolution kernel to carry out convolution operation on the input image, extracts the characteristics of the image and outputs a characteristic diagram;
b. nonlinear activation function: after each convolutional layer, there is a ReLU nonlinear activation function, which increases the nonlinearity of the neural network.
c. A pooling layer: the pooling layer down-samples the feature map obtained by the convolutional layer, reduces the feature map size output by the convolutional layer, and reduces the amount of computation of the network.
Further, the step S33 is specifically:
a. tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation isWherein 65 channels correspond to local 8 x 8 pixel grid regions in the image that do not overlap, plus 1 channel corresponding to no feature point detected in the 8 x 8 region; then, 1 channel without the feature point is removed to obtain
b. Using the Softmax function willThe score of each pixel in the set is mapped to [0,1 ]]Obtaining the probability of each pixel point being a feature point;
c. feature maps of smaller size by sub-pixel convolutionEnlargement, first in 64 feature mapsOne pixel point is taken at the same position of the image, and a characteristic diagram with the size of 8 x 8 is formed by splicing; then carrying out the same processing on the pixel points at other positions of the feature map; finally, the size of the characteristic diagram is enlarged to 8 times of the original size, and a result diagram with the size consistent with that of the initial infrared edge image is output
Further, the step S34 is specifically:
a. tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation is
b. Extracting descriptors corresponding to the feature points, firstly normalizing the size of the image, and simultaneously moving the feature points to the corresponding positions of the normalized image; then constructing K groups of tensors of 1 multiplied by 2 through the normalized feature points, wherein K represents the number of the feature points, and 2 respectively represents the horizontal and vertical coordinates of the feature points; performing inverse normalization on the positions of the feature points, and inserting the descriptors into the positions of the corresponding key points through a double-linear interpolation algorithm; finally, through normalization by an L2 norm standard, a descriptor with uniform length is obtained.
Further, the step S4 is specifically:
step S41, detecting infrared edge image through SuperPointAnd visible edge imagesAfter the feature point position p and the descriptor d are calculated, M and N feature points are respectively extracted from the two images and are respectively expressed asFor infrared edge imagesPosition of the detected feature pointAnd a descriptorCombining to form a feature matching vector, the initial representation of the vector is:
wherein MLP is multilayer perception encoder, and positions of characteristic pointsEmbedding into high-dimensional vector, and then combining the vector after the rising dimension with descriptorAnd (4) adding. For visible light edge imagePosition of the feature pointAnd a descriptorThe same treatment is carried out;
step S42, using the multi-layer neural network, aggregating the undirected edges epsilon for the feature points i in each layerselfOr epsiloncrossAnd calculating the representation form of the update vector; epsilonselfConnecting a feature point to all other feature points, ε, in the same imagecrossThe feature points are connected to all feature points in another image,representing infrared edge imagesIn the intermediate representation of the vector of (1) th layer, mε→iIs the result of the aggregation of all feature points { j (i, j) ∈ epsilon }, epsilon ∈ { epsilonself,εcross};
The vector calculation is updated as:
wherein [ |. ]]Expressed as a tandem operation, starting with 1 layer and epsilon when l is oddselfWhen l is even number, epsilon ═ epsiloncross(ii) a By following the inner edge epsilon of the imageselfAnd between imagesEdge epsiloncrossAlternately aggregating and updating, and simulating the process of judging feature matching by back and forth browsing of human beings; infrared edge imageAfter each feature point i in the image passes through the layer number L of neural network, a feature matching vector is obtained
Where W and b are weights and offsets, visible edge imagesCorrespondingly, the feature matching vector can be obtained
And step S43, the feature point correspondences of the two images must comply with preset physical constraints: 1) one feature point has at most one corresponding relation in another image; 2) due to factors such as occlusion, some feature points cannot be matched; therefore, the matching of the feature points between the images is to be partial distribution between two sets of feature point sets, and a partial distribution matrix P epsilon [0,1 ] is established for M and N feature points of the infrared and visible light edge images]M×NRepresenting all possible matches of the feature points of the two images, each possible correspondence having a confidence value representing the likelihood of its correct match, and constrained as follows:
P*1N<1M and PT*1M<1N
first, feature matching vectors are calculatedAndthe inner product of (A) yields a feature matching score matrix Si,jThe feature allocation matrix P can be maximized under constraintObtaining the total score of; the optimal transmission problem is regarded, and a final feature distribution matrix P is solved through iteration of a Sinkhorn algorithm;
expanding the scoring matrix S by one channel toMatrix arrayRedundant channels are used for filtering unmatched feature points;
the constraint becomes:
whereinAfter iteration is carried out for a plurality of times through a Sinkhorn algorithm, the extra 1 channel is removed, and the feature distribution matrix is restored to be PM×N。
Further, the step S5 is specifically:
step S51, calculating affine transformation model parameters according to the feature point matching result;
step S52, firstly, establishing a zero matrix with the size consistent with that of the infrared image, and obtaining a corresponding point of each point on the visible light image by carrying out coordinate transformation on each point in the matrix; then, obtaining the pixel value of the point through a bilinear interpolation method, and taking the pixel value as the pixel value of the corresponding point on the image after registration;
and step S53, calculating affine transformation model parameters according to the matched feature points, and performing space coordinate transformation on the image to be registered through bilinear interpolation to obtain a final registered image.
Further, the affine transformation model is expressed as:
wherein (x, y) and (x ', y') are coordinates of corresponding points in the two images respectively, and parameters in the affine transformation model can be determined through correct matching points.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, infrared and visible light images of the power equipment are cooperatively processed, and the advantages of strong anti-interference capability, rich visible light image detail information and high resolution of the infrared image, which can detect the temperature of an object, are combined, so that the function of acquiring the temperature information of the power equipment in the background of the visible light image is realized, the registered image information is rich, and the detection by technical personnel is facilitated;
2. according to the method, the edge information of the infrared image and the visible light image is respectively extracted through the edge detection operator, the outline of the image is highlighted, the similarity of the infrared image and the visible light image is improved, and meanwhile, the influence of distortion is better eliminated. By reasonably utilizing the edge information of the image, the image registration time is reduced, and the image registration accuracy is improved;
3. the invention can extract a large number of feature points and descriptors thereof with high precision. Most of mismatching can be identified and filtered out while correct matching between feature point sets is realized. The image registration accuracy and accuracy are high, and the algorithm robustness is good;
4. the invention can realize accurate registration of infrared and visible light images shot by different scenes and different cameras, and has strong applicability.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2 is a structure diagram of a SuperPoint network in an embodiment of the present invention;
FIG. 3 is a block diagram of an encoder network according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a maximum pooling layer in one embodiment of the present invention;
FIG. 5 is a diagram of a SuperGlue network architecture in accordance with an embodiment of the present invention;
FIG. 6 shows an example of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
In the embodiment, a visual information acquisition part in the infrared and visible light image registration method in the power inspection scene is composed of binocular infrared and visible light cameras, wherein the resolution of the visible light camera is 2048 × 1536, the resolution of the infrared camera is 640 × 480, the two cameras are on the same optical axis, lenses are arranged on the same plane, and the baseline distance is 5-10 cm.
Referring to fig. 1, the invention provides a method for registering infrared and visible light images in a power inspection scene, which includes the following steps:
step S1, acquiring an infrared image A and a visible light image B of the power equipment;
step S2, utilizing a Sobel edge detection operator to carry out edge detection on the infrared and visible light images of the power equipment;
(1) and (5) graying processing. Carrying out graying processing on an infrared image A and a visible light image B of the power equipment through a gray level conversion function, and converting a shot color image into a gray level image;
(2) and detecting edges. Extracting infrared edge images of power equipment through Sobel edge detection operatorAnd visible edge imagesThe Sobel convolution factor is:
respectively, horizontal and vertical convolution factors, by performing convolution operation on the factors and the imageCan respectively obtain the horizontal and longitudinal edge detection result images G of the imagex=Sobelx*A,Gy=SobelyA. By passingAnd combining the two images to obtain a final edge image.
And step S3, extracting the characteristic points of the edge images of the infrared and visible light by using the SuperPoint characteristics and calculating the descriptors.
Preferably, in this embodiment, the structure diagram of the SuperPoint network is shown in fig. 2, and the specific process of extracting the features is as follows:
(2) And (5) processing the image. Processing the infrared edge image by an encoder network, the processed edge image being obtained byBecome intoIt has a small space size (Hc Wc) and a large channel depth, wherein
Encoder network architecture as shown in fig. 3, the network structure unit is composed of a convolutional layer, a nonlinear activation function, and a pooling layer (Conv-Relu-Pool):
21) and (4) rolling up the layers. The convolutional layer firstly fills the boundary of the input image, then performs convolution operation on the input image by utilizing a convolution kernel, extracts the characteristics of the image and outputs a characteristic diagram. The encoder network has eight convolutional layers in total, the first four contain 64 convolutional kernels with the size of 3 x 3, the step size is 1, and an input image passes through the convolutional layers and then 64 feature maps are output. The last four contain 128 convolution kernels of size 3 x 3 with a step size of 1, and the input image passes through the convolution kernels and outputs 128 feature maps.
22) A non-linear activation function. After each convolutional layer, there is a ReLU nonlinear activation function, which increases the nonlinearity of the neural network. The nonlinear activation function has stronger expressive power, and simultaneously avoids the problem of gradient disappearance. The Relu function is as follows:
relu input x, when x is positive, the output is unchanged; when x is negative, the output is 0. Through unilateral inhibition, neurons in the network have sparse activation, and the features of the image can be extracted better.
23) And (4) a pooling layer. The pooling layer down-samples the feature map obtained by the convolutional layer, reduces the feature map size output by the convolutional layer, and reduces the amount of computation of the network. The encoder network has three maximum pooling layers, and maximum pooling operation is performed after each two convolutional layers. The pooled nuclei size was 2 x 2 with step size of 2. As shown in fig. 4, the maximum pooling layer preserves the main features in the region while enabling a reduction in image size.
(3) And extracting the characteristic points. Extracting network from feature points with tensor as inputFirstly, carrying out convolution operation twice through a convolution layer to obtain the score of each pixel point as a characteristic point. The score for each pixel is then mapped to [0,1 ] by the Softmax function]And (3) probability that pixel points corresponding to the infrared edge image are characteristic points. And finally, restoring the original size through upsampling. Comprises the following steps:
31) tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation isOf which 65 channels correspond to local 8 x 8 pixel grid regions in the image that do not overlap, plus 1 channel corresponding to no feature point detected in the 8 x 8 region. Then, 1 channel without the feature point is removed to obtain
32) Using the Softmax function willThe score of each pixel in the set is mapped to [0,1 ]]And obtaining the probability of each pixel point as a feature point. The Softmax function is as follows:
of all n elements, the m-th element zmCan be mapped to [0,1 ] by the Softmax formula]In the meantime.
33) Feature maps of smaller size by sub-pixel convolutionEnlargement, first in 64 feature mapsOne pixel point is taken at the same position of the image, and a characteristic diagram with the size of 8 x 8 is formed by splicing; then carrying out the same processing on the pixel points at other positions of the feature map; finally, the size of the characteristic diagram is enlarged to 8 times of the original size, and a result diagram with the size consistent with that of the initial infrared edge image is output
(4) And calculating the descriptor. Describing a network of sub-decoders by computing tensorsAnd rotate itChange toThe output of the network is a fixed-length descriptor normalized by an L2 norm standard, which comprises the following processes:
41) tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation is
42) And extracting descriptors corresponding to the feature points, normalizing the size of the image, and moving the feature points to the corresponding positions of the normalized image. And then constructing K groups of tensors of 1 × 1 × 2 through the normalized feature points, wherein K represents the number of the feature points, and 2 respectively represents the horizontal and vertical coordinates of the feature points. And performing inverse normalization on the positions of the characteristic points, and inserting the descriptors into the positions of the corresponding key points through a double-linear interpolation algorithm. Finally, through normalization by an L2 norm standard, a descriptor with uniform length is obtained.
(5) For visible edge imagesWith the same processing, the characteristic points and descriptors of the visible light edge image are obtained.
(6) A loss function. The final loss function is the sum of two intermediate loss functions, one for the feature pointsThe other is used for the descriptor
In the network training process, given an image I, the positions of the characteristic points of the image are known. Image I' is generated by applying a random homography matrix thereto. By means of the image pair thus generated, both losses are optimized simultaneously and the resulting losses are balanced with λ.
61) Feature point detector loss functionIs in the pixelThe above cross-entropy loss of full convolution, and similarly, defining the correct feature point set corresponding thereto as yhwE.g. Y. Similarly, the image I' generated by the homography matrix comprisesY′;
Wherein:
62) in image IAnd in the image IThe correspondence between the (h, w) cells and the (h ', w') cells in the two images caused by the homography change matrix is as follows:
wherein,representing a homography transformation matrix; p is a radical ofhwRepresents the position of the center pixel in the (h, w) cell;indicates the cell location phwMultiplied by the homography transform matrix H and then divided by the last coordinate.
The descriptor loss function is:
s represents the entire set of correspondence relationships for a pair of images. Using a gas with a positive margin mpAnd a negative margin mnHinge loss function ldAnd through λdTo balance the negative number:
ld(dhw,d′h′w′;shwh′w′)=λd*shwh′w′*max(0,mp-dhw Td′h′w′)+(1-shwh′w′)*max(0,dT hwd′h′w′-mn);
and step S4, utilizing a SuperGlue feature matching network to realize correct matching of the feature points between the infrared and visible light edge images, and simultaneously eliminating unmatchable feature points.
In this embodiment, preferably, the structure of the SuperGlue network is shown in fig. 5, and the following steps are detailed:
(1) and (5) feature coding. Detection of infrared edge images by SuperPointAnd visible edge imagesAfter the feature point position p and the descriptor d are calculated, two images are respectively extractedTaking M and N characteristic points, respectively expressed asFor infrared edge imagesPosition of the detected feature pointAnd a descriptorCombining to form a feature matching vector, the initial representation of the vector is:
wherein MLP is multilayer perception encoder, and positions of characteristic pointsEmbedding into high-dimensional vector, and then combining the vector after the rising dimension with descriptorAnd (4) adding. For visible light edge imagePosition of the feature pointAnd a descriptorThe same process is followed.
(2) A multi-layer graph neural network. Using a graph neural network with 7 layers, aggregating undirected edges ε for feature points i in each layerselfOr epsiloncrossAnd computing a representation of the update vector. EpsilonselfFeature pointsConnected to all other feature points, e, in the same imagecrossThe feature points are connected to all feature points in another image,representing infrared edge imagesIn the intermediate representation of the vector of (1) th layer, mε→iIs the result of the aggregation of all feature points { j (i, j) ∈ epsilon }, epsilon ∈ { epsilonself,εcross}. The vector calculation is updated as:
wherein [ |. ]]Expressed as a tandem operation, starting with 1 layer and epsilon when l is oddselfWhen l is even number, epsilon ═ epsiloncross. By following the inner edge epsilon of the imageselfAnd inter-image edges epsiloncrossAnd alternately aggregating and updating, and simulating the process of judging feature matching by browsing back and forth by human beings. Infrared edge imageAfter each feature point i in the graph is subjected to the neural network with the number L of layers being 7, a feature matching vector is obtained
Where W and b are weights and offsets. Visible edge imageCorrespondingly, the feature matching vector can be obtained
(3) A best matching network. The feature point correspondences of the two images must obey the following physical constraints: 1) one feature point has at most one corresponding relation in another image; 2) some feature points will not match due to occlusion, etc. Therefore, the matching of the feature points between the images is to be partial distribution between two sets of feature point sets, and a partial distribution matrix P epsilon [0,1 ] is established for M and N feature points of the infrared and visible light edge images]M×NRepresenting all possible matches of the feature points of the two images, each possible correspondence having a confidence value representing the likelihood of its correct match, and constrained as follows:
P*1N<1M and PT*1M<1N
first, feature matching vectors are calculatedAndthe inner product of (A) yields a feature matching score matrix Si,jThe feature allocation matrix P can be maximized under constraintThe total score of (a) is obtained. This can be regarded as an optimal transmission problem, and the final feature distribution matrix P is solved iteratively by the Sinkhorn algorithm.
Since there may be unmatched points, expanding the scoring matrix S by one pass becomesMatrix arrayRedundant channels are used to filter out unmatched feature points. At this time, the constraint conditions become:
whereinAfter 100 iterations through the Sinkhorn algorithm, the extra 1 channel is removed, and the feature distribution matrix is restored to PM×N。
(4) A loss function. Network training uses supervised learning when giving truth value of feature point matchingAnd unmatchable feature pointsBy minimizing a partial allocation matrix functionWhile maximizing the accuracy and recall of the match. The loss function is defined as:
and step S5, calculating affine transformation model parameters according to the feature point matching result, and realizing image registration. The detailed steps are as follows:
(1) the affine transformation model can be expressed as:
wherein (x, y) and (x ', y') are the coordinates of corresponding points in the two images respectively, and 6 parameters in the affine transformation model can be determined through 3 pairs of correct matching points;
(2) firstly, establishing a zero matrix with the size consistent with that of the infrared image, and performing coordinate transformation on each point in the matrix to obtain a corresponding point of the point on the visible light image; and then obtaining the pixel value of the point through a bilinear interpolation method, wherein the pixel value is used as the pixel value of the corresponding point on the image after registration. As shown in fig. 6, affine transformation model parameters are calculated according to the matching feature point pairs, and spatial coordinate transformation is performed on the image to be registered through bilinear interpolation, so as to obtain a final registered image.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (10)
1. An infrared and visible light image registration method under a power inspection scene is characterized by comprising the following steps:
step S1, acquiring an infrared image and a visible light image of the power equipment;
step S2, respectively extracting edge information of the infrared and visible light images of the power equipment through a Sobel edge detection operator to obtain the edge images of the infrared and visible light;
step S3, respectively detecting the characteristic points of the two edge images by using a SuperPoint characteristic extraction network and calculating a descriptor;
s4, matching the feature points through a SuperGlue feature matching network according to the feature points of the two edge images obtained in the step S3, screening to obtain correct feature point matching pairs, and simultaneously removing unmatchable feature points;
and step S5, calculating affine transformation model parameters according to the matched feature points, and performing space coordinate transformation on the image to be registered through bilinear interpolation to realize image registration.
2. The method for registering the infrared and visible light images in the power inspection scene according to claim 1, wherein the step S2 specifically includes:
step S21, carrying out graying processing on the infrared image A and the visible light image B of the power equipment through a grayscale conversion function, and converting the color image into a grayscale image;
step S22, extracting the infrared edge image of the power equipment through a Sobel edge detection operatorAnd visible edge imagesLet Sobelx、SobelyRespectively as horizontal and vertical convolution factors, and performing convolution operation on the factors and the image to respectively obtain a horizontal and vertical edge detection result image G of the imagex=Sobelx*A,Gy=SobelyA; by passingAnd combining the two images to obtain a final edge image.
4. the method for registering the infrared and visible light images in the power inspection scene according to claim 1, wherein the step S3 specifically includes:
step S31, according to the acquired infrared edge imageThe image size is H multiplied by W, the infrared edge image is processed by an encoder network, and the processed edge image is composed ofBecome intoIts spatial dimensionHc is multiplied by Wc, wherein Hc is less than H, Wc is less than W;
step S32, extracting network from the feature points, inputting the network as tensorFirstly, carrying out convolution operation twice through a convolution layer to obtain a score of each pixel point as a characteristic point; the score for each pixel is then mapped to [0,1 ] by the Softmax function]The probability that the pixel points corresponding to the infrared edge image are the characteristic points; finally, restoring the original size through upsampling;
step S33 describing the sub-decoder network by computing tensorsAnd convert it intoThe output of the network is a fixed-length descriptor normalized by an L2 norm standard;
5. The method for registering the infrared and visible light images in the power inspection scene according to claim 4, wherein the encoder network structural unit is composed of a convolutional layer Conv, a nonlinear activation function Relu and a pooling layer Pool, and specifically comprises the following steps:
a. and (3) rolling layers: the convolution layer firstly fills the boundary of the input image, then uses the convolution kernel to carry out convolution operation on the input image, extracts the characteristics of the image and outputs a characteristic diagram;
b. nonlinear activation function: after each convolutional layer, there is a ReLU nonlinear activation function, which increases the nonlinearity of the neural network.
c. A pooling layer: the pooling layer down-samples the feature map obtained by the convolutional layer, reduces the feature map size output by the convolutional layer, and reduces the amount of computation of the network.
6. The method for registering the infrared and visible light images in the power inspection scene according to claim 4, wherein the step S33 specifically comprises:
a. tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation isWherein 65 channels correspond to local 8 x 8 pixel grid regions in the image that do not overlap, plus 1 channel corresponding to no feature point detected in the 8 x 8 region; then, 1 channel without the feature point is removed to obtain
b. Using the Softmax function willThe score of each pixel in the set is mapped to [0,1 ]]Obtaining the probability of each pixel point being a feature point;
c. feature maps of smaller size by sub-pixel convolutionEnlargement, first in 64 feature mapsOne pixel point is taken at the same position of the image, and a characteristic diagram with the size of 8 x 8 is formed by splicing; then carrying out the same processing on the pixel points at other positions of the feature map; finally, the size of the feature map is enlarged to 8 times of the original size, and the feature map is output and initiallyResult graph with consistent infrared edge image size
7. The method for registering the infrared and visible light images in the power inspection scene according to claim 4, wherein the step S34 specifically comprises:
a. tensor of encoder network outputFirstly, carrying out convolution twice, wherein the sizes of convolution kernels are 3 x 3 and 1 x 1 in sequence, the step length is 1, and the output after convolution operation is
b. Extracting descriptors corresponding to the feature points, firstly normalizing the size of the image, and simultaneously moving the feature points to the corresponding positions of the normalized image; then constructing K groups of tensors of 1 multiplied by 2 through the normalized feature points, wherein K represents the number of the feature points, and 2 respectively represents the horizontal and vertical coordinates of the feature points; performing inverse normalization on the positions of the feature points, and inserting the descriptors into the positions of the corresponding key points through a double-linear interpolation algorithm; finally, through normalization by an L2 norm standard, a descriptor with uniform length is obtained.
8. The method for registering the infrared and visible light images in the power inspection scene according to claim 1, wherein the step S4 specifically includes:
step S41, detecting infrared edge image through SuperPointAnd visible edge imagesPosition p of the feature point of (1)After the descriptor d is calculated, M and N feature points are respectively extracted from the two images and are respectively expressed asFor infrared edge imagesPosition of the detected feature pointAnd a descriptorCombining to form a feature matching vector, the initial representation of the vector is:
wherein MLP is multilayer perception encoder, and positions of characteristic pointsEmbedding into high-dimensional vector, and then combining the vector after the rising dimension with descriptorAnd (4) adding. For visible light edge imagePosition of the feature pointAnd a descriptorThe same treatment is carried out;
step S42, using a multi-layer neural network, in each layerAggregating undirected edges epsilon for feature points iselfOr epsiloncrossAnd calculating the representation form of the update vector; epsilonselfConnecting a feature point to all other feature points, ε, in the same imagecrossThe feature points are connected to all feature points in another image,representing infrared edge imagesIn the intermediate representation of the vector of (1) th layer, mε→iIs the result of the aggregation of all feature points { j (i, j) ∈ epsilon }, epsilon ∈ { epsilonself,εcross};
The vector calculation is updated as:
wherein [ |. ]]Expressed as a tandem operation, starting with 1 layer and epsilon when l is oddselfWhen l is even number, epsilon ═ epsiloncross(ii) a By following the inner edge epsilon of the imageselfAnd inter-image edges epsiloncrossAlternately aggregating and updating, and simulating the process of judging feature matching by back and forth browsing of human beings; infrared edge imageAfter each feature point i in the feature matching vector is subjected to the layer number L graph neural network, a feature matching vector f is obtainedi A
Where W and b are weights and offsets, visible edge imagesCorrespondingly, the feature matching vector can be obtained
And step S43, the feature point correspondences of the two images must comply with preset physical constraints: 1) one feature point has at most one corresponding relation in another image; 2) due to factors such as occlusion, some feature points cannot be matched; therefore, the matching of the feature points between the images is to be partial distribution between two sets of feature point sets, and a partial distribution matrix P epsilon [0,1 ] is established for M and N feature points of the infrared and visible light edge images]M×NRepresenting all possible matches of the feature points of the two images, each possible correspondence having a confidence value representing the likelihood of its correct match, and constrained as follows:
P*1N<1M and PT*1M<1N
first, a feature matching vector f is calculatedi AAndthe inner product of (A) yields a feature matching score matrix Si,jThe feature allocation matrix P can be maximized under constraintObtaining the total score of; the optimal transmission problem is regarded, and a final feature distribution matrix P is solved through iteration of a Sinkhorn algorithm;
expanding the scoring matrix S by one channel toMatrix arrayRedundant channels are used for filtering unmatched feature points;
the constraint becomes:
9. The method for registering the infrared and visible light images in the power inspection scene according to claim 1, wherein the step S5 specifically includes:
step S51, calculating affine transformation model parameters according to the feature point matching result;
step S52, firstly, establishing a zero matrix with the size consistent with that of the infrared image, and obtaining a corresponding point of each point on the visible light image by carrying out coordinate transformation on each point in the matrix; then, obtaining the pixel value of the point through a bilinear interpolation method, and taking the pixel value as the pixel value of the corresponding point on the image after registration;
and step S53, calculating affine transformation model parameters according to the matched feature points, and performing space coordinate transformation on the image to be registered through bilinear interpolation to obtain a final registered image.
10. The infrared and visible light image registration method in the power inspection scene according to claim 9, wherein the affine transformation model is expressed as:
wherein (x, y) and (x ', y') are coordinates of corresponding points in the two images respectively, and parameters in the affine transformation model can be determined through correct matching points.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110892727.5A CN113628261B (en) | 2021-08-04 | 2021-08-04 | Infrared and visible light image registration method in electric power inspection scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110892727.5A CN113628261B (en) | 2021-08-04 | 2021-08-04 | Infrared and visible light image registration method in electric power inspection scene |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113628261A true CN113628261A (en) | 2021-11-09 |
CN113628261B CN113628261B (en) | 2023-09-22 |
Family
ID=78382773
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110892727.5A Active CN113628261B (en) | 2021-08-04 | 2021-08-04 | Infrared and visible light image registration method in electric power inspection scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113628261B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114066954A (en) * | 2021-11-23 | 2022-02-18 | 广东工业大学 | Feature extraction and registration method for multi-modal images |
CN114255197A (en) * | 2021-12-27 | 2022-03-29 | 西安交通大学 | Infrared and visible light image self-adaptive fusion alignment method and system |
CN114565781A (en) * | 2022-02-25 | 2022-05-31 | 中国人民解放军战略支援部队信息工程大学 | Image matching method based on rotation invariance |
CN114820733A (en) * | 2022-04-21 | 2022-07-29 | 北京航空航天大学 | Interpretable thermal infrared visible light image registration method and system |
CN116797660A (en) * | 2023-07-04 | 2023-09-22 | 广东工业大学 | Unmanned aerial vehicle all-weather geographic positioning method and system without GNSS work |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106257535A (en) * | 2016-08-11 | 2016-12-28 | 河海大学常州校区 | Electrical equipment based on SURF operator is infrared and visible light image registration method |
CN110263868A (en) * | 2019-06-24 | 2019-09-20 | 北京航空航天大学 | Image classification network based on SuperPoint feature |
CN110428008A (en) * | 2019-08-02 | 2019-11-08 | 深圳市唯特视科技有限公司 | A kind of target detection and identification device and method based on more merge sensors |
CN111369605A (en) * | 2020-02-27 | 2020-07-03 | 河海大学 | Infrared and visible light image registration method and system based on edge features |
CN111583315A (en) * | 2020-04-23 | 2020-08-25 | 武汉卓目科技有限公司 | Novel visible light image and infrared image registration method and device |
CN112396629A (en) * | 2019-08-14 | 2021-02-23 | 河海大学 | River course inspection tracking method based on infrared and visible light cooperation |
WO2021098080A1 (en) * | 2019-11-22 | 2021-05-27 | 大连理工大学 | Multi-spectral camera extrinsic parameter self-calibration algorithm based on edge features |
-
2021
- 2021-08-04 CN CN202110892727.5A patent/CN113628261B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106257535A (en) * | 2016-08-11 | 2016-12-28 | 河海大学常州校区 | Electrical equipment based on SURF operator is infrared and visible light image registration method |
CN110263868A (en) * | 2019-06-24 | 2019-09-20 | 北京航空航天大学 | Image classification network based on SuperPoint feature |
CN110428008A (en) * | 2019-08-02 | 2019-11-08 | 深圳市唯特视科技有限公司 | A kind of target detection and identification device and method based on more merge sensors |
CN112396629A (en) * | 2019-08-14 | 2021-02-23 | 河海大学 | River course inspection tracking method based on infrared and visible light cooperation |
WO2021098080A1 (en) * | 2019-11-22 | 2021-05-27 | 大连理工大学 | Multi-spectral camera extrinsic parameter self-calibration algorithm based on edge features |
CN111369605A (en) * | 2020-02-27 | 2020-07-03 | 河海大学 | Infrared and visible light image registration method and system based on edge features |
CN111583315A (en) * | 2020-04-23 | 2020-08-25 | 武汉卓目科技有限公司 | Novel visible light image and infrared image registration method and device |
Non-Patent Citations (1)
Title |
---|
汪鹏;金立左;: "基于Canny边缘SURF特征的红外与可见光图像配准算法", 工业控制计算机, no. 04 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114066954A (en) * | 2021-11-23 | 2022-02-18 | 广东工业大学 | Feature extraction and registration method for multi-modal images |
CN114255197A (en) * | 2021-12-27 | 2022-03-29 | 西安交通大学 | Infrared and visible light image self-adaptive fusion alignment method and system |
CN114255197B (en) * | 2021-12-27 | 2024-04-05 | 西安交通大学 | Infrared and visible light image self-adaptive fusion alignment method and system |
CN114565781A (en) * | 2022-02-25 | 2022-05-31 | 中国人民解放军战略支援部队信息工程大学 | Image matching method based on rotation invariance |
CN114820733A (en) * | 2022-04-21 | 2022-07-29 | 北京航空航天大学 | Interpretable thermal infrared visible light image registration method and system |
CN114820733B (en) * | 2022-04-21 | 2024-05-31 | 北京航空航天大学 | Interpretable thermal infrared visible light image registration method and system |
CN116797660A (en) * | 2023-07-04 | 2023-09-22 | 广东工业大学 | Unmanned aerial vehicle all-weather geographic positioning method and system without GNSS work |
Also Published As
Publication number | Publication date |
---|---|
CN113628261B (en) | 2023-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113628261A (en) | Infrared and visible light image registration method in power inspection scene | |
CN111160297B (en) | Pedestrian re-identification method and device based on residual attention mechanism space-time combined model | |
CN109903299B (en) | Registration method and device for heterogenous remote sensing image of conditional generation countermeasure network | |
CN111783748B (en) | Face recognition method and device, electronic equipment and storage medium | |
CN112733950A (en) | Power equipment fault diagnosis method based on combination of image fusion and target detection | |
CN112818764B (en) | Low-resolution image facial expression recognition method based on feature reconstruction model | |
CN113516693B (en) | Rapid and universal image registration method | |
CN110443881B (en) | Bridge deck morphological change recognition bridge structure damage CNN-GRNN method | |
CN115496928A (en) | Multi-modal image feature matching method based on multi-feature matching | |
CN108898269A (en) | Electric power image-context impact evaluation method based on measurement | |
CN109978897B (en) | Registration method and device for heterogeneous remote sensing images of multi-scale generation countermeasure network | |
CN109190511A (en) | Hyperspectral classification method based on part Yu structural constraint low-rank representation | |
CN115393404A (en) | Double-light image registration method, device and equipment and storage medium | |
Li et al. | Multimodal image fusion framework for end-to-end remote sensing image registration | |
CN115439669A (en) | Feature point detection network based on deep learning and cross-resolution image matching method | |
CN109064511B (en) | Method and device for measuring height of center of gravity of human body and related equipment | |
CN110992301A (en) | Gas contour identification method | |
Xiong et al. | SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration | |
CN111815620B (en) | Fabric defect detection method based on convolution characteristic and low-rank representation | |
CN113610746A (en) | Image processing method and device, computer equipment and storage medium | |
CN107273793A (en) | A kind of feature extracting method for recognition of face | |
CN109993782B (en) | Heterogeneous remote sensing image registration method and device for ring-shaped generation countermeasure network | |
CN117115880A (en) | Lightweight face key point detection method based on heavy parameterization | |
CN115661451A (en) | Deep learning single-frame infrared small target high-resolution segmentation method | |
CN112396089B (en) | Image matching method based on LFGC network and compression excitation module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |