Background
Image local feature description and matching are used as one of basic research problems in the fields of image processing and computer vision, and are widely applied to many scenes such as three-dimensional reconstruction, wide baseline matching, panoramic stitching, image retrieval and the like. In recent years, related researchers have proposed a number of methods for describing image features, and summarized into two main categories: one is a descriptor method based on manual design, and the other is a descriptor method based on learning. Most of the methods describe local regions as a unique feature descriptor, and the most classical of the manual design methods is SIFT descriptor. It is generally believed that better performing feature descriptors are invariant to matching blocks under variations in illumination, blurring, deformation, etc., and are strongly distinguishable from non-matching blocks.
In recent years, due to revolutionary changes brought to various fields by rapid development of manual design descriptors and occurrence of deep learning, and large-scale point matching data sets provided in literature, a series of learning-based point feature descriptors are presented, wherein Zagoruyko et al propose various block matching neural network models including twin networks, dual-channel networks and the like, and compare the network performances of various block matching; in 2017 Tian et al, a CNN model L2-net based on a full convolution structure is proposed, training of L2-net is based on a progressive sampling strategy and a loss function composed of three error terms, and they train L2-net by optimizing the relative distance between descriptors in a batch, and the descriptors output by the model are matched by the L2 distance; and Anastasiya Mishchuk et al inspired by the SIFT matching criteria of Lowe, suggested that triplet loss applies to one compact descriptor named HardNet of the L2-net architecture. However, the straight line feature is also one of the most important image features, which is indispensable in many application scenarios. For example, in some low-texture scenes, local point features and region features alone are insufficient. In contrast, in these scenarios, the line features have more information. Unfortunately, the literature centerline feature descriptors develop slowly compared to the point feature descriptors, and remain in the stage of manual design. The main reasons are as follows: the deep full convolution neural network depends on a large number of labeled training samples, a large number of manpower and financial resources are needed for constructing a large number of labeled training samples, the phenomenon of overfitting caused by too few labeled training samples is avoided, and the reasons of uncertainty of the end points of the straight line, lack of abundant textures in the local neighborhood of the straight line and the like are met.
Disclosure of Invention
The present invention addresses the above-mentioned problems and in order to make the line descriptor more stable and robust over a large number of image variations, it is an object to provide a learning-based line descriptor with a greater stability and differentiation. In order to achieve the object, a line characteristic description method based on a pseudo-twin network comprises the following steps:
step S1: constructing a data set for line characterization network training;
step S11: acquiring image pairs of different transformations of the same scene;
step S12: detecting a straight line in the image;
step S13: obtaining a correct matching straight line pair;
step S14: determining an image block corresponding to the straight line;
step S2: constructing a full convolution pseudo-twin network for line characteristic description;
step S3: training the network by using the line matching data set;
step S31: acquiring a training subset;
step S32: calculating a network output feature vector;
step S33: adjusting network parameters;
step S4: updating network parameter values;
step S5: iteratively updating parameters to the appointed times;
step S6: and acquiring a descriptor of the input straight line.
Aiming at the problems, the invention provides a line characteristic description method based on a pseudo-twin network, which is characterized in that a data set for line characteristic description network training is firstly constructed, then a migration learning strategy is combined, and model parameters of the built full-convolution pseudo-twin network are initialized by utilizing L2-Net model parameters pre-trained on a large data set Liberty, so that line characteristic descriptors with stronger distinguishing capability and robustness are obtained on line matching data. The method provided by the invention can overcome the problems and has stronger stability and better performance.
Detailed Description
Fig. 1 is a flow chart of a line characteristic description method based on a pseudo-twin network, which mainly comprises the following steps: obtaining image pairs of different transformations of the same scene, detecting straight lines in the images, obtaining right matching straight line pairs, determining image blocks corresponding to the straight lines, building a full convolution pseudo-twin network for describing line characteristics, obtaining training subsets, calculating network output characteristic vectors, adjusting network parameters, updating network parameter values, updating parameter iteration to specified times, and obtaining descriptors of input straight lines. The specific implementation details of each step are as follows:
step S1: constructing a data set for line feature description network training in a specific mode comprising the steps of S11, S12, S13 and S14;
step S11: shooting images of different scenes and different visual angles and rotation angles, and performing compression, illumination, noise and other transformations on the images to form image pairs of the same scene and different transformations;
step S12: extracting straight lines in the image by using an existing straight line detection method, such as a Canny edge detection operator;
step S13: the method comprises the steps of obtaining correct matching straight line pairs, specifically, for any image pair, carrying out straight line matching by using a mean-standard deviation line descriptor described in the literature MSLD: A robust descriptor for line matching, pattern recognition.2009,42 (5), obtaining matching straight line pairs in the image pair, then manually eliminating error matching, and obtaining a correct matching straight line pair set { (L) in the image pair j ,L j ′),j=1,2,...,N L }, wherein L j Representing straight lines in the 1 st image in the image pair, L j ' denote the sum L in image 2 of the image pair j Correctly matched straight line, N L The number of the matched straight line pairs;
step S14: determining image blocks corresponding to straight lines, and concrete directionsThe formula is that, for any straight line L composed of Num (L) points, any pixel point on L is marked as P
k K=1, 2,..num (L) will be given by P
k A square area with a length of 64 in the direction of the straight line L and the direction perpendicular to the straight line L is defined as a point P
k Point P of the support region of (2)
k The matrix of luminance values of the support region is denoted as I (P
k ) The average matrix M (L) =mean (I (P)
1 ),I(P
2 ),...,I(P
Num(L) ) Standard deviation matrix STD (L) =std (I (P)
1 ),I(P
2 ),...,I(P
Num(L) ) Where Mean represents the Mean of the calculated matrix, std represents the standard deviation of the calculated matrix, and the normalized matrix of 64×128 corresponding to the straight line L is written as
Wherein A is
L (:,1:64)=M(L),A
L (:,65:128)=STD(L);
Step S2: constructing a full convolution pseudo-twin network for line feature description, namely constructing a full convolution neural network with two branches, wherein each branch is an independent L2-Net, the size of a convolution kernel of the last layer is changed from 8×8 to 8×16, the number of the convolution kernels is changed from 128 to 256, other settings are the same as those of the L2-Net, the tail ends of the two branch networks are subjected to feature splicing to obtain the full convolution pseudo-twin network for line feature description, and the full convolution pseudo-twin network is recorded as CS PSLTL-Net, wherein the first six layers of the two branches are initialized by using model parameters of the L2-Net pre-trained on a data set Liberty, and parameter values of the last layer of the two branches in the CS PSLTL-Net use default initialization values in Pytorch;
step S3: training the network CS PSLTL-Net by using the line matching data set specifically comprises the steps of S31, S32 and S33:
step S31: the training subset is obtained by randomly selecting n pairs of matching straight lines from the line matching data set obtained in the step S1, and combining the normalization matrixes corresponding to the straight lines into a matrix
Wherein->
Is a straight line L
j Corresponding normalization matrix, < >>
Is a straight line L
j ' corresponding normalized matrix, straight line L
j And L
j ' is a matched straight line pair;
step S32, calculating the network output characteristic vector by using the normalized matrix of any straight line obtained in the step S31
For->
Respectively downsampling the mean matrix M (L) and standard deviation matrix STD (L), and splicing to obtain 32×64 matrix ∈>
Will->
As input to the first branch of the network CS PSLTL-Net; extraction of
The center region M of the mean matrix M (L) and the standard deviation matrix STD (L)
c (L)=M(L)(32-15:32+16,32-15:32+16)、STD
c (L) =std (L) (32-15:32+16 ), a matrix of 32×64 is obtained
Will->
As input to the second branch of the network CS PSLTL-Net; splicing the output characteristic vectors of the two branches together to obtain an output characteristic vector corresponding to the input straight line;
step S33: tuning network parametersIn a specific manner, the normalization matrix for any matching straight line pair in step S31
And->
Obtain->
Corresponding output feature vector a
i Obtaining a corresponding output feature vector b according to step S32
i The method comprises the steps of carrying out a first treatment on the surface of the Calculating a distance matrix D of size n x n, wherein +.>
Calculating the triplet loss function->
Wherein->
Represents closest to a
i Non-matching descriptors, j
min =arg min
j=1...n,j≠i d(a
i ,b
j ),/>
Represents closest to b
i Non-matching descriptors, k
min =arg min
k=1...n,k≠i d(a
k ,b
i ) Acquiring new network model parameters by using a random gradient descent method according to the Loss function;
step S4: updating the parameter value of the network CS PSLTL-Net by utilizing the network parameters acquired in the step S3;
step S5: repeating the steps S3 and S4 until the parameter updating reaches the designated times;
step S6: the descriptor of the input straight line is obtained, specifically, for any given image pair, the image block corresponding to any straight line in the images obtained in the steps S12 and S14 is input into the full convolution pseudo-twin network obtained in the step S5, and the descriptor of the straight line can be output.
Aiming at the problems, the invention provides a line characteristic description method based on a pseudo-twin network, which is characterized in that a data set for line characteristic description network training is firstly constructed, then deep migration learning is utilized, model parameters of two branches of the built full convolution pseudo-twin network are initialized by using L2-Net model parameters pre-trained on a large data set Liberty, so that line characteristic descriptors with stronger distinguishing capability are obtained on line matching data. The method provided by the invention can overcome the problems and has stronger stability and better performance.