WO2022142084A1 - 匹配筛选方法及装置、电子设备、存储介质和计算机程序 - Google Patents

匹配筛选方法及装置、电子设备、存储介质和计算机程序 Download PDF

Info

Publication number
WO2022142084A1
WO2022142084A1 PCT/CN2021/095170 CN2021095170W WO2022142084A1 WO 2022142084 A1 WO2022142084 A1 WO 2022142084A1 CN 2021095170 W CN2021095170 W CN 2021095170W WO 2022142084 A1 WO2022142084 A1 WO 2022142084A1
Authority
WO
WIPO (PCT)
Prior art keywords
matching
initial
module
match
local
Prior art date
Application number
PCT/CN2021/095170
Other languages
English (en)
French (fr)
Inventor
赵晨
葛艺潇
杨佳琪
朱烽
赵瑞
李鸿升
Original Assignee
上海商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤科技开发有限公司 filed Critical 上海商汤科技开发有限公司
Publication of WO2022142084A1 publication Critical patent/WO2022142084A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of image processing, and relates to, but is not limited to, a matching screening method and apparatus, electronic device, computer-readable storage medium and computer program product.
  • feature matching is one of the basic research problems.
  • the matching in the initial matching set is generally based on the Euclidean distance similarity between the descriptors corresponding to the matching points between the image pairs.
  • the matching points are selected from the feature points, and this matching method often has a large number of false matching.
  • a deep learning neural network model is generally learned and trained based on an initial matching set and a corresponding image task is performed. Since the distribution of samples in the initial matching set is often unbalanced, if the number of false matches in the initial matching set is much larger than the number of correct matches, the learning process of the neural network model is susceptible to the interference of false matches, resulting in the effect of the neural network model performing image tasks. poor.
  • Embodiments of the present application provide a matching screening method and apparatus, electronic device, computer-readable storage medium, and computer program product, which can improve the processing effect of a parametric transformation model for processing an image task.
  • the embodiment of the present application provides a matching screening method, including:
  • the initial matching set is derived from the initial matching results between the image pairs;
  • a matching subset is filtered from the initial matching set by at least one clipping module, and the correct matching ratio in the matching subset is higher than the correct matching ratio in the initial matching set.
  • the at least one clipping module is used to obtain the Consistency information of each initial match in the initial match set;
  • the matching subset is used to process image tasks related to the image pair.
  • the initial matching result may be a matching algorithm based on the ratio of the nearest neighbor to the next nearest Euclidean distance to select points with consistent matching from two sets of feature points, and each initial match in the initial matching set may include an image pair.
  • the feature information of the corresponding point (for example, the feature information of the corresponding point in the image pair may include the coordinates of the corresponding point, the pixel value of the corresponding point, the gray value of the corresponding point, the red (Red, R) of the corresponding point. ) a combination of at least one of Green (G) and Blue (B) values).
  • the matches in the initial matching set are not necessarily correct, there are correct matches and there are incorrect matches, where the correct matching ratio refers to the ratio of the number of all correct matches in the initial matching set to the total number of the initial matching set.
  • the initial matching set can be screened, so that the correct matching ratio in the filtered matching subset is higher than the correct matching ratio in the initial matching set. Since the matching subset is filtered from the initial matching set, The higher proportion of correct matches in the matching subset makes the calculated model parameters more reliable, thereby improving the calculation accuracy of the model parameters of the parametric transformation model, thereby improving the processing effect of the parametric transformation model for processing image tasks.
  • the method further includes:
  • the initial matching set is predicted by using the parametric transformation model, and a prediction result of each initial matching in the initial matching set is obtained, and the prediction result includes correct matching or incorrect matching.
  • the model parameters of the parametric transformation model in the embodiments of the present application are calculated by using matching subsets, so that the reliability of the calculated model parameters is high, and the parametric transformation model can better predict each initial match in the initial matching set , compared with the neural network model that directly predicts the initial matching set, the accuracy of the prediction result of the parametric transformation model can be improved.
  • the at least one clipping module is used to filter out a matching subset from the initial matching set, including:
  • the first matching set is the initial matching set
  • the first matching set is obtained by filtering the previous cropping module of the first cropping module.
  • one trimming module When one trimming module is used in the embodiment of the present application, it can be applied to the case where there are few erroneous matches in the initial matching set.
  • the at least two trimming modules in the embodiments of the present application are neural network learning modules, which can screen the initial matching set at least twice, so that the correct matching ratio in the screened matching subsets is higher, thereby improving the model of the parametric transformation model.
  • the calculation accuracy of the parameters makes the calculated model parameters more reliable when dealing with image tasks.
  • the embodiment of the present application can be applied to the case where there are many false matches in the initial matching set. Since each cropping module learns different features during the training process, using at least two cropping modules can achieve dynamic feature learning through at least two feature learning, which can improve the screening performance compared with the use of fixed feature training. The proportion of correct matches in the resulting subset of matches.
  • the first matching set is screened by the first cropping module to obtain a matching subset, including:
  • the local consistency information or global consistency information of the first initial matching is determined by the first trimming module, and whether the first initial matching is a Included in the matching subset; the first initial matching is any item in the first matching set.
  • the first matching set is screened by the first cropping module to obtain a matching subset, including:
  • the first trimming module includes a first local consistency learning module, a first global consistency learning module and a first trimming sub-module, and the feature matching consistency information includes a local consistency score and at least one of the global consistency scores;
  • the local consistency information and the global consistency information of the first initial matching are determined by the first trimming module, and whether the first initial matching is determined according to the local consistency information and the global consistency information of the first initial matching To be classified into the matching subset, including:
  • the first local dynamic graph for the first initial matching is constructed by the first local consistency learning module, and the local consistency score of the first initial matching in the first local dynamic graph is calculated; the first local dynamic graph The graph includes the node where the first initial matching is located and K related nodes related to the node where the first initial matching is located; the K related nodes are the nodes where the first initial matching is based on the K-nearest neighbor algorithm owned;
  • a first global dynamic graph is constructed by the first global consistency learning module, and the first global dynamic graph is determined according to the local consistency score of the first initial matching in the first local dynamic graph and the first global dynamic graph The comprehensive consistency score of the initial matching; the first global dynamic graph includes all nodes where the initial matching is located;
  • the first trimming submodule is used to determine whether the first initial match is classified into the matching subset according to the comprehensive consistency score of the first initial match.
  • the first local consistency learning module includes a first feature dimension enhancement module, a first dynamic graph construction module, a first feature dimension reduction module, and a first local consistency score calculation module;
  • the constructing the first local dynamic graph for the first initial matching by the first local consistency learning module, and calculating the local consistency score of the first initial matching in the first local dynamic graph including:
  • the initial feature vector of the first initial match is subjected to a dimensional upgrade process by the first feature dimensional increasing module to obtain a high-dimensional feature vector of the first initial match;
  • the first local dynamic graph building module uses the first local dynamic graph building module to determine the top K correlation matches in the first matching set with the highest correlation (Euclidean distance) of the high-dimensional feature vector of the first initial match through the K-nearest neighbor algorithm , based on the first initial matching and the K related matchings, construct a first local dynamic graph for the first initial matching, and obtain an ultra-high-dimensional feature vector of the first initial matching; the first initial matching
  • the ultra-high-dimensional feature vector includes a combination of the high-dimensional feature vector of the first initial match and the correlation vector between the first initial match and the K correlation matches;
  • the local consistency score of the first initial match in the first local dynamic graph is calculated by the first local consistency score calculation module based on the low-dimensional feature vector of the first initial match.
  • the first feature dimension reduction module includes a first annular convolution module and a second annular convolution module;
  • the matched ultra-high-dimensional feature vector is subjected to dimensionality reduction processing to obtain the first initial matched low-dimensional feature vector, including:
  • the super-high-dimensional feature vectors of the first initial match are grouped according to the degree of relevancy by the first annular convolution module, and the first feature aggregation process is performed on each group of feature vectors to obtain initially aggregated feature vectors;
  • a second feature aggregation process is performed on the initially aggregated feature vector by the second annular convolution module to obtain the first initially matched low-dimensional feature vector.
  • the comprehensive consistency score of the first initial matching is determined according to the local consistency score of the first initial matching in the first local dynamic graph and the first global dynamic graph ,include:
  • a comprehensive consistency score for the first initial match is determined according to the local consistency score and the global consistency score.
  • the construction of the first global dynamic graph by using the first global consistency learning module includes:
  • the first global dynamic graph is constructed by the first global consistency learning module according to the local consistency score of each initial match in the first matching set in the corresponding local dynamic graph;
  • the determining of the comprehensive consistency score of the first initial matching according to the local consistency score of the first local dynamic map and the first global dynamic map according to the first initial matching includes:
  • a comprehensive consistency score of the first initial match is calculated according to the first global dynamic map and the low-dimensional feature vector of the first initial match.
  • the first global dynamic graph is represented by an adjacency matrix
  • the first initial matching low-dimensional feature vector is calculated according to the first global dynamic graph and the first initial matching Composite Concordance Score, including:
  • a graph convolutional network is used to calculate the comprehensive low-dimensional eigenvectors of the first initial matching
  • a composite consistency score for the first initial match is calculated based on the composite low-dimensional feature vector of the first initial match.
  • the determining whether the first initial match is classified into the matching subset according to the comprehensive consistency score of the first initial match by using the first trimming submodule includes:
  • first trimming submodule uses the first trimming submodule to determine whether the comprehensive consistency score of the first initial match is greater than a first threshold, and if so, determining that the first initial match is included in the matching subset;
  • the first trimming sub-module uses the first trimming sub-module to determine that the comprehensive consistency score of the first initial match is ranked in descending order in the first matching set, if the ranking of the first initial match is greater than the second Threshold, it is determined that the first initial match is classified into the matching subset.
  • the method before the at least one clipping module is used to filter out a matching subset from the initial matching set, the method further includes:
  • the training result is evaluated through a temperature-adaptive binary loss function, and the parameters of the cropping module are updated according to the method of minimizing the binary loss function.
  • the method before the at least one clipping module is used to filter out a matching subset from the initial matching set, the method further includes:
  • the method further includes:
  • the model parameters of the parametric transform model are calculated using the matched subset in the case where the parametric transform model uses the constraint relationship.
  • the image task includes any one of a line fitting task, a wide-baseline image matching task, an image localization task, an image stitching task, a three-dimensional reconstruction task, and a camera pose estimation task.
  • the embodiment of the present application provides a matching screening device, including:
  • an obtaining unit configured to obtain an initial matching set, the initial matching set is derived from the initial matching result between the image pairs;
  • a screening unit configured to filter out a matching subset from the initial matching set through at least one tailoring module, and the correct matching ratio in the matching subset is higher than the correct matching ratio in the initial matching set;
  • the matching subset is used to calculate model parameters of a parametric transformation model used to process image tasks related to the image pair.
  • An embodiment of the present application provides an electronic device, including a processor and a memory, the memory is configured to store a computer program, the computer program includes program instructions, the processor is configured to call the program instructions, and execute the above any method.
  • An embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program configured for electronic data exchange, wherein the computer program causes a computer to execute any one of the above methods.
  • An embodiment of the present application provides a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute any one of the above methods.
  • the computer program product may be a software installation package.
  • the electronic device obtains an initial matching set, the initial matching set is derived from the initial matching results between image pairs; a matching subset is filtered from the initial matching set by at least one cropping module, and the matching The correct matching ratio in the subset is higher than the correct matching ratio in the initial matching set, and the at least one clipping module is used to obtain the consistency information of each initial matching in the initial matching set; using the matching subset to calculate Model parameters of a parametric transformation model for processing image tasks associated with the image pair.
  • the initial matching set can be screened, so that the correct matching ratio in the selected matching subset is higher than the correct matching ratio in the initial matching set, which can improve the calculation accuracy of the model parameters of the parametric transformation model, and further Improves the performance of parametric transformation models for image tasks.
  • 1a is a schematic flowchart of a matching screening method provided by an embodiment of the present application.
  • 1b is a schematic structural diagram of a Consensus Learning framework (CLNet) for matching screening provided by an embodiment of the present application;
  • CLNet Consensus Learning framework
  • FIG. 2a is a schematic flowchart of another matching screening method provided by an embodiment of the present application.
  • 2b is a schematic structural diagram of another CLNet used for matching screening provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a first trimming module screening an initial matching set according to an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a first local consistency learning module provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a first feature dimension reduction module provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of feature aggregation performed by a first annular convolution module and a second annular convolution module provided by an embodiment of the present application;
  • 7a is a schematic flowchart of calculating the comprehensive consistency score of each initial match in the initial match set provided by an embodiment of the present application;
  • 7b is another schematic flowchart of calculating the comprehensive consistency score of each initial match in the initial match set provided by an embodiment of the present application;
  • FIG. 8 is a schematic flowchart of another matching screening method provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a fitting effect on a line fitting task using the CLNet method of the embodiment of the present application and the PointCN method;
  • FIG. 10 is a comparison diagram of the L2 distance on the line fitting task using the CLNet method of the embodiment of the present application and the PointCN method, the OANet method, and the PointACN method;
  • FIG. 11 is a schematic structural diagram of a matching screening apparatus provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic devices involved in the embodiments of the present application may include devices with computing capabilities, such as personal computers, mobile phones, servers, face recognition devices, face access devices, image processing devices, and virtual reality devices.
  • devices with computing capabilities such as personal computers, mobile phones, servers, face recognition devices, face access devices, image processing devices, and virtual reality devices.
  • the above-mentioned devices may be collectively referred to as electronic devices.
  • Feature matching screening is one of the fundamental research problems in computer vision and image processing, and its purpose is to filter correct matches from an initial set of feature matches that contain false matches (noise or other disturbances). Since the distribution of false matches is arbitrary and dominates the initial match set, it is necessary to provide a matching screening method that can identify correct matches under the interference of a large number of false matches. In addition, various factors such as rotation, translation, scale, perspective change, and illumination change contained in the real world increase the difficulty of feature matching screening.
  • non-machine learning-based methods these methods are based on artificially designed rules for feature matching and screening, which mostly rely on assumptions or priors. knowledge, and does not require complex learning and training; however, due to the failure of the assumptions or prior knowledge relied on under specific noise, this type of method is less robust to a variety of noises; for machine learning-based methods, the The class method models feature matching screening as a binary classification problem, using a deep learning network to learn to predict all matching categories in the initial matching set, i.e. correct matching or incorrect matching.
  • the number of false matches is much larger than that of correct matches, and the learning process of this type of method is easily disturbed, which makes it difficult to identify all potential correct matches at one time.
  • embodiments of the present application propose a matching screening method, apparatus, electronic device, computer-readable storage medium, and computer program product.
  • FIG. 1a is a schematic flowchart of a matching screening method provided by an embodiment of the present application. As shown in Figure 1a, the matching screening method may include the following steps.
  • Step 101 the electronic device acquires an initial matching set, where the initial matching set is derived from an initial matching result between image pairs.
  • the initial matching set may include multiple pieces of initial matching, and each initial matching piece in the initial matching set may include feature information of a corresponding point in the image pair (for example, the feature information of a corresponding point in the image pair may include A combination of at least one of the coordinates of the corresponding point, the pixel value of the corresponding point, the grayscale value of the corresponding point, and the RGB value of the corresponding point).
  • An image pair is a pair of images used in the image task, and generally includes two images: a first image and a second image.
  • the initial matching result may be pixel points with consistent matching selected from the first image and the second image respectively based on a pixel-by-pixel matching algorithm.
  • the matched pixels may be corresponding pixels in the first image and the second image.
  • the first image is a building taken from one angle
  • the second image is the building taken from another angle
  • the matching pixel points can be the same position of the building as the pixels of the first image. and the pixels in the second image.
  • Step 102 the electronic device filters out a matching subset from the initial matching set through at least one trimming module, and the correct matching ratio in the matching subset is higher than the correct matching ratio in the initial matching set.
  • the at least one cropping module is used for acquiring consistency information of each initial match in the initial matching set, and the matching subset is used for processing image tasks related to the image pair.
  • the consistency information of the initial match is used to measure the consistency of the initial match with other initial matches in the entire image.
  • the consistency may include the consistency of the matching in dimensions such as orientation, rotation, and translation.
  • the matches in the initial matching set are not necessarily all correct, there are correct matches and there are incorrect matches, where the correct matching ratio refers to the number of all correct matches in the initial matching set accounting for the total of the initial matching set. proportion of the quantity.
  • step 102 may use a trained neural network learning model (at least one cropping module) to screen the initial matching set, so that the correct matching ratio in the selected matching subset is higher than the correct matching ratio in the initial matching set.
  • At least one trimming module in the embodiments of the present application is a trained neural network learning model.
  • step 102 may include the following steps:
  • the electronic device filters the first matching set through the first cropping module to obtain a matching subset
  • the first matching set is the initial matching set
  • the first matching set is obtained by filtering the previous cropping module of the first cropping module.
  • the electronic device filters the first matching set through the first cropping module to obtain a matching subset, which may include the following steps:
  • the electronic device determines the local consistency information and the global consistency information of the first initial matching through the first cropping module, and determines whether the first initial matching is based on the local consistency information and the global consistency information of the first initial matching is classified into the matching subset; the first initial matching is any item in the first matching set.
  • the local consistency information is the consistency of the first initial matching in a local area of the image
  • the global consistency information is the consistency of the first initial matching in the entire image.
  • the electronic device determines, through the one clipping module, the local consistency information and global consistency of each initial match in the initial matching set information, and filter out a matching subset from the initial matching set according to the local consistency information and the global consistency information of each initial match.
  • the embodiment of the present application adopts a trained cropping module, which can be applied to the case where there are few erroneous matches in the initial matching set.
  • the cropping module is a neural network learning module. Since the cropping module can learn features during the training process, compared with training with fixed features, it can increase the proportion of correct matches in the selected matching subsets.
  • the electronic device filters the first matching set through the first cropping module to obtain a matching subset, which may include the following steps:
  • the electronic device determines the local consistency information and the global consistency information of the first initial matching through the first cropping module, and determines whether the first initial matching is based on the local consistency information and the global consistency information of the first initial matching is classified into the matching subset; the first initial matching is any item in the first matching set.
  • the screening function of the first cropping module can also be implemented without considering the global consistency information. For images with little difference in global consistency , which can save the calculation amount required for matching and filtering, and quickly realize matching and filtering.
  • the filtering function of the first cropping module can also be realized without considering the local consistency information. The amount of calculation required to quickly achieve matching screening.
  • the electronic device performs at least two screenings through the at least two cropping modules, and the latter cropping module is used to further screen the matching set filtered by the previous cropping module , until the last clipping module filters out the matching set.
  • the first trimming module is the last one of the at least two trimming modules.
  • the at least two cropping modules in the embodiments of the present application are trained neural network learning modules, which can screen the initial matching set at least twice, so that the correct matching ratio in the screened matching subsets is higher, thereby improving the parameterization transformation
  • the calculation accuracy of the model parameters of the model makes the calculated model parameters more reliable when processing image tasks, and the parametric transformation model can be used for tasks such as image stitching and 3D reconstruction.
  • the cropping module can be trained with a large number of supervised samples (pre-known correct matching samples), predict each match, and calculate the training loss. When the training loss is less than the set value, the cropping module is determined to be well trained. clipping module.
  • the trained at least two cropping modules do not screen the initial matching set at the same time, but filter one by one in sequence, that is, the output result filtered by the previous cropping module is used as the input of the next cropping module.
  • at least two clipping modules include two clipping modules: clipping module 1 and clipping module 2, clipping module 1 performs the first screening on the initial matching set to obtain matching set 1; clipping module 2 performs matching on matching set 1 A second filter is performed to obtain a subset of matches.
  • clipping module 1 performs the first screening on the initial matching set to obtain matching set 1; clipping module 2
  • the matching set 1 is screened for the second time to obtain a matching set 2; the trimming module 3 performs a third screening on the matching set 2 to obtain a matching subset.
  • the number of trimming modules is 3, the number of initial matching sets is 10,000, and the trimming module filters 50% each time, the number of matches in the selected matching subset is 1250. Since the clipping module fully considers the local consistency and global consistency of each match, the proportion of correct matches in the matching subset is much higher than that in the initial matching set. The proportion of false matches in the matching subset is very small, and when the matching subset is used for the line fitting task, the interference by the false matching is also small, thereby improving the processing effect of the line fitting task.
  • the initial matching set is trimmed multiple times by the trimming module, and the number of false matches is gradually eliminated, thereby alleviating the problems of unbalanced sample distribution in the initial matching set and arbitrary distribution of false matches.
  • FIG. 1b is a schematic structural diagram of a CLNet for matching screening provided by an embodiment of the present application.
  • the CLNet includes at least two cropping modules and a parametric transformation model, where N represents the number of initial matches in the initial matching set, and 4 represents the 4-dimensional coordinates of the initial matches (for example, in the first image 4-dimensional coordinates composed of the coordinate position of the first pixel point in the second image and the coordinate position of the second pixel point in the second image that matches the pixel point in the first image).
  • each cropping module may include a local consistency learning module, a global consistency learning module and a cropping sub-module.
  • the embodiment of the present application can be applied to the case where there are many false matches in the initial matching set. Since each cropping module learns different features during the training process, using at least two cropping modules can achieve dynamic feature learning through at least two feature learning; compared with the use of fixed feature training, it can improve screening The proportion of correct matches in the resulting subset of matches.
  • the electronic device can use the matching subset to calculate model parameters of a parametric transformation model, and the parametric transformation model is used to process image tasks related to image pairs.
  • the parametric transformation model may be used to predict each initial match in the initial matching set, and predict whether each initial match is a correct match or an incorrect match. Since the model parameters of the parametric transformation model are calculated based on the matching subset, for example, the model parameters can be an essential matrix.
  • the matching subset is selected from the initial matching set, and the proportion of correct matching in the matching subset is high, which makes the calculated model parameters more reliable, thereby improving the calculation accuracy of the model parameters of the parametric transformation model, thereby improving the parameters. Transform the transformation model to deal with the processing effect of the image task.
  • image tasks related to image pairs may include line fitting tasks, wide-baseline image matching tasks, image localization tasks, image stitching tasks, and three-dimensional reconstruction tasks. either.
  • step 102 the method in FIG. 1 may further perform the following steps:
  • Step 11 The electronic device uses the supervised data set to train the cropping module to obtain the training result
  • Step 12 The electronic device evaluates the training result through a temperature-adaptive binary loss function, and updates the parameters of the cropping module according to the method of minimizing the binary loss function to obtain a trained cropping module.
  • dthr represents the set distance; Avoided label ambiguity (ie, a match near d thr may be judged as a correct match or as a false match). Since the confidence of the matching c i should be negatively correlated with the corresponding epipolar distance d i , that is , the closer d i is to 0, the more likely it is to be judged as a correct match; thr ) introduces an adaptive temperature whose calculation formula can be represented by the Gaussian kernel ⁇ i shown in formula (1).
  • is the kernel width of the Gaussian kernel
  • the training target is described by formula (2):
  • L reg represents the parametric transformation model , where ⁇ is the weighting factor.
  • is the weighting factor.
  • the initial matching set can be screened, so that the correct matching ratio in the selected matching subset is higher than the correct matching ratio in the initial matching set, and the matching subset is used to calculate the model parameters of the parametric transformation model.
  • the calculation accuracy of the model parameters of the parametric transformation model is improved, thereby improving the processing effect of the parametric transformation model for processing image tasks.
  • FIG. 2a is a schematic flowchart of another matching screening method provided by an embodiment of the present application.
  • Fig. 2a is obtained by further optimization on the basis of Fig. 1a.
  • the matching screening method may include the following steps.
  • Step 201 the electronic device acquires an initial matching set, where the initial matching set is derived from the initial matching result between the image pairs.
  • Step 202 the electronic device filters out a matching subset from the initial matching set through at least one clipping module, at least one clipping module is used to obtain the consistency information of each initial match in the initial matching set, and the correct matching ratio in the matching subset is higher than The proportion of correct matches in the initial matching set, the matching subset is used to process image tasks related to image pairs.
  • the matching subset is used to calculate model parameters of a parametric transformation model, which is used to process image tasks related to image pairs.
  • steps 201 to 202 For the implementation of steps 201 to 202, reference may be made to steps 101 to 102 of FIG. 1a, which will not be repeated here.
  • Step 203 the electronic device uses the parametric transformation model to predict the initial matching set, and obtains a prediction result of each initial matching in the initial matching set, and the prediction result includes correct matching or incorrect matching.
  • the model parameters of the parametric transformation model are calculated by using the matching subset, so that the reliability of the calculated model parameters is high, and the parametric transformation model can perform better performance on each initial match in the initial matching set.
  • Prediction compared with the neural network model that directly predicts the initial matching set, can improve the accuracy of the prediction result of the parametric transformation model.
  • FIG. 2b is a schematic structural diagram of another CLNet used for matching screening provided by an embodiment of the present application.
  • the CLNet includes at least two cropping modules, a parametric transformation model and a full-scale prediction module, where N represents the number of initial matches in the initial matching set, and 4 represents the 4-dimensional coordinates of the initial matches (such as , a 4-dimensional coordinate consisting of the coordinate position of the first pixel in the first image and the coordinate position of the second pixel in the second image that matches the pixel in the first image).
  • the initial matching set is gradually filtered through K (K is greater than or equal to 2) clipping modules (the clipping module based on local-to-global consistency learning) to obtain a matching subset (the matching subset contains N1 candidate matches), parameterized
  • the model parameters of the transform model are calculated based on N1 candidate matches, and the full-size prediction module is used to predict N initial matches in the initial matching set (ie, full-size prediction), and it can be obtained that each initial matching pair in the set is obtained.
  • the prediction results of the initial matching pairs (prediction results include correct matching or false matching).
  • each cropping module may include a local consistency learning module, a global consistency learning module and a cropping sub-module.
  • SfM Structure From Motion
  • SLAM Simultaneous Location And Mapping
  • Image Stitching Visual Localization
  • Virtual Reality etc.
  • SfM refers to the process of obtaining 3D structural information by analyzing 2D moving images of objects.
  • real-world images often contain multiple factors such as rotation, translation, scale, perspective changes, and illumination changes, making the problem of matching screening methods extremely challenging.
  • match screening is usually used as a match classification task, in which MLP is used to classify matches (correct match or false match), however, optimization for such binary classification problems is not easy, matching May be extremely unbalanced, for example, outliers (mismatches) account for up to 90% or more. Therefore, the accuracy of directly predicting the correct matching result in the initial matching set by MLP is low.
  • the reliability of the calculated model parameters is high, and the parametric transformation model can better predict each initial match in the initial matching set. Compared with the network model, the accuracy of the prediction result of the parametric transformation model can be improved.
  • FIG. 3 is a schematic flowchart of a first trimming module screening an initial matching set according to an embodiment of the present application. As shown in FIG. 3 , the method may include the following steps.
  • Step 301 the electronic device constructs a first local dynamic graph for the first initial matching through the first local consistency learning module, and calculates the local consistency score of the first initial matching in the first local dynamic graph; the first local dynamic graph includes the first local dynamic graph.
  • the first trimming module includes a first local consistency learning module, a first global consistency learning module, and a first trimming sub-module.
  • the first cropping module is the first of the at least two cropping modules that have been trained.
  • the first local consistency learning module may construct a first local dynamic graph, and calculate a local consistency score of the first initial matching in the first local dynamic graph.
  • the first global consistency learning module may construct the first global dynamic graph, calculate the global consistency score of the first initial matching in the first global dynamic graph, and may also calculate the comprehensive consistency score of the first initial matching.
  • the first local dynamic map is constructed according to the correlation between the high-dimensional feature vector of the first initial match and other initial matches after the initial feature vector of the initial match is mapped to the high-dimensional feature vector.
  • Each initial match maps to a node in the first local dynamic graph.
  • the node where the first initial match is located is the node where the first initial match is mapped to the first local dynamic graph.
  • KNN K nearest neighbors algorithm
  • the local consistency score of the first initial match in the first local dynamic graph is used to measure the local consistency of the first initial match. If the first initial match is a correct match, the local consistency is better and the local consistency The sex score is high; if the first initial match is a false match, its local consistency is poor and the local consistency score is low.
  • the dynamic graph method is used to calculate the consistency score of the matching in the local area and the consistency score of the global area, which can ensure that only reliable matches with high consistency are retained during the cropping process.
  • FIG. 4 is a schematic structural diagram of a first local consistency learning module provided by an embodiment of the present application.
  • the first local consistency learning module includes a first feature dimension enhancement module, a first dynamic graph construction module, a first feature dimension reduction module, and a first local consistency score calculation module.
  • Step 301 may include the following steps:
  • Step 21 The electronic device performs dimension-upgrading processing on the initial feature vector of the first initial matching through the first feature dimensional increasing module, and obtains the high-dimensional feature vector of the first initial matching;
  • Step 22 The electronic device uses the first local dynamic graph building module to determine the correlation of the high-dimensional feature vector in the first matching set with the first initial matching through the K-nearest neighbor algorithm (for example, determined according to the Euclidean distance; the correlation degree) of the top K correlation matches, build a first local dynamic graph for the first initial match based on the first initial match and the K correlation matches, and obtain the first initial match Super-high-dimensional feature vector; the super-high-dimensional feature vector of the first initial match includes the high-dimensional feature vector of the first initial match and the correlation degree vector between the first initial match and the K correlation matches The combination;
  • the K-nearest neighbor algorithm for example, determined according to the Euclidean distance; the correlation degree
  • Step 23 The electronic device uses the first feature dimensionality reduction module to perform dimensionality reduction processing on the first initial matched ultra-high-dimensional feature vector, and obtains the first initial matched low-dimensional feature vector;
  • Step 24 The electronic device calculates, through the first local consistency score calculation module, a local consistency score of the first initial match in the first local dynamic graph based on the low-dimensional feature vector of the first initial match.
  • the first feature dimension enhancement module may be a trained deep neural network module, such as a trained residual network, and the residual network may include multiple residual modules (ResNet Blcok), for example, may include 4 A residual module (ResNet Blcok).
  • the first feature dimension increasing module may perform dimension increasing processing on the initial feature vector of each initial match in the first matching set to obtain a high-dimensional feature vector of each initial match.
  • the first dynamic graph building module, the first feature dimension reduction module, and the first local consistency score calculation module may all be trained deep neural network modules.
  • the first feature dimension reduction module may include multiple residual modules (ResNet Blcok), and the first local consistency score calculation module may include MLP.
  • the initial feature vector of the first initial matching may be a four-dimensional vector, including the coordinates of the first pixel point in the first image of the first initial matching in the first image of the image pair and the first initial matching in the second image of the image pair.
  • the high-dimensional feature vector of the first initial match may be a 128-dimensional vector.
  • FIG. 5 is a schematic structural diagram of a first feature dimension reduction module provided by an embodiment of the present application.
  • the first feature dimension reduction module includes a first annular convolution module and a second annular convolution module.
  • Step 23 may include the following steps:
  • Step 231 The electronic device groups the first initially matched ultra-high-dimensional feature vectors according to the degree of correlation through the first annular convolution module, and performs the first feature aggregation process on each group of feature vectors to obtain a preliminary aggregation eigenvector of ;
  • Step 232 The electronic device performs a second feature aggregation process on the initially aggregated feature vector through the second annular convolution module to obtain the first initially matched low-dimensional feature vector.
  • the first annular convolution module groups the first initially matched ultra-high-dimensional feature vectors according to the degree of correlation, and the dimensions of each group of feature vectors are the same. For example, group the top 10% of the relevancy, group the top 10% to 20% of the relevancy, and group the top 20% to 30% of the relevancy.
  • the top 30% to 40% of the rankings are divided into one group, the top 40% to 50% of the relevance rankings are divided into a group, the top 50% to 60% of the relevance rankings are divided into a group, and the top 50% to 60% of the relevance rankings are divided into a group, and the relevance ranking is the top 60% to 70% are divided into one group, the top 70% to 80% of the relevance are divided into one group, the top 80% to 90% of the relevance is divided into one group, and the top 90% of the relevance is divided into one group ⁇ 100% were divided into one group for a total of 10 groups.
  • the first ring-shaped convolution module groups the first initially matched ultra-high-dimensional feature vectors according to the degree of correlation, and then aggregates each group of feature vectors into one feature vector.
  • the ultra-high-dimensional feature vector is k*128 dimension
  • k*128 dimension can be divided into p groups: (p ⁇ k/p) ⁇ 128, the first ring convolution module pair (p ⁇ k/p) ⁇ 128
  • the first feature aggregation process is performed, and the initially aggregated feature vector is k/p ⁇ 128 dimensions.
  • the second ring convolution module can aggregate k/p ⁇ 128 into low-dimensional feature vectors of 1 ⁇ 128 dimensions.
  • the parameters in the matrix learned in the first annular convolution module are not shared with the parameters in the matrix learned in the second annular convolution module.
  • a parameter in a matrix refers to the value of an element in the matrix.
  • FIG. 6 is a schematic diagram of feature aggregation performed by a first annular convolution module and a second annular convolution module provided by an embodiment of the present application.
  • the K correlation matches determined by the K-nearest neighbor algorithm are all reflected in the first local dynamic diagram of FIG. 6 .
  • the aggregated feature vectors are (k/p) ⁇ 128 dimensions.
  • the second annular convolution module can aggregate (k/p) ⁇ 128 into a low-dimensional feature vector of 1 ⁇ 128 dimensions.
  • the embodiment of the present application adopts a ring-shaped convolution module to group the ultra-high-dimensional feature vectors according to the first initial matching according to the correlation degree, and then reduces the dimension, and fully considers the local consistency of the first initial matching, so that the first initial The matched low-dimensional feature vector still retains the local consistency of the first initial matching, thereby improving the accuracy of the calculation result of the local consistency score of the first initial matching in the first local dynamic graph.
  • Step 302 the electronic device constructs a first global dynamic graph through the first global consistency learning module, and determines the comprehensive consistency of the first initial matching according to the local consistency score of the first initial matching in the first local dynamic graph and the first global dynamic graph. Sex Score.
  • the first global dynamic graph includes all nodes where the initial matching is located, and the global consistency score of the first initial matching in the first global dynamic graph can be determined through the first global dynamic graph, and according to the first initial matching in the first global dynamic graph
  • the local consistency score of the first local dynamic graph and the global consistency score of the first initial match in the first global dynamic graph determine the comprehensive consistency score of the first initial match.
  • the comprehensive consistency score of the first initial match may also be determined according to the local consistency score of the first initial match in the first local dynamic graph and the first global dynamic graph.
  • the comprehensive consistency score of the first initial matching is obtained by synthesizing the local consistency score of the first initial matching in the first local dynamic graph and the global consistency score of the first initial matching in the first global dynamic graph.
  • step 302 the electronic device determining the comprehensive consistency score of the first initial match according to the local consistency score of the first initial match in the first local dynamic map and the first global dynamic map may include the following steps :
  • Step 31 Calculate the global consistency score of the first initial match in the first global dynamic graph through the first global consistency learning module
  • Step 32 The first global consistency learning module determines a comprehensive consistency score of the first initial match according to the local consistency score and the global consistency score.
  • the first global consistency learning module may calculate the global consistency score of the first initial matching in the first global dynamic graph, and determine the synthesis of the first initial matching according to the local consistency score and the global consistency score Consistency Score. In some embodiments of the present application, the first global consistency learning module may directly add the local consistency score and the global consistency score, and use the sum of the local consistency score and the global consistency score as the comprehensive consistency score. The first global consistency learning module may also calculate a comprehensive consistency score according to a weighting algorithm.
  • step 302 the electronic device constructing the first global dynamic graph by using the first global consistency learning module may include the following steps:
  • Step 41 The electronic device constructs a first global dynamic graph through the first global consistency learning module according to the local consistency score of each initial match in the first matching set in the corresponding local dynamic graph;
  • step 302 the electronic device determines the comprehensive consistency score of the first initial matching according to the local consistency score of the first initial matching in the first local dynamic graph and the first global dynamic graph, which may include:
  • Step 42 The electronic device calculates a comprehensive consistency score of the first initial match according to the first global dynamic map and the low-dimensional feature vector of the first initial match.
  • the global consistency score is not calculated directly, but the comprehensive consistency score is directly calculated on the basis of the local consistency score and the first global dynamic graph, which reduces the calculation process of the global consistency score. Improved computational efficiency for composite consistency scores.
  • the first global dynamic graph is represented by an adjacency matrix, and step 32 may include the following steps:
  • Step 421 The electronic device calculates the comprehensive low-dimensional feature vector of the first initial match based on the low-dimensional feature vector of the first initial match and the adjacency matrix using a graph convolution network (Graph Convolutional Network, GCN);
  • GCN Graph Convolutional Network
  • Step 422 The electronic device calculates a comprehensive consistency score of the first initial match based on the comprehensive low-dimensional feature vector of the first initial match.
  • the first global dynamic graph is constructed by the local consistency score of each initial match in the first matching set in the corresponding local dynamic graph.
  • the adjacency matrix is obtained by multiplying the matrix composed of the local consistency scores of each initial match in the corresponding local dynamic graph and the corresponding transposed matrix in the first matching set.
  • the matrix composed of the local consistency scores of the corresponding local dynamic graph for each initial match in the first matching set is N ⁇ 1
  • the corresponding transposed matrix is 1 ⁇ N
  • the two are multiplied to obtain an N ⁇ N matrix , which is the adjacency matrix.
  • the output result of the graph convolutional network is as follows: (4) shows:
  • L D -1/2 ⁇ A' ⁇ D -1/2 ;
  • D is the diagonal degree matrix of A',
  • A' A+IN , and
  • IN is to ensure numerical stability matrix.
  • L is an N ⁇ N matrix,
  • Z is an N ⁇ 128 matrix, and
  • W g is a 128 ⁇ 128 matrix.
  • the comprehensive low-dimensional feature vector of the initial matching after each initial matching comprehensive low-dimensional feature vector is processed by the residual module, input MLP, and the MLP reduces the dimensionality of each initial matching comprehensive low-dimensional feature vector in the first matching set , calculate the comprehensive consistency score of each initial match in the first match set.
  • Step 303 the electronic device uses the first trimming sub-module to determine whether the first initial match is classified into the matching subset according to the comprehensive consistency score of the first initial match.
  • the higher the comprehensive consistency score of the first initial match the higher the probability that the first initial match is a correct match.
  • the initial matches with higher comprehensive consistency scores in the first matching set can be classified into the matching subset, or the comprehensive consistency scores in the first matching set can be sorted in descending order, and the first matching initial matches can be classified into the matching subset. Match a subset.
  • for each initial match its comprehensive consistency score is calculated, and each initial match in the first matching set can be classified by a simple index (comprehensive consistency score). For the local consistency and global consistency of the initial matching, more correct matches can be screened from the first matching set through the first trimming module, which lays a good foundation for the subsequent screening of matching subsets.
  • the correct matching ratio in the first matching set is higher than the correct matching ratio in the first matching set.
  • step 303 may include the following steps:
  • the electronic device uses the first trimming submodule to determine whether the comprehensive consistency score of the first initial match is greater than a first threshold, and if so, determines that the first initial match is included in the matching subset;
  • the electronic device uses the first trimming submodule to determine that the comprehensive consistency score of the first initial match is ranked in descending order of the first matching set, if the ranking of the first initial match is greater than The second threshold determines that the first initial match is classified into the matching subset.
  • the first matching set may be screened according to the comprehensive consistency score of each initial match in the first matching set, and the initial matching in the first matching set whose comprehensive consistency score is greater than the first threshold is classified as matching Subset. It is also possible to filter according to the comprehensive consistency score of each initial match in the first matching set according to the ranking from large to small, and classify the initial matching whose ranking of the comprehensive consistency score in the first matching set is greater than the first threshold into the matching subset. .
  • FIG. 7a is a schematic flowchart of calculating a comprehensive consistency score of each initial match in a first matching set (the first matching set is an example of an initial matching set) provided by an embodiment of the present application.
  • the initial matching set (c 1 , c 2 , ... c N ), where c 1 represents an initial feature vector of an initial match, and c 1 can be a 4-dimensional vector (the initial match is in the image pair A 4-dimensional vector composed of the two-dimensional coordinates of the first pixel of the first image and the two-dimensional coordinates of the second pixel of the second image of the pair of initial matches), the initial matching set includes N initial matches.
  • the initial matching set becomes ( z 1 , z 2 , .
  • each matching feature vector is k ⁇ 256 dimension.
  • the feature dimension reduction module can reduce each match from k ⁇ 256 dimensions to 128 dimensions, and the local consistency score calculation module can calculate the local consistency score of each match.
  • the global consistency learning module outputs a comprehensive consistency score for each match.
  • the feature dimension upgrading module can include 4 residual modules (for example, 1 residual module is used for dimension upgrading, and the other 3 residual modules are used to solve the problem of deep neural network degradation), and the feature dimension reduction module can be used for 1 MLP (for dimensionality reduction, not shown in Figure 7a and Figure 7b, MLP can reduce k ⁇ 256 dimensions to k ⁇ 128 dimensions), ring convolution (for dimensionality reduction, for example, from k ⁇ 128 The dimensionality is reduced to 128 dimensions) and 4 residual modules (4 residual modules are used to solve the problem of deep neural network degradation), and the local consistency score calculation can be realized by 1 MLP.
  • MLP for dimensionality reduction, not shown in Figure 7a and Figure 7b
  • MLP can reduce k ⁇ 256 dimensions to k ⁇ 128 dimensions
  • ring convolution for dimensionality reduction, for example, from k ⁇ 128 The dimensionality is reduced to 128 dimensions
  • 4 residual modules (4 residual modules are used to solve the problem of deep neural network degradation
  • FIG. 7b is another schematic flowchart of calculating the comprehensive consistency score of each initial match in the first matching set (the first matching set is an example of an initial matching set) provided by an embodiment of the present application.
  • Figure 7b is further optimized on the basis of Figure 7a.
  • the local consistency calculation process in Figure 7b is similar to that in Figure 7a, and Figure 7b can describe the global consistency calculation process.
  • transpose N ⁇ 1 to 1 ⁇ N multiply the two to obtain an N ⁇ N adjacency matrix, and complete the global dynamic composition. It covers the consistency between each match and other matches in the initial match set, that is, includes the global consistency information of each match.
  • a filter with shared parameters is used to form a feature map by calculating the weighted sum of the central pixel and adjacent pixels to achieve feature space extraction.
  • the image convolutional network can modulate the information learned by the local consistency module into the spectrum, and the feature filter in the spectrum enables the propagated features to reflect the consistency in the Laplacian operator of the global dynamic graph.
  • the screening function of the first cropping module can also be implemented without considering the global consistency information. For images with little difference in global consistency, matching can be saved The amount of computation required for filtering can quickly achieve matching filtering.
  • FIG. 8 is a schematic flowchart of another matching screening method provided by an embodiment of the present application.
  • Fig. 8 is obtained by further optimization on the basis of Fig. 2a.
  • the matching screening method may include the following steps.
  • Step 801 the electronic device obtains an initial matching set, the initial matching set is derived from the initial matching result between the image pairs.
  • Step 802 the electronic device filters out a matching subset from the initial matching set through at least one clipping module, and at least one clipping module is used to obtain the consistency information of each initial match in the initial matching set, and the correct matching ratio in the matching subset is higher than The proportion of correct matches in the initial match set.
  • the matching subset is used to process image tasks related to image pairs.
  • steps 801 to 802 reference may be made to steps 201 to 202 shown in FIG. 2a, and details are not repeated here.
  • Step 803 the electronic device determines the constraint relationship used by the parametric transformation model according to the image task related to the image pair, and the constraint relationship includes epipolar geometric constraint or reprojection error.
  • different image tasks may correspond to different constraint relationships.
  • the constraint relationship used is epipolar geometry constraint
  • the constraint relationship used is reprojection error.
  • step 803 is performed before step 804, and step 803 may be performed before step 801 or step 802, or may be performed after step 801 or step 802, or may be performed simultaneously with step 801 or step 802, which is not limited in this embodiment of the present application. .
  • Step 804 in the case that the parametric transformation model uses the constraint relationship, the electronic device uses the matching subset to calculate model parameters of the parametric transformation model, and the parametric transformation model is used to process image tasks related to image pairs.
  • Step 805 the electronic device uses the parametric transformation model to predict the initial matching set, and obtains a prediction result of each initial matching in the initial matching set, and the prediction result includes correct matching or incorrect matching.
  • the electronic device uses a parametric transformation model to predict the initial matching set, and obtains the prediction result of each initial matching in the initial matching set, including:
  • the electronic device uses the parametric transformation model to calculate the epipolar distance or reprojection error of each match in the initial matching set, and then determines the value of each initial match according to the epipolar distance or reprojection error of each match. forecast result.
  • the model parameters of the parametric transformation model may be essential matrices.
  • the electronic device uses the parametric transformation model to calculate the epipolar distance of each match in the initial matching set, and then determines the prediction result of each initial match according to the epipolar distance of each match.
  • a match with an epipolar distance less than a third threshold may be predicted as a correct match, and a match with an epipolar distance greater than the third threshold may be predicted as an incorrect match.
  • the electronic device uses the parametric transformation model to calculate the reprojection error of each match in the initial matching set, and then calculates the reprojection error of each match in the initial matching set according to the The reprojection error of determines the prediction result for each initial match.
  • a match with a reprojection error less than a fourth threshold may be predicted as a correct match, and a match with a reprojection error greater than the fourth threshold may be predicted as an incorrect match.
  • the constraint relationship corresponding to the image task can be selected before the model parameters of the parametric transformation model are calculated, so that the subsequent image task can be better completed by the calculated parametric transformation model.
  • the effects of using the CLNet method according to the embodiment of the present application and the PointCN method on the line fitting task are presented below with reference to FIG. 9 .
  • the PointCN method is used to directly perform straight line fitting on the initial matching set; in the method of the embodiment of the present application, the matching subset is first screened from the initial matching set by the clipping module, and then the straight line fitting is performed according to the matching subset.
  • the matching subset of the method of the example removes most of the wrong matches, and the straight line fitting is less affected by the wrong matching, thereby improving the reliability of the straight line fitting.
  • Figure 9 provides two initial matching sets (the initial matching set in the first case and the initial matching set in the second case, the distribution of the initial matching sets in the two cases is different), the two initial matching sets are Matches from random distributions in real scenes, for a given line fitting task, it requires the model to fit a given line.
  • the PointCN method is not very reliable.
  • the fitting fails under the following conditions, but the method in the embodiment of the present application is successful in both cases.
  • the cropping module may crop the initial matching set for multiple times based on the consistency of learning from the local area to the global local area, to obtain a matching subset with a higher degree of confidence.
  • FIG. 10 A comparison diagram of the L2 distance on the line fitting task using the CLNet method according to the embodiment of the present application and the PointCN method, the OANet method, and the PointACN method is presented below with reference to FIG. 10 .
  • the ordinate of Figure 10 is the L2 distance error
  • the abscissa is the outlier rate (the proportion of false matches) of the test dataset.
  • the evaluation index in Figure 10 is the L2 distance between the predicted line parameters and the real line. The smaller the L2 distance, the higher the prediction accuracy.
  • the image task includes any one of a line fitting task, a wide-baseline image matching task, an image localization task, an image stitching task, a three-dimensional reconstruction task, and a camera pose estimation task.
  • the matching screening method by screening the initial matching set containing a large number of false matches by the matching screening method, high-precision feature matching results can be obtained, which can be used for line fitting tasks and wide-baseline image matching tasks; screening using the matching screening method As a result, the parametric transformation model between images can be calculated for image stitching, 3D reconstruction tasks and camera pose estimation; the number of feature matches obtained by the matching screening method is used as a measure for image retrieval to locate the target image.
  • the matching screening method can be applied to VIPER platform products.
  • the electronic device includes corresponding hardware structures and/or software modules for executing each function.
  • the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in combination with the units and algorithm steps of each example described in the embodiments provided herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • the electronic device may be divided into functional units according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and other division methods may be used in actual implementation.
  • FIG. 11 is a schematic structural diagram of a matching screening apparatus provided by an embodiment of the present application.
  • the matching screening apparatus 1100 is applied to electronic equipment.
  • the matching screening apparatus 1100 may include an acquisition unit 1101 and a screening apparatus. Unit 1102, where:
  • the obtaining unit 1101 is configured to obtain an initial matching set, the initial matching set is derived from the initial matching result between the image pairs;
  • the screening unit 1102 is configured to filter out a matching subset from the initial matching set through at least one cropping module, the correct matching ratio in the matching subset is higher than the correct matching ratio in the initial matching set, the at least one matching subset
  • the clipping module is used to obtain the consistency information of each initial match in the initial match set
  • the matching subset is used to process image tasks related to the image pair.
  • the matching screening apparatus 1100 may further include a prediction unit 1103;
  • the prediction unit 1103 is configured to, after the screening unit 1102 filters out a matching subset from the initial matching set through at least one cropping module, use the parameterized transformation model to predict the initial matching set to obtain the initial matching
  • the prediction result of each initial match in the set, the prediction result includes a correct match or an incorrect match.
  • the screening unit 1102 is configured to filter out a matching subset from the initial matching set through at least one clipping module, including:
  • the first matching set is the initial matching set
  • the first matching set is obtained by filtering the previous cropping module of the first cropping module.
  • the screening unit 1102 is configured to screen the first matching set through the first clipping module to obtain a matching subset, including: determining the local consistency information of the first initial matching through the first clipping module or global consistency information, according to the local consistency information or global consistency information of the first initial matching to determine whether the first initial matching is classified into the matching subset; the first initial matching is the Any item in the first matching set.
  • the screening unit 1102 is configured to screen the first matching set through the first clipping module to obtain a matching subset, including: determining the local consistency information of the first initial matching through the first clipping module and global consistency information, according to the local consistency information and global consistency information of the first initial matching to determine whether the first initial matching is classified into the matching subset; the first initial matching is the Any item in the first matching set.
  • the first trimming module includes a first local consistency learning module, a first global consistency learning module and a first trimming sub-module
  • the feature matching consistency information includes a local consistency score and at least one of the global consistency scores
  • the screening unit 1102 is configured to determine the local consistency information and the global consistency information of the first initial match through the first trimming module, according to the local consistency information of the first initial match Determine whether the first initial match is classified into the matching subset by using sex information and global consistency information, including:
  • the first local dynamic graph for the first initial matching is constructed by the first local consistency learning module, and the local consistency score of the first initial matching in the first local dynamic graph is calculated; the first local dynamic graph The graph includes the node where the first initial matching is located and K related nodes related to the node where the first initial matching is located; the K related nodes are the nodes where the first initial matching is based on the K-nearest neighbor algorithm owned;
  • a first global dynamic graph is constructed by the first global consistency learning module, and the first global dynamic graph is determined according to the local consistency score of the first initial matching in the first local dynamic graph and the first global dynamic graph The comprehensive consistency score of the initial matching; the first global dynamic graph includes all nodes where the initial matching is located;
  • the first trimming submodule is used to determine whether the first initial match is classified into the matching subset according to the comprehensive consistency score of the first initial match.
  • the first local consistency learning module includes a first feature dimension enhancement module, a first dynamic graph construction module, a first feature dimension reduction module, and a first local consistency score calculation module;
  • the screening unit 1102 is configured to construct a first local dynamic graph for the first initial matching through the first local consistency learning module, and calculate a local consistency score of the first initial matching in the first local dynamic graph ,include:
  • the initial feature vector of the first initial match is subjected to a dimensional upgrade process by the first feature dimensional increasing module to obtain a high-dimensional feature vector of the first initial match;
  • the first local dynamic graph building module uses the first local dynamic graph building module to determine the top K correlation matches in the first matching set with the highest correlation (Euclidean distance) of the high-dimensional feature vector of the first initial match through the K-nearest neighbor algorithm , based on the first initial matching and the K related matchings, construct a first local dynamic graph for the first initial matching, and obtain an ultra-high-dimensional feature vector of the first initial matching; the first initial matching
  • the ultra-high-dimensional feature vector includes a combination of the high-dimensional feature vector of the first initial match and the correlation vector between the first initial match and the K correlation matches;
  • the local consistency score of the first initial match in the first local dynamic graph is calculated by the first local consistency score calculation module based on the low-dimensional feature vector of the first initial match.
  • the first feature dimension reduction module includes a first annular convolution module and a second annular convolution module; the screening unit 1102 is configured to use the first feature dimension reduction module to The ultra-high-dimensional feature vector of the first initial matching is subjected to dimensionality reduction processing, and the low-dimensional feature vector of the first initial matching is obtained, including:
  • the super-high-dimensional feature vectors of the first initial match are grouped according to the degree of relevancy by the first annular convolution module, and the first feature aggregation process is performed on each group of feature vectors to obtain initially aggregated feature vectors;
  • a second feature aggregation process is performed on the initially aggregated feature vector by the second annular convolution module to obtain the first initially matched low-dimensional feature vector.
  • the screening unit 1102 is configured to determine the first initial match according to a local consistency score of the first initial match in the first local dynamic graph and the first global dynamic graph Composite Concordance Score, including:
  • a comprehensive consistency score for the first initial match is determined according to the local consistency score and the global consistency score.
  • the screening unit 1102 is configured to construct a first global dynamic graph through the first global consistency learning module, including:
  • the first global dynamic graph is constructed by the first global consistency learning module according to the local consistency score of each initial match in the first matching set in the corresponding local dynamic graph;
  • the determining of the comprehensive consistency score of the first initial matching according to the local consistency score of the first local dynamic map and the first global dynamic map according to the first initial matching includes:
  • a comprehensive consistency score of the first initial match is calculated according to the first global dynamic map and the low-dimensional feature vector of the first initial match.
  • the first global dynamic graph is represented by an adjacency matrix
  • the screening unit 1102 is configured to calculate the first global dynamic graph and the low-dimensional feature vector of the first initial match according to the first global dynamic graph
  • the composite agreement score for the first initial match including:
  • a graph convolutional network is used to calculate the comprehensive low-dimensional eigenvectors of the first initial matching
  • a composite consistency score for the first initial match is calculated based on the composite low-dimensional feature vector of the first initial match.
  • the screening unit 1102 is configured to use the first trimming sub-module to determine whether the first initial match is classified into the match according to the comprehensive consistency score of the first initial match subset, including:
  • first trimming submodule uses the first trimming submodule to determine whether the comprehensive consistency score of the first initial match is greater than a first threshold, and if so, determining that the first initial match is included in the matching subset;
  • the first trimming sub-module uses the first trimming sub-module to determine that the comprehensive consistency score of the first initial match is ranked in descending order in the first matching set, if the ranking of the first initial match is greater than the second Threshold, it is determined that the first initial match is classified into the matching subset.
  • the matching screening apparatus 1100 further includes a training unit 1104;
  • the training unit 1104 is configured to use the supervised data set to train the cropping module before the screening unit 1102 selects a matching subset from the initial matching set through at least one cropping module to obtain a training result;
  • the temperature-adapted binary loss function evaluates the training result, and the parameters of the cropping module are updated according to the method of minimizing the binary loss function.
  • the matching screening apparatus 1100 further includes a determining unit 1105 and a calculating unit 1106;
  • the determining unit 1105 is configured to, before the screening unit 1102 filters out a matching subset from the initial matching set through at least one cropping module, according to the image task related to the image pair, determine the parameters of the parametric transformation model. Constraints used, including epipolar geometric constraints or reprojection errors;
  • the calculating unit 1106 is configured to use the matching subset to calculate the model parameters of the parametric transform model when the parametric transform model uses the constraint relationship.
  • the image task includes any one of a line fitting task, a wide-baseline image matching task, an image localization task, an image stitching task, a three-dimensional reconstruction task, and a camera pose estimation task.
  • the acquisition unit 1101 in this embodiment of the present application may be a communication module in an electronic device, and the screening unit 1102, prediction unit 1103, training unit 1104, determination unit 1105, and calculation unit 1106 may be processors or chips in the electronic device.
  • the initial matching set can be screened, so that the correct matching ratio in the selected matching subset is higher than the correct matching ratio in the initial matching set, which can improve the calculation accuracy of the model parameters of the parametric transformation model , and then improve the processing effect of the parametric transformation model for processing image tasks.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 1200 includes a processor 1201 and a memory 1202.
  • the processor 1201 and the memory 1202 can pass through a communication bus 1203 are connected to each other.
  • the communication bus 1203 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like.
  • the communication bus 1203 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in FIG. 12, but it does not mean that there is only one bus or one type of bus.
  • the memory 1202 is configured to store a computer program including program instructions
  • the processor 1201 is configured to invoke the program instructions, the program including for performing the methods shown in FIGS. 1 a , 2 a and 3 .
  • the processor 1201 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), or one or more integrated circuits for controlling the execution of the above program programs.
  • CPU Central Processing Unit
  • ASIC Application-Specific Integrated Circuit
  • Memory 1202 may be Read-Only Memory (ROM) or other types of static storage devices that can store static information and instructions, Random Access Memory (RAM), or other types of information and instructions that can be stored It can also be an electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of being executed by a computer Access any other medium without limitation.
  • the memory can exist independently and be connected to the processor through a bus.
  • the memory can also be integrated with the processor.
  • the electronic device 1200 may also include common components such as a communication module and an antenna, which will not be described in detail here.
  • the initial matching set can be screened, so that the correct matching ratio in the selected matching subset is higher than the correct matching ratio in the initial matching set, which can improve the calculation accuracy of the model parameters of the parametric transformation model , so as to improve the processing effect of the parametric transformation model for processing image tasks.
  • Embodiments of the present application further provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program configured for electronic data exchange, and the computer program causes a computer to perform any matching as described in the foregoing method embodiments Some or all steps of a screening method.
  • Embodiments of the present application further provide a computer program product, including a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause the computer to execute any part of the matching screening method described in the above method embodiments or all steps.
  • the disclosed apparatus may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software program modules.
  • the integrated unit if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the aforementioned memory includes various media that can store program codes, such as U disk, ROM, RAM, mobile hard disk, magnetic disk or optical disk.
  • Embodiments of the present application provide a matching screening method and apparatus, electronic device, computer-readable storage medium, and computer program product.
  • the method includes: acquiring an initial matching set, where the initial matching set is derived from the initial matching between image pairs Result: a matching subset is screened from the initial matching set by at least one clipping module, and the correct matching ratio in the matching subset is higher than the correct matching ratio in the initial matching set.
  • the at least one clipping module is used to obtain Consistency information of each piece of initial matching in the initial matching set; wherein the matching subset is used for processing image tasks related to the image pair.
  • the embodiments of the present application can improve the processing effect of the parametric transformation model for processing an image task.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Oscillators With Electromechanical Resonators (AREA)
  • Filters And Equalizers (AREA)

Abstract

本申请实施例提供一种匹配筛选方法、装置、电子设备和计算机可读存储介质,该匹配筛选方法包括:电子设备获取初始匹配集合,初始匹配集合来源于图像对之间的初始匹配结果;通过至少一个裁剪模块从初始匹配集合中筛选出匹配子集,匹配子集中的正确匹配比例高于初始匹配集合中的正确匹配比例,至少一个裁剪模块用于获取初始匹配集合中每条初始匹配的一致性信息;该匹配子集用于处理与图像对相关的图像任务。

Description

匹配筛选方法及装置、电子设备、存储介质和计算机程序
相关申请的交叉引用
本申请基于申请号为202011641201.1、申请日为2020年12月31日的中国专利申请提出,申请人为上海商汤科技开发有限公司,申请名称为“匹配筛选方法、装置、电子设备和计算机可读存储介质”的技术方案,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及图像处理领域,涉及但不限于一种匹配筛选方法及装置、电子设备、计算机可读存储介质和计算机程序产品。
背景技术
在计算机视觉和图像处理领域中,特征匹配是基础研究问题之一,初始匹配集合中的匹配一般基于图像对之间的匹配点对应的描述子之间的欧式距离相似度从图像对的两组特征点中选择匹配一致的点,这种匹配的方法往往存在大量的错误匹配。
相关技术中,一般基于初始匹配集合对深度学习的神经网络模型进行学习训练并执行相应的图像任务。由于初始匹配集合中样本分布往往不平衡,如果初始匹配集合中错误匹配的数量远多于正确匹配,则神经网络模型的学习过程易受错误匹配的干扰,导致神经网络模型的执行图像任务的效果较差。
发明内容
本申请实施例提供一种匹配筛选方法及装置、电子设备、计算机可读存储介质和计算机程序产品,可以提高参数化变换模型处理图像任务的处理效果。
本申请实施例提供了一种匹配筛选方法,包括:
获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;
通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;
其中,所述匹配子集用于处理与所述图像对相关的图像任务。
本申请实施例中,初始匹配结果可以是基于最邻近与次邻近欧氏距离比值的匹配算法从两组特征点中选择匹配一致的点,初始匹配集合中的每条初始匹配可以包括图像对中对应的点的特征信息(比如,图像对中对应的点的特征信息可以包括对应的点的坐标、对应的点的像素值、对应的点的灰度值、对应的点的红(Red,R)绿(Green,G)蓝(Blue,B)值中的至少一种的组合)。初始匹配集合中的匹配不一定都是正确的,有正确匹配,也有错误匹配,其中,正确匹配比例指的是初始匹配集合中所有正确匹配的数量占初始匹配集合的总数量的比例。本申请实施例可以对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,由于匹配子集是从初始匹配集合中筛选出来的,匹配子集中的正确匹配比例较高,使得计算的模型参数的可靠性较高,从而提高参数化变换模型的模型参数的计算精度,进而提高参数化变换模型处理图像任务的处理效果。
在本申请一些实施例中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,所述方法还包括:
利用所述参数化变换模型对所述初始匹配集合进行预测,得到所述初始匹配集合中每条初始匹配的预测结果,所述预测结果包括正确匹配或错误匹配。
本申请实施例的参数化变换模型的模型参数采用匹配子集进行计算,使得计算的模型参数的可靠性较高,参数化变换模型可以对初始匹配集合中的每条初始匹配进行更好的预测,与直接对初始匹配集合进行预测的神经网络模型相比,可以提高该参数化变换模型的预测结果的准确度。
在本申请一些实施例中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,包括:
通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集;
在所述至少一个裁剪模块包括一个裁剪模块的情况下,所述第一匹配集合为所述初始匹配集合;
在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,所述第一匹配集合是通过所述第一裁剪模块的上一个裁剪模块筛选得到的。
本申请实施例采用一个裁剪模块时,可以适用于初始匹配集合中错误匹配较少的情况。
本申请实施例的至少两个裁剪模块是神经网络学习模块,可以对初始匹配集合进行至少两次筛选,从而使得筛选出的匹配子集中的正确匹配比例较高,进而提高参数化变换模型的模型参数的计算精度,使得计算的模型参数在处理图像任务时的可靠性较高。本申请实施例可以适用于初始匹配集合中错误匹配较多的情况。由于每个裁剪模块在训练过程中学习的特征都不一样,采用至少两个裁剪模块,可以通过至少两次的特征的学习,可以实现动态特征学习,与采用固定特征训练相比,可以提高筛选出的匹配子集中正确匹配的比例。
在本申请一些实施例中,所述通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
通过所述第一裁剪模块确定第一初始匹配的局部一致性信息或全局一致性信息,根据所述第一初始匹配的局部一致性信息或全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
在本申请一些实施例中,所述通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
在本申请一些实施例中,所述第一裁剪模块包括第一局部一致性学习模块、第一全局一致性学习模块和第一裁剪子模块,所述特征匹配一致性信息包括局部一致性分数和全局一致性分数中的至少一项;
所述通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集,包括:
通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数;所述第一局部动态图包含所述第一初始匹配所在的节点以及与所述第一初始匹配所在的节点相关的K个相关节点;所述K个相关节点是利用K近邻算法基于所述第一初始匹配所在的节点得到的;
通过所述第一全局一致性学习模块构建第一全局动态图,根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数;所述第一全局动态图包含所有初始匹配所在的节点;
利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集。
在本申请一些实施例中,所述第一局部一致性学习模块包括第一特征升维模块、第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块;
所述通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数,包括:
通过所述第一特征升维模块对所述第一初始匹配的初始特征向量进行升维处理,得到所述第一初始匹配的高维特征向量;
利用所述第一局部动态图构建模块通过K近邻算法确定所述第一匹配集合中与所述第一初始匹配的高维特征向量的相关度(欧氏距离)排名靠前的K条相关匹配,基于所述第一初始匹配和所述K条相关匹配构建针对所述第一初始匹配的第一局部动态图,得到所述第一初始匹配的超高维特征向量;所述第一初始匹配的超高维特征向量包括所述第一初始匹配的高维特征向量以及所述第一初始匹配与所述K条相关匹配之间的相关度向量的组合;
利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第 一初始匹配的低维特征向量;
通过所述第一局部一致性分数计算模块基于所述第一初始匹配的低维特征向量计算所述第一初始匹配在所述第一局部动态图的局部一致性分数。
在本申请一些实施例中,所述第一特征降维模块包括第一环状卷积模块和第二环状卷积模块;所述利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量,包括:
通过所述第一环状卷积模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,对每组特征向量进行第一次特征聚集处理,得到初步聚集的特征向量;
通过所述第二环状卷积模块对所述初步聚集的特征向量进行第二次特征聚集处理,得到所述第一初始匹配的低维特征向量。
在本申请一些实施例中,所述根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
计算所述第一初始匹配在所述第一全局动态图的全局一致性分数;
根据所述局部一致性分数和所述全局一致性分数确定所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述通过所述第一全局一致性学习模块构建第一全局动态图,包括:
通过所述第一全局一致性学习模块根据所述第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建第一全局动态图;
所述根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述第一全局动态图通过邻接矩阵表示,所述根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数,包括:
基于所述第一初始匹配的低维特征向量和所述邻接矩阵,利用图形卷积网络计算所述第一初始匹配的综合低维特征向量;
基于所述第一初始匹配的综合低维特征向量计算所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集,包括:
利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数是否大于第一阈值,若是,确定所述第一初始匹配归入所述匹配子集;
或者,利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数在所述第一匹配集合中按照从大到小的排名,若所述第一初始匹配的排名大于第二阈值,确定所述第一初始匹配归入所述匹配子集。
在本申请一些实施例中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,所述方法还包括:
利用有监督数据集对裁剪模块进行训练,得到训练结果;
通过自适应温度的二分类损失函数对所述训练结果进行评估,按照最小化所述二分类损失函数的方法对所述裁剪模块的参数进行更新。
在本申请一些实施例中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,所述方法还包括:
根据所述图像对相关的图像任务确定所述参数化变换模型所使用的约束关系,所述约束关系包括对极几何约束或重投影误差;
所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,所述方法还包括:
在所述参数化变换模型使用所述约束关系的情况下,利用所述匹配子集计算所述参数化变换模型的模型参数。
在本申请一些实施例中,所述图像任务包括直线拟合任务、宽基线图像匹配任务、图像定位任务、图像拼接任务、三维重建任务、相机姿态估计任务中的任一种。
本申请实施例提供了一种匹配筛选装置,包括:
获取单元,配置为获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;
筛选单元,配置为通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例;
其中,所述匹配子集用于计算参数化变换模型的模型参数,所述参数化变换模型用于处理与所述图像对相关的图像任务。
本申请实施例提供了一种电子设备,包括处理器和存储器,所述存储器配置为存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置为调用所述程序指令,执行如上述任意一种方法。
本申请实施例提供了一种计算机可读存储介质,其中,上述计算机可读存储介质存储配置为电子数据交换的计算机程序,其中,上述计算机程序使得计算机执行如上述任意一种方法。
本申请实施例提供了一种计算机程序产品,其中,上述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,上述计算机程序可操作来使计算机执行如上述任意一种方法。该计算机程序产品可以为一个软件安装包。
本申请实施例中,电子设备获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;利用所述匹配子集计算参数化变换模型的模型参数,所述参数化变换模型用于处理与所述图像对相关的图像任务。本申请实施例可以对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,可以提高参数化变换模型的模型参数的计算精度,进而提高参数化变换模型处理图像任务的处理效果。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本申请实施例。根据下面参考附图对示例性实施例的详细说明,本申请实施例的其它特征及方面将变得清楚。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本申请的实施例,并与说明书一起用于说明本申请的技术方案。应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1a是本申请实施例提供的一种匹配筛选方法的流程示意图;
图1b是本申请实施例提供的一种用于匹配筛选的一致性学习框架(Consensus Learning framework,CLNet)的结构示意图;
图2a是本申请实施例提供的另一种匹配筛选方法的流程示意图;
图2b是本申请实施例提供的另一种用于匹配筛选的CLNet的结构示意图;
图3是本申请实施例提供的一种第一裁剪模块对初始匹配集合进行筛选的流程示意图;
图4是本申请实施例提供的一种第一局部一致性学习模块的结构示意图;
图5是本申请实施例提供的一种第一特征降维模块的结构示意图;
图6是本申请实施例提供的一种第一环状卷积模块和第二环状卷积模块进行特征聚集的示意图;
图7a是本申请实施例提供的一种计算初始匹配集合中每条初始匹配的综合一致性分数的流程示意图;
图7b是本申请实施例提供的另一种计算初始匹配集合中每条初始匹配的综合一致性分数的流程示意图;
图8是本申请实施例提供的另一种匹配筛选方法的流程示意图;
图9是采用本申请实施例的CLNet方法与采用PointCN方法在直线拟合任务上的拟合效果示意图;
图10是采用本申请实施例的CLNet方法与采用PointCN方法、OANet方法、PointACN方法在直线拟合任务上的L2距离的对比图;
图11为本申请实施例提供的一种匹配筛选装置的结构示意图;
图12是本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。
在本申请中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本申请所描述的实施例可以与其它实施例相结合。
本申请实施例所涉及到的电子设备可以包括具有运算能力的设备,比如,个人电脑、手机、服务器、人脸识别设备、人脸通行设备、图像处理设备、虚拟现实设备等。为方便描述,可以将上面提到的设备统称为电子设备。
特征匹配筛选是计算机视觉和图像处理领域的基础研究问题之一,其目的是从包含错误匹配(噪声或其它干扰)的初始特征匹配集合中筛选正确的匹配。由于错误匹配的分布具有任意性,且在初始匹配集合中占据了主导地位,因此,需要提供一种能够在大量错误匹配的干扰下识别正确匹配的匹配筛选方法。此外,真实世界中包含的旋转、平移、尺度、视角变化和光照变化等多种因素增加了特征匹配筛选的难度。
相关技术中,主要有基于非机器学习和基于机器学习这两种特征匹配筛选方法;对于基于非机器学习的方法,该类方法基于人工设计的规则进行特征匹配筛选,多依赖于假设或先验知识,不需要进行复杂的学习和训练;但由于所依赖的假设或先验知识在特定噪声下失效,导致该类方法对多种噪声的鲁棒性较差;对于基于机器学习的方法,该类方法将特征匹配筛选建模为二分类问题,使用深度学习的网络学习预测初始匹配集合中所有匹配的类别,即正确匹配或错误匹配。但由于初始匹配集合中样本分布不平衡,错误匹配的数量远多于正确匹配,该类方法的学习过程易受干扰,从而导致难以一次性识别出所有潜在的正确匹配。
针对上述技术问题,本申请实施例提出了一种匹配筛选方法、装置、电子设备、计算机可读存储介质和计算机程序产品。
请参阅图1a,图1a是本申请实施例提供的一种匹配筛选方法的流程示意图。如图1a所示,该匹配筛选方法可以包括如下步骤。
步骤101,电子设备获取初始匹配集合,初始匹配集合来源于图像对之间的初始匹配结果。
本申请实施例中,初始匹配集合可以包括多条初始匹配,初始匹配集合中的每条初始匹配可以包括图像对中对应的点的特征信息(比如,图像对中对应的点的特征信息可以包括对应的点的坐标、对应的点的像素值、对应的点的灰度值、对应的点的RGB值中的至少一种的组合)。图像对是图像任务中用到的一对图像,一般包括两张图像:第一图像和第二图像。举例来说,初始匹配结果可以是基于逐像素匹配算法从第一图像和第二图像中分别选择的匹配一致的像素点。匹配一致的像素点,可以是第一图像和第二图像中相对应的像素点。比如,第一图像是从一个角度拍摄的一栋大楼,第二图像是从另一个角度拍摄的该栋大楼,匹配一致的像素点可以是该栋大楼的同样的位置在第一图像的像素点以及在第二图像的像素点。
步骤102,电子设备通过至少一个裁剪模块从初始匹配集合中筛选出匹配子集,匹配子集中的正确匹配比例高于初始匹配集合中的正确匹配比例。
其中,所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息,所述匹配子集用于处理与所述图像对相关的图像任务。
初始匹配的一致性信息用于衡量初始匹配在整个图像中与其他初始匹配的一致性,在本申请一些实施例中,一致性可以包括匹配在朝向、旋转、平移等维度上的一致性。
本申请实施例中,初始匹配集合中的匹配不一定都是正确的,有正确匹配,也有错误匹配,其中,正确匹配比例指的是初始匹配集合中所有正确匹配的数量占初始匹配集合的总数量的比例。
比如,步骤102可以采用经过训练的神经网络学习模型(至少一个裁剪模块)对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于初始匹配集合中的正确匹配比例。本申请实施例中的至少一个裁剪模块都是训练好的神经网络学习模型。
在本申请一些实施例中,步骤102可以包括如下步骤:
电子设备通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集;
在所述至少一个裁剪模块包括一个裁剪模块的情况下,所述第一匹配集合为所述初始匹配集合;
在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,所述第一匹配集合是通过所述第一裁剪模块的上一个裁剪模块筛选得到的。
在本申请一些实施例中,电子设备通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,可以包括如下步骤:
电子设备通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
其中,局部一致性信息是第一初始匹配在图像的局部区域的一致性,全局一致性信息是第一初始匹配在整幅图像的一致性。
在本申请一些实施例中,在所述至少一个裁剪模块包括一个裁剪模块的情况下,电子设备通过该一个裁剪模块确定所述初始匹配集合中每条初始匹配的局部一致性信息和全局一致性信息,根据每条初始匹配的局部一致性信息和全局一致性信息从所述初始匹配集合中筛选出匹配子集。
本申请实施例采用训练好的一个裁剪模块,可以适用于初始匹配集合中错误匹配较少的情况。裁剪模块是神经网络学习模块,由于裁剪模块在训练过程中可以学习到特征,与采用固定特征进行训练相比,可以提高筛选出的匹配子集中正确匹配的比例。
在本申请一些实施例中,电子设备通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,可以包括如下步骤:
电子设备通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
本申请实施例中,仅考虑第一初始匹配的局部一致性信息的情况下,也可以实现第一裁剪模块的筛选功能,无需考虑全局一致性信息,对于全局一致性差别不大的图像而言,可以节省匹配筛选所需的计算量,快速实现匹配筛选。
仅考虑第一初始匹配的全局一致性信息的情况下,也可以实现第一裁剪模块的筛选功能,无需考虑局部一致性信息,对于局部一致性差别不大的图像而言,可以节省匹配筛选所需的计算量,快速实现匹配筛选。
在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,电子设备通过至少两个裁剪模块进行至少两次筛选,后一个裁剪模块用于对前一个裁剪模块筛选出的匹配集合进一步进行筛选,直到最后一个裁剪模块筛选出匹配集合为止。
本申请实施例中,第一裁剪模块是至少两个裁剪模块中的最后一个。
本申请实施例的至少两个裁剪模块是训练好的神经网络学习模块,可以对初始匹配集合进行至少两次筛选,从而使得筛选出的匹配子集中的正确匹配比例较高,进而提高参数化变换模型的模型参数的计算精度,使得计算的模型参数在处理图像任务时的可靠性较高,参数化变换模型可用于图像拼接和三维重建等任务。
裁剪模块可以通过大量的有监督样本(预先知道的正确匹配样本)进行训练,对每个匹配进行预测,并计算训练的损失,当训练的损失小于设定值时,确定该裁剪模块为训练好的裁剪模块。
其中,训练好的至少两个裁剪模块不是对初始匹配集合同时筛选,而是按顺序逐个筛选,即,上一个裁剪模块筛选后的输出结果作为下一个裁剪模块的输入。举例来说,如果至少两个裁剪模块包括2个裁剪模块:裁剪模块1和裁剪模块2,则裁剪模块1对初始匹配集合进行第一次筛选,得到匹配集合1;裁剪模块2对匹配集合1进行第二次筛选,得到匹配子集。举例来说,如果至少两个裁剪模块包括3个裁剪模块:裁剪模块1、裁剪模块2和裁剪模块3,则裁剪模块1对初始匹配集合进行第一次筛选,得到匹配集合1;裁剪模块2对匹配集合1进行第二次筛选,得到匹配集合2;裁剪模块3对匹配集合2进行第三次筛选,得到匹配子集。
在本申请一些实施例中,如果裁剪模块的数量为3个,初始匹配集合的数量为10000条,裁剪模块每次筛选50%,则筛选出来的匹配子集中的匹配数量为1250条。由于裁剪模块充分考虑了每条匹配的局部一致性和全局一致性,匹配子集中的正确匹配的比例远高于初始匹配集合中的正确匹配的比例。匹配子集中的错误匹配比例很小,当该匹配子集用于直线拟合任务时,受到错误匹配的干扰也较小,从而提高直线拟合任务的处理效果。
可见,本申请实施例通过裁剪模块对初始匹配集合进行多次裁剪,逐步剔除错误匹配的数量,进而,可以缓解初始匹配集合中样本分布不平衡以及错误匹配分布任意性的问题。
请参阅图1b,图1b是本申请实施例提供的一种用于匹配筛选的CLNet的结构示意图。如图1b所示,该CLNet包括至少两个裁剪模块和一个参数化变换模型,其中,N表示初始匹配集合中的初始匹配的数量,4表示初始匹配的4维坐标(比如,第一图像中的第一像素点的坐标位置以及第二图像中与第一图像中的该像素点匹配的第二像素点的坐标位置组成的4维坐标)。通过K(K大于或等于2)个裁剪模块(基于局部到全局的一致性学习的裁剪模块)逐步对初始匹配集合进行筛选,得到匹配子集(匹配子集包含N1个候选匹配),参数化变换模型的模型参数是基于N1个候选匹配计算得到的。其中,每个裁剪模块均可以包括局部一致性学习模块、全局一致性学习模块和裁剪子模块。
本申请实施例可以适用于初始匹配集合中错误匹配较多的情况。由于每个裁剪模块在训练过程中学习的特征都不一样,采用至少两个裁剪模块,可以通过至少两次的特征的学习,可以实现动态特征学习;与采用固定特征训练相比,可以提高筛选出的匹配子集中正确匹配的比例。
其中,电子设备利可以用匹配子集计算参数化变换模型的模型参数,参数化变换模型用于处理与图像对相关的图像任务。
本申请实施例中,参数化变换模型可以用于对初始匹配集合中每条初始匹配进行预测,预测每条初始匹是正确匹配或错误匹配。由于参数化变换模型的模型参数是基于匹配子集计算得到的,举例来说,模型参数可以是本质矩阵(essential matrix)。匹配子集是从初始匹配集合中筛选出来的,匹配子集中的正确匹配比例较高,使得计算的模型参数的可靠性较高,从而提高参数化变换模型的模型参数的计算精度,进而提高参数化变换模型处理图像任务的处理效果。
其中,与图像对相关的图像任务可以包括直线拟合(line fitting)任务、宽基线图像匹配(wide-baseline image matching)任务、图像定位(image localization)任务、图像拼接任务、三维重建任务中的任一种。
在本申请一些实施例中,在执行步骤102之前,图1的方法还可以执行如下步骤:
步骤11:电子设备利用有监督数据集对裁剪模块进行训练,得到训练结果;
步骤12:电子设备通过自适应温度的二分类损失函数对所述训练结果进行评估,按照最小化所述二分类损失函数的方法对所述裁剪模块的参数进行更新,得到训练好的裁剪模块。
本申请实施例中,尽管使用常规的二进制交叉熵损失进行的训练取得了令人满意的效果,但这种训练方式对于对极距离在d thr附近的匹配,dthr表示设定距离;仍然存在不可避免的标签模糊(即,在d thr附近的匹配,可能被判定为正确匹配,也可能被判定为错误匹配)。由于匹配c i的置信度应与对应的极线距离d i负相关,即d i越接近0,越可能被判断为正确匹配;因此,本申请实施例对于推定的正确匹配(d i<d thr)引入了一个自适应温度,其计算公式可以用公式(1)所示的高斯核τ i来表示。
τ i=exp(-||d i-d thr||/α·d thr)          (1)
其中,α是高斯核的内核宽度,对于d i>=d thr的离群值c i,将τ i设为1。由于极点的固有歧义性,无法解决标签模糊的问题,本申请实施例将训练目标用公式(2)进行描述:
Figure PCTCN2021095170-appb-000001
其中,L reg表示参数化变换模型
Figure PCTCN2021095170-appb-000002
的回归损失,λ是加权因子。本申请实施例提出的自适应温度的二分类损失函数如公式(3)所示:
Figure PCTCN2021095170-appb-000003
其中,
Figure PCTCN2021095170-appb-000004
是第j个修剪模块的局部一致性学习层的输出,
Figure PCTCN2021095170-appb-000005
是第j个修剪模块的全局一致性学习层的输出,
Figure PCTCN2021095170-appb-000006
是最后一个裁剪模块的最后一个多层感知器(Multi-Layer Perceptrons,MLP)的输出(w=tanh(ReLU(o)));H(o)=σ(τ·o)(σ为sigmoid激活函数);y j
Figure PCTCN2021095170-appb-000007
表示二进制的标签正确的数据集;L bce表示二进制交叉熵损失;K是裁剪模块的数量,因此,对于具有较小d i的正确匹配c i而言,通过较小的温度进行模型优化,对执行更大的正则化更有信心。
本申请实施例可以对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,利用匹配子集计算参数化变换模型的模型参数,可以提高参数化变换模型的模型参数的计算精度,进而提高参数化变换模型处理图像任务的处理效果。
请参阅图2a,图2a是本申请实施例提供的另一种匹配筛选方法的流程示意图。图2a是在图1a的基础上进一步优化得到的,如图2a所示,该匹配筛选方法可以包括如下步骤。
步骤201,电子设备获取初始匹配集合,初始匹配集合来源于图像对之间的初始匹配结果。
步骤202,电子设备通过至少一个裁剪模块从初始匹配集合中筛选出匹配子集,至少一个裁剪模块用于获取初始匹配集合中每条初始匹配的一致性信息,匹配子集中的正确匹配比例高于初始匹配集合中的正确匹配比例,匹配子集用于处理与图像对相关的图像任务。
在本申请一些实施例中,匹配子集用于计算参数化变换模型的模型参数,参数化变换模型用于处理与图像对相关的图像任务。
其中,步骤201至步骤202的实施方式可以参见图1a的步骤101至步骤102,此处不再赘述。
步骤203,电子设备利用参数化变换模型对初始匹配集合进行预测,得到初始匹配集合中每条初始匹配的预测结果,预测结果包括正确匹配或错误匹配。
本申请实施例中,参数化变换模型的模型参数采用匹配子集进行计算,使得计算的模型参数的可靠性较高,参数化变换模型可以对初始匹配集合中的每条初始匹配进行更好的预测,与直接对初始匹配集合进行预测的神经网络模型相比,可以提高该参数化变换模型的预测结果的准确度。
请参阅图2b,图2b是本申请实施例提供的另一种用于匹配筛选的CLNet的结构示意图。如图2b所示,该CLNet包括至少两个裁剪模块、一个参数化变换模型和全尺寸预测模块,其中,N表示初始匹配集合中的初始匹配的数量,4表示初始匹配的4维坐标(比如,第一图像中的第一像素点的坐标位置以及第二图像中与第一图像中的该像素点匹配的第二像素点的坐标位置组成的4维坐标)。通过K(K大于或等于2)个裁剪模块(基于局部到全局的一致性学习的裁剪模块)逐步对初始匹配集合进行筛选,得到匹配子集(匹配子集包含N1个候选匹配),参数化变换模型的模型参数是基于N1个候选匹配计算得到的,全尺寸预测模块用于对初始匹配集合中的N条初始匹配进行预测(即,全尺寸预测),可得出初始匹配对集合中每个初始匹配对的预测结果(预测结果包括正确匹配或错误匹配)。其中,每个裁剪模块均可以包括局部一致性学习模块、全局一致性学习模块和裁剪子模块。
目前,准确的像素特征匹配是解决计算机视觉、机器学习等许多重要的图像任务的前提。例如,运动恢复结构(Structure From Motion,SfM)、同步定位和地图绘制(Simultaneous Location And Mapping,SLAM)、图像拼接、视觉定位和虚拟现实等。SfM在计算机视觉领域指的是,通过分析物体的2D运动图像得到3D结构信息的过程。然而,真实世界中的图片往往包含旋转、平移、尺度、视角变化和光照变化等多种因素,使得匹配筛选方法这一问题极具挑战性。
在目前的基于学习的方法中,通常将匹配筛选作为一种匹配分类任务,其中采用MLP对匹配进行分类(正确匹配或错误匹配),然而对此类二元分类问题的优化并非易事,匹配可能极不平衡,比如,离群值(错误匹配)占比高达90%以上。因此,通过MLP直接预测初始匹配集合中的正确匹配结果的准确性较低。
采用图2a所示的方法,使得计算的模型参数的可靠性较高,参数化变换模型可以对初始匹配集合中的每条初始匹配进行更好的预测,与直接对初始匹配集合进行预测的神经网络模型相比,可以提高该参数化变换模型的预测结果的准确度。
请参阅图3,图3是本申请实施例提供的一种第一裁剪模块对初始匹配集合进行筛选的流程示意图,如图3所示,该方法可以包括如下步骤。
步骤301,电子设备通过第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算第一初始匹配在第一局部动态图的局部一致性分数;第一局部动态图包含第一初始匹配所在的节点以及与第一初始匹配所在的节点相关的K个相关节点;K个相关节点是利用K近邻算法基于第一初始匹配所在的节点得到的。
本申请实施例中,第一裁剪模块包括第一局部一致性学习模块、第一全局一致性学习模块和第一裁剪子模块。其中,第一裁剪模块是训练好的至少两个裁剪模块中的第一个。
第一局部一致性学习模块可以构建第一局部动态图,计算第一初始匹配在第一局部动态图的局部一致性分数。第一全局一致性学习模块可以构建第一全局动态图,计算第一初始匹配在第一全局动态图的全局一致性分数,也可以计算第一初始匹配的综合一致性分数。
第一局部动态图是根据初始匹配的初始特征向量映射到高维特征向量后,根据第一初始匹配的高维特征向量与其他初始匹配之间的相关性构建的。每条初始匹配映射到第一局部动态图的一个节点。第一初始匹配所在的节点是该第一初始匹配映射到第一局部动态图的节点。比如,可以按照K近邻算法(K-Nearest Neighbor,KNN)找到与第一初始匹配所在的节点最接近的K个相关节点,将第一初始匹配所在的节点与这K个相关节点组成的图作为第一局部动态图。命名为动态图,主要是因为初始匹配从初始特征向量映射到高维特征向量后,每次通过K近邻算法找到的节点不一定相同,是动态变化的。
第一初始匹配在第一局部动态图的局部一致性分数,用于衡量第一初始匹配在局部的一致性,如果第一初始匹配是正确匹配,则其在局部的一致性较好,局部一致性分数较高;如果第一初始匹配是错误匹配,则其在局部的一致性较差,局部一致性分数较低。
本申请实施例,通过动态图的方法计算匹配在局部区域的一致性分数以及全局区域的一致性分数,可以确保在裁剪过程中仅保留一致性较高的可靠匹配。
在本申请一些实施例中,请参阅图4,图4是本申请实施例提供的一种第一局部一致性学习模块的结构示意图。如图4所示,所述第一局部一致性学习模块包括第一特征升维模块、第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块。
步骤301可以包括如下步骤:
步骤21:电子设备通过所述第一特征升维模块对所述第一初始匹配的初始特征向量进行升维处理,得到所述第一初始匹配的高维特征向量;
步骤22:电子设备利用所述第一局部动态图构建模块通过K近邻算法确定所述第一匹配集合中与所述第一初始匹配的高维特征向量的相关度(比如,根据欧氏距离确定的相关度)排名靠前的K条相关匹配,基于所述第一初始匹配和所述K条相关匹配构建针对所述第一初始匹配的第一局部动态图,得到所述第一初始匹配的超高维特征向量;所述第一初始匹配的超高维特征向量包括所述第一初始匹配的高维特征向量以及所述第一初始匹配与所述K条相关匹配之间的相关度向量的组合;
步骤23:电子设备利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量;
步骤24:电子设备通过所述第一局部一致性分数计算模块基于所述第一初始匹配的低维特征向量计算所述第一初始匹配在所述第一局部动态图的局部一致性分数。
本申请实施例中,第一特征升维模块可以是训练好的深度神经网络模块,比如训练好的残差网络,残差网络可以包括多个残差模块(ResNet Blcok),比如,可以包括4个残差模块(ResNet Blcok)。第一特征升维模块可以将第一匹配集合中每个初始匹配的初始特征向量进行升维处理, 得到每个初始匹配的高维特征向量。第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块均可以是训练好的深度神经网络模块。举例来说,第一特征降维模块可以包括多个残差模块(ResNet Blcok),第一局部一致性分数计算模块可以包括MLP。
其中,第一初始匹配的初始特征向量可以是四维向量,包括第一初始匹配在图像对的第一图像中的第一像素点的坐标和第一初始匹配在图像对的第二图像中的第二像素点的坐标的组合。举例来说,如果第一像素点的坐标为(x1,y1),第二像素点的坐标为(x2,y2),则第一初始匹配的初始特征向量p1=(x1,y1,x2,y2)。第一初始匹配的高维特征向量可以是128维向量。
在本申请一些实施例中,请参阅图5,图5是本申请实施例提供的一种第一特征降维模块的结构示意图。所述第一特征降维模块包括第一环状卷积模块和第二环状卷积模块。
步骤23可以包括如下步骤:
步骤231:电子设备通过所述第一环状卷积模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,对每组特征向量进行第一次特征聚集处理,得到初步聚集的特征向量;
步骤232:电子设备通过所述第二环状卷积模块对所述初步聚集的特征向量进行第二次特征聚集处理,得到所述第一初始匹配的低维特征向量。
本申请实施例中,第一环状卷积(annular convolution)模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,每组特征向量的维度相同。比如,将相关度排名前10%的分为一组,将相关度排名前10%~20%的分为一组,将相关度排名前20%~30%的分为一组,将相关度排名前30%~40%的分为一组,将相关度排名前40%~50%的分为一组,将相关度排名前50%~60%的分为一组,将相关度排名前60%~70%的分为一组,将相关度排名前70%~80%的分为一组,将相关度排名前80%~90%的分为一组,将相关度排名前90%~100%的分为一组,总共分成10组。
第一环状卷积模块将第一初始匹配的超高维特征向量按照相关度进行分组后,将每组特征向量聚集成一个特征向量。比如,超高维特征向量是k*128维,可以将k*128维分成p组:(p×k/p)×128,第一环状卷积模块对(p×k/p)×128进行第一次特征聚集处理,得到初步聚集的特征向量为k/p×128维。第二环状卷积模块可以将k/p×128聚集成1×128维的低维特征向量。其中,第一环状卷积模块中学习到的矩阵中的参数与第二环状卷积模块中学习到的矩阵中的参数不共享。矩阵中的参数,指的是矩阵中的元素的值。
请参阅图6,图6是本申请实施例提供的一种第一环状卷积模块和第二环状卷积模块进行特征聚集的示意图。如图6所示,对于第一初始匹配c 1而言,通过K近邻算法确定的K条相关匹配都反映在图6的第一局部动态图中,图6以K等于12为例进行说明,第一初始匹配c 1所在的节点与这K=12个相关节点组成的图作为第一局部动态图,12个相关节点按照与第一初始匹配c 1所在的节点的相关度(比如,欧式距离)被分成3组,然后通过第一环状卷积模块进行第一次特征聚集处理,第一环状卷积模块对(p×k/p)×128进行第一次特征聚集处理,得到初步聚集的特征向量为(k/p)×128维。第二环状卷积模块可以将(k/p)×128聚集成1×128维的低维特征向量。
本申请实施例采用环状卷积模块根据第一初始匹配的超高维特征向量按照相关度进行分组后降维,充分考虑了第一初始匹配的局部一致性,使得降维后的第一初始匹配的低维特征向量依然保留了第一初始匹配的局部一致性,从而提高了第一初始匹配在所述第一局部动态图的局部一致性分数的计算结果的准确性。
步骤302,电子设备通过第一全局一致性学习模块构建第一全局动态图,根据第一初始匹配在第一局部动态图的局部一致性分数和第一全局动态图确定第一初始匹配的综合一致性分数。
本申请实施例中,第一全局动态图包含了所有初始匹配所在的节点,可以通过第一全局动态图确定第一初始匹配在第一全局动态图的全局一致性分数,根据第一初始匹配在第一局部动态图的局部一致性分数和第一初始匹配在第一全局动态图的全局一致性分数确定第一初始匹配的综合一致性分数。还可以根据第一初始匹配在第一局部动态图的局部一致性分数和第一全局动态图确定第一初始匹配的综合一致性分数。第一初始匹配的综合一致性分数是综合了第一初始匹配在第一局部动态图的局部一致性分数和第一初始匹配在第一全局动态图的全局一致性分数得到的。
在本申请一些实施例中,步骤302中,电子设备根据第一初始匹配在第一局部动态图的局 部一致性分数和第一全局动态图确定第一初始匹配的综合一致性分数可以包括如下步骤:
步骤31:通过第一全局一致性学习模块计算所述第一初始匹配在所述第一全局动态图的全局一致性分数;
步骤32:第一全局一致性学习模块根据所述局部一致性分数和所述全局一致性分数确定所述第一初始匹配的综合一致性分数。
本申请实施例中,第一全局一致性学习模块可以计算第一初始匹配在第一全局动态图的全局一致性分数,根据局部一致性分数和全局一致性分数确定所述第一初始匹配的综合一致性分数。在本申请一些实施例中,第一全局一致性学习模块可以将局部一致性分数和全局一致性分数直接相加,将局部一致性分数和全局一致性分数之和作为综合一致性分数。第一全局一致性学习模块还可以根据加权算法计算综合一致性分数。
在本申请一些实施例中,步骤302中,电子设备通过所述第一全局一致性学习模块构建第一全局动态图可以包括如下步骤:
步骤41:电子设备通过所述第一全局一致性学习模块根据所述第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建第一全局动态图;
步骤302中,电子设备根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,可以包括:
步骤42:电子设备根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数。
本申请实施例中,不会直接计算全局一致性分数,而是在局部一致性分数和所述第一全局动态图的基础上直接计算综合一致性分数,减少了全局一致性分数的计算过程,提高了综合一致性分数的计算效率。
在本申请一些实施例中,所述第一全局动态图通过邻接矩阵表示,步骤32可以包括如下步骤:
步骤421:电子设备基于所述第一初始匹配的低维特征向量和所述邻接矩阵,利用图形卷积网络(Graph Convolutional Network,GCN)计算所述第一初始匹配的综合低维特征向量;
步骤422:电子设备基于所述第一初始匹配的综合低维特征向量计算所述第一初始匹配的综合一致性分数。
本申请实施例中,第一全局动态图是第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建的。邻接矩阵为第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数组成的矩阵与对应的转置矩阵相乘得到的。比如,第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数组成的矩阵为N×1,对应的转置矩阵为1×N,二者相乘得到N×N的矩阵,即为邻接矩阵。
若图形卷积网络经过训练后学习到的矩阵为W g,邻接矩阵为A,第一匹配集合中每条初始匹配的低维特征向量组成的矩阵Z,则图形卷积网络输出的结果如公式(4)所示:
out=L·Z·W g        (4)
其中,L=D -1/2·A’·D -1/2;D为A’的对角度矩阵(diagonal degree matrix),A’=A+I N,I N是为了保证数值稳定性的矩阵。L为N×N的矩阵,Z为N×128的矩阵,W g为128×128的矩阵。
得到图形卷积网络输出的结果out,将图形卷积网络输出的结果out加上第一匹配集合中每条初始匹配的低维特征向量组成的矩阵Z,即可得到第一匹配集合中每条初始匹配的综合低维特征向量,每条初始匹配的综合低维特征向量通过残差模块进行处理后,输入MLP,MLP对第一匹配集合中每条初始匹配的综合低维特征向量进行降维,计算第一匹配集合中每条初始匹配的综合一致性分数。
步骤303,电子设备利用第一裁剪子模块根据第一初始匹配的综合一致性分数确定第一初始匹配是否为被归入匹配子集。
本申请实施例中,第一初始匹配的综合一致性分数越高,表明第一初始匹配是正确匹配的可能性越大。可以将第一匹配集合中综合一致性分数较高的初始匹配归入匹配子集,也可以将第一匹配集合中综合一致性分数按照从大到小排序,将排序靠前的初始匹配归入匹配子集。本申请实施例针对每个初始匹配,都计算其综合一致性分数,可以通过一个简单的指标(综合一 致性分数)对第一匹配集合中的每个初始匹配进行归类,综合考虑了每个初始匹配的局部一致性和全局一致性,可以通过第一裁剪模块从第一匹配集合中筛选出较多的正确匹配,为后续筛选出匹配子集打下较好的基础。其中,第一匹配集合中的正确匹配比例高于所述第一匹配集合中的正确匹配比例。
在本申请一些实施例中,步骤303可以包括如下步骤:
电子设备利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数是否大于第一阈值,若是,确定所述第一初始匹配归入所述匹配子集;
或者,电子设备利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数在所述第一匹配集合中按照从大到小的排名,若所述第一初始匹配的排名大于第二阈值,确定所述第一初始匹配归入所述匹配子集。
本申请实施例中,可以根据第一匹配集合中每个初始匹配的综合一致性分数对第一匹配集合进行筛选,将第一匹配集合中综合一致性分数大于第一阈值的初始匹配归入匹配子集。也可以根据第一匹配集合每个初始匹配的综合一致性分数按照从大到小的排名进行筛选,将第一匹配集合中综合一致性分数的排名大于第一阈值的初始匹配归入匹配子集。
请参阅图7a,图7a是本申请实施例提供的一种计算第一匹配集合(第一匹配集合以初始匹配集合为例)中每条初始匹配的综合一致性分数的流程示意图。如图7a所示,初始匹配集合(c 1,c 2,…c N),其中,c 1表示一条初始匹配的初始特征向量,c 1可以是4维向量(该条初始匹配在图像对的第一图像的第一像素点的二维坐标和该条初始匹配在图像对的第二图像的第二像素点的二维坐标组成的4维向量),初始匹配集合包括N条初始匹配。初始匹配集合经过特征升维模块升维后变为(z 1,z 2,…z N),z 1表示一条初始匹配的高维特征向量,z 1可以是128维的特征向量。动态图构建模块将每条初始匹配的高维特征向量以及通过K近邻算法确定的该条初始匹配相关度最高的k条匹配进行构图,每条匹配z i可以通过[z i,Δz i]进行升维,其中Δz i=(z i-z i j),1≤j≤k。z i j为与匹配z i相关的K条匹配中的任一条。经过动态构图后,每条匹配的特征向量为k×256维。特征降维模块可以将每条匹配从k×256维降至128维,局部一致性分数计算模块可以计算每条匹配的局部一致性分数。全局一致性学习模块输出每条匹配的综合一致性分数。其中,特征升维模块可以包括4个残差模块(比如1个残差模块用于升维,另外3个残差模块用于用解决深度神经网络退化的问题),特征降维模块可以通过1个MLP(用于降维,图7a和图7b中均未示出,MLP可以将k×256维降至k×128维)、环状卷积(用于降维,比如,从k×128维降至128维)和4个残差模块(4个残差模块用于解决深度神经网络退化的问题)实现,局部一致性分数计算可以通过1个MLP实现。
请参阅图7b,图7b是本申请实施例提供的另一种计算第一匹配集合(第一匹配集合以初始匹配集合为例)中每条初始匹配的综合一致性分数的流程示意图。图7b是在图7a的基础上进一步优化得到的。图7b的局部一致性计算过程与图7a类似,图7b可以描述全局一致性计算过程。如图7b所示,得到每条匹配的局部一致性分数后,将N×1转置为1×N,将二者相乘,得到N×N的邻接矩阵,完成全局动态构图,邻接矩阵中涵盖了每条匹配与初始匹配集合中其他匹配之间的一致性,即包含了每条匹配的全局一致性信息。本质上是利用一个共享参数的过滤器,通过计算中心像素点以及相邻像素点的加权和来构成特征图,实现特征空间的提取。图像卷积网络可以将局部一致性模块学习的信息调制到频谱中,频谱中的特征过滤器使得传播的特征能够反映全局动态图的拉普拉斯算子中的一致性。
本申请实施例中,仅考虑第一初始匹配的局部一致性信息,也可以实现第一裁剪模块的筛选功能,无需考虑全局一致性信息,对于全局一致性差别不大的图像而已,可以节省匹配筛选所需的计算量,快速实现匹配筛选。
请参阅图8,图8是本申请实施例提供的另一种匹配筛选方法的流程示意图。图8是在图2a的基础上进一步优化得到的,如图8所示,该匹配筛选方法可以包括如下步骤。
步骤801,电子设备获取初始匹配集合,初始匹配集合来源于图像对之间的初始匹配结果。
步骤802,电子设备通过至少一个裁剪模块从初始匹配集合中筛选出匹配子集,至少一个裁剪模块用于获取初始匹配集合中每条初始匹配的一致性信息,匹配子集中的正确匹配比例高于初始匹配集合中的正确匹配比例。
其中,匹配子集用于处理与图像对相关的图像任务。
其中,步骤801至步骤802可以参见图2a所示的步骤201至步骤202,此处不再赘述。
步骤803,电子设备根据图像对相关的图像任务确定参数化变换模型所使用的约束关系,约束关系包括对极几何约束或重投影误差。
本申请实施例中,不同的图像任务可能对应不同的约束关系。比如,若图像任务是三维重建任务,则使用的约束关系为对极几何约束(epipolar geometry constraint);若图像任务是直线拟合任务,则使用的约束关系为重投影误差(reprojection error)。
其中,步骤803在步骤804之前执行,步骤803可以在步骤801或步骤802之前执行,也可以在步骤801或步骤802之后执行,也可以与步骤801或步骤802同时执行,本申请实施例不作限定。
步骤804,在参数化变换模型使用所述约束关系的情况下,电子设备利用匹配子集计算参数化变换模型的模型参数,参数化变换模型用于处理与图像对相关的图像任务。
步骤805,电子设备利用参数化变换模型对初始匹配集合进行预测,得到初始匹配集合中每条初始匹配的预测结果,预测结果包括正确匹配或错误匹配。
本申请实施例中,电子设备利用参数化变换模型对初始匹配集合进行预测,得到初始匹配集合中每条初始匹配的预测结果,包括:
电子设备利用参数化变换模型计算初始匹配集中每条匹配的对极距离(epipolar distance)或重投影误差(reprojection error),然后根据每条匹配的对极距离或重投影误差确定每条初始匹配的预测结果。
其中,若参数化变换模型的模型参数是在对极几何约束下利用匹配子集计算得到的,模型参数可以是本质矩阵(essential matrix)。电子设备利用参数化变换模型计算初始匹配集中每条匹配的对极距离,然后根据每条匹配的对极距离确定每条初始匹配的预测结果。在本申请一些实施例中,可以根据每条匹配的对极距离,将对极距离小于第三阈值的匹配预测为正确匹配,将对极距离大于第三阈值的匹配预测为错误匹配。
其中,若参数化变换模型的模型参数是在重投影误差的约束下利用匹配子集计算得到的,电子设备利用参数化变换模型计算初始匹配集中每条匹配的重投影误差,然后根据每条匹配的重投影误差确定每条初始匹配的预测结果。在本申请一些实施例中,可以根据每条匹配的重投影误差,将重投影误差小于第四阈值的匹配预测为正确匹配,将重投影误差大于第四阈值的匹配预测为错误匹配。
本申请实施例中,可以在对参数化变换模型的模型参数进行计算之前,选择与图像任务对应的约束关系,从而通过计算好的参数化变换模型更好的完成后续的图像任务。
下面结合图9来呈现采用本申请实施例的CLNet方法与采用PointCN方法在直线拟合任务上的效果。采用PointCN方法,对初始匹配集合直接进行直线拟合;本申请实施例的方法,首先通过裁剪模块从初始匹配集合中筛选出匹配子集,然后根据匹配子集进行直线拟合,由于本申请实施例的方法的匹配子集筛除了大部分的错误匹配,直线拟合受到错误匹配的影响很小,从而提高直线拟合的可靠性。图9中提供了两种初始匹配集合(第一种情况下的初始匹配集合和第二种情况下的初始匹配集合,两种情况下的初始匹配集合的分布不同),两种初始匹配集合均来自真实场景中随机分布的匹配,对于给定的直线拟合任务,它需要模型拟合给定的一条直线,从图9可以看出,采用PointCN方法是不太可靠的,在第二种情况下拟合失败,而采用本申请实施例的方法在两种情况下都拟合成功。
本申请实施例中,裁剪模块可以基于由局部区域到全局局域学习的一致性对初始匹配集合进行多次裁剪,获得置信度较高的匹配子集。
下面结合图10来呈现采用本申请实施例的CLNet方法与采用PointCN方法、OANet方法、PointACN方法在直线拟合任务上的L2距离的对比图。图10的纵坐标是L2距离误差,横坐标是测试数据集的离群率(错误匹配所占的比例)。从图10可以看出,测试数据集的离群率在50%到90%之间变化时,本申请实施例的方法(CLNet)在所有五个噪声级别上都有很好的概括,并且在最困难的情况(即90%的离群率)取得了显着的优势。图10的评估指标是预测的直线参数与真实直线之间的L2距离,L2距离越小,预测的准确性越高。
在本申请一些实施例中,图像任务包括直线拟合任务、宽基线图像匹配任务、图像定位任务、图像拼接任务、三维重建任务、相机姿态估计任务中的任一种。
本申请实施例中,通过匹配筛选方法对包含大量错误匹配的初始匹配集合进行筛选,可以获得高精度的特征匹配结果,用于直线拟合任务和宽基线图像匹配任务;利用匹配筛选方法的筛选结果可以计算图像间的参数化变换模型,用于图像拼接、三维重建任务和相机姿态估计;利用匹配筛选方法筛选获得的特征匹配数量作为测度进行图像检索,对目标图像进行定位。
在本申请一些实施例中,匹配筛选方法可以应用于VIPER平台产品中。
上述主要从方法侧执行过程的角度对本申请实施例的方案进行了介绍。可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所提供的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对电子设备进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
与上述一致的,请参阅图11,图11为本申请实施例提供的一种匹配筛选装置的结构示意图,该匹配筛选装置1100应用于电子设备,该匹配筛选装置1100可以包括获取单元1101和筛选单元1102,其中:
获取单元1101,配置为获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;
筛选单元1102,配置为通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;
其中,所述匹配子集用于处理与所述图像对相关的图像任务。
在本申请一些实施例中,该匹配筛选装置1100还可以包括预测单元1103;
预测单元1103,配置为在筛选单元1102通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,利用所述参数化变换模型对所述初始匹配集合进行预测,得到所述初始匹配集合中每条初始匹配的预测结果,所述预测结果包括正确匹配或错误匹配。
在本申请一些实施例中,筛选单元1102配置为通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,包括:
通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集;
在所述至少一个裁剪模块包括一个裁剪模块的情况下,所述第一匹配集合为所述初始匹配集合;
在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,所述第一匹配集合是通过所述第一裁剪模块的上一个裁剪模块筛选得到的。
在本申请一些实施例中,筛选单元1102配置为通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:通过所述第一裁剪模块确定第一初始匹配的局部一致性信息或全局一致性信息,根据所述第一初始匹配的局部一致性信息或全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
在本申请一些实施例中,筛选单元1102配置为通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
在本申请一些实施例中,所述第一裁剪模块包括第一局部一致性学习模块、第一全局一致性学习模块和第一裁剪子模块,所述特征匹配一致性信息包括局部一致性分数和全局一致性分数中的至少一项;所述筛选单元1102配置为通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集,包括:
通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数;所述第一局部动态图包含所述第一初始匹配所在的节点以及与所述第一初始匹配所在的节点相关的K个相关节点;所述K个相关节点是利用K近邻算法基于所述第一初始匹配所在的节点得到的;
通过所述第一全局一致性学习模块构建第一全局动态图,根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数;所述第一全局动态图包含所有初始匹配所在的节点;
利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集。
在本申请一些实施例中,所述第一局部一致性学习模块包括第一特征升维模块、第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块;
所述筛选单元1102配置为通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数,包括:
通过所述第一特征升维模块对所述第一初始匹配的初始特征向量进行升维处理,得到所述第一初始匹配的高维特征向量;
利用所述第一局部动态图构建模块通过K近邻算法确定所述第一匹配集合中与所述第一初始匹配的高维特征向量的相关度(欧氏距离)排名靠前的K条相关匹配,基于所述第一初始匹配和所述K条相关匹配构建针对所述第一初始匹配的第一局部动态图,得到所述第一初始匹配的超高维特征向量;所述第一初始匹配的超高维特征向量包括所述第一初始匹配的高维特征向量以及所述第一初始匹配与所述K条相关匹配之间的相关度向量的组合;
利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量;
通过所述第一局部一致性分数计算模块基于所述第一初始匹配的低维特征向量计算所述第一初始匹配在所述第一局部动态图的局部一致性分数。
在本申请一些实施例中,所述第一特征降维模块包括第一环状卷积模块和第二环状卷积模块;所述筛选单元1102配置为利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量,包括:
通过所述第一环状卷积模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,对每组特征向量进行第一次特征聚集处理,得到初步聚集的特征向量;
通过所述第二环状卷积模块对所述初步聚集的特征向量进行第二次特征聚集处理,得到所述第一初始匹配的低维特征向量。
在本申请一些实施例中,所述筛选单元1102配置为根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
计算所述第一初始匹配在所述第一全局动态图的全局一致性分数;
根据所述局部一致性分数和所述全局一致性分数确定所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述筛选单元1102配置为通过所述第一全局一致性学习模块构建第一全局动态图,包括:
通过所述第一全局一致性学习模块根据所述第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建第一全局动态图;
所述根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述第一全局动态图通过邻接矩阵表示,所述筛选单元1102配置为根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数,包括:
基于所述第一初始匹配的低维特征向量和所述邻接矩阵,利用图形卷积网络计算所述第一初始匹配的综合低维特征向量;
基于所述第一初始匹配的综合低维特征向量计算所述第一初始匹配的综合一致性分数。
在本申请一些实施例中,所述筛选单元1102配置为利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集,包括:
利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数是否大于第一阈值,若是,确定所述第一初始匹配归入所述匹配子集;
或者,利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数在所述第一匹配集合中按照从大到小的排名,若所述第一初始匹配的排名大于第二阈值,确定所述第一初始匹配归入所述匹配子集。
在本申请一些实施例中,该匹配筛选装置1100还包括训练单元1104;
所述训练单元1104,配置为在所述筛选单元1102通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,利用有监督数据集对裁剪模块进行训练,得到训练结果;通过自适应温度的二分类损失函数对所述训练结果进行评估,按照最小化所述二分类损失函数的方法对所述裁剪模块的参数进行更新。
在本申请一些实施例中,该匹配筛选装置1100还包括确定单元1105和计算单元1106;
所述确定单元1105,配置为在所述筛选单元1102通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,根据所述图像对相关的图像任务确定所述参数化变换模型所使用的约束关系,所述约束关系包括对极几何约束或重投影误差;
所述计算单元1106,配置为在所述参数化变换模型使用所述约束关系的情况下,利用所述匹配子集计算所述参数化变换模型的模型参数。
在本申请一些实施例中,所述图像任务包括直线拟合任务、宽基线图像匹配任务、图像定位任务、图像拼接任务、三维重建任务、相机姿态估计任务中的任一种。
其中,本申请实施例中的获取单元1101可以是电子设备中的通信模块,筛选单元1102、预测单元1103、训练单元1104、确定单元1105和计算单元1106可以是电子设备中的处理器或芯片。
本申请实施例中,可以对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,可以提高参数化变换模型的模型参数的计算精度,进而提高参数化变换模型处理图像任务的处理效果。
请参阅图12,图12是本申请实施例提供的一种电子设备的结构示意图,如图12所示,该电子设备1200包括处理器1201和存储器1202,处理器1201、存储器1202可以通过通信总线1203相互连接。通信总线1203可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。通信总线1203可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。存储器1202配置为存储计算机程序,计算机程序包括程序指令,处理器1201被配置为调用程序指令,上述程序包括用于执行图1a、2a、3所示的方法。
处理器1201可以是通用中央处理器(Central Processing Unit,CPU)、微处理器、特定应用集成电路(Application-Specific Integrated Circuit,ASIC)或一个或多个用于控制以上方案程序执行的集成电路。
存储器1202可以是只读存储器(Read-Only Memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(Random Access Memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过总线与处理器相连接。存储器也可以和处理器集成在一起。
此外,该电子设备1200还可以包括通信模块、天线等通用部件,在此不再详述。
本申请实施例中,可以对初始匹配集合进行筛选,使得筛选出的匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,可以提高参数化变换模型的模型参数的计算精度, 进而提高参数化变换模型处理图像任务的处理效果。
本申请实施例还提供一种计算机可读存储介质,其中,该计算机可读存储介质存储配置为电子数据交换的计算机程序,该计算机程序使得计算机执行如上述方法实施例中记载的任何一种匹配筛选方法的部分或全部步骤。
本申请实施例还提供一种计算机程序产品,包括存储了计算机程序的非瞬时性计算机可读存储介质,计算机程序可操作来使计算机执行上述方法实施例中记载的任何一种匹配筛选方法的部分或全部步骤。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在申请明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序模块的形式实现。
所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、ROM、RAM、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器、随机存取器、磁盘或光盘等。
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。
工业实用性
本申请实施例提供一种匹配筛选方法及装置、电子设备、计算机可读存储介质和计算机程序产品,所述方法包括:获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;其中,所述匹配子集用于处理与所述图像对相关的图像任务。本申请实施例可以提高参数化变换模型处理图像任务的处理效果。

Claims (33)

  1. 一种匹配筛选方法,应用于电子设备中,包括:
    获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;
    通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;
    其中,所述匹配子集用于处理与所述图像对相关的图像任务。
  2. 根据权利要求1所述的方法,其中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,所述方法还包括:
    利用所述参数化变换模型对所述初始匹配集合进行预测,得到所述初始匹配集合中每条初始匹配的预测结果,所述预测结果包括正确匹配或错误匹配。
  3. 根据权利要求1或2所述的方法,其中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,包括:
    通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集;
    在所述至少一个裁剪模块包括一个裁剪模块的情况下,所述第一匹配集合为所述初始匹配集合;
    在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,所述第一匹配集合是通过所述第一裁剪模块的上一个裁剪模块筛选得到的。
  4. 根据权利要求3所述的方法,其中,所述通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
    通过所述第一裁剪模块确定第一初始匹配的局部一致性信息或全局一致性信息,根据所述第一初始匹配的局部一致性信息或全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
  5. 根据权利要求3所述的方法,其中,所述通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
    通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
  6. 根据权利要求5所述的方法,其中,所述第一裁剪模块包括第一局部一致性学习模块、第一全局一致性学习模块和第一裁剪子模块,所述特征匹配一致性信息包括局部一致性分数和全局一致性分数中的至少一项;
    所述通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集,包括:
    通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数;所述第一局部动态图包含所述第一初始匹配所在的节点以及与所述第一初始匹配所在的节点相关的K个相关节点;所述K个相关节点是利用K近邻算法基于所述第一初始匹配所在的节点得到的;
    通过所述第一全局一致性学习模块构建第一全局动态图,根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数;所述第一全局动态图包含所有初始匹配所在的节点;
    利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集。
  7. 根据权利要求6所述的方法,其中,所述第一局部一致性学习模块包括第一特征升维模块、第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块;
    所述通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数,包括:
    通过所述第一特征升维模块对所述第一初始匹配的初始特征向量进行升维处理,得到所述第一初始匹配的高维特征向量;
    利用所述第一局部动态图构建模块通过K近邻算法确定所述第一匹配集合中与所述第一初始匹配的高维特征向量的相关度排名靠前的K条相关匹配,基于所述第一初始匹配和所述K条相关匹配构建针对所述第一初始匹配的第一局部动态图,得到所述第一初始匹配的超高维特征向量;所述第一初始匹配的超高维特征向量包括所述第一初始匹配的高维特征向量以及所述第一初始匹配与所述K条相关匹配之间的相关度向量的组合;
    利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量;
    通过所述第一局部一致性分数计算模块基于所述第一初始匹配的低维特征向量计算所述第一初始匹配在所述第一局部动态图的局部一致性分数。
  8. 根据权利要求7所述的方法,其中,所述第一特征降维模块包括第一环状卷积模块和第二环状卷积模块;所述利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量,包括:
    通过所述第一环状卷积模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,对每组特征向量进行第一次特征聚集处理,得到初步聚集的特征向量;
    通过所述第二环状卷积模块对所述初步聚集的特征向量进行第二次特征聚集处理,得到所述第一初始匹配的低维特征向量。
  9. 根据权利要求6~8任一项所述的方法,其中,所述根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
    计算所述第一初始匹配在所述第一全局动态图的全局一致性分数;
    根据所述局部一致性分数和所述全局一致性分数确定所述第一初始匹配的综合一致性分数。
  10. 根据权利要求7或8所述的方法,其中,所述通过所述第一全局一致性学习模块构建第一全局动态图,包括:
    通过所述第一全局一致性学习模块根据所述第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建第一全局动态图;
    所述根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
    根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数。
  11. 根据权利要求10所述的方法,其中,所述第一全局动态图通过邻接矩阵表示,所述根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数,包括:
    基于所述第一初始匹配的低维特征向量和所述邻接矩阵,利用图形卷积网络计算所述第一初始匹配的综合低维特征向量;
    基于所述第一初始匹配的综合低维特征向量计算所述第一初始匹配的综合一致性分数。
  12. 根据权利要求6~11任一项所述的方法,其中,所述利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集,包括:
    利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数是否大于第一阈值,若是,确定所述第一初始匹配归入所述匹配子集;
    或者,利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数在所述第一匹配集合中按照从大到小的排名,若所述第一初始匹配的排名大于第二阈值,确定所述第一初始匹配归入所述匹配子集。
  13. 根据权利要求1~12所述的方法,其中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,所述方法还包括:
    利用有监督数据集对裁剪模块进行训练,得到训练结果;
    通过自适应温度的二分类损失函数对所述训练结果进行评估,按照最小化所述二分类损失函数的方法对所述裁剪模块的参数进行更新。
  14. 根据权利要求1~13任一项所述的方法,其中,所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,所述方法还包括:
    根据所述图像对相关的图像任务确定所述参数化变换模型所使用的约束关系,所述约束关系包括对极几何约束或重投影误差;
    所述通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,所述方法还包括:
    在所述参数化变换模型使用所述约束关系的情况下,利用所述匹配子集计算所述参数化变换模型的模型参数。
  15. 根据权利要求1~14任一项所述的方法,其中,所述图像任务包括直线拟合任务、宽基线图像匹配任务、图像定位任务、图像拼接任务、三维重建任务、相机姿态估计任务中的任一种。
  16. 一种匹配筛选装置,包括:
    获取单元,配置为获取初始匹配集合,所述初始匹配集合来源于图像对之间的初始匹配结果;
    筛选单元,配置为通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,所述匹配子集中的正确匹配比例高于所述初始匹配集合中的正确匹配比例,所述至少一个裁剪模块用于获取所述初始匹配集合中每条初始匹配的一致性信息;
    其中,所述匹配子集用于处理与所述图像对相关的图像任务。
  17. 根据权利要求16所述的装置,其中,所述装置还可以包括:
    预测单元,配置为在通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,利用所述参数化变换模型对所述初始匹配集合进行预测,得到所述初始匹配集合中每条初始匹配的预测结果,所述预测结果包括正确匹配或错误匹配。
  18. 根据权利要求16或17所述的装置,其中,所述筛选单元,配置为通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集,包括:
    通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集;
    在所述至少一个裁剪模块包括一个裁剪模块的情况下,所述第一匹配集合为所述初始匹配集合;
    在所述至少一个裁剪模块包括至少两个裁剪模块的情况下,所述第一匹配集合是通过所述第一裁剪模块的上一个裁剪模块筛选得到的。
  19. 根据权利要求18所述的装置,其中,所述筛选单元,配置为通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
    通过所述第一裁剪模块确定第一初始匹配的局部一致性信息或全局一致性信息,根据所述第一初始匹配的局部一致性信息或全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
  20. 根据权利要求18所述的装置,其中,所述筛选单元,配置为通过第一裁剪模块对第一匹配集合进行筛选,得到匹配子集,包括:
    通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集;所述第一初始匹配为所述第一匹配集合中的任意一条。
  21. 根据权利要求20所述的装置,其中,所述第一裁剪模块包括第一局部一致性学习模块、第一全局一致性学习模块和第一裁剪子模块,所述特征匹配一致性信息包括局部一致性分数和全局一致性分数中的至少一项;
    所述筛选单元,配置为通过所述第一裁剪模块确定第一初始匹配的局部一致性信息和全局一致性信息,根据所述第一初始匹配的局部一致性信息和全局一致性信息确定所述第一初始匹配是否为被归入所述匹配子集,包括:
    通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数;所述第一局部动态图包含所述第一初始匹配所在的节点以及与所述第一初始匹配所在的节点相关的K个相关节点;所述K个相关节点是利用K近邻算法基于所述第一初始匹配所在的节点得到的;
    通过所述第一全局一致性学习模块构建第一全局动态图,根据所述第一初始匹配在所述第 一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数;所述第一全局动态图包含所有初始匹配所在的节点;
    利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集。
  22. 根据权利要求21所述的装置,其中,所述第一局部一致性学习模块包括第一特征升维模块、第一动态图构建模块、第一特征降维模块和第一局部一致性分数计算模块;
    所述筛选单元,配置为通过所述第一局部一致性学习模块构建针对第一初始匹配的第一局部动态图,计算所述第一初始匹配在所述第一局部动态图的局部一致性分数,包括:
    通过所述第一特征升维模块对所述第一初始匹配的初始特征向量进行升维处理,得到所述第一初始匹配的高维特征向量;
    利用所述第一局部动态图构建模块通过K近邻算法确定所述第一匹配集合中与所述第一初始匹配的高维特征向量的相关度(欧氏距离)排名靠前的K条相关匹配,基于所述第一初始匹配和所述K条相关匹配构建针对所述第一初始匹配的第一局部动态图,得到所述第一初始匹配的超高维特征向量;所述第一初始匹配的超高维特征向量包括所述第一初始匹配的高维特征向量以及所述第一初始匹配与所述K条相关匹配之间的相关度向量的组合;
    利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量;
    通过所述第一局部一致性分数计算模块基于所述第一初始匹配的低维特征向量计算所述第一初始匹配在所述第一局部动态图的局部一致性分数。
  23. 根据权利要求22所述的装置,其中,所述第一特征降维模块包括第一环状卷积模块和第二环状卷积模块;所述筛选单元,配置为利用所述第一特征降维模块对所述第一初始匹配的超高维特征向量进行降维处理,得到第一初始匹配的低维特征向量,包括:
    通过所述第一环状卷积模块对所述第一初始匹配的超高维特征向量按照相关度进行分组,对每组特征向量进行第一次特征聚集处理,得到初步聚集的特征向量;
    通过所述第二环状卷积模块对所述初步聚集的特征向量进行第二次特征聚集处理,得到所述第一初始匹配的低维特征向量。
  24. 根据权利要求21~23任一项所述的装置,其中,所述筛选单元,配置为根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
    计算所述第一初始匹配在所述第一全局动态图的全局一致性分数;
    根据所述局部一致性分数和所述全局一致性分数确定所述第一初始匹配的综合一致性分数。
  25. 根据权利要求22或23所述的装置,其中,所述筛选单元,配置为通过所述第一全局一致性学习模块构建第一全局动态图,包括:
    通过所述第一全局一致性学习模块根据所述第一匹配集合中每条初始匹配在对应的局部动态图的局部一致性分数构建第一全局动态图;
    所述筛选单元配置为根据所述第一初始匹配在所述第一局部动态图的局部一致性分数和所述第一全局动态图确定所述第一初始匹配的综合一致性分数,包括:
    根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数。
  26. 根据权利要求25所述的装置,其中,所述第一全局动态图通过邻接矩阵表示,所述筛选单元,配置为根据所述第一全局动态图和所述第一初始匹配的低维特征向量计算所述第一初始匹配的综合一致性分数,包括:
    基于所述第一初始匹配的低维特征向量和所述邻接矩阵,利用图形卷积网络计算所述第一初始匹配的综合低维特征向量;
    基于所述第一初始匹配的综合低维特征向量计算所述第一初始匹配的综合一致性分数。
  27. 根据权利要求21~26任一项所述的装置,其中,所述筛选单元,配置为利用所述第一裁剪子模块根据所述第一初始匹配的综合一致性分数确定所述第一初始匹配是否为被归入所述匹配子集,包括:
    利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数是否大于第一阈值,若 是,确定所述第一初始匹配归入所述匹配子集;
    或者,利用所述第一裁剪子模块确定所述第一初始匹配的综合一致性分数在所述第一匹配集合中按照从大到小的排名,若所述第一初始匹配的排名大于第二阈值,确定所述第一初始匹配归入所述匹配子集。
  28. 根据权利要求16~27所述的装置,其中,一种匹配筛选方法、装置、电子设备、计算机可读存储介质和计算机程序产品:
    训练单元,配置为在通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,利用有监督数据集对裁剪模块进行训练,得到训练结果;通过自适应温度的二分类损失函数对所述训练结果进行评估,按照最小化所述二分类损失函数的方法对所述裁剪模块的参数进行更新。
  29. 根据权利要求16~28任一项所述的装置,其中,所述装置还包括确定单元和计算单元;
    所述确定单元,配置为在所述筛选单元通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之前,根据所述图像对相关的图像任务确定所述参数化变换模型所使用的约束关系,所述约束关系包括对极几何约束或重投影误差;
    所述计算单元,配置为在所述筛选单元通过至少一个裁剪模块从所述初始匹配集合中筛选出匹配子集之后,在所述参数化变换模型使用所述约束关系的情况下,利用所述匹配子集计算所述参数化变换模型的模型参数。
  30. 根据权利要求16~29任一项所述的装置,其中,所述图像任务包括直线拟合任务、宽基线图像匹配任务、图像定位任务、图像拼接任务、三维重建任务、相机姿态估计任务中的任一种。
  31. 一种电子设备,包括处理器和存储器,所述存储器配置为存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置为调用所述程序指令,执行如权利要求1~15任一项所述的方法。
  32. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1~15任一项所述的方法。
  33. 一种计算机程序产品,包括存储了计算机程序的非瞬时性计算机可读存储介质,计算机程序可操作来使计算机执行如权利要求1~15任一项所述的方法。
PCT/CN2021/095170 2020-12-31 2021-05-21 匹配筛选方法及装置、电子设备、存储介质和计算机程序 WO2022142084A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011641201.1 2020-12-31
CN202011641201.1A CN112712123B (zh) 2020-12-31 2020-12-31 匹配筛选方法、装置、电子设备和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2022142084A1 true WO2022142084A1 (zh) 2022-07-07

Family

ID=75547994

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/095170 WO2022142084A1 (zh) 2020-12-31 2021-05-21 匹配筛选方法及装置、电子设备、存储介质和计算机程序

Country Status (3)

Country Link
CN (1) CN112712123B (zh)
TW (1) TWI776718B (zh)
WO (1) WO2022142084A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712123B (zh) * 2020-12-31 2022-02-22 上海商汤科技开发有限公司 匹配筛选方法、装置、电子设备和计算机可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544732A (zh) * 2013-09-29 2014-01-29 北京空间飞行器总体设计部 一种用于月球车的三维立体重建方法
US20150016723A1 (en) * 2012-01-02 2015-01-15 Telecom Italia S.P.A. Method and system for comparing images
CN110728296A (zh) * 2019-09-03 2020-01-24 华东师范大学 一种加速特征点匹配的两步随机抽样一致性方法及系统
CN112712123A (zh) * 2020-12-31 2021-04-27 上海商汤科技开发有限公司 匹配筛选方法、装置、电子设备和计算机可读存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102475B (zh) * 2020-09-04 2023-03-07 西北工业大学 一种基于图像序列轨迹跟踪的空间目标三维稀疏重构方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150016723A1 (en) * 2012-01-02 2015-01-15 Telecom Italia S.P.A. Method and system for comparing images
CN103544732A (zh) * 2013-09-29 2014-01-29 北京空间飞行器总体设计部 一种用于月球车的三维立体重建方法
CN110728296A (zh) * 2019-09-03 2020-01-24 华东师范大学 一种加速特征点匹配的两步随机抽样一致性方法及系统
CN112712123A (zh) * 2020-12-31 2021-04-27 上海商汤科技开发有限公司 匹配筛选方法、装置、电子设备和计算机可读存储介质

Also Published As

Publication number Publication date
CN112712123B (zh) 2022-02-22
TW202228018A (zh) 2022-07-16
TWI776718B (zh) 2022-09-01
CN112712123A (zh) 2021-04-27

Similar Documents

Publication Publication Date Title
CN109522942B (zh) 一种图像分类方法、装置、终端设备和存储介质
US11670071B2 (en) Fine-grained image recognition
US11232286B2 (en) Method and apparatus for generating face rotation image
CN108427927B (zh) 目标再识别方法和装置、电子设备、程序和存储介质
CN109902548B (zh) 一种对象属性识别方法、装置、计算设备及系统
CN112288011B (zh) 一种基于自注意力深度神经网络的图像匹配方法
US10984272B1 (en) Defense against adversarial attacks on neural networks
CN111627065A (zh) 一种视觉定位方法及装置、存储介质
CN112257808B (zh) 用于零样本分类的集成协同训练方法、装置及终端设备
US20220292728A1 (en) Point cloud data processing method and device, computer device, and storage medium
WO2021169160A1 (zh) 图像归一化处理方法及装置、存储介质
JP2010157118A (ja) パターン識別装置及びパターン識別装置の学習方法ならびにコンピュータプログラム
WO2023185925A1 (zh) 一种数据处理方法及相关装置
CN114092963A (zh) 关键点检测及模型训练方法、装置、设备和存储介质
CN114586078A (zh) 手部姿态估计方法、装置、设备以及计算机存储介质
CN112364747A (zh) 一种有限样本下的目标检测方法
WO2022100607A1 (zh) 一种神经网络结构确定方法及其装置
WO2022142084A1 (zh) 匹配筛选方法及装置、电子设备、存储介质和计算机程序
US20230401670A1 (en) Multi-scale autoencoder generation method, electronic device and readable storage medium
CN111008992A (zh) 目标跟踪方法、装置和系统及存储介质
CN115331021A (zh) 基于多层特征自身差异融合的动态特征提取与描述方法
CN113283469A (zh) 基于视图的三维模型检索的图嵌入无监督特征学习方法
Yang et al. UP-Net: unique keyPoint description and detection net
CN114331827B (zh) 风格迁移方法、装置、设备和存储介质
CN117152537A (zh) 基于难易样本关联学习的图像分类模型构建及训练方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21912845

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21912845

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/11/2023)