CN115984587A - Image matching method for combining consistency of mixed scale feature descriptors and neighbors - Google Patents

Image matching method for combining consistency of mixed scale feature descriptors and neighbors Download PDF

Info

Publication number
CN115984587A
CN115984587A CN202211500472.4A CN202211500472A CN115984587A CN 115984587 A CN115984587 A CN 115984587A CN 202211500472 A CN202211500472 A CN 202211500472A CN 115984587 A CN115984587 A CN 115984587A
Authority
CN
China
Prior art keywords
feature
matching
attention
descriptor
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211500472.4A
Other languages
Chinese (zh)
Inventor
杜松林
李东岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute Of Southeast University
Southeast University
Original Assignee
Shenzhen Institute Of Southeast University
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute Of Southeast University, Southeast University filed Critical Shenzhen Institute Of Southeast University
Priority to CN202211500472.4A priority Critical patent/CN115984587A/en
Publication of CN115984587A publication Critical patent/CN115984587A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses an image matching method for consistency of a joint mixed scale feature descriptor and a neighbor, which comprises the steps of sequentially passing through feature description networks of different branches, splicing a single-scale feature descriptor and a multi-scale feature descriptor on a dimension to generate a mixed scale feature descriptor, inputting the mixed scale descriptor into an optimal transmission matching layer, and obtaining an initial distribution matrix; the initial matching point pairs pass through a graph neural network sharing weight again to refine the initial distribution matrix so as to obtain final matching. According to the invention, through fusing the single-scale descriptors and the multi-scale descriptors, the mixed descriptors can keep robustness to various geometric deformations and high significance, and meanwhile, the geometric prior is utilized to remove wrong matching point pairs, so that the matching effect with high accuracy is finally achieved.

Description

Image matching method for combining consistency of mixed scale feature descriptors and neighbors
Technical Field
The invention belongs to the technical field of computer vision based on deep learning, and mainly relates to an image matching method for consistency of a joint mixed scale feature descriptor and a neighbor.
Background
Image feature matching refers to establishing point-to-point correspondence between two-dimensional views of the same three-dimensional scene, and image matching is a foundation for many downstream three-dimensional computer vision tasks, including three-dimensional reconstruction, visual positioning, motion structure (SfM), synchronous positioning and mapping (SLAM), and the like. Given a pair of images, the conventional feature matching method is: feature detection (1), feature description (3), feature matching (4) and outlier rejection.
Early feature matching methods tended to design feature point extractors and descriptors manually with some success. In recent years, deep learning methods adopt a data-driven strategy, descriptors which are more robust to illumination and visual angle changes can be obtained, and a convolutional neural network is adopted as a tool for detecting and describing feature points firstly. In recent years, for the purpose of expanding the field of reception and aggregating larger context information, transformers are widely used in feature matching. The method without the feature point detector tends to establish dense matching between views first and refine the extracted reliable matching. However, fine-grained detail information can be lost when the features extracted by the convolutional neural network are subjected to multi-layer down-sampling, and correct matching cannot be established on small objects in a scene. How to overcome the defect that the learned descriptor has rich fine-grained details and is robust to various geometric deformations becomes a problem to be solved urgently by the technical personnel in the field.
Disclosure of Invention
The invention provides an image matching method for consistency of joint mixed scale feature descriptors and neighbors aiming at the problem that feature point-free detector methods in the prior art have defects. According to the invention, through fusing the single-scale feature descriptor and the multi-scale feature descriptor, the detail loss caused by the downsampling operation in the convolutional neural network is avoided, meanwhile, the geometric consistency of matching is ensured by considering the neighbor consistency, and finally, the matching effect with high accuracy is achieved.
In order to achieve the purpose, the invention adopts the technical scheme that: the image matching method combining the consistency of the mixed scale feature descriptors and neighbors comprises the steps of sequentially carrying out convolution and attention mixing and enhancement self-attention based networks, splicing the feature descriptors with different scales on feature dimensions to obtain an initial distribution matrix, and modifying the distribution matrix after an initial matching point pair passes through a shared weight based graph neural network to realize image matching.
As an improvement of the invention, the method comprises the following steps:
s1, feature extraction: performing feature extraction of different resolutions on input original pictures shot from different perspectives on the same picture through an FPN network, wherein feature maps obtained through the feature extraction have different spatial resolutions and semantic information, and feature maps with the resolution of 1/2 and feature maps with the resolution of 1/8 of the original pictures are used for feature description of the next step;
s2, single-scale feature description; after the position of the feature map with the size of 1/8 obtained in the step S1 is coded, inputting the feature map into a neural network based on convolution and attention mixing to obtain a single-scale feature descriptor; a convolution branch is additionally added to a mixed self-attention layer in the neural network based on convolution and attention mixing, a cross attention layer is kept unchanged, the convolution branch of the mixed self-attention layer restores the local geometric structure of an original image, and the attention branch carries out information interaction inside features; the cross attention layer realizes information interaction of different characteristics and updates the characteristics of each layer;
s3, describing multi-scale features; inputting the original pictures shot from different viewpoints obtained in the step S1 into a network based on self-attention enhancement and outputting a multi-scale feature descriptor; the key matrix (K) and the value matrix (V) in the enhanced self-attention in the network based on the enhanced self-attention are sampled in different proportions in different self-attention heads, and each self-attention head carries out information transfer of features with different scales to generate a multi-scale feature descriptor;
s4, fusing features of different scales: splicing the single-scale feature descriptor obtained in the step S2 and the multi-scale feature descriptor obtained in the step S3 on the feature dimension;
s5, inputting the mixed scale descriptor obtained in the step S4 into an optimal matching layer to obtain an initial distribution matrix; selecting an initial matching point pair based on a set threshold value;
s6, neighbor consistency filtering outliers: modeling the initial matching point pairs obtained in the step S4 into a graph structure, inputting the graph structure into a graph neural network sharing weight, and using the output of the graph neural network to correct the initial distribution matrix to obtain new matching point pairs.
S7, matching and fine trimming: inputting the feature map with the size of 1/2 obtained in the step S1 and the mixed descriptor obtained in the step S4 into a fully-connected neural network to obtain an enhanced feature map with the size of 1/2; and inputting the obtained characteristic graph and the new matching point pairs with the pixel-level precision obtained in the step S6 into a matching refinement network, and outputting the final matching with the sub-pixel-level precision, so that a complete image matching model is constructed, and image matching is realized.
As an improvement of the present invention, in step S2, the feature maps with the size of 1/8 are position-coded and rearranged into a one-dimensional tensor; obtaining single-scale feature descriptors via mixed self-attention and cross-attention layers of convolution and self-attention fusion
Figure BDA0003967322360000031
As an improvement of the present invention, the sparse-based attention network training process in step S2 specifically includes:
the hybrid self-attentive mechanism and the cross-attentive mechanism are used alternately at different levels in the network. When a hybrid self-attention mechanism is used, the similarity between each pixel is learned within the feature map; when a cross attention mechanism is used, the similarity of each pixel between the feature maps is learned, and finally information transmitted between network layers is obtained through a layer of fully-connected neural network.
As another improvement of the present invention, the step S3 further includes:
s31: the key matrix (K) and the value matrix (V) are down-sampled in different proportions in different self-attentional heads,
Figure BDA0003967322360000032
Figure BDA0003967322360000033
V i =V i +LE(V i ),
in the formula, X represents the input characteristic,
Figure BDA0003967322360000034
representing a linear mapping matrix, r i Represents the downsampling proportion of the ith feature header, MTA (-) represents the multi-scale aggregation operation, LE (-) is a convolutional neural network;
s32: the query matrix (Q), the key matrix (K) and the value matrix (V) obtained in step S31 are used for information transmission,
Figure BDA0003967322360000041
in the formula d h Representing the feature dimension of each feature head.
As another improvement of the present invention, the operation with the same dimension in step S4 specifically includes: and splicing the 256-dimensional single-scale feature descriptors and the 128-dimensional multi-scale feature descriptors on the feature dimension to obtain 384-dimensional feature descriptors.
As another improvement of the present invention, the step S5 specifically includes: first, two mixed descriptors are calculated
Figure BDA0003967322360000042
A matrix of the degree of similarity between them,
Figure BDA0003967322360000043
where τ is a constant, and <. Indicates the inner product. The similarity matrix is used as a cost matrix of a part of assignment problems, and the optimal solution confidence coefficient distribution matrix can be obtained by solving the part of assignment problems, so that initial matching is obtained.
As a further improvement of the present invention, in step S6, a sparse descriptor of the corresponding point pair is extracted, a sparse similarity matrix P is calculated by inner product, and the corresponding relationship of the point set between the images can be regarded as the corresponding relationship of the nodes in the graph structure, so as to construct a node matrix R A ,R B And the edge matrix E A ,E B Where each node retains only the edges between the two nodes that are most similar to the rest, via a graph neural network sharing parameters,
d A =Ψ(R A ,E A ),
d B =Ψ(R A ,E B ),
where Ψ is a neural network of the diagram, d A ,d B The difference of (d) can be used to modify the initial distribution matrix to obtain a new match that meets the pixel-level accuracy of neighbor consistency.
As a further improvement of the present invention, in step S7, in the enhanced 1/2 original-size feature map, a local window with a size of 5 × 5 is cut out with each matching point as a center, after the windows are serialized, a local fine-grained descriptor is obtained through the single-scale feature description network in step S2, and a peak response of the descriptor at each matching point on the local fine-grained descriptor in another map is respectively calculated, so as to obtain a final matching result with a final sub-pixel precision.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention improves the feature matching method of the feature-free point detector and supplements the single-scale feature descriptor with local geometric structure information.
2. The invention combines the mixed scale feature descriptors, thereby not only enhancing the significance of the feature descriptors, but also keeping the robustness of the descriptors to illumination and visual angle transformation.
3. The invention designs a novel outlier filtering method, which is used for detecting whether the obtained initial matching has neighbor consistency or not, enhancing the reliability of the matching result and having wide application prospects in the fields of three-dimensional reconstruction, visual positioning, navigation and the like.
Drawings
FIG. 1 is a flow chart of the steps of the method of the present invention;
fig. 2 is a schematic diagram of picture matching after the method of the present invention is used in embodiment 2 of the present invention.
Detailed Description
The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention.
Example 1
An image matching method for combining consistency of mixed-scale feature descriptors and neighbors is shown in fig. 1 and comprises the following steps:
s1, feature extraction: and performing feature extraction with different resolutions on input pictures which take the same picture from different perspectives through an FPN network. The feature maps extracted by the features have different spatial resolutions and semantic information, and the feature map with the size of 1/2 resolution and the feature map with the size of 1/8 of the original image are used for feature description of the next step;
s2, single-scale feature description: after the position of the feature map with the size of 1/8 obtained in the step S1 is coded, inputting the feature map into a neural network based on convolution and attention mixing to obtain a single-scale feature descriptor; a convolution branch is additionally added to a mixed self-attention layer in the neural network based on convolution and attention mixing, a cross-attention layer is kept unchanged, the convolution branch of the mixed self-attention layer restores a partial geometric structure of an original image, and the attention branch carries out information interaction inside features; the cross attention layer realizes information interaction of different characteristics and updates the characteristics of each layer;
s3, describing multi-scale features; inputting the original pictures shot from different viewpoints obtained in the step S1 into a network based on self-attention enhancement and outputting a multi-scale feature descriptor; the key matrix (K) and the value matrix (V) in the enhanced self-attention in the network based on the enhanced self-attention are sampled in different proportions in different self-attention heads, and each self-attention head carries out information transfer of features with different scales to generate a multi-scale feature descriptor;
s4, fusing features of different scales: splicing the single-scale feature descriptor obtained in the step S2 and the multi-scale feature descriptor obtained in the step S3 on the feature dimension;
s5, inputting the mixed scale descriptor obtained in the step S4 into an optimal matching layer to obtain an initial distribution matrix; selecting an initial matching point pair based on a set threshold value;
s6, in the step S6, extracting sparse descriptors of corresponding point pairs, calculating a sparse similarity matrix P through inner products, and constructing a node matrix E by regarding the corresponding relation of point sets among images as the corresponding relation of nodes in a graph structure A ,E B And edge matrix E A ,E B Where each node retains only the edges between the two nodes that are most similar to the rest, via a graph neural network sharing parameters,
d A =Ψ(R A ,E A ),
d B =Ψ(R A ,E B ),
where Ψ is a neural network of the graph, d A ,d B The difference of (d) can be used to modify the initial distribution matrix to obtain a new match that meets the pixel-level accuracy of neighbor consistency.
S7, inputting the feature map with the size of 1/2 obtained in the step S1 and the mixed descriptor obtained in the step S4 into a fully-connected neural network to obtain an enhanced feature map with the size of 1/2; and inputting the obtained feature map and the new matching point pairs with the pixel-level precision obtained in the step S6 into a matching refinement network, and outputting the final matching with the sub-pixel-level precision.
Example 2
The image matching method for combining the consistency of the mixed scale feature descriptors and the neighbors comprises the following steps:
s1: and performing feature extraction on the input image pair, and obtaining feature maps with different resolution sizes by the input image pair through a feature extraction network.
The experimental dataset was a MegaDepth, which consisted of 100 million internet images of 196 different outdoor scenes.
Each picture is initially cropped to 840 x 840 size and converted to a grey scale map format as input.
S2: based on the neural network in which convolution and attention are mixed, a feature map of 1/8 of the size of the original image obtained by feature extraction in step S1 after position coding is used as input.
The input is input into a neural network based on convolution and attention mixing, and a single-scale feature descriptor is output, and the dimensionality of the descriptor is 256.
S3, training a network based on self-attention enhancement, using the original image in the step S1 as input, and enabling the dimensionality of a descriptor to be 128;
and S4, splicing the single-scale feature descriptors obtained in the step S2 and the multi-scale feature descriptors obtained in the step S3 on feature dimensions.
S5, inputting the mixed scale descriptor obtained in the step S4 into an optimal matching layer to obtain an initial distribution matrix; and selecting an initial matching point pair based on the set threshold value.
First, two mixed descriptors are calculated
Figure BDA0003967322360000071
A matrix of the degree of similarity between them,
Figure BDA0003967322360000072
where τ is a constant, < - > denotes the inner product. The similarity matrix is used as a cost matrix of a part of assignment problems, and the optimal solution confidence coefficient distribution matrix can be obtained by solving the part of assignment problems, so that initial matching is obtained.
S6, extracting sparse descriptors of corresponding point pairs, calculating a sparse similarity matrix P through inner products, and constructing a node matrix R by regarding the corresponding relation of point sets among images as the corresponding relation of nodes in a graph structure A ,R B And the edge matrix E A ,E B Where each node retains only the edges between the two nodes that are most similar to the rest, via a graph neural network sharing parameters,
d A =Ψ(R A ,E A ),
d B =Ψ(R A ,E B ),
where Ψ is a neural network of the diagram, d A ,d B The difference of (d) can be used to modify the initial distribution matrix to obtain a new match with pixel level accuracy that is consistent with the neighbor consistency.
S7, inputting the feature map with the size of 1/2 obtained in the step S1 and the mixed descriptor obtained in the step S4 into a fully-connected neural network to obtain an enhanced feature map with the size of 1/2; and inputting the obtained feature map and the new matching point pairs with the pixel-level precision obtained in the step S6 into a matching refinement network, and outputting the final matching with the sub-pixel-level precision. FIG. 2 shows all the matching results after the method of the present invention. Fig. 2 shows that the method can accurately match small-scale objects in the image, and has strong robustness to scale and view angle transformation.
In summary, the image matching method combining the consistency of the mixed scale feature descriptors and the neighbors is provided, the image matching model trained by the method can be operated on a computer or other equipment, the corresponding point set with sub-pixel precision can be output by inputting the original image pair, and the method is widely applied to the fields of three-dimensional reconstruction, visual positioning and navigation, multi-target tracking and the like.
It should be noted that the above-mentioned contents only illustrate the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and it is obvious to those skilled in the art that several modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations fall within the protection scope of the claims of the present invention.

Claims (9)

1. The image matching method for combining the consistency of the mixed scale feature descriptors and the neighbors is characterized in that: the method sequentially passes through a network based on convolution and attention mixing and enhanced self-attention, and then the feature descriptors with different scales are spliced on feature dimensions to obtain an initial distribution matrix, and after an initial matching point pair passes through a graph neural network based on a shared weight, the distribution matrix is corrected to realize image matching.
2. The method of image matching for joint mixed-scale feature descriptor and neighbor coherence according to claim 1, comprising the steps of:
s1, feature extraction: performing feature extraction of different resolutions on an input original image shot from different perspectives on the same picture through an FPN network, wherein feature maps obtained through the feature extraction have different spatial resolutions and semantic information, and feature maps with the size of 1/2 resolution and feature maps with the size of 1/8 of the original image are used for feature description of the next step;
s2, single-scale feature description: after the position of the feature map with the size of 1/8 obtained in the step S1 is coded, inputting the feature map into a neural network based on convolution and attention mixing to obtain a single-scale feature descriptor; a convolution branch is additionally added to a mixed self-attention layer in the neural network based on convolution and attention mixing, a cross attention layer is kept unchanged, the convolution branch of the mixed self-attention layer restores the local geometric structure of an original image, and the attention branch carries out information interaction inside features; the cross attention layer realizes information interaction of different characteristics and updates the characteristics of each layer;
s3, multi-scale feature description: inputting the original pictures shot from different viewpoints obtained in the step S1 into a network based on self-attention enhancement and outputting a multi-scale feature descriptor; the key matrix (K) and the value matrix (V) in the enhanced self-attention in the network based on the enhanced self-attention are sampled in different proportions in different self-attention heads, and each self-attention head carries out information transfer of features with different scales to generate a multi-scale feature descriptor;
s4, fusing features of different scales: splicing the single-scale feature descriptor obtained in the step S2 and the multi-scale feature descriptor obtained in the step S3 on the feature dimension;
s5, inputting the mixed scale descriptor obtained in the step S4 into an optimal matching layer to obtain an initial distribution matrix; selecting an initial matching point pair based on a set threshold value;
s6, neighbor consistency filtering outliers: modeling the initial matching point pairs obtained in the step S4 into a graph structure, inputting the graph structure into a graph neural network sharing weight, and using the output of the graph neural network to correct an initial distribution matrix to obtain new matching point pairs;
s7, matching and fine trimming: inputting the feature map with the size of 1/2 obtained in the step S1 and the mixed descriptor obtained in the step S4 into a full-connection neural network to obtain an enhanced feature map with the size of 1/2; and inputting the obtained characteristic graph and the new matching point pairs with the pixel-level precision obtained in the step S6 into a matching refinement network, and outputting the final matching with the sub-pixel-level precision, so that a complete image matching model is constructed, and image matching is realized.
3. The method of image matching for joint mixed-scale feature descriptor and neighbor consistency of claim 2, wherein: in the step S2, the feature images with the size of 1/8 are subjected to position coding and rearranged into a one-dimensional tensor; obtaining single-scale feature descriptors via mixed self-attention and cross-attention layers of convolution and self-attention fusion
Figure FDA0003967322350000021
4. The method of image matching for joint mixed-scale feature descriptor and neighbor consistency of claim 3, wherein: the neural network training process based on convolution and attention mixing in the step S2 specifically comprises the following steps: the hybrid self-attention mechanism and the cross-attention mechanism are alternately used at different layers in the network, and when the hybrid self-attention mechanism is used, the similarity between each pixel is learned in the feature map; when a cross attention mechanism is used, the similarity of each pixel between the feature maps is learned, and finally information transmitted between network layers is obtained through a layer of fully-connected neural network.
5. A method of image matching incorporating mixed-scale feature descriptor and neighbor coherence as claimed in claim 3, wherein: the step S3 further includes:
s31: the key matrix (K) and the value matrix (V) are down-sampled in different proportions in different self-attentional heads,
Figure FDA0003967322350000022
Figure FDA0003967322350000023
V i =V i +LE(V i ),
in the formula, X represents the input characteristic,
Figure FDA0003967322350000024
representing a linear mapping matrix, r i Representing the downsampling proportion of the ith feature header, MTA (-) representing the multiscale aggregation operation, LE (-) being a convolutional neural network;
s32: the query matrix (Q), the key matrix (K) and the value matrix (V) obtained in step S31 are subjected to information transfer,
Figure FDA0003967322350000031
in the formula d h Representing the feature dimensions of each feature header.
6. The joint mixed-scale feature descriptor and neighbor consensus image matching method of claim 4 or 5, wherein: the splicing in the feature dimension in the step S4 specifically includes: and splicing the 256-dimensional single-scale feature descriptors and the 128-dimensional multi-scale feature descriptors on the feature dimension to obtain 384-dimensional feature descriptors.
7. The method of image matching for joint mixed-scale feature descriptor and neighbor consistency of claim 6, wherein: the step S5 specifically comprises the following steps: first, two mixed descriptors are calculated
Figure FDA0003967322350000032
A matrix of the degree of similarity between them,
Figure FDA0003967322350000033
in the formula, tau is a constant, the < · > represents an inner product, the similarity matrix is used as a cost matrix of a partial assignment problem, and the optimal solution confidence coefficient distribution matrix can be obtained by solving the partial assignment problem, so that initial matching is obtained.
8. The method of image matching for joint mixed-scale feature descriptor and neighbor consistency of claim 7, wherein: in step S6, a sparse descriptor of the corresponding point pair is extracted, a sparse similarity matrix P is calculated by inner product, and the correspondence between the point sets in the images can be regarded as the correspondence between the nodes in the graph structure, thereby constructing a node matrix R A ,R B And the edge matrix E A ,E B Where each node retains only the edges between the two nodes that are most similar to the rest, via a graph neural network sharing parameters,
d A =Ψ(R A ,E A ),
d B =Ψ(R A ,E B ),
where Ψ is a neural network of the graph, d A ,d B The difference of (d) can be used to modify the initial distribution matrix to obtain a new match that meets the pixel-level accuracy of neighbor consistency.
9. The method of image matching for joint mixed-scale feature descriptor and neighbor consistency of claim 8, wherein: in step S7, in the enhanced 1/2 original image-sized feature map, a local window of 5 × 5 size is intercepted with each matching point as the center, after the windows are serialized, a local fine-grained descriptor is obtained through the single-scale feature description network in step S2, peak responses of the descriptor at each matching point on the local fine-grained descriptor in another image are respectively calculated, and a final matching result of final sub-pixel precision is obtained.
CN202211500472.4A 2022-11-28 2022-11-28 Image matching method for combining consistency of mixed scale feature descriptors and neighbors Pending CN115984587A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211500472.4A CN115984587A (en) 2022-11-28 2022-11-28 Image matching method for combining consistency of mixed scale feature descriptors and neighbors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211500472.4A CN115984587A (en) 2022-11-28 2022-11-28 Image matching method for combining consistency of mixed scale feature descriptors and neighbors

Publications (1)

Publication Number Publication Date
CN115984587A true CN115984587A (en) 2023-04-18

Family

ID=85965489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211500472.4A Pending CN115984587A (en) 2022-11-28 2022-11-28 Image matching method for combining consistency of mixed scale feature descriptors and neighbors

Country Status (1)

Country Link
CN (1) CN115984587A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129228A (en) * 2023-04-19 2023-05-16 中国科学技术大学 Training method of image matching model, image matching method and device thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129228A (en) * 2023-04-19 2023-05-16 中国科学技术大学 Training method of image matching model, image matching method and device thereof

Similar Documents

Publication Publication Date Title
CN109377530B (en) Binocular depth estimation method based on depth neural network
CN111339903B (en) Multi-person human body posture estimation method
CN111968121B (en) Three-dimensional point cloud scene segmentation method based on instance embedding and semantic fusion
CN114969405B (en) Cross-modal image-text mutual detection method
CN111259945B (en) Binocular parallax estimation method introducing attention map
CN111860651B (en) Monocular vision-based semi-dense map construction method for mobile robot
CN113283525B (en) Image matching method based on deep learning
CN113345082B (en) Characteristic pyramid multi-view three-dimensional reconstruction method and system
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN111127538A (en) Multi-view image three-dimensional reconstruction method based on convolution cyclic coding-decoding structure
CN113313732A (en) Forward-looking scene depth estimation method based on self-supervision learning
CN115984587A (en) Image matching method for combining consistency of mixed scale feature descriptors and neighbors
CN110348299B (en) Method for recognizing three-dimensional object
CN115482268A (en) High-precision three-dimensional shape measurement method and system based on speckle matching network
Hughes et al. A semi-supervised approach to SAR-optical image matching
CN112906675B (en) Method and system for detecting non-supervision human body key points in fixed scene
CN110751271A (en) Image traceability feature characterization method based on deep neural network
CN112489186A (en) Automatic driving binocular data perception algorithm
CN111950476A (en) Deep learning-based automatic river channel ship identification method in complex environment
Chen et al. Monocular image depth prediction without depth sensors: An unsupervised learning method
CN115330935A (en) Three-dimensional reconstruction method and system based on deep learning
CN115564888A (en) Visible light multi-view image three-dimensional reconstruction method based on deep learning
CN112419387A (en) Unsupervised depth estimation method for tomato plant image in sunlight greenhouse
CN114612734B (en) Remote sensing image feature matching method and device, storage medium and computer equipment
Jung et al. Single image depth estimation with integration of parametric learning and non-parametric sampling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination