CN111951319A - Image stereo matching method - Google Patents

Image stereo matching method Download PDF

Info

Publication number
CN111951319A
CN111951319A CN202010847540.9A CN202010847540A CN111951319A CN 111951319 A CN111951319 A CN 111951319A CN 202010847540 A CN202010847540 A CN 202010847540A CN 111951319 A CN111951319 A CN 111951319A
Authority
CN
China
Prior art keywords
image
feature
parallax
stereo matching
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010847540.9A
Other languages
Chinese (zh)
Inventor
周杰
李永强
郭振华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN202010847540.9A priority Critical patent/CN111951319A/en
Publication of CN111951319A publication Critical patent/CN111951319A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application belongs to the technical field of image processing, and particularly relates to an image stereo matching method. The existing stereo matching method based on deep learning has poor generalization capability, and the main reason is that data in the stereo matching field has certain defects, such as over-strong reflection, shielding and the like, and on the other hand, a region lacking texture also exists in an image, so that some methods for adding neighborhood constraint are easy to overfit. Meanwhile, high-quality data sets in the field of stereo matching are relatively short, and a network with generalization capability is difficult to obtain through simple training. The application provides an image stereo matching method, which comprises the following steps: constructing a training image library; enhancing a training image; extracting point (line) features for all images; training a feature point extraction network; extracting unary characteristics of the binocular image; an iteration module acquires a coarse-precision disparity map; polymerization of univariate characteristics; performing parallax regression; and (5) refining the parallax. The problem of overfitting in a stereo depth matching algorithm is solved.

Description

Image stereo matching method
Technical Field
The application belongs to the technical field of image processing, and particularly relates to an image stereo matching method.
Background
The stereo matching can be applied to scenes such as automatic driving and three-dimensional reconstruction. In the stereo matching task, the problems of shading and light reflection and lack of textures have great influence on the matching result (for example, as shown in fig. 1, the automobile glass in an outdoor road scene has light reflection). In an ideal case, the corrected binocular images are kept uniform in the vertical direction, and there is only parallax in the horizontal direction. However, in practical situations, the correction of the cameras is not perfect, the parameters of the two cameras are not completely consistent, and on the other hand, the scenes seen from different viewing angles are not completely consistent, and the scene content and the lighting condition are different, which may cause some differences in the images captured by the binocular cameras, and bring some challenges to robust stereo matching.
The existing scheme for stereo matching by using a depth network generally extracts features of each pixel point or pixel region in an image, calculates similarity with all pixel points or pixel regions in the same horizontal direction in another image, and generates a parallax estimation result end to end. Some methods also combine the traditional stereo matching idea to apply neighborhood constraint to the matching of pixel points or pixel regions, and typical methods include adding edge constraint, neighborhood constraint and the like.
The existing stereo matching method based on deep learning has poor generalization capability, and the main reason is that data in the stereo matching field has certain defects, such as over-strong reflection, shielding and the like, and on the other hand, a region lacking texture also exists in an image, so that some methods for adding neighborhood constraint are easy to overfit. Meanwhile, high-quality data sets in the field of stereo matching are relatively short, and a network with generalization capability is difficult to obtain through simple training.
Disclosure of Invention
1. Technical problem to be solved
The existing stereo matching method based on deep learning is poor in generalization capability, and the main reason is that certain defects exist in data in the stereo matching field, such as too strong light reflection, shading and the like, and on the other hand, a region lacking texture also exists in an image, so that some methods for adding neighborhood constraints are easy to overfit. Meanwhile, the high-quality data set in the stereo matching field is relatively lack, and a network with generalization capability is difficult to obtain through simple training.
2. Technical scheme
In order to achieve the above object, the present application provides an image stereo matching method, including the steps of:
step 1: constructing a training image library;
step 2: enhancing a training image;
and step 3: extracting point (line) features for all images;
and 4, step 4: training a feature point extraction network;
and 5: extracting unary characteristics of the binocular image;
step 6: an iteration module acquires a coarse-precision disparity map;
and 7: polymerization of univariate characteristics;
and 8: performing parallax regression;
and step 9: and (5) refining the parallax.
Another embodiment provided by the present application is: and (3) binocular data and monocular data of the image library in the step 1. The binocular image comprises a left image and a right image, and the used data sets comprise a Sceneflow synthetic data set, a Kitti automatic driving data set, a Middlebury data set, an Eth3d data set and other binocular data sets. The monocular data set used includes a general image data set (which may be referred to as a monocular image) such as Imagenet, Coco, and the like.
Another embodiment provided by the present application is: the enhancing in step 2 comprises applying random brightness variations, occlusions or random offsets in the vertical direction.
Another embodiment provided by the present application is: the extracted point features in the step 3 adopt traditional feature detection operators, such as ORB, SIFT and the like, further consider that the point features have poor effect on regions lacking textures, add line feature detection, classify the regions obtained by line feature detection into the range of the feature points, and use the extracted feature points in a subsequent training feature point extraction network.
Another embodiment provided by the present application is: the feature point extraction module in the step 4 comprises a series of 2D convolution layers and pooling layers, and labels used for training are feature points extracted by a traditional operator, including points extracted by a point feature and line method. The partial feature point extraction module can also directly use a traditional operator, and the dimensions of the unary feature descriptors generated by the two methods are required to be kept consistent.
Another embodiment provided by the present application is: the features generated by the feature point extraction module in the step 4 are divided into two parts, one part is used as a hidden vector h and is input into the iteration module, and the other part is directly input into the iteration module.
Another embodiment provided by the present application is: the unitary feature extraction module in step 5 is based on a resnet structure, and the unitary feature refers to a feature vector for describing each pixel point, because in stereo matching, a pixel point is taken as a unit, and the goal is to find a corresponding pixel point in a right image for all pixel points in a drawing.
Another embodiment provided by the present application is: and L in the iteration module in the step 6 represents that the Lookup operation is carried out on the four-dimensional matching cost, a search range radius r is defined, and the radius r is used as an index to obtain the matching cost of one r dimension from the matching. And matching the cost and the input disparity value, inputting the hidden vector h into an iteration module, and outputting a disparity increment delta d and an updated hidden vector h', wherein the initial disparity value is set to be 0.
Another embodiment provided by the present application is: the deformable convolution module in the step 7 performs aggregation on the unary features, where feature aggregation refers to binding the disparity value of a single pixel with disparity values of other pixels in the image to some extent, so as to avoid error results generated in some abnormal regions and improve the robustness of the system.
Another embodiment provided by the present application is: the regression of the parallax in the step 8 is to find each parallax value d in the range of the parallaxiWeighted sum with its corresponding weight, its disparity value diIs dminTo dmaxAnd calculating probability values of the matching costs in the parallax range by positive integers, and weighting and summing to obtain corresponding parallax values.
Another embodiment provided by the present application is: the parallax refinement in the step 9 further refines the calculated parallax value by combining the input image, and particularly, in the case of pursuing timeliness, the network may output a parallax map with a low resolution, and the refinement module performs up-sampling on the obtained parallax map by combining the original input map to obtain a parallax map with a resolution consistent with that of the original input image.
3. Advantageous effects
Compared with the prior art, the image stereo matching method provided by the application has the beneficial effects that:
the image stereo matching method provided by the application is a binocular image stereo matching method with high robustness and high speed.
The image stereo matching method provided by the application solves the problem of overfitting in a stereo depth matching algorithm.
The image stereo matching method has a wide application prospect in various fields in view of depth information, and has strong theoretical significance and practical value.
According to the image stereo matching method, a large-scale training image database is built, real and synthetic data are used for combined training, a traditional feature detection algorithm is combined, iterative updating is carried out, a deformable convolution is used for replacing a 3D convolution module, and a disparity map is obtained rapidly and robustly.
The image stereo matching method can obtain the depth map under the condition of reference in the known camera, and the depth information has wide application prospects in the aspects of automatic driving, navigation and the like.
Drawings
FIG. 1 is an exemplary schematic diagram of a Kitti dataset image of the present application;
FIG. 2 is an exemplary schematic diagram of a Sceneflow dataset image of the present application;
FIG. 3 is a schematic diagram illustrating an example of a Middlebury dataset image of the present application;
FIG. 4 is a diagram illustrating the results of ORB feature point detection in the present application;
FIG. 5 is a schematic diagram of line feature detection results of the present application;
fig. 6 is a schematic network structure diagram of the image stereo matching method of the present application;
FIG. 7 is a schematic view of the parallax resulting from FIG. 1 of the present application;
fig. 8 is a flowchart illustrating an image stereo matching method according to the present application.
Detailed Description
Hereinafter, specific embodiments of the present application will be described in detail with reference to the accompanying drawings, and it will be apparent to those skilled in the art from this detailed description that the present application can be practiced. Features from different embodiments may be combined to yield new embodiments, or certain features may be substituted for certain embodiments to yield yet further preferred embodiments, without departing from the principles of the present application.
ORB (organized FAST and rotaed BRIEF) is an algorithm for FAST feature point extraction and description. This algorithm was developed by Ethan ruble, Vincent Rabaud, Kurt Konolige and Gary r.bradski in 2011 under the name "ORB: an Efficient to SIFtor SURF:// www.willowgarage.com/sites/default/files/orb _ final. pdf). The ORB algorithm is divided into two parts, namely feature point extraction and feature point description. The feature extraction is developed by fast (features from accessed Segment test) algorithm, and the feature point description is improved according to brief (binary Robust independent feature features) feature description algorithm. The ORB feature is to combine the detection method of FAST feature points with BRIEF feature descriptors and make improvements and optimization on the original basis. The ORB algorithm is characterized by fast calculation speed. This is first of all benefited by the use of FAST to detect feature points, which is notoriously FAST just like its name. And again, the BRIEF algorithm is used for calculating the descriptor, and the expression form of the 2-system string specific to the descriptor not only saves the storage space, but also greatly shortens the matching time.
Referring to fig. 1 to 8, the present application provides an image stereo matching method, including the following steps:
step 1: constructing a training image library;
step 2: enhancing each training image;
and step 3: training point (line) feature extraction networks on a single graph;
and 4, step 4: extracting characteristic points and training a network;
and 5: unitary feature extraction, and construction of four-dimensional matching cost;
step 6: an iterative network module obtains a coarse precision disparity map;
and 7: the deformable convolution module carries out aggregation on the unary features;
and 8: performing parallax regression;
and step 9: and (5) refining the parallax.
Further, the image library in the step 1 comprises binocular data and monocular data. The binocular image comprises a left image and a right image, and the used data sets comprise a Sceneflow synthetic data set, a Kitti automatic driving data set, a Middlebury data set, an Eth3d data set and other binocular data sets. The monocular data set used includes a general image data set (which may be referred to as a monocular image) such as Imagenet, Coco, and the like.
Further, the enhancing in step 2 includes applying random brightness variation, occlusion or random offset in the vertical direction.
Further, the extracted point features in step 3 adopt a traditional feature detection operator, such as ORB, SIFT, and the like, and further consider that the point features do not have a good effect on regions lacking texture, line feature detection is added here, the regions obtained by line feature detection are classified into the range of the feature points, and the extracted feature points are used for subsequently training a feature point extraction network.
Further, the feature point extraction module in step 4 includes a series of 2D convolution layers and pooling layers, and the labels used for training are feature points extracted by traditional operators, including points extracted by a point feature and line method. The partial feature point extraction module can also directly use a traditional operator, and the dimensions of the unary feature descriptors generated by the two methods are required to be kept consistent.
Further, the unitary feature extraction module in step 5 is based on a resnet structure, and the unitary feature refers to a feature vector for describing each pixel point, because in stereo matching, a pixel point is used as a unit, and the goal is to find a corresponding pixel point in a right image for all pixel points in a mapping.
Further, L in the iteration module in step 6 represents that a Lookup operation is performed on the four-dimensional matching cost, a search range radius r is defined, and the search range radius r is used as an index to obtain a matching cost of r dimensions from matching. And matching the cost and the input disparity value, inputting the hidden vector h into an iteration module, and outputting a disparity increment delta d and an updated hidden vector h', wherein the initial disparity value is set to be 0.
Further, the deformable convolution module in step 7 performs aggregation on the unary features, where feature aggregation refers to binding the disparity value of a single pixel with disparity values of other pixels in the image to some extent, so as to avoid some abnormal regions from generating erroneous results, and improve the robustness of the system.
Further, the disparity regression in step 8 is to find each disparity value d in the disparity rangeiWeighted sum with its corresponding weight, its disparity value diIs dminTo dmaxAnd calculating probability values of the matching costs in the parallax range by positive integers, and weighting and summing to obtain corresponding parallax values.
Further, the parallax refinement in step 9 further refines the obtained parallax value in combination with the input image, and particularly, in the case of pursuing timeliness, the network may output a parallax map with a low resolution, and the refinement module performs up-sampling on the obtained parallax map in combination with the original input map to obtain a parallax map with a resolution consistent with that of the original input image.
Examples
(1) And constructing a training image library. There is a certain difficulty in acquiring binocular images including depth, and thus there are few binocular data sets currently available. The method mainly comprises two current methods for obtaining a depth map, namely, a laser radar is used; and secondly, an infrared depth sensor is used, wherein the infrared depth sensor obtains a sparse depth map, and the infrared depth sensor cannot work effectively outdoors. The lack of data limits the universality of the algorithm, and the current general method is to perform pre-training on a synthetic data set and then fine-tune on a small number of real data sets. Due to the insufficiency of data, the generalization capability of the deep learning stereo matching algorithm is greatly limited, and completely wrong results are often obtained for unseen scenes.
The part jointly uses various data such as Secenflow, Kitti, Middlebury and the like, so that sufficient data are improved for the training of the model. Further, considering that binocular data are difficult to widely contain various scenes, part of monocular data sets such as Imagenet and Coco are collected and used for training the feature point extraction module, and the generalization capability of the model is improved.
(2) And (5) enhancing the image. For each training image, random intensity variations, occlusions, random offsets in the vertical direction were applied. In a binocular camera, brightness variation may exist in two images, the images in the views are not identical due to occlusion, and meanwhile, errors may exist in the correction of stereo matching, so random offset in the vertical direction is added for data enhancement.
The significance of the part lies in enhancing data and overcoming some ill-conditioned problems existing in the stereo matching problem.
(3) And (4) extracting point (line) features. At present, the mainstream stereo matching method is difficult to have better generalization, and the main reason is that binocular acquisition data has higher requirements and is difficult to fully cover various scenes. The feature detection operator is adopted to provide labels of feature points, and extensive data are trained to improve the generalization capability of the model, so that the model has better universality on unseen scenes.
The significance of the part is to provide labels for the feature point extraction network, so that the feature point extraction network can be trained on extensive data to generate a model with generalization capability.
(4) And (5) extracting the characteristic points and training the network. The extracted features of the part are divided into two parts, one part is used as a hidden vector h and is input into an iteration updating module, and the other part is directly input into the iteration updating module.
The significance of the part lies in that the characteristic point extraction network is trained by using data covering rich scenes, the universality and the effectiveness of characteristic extraction are improved, and the generalization capability of the model is improved.
(5) And (5) unary feature extraction and construction of four-dimensional matching cost. The unary feature extraction module is based on a resnet structure, and the unary feature refers to a feature vector for describing each pixel point, because the pixel point is taken as a unit in stereo matching, and the goal is to find a corresponding pixel point in a right image for all pixel points in a mapping. And after the unitary features are extracted, multiplying the unitary features between each pixel point in the left graph and the pixel points in the specified parallax range in the right graph to construct and obtain the four-dimensional matching cost of the dimension position c x d x h x w. Where c denotes the number of channels and d denotes the parallax range (d ═ d)max-dmin) H denotes an image height, and w denotes an image width.
(6) And the iteration network module acquires a coarse-precision disparity map. D in the mainstream ProcessmaxThe maximum value of parallax between the left image and the right image is represented, the calculation amount is large during parallax aggregation, the iterative module is used for obtaining the parallax value with the precision of several pixels, and the d with the small range is obtainedmin,dmax. The iterative module can remarkably reduce the search range and improve the execution speed of the algorithm.
(7) The deformable convolution module aggregates the unary features. In the mainstream method, the 3D convolution is used for feature aggregation, however, the 3D convolution requires a large memory of the graphics card, and the calculation time is also long. We introduce a deformable convolution, outputting a parallax shift Δ p using two independent convolution layerskAnd a weight mkThe loss after polymerization is of the formula (1)
Figure BDA0002643597300000061
Where K represents the number of sampling points, ωkRepresenting the weight resulting from the softmax operation.
(8) And (6) parallax regression. For the matching cost c after aggregationdTaking a negative number (for each disparity d), then converting the matching cost into probability through softmax (here, sigma) operation, and then performing weighted summation on each disparity value, wherein the weight is a probability value, as shown in formula (2). Here, we merge point features and line features into the loss function to obtain the disparity map quickly and robustly.
Figure BDA0002643597300000071
(9) And (5) refining the parallax. And the parallax thinning module is used for sampling the obtained parallax map by combining the characteristics of the input map to obtain the parallax map with the resolution consistent with the resolution of the original map.
Although the present application has been described above with reference to specific embodiments, those skilled in the art will recognize that many changes may be made in the configuration and details of the present application within the principles and scope of the present application. The scope of protection of the application is determined by the appended claims, and all changes that come within the meaning and range of equivalency of the technical features are intended to be embraced therein.

Claims (10)

1. An image stereo matching method is characterized in that: the method comprises the following steps:
step 1: constructing a training image library;
step 2: enhancing each training image;
and step 3: training a point or line feature extraction network on a single graph;
and 4, step 4: extracting a network from the feature points or lines for training;
and 5: performing unitary feature extraction on the image to construct four-dimensional matching cost;
step 6: an iterative network module obtains a coarse precision disparity map;
and 7: the deformable convolution module carries out aggregation on the unary features;
and 8: performing parallax regression;
and step 9: and (5) refining the parallax.
2. The image stereo matching method according to claim 1, characterized in that: the binocular data and the monocular data of the image library in the step 1; the binocular image comprises a left image and a right image, the binocular data are obtained from a data set, and the data set comprises a Sceneflow synthetic data set, a Kitti automatic driving data set, a Middlebury data set, an Eth3d data set and other binocular data sets; the monocular data is obtained from a dataset comprising an Imagenet image dataset, a Coco image dataset.
3. The image stereo matching method according to claim 1, characterized in that: the enhancing in step 2 comprises applying random brightness variations, occlusions or random offsets in the vertical direction.
4. The image stereo matching method according to claim 1, wherein the extraction of the point features in step 3 employs a conventional feature detection operator, the extraction of the line features puts regions obtained by line feature detection together into a range of feature points, and the extracted feature points are used for subsequent training of a feature point extraction network.
5. The image stereo matching method according to claim 1, characterized in that: the feature point extraction module in the step 4 comprises a series of 2D convolution layers and pooling layers, and labels used for training are feature points extracted by a traditional operator, including points extracted by a point feature and line method; the partial feature point extraction module can also directly use the traditional operator, and the dimensions of the unary feature descriptors generated by the two methods are required to be kept consistent;
the features generated by the feature point extraction module in the step 4 are divided into two parts, one part is used as a hidden vector h and is input into the iteration module, and the other part is directly input into the iteration module.
6. The image stereo matching method according to claim 1, characterized in that: the unary feature extraction module in the step 5 is based on a resnet structure.
7. The image stereo matching method according to claim 1, characterized in that: the deformable convolution module in step 7 aggregates unary features, where the feature aggregation is to bind the disparity value of a single pixel to some extent with the disparity values of other pixels in the image.
8. The image stereo matching method according to claim 1, characterized in that: the regression of the parallax in the step 8 is to find each parallax value d in the range of the parallaxiWeighted sum with its corresponding weight, its disparity value diIs dminTo dmaxAnd calculating probability values of the matching costs in the parallax range by positive integers, and weighting and summing to obtain corresponding parallax values.
9. The image stereo matching method according to claim 1, characterized in that: the parallax refinement in the step 9 further refines the obtained parallax value by combining the input image, the network can output a parallax image with low resolution under the condition of pursuing timeliness, and the refinement module performs up-sampling on the obtained parallax image by combining the original input image to obtain a parallax image with the resolution consistent with that of the original input image.
10. The image stereo matching method according to any one of claims 1 to 9, characterized by: and performing network training by jointly using the synthetic data and the real data, performing feature extraction network training by using richer monocular data, and performing unary feature fusion by using deformable convolution.
CN202010847540.9A 2020-08-21 2020-08-21 Image stereo matching method Pending CN111951319A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010847540.9A CN111951319A (en) 2020-08-21 2020-08-21 Image stereo matching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010847540.9A CN111951319A (en) 2020-08-21 2020-08-21 Image stereo matching method

Publications (1)

Publication Number Publication Date
CN111951319A true CN111951319A (en) 2020-11-17

Family

ID=73358714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010847540.9A Pending CN111951319A (en) 2020-08-21 2020-08-21 Image stereo matching method

Country Status (1)

Country Link
CN (1) CN111951319A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598722A (en) * 2021-01-08 2021-04-02 北京深睿博联科技有限责任公司 Image stereo matching method and system based on deformable convolution network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764320A (en) * 2018-05-21 2018-11-06 深圳信息职业技术学院 Feature extracting method based on fractional order feature line analysis
CN108764248A (en) * 2018-04-18 2018-11-06 广州视源电子科技股份有限公司 The extracting method and device of image characteristic point
CN109005398A (en) * 2018-07-27 2018-12-14 杭州电子科技大学 A kind of stereo image parallax matching process based on convolutional neural networks
CN109409443A (en) * 2018-11-28 2019-03-01 北方工业大学 Multi-scale deformable convolution network target detection method based on deep learning
CN109934272A (en) * 2019-03-01 2019-06-25 大连理工大学 A kind of image matching method based on full convolutional network
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks
CN111210443A (en) * 2020-01-03 2020-05-29 吉林大学 Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN111508013A (en) * 2020-04-21 2020-08-07 中国科学技术大学 Stereo matching method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764248A (en) * 2018-04-18 2018-11-06 广州视源电子科技股份有限公司 The extracting method and device of image characteristic point
CN108764320A (en) * 2018-05-21 2018-11-06 深圳信息职业技术学院 Feature extracting method based on fractional order feature line analysis
CN109005398A (en) * 2018-07-27 2018-12-14 杭州电子科技大学 A kind of stereo image parallax matching process based on convolutional neural networks
CN109409443A (en) * 2018-11-28 2019-03-01 北方工业大学 Multi-scale deformable convolution network target detection method based on deep learning
CN109934272A (en) * 2019-03-01 2019-06-25 大连理工大学 A kind of image matching method based on full convolutional network
CN110533712A (en) * 2019-08-26 2019-12-03 北京工业大学 A kind of binocular solid matching process based on convolutional neural networks
CN111210443A (en) * 2020-01-03 2020-05-29 吉林大学 Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN111508013A (en) * 2020-04-21 2020-08-07 中国科学技术大学 Stereo matching method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAOFEI XU.ETC: "AANet: Adaptive Aggregation Network for Efficient Stereo Matching", ARXIV *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598722A (en) * 2021-01-08 2021-04-02 北京深睿博联科技有限责任公司 Image stereo matching method and system based on deformable convolution network
CN112598722B (en) * 2021-01-08 2022-02-11 北京深睿博联科技有限责任公司 Image stereo matching method and system based on deformable convolution network

Similar Documents

Publication Publication Date Title
CN101388115B (en) Depth image autoegistration method combined with texture information
CN110188835B (en) Data-enhanced pedestrian re-identification method based on generative confrontation network model
CN111553845B (en) Quick image stitching method based on optimized three-dimensional reconstruction
CN113486887B (en) Target detection method and device in three-dimensional scene
CN110706269A (en) Binocular vision SLAM-based dynamic scene dense modeling method
CN116612468A (en) Three-dimensional target detection method based on multi-mode fusion and depth attention mechanism
CN113159043A (en) Feature point matching method and system based on semantic information
CN113705796A (en) Light field depth acquisition convolutional neural network based on EPI feature enhancement
CN115511759A (en) Point cloud image depth completion method based on cascade feature interaction
CN114708475A (en) Point cloud multi-mode feature fusion network method for 3D scene understanding
CN111951319A (en) Image stereo matching method
Song et al. Voxelnextfusion: A simple, unified and effective voxel fusion framework for multi-modal 3d object detection
Tao et al. F-pvnet: Frustum-level 3-d object detection on point–voxel feature representation for autonomous driving
CN112270701A (en) Packet distance network-based parallax prediction method, system and storage medium
CN110390336B (en) Method for improving feature point matching precision
CN115294371B (en) Complementary feature reliable description and matching method based on deep learning
CN116612235A (en) Multi-view geometric unmanned aerial vehicle image three-dimensional reconstruction method and storage medium
CN115330935A (en) Three-dimensional reconstruction method and system based on deep learning
Xiao et al. Instance-Aware Monocular 3D Semantic Scene Completion
CN106056599A (en) Object depth data-based object recognition algorithm and device
Zhang et al. 3D Object Detection Based on Multi-view Adaptive Fusion
Ai et al. MVTr: multi-feature voxel transformer for 3D object detection
CN115906007B (en) Intelligent driving characteristic parameter generation method, device and computer readable medium
CN113963335B (en) Road surface obstacle detection method based on image and point cloud data
Li et al. Enhancing depth quality of stereo vision using deep learning-based prior information of the driving environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination