CN111739144A - Method and device for simultaneously positioning and mapping based on depth feature optical flow - Google Patents

Method and device for simultaneously positioning and mapping based on depth feature optical flow Download PDF

Info

Publication number
CN111739144A
CN111739144A CN202010565428.6A CN202010565428A CN111739144A CN 111739144 A CN111739144 A CN 111739144A CN 202010565428 A CN202010565428 A CN 202010565428A CN 111739144 A CN111739144 A CN 111739144A
Authority
CN
China
Prior art keywords
image
optical flow
characteristic
feature
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010565428.6A
Other languages
Chinese (zh)
Inventor
向坤
陶文源
闫野
唐荣富
陶雨薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202010565428.6A priority Critical patent/CN111739144A/en
Publication of CN111739144A publication Critical patent/CN111739144A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a device for simultaneously positioning and mapping based on depth feature optical flow, wherein the method comprises the following steps: estimating the characteristic optical flow information between two frames of images according to the characteristic optical flow mapping model; designing a visual odometer based on a characteristic light stream, and using the visual odometer for feature extraction, feature tracking, pose estimation of a carrying camera body and image three-dimensional feature recovery of an image; when the tracking of the visual odometer fails, the re-matching of the visual features is completed through a repositioning technology; detecting whether the main body moves to a region which is passed by before, and realizing the pose optimization on the scale; iterative optimization is carried out on the image three-dimensional characteristic information and the main body pose information provided by the visual odometer, so that errors are reduced; and constructing a three-dimensional map of the environment by using the three-dimensional feature information of the image subjected to the back-end optimization and the pose information of the main body. The device comprises: memory, processor implementing the method steps of when the processor executes the program. The present invention provides a robust and accurate solution.

Description

Method and device for simultaneously positioning and mapping based on depth feature optical flow
Technical Field
The invention relates to the field of computing vision and deep learning, in particular to a method and a device for simultaneously positioning and mapping based on a depth feature optical flow.
Background
Optical flow (optical flow), which is essentially the change in luminance of a pixel point when the motion of an object in a three-dimensional scene is projected onto a two-dimensional image plane. The optical flow method is an image motion analysis technique developed in the field of computer vision, and is an important research topic in the field of machine vision. Optical flow based motion analysis is the basis for many visual tasks.
Conventional optical flow methods mainly include the horns & Schunck (HS) and Lucas & Kanade (LK) methods. The HS method and the LK method are both proposed based on a most basic assumption that the brightness (gray value) displayed by the same pixel point in two adjacent frame images is unchanged. Noting time t, the gray level of the pixel point at (x, y) is I (x, y, t), and assuming time t + dt, it moves to (x + dx, y + dy). Since the gray scale is not changed, I (x + dx, y + dy, t + dt) is I (x, y, t).
The Simultaneous Localization and Mapping (SLAM) technology is a technology that a main body carrying a specific sensor starts to move without environment prior information, a model of an environment is established in the moving process, self-Localization is carried out according to position estimation and a map, and an incremental map is built on the basis of the self-Localization. If the sensor here is primarily a camera, it is called a visual SLAM.
In the visual SLAM, a system extracts visual feature information on two adjacent frames of images, completes feature tracking and matching through feature optical flow information of the images, and estimates pose transformation of a camera through an established geometric model.
In a real-world environment, when an acquired image sequence faces obvious illumination change or an object in a three-dimensional scene has a large running state, a large error exists in the conventional optical flow estimation, so that the accuracy of tracking and matching of visual features is influenced.
Disclosure of Invention
The invention provides a method and a device for simultaneously positioning and mapping based on depth characteristic optical flow, which designs a depth neural network model based on the mapping input of adjacent frame images, maps the depth characteristic optical flow information of the images, and provides a robust and accurate solution for the problem of simultaneously positioning and mapping based on the characteristic optical flow, and the detailed description is as follows:
a method for simultaneous localization and mapping based on depth-feature optical flow, the method comprising:
estimating the characteristic optical flow information between two frames of images according to the characteristic optical flow mapping model;
designing a visual odometer based on a characteristic light stream, and using the visual odometer for feature extraction, feature tracking, pose estimation of a carrying camera body and image three-dimensional feature recovery of an image;
when the tracking of the visual odometer fails, the re-matching of the visual features is completed through a repositioning technology;
detecting whether the main body moves to a region which is passed by before, and realizing the pose optimization on the scale;
iterative optimization is carried out on the image three-dimensional characteristic information and the main body pose information provided by the visual odometer, so that errors are reduced;
and constructing a three-dimensional map of the environment by using the three-dimensional feature information of the image subjected to the back-end optimization and the pose information of the main body.
Wherein, the designing of the visual odometer based on the characteristic optical flow is specifically as follows:
s1.1, collecting a large data set sample of image data and a real value of a characteristic light stream, and dividing the large data set sample into a training set and a testing set;
s1.2, establishing an image pyramid
Figure BDA0002547471950000021
Layer relative to
Figure BDA0002547471950000022
Scaling the layer image by 2 times, and performing optical flow feature extraction and iterative optimization in each layer of image pyramid;
s1.3, for the estimated optical flow fl+1Performing a 2-fold upsampling, combining
Figure BDA0002547471950000023
Performing image transformation operation to obtain
Figure BDA0002547471950000024
S1.4, will obtain
Figure BDA0002547471950000025
And
Figure BDA0002547471950000026
carrying out correlation calculation to obtain cvl
S1.5, establishing an optical flow estimation local network, and inputting the optical flow estimation local network into a layer I image of an image pyramid
Figure BDA0002547471950000027
Image correlation value cvlUp-sampled optical flow up of l +1 layer2(fl+1) Firstly, a convolution network with 5 layers is established, and the output is a primary characteristic optical flow wlThen, a sixth layer of convolution network is established to output a precise characteristic optical flow vl
S1.6, establishing an optical flow optimization local network, and inputting the optical flow optimization local network as a primary characteristic optical flow wlRefined characteristic light flow vlFirstly, a convolution network with 5 layers is established, then a convolution network with a sixth layer with 2 convolution kernels is established, and optimized optical flow f is outputl
S1.7, carrying out S1.3-S1.6 operation on each layer of image pyramid in an iterative manner, wherein the optical flow graph output by the 0 th layer is the obtained characteristic optical flow f of the two frames of images;
s1.8, training the characteristic light stream mapping model by using a training set, and performing error detection by using a test set;
s1.9, for a first frame image I of a test set image sequence1Defining the distance of the feature points as d pixels and defining the extraction quantity of the feature points as n;
s1.10, willOne-frame and two-frame image I1And I2Predicting a characteristic light flow graph f between two frame images in an input optical flow network1
S1.11, the characteristic light flow graph f is a double-channel image, and the value stored at the coordinate (x, y) in the first channel corresponds to the image I1Upper point of
Figure BDA0002547471950000028
Is shifted u in the x direction, the value stored at the coordinate (x, y) in the second channel, corresponds to the image I1Upper point of
Figure BDA0002547471950000029
Y-direction displacement v;
s1.12, analyzing characteristic light flow diagram f1Calculating a second frame image I2Upper corresponding characteristic point
Figure BDA00025474719500000210
S1.13, feature points after tracking
Figure BDA00025474719500000211
Detecting and marking invalid points;
s1.14, for successfully tracked point pairs
Figure BDA00025474719500000212
Solving a corresponding space 3D point, and defining the space 3D point as a map point;
s1.15, first frame image I1Setting the pose of the main body to be the standard pose T1
S1.16, solving a second frame image I through the corresponding relation between map points and image feature points2Time subject pose T2
S1.17, feature point pair
Figure BDA0002547471950000031
Judging a threshold value, and when the number of feature points is less than a threshold value n, carrying out second frame image I2Extracting Fast characteristic points again, and supplementing the number of the characteristic points to n;
s1.18, inputting the next frame image Ij(ii) a Image Ij-1And image IjInputting the predicted characteristic light flow into a characteristic light flow mapping model to predict a characteristic light flow graph f between two frame imagesj-1
S1.19, judging whether the last frame is reached, and if not, repeating the steps S1.12-S1.19.
Further, the above-mentioned completion of the visual feature re-matching by the repositioning technique is specifically:
will IjThe frame is set as a start frame, IjSetting the frame corresponding main body pose as the standard pose Tj(ii) a In IjOn the basis of the initial frame, the next frame of image is input, and the tracking process in the visual odometer is repeated.
Wherein, whether the detection subject moves to a region which has been passed before or not is detected, and the pose optimization on the scale is specifically as follows:
for any input image frame IjAnd the corresponding characteristic points, and extracting ORB descriptor of each characteristic point;
setting a descriptor matching degree threshold value, and tracking any image frame IlThen, corresponding descriptors of the feature points are matched with the image I of each frame in the frontkComparing descriptors, calculating a Hamming distance dis, and defining that the loop detection is successful when dis is less than th2, namely IlAnd IkAnd performing pose optimization by using a nonlinear optimization algorithm corresponding to the same position of the main body in the space.
Further, the iterative optimization of the image three-dimensional feature information and the main body pose information provided by the visual odometer is specifically to reduce errors:
for the obtained two adjacent frame images Ij-1And IjMap point and pose TjAs an optimization input;
reprojection of map points into image IjObtaining the image points after the re-projection, and calculating the error between the image points and the original characteristic points as an optimization quantity;
optimizing map points and poses by minimizing optimization quantities using a non-linear optimization algorithmTj
An apparatus for simultaneous localization and mapping of depth-feature-based optical flow, the apparatus comprising: memory, processor and computer program stored on the memory and executable on the processor, the processor implementing the method steps of the claims when executing the program.
The technical scheme provided by the invention has the beneficial effects that:
1) the specific deep neural network is used for extracting the optical flow characteristics, so that the robustness on the conditions of illumination change, unobvious texture and the like is better;
2) the optical flow information is used for constructing a simultaneous positioning and mapping system, the calculation cost is low, and the real-time performance is good.
Drawings
FIG. 1 is an architectural diagram of a method for simultaneous localization and mapping based on a characteristic optical flow;
FIG. 2 is an optical flow network model;
FIG. 3 is a schematic diagram of an Image pyramid (Image pyramid);
FIG. 4 is a flow estimator local network architecture in a network model;
FIG. 5 is a context layer local network architecture in a network model;
fig. 6 is a schematic structural diagram of a visual odometer module.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
Example 1
A method for simultaneous localization and mapping based on depth-feature optical flow, see fig. 1 and 2, the method comprising the steps of:
firstly, constructing a characteristic optical flow mapping model based on a deep neural network
In a first aspect of the present invention, a deep neural network-based feature optical flow mapping model is provided, which includes the following steps:
(1) collecting large samples of image data and characteristic light stream information, and dividing the large samples into a training set and a test set;
the number of the large samples is set according to the needs in practical application, which is not described in detail in the embodiments of the present invention.
(2) Designing an image pyramid network for extracting depth features between two frames of images;
(3) designing a depth convolution network to carry out iterative optimization on the extracted depth features;
(4) and training a feature network mapping model from the image to the optical flow information based on the result of the iterative optimization by combining a training set and a test set.
Estimating the characteristic optical flow information between two frames of images through a characteristic optical flow mapping model, and simultaneously positioning and mapping
In the second aspect of the invention, the characteristic optical flow information between two frames of images is estimated through the characteristic optical flow mapping model constructed in the first step, and further the positioning and mapping are carried out simultaneously, and the method comprises the following steps:
(1) developing a feature-based optical flow visual odometer module for feature extraction of images, feature tracking between the images, pose estimation of a camera body and three-dimensional feature recovery of the images;
(2) developing a repositioning module, and completing the re-matching of the visual characteristics through a repositioning technology when the tracking of the visual odometer fails;
(3) developing a closed loop detection module, detecting whether the main body moves to a region which is passed by before, and realizing pose optimization on a certain scale;
(4) a back-end optimization module is developed to perform iterative optimization on the image three-dimensional characteristic information and the main body pose information provided by the visual odometer, so that errors are reduced;
(5) and the map building module is used for building a three-dimensional map of the environment by using the image three-dimensional characteristic information subjected to the back-end optimization and the pose information of the main body.
Example 2
The scheme of example 1 is further described with reference to fig. 1 to 6 and specific calculation formulas, which are described in detail below:
in step S1, the optical flow network shown in fig. 2 is constructed, and further, the visual odometer module is constructed, the key steps are as follows:
s1.1, collecting a large data set sample of image data and a real value of a characteristic light stream, and dividing the large data set sample into a training set and a testing set;
s1.2, establishing an image pyramid (shown in figure 3)
Figure BDA0002547471950000051
W represents the image of the frame, l represents the image pyramid of the layer), wherein
Figure BDA0002547471950000052
The layers are original images, and convolution is carried out between every two layers by using a mode of step length 2, so that
Figure BDA0002547471950000053
Layer relative to
Figure BDA0002547471950000054
The layer image is scaled by a factor of 2. Establishing 7 layers of image pyramids, extracting optical flow characteristics and performing iterative optimization in each layer of image pyramid, and using a leakage ReLU as an activation function;
s1.3, for the l +1 layer of the light stream f estimated in the image pyramid of the l layerl+1Performing a 2-fold upsampling and then combining
Figure BDA0002547471950000055
An image conversion (warping layer) operation is performed,
Figure BDA0002547471950000056
to obtain
Figure BDA0002547471950000057
Wherein x is an image
Figure BDA0002547471950000058
The coordinates of each point. For the top level image pyramid, its up2(fl+1) Is 0;
s1.4, will obtain
Figure BDA0002547471950000059
And
Figure BDA00025474719500000510
performing correlation calculation (correlation layer), calculating correlation between them to obtain cvl
Figure BDA00025474719500000511
Wherein x is2Value range x of2x∈[x1x-d,x1x+d],x2y∈[x1y-d,x1y+d](d is generally set to 4).
S1.5, establishing an optical flow estimation local network as shown in FIG. 4, and inputting the optical flow estimation local network into a layer I image of an image pyramid
Figure BDA00025474719500000512
Image correlation value cvlUp-sampled optical flow up of l +1 layer2(fl+1). Firstly, establishing a convolution network with 5 layers, sequentially setting the number of convolution kernels as (128, 128, 96, 64 and 32), setting the activation function as leak ReLU and outputting as a primary characteristic optical flow wl. Then establishing a sixth layer of convolution network, wherein the number of convolution kernels is 2, and outputting precise characteristic optical flow vl
S1.6, establishing an optical flow optimization local network shown in FIG. 5, and inputting a preliminary characteristic optical flow wlRefined characteristic light flow vl. Firstly, a convolutional network with 5 layers is established, the number of convolutional kernels is (128, 128, 96, 64 and 32) in sequence, the activation function is leak ReLU, and then a convolutional network with a sixth layer and the number of convolutional kernels is 2 is established. Outputting optimized light flow fl
S1.7, carrying out S1.3-S1.6 operation on each layer of image pyramid in an iterative manner, wherein the optical flow graph output by the 0 th layer is the obtained characteristic optical flow f of the two frames of images;
s1.8, training the characteristic light stream mapping model by using a training set, carrying out error detection by using a test set after the best fitting effect is achieved, and further building a visual odometer shown in FIG. 6;
s1.9, for a first frame image I of a test set image sequence1Invoking Opencv library to extract Fast[1]Characteristic point
Figure BDA0002547471950000061
Limiting the distance of the feature points to be d pixels, limiting the extraction quantity of the feature points to be n, and ensuring the sufficient quantity and the uniform distribution of the feature points (d is constrained by the size of an input image, and n is generally 200);
s1.10, a first frame image I1And a second frame image I2Predicting a characteristic light flow graph f between two frame images in an input optical flow network1
S1.11, the characteristic light flow graph f is a double-channel image, and the value stored at the coordinate (x, y) in the first channel corresponds to the image I1Upper point of
Figure BDA0002547471950000062
Is shifted u in the x direction, the value stored at the coordinate (x, y) in the second channel, corresponds to the image I1Upper point of
Figure BDA0002547471950000063
Y-direction displacement v;
s1.12, analyzing characteristic light flow diagram f1Calculating a second frame image I2Upper corresponding characteristic point
Figure BDA0002547471950000064
Wherein
Figure BDA0002547471950000065
S1.13, invoking OpenCV library random sample detection algorithm (RANSAC)[2]For the tracked feature points
Figure BDA0002547471950000066
Detecting, wherein an algorithm can mark invalid points in the sample, namely tracking failure points, and removing the tracking failure points;
s1.14, for successfully tracked point pairs
Figure BDA0002547471950000067
Calling OpenCV library triangularization algorithm[3]The corresponding spatial 3D point is obtained and defined as the map point { mi|i=1,...,n};
S1.15, first frame image I1Setting the pose of the main body to be the standard pose T1
S1.16, corresponding relation between map points and image feature points
Figure BDA0002547471950000068
By calling the OpenCV library PnP algorithm[4]Solving a second frame image I2Time subject pose T2
S1.17, feature point pair
Figure BDA0002547471950000069
Judging a threshold value, and when the number of feature points is less than a threshold value n, carrying out second frame image I2Extracting Fast characteristic points again, and supplementing the number of the characteristic points to n;
s1.18, inputting the next frame image Ij(ii) a Image Ij-1And image IjInputting the predicted characteristic light flow into a characteristic light flow mapping model to predict a characteristic light flow graph f between two frame imagesj-1
And S1.19, judging whether the last frame is reached, ending the tracking when the last frame is reached, and otherwise, repeating the steps S1.12-S1.19.
In step S2, a relocation module based on image feature matching detection is constructed, and the key steps are as follows:
s2.1, setting a threshold value th1 of the matching degree of the feature points, wherein in the visual odometer, when adjacent frame images I are adjacentj-1And IjWhen the number of the characteristic points is smaller than a threshold th1, the tracking is considered to be failed, and relocation is carried out;
s2.1, mixing IjThe frame is set as a start frame, IjSetting the frame corresponding main body pose as the standard pose Tj
S2.3 atjAnd inputting the next frame of image on the basis of the initial frame, and repeating the tracking process from S1.10 to S1.19 in the visual odometer.
In step S3, a loop detection module based on an image feature matching algorithm is constructed, which includes the following key steps;
s3.1, for any input image frame IjAnd its corresponding characteristic points
Figure BDA0002547471950000071
Calling OpenCV library in image IjExtract ORB descriptor of each feature point[5]
Figure BDA0002547471950000072
S3.2, setting a descriptor matching degree threshold th 2;
s3.3, tracking to optional image frame IlThen, corresponding descriptors of the characteristic points
Figure BDA0002547471950000073
And each preceding frame image IkComparing descriptors, calculating the Hamming distance dis, and defining that the loop detection is successful when dis is less than th2, i.e. IlAnd IkUsing a non-linear optimization algorithm corresponding to the same position of the main body in space[6](cerees) pose optimization.
In step S4, a back-end optimization module based on a nonlinear optimization algorithm is constructed, which includes the following key steps:
s4.1, for the obtained two adjacent frames of images Ij-1And IjMap point of { m }i1, ·, n }, and pose TjAs an optimization input;
s4.2, map points { miI1.. n } is re-projected onto image IjTo obtain the image point after the re-projection
Figure BDA0002547471950000074
Computing
Figure BDA0002547471950000075
And original characteristic point
Figure BDA0002547471950000076
Error between
Figure BDA0002547471950000077
As an optimized amount;
s4.3, using a non-linear optimization algorithm, by minimizing
Figure BDA0002547471950000078
Optimized map points { mi1,. n } and pose Tj
In step S5, a mapping module is constructed, which includes the following key steps:
s5.1, obtaining the optimized map points { m ] of each imagei1,. n } and pose TjInformation;
s5.2, calling drawing library pangolin[7]Drawing pose TjAnd drawing the space map points by a corresponding camera model.
Reference to the literature
[1]E.Rosten,R.Porter and T.Drummond,"Faster and Better:A MachineLearning Approach to Corner Detection,"in IEEE Transactions on PatternAnalysis and Machine Intelligence,vol.32,no.1,pp.105-119,Jan.2010,doi:10.1109/TPAMI.2008.275.
[2]Nister D.Preemptive RANSAC for Live Structure and MotionEstimation[C]//Proceedings Ninth IEEE International Conference on ComputerVision.IEEE,2008.
[3]OpenCV Online documentation https://docs.opencv.org/3.4/d0/dbd/group__triangulation.html
[4]OpenCV Online documentation https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033
[5]Rublee E,Rabaud V,Konolige K,et al.ORB:An efficient alternative toSIFT or SURF[C]//International Conference on Computer Vision.IEEE,2012.
[6]Ceres Solver:http://www.ceres-solver.org
[7]Pangolin:https://github.com/stevenlovegrove/Pangolin
In the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited, as long as the device can perform the above functions.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A method for simultaneous localization and mapping based on depth-feature optical flow, the method comprising:
estimating the characteristic optical flow information between two frames of images according to the characteristic optical flow mapping model;
designing a visual odometer based on a characteristic light stream, and using the visual odometer for feature extraction, feature tracking, pose estimation of a carrying camera body and image three-dimensional feature recovery of an image;
when the tracking of the visual odometer fails, the re-matching of the visual features is completed through a repositioning technology;
detecting whether the main body moves to a region which is passed by before, and realizing the pose optimization on the scale;
iterative optimization is carried out on the image three-dimensional characteristic information and the main body pose information provided by the visual odometer, so that errors are reduced;
and constructing a three-dimensional map of the environment by using the three-dimensional feature information of the image subjected to the back-end optimization and the pose information of the main body.
2. The method for simultaneous localization and mapping of depth-feature-based optical flow according to claim 1, wherein the designing of feature-based optical flow visual odometer is specifically:
s1.1, collecting a large data set sample of image data and a real value of a characteristic light stream, and dividing the large data set sample into a training set and a testing set;
s1.2, establishing an image pyramid
Figure FDA0002547471940000011
Layer relative to
Figure FDA0002547471940000012
Scaling the layer image by 2 times, and performing optical flow feature extraction and iterative optimization in each layer of image pyramid;
s1.3, for the estimated optical flow fl+1Performing a 2-fold upsampling, combining
Figure FDA0002547471940000013
Performing image transformation operation to obtain
Figure FDA0002547471940000014
S1.4, will obtain
Figure FDA0002547471940000015
And
Figure FDA0002547471940000016
carrying out correlation calculation to obtain cvl
S1.5, establishing an optical flow estimation local network, and inputting the optical flow estimation local network into a layer I image of an image pyramid
Figure FDA0002547471940000017
Image correlation value cvlUp-sampled optical flow up of l +1 layer2(fl+1) Firstly, a convolution network with 5 layers is established, and the output is a primary characteristic optical flow wlThen, a sixth layer of convolution network is established to output a precise characteristic optical flow vl
S1.6, establishing an optical flow optimization local network, and inputting the optical flow optimization local network as a primary characteristic optical flow wlRefined characteristic light flow vlFirstly, a convolution network with 5 layers is established, then a convolution network with a sixth layer with 2 convolution kernels is established, and optimized optical flow f is outputl
S1.7, carrying out S1.3-S1.6 operation on each layer of image pyramid in an iterative manner, wherein the optical flow graph output by the 0 th layer is the obtained characteristic optical flow f of the two frames of images;
s1.8, training the characteristic light stream mapping model by using a training set, and performing error detection by using a test set;
s1.9, for a first frame image I of a test set image sequence1Defining the distance of the feature points as d pixels and defining the extraction quantity of the feature points as n;
s1.10, a first frame image I and a second frame image I are processed1And I2Predicting a characteristic light flow graph f between two frame images in an input optical flow network1
S1.11, the characteristic light flow graph f is a double-channel image, and the value stored at the coordinate (x, y) in the first channel corresponds to the image I1Upper point of
Figure FDA0002547471940000018
Is shifted u in the x direction, the value stored at the coordinate (x, y) in the second channel, corresponds to the image I1Upper point of
Figure FDA0002547471940000019
Y-direction displacement v;
s1.12, analyzing characteristic light flow diagram f1Calculating a second frame image I2Upper corresponding characteristic point
Figure FDA0002547471940000021
S1.13, feature points after tracking
Figure FDA0002547471940000022
Detecting and marking invalid points;
s1.14, for successfully tracked point pairs
Figure FDA0002547471940000023
Solving a corresponding space 3D point, and defining the space 3D point as a map point;
s1.15, first frame image I1Setting the pose of the main body to be the standard pose T1
S1.16, solving a second frame image I through the corresponding relation between map points and image feature points2Time subject pose T2
S1.17, feature point pair
Figure FDA0002547471940000024
Judging a threshold value, and when the number of feature points is less than a threshold value n, carrying out second frame image I2Extracting Fast characteristic points again, and supplementing the number of the characteristic points to n;
s1.18, inputting the next frame image Ij(ii) a Image Ij-1And image IjInputting the predicted characteristic light flow into a characteristic light flow mapping model to predict a characteristic light flow graph f between two frame imagesj-1
S1.19, judging whether the last frame is reached, and if not, repeating the steps S1.12-S1.19.
3. The method for simultaneous localization and mapping of optical flow based on depth features according to claim 1, wherein the re-matching of visual features by repositioning technique is specifically:
will IjThe frame is set as a start frame, IjSetting the frame corresponding main body pose as the standard pose Tj(ii) a In IjOn the basis of the initial frame, the next frame of image is input, and the tracking process in the visual odometer is repeated.
4. The method for simultaneous localization and mapping based on depth-feature optical flow according to claim 1, wherein the detecting whether the subject moves to a region that has been passed before implements pose optimization on a scale specifically:
for any input image frame IjAnd the corresponding characteristic points, and extracting ORB descriptor of each characteristic point;
setting a descriptor matching degree threshold value, and tracking any image frame IlThen, corresponding descriptors of the feature points are matched with the image I of each frame in the frontkComparing descriptors, calculating a Hamming distance dis, and defining that the loop detection is successful when dis is less than th2, namely IlAnd IkAnd performing pose optimization by using a nonlinear optimization algorithm corresponding to the same position of the main body in the space.
5. The method for simultaneous localization and mapping based on depth-feature optical flow according to claim 1, wherein the iterative optimization is performed on the image three-dimensional feature information and the main body pose information provided by the visual odometer, and the error reduction is specifically:
for the obtained two adjacent frame images Ij-1And IjMap point and pose TjAs an optimization input;
reprojection of map points into image IjObtaining the image points after the re-projection, and calculating the error between the image points and the original characteristic points as an optimization quantity;
optimizing map points and pose T by minimizing optimization quantity using nonlinear optimization algorithmj
6. An apparatus for simultaneous localization and mapping of depth-feature-based optical flow, the apparatus comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the method steps of claim 1 are implemented when the processor executes the program.
CN202010565428.6A 2020-06-19 2020-06-19 Method and device for simultaneously positioning and mapping based on depth feature optical flow Pending CN111739144A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010565428.6A CN111739144A (en) 2020-06-19 2020-06-19 Method and device for simultaneously positioning and mapping based on depth feature optical flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010565428.6A CN111739144A (en) 2020-06-19 2020-06-19 Method and device for simultaneously positioning and mapping based on depth feature optical flow

Publications (1)

Publication Number Publication Date
CN111739144A true CN111739144A (en) 2020-10-02

Family

ID=72650338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010565428.6A Pending CN111739144A (en) 2020-06-19 2020-06-19 Method and device for simultaneously positioning and mapping based on depth feature optical flow

Country Status (1)

Country Link
CN (1) CN111739144A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132871A (en) * 2020-08-05 2020-12-25 天津(滨海)人工智能军民融合创新中心 Visual feature point tracking method and device based on feature optical flow information, storage medium and terminal
CN112967228A (en) * 2021-02-02 2021-06-15 中国科学院上海微系统与信息技术研究所 Method and device for determining target optical flow information, electronic equipment and storage medium
CN113052750A (en) * 2021-03-31 2021-06-29 广东工业大学 Accelerator and accelerator for task tracking in VSLAM system
CN113066103A (en) * 2021-03-18 2021-07-02 鹏城实验室 Camera interframe motion determining method
CN113724379A (en) * 2021-07-08 2021-11-30 中国科学院空天信息创新研究院 Three-dimensional reconstruction method, device, equipment and storage medium
US12002253B2 (en) 2021-11-29 2024-06-04 Automotive Research & Testing Center Feature point integration positioning system, feature point integration positioning method and non-transitory computer-readable memory

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025668A (en) * 2017-03-30 2017-08-08 华南理工大学 A kind of design method of the visual odometry based on depth camera
CN110108258A (en) * 2019-04-09 2019-08-09 南京航空航天大学 A kind of monocular vision odometer localization method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025668A (en) * 2017-03-30 2017-08-08 华南理工大学 A kind of design method of the visual odometry based on depth camera
CN110108258A (en) * 2019-04-09 2019-08-09 南京航空航天大学 A kind of monocular vision odometer localization method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DEQING SUN等: "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume", 《IEEE》, pages 3 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132871A (en) * 2020-08-05 2020-12-25 天津(滨海)人工智能军民融合创新中心 Visual feature point tracking method and device based on feature optical flow information, storage medium and terminal
CN112132871B (en) * 2020-08-05 2022-12-06 天津(滨海)人工智能军民融合创新中心 Visual feature point tracking method and device based on feature optical flow information, storage medium and terminal
CN112967228A (en) * 2021-02-02 2021-06-15 中国科学院上海微系统与信息技术研究所 Method and device for determining target optical flow information, electronic equipment and storage medium
CN112967228B (en) * 2021-02-02 2024-04-26 中国科学院上海微系统与信息技术研究所 Determination method and device of target optical flow information, electronic equipment and storage medium
CN113066103A (en) * 2021-03-18 2021-07-02 鹏城实验室 Camera interframe motion determining method
CN113066103B (en) * 2021-03-18 2023-02-21 鹏城实验室 Camera interframe motion determining method
CN113052750A (en) * 2021-03-31 2021-06-29 广东工业大学 Accelerator and accelerator for task tracking in VSLAM system
CN113724379A (en) * 2021-07-08 2021-11-30 中国科学院空天信息创新研究院 Three-dimensional reconstruction method, device, equipment and storage medium
CN113724379B (en) * 2021-07-08 2022-06-17 中国科学院空天信息创新研究院 Three-dimensional reconstruction method and device for fusing image and laser point cloud
US12002253B2 (en) 2021-11-29 2024-06-04 Automotive Research & Testing Center Feature point integration positioning system, feature point integration positioning method and non-transitory computer-readable memory

Similar Documents

Publication Publication Date Title
CN108764048B (en) Face key point detection method and device
US11762475B2 (en) AR scenario-based gesture interaction method, storage medium, and communication terminal
US11030525B2 (en) Systems and methods for deep localization and segmentation with a 3D semantic map
CN111739144A (en) Method and device for simultaneously positioning and mapping based on depth feature optical flow
CN110108258B (en) Monocular vision odometer positioning method
US9418480B2 (en) Systems and methods for 3D pose estimation
US20210350560A1 (en) Depth estimation
Peng et al. Model and context‐driven building extraction in dense urban aerial images
CN103854283A (en) Mobile augmented reality tracking registration method based on online study
Taketomi et al. Real-time and accurate extrinsic camera parameter estimation using feature landmark database for augmented reality
CN113609896A (en) Object-level remote sensing change detection method and system based on dual-correlation attention
CN113807361B (en) Neural network, target detection method, neural network training method and related products
CN111914756A (en) Video data processing method and device
CN116524062B (en) Diffusion model-based 2D human body posture estimation method
CN110910375A (en) Detection model training method, device, equipment and medium based on semi-supervised learning
CN110827320A (en) Target tracking method and device based on time sequence prediction
Gao et al. A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images
CN114494150A (en) Design method of monocular vision odometer based on semi-direct method
Liu et al. D-vpnet: A network for real-time dominant vanishing point detection in natural scenes
Chen et al. I2D-Loc: Camera localization via image to lidar depth flow
Xu et al. 6d-diff: A keypoint diffusion framework for 6d object pose estimation
CN114862866A (en) Calibration plate detection method and device, computer equipment and storage medium
CN114663917A (en) Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device
CN114639013A (en) Remote sensing image airplane target detection and identification method based on improved Orient RCNN model
Kourbane et al. Skeleton-aware multi-scale heatmap regression for 2D hand pose estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination