CN111693047B - Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene - Google Patents

Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene Download PDF

Info

Publication number
CN111693047B
CN111693047B CN202010381182.7A CN202010381182A CN111693047B CN 111693047 B CN111693047 B CN 111693047B CN 202010381182 A CN202010381182 A CN 202010381182A CN 111693047 B CN111693047 B CN 111693047B
Authority
CN
China
Prior art keywords
image
frame
map
optimization
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010381182.7A
Other languages
Chinese (zh)
Other versions
CN111693047A (en
Inventor
张磊
刘硕
文鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN202010381182.7A priority Critical patent/CN111693047B/en
Publication of CN111693047A publication Critical patent/CN111693047A/en
Application granted granted Critical
Publication of CN111693047B publication Critical patent/CN111693047B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a visual navigation method of a microminiature unmanned aerial vehicle in a high dynamic scene, which comprises the following steps: 1. the exposure control and compensation method is adopted to carry out luminosity pretreatment on the input image, so that the random change of illumination in a high dynamic scene is overcome; 2. a lightweight image semantic segmentation algorithm is adopted to segment high-mobility people or objects from a static scene or the static scene accurately, so that interference factors in the high-mobility scene are overcome; 3. the visual reprojection error, the image photometric error and the pixel semantic error are used for jointly constructing a target function, and the optimal solution is calculated through nonlinear optimization, so that the optimization precision of the local image is improved; 4. the visual and inertial integrated navigation is supported, and the navigation robustness and the data updating rate are improved; 5. a loop detection algorithm based on deep learning is adopted, so that the loop detection precision is improved; 6. and the navigation precision is further improved by optimizing the global pose graph. The method has the navigation capability of high precision, strong real-time performance, strong robustness and expandability.

Description

Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene
Technical Field
The invention belongs to the field of computer vision navigation, and particularly relates to a visual navigation method for a micro unmanned aerial vehicle in a high dynamic scene.
Background
The microminiature unmanned aerial vehicle is an unmanned aerial vehicle type with excellent characteristics of low cost, light weight, flexibility, maneuverability, easy concealment, easy penetration and the like, and is also a research hotspot in recent years.
The micro unmanned aerial vehicle is limited by very limited load capacity, computing capacity and energy supply capacity, and cannot bear various sensors to improve positioning and navigation accuracy. At present, unmanned aerial vehicle positioning and navigation mainly comprises various methods such as inertial navigation, global satellite navigation, radio navigation positioning and dead reckoning, visual navigation and the like. The inertial navigation has high precision, but the inertial navigation has large volume and heavy weight and cannot be loaded on a micro unmanned aerial vehicle; the global satellite navigation is restricted by satellite signals, has poor autonomy and instantaneity and is easy to lose efficacy in scenes such as the interior of a building; radio navigation requires a ground station to complete accurate positioning, and is also limited by a large scene. And characteristics such as autonomy, accuracy, visuality and intelligence of visual navigation to the low dependence to external perception source all have good commonality to each type of complicated scene, consequently are prepared for favour in the unmanned aerial vehicle navigation field. Meanwhile, with the continuous progress of the visual sensor technology, the machine vision technology and the artificial intelligence technology, the visual navigation technology is rapidly brought to be a research hotspot in the field.
Visual navigation can be divided into passive visual navigation and active visual navigation according to sensor types. The active visual navigation utilizes active detection modes such as laser radar, millimeter wave radar and sonar to sense the environment, and although the technology has the advantages of high detection precision and capability of obtaining scene depth information, the technology has large volume and weight and is expensive.
And the monocular camera has the advantages of small volume, light weight, low price and long detection distance, and is more suitable for the application of the micro unmanned aerial vehicle. Meanwhile, most micro unmanned aerial vehicles are provided with video equipment, video information acquired by the video equipment for task load can be used for autonomous navigation and positioning of the unmanned aerial vehicles by using visual navigation of a monocular camera, so that two purposes of one equipment are achieved, the size, the power consumption and the load weight of the micro unmanned aerial vehicles are further reduced, and passive visual navigation is realized.
The micro unmanned aerial vehicle is mainly used in complex environments such as the interior of buildings, urban road blocks, jungles and the like, the scenes generally have the characteristics of complex three-dimensional space, various barrier species and shapes and the like, the positioning and navigation precision and the real-time performance of the micro unmanned aerial vehicle are highly required, and the self-positioning and safe navigation of the micro unmanned aerial vehicle are realized by using visual navigation, so that the great challenge is brought. Particularly, with the difference of the application scenarios of the micro unmanned aerial vehicle, the problems that need to be solved urgently by the visual navigation technology are different. The illumination is stable in a narrow indoor environment, and the image features are difficult to extract from an indoor white smooth wall surface area; the similarity of indoor structures is extremely high, and the semantic constraint effect is not obvious; in an outdoor environment, the range of the unmanned aerial vehicle is greatly increased, and the accumulated error of the visual odometer is increased; the outdoor map points are rapidly increased, the calculation amount of global map optimization is increased, and the real-time performance is difficult to guarantee; rapid movement of vehicles and pedestrians in a city block causes image matching errors, and camera movement and maps cannot be estimated correctly; under the influence of illumination and seasonal variation in the field environment, the camera motion and the three-dimensional scene map cannot be accurately restored by using sequence images in both the feature method and the direct method.
Therefore, the new visual navigation method based on the monocular camera needs to be developed for the safety navigation requirements of the micro unmanned aerial vehicle by facing the limitation and the constraint of the complex scene, and technical support and design guidance are provided for realizing the safety navigation of the micro unmanned aerial vehicle in the complex environment.
Disclosure of Invention
In order to solve the problems, the invention provides a visual navigation method of a microminiature unmanned aerial vehicle in a high dynamic scene, which adopts the following technical scheme:
the visual navigation method of the micro unmanned aerial vehicle in the high dynamic scene comprises the following steps:
image preprocessing
Performing first feature extraction, semantic segmentation and deep neural network feature extraction on an input image; preferably, the semantic segmentation of the target in the image is performed by adopting a DFANet or Mask R-CNN semantic segmentation algorithm, and the deep neural network feature extraction is performed by adopting a convolutional neural network; of course, other algorithms or networks can be adopted for adjustment according to real-time calculation requirements on the premise of meeting data requirements;
visual odometer
According to the first feature of the image and the result of semantic segmentation, performing inter-frame image matching and local map optimization, thereby updating a map and a state;
preferably, the first feature extraction is performed by using FAST, Corner, ORB, SURF or SIFT algorithm;
3 ] Loop detection
Firstly, performing dimensionality reduction processing on deep neural network features extracted from an input image to simplify the features and record the features as a current frame, then searching a historical key frame in a global map and calculating the similarity between the current frame and the historical frame, if a loop exists, performing similarity transformation on the current frame, and outputting a corrected key frame pose; if no loop exists, continuing to process the extracted deep neural network features in the new input image;
global optimization and map creation
Performing local optimization according to the local update map and state obtained in the step 2 and the corrected key frame pose obtained in the step 3; and optimizing the tree or the map through a filtering theory or an optimizing theory on all map points to finally obtain the optimal global pose estimation and the map.
Preferably, in the step 1, before the first feature extraction, the semantic segmentation and the deep neural network feature extraction are performed on the input image, luminosity correction is performed on each frame of input image to eliminate the influence of illumination change on the image. Other algorithms or correction methods can be used for image adjustment, but at least the accuracy and other requirements of the subsequent steps on the relevant parameters are met.
Preferably, the photometric correction of each frame of input image is specifically:
the imaging model of each frame of input image is
Ii(x)=G(tiV(x)Bi(x)) (1)
Wherein G: R → [0,255]Represents a nonlinear response function, V: Ω → [0,1]Representing lens attenuation, BiAnd IiIrradiance and observed illumination intensity, t, respectively, over an image frame iiRepresents an exposure time;
the luminosity correction algorithm of each frame of input image is
Figure GDA0003543559240000041
And substituting the formula (2) into the formula (1) to obtain a correction result, and using the correction result for photometric parameter estimation in the local map optimization process in the step 2.
The correction method has the advantages of small relative calculation amount and high calculation speed, and is particularly suitable for airborne calculation of the micro unmanned aerial vehicle.
Preferably, in the step 2, inter-frame image matching specifically judges whether the current frame is a key frame through frame tracking, and if the current frame is the key frame, map points and semantics are updated so as to complete local map optimization to update a map; if not, triangle matching is carried out and semantic and depth information is updated to update the state.
The key frame judgment methods are more, but in consideration of the computing capability of the micro unmanned aerial vehicle and the maximum bearing capacity of the airborne weight, a method with smaller computation amount and higher computation accuracy is selected as far as possible to perform, for example:
under the geometric constraint, photometric constraint and semantic constraint between two frames of images, establishing a reprojection error constraint term in a sliding window, a pixel photometric error constraint term in a neighborhood and a semantic error constraint term;
wherein (a) reprojection error
The reprojection error refers to the observed point (x) in the imagej,yj)TMap points corresponding thereto
Figure GDA0003543559240000042
The position deviation of the back projection in the image is defined as
Figure GDA0003543559240000051
Wherein T isiIndicating the position, T, of the i-frame image in the world coordinate systemjExpressing the position of the j frame image in a world coordinate system, K expressing a camera internal parameter calibration matrix, pi expressing the conversion from homogeneous coordinates to Cartesian coordinates, and d expressing the estimated depth of the key point;
(b) pixel luminance error in the neighborhood
Any point p on the image frame i belongs to omegaiThe luminance error on the other frame image j is defined as
Figure GDA0003543559240000052
Wherein N ispIs the set of pixels in the neighborhood around the p-point, IiAnd IjIs two adjacent frames of images, tiAnd tjIs the exposure time of two adjacent frames of images, ai、bi、ajAnd bjIs to correct affine illumination transformation coefficient, gamma represents Huber norm, omegapIs a gradient dependent weight, p' is the projection onto the image frame IjA point on;
(c) semantic error
Semantic error represents map point with semantic c
Figure GDA0003543559240000053
Probability of pixel point semantic being c in corresponding image
Figure GDA0003543559240000054
Observing the semantic meaning of an image SkWith camera attitude TkThree-dimensional point piAssociated with and re-projected onto the image SkTwo-dimensional pixel coordinates of
Figure GDA0003543559240000055
The distance to the nearest region marked is in inverse proportion, and the map point is marked
Figure GDA0003543559240000056
Is defined as
Figure GDA0003543559240000057
Wherein the content of the first and second substances,
Figure GDA0003543559240000058
defined as a distance transform function on a binary image, sigma representing the uncertainty of the semantic image segmentation, and a probability vector passing through the corresponding map point piIs calculated from all observations of (a), if by the camera TiObserve that F is the set of keyframes to be optimized, then
Figure GDA0003543559240000061
Defining the sum of a reprojection error constraint term of the first characteristic point in the sliding window, a pixel luminosity error constraint term in the neighborhood and a semantic error constraint term as a visual error sum function to obtain
Figure GDA0003543559240000062
Wherein eta isR、ηPAnd ηsWeighting coefficients of a reprojection error constraint term, a pixel luminosity error constraint term in the neighborhood and a semantic error constraint term are respectively determined by analyzing image semantic information under the current scene;
performing combined navigation on the visual measurement and the inertial measurement, assuming that the relative pose of a camera and an inertial measurement unit is accurately calibrated, and simultaneously calculating a solution when the visual error and the function and the inertial error and the function obtain the minimum value in a sliding window to realize local map optimization; estimating the camera motion by using the information of a plurality of key frames in the sliding window is equivalent to finding a solution when the formula (8) obtains the minimum value, so that the estimation of the camera motion state is converted into a nonlinear optimization problem, as shown in the formula (9)
Figure GDA0003543559240000063
When the Gaussian Newton iteration method is adopted to solve the problem, the following matrix needs to be calculated
Figure GDA0003543559240000067
Wherein the content of the first and second substances,
Figure GDA0003543559240000064
in order to be a weight matrix, the weight matrix,
Figure GDA0003543559240000065
in the form of a residual vector, the vector,
Figure GDA0003543559240000066
is the Jacobian matrix.
Preferably, the performing the dimensionality reduction on the extracted deep neural network feature in the loop detection in the step 3 to simplify the feature specifically includes:
on the premise that the high-dimensional data and the low-dimensional data after dimensionality reduction have similar or the same probability distribution, the distances mapped on the high-dimensional data have similar characteristics after being mapped into the low-dimensional data; using an energy-based model, the probability distributions of the samples in the high-dimensional space and the low-dimensional space can be defined as
Figure GDA0003543559240000071
Where y is the vector that x maps to the low dimensional space.
In order to keep the distribution of the feature vectors in the high-dimensional space and the low-dimensional space consistent, the distance between the two distributions is measured by using KL divergence to obtain
H =∑iKL(Pi||Qi)=∑ijpji(logpji-logqji) (11)
In the formula PiAnd QiRespectively the probability distribution of the sample i in a high-dimensional space and the probability distribution of the sample i in a low-dimensional space; by minimizing the KL divergence measure, a corresponding dimension reduction method is learned, namely, the features are simplified on the premise of not losing data information.
Preferably, the calculating of the similarity between the current frame and the historical frame specifically includes:
the current frame set is recorded as P ═ { P1, P2, …, pn };
the historical frame set is recorded as F ═ { F1, F2, …, fn };
respectively recording the feature vectors of the key frame 1 and the key frame 2 after dimensionality reduction as
Figure GDA0003543559240000072
Wherein m is the characteristic dimension after the dimension reduction operation;
calculating each characteristic value after dimension reduction through a Gaussian function to obtain
Figure GDA0003543559240000073
Then, the similarity score between two key frames is the average of the sum of all features
Figure GDA0003543559240000074
Wherein m is the characteristic dimension after dimension reduction;
and judging whether the similarity score between the two key frames reaches a similarity threshold value. If the similarity score is larger than the similarity threshold, detecting a loop, performing similarity transformation on the current key frame image, and outputting a corrected key frame pose; if the similarity score is less than the similarity threshold, no loopback is detected.
Preferably, the optimization of the tree or the map by the filtering theory or the optimization theory for all map points in the step 4 specifically comprises: the pose optimization method based on nonlinear optimization simultaneously realizes local pose optimization and global pose optimization; namely, when the monocular camera carries out image input, the objective function is determined by the state variable and the observation equation:
Figure GDA0003543559240000081
where C represents the optimal consistent set of samples, xkRepresenting the moving pose of the unmanned aerial vehicle at the kth moment, and is called a position node in the graph optimization; y isiThe landmark which represents the landmark which can be observed by a visual sensor of the robot at the kth moment is called a point node in the graph optimization; z is a radical ofk,jThen the constraints of the point node and the point node are represented. When the optimization process is loop detection, zk,jRepresenting the constraints of a pos node and a pos node, e (x)k,yi,zk,j) The vector error function reflects the degree of the two nodes meeting the constraint condition, the vector error function is represented as a connecting line between the two nodes in the graph optimization, the smaller the connecting line value is, the higher the degree of the two nodes meeting the matching constraint is, and the value is 0, the two nodes completely meet the matching constraint;
assuming that the error function conforms to a gaussian distribution, the bayesian probability is described as: solving for the appropriate (x, y) makes the system most likely to produce the current observation Z, i.e. the maximum likelihood estimate is as follows:
Figure GDA0003543559240000082
final e (x) ofk,yi,zk,j) Is essentially solving for the maximum likelihood.
The beneficial effects of the invention are:
the visual navigation method of the micro unmanned aerial vehicle in the high dynamic scene provided by the invention is an improvement on the SLAM method aiming at the characteristics of a complex application environment on the basis of comprehensively analyzing the technical advantages and disadvantages of various typical SLAM methods, can meet the safety navigation requirement of the micro unmanned aerial vehicle in the low altitude high dynamic scene and without depending on a global navigation satellite system and prior geographic information, and has the navigation capabilities of high precision, strong real-time performance, robustness, autonomy, safety and expandability.
Drawings
FIG. 1 is a flow chart of a visual navigation method of a micro unmanned aerial vehicle in a high dynamic scene;
FIG. 2 is a block flow diagram of the visual odometer of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
A typical visual SLAM system consists essentially of four modules, namely, visual odometry, loopback detection, back-end non-linear optimization, and mapping.
The visual odometer is used for establishing a corresponding relation between data and solving relative transformation between two frames of images according to two frames of observed images, so that observed data under different coordinate systems can be placed in the same coordinate system. Because the two frames of images are respectively represented in the corresponding sensor coordinate systems, the obtained relative transformation is the relative pose of the sensor.
The loop detection means detecting whether the unmanned aerial vehicle enters the same place again, and judging whether the unmanned aerial vehicle returns to the place where the unmanned aerial vehicle has arrived through a loop. Because errors are accumulated continuously, the estimated trajectory of the unmanned aerial vehicle is different from the real situation when a loop is detected. After the loop is detected, global optimization is started to optimize the pose of the whole loop according to the error between the estimated value and the true value, and the precision of pose estimation and map construction of the unmanned aerial vehicle is improved.
And the back-end nonlinear optimization receives the camera poses measured by the visual odometer at different moments and the loop detection information, optimizes the camera poses and the loop detection information, and obtains a globally consistent track and map.
And the map construction is to build a map corresponding to the task requirement according to the estimated track.
The invention provides a visual navigation method of a micro unmanned aerial vehicle in a high dynamic scene, aiming at the requirement of autonomous navigation of the micro unmanned aerial vehicle in a low altitude and high dynamic scene.
Firstly, aiming at the problem of random change of illumination in low-altitude and high-dynamic scenes, an exposure control and compensation algorithm is adopted to preprocess an input image; aiming at the problem that the anti-view transformation capability of the first feature in image frame matching is weak, the anti-view transformation fast image matching of the first feature is improved, so that the speed is higher, and the robustness is stronger; aiming at the problem of interference of high-mobility persons or objects on camera motion estimation and map construction accuracy, a lightweight image semantic segmentation algorithm is adopted to segment the high-mobility persons or objects from a static scene or the accurate scene.
Secondly, aiming at the problem of low local optimization precision, a target function is constructed by using a visual reprojection error, an image photometric error and a pixel semantic error together, a joint constraint relation is established, and local optimization is realized through a nonlinear optimization algorithm. In order to meet the requirement of multi-sensor fusion of the microminiature unmanned aerial vehicle, the method supports the combined navigation of visual measurement and inertial measurement in local map optimization.
And thirdly, aiming at the problem that the accuracy and the positioning accuracy of the existing algorithm are low in loop detection, the loop detection algorithm based on deep learning is adopted, and the image matching positioning accuracy is improved to a sub-pixel level.
And finally, optimizing the global pose map, constructing the global map, and further improving the precision of camera motion estimation and map construction.
The invention provides a visual navigation method of a micro unmanned aerial vehicle in a high dynamic scene, which is shown in a flow chart in figure 1 and mainly comprises the following four steps:
step 1, image preprocessing
The method comprises the steps of obtaining an input image, and carrying out first feature extraction, semantic segmentation and deep neural network feature extraction.
The first feature extraction may be performed by using FAST, Corner, ORB, SURF, or SIFT algorithms, and the ORB feature extraction is described in this embodiment.
Before ORB feature extraction, semantic segmentation and deep neural network feature extraction, the influence of illumination change can be eliminated by utilizing camera luminosity calibration, and luminosity parameters comprise lens attenuation, gamma correction and exposure time. The photometric camera model maps the true energy (irradiance) received by a pixel on the imaging sensor to a corresponding intensity value. The imaging model is
Ii(x)=G(tiV(x)Bi(x)) (1)
Wherein G: R → [0,255]Denotes a nonlinear response function, V: omega → [0,1]Representing lens attenuation (halo), BiAnd IiIrradiance and observed illumination intensity, t, respectively, over an image frame iiThe exposure time is indicated. This model is used as a first step in the photometric correction of each video frame, calculated as follows
Figure GDA0003543559240000111
And substituting the formula (2) into the formula (1) to estimate the photometric parameters in the local map optimization process.
Aiming at the problems that the ORB image matching algorithm is high in error point rate, pixel point contrast is easily influenced by noise and meanwhile the anti-view transformation capability is weak, the rapid image matching algorithm based on the improved ORB anti-view transformation can be adopted.
The semantic features are to further generalize and organize the local features to reach the classification level understood by people. Although the transformation of viewpoint, illumination and scale can affect the underlying feature representation of the object, the semantic representation of the object is not changed.
After comparing the current mainstream image segmentation algorithm, the DFANet or Mask R-CNN algorithm is adopted as a real-time semantic segmentation scheme, and the method has two advantages:
1) fully utilizing high-level Feature information of the network through a Deep hierarchy Aggregation (Deep hierarchy Aggregation) structure;
2) lightweight coding is achieved through a DFA lightweight feature aggregation structure.
Step 2, vision mileometer
And performing inter-frame image matching and local map optimization according to the ORB characteristics of the image and the semantic segmentation result, so as to update the map and the state.
As shown in fig. 2, the method specifically comprises the following steps:
step 2.1, inter-frame image matching
Judging whether the current frame is a key frame or not through frame tracking, and if the current frame is the key frame, updating map points and semantics so as to complete local map optimization to update a map; if not, triangle matching is carried out and semantic and depth information is updated to update the state.
In the first flight, the image is matched with a pre-stored basic map pose.
And 2.2, under the geometric constraint, luminosity constraint and semantic constraint between the two frames of images, establishing a reprojection error constraint term in a sliding window, a pixel luminosity error constraint term in a neighborhood and a semantic error constraint term.
(a) Reprojection error
The reprojection error refers to the observed point (x) in the imagej,yj)TMap points corresponding thereto
Figure GDA0003543559240000121
The position deviation of the back projection in the image is defined as
Figure GDA0003543559240000122
Wherein T isiIndicating the position, T, of the i-frame image in the world coordinate systemjThe position of the j frame image under a world coordinate system is represented, K represents a camera internal parameter calibration matrix, pi represents the transformation from homogeneous coordinates to Cartesian coordinates, and d represents the estimated depth of the key point;
(b) pixel luminance error in the neighborhood
Any point p on the image frame i belongs to omegaiThe photometric error on the other frame image j is defined as
Figure GDA0003543559240000131
Wherein N ispIs the set of pixels in the neighborhood around the p-point, IiAnd IjIs two adjacent frames of images, tiAnd tjIs the exposure time of two adjacent frames of images, ai、bi、ajAnd bjIs to correct affine illumination transformation coefficient, gamma represents Huber norm, omegapIs a gradient dependent weight, p' is the projection onto the image frame IjA point on;
(c) semantic error
Semantic error represents map point with semantic c
Figure GDA0003543559240000132
Probability of pixel point semantic being c in corresponding image
Figure GDA0003543559240000133
Observing the semantic meaning of an image SkWith camera attitude TkThree-dimensional point piAssociated with and re-projected onto the image SkTwo-dimensional pixel coordinates of
Figure GDA0003543559240000134
The distance to the nearest region marked is in inverse proportion, and the map point is marked
Figure GDA0003543559240000135
Is defined as
Figure GDA0003543559240000136
Wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0003543559240000139
defined as a distance transform function on a binary image, sigma representing the uncertainty of the semantic image segmentation, and a probability vector passing through the corresponding map point piIs calculated from all observations of (a), if by the camera TiIt is observed that,f is the set of key frames to be optimized, then
Figure GDA0003543559240000137
Defining the sum of a reprojection error constraint term of the first characteristic point in the sliding window, a pixel luminosity error constraint term in the neighborhood and a semantic error constraint term as a visual error sum function to obtain
Figure GDA0003543559240000138
Wherein eta isR、ηPAnd ηsThe weight coefficients are respectively a reprojection error constraint term, a pixel luminosity error constraint term in the neighborhood and a semantic error constraint term, and the coefficients are determined by analyzing the image semantic information under the current scene.
And 2.3, performing combined navigation on the visual measurement and the inertial measurement, assuming that the relative poses of the camera and the inertial measurement unit are accurately calibrated, and simultaneously calculating a solution when the visual error and the function and the inertial error and the function obtain the minimum value in a sliding window to realize local map optimization.
Estimating the camera motion by using the information of a plurality of key frames in the sliding window is equivalent to finding a solution when the formula (8) obtains the minimum value, so that the estimation of the camera motion state is converted into a nonlinear optimization problem, as shown in the formula (9)
Figure GDA0003543559240000141
When the Gaussian Newton iteration method is adopted to solve the problem, the following matrix needs to be calculated
Figure GDA0003543559240000145
Wherein the content of the first and second substances,
Figure GDA0003543559240000142
in order to be a weight matrix, the weight matrix,
Figure GDA0003543559240000143
is a vector of the residual errors,
Figure GDA0003543559240000144
is the Jacobian matrix.
Step 3, loop detection
The loop detection is mainly used for solving the problem of pose drift caused by the increase of a visual odometer along with time in the positioning and mapping process of the unmanned aerial vehicle platform motion, and the essence of the loop detection is the image matching problem.
The method adopts a loop detection method based on a deep learning method, so that the excellent capability of learning the image features through the deep learning method is fully utilized, good feature representation of the image acquired by the unmanned aerial vehicle platform is acquired, and the effectiveness of loop detection is improved. After the deep neural network features of the image are obtained by the convolutional neural network, dimension reduction processing is carried out on the learned features, and loop detection is realized through similarity calculation.
In the method, a network pre-trained by VGG16 which is trained and widely used is used as a backbone network, different convolutional layers and full connection layers are added to detect the target of each frame of image frame, and higher-layer semantic information is acquired at the same time, such as a classic target detection network SSD (Single shot Multi-box detector). In the SSD, by adding a convolution network and an auxiliary network structure after a VGG16 network, after layer-by-layer convolution and pooling, the acquired features are enabled to continuously reduce as the network becomes deeper.
Generally, the highest layer of the convolutional neural network uses a fully connected layer, and the dimensionality of the image features obtained by the convolutional neural network is generally higher, which brings great computational requirements for training a corresponding classifier and a regression method. In order to adapt to the low-power-consumption platform of the unmanned aerial vehicle, affine transformation is used for carrying out dimension reduction processing on the features obtained by the convolutional neural network, and subsequent similarity calculation and positioning are necessary. On the premise that the high-dimensional data and the low-dimensional data after dimensionality reduction have similar or the same probability distribution, the distances mapped to the high-dimensional data have similar characteristics after being mapped to the low-dimensional data. Using an energy-based model, the probability distributions of the samples in the high-dimensional space and the low-dimensional space can be defined as
Figure GDA0003543559240000151
Where y is the vector that x maps to the low dimensional space.
Since the distributions of feature vectors in the high-dimensional space and the low-dimensional space are kept consistent, the distance between the distributions can be measured using the KL divergence to obtain
H =∑iKL(Pi||Qi)=∑ijpji(logpji-logqji) (11)
In the formula PiAnd QiRespectively, the probability distribution of the sample i in the high dimensional space and the probability distribution in the low dimensional space. By minimizing the KL divergence measure, a corresponding dimension reduction method is learned, namely, the features are simplified on the premise of not losing data information, so that the processing speed of the image data is increased.
In the visual SLAM, an unmanned aerial vehicle acquires a key frame of a frame by continuously flying, extracts required features by using a deep convolution network every time one key frame is acquired, and then vectorizes the required features to obtain corresponding vectorized feature vectors. And performing loop detection by calculating similarity comparison between the current frame and the historical frame. The method comprises the steps of recording a key frame set obtained by an unmanned aerial vehicle platform as P ═ { P1, P2, …, pn }, and calculating loop detection for each frame image in the key frame, namely obtaining a corresponding vectorization feature set for the image of each key frame through a convolutional neural network, and expressing the vectorization feature set by using F ═ { F1, F2, …, fn }. After the feature vectors are subjected to dimension reduction processing, similarity calculation is performed on the feature vectors. In the process of calculating the similarity, the feature vectors of the key frame 1 and the key frame 2 after dimensionality reduction are respectively recorded as
Figure GDA0003543559240000161
Wherein m is the characteristic dimension after the dimension reduction operation.
In the method, each characteristic value after dimension reduction is calculated through a Gaussian function to obtain
Figure GDA0003543559240000162
Then, the similarity score between two key frames is the average of the sum of all features to get
Figure GDA0003543559240000163
Wherein m is the characteristic dimension after dimension reduction.
The loop detection is to determine whether the similarity score between two key frames reaches a similarity threshold. If the similarity score is larger than the similarity threshold, detecting a loop, performing similarity transformation on the current key frame image, and outputting a corrected key frame pose; if the similarity score is less than the similarity threshold, no loopback is detected.
Step 4, global optimization and map creation
And (3) optimizing a global map and planning a flight path for navigation according to the map and the state updated in the step (2) and the keyframe pose corrected in the step (3).
With the continuous increase of the visual mileage, map points are more and more, and in order to ensure the real-time operation of the visual odometer, the local map optimization only processes the map points near the current position. And optimizing a tree or a map through a filtering theory (EKF, PF and the like) or an optimization theory to obtain the optimal pose estimation and map. Aiming at the repeatability of the unmanned aerial vehicle task, the constructed map is fully utilized, and semantic information is added in the map so as to support interframe image matching, local map optimization and loop detection based on semantic constraints.
On the basis of comprehensively analyzing the technical advantages and disadvantages of various typical SLAM methods, the method improves the SLAM method according to the characteristics of a complex application environment. Aiming at the defects that a point cloud map constructed by a mainstream visual SLAM algorithm is sparse and environment structure information is difficult to accurately express due to the fact that points are used as features, a straight line is used as the features to construct the map, and a map optimization method is used to improve positioning accuracy and map construction accuracy.
Aiming at the problems of linearization error, low updating efficiency and the like existing in the traditional pose estimation method by using a filtering method, the invention designs a pose optimization method based on nonlinear optimization according to a graph optimization thought in SLAM research, and realizes local pose optimization and global pose optimization at the same time. The classical SLAM mathematical model consists of three parts, namely a state variable, a motion equation and an observation equation. For the input of the monocular camera, the objective function of the method only uses the state variable and the observation equation, and the objective function is shown as the formula (15)
Figure GDA0003543559240000171
Where C represents the optimal consistent set of samples, xkRepresenting the moving pose of the unmanned aerial vehicle at the kth moment, and is called a position node in the graph optimization; y isiThe landmark which represents that the kth moment can be observed by a robot vision sensor is called a point node in the graph optimization; z is a radical of formulak,jThen the constraints of the point node and the point node are represented. When the optimization process is loop detection, zk,jRepresenting the constraints of a pos node and a pos node, e (x)k,yi,zk,j) It is a vector error function that reflects the degree to which two nodes satisfy the constraint, represented in the graph optimization as a line between two nodes, also called an edge. The smaller the value of the matching constraint, the higher the degree of the matching constraint satisfied by the two nodes, and the value of 0 indicates that the matching constraint is completely satisfied by the two nodes.
Assuming that the error function conforms to a gaussian distribution, the bayesian probability of the SLAM problem is described as: solving for the appropriate (x, y) makes the system most likely to produce the current observed data Z, i.e., the maximum likelihood estimate, as shown in equation (16)
Figure GDA0003543559240000181
Final e (x) ofk,yi,zk,j) The quadratic form of (c) is essentially equivalent to a least squares problem after solving for the maximum likelihood, i.e., after assuming that the noise is gaussian distributed. The non-linear least squares problem described above can be solved by means of the graph optimization tool g2 o.
The SLAM problem is completely abstracted into nodes and edges, the nodes represent optimization variables and comprise position nodes and point nodes, and the edges represent error constraint conditions and comprise position-point constraints and position-position constraints. The position-point constraint depends on the constraint generated by the camera observation, and the position-position constraint depends on the constraint generated by the loop detection in the system. The combination of the two can effectively inhibit the accumulated error of the system, thereby obtaining a map with consistent information.
According to the method, an NVIDIA Jetson TX2 embedded computer module is used as a computing platform, the computer is an AI single-module super computer based on an NVIDIA Pascal framework, and technical parameters are detailed in a table 1.
TABLE 1 NVIDIA Jetson TX computer Module hardware resources
Figure GDA0003543559240000182
Figure GDA0003543559240000191
And performing algorithm test and verification on a ground experiment platform by adopting four authoritative open data sets and one self-owned data set. Wherein, the ground experiment platform is a NVIDIA Jetson TX2 development kit. Selecting a corresponding test data set aiming at a typical flight environment of the micro unmanned aerial vehicle, wherein the test verification items comprise: navigation accuracy, computational complexity, illumination change robustness, scene change robustness, and the like, as detailed in table 2.
TABLE 2 data set correspondence test item
Data set Test environment Characteristics of the test
KITTI data set Indoor and city block Dynamic scenes
EuRoC data set Indoor room, industrial area Dynamic scenes
TUM data set Indoor corridor and jungle Indoor circular path
Oxford data set City block (building and road) Outdoor circular path
ICL-NUIM dataset Indoor room Indoor circular path
Self-contained data set Road Homomorphic scenes

Claims (6)

1. A visual navigation method for a microminiature unmanned aerial vehicle in a high dynamic scene is characterized by comprising the following steps:
image preprocessing
Firstly, performing luminosity correction on each frame of input image to eliminate the influence of illumination change on the image, and then performing first feature extraction, semantic segmentation and deep neural network feature extraction on the input image;
the luminosity correction of each frame of input image is specifically as follows:
the imaging model of each frame of input image is
Ii(x)=G(tiV(x)Bi(x)) (1)
Wherein G: R → [0,255]Denotes a nonlinear response function, V: omega → [0,1]Representing lens attenuation, BiAnd IiIrradiance and observed illumination intensity, t, respectively, over an image frame iiRepresents an exposure time;
the luminosity correction algorithm of each frame of input image is
Figure FDA0003568487030000011
Substituting the formula (2) into the formula (1) to obtain a correction result, and estimating photometric parameters in the local map optimization process in the step 2;
visual odometer
According to the first feature of the image and the result of semantic segmentation, performing inter-frame image matching and local map optimization, thereby updating a map and a state;
the inter-frame image matching specifically comprises: judging whether the current frame is a key frame or not through frame tracking, and if the current frame is the key frame, updating map points and semantics so as to complete local map optimization to update a map; if the non-key frame is not the key frame, performing triangle matching and updating semantic and depth information to update the state;
the specific steps of judging whether the current frame is a key frame or not and judging whether the current frame is a key frame and updating map points and semantics for the key frame so as to complete local map optimization are as follows:
under the geometric constraint, photometric constraint and semantic constraint between two frames of images, establishing a reprojection error constraint term in a sliding window, a pixel photometric error constraint term in a neighborhood and a semantic error constraint term;
wherein (a) reprojection error
The reprojection error refers to the observed point (x) in the imagej,yj)TMap points corresponding thereto
Figure FDA0003568487030000021
The position deviation of the back projection in the image is defined as
Figure FDA0003568487030000022
Wherein T isiIndicating the position, T, of the i-frame image in the world coordinate systemjExpressing the position of the j frame image in a world coordinate system, K expressing a camera internal parameter calibration matrix, pi expressing the conversion from homogeneous coordinates to Cartesian coordinates, and d expressing the estimated depth of the key point;
(b) pixel luminance error in the neighborhood
Any point p on the image frame i belongs to omegaiThe photometric error on the other frame image j is defined as
Figure FDA0003568487030000023
Wherein N ispIs the set of pixels in the neighborhood around the p-point, IiAnd IjIs two adjacent frames of images, tiAnd tjIs the exposure time of two adjacent frames of images, ai、bi、ajAnd bjIs to correct affine illumination transformation coefficient, gamma represents Huber norm, omegapIs a gradient dependent weight, p' is the projection onto the image frame IjA point on;
(c) semantic error
Semantic error represents map point with semantic c
Figure FDA0003568487030000024
Probability of pixel point semantic c in corresponding image
Figure FDA0003568487030000025
Observing the semantic meaning of an image SkWith camera attitude TkThree-dimensional point piAssociated with and re-projected onto the image SkTwo-dimensional pixel coordinates of
Figure FDA0003568487030000026
The distance to the nearest region marked is in inverse proportion, and the map point is marked
Figure FDA0003568487030000027
Is defined as
Figure FDA0003568487030000031
Wherein the content of the first and second substances,
Figure FDA0003568487030000032
defined as a distance transform function on a binary image, sigma representing the uncertainty of the semantic image segmentation, and a probability vector passing through the corresponding map point piIs calculated from all observations of (a), if by the camera TiObserve that F is the set of keyframes to be optimized, then have
Figure FDA0003568487030000033
Defining the sum of a reprojection error constraint term of a first characteristic point in a sliding window, a pixel luminosity error constraint term in a neighborhood and a semantic error constraint term as a visual error sum function to obtain
Figure FDA0003568487030000034
Wherein eta isR、ηPAnd ηsWeighting coefficients of a reprojection error constraint term, a pixel luminosity error constraint term in the neighborhood and a semantic error constraint term are respectively determined by analyzing image semantic information under the current scene;
performing combined navigation on the vision measurement and the inertia measurement, assuming that the relative poses of the camera and the inertia measurement unit are accurately calibrated, and simultaneously calculating a solution when the vision error and the function and the inertia error and the function obtain the minimum value in a sliding window to realize local map optimization; estimating the camera motion by using the information of a plurality of key frames in the sliding window is equivalent to finding a solution when the formula (8) obtains the minimum value, so that the estimation of the camera motion state is converted into a nonlinear optimization problem, as shown in the formula (9)
Figure FDA0003568487030000035
When the Gaussian Newton iteration method is adopted to solve the problem, the following matrix needs to be calculated
Figure FDA0003568487030000036
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003568487030000037
in order to be a weight matrix, the weight matrix,
Figure FDA0003568487030000038
in the form of a residual vector, the vector,
Figure FDA0003568487030000039
is the Jacobian matrix;
3 ] Loop detection
Firstly, performing dimensionality reduction processing on deep neural network features extracted from an input image to simplify the features and record the features as a current frame, then searching a historical key frame in a global map and calculating the similarity between the current frame and the historical frame, if a loop exists, performing similarity transformation on the current frame, and outputting a corrected key frame pose; if no loop exists, continuing to process the extracted deep neural network features in the new input image;
global optimization and map creation
Performing local optimization according to the local update map and state obtained in the step 2 and the corrected key frame pose obtained in the step 3; and optimizing the tree or the map through a filtering theory or an optimizing theory on all map points to finally obtain the optimal global pose estimation and the map.
2. The visual navigation method for the micro unmanned aerial vehicle in the high dynamic scene as claimed in claim 1, wherein:
the first feature extraction in step 1 is performed by using FAST, conner, ORB, SURF, or SIFT algorithm.
3. The visual navigation method for the micro unmanned aerial vehicle in the high dynamic scene as claimed in claim 2, wherein:
the step 3 of performing dimension reduction processing on the extracted deep neural network features in the loop detection to simplify the features specifically comprises the following steps:
on the premise that the high-dimensional data and the low-dimensional data after dimensionality reduction have similar or the same probability distribution, the distances mapped on the high-dimensional data have similar characteristics after being mapped into the low-dimensional data;
using an energy-based model, the probability distributions of the samples in the high-dimensional space and the low-dimensional space can be defined as
Figure FDA0003568487030000051
Where y is a vector with x mapped to a low dimensional space;
in order to keep the distribution of the feature vectors in the high-dimensional space and the low-dimensional space consistent, the distance between the two distributions is measured by using KL divergence to obtain
H=∑iKL(Pi||Qi)=∑ijpj|i(logpj|i-logqj|i) (11)
In the formula PiAnd QiRespectively representing the probability distribution of the sample i in a high-dimensional space and the probability distribution of the sample i in a low-dimensional space; by minimizing the KL divergence measure, a corresponding dimension reduction method is learned, namely, the features are simplified on the premise of not losing data information.
4. The visual navigation method for the micro unmanned aerial vehicle in the high dynamic scene as claimed in claim 3, wherein:
the calculating the similarity between the current frame and the historical frame specifically comprises:
the current frame set is recorded as P ═ { P1, P2, …, pn };
the historical frame set is recorded as F ═ { F1, F2, …, fn };
respectively recording the feature vectors of the key frame 1 and the key frame 2 after dimensionality reduction as
Figure FDA0003568487030000052
Wherein m is the characteristic dimension after the dimension reduction operation;
calculating each characteristic value after dimension reduction through a Gaussian function to obtain
Figure FDA0003568487030000053
Then, the similarity score between two key frames is the average of the sum of all features
Figure FDA0003568487030000054
Wherein m is the characteristic dimension after dimension reduction;
judging whether the similarity score between the two key frames reaches a similarity threshold value, if the similarity score is larger than the similarity threshold value, detecting a loop, performing similarity transformation on the current key frame image, and outputting a corrected key frame pose; if the similarity score is less than the similarity threshold, no loopback is detected.
5. The visual navigation method for the micro unmanned aerial vehicle in the high dynamic scene as claimed in claim 2, wherein:
the optimization of the tree or the map by the filtering theory or the optimization theory of all map points in the step 4 specifically comprises the following steps:
the pose optimization method based on nonlinear optimization simultaneously realizes local pose optimization and global pose optimization; namely, when the monocular camera inputs images, the objective function is determined by the state variable and the observation equation:
Figure FDA0003568487030000061
where C represents the optimal consistent set of samples, xkRepresenting the moving pose of the unmanned aerial vehicle at the kth moment, and is called a position node in the graph optimization; y isiThe landmark which represents that the kth moment can be observed by a robot vision sensor is called a point node in the graph optimization; z is a radical ofk,jRepresenting the constraint of the point node and the point node; when the optimization process is loop detection, zk,jRepresenting the constraints of a pos node and a pos node, e (x)k,yi,zk,j) It is a vector error function that reflects the degree to which two nodes satisfy the constraint, represented in the graph optimization as a line between the two nodesThe smaller the connecting line value is, the higher the degree that the two nodes meet the matching constraint is, and the value of 0 indicates that the two nodes completely meet the matching constraint;
assuming that the error function conforms to a gaussian distribution, the bayesian probability is described as: solving for the appropriate (x, y) makes it most likely that the system will produce the current observation Z, i.e. the maximum likelihood estimate is as follows:
Figure FDA0003568487030000062
final e (x) ofk,yi,zk,j) Is essentially solving for the maximum likelihood.
6. The visual navigation method for the micro unmanned aerial vehicle in the high dynamic scene as claimed in any one of claims 1 to 5, wherein:
and the semantic segmentation of the target in the image is performed by adopting a DFANet or Mask R-CNN semantic segmentation algorithm, and the deep neural network feature extraction is performed by adopting a convolutional neural network.
CN202010381182.7A 2020-05-08 2020-05-08 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene Active CN111693047B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010381182.7A CN111693047B (en) 2020-05-08 2020-05-08 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010381182.7A CN111693047B (en) 2020-05-08 2020-05-08 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene

Publications (2)

Publication Number Publication Date
CN111693047A CN111693047A (en) 2020-09-22
CN111693047B true CN111693047B (en) 2022-07-05

Family

ID=72477337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010381182.7A Active CN111693047B (en) 2020-05-08 2020-05-08 Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene

Country Status (1)

Country Link
CN (1) CN111693047B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112461228B (en) * 2020-11-03 2023-05-09 南昌航空大学 IMU and vision-based secondary loop detection positioning method in similar environment
CN112396596A (en) * 2020-11-27 2021-02-23 广东电网有限责任公司肇庆供电局 Closed loop detection method based on semantic segmentation and image feature description
CN112697158A (en) * 2020-12-03 2021-04-23 南京工业大学 Man-made loop-back instant positioning and picture building method and system for indoor and outdoor scenes
CN114723779A (en) * 2021-01-06 2022-07-08 广州汽车集团股份有限公司 Vehicle positioning method and device and computer readable storage medium
CN112767481B (en) * 2021-01-21 2022-08-16 山东大学 High-precision positioning and mapping method based on visual edge features
CN113052297B (en) * 2021-03-04 2022-11-22 吉林大学 Towing cable attitude calculation method and system based on convolution neural network fusion EKF
CN113108771B (en) * 2021-03-05 2022-08-16 华南理工大学 Movement pose estimation method based on closed-loop direct sparse visual odometer
CN113156942B (en) * 2021-03-26 2023-07-21 北京师范大学 Wide area environment coding method and system based on spatial memory neural mechanism
CN113239936B (en) * 2021-04-26 2024-05-28 大连理工大学 Unmanned aerial vehicle visual navigation method based on deep learning and feature point extraction
CN113188557B (en) * 2021-04-28 2023-10-20 江苏方天电力技术有限公司 Visual inertial integrated navigation method integrating semantic features
CN113587933B (en) * 2021-07-29 2024-02-02 山东山速机器人科技有限公司 Indoor mobile robot positioning method based on branch-and-bound algorithm
CN113673462B (en) * 2021-08-27 2023-12-19 西安电子科技大学广州研究院 Logistics AGV positioning method based on lane lines
CN114199250A (en) * 2021-12-03 2022-03-18 清华大学 Scene matching navigation method and device based on convolutional neural network
CN114241050B (en) * 2021-12-20 2024-05-07 东南大学 Camera pose optimization method based on Manhattan world assumption and factor graph
CN114460943B (en) * 2022-02-10 2023-07-28 山东大学 Self-adaptive target navigation method and system for service robot
CN116823949B (en) * 2023-06-13 2023-12-01 武汉天进科技有限公司 Miniaturized unmanned aerial vehicle airborne real-time image processing device
CN117824664B (en) * 2024-03-05 2024-05-28 河海大学 Active SLAM method of autonomous unmanned system based on multi-beam sounding sonar

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109307508A (en) * 2018-08-29 2019-02-05 中国科学院合肥物质科学研究院 A kind of panorama inertial navigation SLAM method based on more key frames
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110060277A (en) * 2019-04-30 2019-07-26 哈尔滨理工大学 A kind of vision SLAM method of multiple features fusion
CN110853085A (en) * 2018-08-21 2020-02-28 深圳地平线机器人科技有限公司 Semantic SLAM-based mapping method and device and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190080166A1 (en) * 2017-09-13 2019-03-14 TuSimple Data acquistion and input of neural network method for deep odometry assisted by static scene optical flow
CN110345944A (en) * 2019-05-27 2019-10-18 浙江工业大学 Merge the robot localization method of visual signature and IMU information
CN110827415B (en) * 2019-11-11 2022-08-23 吉林大学 All-weather unknown environment unmanned autonomous working platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110853085A (en) * 2018-08-21 2020-02-28 深圳地平线机器人科技有限公司 Semantic SLAM-based mapping method and device and electronic equipment
CN109307508A (en) * 2018-08-29 2019-02-05 中国科学院合肥物质科学研究院 A kind of panorama inertial navigation SLAM method based on more key frames
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN110060277A (en) * 2019-04-30 2019-07-26 哈尔滨理工大学 A kind of vision SLAM method of multiple features fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
动态环境下基于语义分割的视觉SLAM方法研究;张建波;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200215(第2期);正文第2-5章,附图2-1至5-12 *
基于卷积神经网络的回环检测算法;罗顺心,张孙杰;《计算机与数字工程》;20190531;第47卷(第5期);正文第1-5节,附图1-14 *
多旋翼无人机单目V-SLAM研究综述;赵良玉,朱叶青,金瑞;《航空兵器》;20200430;第27卷(第2期);正文第1-5节,附图1-4 *

Also Published As

Publication number Publication date
CN111693047A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
CN111693047B (en) Visual navigation method for micro unmanned aerial vehicle in high-dynamic scene
Wang et al. F-loam: Fast lidar odometry and mapping
Huang et al. Visual odometry and mapping for autonomous flight using an RGB-D camera
Li et al. Deep sensor fusion between 2D laser scanner and IMU for mobile robot localization
Kümmerle et al. Large scale graph-based SLAM using aerial images as prior information
Konolige et al. FrameSLAM: From bundle adjustment to real-time visual mapping
CN102426019B (en) Unmanned aerial vehicle scene matching auxiliary navigation method and system
CN110717927A (en) Indoor robot motion estimation method based on deep learning and visual inertial fusion
He et al. Vision-based UAV flight control and obstacle avoidance
CN113776519B (en) AGV vehicle mapping and autonomous navigation obstacle avoidance method under lightless dynamic open environment
Li et al. Indoor multi-sensor fusion positioning based on federated filtering
Graf et al. Optimization-based terrain analysis and path planning in unstructured environments
He et al. Online semantic-assisted topological map building with LiDAR in large-scale outdoor environments: Toward robust place recognition
CN111739066B (en) Visual positioning method, system and storage medium based on Gaussian process
Wang et al. A submap joining based RGB-D SLAM algorithm using planes as features
CN117367427A (en) Multi-mode slam method applicable to vision-assisted laser fusion IMU in indoor environment
CN116563341A (en) Visual positioning and mapping method for processing dynamic object in complex environment
CN115950414A (en) Adaptive multi-fusion SLAM method for different sensor data
CN114202701A (en) Unmanned aerial vehicle vision repositioning method based on object semantics
CN113947636B (en) Laser SLAM positioning system and method based on deep learning
CN115962773A (en) Method, device and equipment for synchronous positioning and map construction of mobile robot
Li et al. Laser-based SLAM with efficient occupancy likelihood map learning for dynamic indoor scenes
Yoshisada et al. Indoor map generation from multiple LiDAR point clouds
CN115187614A (en) Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network
Hernández-García et al. 3d city models: Mapping approach using lidar technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant