CN113808152A - Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 - Google Patents
Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 Download PDFInfo
- Publication number
- CN113808152A CN113808152A CN202111075423.6A CN202111075423A CN113808152A CN 113808152 A CN113808152 A CN 113808152A CN 202111075423 A CN202111075423 A CN 202111075423A CN 113808152 A CN113808152 A CN 113808152A
- Authority
- CN
- China
- Prior art keywords
- camera
- point
- points
- orb
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20164—Salient point detection; Corner detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2, and belongs to the field of vision synchronous positioning and map construction. The method mainly comprises the steps of global optimization and mapping of a visual front end and a visual rear end and three-dimensional path planning. The visual front end mainly comprises ORB feature point extraction and matching and a visual odometer for calculating the current pose of the unmanned aerial vehicle, and the part realizes the positioning function. The visual back-end global optimization and mapping mainly comprises BA global map optimization and dense point cloud image mapping, and the global optimization and mapping functions are realized. The three-dimensional path planning realizes the operation of avoiding obstacles indoors and outdoors by the unmanned aerial vehicle, and the part realizes the path planning function. The method utilizes the improved FAST angular points to extract ORB pixel points and ORB characteristic points for block optimization; and performing pose estimation by using an improved ICP-PnP algorithm.
Description
Technical Field
The invention relates to an unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2, and belongs to the field of vision synchronous positioning and map building (SLAM).
Background
An Unmanned Aerial Vehicle (UAV) is a mobile robot, and is called an Unmanned Aerial Vehicle (UAV), and a controller is mounted on the UAV, so that people can control the flight of the Vehicle on the ground. Meanwhile, an algorithm for building a Simuitaneous Localization and Mapping (SLAM) is disclosed, which aims to solve the problem of positioning of a robot without GPS signals and measure a map of an environment. The unmanned aerial vehicle relies on RGB-D autonomous navigation and is mainly divided into three elements: "where do I am? "," what is the environment i are in? "how do i go to the task destination? ", these three problems represent: positioning, mapping and path planning. SLAM can solve the first two of the three problems, so the SLAM technology is positioned and mapped, which is considered as the key for realizing unmanned aerial vehicle autonomous navigation by extensive researchers in the field.
As disclosed in patent No. CN 108520554 a, a binocular three-dimensional dense mapping method based on ORB _ SLAM2 proposes four ORB (organized FAST and Rotated brief) threads mainly for SLAM mapping: the method comprises a tracking thread, a local map thread, a closed-loop monitoring thread and a dense mapping process, wherein the ORB feature point extraction and matching process is put in the tracking process, if ORB feature points appear in piles, the judgment pose of a visual odometer is greatly deviated, and if ORB feature point blocking optimization is not carried out, the positioning error is greatly caused.
In the conventional unmanned aerial vehicle SLAM system, after the ORB feature points are extracted from the positioning module, the problem that the estimation pose of the visual odometer is influenced by the fact that feature points are piled up and stored is considered, the image has errors when pixel coordinate transformation is carried out due to the fact that global optimization does not exist after a dense point cloud image is generated in the mapping module, and the conventional RRT algorithm when the path planning module operates may cause that the unmanned aerial vehicle can not avoid obstacles at corners or avoid too large obstacles.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2, so that the navigation process of the unmanned aerial vehicle is smoother, and related tasks are better completed
The invention adopts the following technical scheme for solving the technical problems:
an autonomous navigation method of a unmanned aerial vehicle based on ORB _ SLAM2 comprises the following steps:
(1) ORB feature point extraction
The ORB feature point extraction consists of two parts: FAST corner and BRIEF descriptor
When detecting the angular points, directly checking the gray levels of pixel points at the four positions of No. 1,5,9 and 13 on the circumference of each pixel to be determined, wherein the current detected pixel may be an angular point only if the gray levels of the four pixels meet the requirement that the number of the pixel points is not less than three;
the rotational invariance of the ORB features is realized by a gray centroid method, which comprises the following specific steps: selecting a small image block B from the center of the corner point, and then the moment of the image block can be defined as:
wherein: m ispqRepresenting moments, x, of image blockspP pixels representing the x-axis direction, yqQ pixels representing the y direction, I (x, y) represents the rotational invariance in the x and y directions, p represents the pixels p around the keypoint, q represents the pixels q around the keypoint;
finding the centroid of the image block by moment:
wherein: c denotes the centroid of the image block, m00Moment, m, representing an azimuthal (0,0) image block01Representing moments, m, of azimuthal (0,1) image blocks10Representing the moments of the orientation (1,0) image blocks;
assuming that the geometric center of the image block B is O and the OCB is connected, the direction of the key point is the vectorUsing angles to indicate:
θ=arctan(m01/m10)
after the ORB feature key points are extracted, calculating another part of content of the feature points, namely descriptors of the key points;
after graying the image, dividing the image into 64x64 local pixel blocks, then carrying out key point detection on each pixel block, respectively calculating Harris response values of candidate key points meeting detection conditions, and finally reserving two key points with the maximum response values as a final key point set;
(2) ORB violence matching
After the violence characteristic points are matched and the descriptors are extracted, matching the descriptors;
after the feature matching is completed, only the depth values of the feature points are needed to be used, the feature points are subjected to inverse transformation once by using a camera pinhole imaging principle, and then the three-dimensional coordinates of the feature points under the camera coordinates are obtained, and the transformation is as follows:
wherein d represents the depth value of the characteristic point; u denotes x-axis constant, v denotes y-axis constant, cxAnd cyRepresenting the amount of change, f, of the origin of the two coordinate systemsyRepresents the camera focal length;
let the coordinates of a P point in space be x, y, z]TThe imaging position P 'is [ x', y ', z']TF is the camera focal length; according to the triangle similarity theorem, the following comparison relationship is obvious:
slightly finishing to obtain:
in the formula, only a transformation model from a space point to an imaging plane is shown, and individual pixels are obtained in the shooting process of a camera, so that the pixel coordinate of the space P can be obtained only by carrying out coordinate transformation once. The transformation process is as follows:
alpha and beta represent the scaling factors of two coordinate axes, respectively, cxAnd cyRepresenting the variation of the origin of the two coordinate systems; combined and written as f for α f and β f, respectivelyxAnd fyThen the whole imaging process is written as:
so far, we have obtained a model of camera pinhole imaging, and write the pixel coordinates into a homogeneous form and then write the whole projection process into a matrix form:
in the formula, z is the depth of the pixel;
solving the PnP problem using a direct linear transformation method, assuming that its coordinate has a homogeneous form of P ═ x, y, z,1)TThe coordinate of the corresponding homogeneous pixel in the image is P ═ (u, v,1)T(ii) a In order to solve the poses R and t of the camera, a rotation matrix and a translation vector are written into a matrix form, namely an augmentation matrix [ R | t ] is constructed];
And (3) expanding the augmentation matrix by using an imaging projection model of the camera to obtain the following relational expression:
thus, each 3D to 2D pair of feature matching points provides two linear constraints for the augmented matrix; define the row vector of the augmented matrix [ R | t ]:
the formula is then simplified to:
assuming a total of N matching point pairs, the system of equations is set forth:
the matrix [ R | t ] has 12 dimensions in total, and one feature matching point provides two linear constraints for the matrix [ R | t ], so that the pose is solved as long as the number of the feature matching point pairs is more than 6 pairs;
(3) global map optimization and three-dimensional dense point cloud map construction of BA
Visual SLAM rear end optimization and map building comprise global map optimization and three-dimensional dense point cloud map building based on BA;
the SLAM process is a process of continuously observing the environment, where the observation is made at a specific time, and the motion of the drone is converted into the motion of the camera at a discrete time t 1,2 within a continuous time..., k-1, k, and the poses of the cameras at the corresponding times are marked as x1, x2, and xk, and the camera poses at the times are connected to form the track of the cameras in the time; recording the environmental characteristic points observed in the period of time, namely the landmarks to finish the construction of the sparse map, assuming that N observed landmarks exist, and each landmark is marked as m1,...,mN(ii) a Considering the motion of the camera from time k-1 to time k, i.e. the change of pose of the drone, this function is abstracted as f:
xk=f(xk-1,uk,ωk)
in the above formula, xk-1,xkIs the pose of the camera at the time of k-1 and k, ukIs a motion measurement sensor carried by an unmanned aerial vehicle, such as an encoder, and the reading of an IMU at the moment k, omegakIs the noise present during the motion measurement of the drone; the above formula indicates that the state change of the unmanned aerial vehicle is the motion equation of the unmanned aerial vehicle;
the process of the camera observation environment is abstracted into a function h:
zk,j=h(mj,xk,vk,j)
in the above formula, mjDenotes the jth road sign, vk,jRepresenting noise generated in the process, zk,jThe representative camera observes the environment at the pose xk, and the formula is an observation equation of SLAM;
ignoring the noise first, the observation equation is written as:
z=h(x,y)
wherein x refers to the current pose of the camera and is represented by R and t, and the corresponding lie algebra form is epsilon; pixel coordinates of signpost y are written asThe observed error is written as:
e=z-h(ε,p)
accumulating all error terms, we can get the global BA optimized objective function:
define camera pose and all landmarks as arguments:
x=[ε1,ε2,...,εm,p1,p2,...,pn]T
adding a slight increment to the argument and performing a taylor expansion, the objective function is written as:
fij represents the partial derivative of the camera pose part of the independent variable in the objective function, and Eij is the partial derivative of the position of the characteristic point in the objective function;
then, the incremental equation of the overall objective function is pieced together by F and E, and adjusted to be the jacobian matrix:
J=[F E]
the incremental equation H matrix is then:
thus obtaining incremental function representation of global optimization of BA;
constructing a three-dimensional dense point cloud map, constructing four threads by using an ORB _ SLAM2 framework, and designing and constructing an SLAM system of the dense map; firstly, completing a visual odometer by using a thread, estimating an accurate pose, feature point extraction and feature point violence matching for each frame of image shot by a camera; completing local map construction by using a second thread, wherein the second thread is mainly used for positioning; the third thread accelerates the feature matching of the map and the current frame; finally, the fourth thread completes the construction of the three-dimensional dense point cloud map;
(4) improved RRT algorithm in three-dimensional environment
In the initial sampling, reducing the range of sampling points to reduce the memory requirement, and adding connected domain sampling and target deviation bounded sampling; after the initial path is found, continuous and rapid optimization of the path is realized by utilizing elliptical sampling.
The invention has the following beneficial effects:
1. the invention can more accurately extract the corner points with gray change by the improved FAST (features From accessed Segment test) corner point extraction method. In the feature point extraction process, the feature points may be piled up, so that the feature points are extracted after the image is segmented, the feature point extraction is more dispersed, and the feature point matching is facilitated.
2. The invention uses the improved ICP-PnP (Iterative closed Point periodic-n-Point) algorithm to estimate the pose of the camera, solves the problem that effective pose estimation cannot be given under the condition of less feature points, and ensures that the pose of the camera under the feature matching relationship from 3D to 2D is more accurate.
3. According to the invention, the problem that the convergence speed of the algorithm is reduced and the quality of the planned path is poor due to too many exploration directions caused by complete randomness of the conventional RRT is solved through the improved unmanned aerial vehicle path planning algorithm R-RRT, so that the random tree has directivity, and the convergence speed is greatly improved.
Drawings
Fig. 1 is a general principle block diagram of a visual camera.
Fig. 2 is a flow chart of an algorithm for determining pose by a visual odometer.
FIG. 3 is a four thread flow diagram of the creation of a graph.
FIG. 4 is a diagram of unmanned aerial vehicle path planning R-RRT*And (4) an algorithm flow chart.
Fig. 5(a) is a FAST corner extraction effect diagram; FIG. 5(b) is the ORB descriptor violence matching effect.
A drawing; fig. 5(c) is a diagram of a visual odometer principle pinhole imaging model.
FIG. 6 is a three-dimensional simulated dense map.
FIG. 7(a) is a random tree RRT*A schematic diagram; FIG. 7(b) is a modified R-RRT*The algorithm simulates a simulation diagram.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
An autonomous unmanned aerial vehicle navigation method based on ORB _ SLAM2 includes, as shown in FIG. 1, a visual SLAM front end design, a visual SLAM rear end global optimization and mapping, and an unmanned aerial vehicle path planning algorithm design.
(1) Visual SLAM front end design
The visual SLAM front end design comprises ORB feature point extraction and violence matching, and the overall visual SLAM front end design flow is shown in figure 2 based on the position estimation of the ICP algorithm. The ORB feature point extraction consists of two parts: FAST corner and BRIEF descriptor
The FAST corner is a relatively special pixel point, a local pixel block with obvious gray change in the image represents the characteristics of the image, and the FAST corner monitoring process idea is as follows: and comparing the gray levels of one pixel point and adjacent pixel points in the image, wherein if the gray levels have obvious difference of over-bright or over-dark, the corner point is likely to be a corner point. The invention is improved, when detecting the corner, the gray levels of the pixel points at the 1 st, 5 th, 9 th and 13 th positions on the circumference are directly checked for each pixel to be determined, only if the gray levels of the four pixels meet the requirement that the number is not less than three, the current detected pixel may be a corner, and the specific effect of extracting the FAST corner is shown in fig. 5 (a).
The rotational invariance of the ORB features can be achieved by a grayscale centroid method, which is specifically done as follows: selecting a small image block B from the center of the corner point, and then the moment of the image block can be defined as:
wherein: m ispqTo representMoment of image block, xpP pixels representing the x-axis direction, yqQ pixels representing the y direction, I (x, y) represents the rotational invariance in the x and y directions, p represents the pixels p around the keypoint, q represents the pixels q around the keypoint;
the centroid of the image block can be found by the moments:
wherein: c denotes the centroid of the image block, m00Moment, m, representing an azimuthal (0,0) image block01Representing moments, m, of azimuthal (0,1) image blocks10Representing the moments of the orientation (1,0) image blocks;
assume that the geometric center of the image block B is O, and OC is connectedBThen the direction of the keypoint is the vectorCan be expressed using angles:
θ=arctan(m01/m10)
after the ORB key points are added with scale invariance and rotation invariance, the robustness of the representation of the ORB key points among different images is greatly improved. After the ORB feature key point extraction is completed, another part of the content of the feature point, that is, the descriptor of the key point, needs to be calculated. The descriptor is the key of feature matching, can ensure the rotation invariance of feature points, and the calculation method of the descriptor comprises the following steps: determining the key point as an origin, establishing a coordinate axis according to the direction of the key point, randomly selecting two pixels p and q around the key point, comparing the gray values of the two pixels, if p is greater than q, the value of the current bit is 1, otherwise, 0 is selected. Repeating the operation 256 times can calculate a descriptor of the ORB feature point.
Key points of the ORB features are easy to appear at places with obvious pixel changes, and easy to be piled up, because the vicinity of a pixel point with obvious gray change better meets the requirement of the FAST corner point, the extracted ORB feature points are particularly concentrated, and the pose error of the camera is increased. In order to solve the problem of the occurrence of the feature point bunching, the invention divides the image into regions, and the specific operation is as follows: after graying the image, dividing the image into 64x64 local pixel blocks, then performing key point detection on each pixel block, respectively calculating Harris response values of candidate key points meeting detection conditions, and finally reserving two key points with the maximum response values as a final key point set.
The violent feature point matching is useful when there are few feature points by extracting a descriptor and then matching the descriptor. Violence feature point matching thought: two sets of descriptors, P ═ p1... pM ] and Q ═ q1... qN, are given. Then, at any point in P, the corresponding minimum distance point in Q is found, i.e. a match is calculated. However, the feature points that do not satisfy the detection requirement are not eliminated, so a distance threshold d _ max is generally set, i.e., the distance of the feature points considered as matching should not be greater than d _ max, and the effect of the ORB feature point violent matching is shown in fig. 5 (b).
The ICP-PnP algorithm-based pose estimation method has the advantages that due to the complexity of the environment, noise exists in an RGB-D camera shooting depth map, the situation that the number of inner points matched with features is too small can be caused, the improvement of the estimation pose by fusing the PnP algorithm in the front end is provided, and the pose calculation method after the fusion improvement is named as the ICP-PnP algorithm.
After the feature matching is completed, the three-dimensional coordinates of the feature points under the camera coordinates can be obtained by only using the depth values of the feature points and carrying out inverse transformation on the feature points once by utilizing the camera pinhole imaging principle, and the transformation is as follows:
wherein d represents the depth value of the feature point, x represents the x-axis direction, y represents the y-axis direction, z represents the z-axis direction, u represents the x-axis constant, v represents the y-axis constant, cxAnd cyRepresenting the amount of change, f, of the origin of the two coordinate systemsyRepresenting the camera focal length.
Let the coordinate of a P point in space be[x,y,z]TThe imaging position P 'is [ x', y ', z']TAnd f is the camera focal length. According to the triangle similarity theorem, the following comparison relationship is obvious:
slightly finishing to obtain:
in the formula, only a transformation model from a space point to an imaging plane is shown, and individual pixels are obtained in the shooting process of a camera, so that the pixel coordinate of the space P can be obtained only by carrying out coordinate transformation once. The transformation process is as follows:
alpha and beta represent the scaling factors of two coordinate axes, respectively, cxAnd cyRepresenting the amount of change in the origin of the two coordinate systems. Combined and written as f for α f and β f, respectivelyxAnd fyThen the whole imaging process can be written as:
so far, we have obtained a model of camera pinhole imaging, and write the pixel coordinates into a homogeneous form and then write the whole projection process into a matrix form:
in the formula, z is the depth of the pixel.
The present invention solves the PnP problem using a Direct Linear Transformation (DLT) method, assuming a homogeneous form of its coordinatesFor P ═ (x, y, z,1)TThe coordinate of the corresponding homogeneous pixel in the image is P ═ (u, v,1)T. In order to solve the poses R and t of the camera, a rotation matrix and a translation vector are written into a matrix form, namely an augmentation matrix [ R | t ] is constructed]。
Using the imaging projection model of the camera, as shown in fig. 5(c), the augmentation matrix can be expanded to obtain the following relation:
thus, each pair of 3D to 2D feature matching points may provide two linear constraints for the augmented matrix. Define the row vector of the augmented matrix [ R | t ]:
the equation can then be simplified as:
assuming a total of N matching point pairs, the system of equations can be set forth:
the matrix [ R | t ] has 12 dimensions in total, and one feature matching point can provide two linear constraints for the matrix [ R | t ], so that the pose can be solved as long as the number of the feature matching point pairs is more than 6 pairs. (2) Visual SLAM back-end optimization and mapping
The visual SLAM back-end optimization and mapping comprises BA (global map optimization) -based and three-dimensional dense point cloud map construction, and the overall process is shown in FIG. 3.
The SLAM process is a process of continuously observing the environment, wherein the observation is carried out at a specific moment, the motion of the unmanned aerial vehicle in a period of continuous time is converted into the motion of a camera in the optimal dispersion moment t from a global graph, namely 1,21,x2,...,xkConnecting the camera poses at these times constitutes the trajectory of the camera during this time. Recording the environmental characteristic points observed in the period of time, namely the landmarks to finish the construction of the sparse map, assuming that N observed landmarks exist, and each landmark is marked as m1,...,mN. Considering the motion of the camera from time k-1 to time k, i.e. the change of pose of the drone, this function is abstracted as f:
xk=f(xk-1,uk,ωk)
in the above formula, xk-1,xkIs the pose of the camera at the time of k-1 and k, ukIs a motion measurement sensor carried by an unmanned aerial vehicle, such as an encoder, and the reading of an IMU at the moment k, omegakIs the noise present when measuring the motion of the drone. The above formula indicates that the change of state of the drone is the equation of motion of the drone.
The process of the camera observation environment is abstracted into a function h:
zk,j=h(mj,xk,vk,j)
in the above formula, mjDenotes the jth road sign, vk,jRepresenting noise generated in the process, zk,jRepresenting the camera in pose xkThe environment is observed and the above equation is the observation equation for SLAM.
BA plays a crucial role in back-end global optimization, ignoring noise first, then the observation equation can be written as:
z=h(x,y)
wherein x is the current pose of the camera and can be represented by R and t, and its corresponding lieThe algebraic form is epsilon. The pixel coordinates of signpost y can be written asThe observed error can be written as:
e=z-h(ε,p)
accumulating all error terms, we can get the global BA optimized objective function:
define camera pose and all landmarks as arguments:
x=[ε1,ε2,...,εm,p1,p2,...,pn]T
by adding a slight increment to the argument and performing a taylor expansion, the objective function can be written as:
fij represents the partial derivative of the camera pose component of the argument in the objective function, Eij is the partial derivative of the feature point position in the objective function.
Then, the incremental equation of the overall objective function is pieced together by F and E, and adjusted to be the jacobian matrix:
J=[F E]
the incremental equation H matrix is then:
thus, the incremental function representation of global optimization of BA is obtained.
The three-dimensional dense point cloud map is constructed by constructing four threads and designing an SLAM system for constructing the dense map by using an ORB _ SLAM2 framework. Firstly, a thread is used for completing a visual odometer, and an accurate pose, feature point extraction and feature point violence matching are estimated for each frame of image shot by a camera. The local map construction is done using a second thread, mainly for localization. The third thread accelerates the feature matching of the map and the current frame. And finally, completing the construction of the three-dimensional dense point cloud map by the fourth thread, wherein the three-dimensional dense point cloud map is constructed by the last four threads as shown in FIG. 6.
Improved RRT algorithm in three-dimensional environment
The main idea of route planning of the RRT algorithm is as follows: taking a task starting point as a root node of a random tree in a map, then obtaining random points through random sampling in a planning space, and then enabling tree nodes close to the random points to grow towards the direction of the random points, wherein the schematic diagram of RRT is shown in FIG. 7(a), and the specific planning process of RRT algorithm can be divided into five steps:
the first step is to take the task starting point as the root node q of the random treeinit(ii) a Second step, a random point z is obtained by random sampling in the planning spacerandThen find the tree node z nearest to the random point in the random treenewest
Third step from tree node znewestAlong znewestzrandObtaining a distance z from the random tree step directionnewest
Nearest candidate tree node znew(ii) a The fourth step connecting znewestAnd znewObtaining a line segment, carrying out collision test on the road segment, and if no obstacle exists in the road segment, znewThe tree node becomes a real tree node, otherwise, a new candidate tree node is searched; and fifthly, continuously searching tree nodes until the tree nodes grow to the position on or near the target point. However, the RRT algorithm has the following disadvantages:
(1) in the initial iteration, the beneficial samples near the target area are ignored.
(2) The sampling is explored throughout the configuration space, so the sampling space is too large.
(3) The random tree growth direction has more and more tree nodes, and a large amount of time is spent for exploring the space beyond the target point, so that the memory requirement is increased.
(4) The random tree converges slowly.
In order to solve the above problems, the present invention provides a new off-line algorithm, namely, reset boundary fast search random tree (R-RRT), and a flow chart of the unmanned plane path planning R-RRT algorithm is shown in fig. 4. In the initial sampling, the range of sampling points is reduced to reduce the memory requirement, and connected domain sampling and target deviation bounded sampling are added, so that the algorithm can quickly converge to an optimal solution. The two methods enable valuable samples near the target area to be rapidly acquired, so that the optimal path can be effectively acquired in a relatively short time; after the initial path is found, continuous and rapid optimization of the path is realized by utilizing elliptical sampling.
The main operation of the process is explained as follows:
the GoalFirstNotFound function is used to determine whether the first path planning is completed, and if the sampling of the connected domain is completed, the sampling area of the ellipse will be planned.
ConnectivityReg this function uses DscaleTo identify by CRegA connection region of the representation, which covers ZinitAnd ZgoalThe search space in between.
BounddSample, which randomly selects a point Zrand∈Creg∈Zfree。
Nerest function nerest (T, Z)rand) Returning T ═ (V, E) to Z according to a cost functionrandThe closest point of (a).
Steer guiding function (Z)nearestZrand) Providing a control input UnewFor the drive system, Z (0) is ZrandTo Z (1) ═ ZnerestAlong path z [0,1 ]]Z to ZnewA determined distance deltaq.
Collision check the function determines the path Z (t) e ZfreeWhether it is a feasible path in Z from t-0 to t-1.
Near: function Near (T, Z)newV) return neighboring points within a range.
ChooseParent: this function selects the least costly tree node Z from the nearby pointsmin。
InsertPoint: the function sets the tree T to point Z in (V, E)newAdd to V and point ZminConnected as its tree nodes. Distance is resolved to ZnewEqual to the distance of its tree node plus ZnewAnd ZminThe sum of the euclidean distances between them.
Rewire function rewiring: (T, Z)near,Zmin,Znew) Inspection from ZnewTo ZnearIf the influence on the specific point is small, please define the parent node as the tree node.
CompleteScan: if at CRegComplete scan is complete, then true is returned. If a complete scan path is not found, the algorithm must grow by CReg。
MinAix: this function is used to define the length of the major axis of the ellipse. After sampling, the ellipse iteratively changes the length of the major axis to optimize the path.
Ellipse Emaple this function defines and samples an elliptical area.
The algorithm is represented by ZinitFor the root node, then use the intelligent bounded sampling random tree, called CRegRandomly selects points within a limited area.
CRegThe search space in which new random points are sampled is limited and is set according to the size of the search map, i.e. the size of the search map is limitedSay Using the extended distance Scale to define ZinitAnd ZgoalThe search space in between.
DscaleE/m, where E is the size of the environment map and m is the spreading factor. By default, m is 8, limiting generation of CRegThe size of the environmental map of the area. m may be increased to obtain a narrower area or a wider area. Heuristic random selection C for target deviation caused by intelligent bounded samplingRegA point within the range.
When bound sampling is completed CRegIn a single scan, the first scan finds no path and then the digital scale is gradually increased, CRegAnd incremented until a path is found.
Once at CRegSearch to initial Path, CbestThe function will return a path defined by euclidean:
after the first path is found after connected domain sampling, a sampling area is continuously set, and elliptic sampling is achieved. An ellipse subset state is added to the connected component, which is uniformly sampled in the ellipse as the path solution for the planned region shrinks to the theoretical minimum.
The elliptical sampling space is defined as the optimal path after sampling through the connected domain:c, a path minimum function, andinitconnecting to Z in a given free spacegoalExpressed as:
f (X) is by ZinitTo ZgoalSo we can improve the solution subset ZfE.g., current state of Z, the current solutionCbestExpressed as:
Zf={z∈Z|f(x)<Cbest}
heuristic sampling Domain ZfIs one from the starting point ZinitAnd target point ZgoalAn ellipse at the focus, is used to find the minimum path problem. The shape of the ellipse depends on the initial and target points, the theoretical minimum distance C between themminAnd path C of the found best solutionbest。
The unmanned aerial vehicle path planning algorithm R-RRT adds heuristic planning based on the RRT algorithm, and the specific effect is shown in fig. 7 (b).
Claims (1)
1. An unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 is characterized by comprising the following steps:
(1) ORB feature point extraction
The ORB feature point extraction consists of two parts: FAST corner and BRIEF descriptor
When detecting the angular points, directly checking the gray levels of pixel points at the four positions of No. 1,5,9 and 13 on the circumference of each pixel to be determined, wherein the current detected pixel may be an angular point only if the gray levels of the four pixels meet the requirement that the number of the pixel points is not less than three;
the rotational invariance of the ORB features is realized by a gray centroid method, which comprises the following specific steps: selecting a small image block B from the center of the angular point, and defining the moment of the image block as:
wherein: m ispqRepresenting moments, x, of image blockspP pixels representing the x-axis direction, yqQ pixels representing the y direction, I (x, y) represents the rotational invariance in the x and y directions, p represents the pixels p around the keypoint, q represents the pixels q around the keypoint;
finding the centroid of the image block by moment:
wherein: c denotes the centroid of the image block, m00Moment, m, representing an azimuthal (0,0) image block01Representing moments, m, of azimuthal (0,1) image blocks10Representing the moments of the orientation (1,0) image blocks;
assuming that the geometric center of the image block B is O and the OCB is connected, the direction of the key point is the vectorUsing angles to indicate:
θ=arctan(m01/m10)
after the ORB feature key points are extracted, calculating another part of content of the feature points, namely descriptors of the key points;
after graying the image, dividing the image into 64x64 local pixel blocks, then carrying out key point detection on each pixel block, respectively calculating Harris response values of candidate key points meeting detection conditions, and finally reserving two key points with the maximum response values as a final key point set;
(2) ORB violence matching
After the violence characteristic points are matched and the descriptors are extracted, matching the descriptors;
after the feature matching is completed, only the depth values of the feature points are needed to be used, the feature points are subjected to inverse transformation once by using a camera pinhole imaging principle, and then the three-dimensional coordinates of the feature points under the camera coordinates are obtained, and the transformation is as follows:
wherein d represents the depth value of the characteristic point; u denotes x-axis constant, v denotes y-axis constant, cxAnd cyRepresenting the amount of change, f, of the origin of the two coordinate systemsyRepresents the camera focal length;
suppose a P point in spaceHas the coordinate of [ x, y, z ]]TThe imaging position P 'is [ x', y ', z']TF is the camera focal length; according to the triangle similarity theorem, the following comparison relationship is obvious:
slightly finishing to obtain:
in the formula, only a transformation model from a space point to an imaging plane is shown, and individual pixels are obtained in the shooting process of a camera, so that the pixel coordinate of the space P can be obtained only by carrying out coordinate transformation once. The transformation process is as follows:
alpha and beta represent the scaling factors of two coordinate axes, respectively, cxAnd cyRepresenting the variation of the origin of the two coordinate systems; combined and written as f for α f and β f, respectivelyxAnd fyThen the whole imaging process is written as:
so far, we have obtained a model of camera pinhole imaging, and write the pixel coordinates into a homogeneous form and then write the whole projection process into a matrix form:
in the formula, z is the depth of the pixel;
using direct linear transformationThe alternative method solves the PnP problem, assuming that its coordinate has a homogeneous form of P ═ x, y, z,1)TThe coordinate of the corresponding homogeneous pixel in the image is P ═ (u, v,1)T(ii) a In order to solve the poses R and t of the camera, a rotation matrix and a translation vector are written into a matrix form, namely an augmentation matrix [ R | t ] is constructed];
And (3) expanding the augmentation matrix by using an imaging projection model of the camera to obtain the following relational expression:
thus, each 3D to 2D pair of feature matching points provides two linear constraints for the augmented matrix; define the row vector of the augmented matrix [ R | t ]:
the formula is then simplified to:
assuming a total of N matching point pairs, the system of equations is set forth:
the matrix [ R | t ] has 12 dimensions in total, and one feature matching point provides two linear constraints for the matrix [ R | t ], so that the pose is solved as long as the number of the feature matching point pairs is more than 6 pairs;
(3) global map optimization and three-dimensional dense point cloud map construction of BA
Visual SLAM rear end optimization and map building comprise global map optimization and three-dimensional dense point cloud map building based on BA;
the SLAM process is a process of continuously observing the environment, wherein the observation is performed at a specific time, the motion of the unmanned aerial vehicle in a continuous period of time is converted into the motion of the camera in a discrete time t 1,2,.., k-1, k, and the poses of the camera corresponding to the time are marked as x1, x2,.., xk, and the camera poses at the time are connected to form the track of the camera in the time; recording the environmental characteristic points observed in the period of time, namely the landmarks to finish the construction of the sparse map, assuming that N observed landmarks exist, and each landmark is marked as m1,...,mN(ii) a Considering the motion of the camera from time k-1 to time k, i.e. the change of pose of the drone, this function is abstracted as f:
xk=f(xk-1,uk,ωk)
in the above formula, xk-1,xkIs the pose of the camera at the time of k-1 and k, ukIs a motion measurement sensor carried by an unmanned aerial vehicle, such as an encoder, and the reading of an IMU at the moment k, omegakIs the noise present during the motion measurement of the drone; the above formula indicates that the state change of the unmanned aerial vehicle is the motion equation of the unmanned aerial vehicle;
the process of the camera observation environment is abstracted into a function h:
zk,j=h(mj,xk,vk,j)
in the above formula, mjDenotes the jth road sign, vk,jRepresenting noise generated in the process, zk,jThe representative camera observes the environment at the pose xk, and the formula is an observation equation of SLAM;
ignoring the noise first, the observation equation is written as:
z=h(x,y)
wherein x refers to the current pose of the camera and is represented by R and t, and the corresponding lie algebra form is epsilon;pixel coordinates of signpost y are written asThe observed error is written as:
e=z-h(ε,p)
accumulating all error terms, we can get the global BA optimized objective function:
define camera pose and all landmarks as arguments:
x=[ε1,ε2,...,εm,p1,p2,...,pn]T
adding a slight increment to the argument and performing a taylor expansion, the objective function is written as:
fij represents the partial derivative of the camera pose part of the independent variable in the objective function, and Eij is the partial derivative of the position of the characteristic point in the objective function;
then, the incremental equation of the overall objective function is pieced together by F and E, and adjusted to be the jacobian matrix:
J=[F E]
the incremental equation H matrix is then:
thus obtaining incremental function representation of global optimization of BA;
constructing a three-dimensional dense point cloud map, constructing four threads by using an ORB _ SLAM2 framework, and designing and constructing an SLAM system of the dense map; firstly, completing a visual odometer by using a thread, estimating an accurate pose, feature point extraction and feature point violence matching for each frame of image shot by a camera; completing local map construction by using a second thread, wherein the second thread is mainly used for positioning; the third thread accelerates the feature matching of the map and the current frame; finally, the fourth thread completes the construction of the three-dimensional dense point cloud map;
(4) improved RRT algorithm in three-dimensional environment
In the initial sampling, reducing the range of sampling points to reduce the memory requirement, and adding connected domain sampling and target deviation bounded sampling; after the initial path is found, continuous and rapid optimization of the path is realized by utilizing elliptical sampling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111075423.6A CN113808152A (en) | 2021-09-14 | 2021-09-14 | Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111075423.6A CN113808152A (en) | 2021-09-14 | 2021-09-14 | Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113808152A true CN113808152A (en) | 2021-12-17 |
Family
ID=78895270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111075423.6A Pending CN113808152A (en) | 2021-09-14 | 2021-09-14 | Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113808152A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115451920A (en) * | 2022-10-27 | 2022-12-09 | 南京航空航天大学 | Relative pose measurement method for unmanned autonomous landing |
CN117274620A (en) * | 2023-11-23 | 2023-12-22 | 东华理工大学南昌校区 | Visual SLAM method based on self-adaptive uniform division feature point extraction |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109766758A (en) * | 2018-12-12 | 2019-05-17 | 北京计算机技术及应用研究所 | A kind of vision SLAM method based on ORB feature |
CN111667506A (en) * | 2020-05-14 | 2020-09-15 | 电子科技大学 | Motion estimation method based on ORB feature points |
-
2021
- 2021-09-14 CN CN202111075423.6A patent/CN113808152A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109766758A (en) * | 2018-12-12 | 2019-05-17 | 北京计算机技术及应用研究所 | A kind of vision SLAM method based on ORB feature |
CN111667506A (en) * | 2020-05-14 | 2020-09-15 | 电子科技大学 | Motion estimation method based on ORB feature points |
Non-Patent Citations (3)
Title |
---|
王海: "基于视觉SLAM建图的无人机路径规划", 《中国优秀硕士学位论文全文数据库 工程科技II辑》, no. 2012, pages 031 - 189 * |
裴以建 等: "基于改进 RRT*的移动机器人路径规划算法", 《计算机工程》, vol. 45, no. 5, pages 285 - 297 * |
郑凯元: "室内服务机器人关键技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 2020, pages 140 - 115 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115451920A (en) * | 2022-10-27 | 2022-12-09 | 南京航空航天大学 | Relative pose measurement method for unmanned autonomous landing |
CN115451920B (en) * | 2022-10-27 | 2023-03-14 | 南京航空航天大学 | Relative pose measurement method for unmanned autonomous landing |
CN117274620A (en) * | 2023-11-23 | 2023-12-22 | 东华理工大学南昌校区 | Visual SLAM method based on self-adaptive uniform division feature point extraction |
CN117274620B (en) * | 2023-11-23 | 2024-02-06 | 东华理工大学南昌校区 | Visual SLAM method based on self-adaptive uniform division feature point extraction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110070615B (en) | Multi-camera cooperation-based panoramic vision SLAM method | |
Rozenberszki et al. | LOL: Lidar-only Odometry and Localization in 3D point cloud maps | |
Huang | Review on LiDAR-based SLAM techniques | |
Yang et al. | Multi-camera visual SLAM for off-road navigation | |
Mu et al. | Research on SLAM algorithm of mobile robot based on the fusion of 2D LiDAR and depth camera | |
CN110726406A (en) | Improved nonlinear optimization monocular inertial navigation SLAM method | |
Lv et al. | ORB-SLAM-based tracing and 3D reconstruction for robot using Kinect 2.0 | |
Sujiwo et al. | Monocular vision-based localization using ORB-SLAM with LIDAR-aided mapping in real-world robot challenge | |
CN114419147A (en) | Rescue robot intelligent remote human-computer interaction control method and system | |
CN113808152A (en) | Unmanned aerial vehicle autonomous navigation method based on ORB _ SLAM2 | |
CN113674412B (en) | Pose fusion optimization-based indoor map construction method, system and storage medium | |
Li et al. | Automatic targetless LiDAR–camera calibration: a survey | |
CN111812978B (en) | Cooperative SLAM method and system for multiple unmanned aerial vehicles | |
CN111998862A (en) | Dense binocular SLAM method based on BNN | |
CN117367427A (en) | Multi-mode slam method applicable to vision-assisted laser fusion IMU in indoor environment | |
Shao et al. | Monocular object slam using quadrics and landmark reference map for outdoor UAV applications | |
Wei et al. | Overview of visual slam for mobile robots | |
Zhang et al. | A Robust Lidar SLAM System Based on Multi-Sensor Fusion | |
CN113379915B (en) | Driving scene construction method based on point cloud fusion | |
Li et al. | An SLAM Algorithm Based on Laser Radar and Vision Fusion with Loop Detection Optimization | |
Liu et al. | Laser 3D tightly coupled mapping method based on visual information | |
Li et al. | BA-LIOM: tightly coupled laser-inertial odometry and mapping with bundle adjustment | |
Zhang et al. | Object depth measurement from monocular images based on feature segments | |
Liu et al. | 6-DOF motion estimation using optical flow based on dual cameras | |
Chen et al. | SLAM system based on tightly coupled visual-inertial |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |