CN105809687B - A kind of monocular vision ranging method based on point information in edge in image - Google Patents

A kind of monocular vision ranging method based on point information in edge in image Download PDF

Info

Publication number
CN105809687B
CN105809687B CN201610131438.2A CN201610131438A CN105809687B CN 105809687 B CN105809687 B CN 105809687B CN 201610131438 A CN201610131438 A CN 201610131438A CN 105809687 B CN105809687 B CN 105809687B
Authority
CN
China
Prior art keywords
depth
frame
map
image
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610131438.2A
Other languages
Chinese (zh)
Other versions
CN105809687A (en
Inventor
程农
杨盛
李清
田振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201610131438.2A priority Critical patent/CN105809687B/en
Publication of CN105809687A publication Critical patent/CN105809687A/en
Application granted granted Critical
Publication of CN105809687B publication Critical patent/CN105809687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to a kind of methods based on the unmanned plane monocular vision ranging of edge point information in image, belong to Navigation of Pilotless Aircraft field of locating technology, this method comprises: lower depending on selecting two frames to construct initial map and initial depth figure in monocular camera captured image sequence from being connected with unmanned plane, first key frame first frame being taken as in map, and the corresponding camera coordinates system of first frame is taken as world coordinate system, complete initialization;Carry out estimation, map structuring and depth map parallel again and estimate three threads: estimation thread utilizes known map and depth map information, it is aligned to obtain ranging result with present frame, and existing cartographic information is optimized according to ranging result, map structuring and depth map estimation thread are run simultaneously to safeguard map and depth map information.The present invention makes full use of the multicore architecture of modern processors, and the edge point information that can be effectively utilized in image improves efficiency of algorithm in conjunction with angle point information, has stronger adaptability.

Description

Monocular vision ranging method based on edge point information in image
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle navigation and positioning, and particularly relates to a monocular vision range finding method utilizing edge point information in an image.
Background
In many application scenarios, people cannot rely on GPS for accurate positioning, such as indoors, between urban buildings, or in jungle, valley or even outside celestial bodies. Navigating with vision is very intuitive: all animals with eyes in nature use vision to navigate in a certain form, insects such as flies and bees judge the relative motion between the insects and a target through light flow, and people judge the position of the body through the seen scenery in most of time.
On the premise of not using a GPS, most of the micro unmanned aerial vehicles adopt a navigation scheme combining an inertial measurement unit and a single camera due to the consideration of cost and weight, and can also be provided with an ultrasonic sensor and an air pressure sensor to acquire absolute height information. A drone of this configuration can achieve completely autonomous waypoint tracking.
The process of estimating the motion of a carrier carrying a single or a plurality of cameras by only using the image input of the carrier is called visual odometry VO (visual odometry). The application fields include robots, augmented reality, automatic driving and the like. The process is similar to the conventional round of measurement. Wheel range is calculated by accumulating wheel rotation, and visual range is used for incrementally estimating the pose of the carrier by sensing the change of an input image. The effective operation of the visual ranging algorithm requires sufficient illumination in the environment and sufficient scene texture.
The monocular vision measuring range only uses a single camera as input, the system configuration is simple, and the capability of adapting to environmental scale change is stronger than that of a monocular vision system. In the existing monocular visual ranging method, the matching between frames is generally carried out by utilizing the corner features in the image, and the method cannot adapt to scenes lacking the corner features (the corner features are important features of the image and represent pixel points highlighted in some attributes in the image. in the scene in FIG. 1, the corner features are mostly present at the intersection points of straight lines, the quantity is rare, the repeatability exists on the structure, and the existing monocular visual ranging algorithm based on the corner features is likely to fail). In addition, the angular point features only occupy a small part of all pixel points of the image, and the monocular vision measuring range performance can be improved from the viewpoint of the utilization rate of image information.
The terms to which the present invention relates are described below:
frame: in the field of visual ranging, it is customary to call an obtained image as one frame, for example, an image obtained at the previous moment of the camera is called a previous frame, an image obtained at the current moment of the camera is called a current frame, and two consecutive images obtained by the camera are called adjacent frames, etc.;
key frame: because the frame rate of the current camera is high, the pose change between adjacent frames is often small, in order to enhance the accuracy of motion estimation, a key frame strategy is generally adopted, that is, in a certain pose change range, a newly obtained image is only aligned with a certain specific frame to estimate the current pose, and only after a certain range is exceeded, a new specific frame is adopted to perform image alignment of the next stage, that is, the specific frames used for performing image alignment are called as key frames;
reference frame: the frame used to align the current picture is called the reference frame of the current picture;
map: in the field of visual ranging, known environmental information (e.g., the positions of points that have been calculated, images that have been acquired, etc.) is stored and referred to as a map. The map can be used as prior information of subsequent image matching and motion estimation to increase the accuracy of the measuring range.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a monocular vision range finding method based on edge point information in an image, which can effectively utilize the edge point information in the image and has stronger adaptability to scenes lacking corner features by combining corner point information (as assistance).
The invention provides an unmanned aerial vehicle monocular vision range finding method based on edge point information in an image, which is characterized by comprising the steps of initializing and carrying out parallel processing on motion estimation, map construction and depth map estimation; the method specifically comprises the following steps:
1) initialization: selecting two frames from an image sequence captured by a downward-looking monocular camera fixedly connected with an unmanned aerial vehicle to construct an initial map and an initial depth map, wherein the map uses a group of key frame sets { r }mAnd a set of three-dimensional feature point sets piDenotes, where r denotes a key frame, subscript m denotes an mth key frame, m is a positive integer, p denotes a three-dimensional feature point, subscript i denotes an ith three-dimensional feature point, i is a positive integer, the three-dimensional feature point corresponds to a corner feature in an image, and its corresponding two-dimensional corner set is denoted as a two-dimensional corner setWherein u iscRepresents a corner point, subscript i represents the ith corner point, i is a positive integer; the depth map is composed of edge point coordinates, edge point depth values and uncertainty of depth values, the depth map is in one-to-one correspondence with the key frame, and the depth map is usedRepresenting, wherein D represents a depth map corresponding to the edge points, the superscript m represents the mth depth map which is a positive integer, ueRepresenting edge points, deDepth values, σ, representing edge pointseThe uncertainty of the depth reciprocal of the edge point is shown, and a subscript i represents the ith edge point and is a positive integer;
taking the first frame as a first key frame in the map, taking a camera coordinate system corresponding to the first frame as a world coordinate system, and finishing initialization;
2) after map and depth map information is initialized, three threads of motion estimation, map construction and depth map estimation are performed in parallel: the motion estimation thread aligns with the current frame by using the known map and depth map information to obtain a ranging result, the existing map information is optimized according to the ranging result, and the map construction and depth map estimation thread simultaneously operate to maintain the map and depth map information.
The invention has the characteristics and beneficial effects that:
the monocular vision ranging method based on the edge point information in the image takes an image captured by a downward-looking monocular camera fixedly connected with an unmanned aerial vehicle as input, and outputs a ranging result of the unmanned aerial vehicle through on-board computer operation; the method is realized by three parallel threads, namely a motion estimation thread, a map construction thread and a depth map estimation thread; the motion estimation thread aligns the current motion pose with the current frame by using the existing information in the map and the depth map, is the core of a ranging algorithm and must ensure strong real-time performance; the map construction thread is used for maintaining the key frame information and the three-dimensional feature map corresponding to the image corner features; the depth map estimation thread is used to maintain a depth estimate of edge point information in the key frame.
The invention respectively runs the motion estimation, the map construction and the depth map estimation in three threads, fully utilizes the multi-core architecture of the modern processor and improves the algorithm efficiency. The motion estimation algorithm has no processes of feature extraction and matching, a rapid feature point extraction algorithm is adopted during map construction, and pixel points with large brightness gradient are directly extracted in depth map estimation, so that the calculated amount is effectively reduced. Meanwhile, in the initial estimation process of the motion estimation algorithm, the edge point information in the image is utilized, and a depth map propagation method of the edge point is correspondingly designed. The edge points in the image are image features which are much richer than the corner points, and the algorithm can reduce the dependence of the algorithm on the corner point features by adopting edge point information in the initial estimation and enhance the environment adaptability of the algorithm.
Drawings
FIG. 1 is a scene in which a monocular visual ranging method based on corner features may fail;
FIG. 2 is a general flow chart of the improved monocular vision ranging method based on edge points according to the present invention;
FIG. 3 is a flow chart of a motion estimation thread according to an embodiment of the present invention;
FIG. 4 is a flow diagram of a mapping thread according to an embodiment of the invention;
FIG. 5 is a flowchart of a depth map estimation thread according to an embodiment of the present invention;
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and examples.
The general flow of the unmanned aerial vehicle monocular vision ranging method based on the edge point information in the image is shown in fig. 2, and is characterized by comprising the steps of initializing and performing parallel motion estimation, map construction and depth map estimation; the method specifically comprises the following steps:
1) initialization: two frames are selected from an image sequence captured by a downward-looking monocular camera fixedly connected with an unmanned aerial vehicle to construct an initial map and an initial depth map, and in the embodiment, the map uses a group of key frame sets { rmAnd a set of three-dimensional feature point sets piWhere r denotes a key frame, subscript m denotes an mth key frame, which is a positive integer, p denotes a three-dimensional feature point, subscript i denotes an ith three-dimensional feature point, which is a positive integer, the three-dimensional feature point corresponds to a corner feature in an image, and its corresponding two-dimensional corner set is generally denoted as a two-dimensional corner setWherein u iscRepresents a corner point, and the subscript i represents the ith corner point, which is a positive integer;
in this embodiment, a depth map corresponding to each key frame in the map is calculated, where the depth map is composed of edge point coordinates, edge point depth values, and uncertainty of the depth values, that is, the depth map is usedWherein D represents the depth map corresponding to the edge points, the superscript m represents the mth depth map, which is a positive integer (the depth map corresponds to the key frame one by one), and ueRepresenting edge points, deDepth values, σ, representing edge pointseDenotes the uncertainty of the reciprocal depth of the edge point, and the subscript i denotes the ith edge point, which is a positive integer.
The first frame is taken as a first key frame in the map, and the camera coordinate system corresponding to the first frame is taken as a world coordinate system (the world coordinate system refers to a coordinate system with a definite transformation relation relative to a coordinate system of the real world such as a coordinate system of the northeast, and the camera coordinate system corresponding to the first frame is selected as the world coordinate system for convenience of calculation), so that initialization is completed;
2) after map and depth map information is initialized, three threads of motion estimation, map construction and depth map estimation are performed in parallel: the method comprises the steps that a motion estimation thread aligns current frames by utilizing known map and depth map information to obtain a range finding result, existing map information is optimized according to the range finding result, and a map construction thread and a depth map estimation thread run simultaneously to maintain the map and depth map information; the three threads are described in detail as follows:
21) motion estimation thread: for a current frame image newly acquired by a camera, carrying out image alignment based on a motion model by using edge point depth information in a current key frame to obtain a pose T of the current frame in a world coordinate systemk,wInitial estimation ofWherein T represents a pose, subscript k represents that the current frame is a kth frame image obtained by a camera and is a positive integer, subscript w represents a world coordinate system, and superscript-represents initial estimation, and the initial estimation can be utilized to enable the existing three-dimensional feature points { p } in the map to beiIs projected in the current image and is matched with p by block matchingiThe corresponding two-dimensional corner pointsTo be provided withIn order to obtain a measured value,as an initial value, successively comparing the poses T of the current frame in the world coordinate system wk,wAnd optimizing the position of the relevant three-dimensional characteristic point in the map. The embodiment of motion estimation is shown in fig. 3, and includes:
s11: recording the current frame image as IkWherein I denotes camera acquisitionImage, first initializing the current pose of the drone to the pose at the previous moment, i.e. the pose at the previous momentWith the current key frame rmAs a reference frame, minimizing I by using a Gauss-Newton iterative algorithmkAnd rmThe weighted gray scale residuals between them are as follows (1-1):
obtaining a current frame IkRelative to the reference frame rmPosition and posture Tk,mWhereinRepresents the reference frame rmThe median inverse depth uncertainty is less than a set uncertainty threshold (1/200 for the maximum estimate of the inverse depth) and is visible at the current frame as a set of edge points uiRepresenting pixel points in the image, wherein u represents a pixel point, subscript i represents the ith pixel point which is a positive integer, wi(ui) Representing a pixel uiThe weight of (A) is taken asσiUncertainty of depth of the ith pixel point, pixel point uiGray residual δ R (T)k,m,ui) Is represented by the formula (1-2):
wherein pi is a projection function, and projecting three-dimensional points in the camera coordinate system into the two-dimensional image coordinate system-1In order to be the inverse of the projection function,andrespectively representing pixel points in frame IkAnd frame rmIn conjunction with the reference frame rmPose T relative to world coordinate systemm,wThen an initial estimate of the pose of the current frame can be calculatedIs represented by the formula (1-3):
if the number of the edge points participating in the weighting is less than the set edge point threshold (the value is 300 in the embodiment), turning to step S12, otherwise, directly entering step S13;
s12: using the pose estimation obtained in the step S11 as an initial value, and taking the previous frame Ik-1Performing image alignment as a reference frame of the current frame, and minimizing I by using a Gauss-Newton iterative optimization algorithmkAnd Ik-1The gray residual between the previous frames obtains the pose T of the current frame relative to the previous framek,k-1As shown in formulas (1-4):
whereinRepresenting a set of corner points (projected from three-dimensional feature points in the map) in frame k-1, where the depth is known and visible in the current frame. Combining the pose T of the previous frame relative to the world coordinate systemk-1,wThen an initial estimate of the pose of the current frame can be calculatedIs represented by the formula (1-5):
at this time, the current frame IkSelecting a new key frame as a reference frame of a subsequent segment of image; simultaneously, the information is used as the input information of a map construction thread and a depth map estimation thread;
s13: (initial estimation of pose using current frameThree-dimensional feature points in the map can be projected to the current frame to obtain a set of two-dimensional coordinates, but because of the fact thatAnd errors exist, and the two-dimensional coordinates are not necessarily the real positions of the three-dimensional characteristic points imaged in the current frame. ) From initial estimatesJudging a three-dimensional feature point set { p) observed by the current frame in the mapiGet it at the reference frame rmEstimate of the coordinates of the corresponding corner pointsTo be provided withFor the initial value, { p ] is calculated by using image tracking algorithm KLT (Kanade-Lucas-Tomasi Feature Tracker, a commonly used open algorithm)iTrue imaging position at current frame
Wherein A isiRepresenting a transformation matrix for transforming the reference tile to the current image.Representing the average gray difference of the reference and matching blocks in the current image for eliminating the illumination effect (this step establishes the three-dimensional feature points { p } in the mapiAnd two-dimensional corner points in the imageThe sub-pixel precision of);
s14: the two-dimensional coordinates obtained in step S13 and the initial pose estimate of the current frameThe geometric projection constraint, i.e. equation (1-7), is no longer satisfied:
to be provided withFor the initial value, by minimizing the reprojection error pair Tk,wUpdating the pose T of the current framek,wI.e. the final result of the motion estimation, as in equations (1-8):
Tk,wthe final result obtained by measuring the distance is obtained. Turning to step S15, the current frame pose T obtained in step S14 is usedk,wRespectively turning a map construction thread and a depth map estimation thread to calculate corner depth values and edge depth values;
s15: utilizing the current frame pose T obtained in the step S14k,wOptimizing each three-dimensional feature point in the map which can be observed by the current frame respectively, and enabling the sum of squares of errors of projection and real imaging positions of the three-dimensional feature point in all key frames which can observe the three-dimensional feature point to be minimum, namely, the three-dimensional feature point is expressed as a formula (1-9):
in the above formula, the superscript j indicates that the three-dimensional feature point p can be observediThe jth key frame of (1);
22) the map construction thread is processed in two situations: if the newly acquired image is selected as a new key frame in the motion estimation thread step S12, executing step S21, extracting corner features from the newly added key frame, and creating a probability-based depth filter for each newly extracted corner by the algorithm (i.e., the depth of a corner is composed of a depth value and an uncertainty, and the depth value and the uncertainty of the corner are updated by the depth filter for each new observation); when a newly acquired image is not selected as a key frame, performing steps S22 to S24 using the pose of the current frame obtained in the motion estimation thread step S14, updating a probability depth filter of a corner feature using information of a new image, and adding a corresponding three-dimensional feature point to a map when the depth uncertainty of a certain corner is smaller than a set uncertainty threshold, where an embodiment of the thread is shown in fig. 4, and the specific steps are described as follows: firstly, judging whether a new image is a key frame, if so, executing a step S21, otherwise, executing steps S22-S24;
s21: when a frame of image is selected as a key frame, extracting corner Features of the key frame by using a FAST corner extraction algorithm (FAST corner extraction algorithm); the image is first homogenized using an equally spaced grid (e.g. 30x30 pixels)Dividing, wherein only one angular point is extracted from each grid, so that the angular points are uniformly distributed in the image, and the total number does not exceed the number of the grids; for each new corner pointCreating a probability depth filter and inverting the depth(whereinRepresenting the inverse of the depth of a corner, the subscript i representing the ith corner in the current key frame, which is a positive integer), and the probability of occurrence ρ of a valid observation for that corneri(where ρ represents the effective observation probability of a corner, and subscript i represents the ith corner in the current key frame, which is a positive integer), the joint posterior distribution is defined as formula (2-1):
wherein Beta and N respectively represent Beta distribution and normal distribution, subscript N represents that parameter updating has been performed N times on the current probability depth filter, and the parameter a is a positive integernAnd bnIs the number of times, mu, that valid observation and invalid observation occur respectively in the recursive updating processnAndthe mean and variance of the inverse depth gaussian distribution; when the parameters in the formula (2-1) are initialized, a isnAnd bnHas an initial value of 10, munInitialisation is to the inverse of the mean depth of the image,initialized to a maximum value.
S22: application stepsThe image I with known camera pose obtained in step S14kUpdating the depth estimation of the corner feature; corner point for recording updateAt key frame rmAccording to the current frame IkAnd rmRelative position and attitude ofkDetermine a straight line and ensure IkNeutralization ofCorresponding pixel pointOccurs on this straight line; a search is performed on this line (this process is commonly referred to as epipolar search), resulting in sub-pixel precision block matchingAfter the coordinates of (a) are obtained, the corner point is calculated by triangulation (a standard calculation method in stereovision)Depth of (2)
S23: the depth obtained in step S22Take the reciprocal ofWherein, the subscript i represents the ith corner in the current key frame and is a positive integer, the superscript n represents that the ith corner is the nth measurement update of the depth filter of the current corner and is a positive integer, and the subscript n represents the inverse of the depthIs modeled as in equation (2-2):
the meaning of the above formula is: key frame rmImage I of the ith corner pointkIf the measurement is valid, the reciprocal of the measurement resultSatisfy normal distributionWhereinCalculating the variance for the inverse depth due to one pixel error in the image plane, and if it is an invalid measurementSatisfy uniform distributionWhereinThe minimum value and the maximum value of the depth reciprocal are set according to the prior information of the current scene; throughMeasurement update, inverse depthThe posterior probability distribution of (2) to (3):
wherein C is a constant that ensures the normalization of the probability distribution. By means of moment matching (statistical methods, approximating one distribution to another) it is possible to calculate
S24: rejecting effective measurement probabilitiesToo low (less than 0.1) corner points will not determine a degree of uncertainty σnSatisfies the requirement (sigma)n1/200 less than the maximum estimate of the reciprocal depth of the corner point) is added to the map;
23) the depth map estimation thread processes in two cases: if the newly acquired image is selected as a new key frame in the motion estimation thread step S12, extracting edge points from the newly added key frame, and executing steps S31 to S33; when the newly input image is not selected as the key frame, the pose of the current frame obtained in the step S14 of the motion estimation thread is utilized to execute the steps S34-S36, the probability depth filter of the edge point is updated by new image information, and the edge point with the too low effective observation probability is removed; the execution steps are shown in fig. 5 and described in detail as follows: firstly, judging whether a new image is a key frame, if so, executing steps S31-S33, otherwise, executing steps S34-S36;
s31: in the newly added key frame, Sobel operator (a general method for edge point detection) is used for calculating horizontal and vertical gray gradients G of the imagexAnd GyFor grey scale gradient mapsAnd (4) approximation. Selecting 2500 to 4000 pixel points with the largest gray gradient from the image, reducing the gradient threshold value 450 to be 0.95 times of the original value when the number of the pixel points with obvious gradient is less than 2500, and increasing the threshold value to be 1.05 times of the original value when the number is more than 4000;
s32: remember the key frame as rmThe previous key frame is rm-1This step converts the frame rm-1Depth map D ofm-1Propagation to new key frame depth map Dm;rm-1Middle, edge pointHas a mean of inverse depths ofThe variance of the reciprocal of depth isWherein,represents the mean of the inverse depths of the edge points,denotes the variance of the inverse depth, the index i denotes the frame rm-1The ith edge point of (1). r ism-1To rmRelative position and posture of Tm,m-1Edge pointIn frame rmPosition of corresponding edge point inCan be calculated from the projection relation to obtain the formula (3-1):
note the bookHas an inverse depth ofThe variance of the reciprocal depth is noted asThen:
wherein, tzRepresenting translation of the camera on the optical axis. From this it can be deducedVariance of inverse depth ofComprises the following steps:
whereinRepresenting the prediction uncertainty, set empirically. Due to the error in the projection of the light,may not be an integer, willIs bound to rmMiddle nearest pixel point
S33: creating a probability for each edge point extracted in step S31Depth filter, if the 3x3 pixel range near an edge point is stored with a prior estimateThen its depthMean value ofInitializing as equation (3-4):
uncertainty of inverse depth of parameter of probability depth filterThen it is initialized to the minimum uncertainty in the a priori estimate, i.e.If no prior estimate exists, the probabilistic depth filter is initialized to mean depth and maximum uncertainty; getThen for each edge point the probability depth filter can be constructed according to equation (2-1) as in equation (3-5):
wherein each parameter of the formula (3-5) is defined similarly to the formula (2-1), where each parameter of the probability depth filter corresponding to an edge point, anAnd bnHas an initial value of 10, mun,σnIs initialized toAnd
s34: new key frame rmSetting the reference frame as the subsequent image, and utilizing the subsequent image to frame rmDepth map D of (1)mPerforming measurement updating; using images IkTo DmWhen updating, the edge points of the update are recorded asAccording to IkAnd rmRelative position and attitude ofkDetermine a straight line, ensure IkNeutralization ofCorresponding pixel pointOccurs on this straight line; if the included angle between the edge point gray scale gradient direction and the linear direction is less than the threshold (the threshold is set to be 25 degrees in the embodiment), the sub-pixel precision block matching is performed to obtain the edge point gray scale gradient imageAfter the coordinates of (2), calculating the depth of the edge point by triangulationPerforming step 35); if the included angle is larger than the threshold value, no search is performed, and the step S34 is returned to check the next edge point.
S35: the depth obtained in step S34Take the reciprocal ofWherein, the subscript i represents the ith edge point in the current key frame and is a positive integer, the superscript n represents the nth measurement update of the depth filter of the current edge point and is a positive integer, and the subscript i represents the inverse of the depthIs modeled as in equation (3-6):
the meaning of the above formula is: key frame rmThe ith edge point of the image IkIf the measurement is valid, the reciprocal of the measurement resultSatisfy normal distributionIf it is an invalid measurement, thenSatisfy uniform distributionWhereinThe minimum value and the maximum value of the depth reciprocal are set according to the prior information of the current scene; throughMeasurement update, inverse depthThe posterior probability distribution of (a) is as follows (3-6):
calculated by a moment matching method
S36 rejection of the low probability of valid measurementsLess than 0.1).
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention. And the scope of the present invention is not limited thereto, and those skilled in the art will be able to make various changes, modifications, alterations, and substitutions on the embodiments without departing from the spirit, principle, and scope of the present invention, and therefore all equivalent technical means should fall within the scope of the present invention, and the scope of the present invention should be limited only by the appended claims and their equivalents.

Claims (4)

1. A method for measuring distance of monocular vision of an unmanned aerial vehicle based on edge point information in an image is characterized by comprising the steps of motion estimation, map construction and depth map estimation of initialization and parallel processing; the method specifically comprises the following steps:
1) initialization: selecting two frames from an image sequence captured by a downward-looking monocular camera fixedly connected with an unmanned aerial vehicle to construct an initial map and an initial depth map, wherein the map uses a group of key frame sets { r }mAnd a set of three-dimensional feature point sets piDenotes wherein r ismRepresenting the mth key frame, m being a positive integer, piRepresenting the ith three-dimensional feature point, wherein i is a positive integer, the three-dimensional feature points correspond to the two-dimensional corner features in the image one by one, and the two-dimensional corner set is recorded asWhereinRepresenting the ith angular point, wherein i is a positive integer; the depth map is composed of edge point coordinates, edge point depth values and uncertainty of depth reciprocal, the depth map is in one-to-one correspondence with the key frame, and the depth map is usedIs shown in which DmRepresenting the depth map corresponding to the mth key frame,which represents the point of the i-th edge,a depth value representing the ith edge point,representing the uncertainty of the depth reciprocal of the ith edge point;
taking a first frame as a first key frame in a map, taking a camera coordinate system corresponding to the first frame as a world coordinate system, and finishing initialization;
2) after map and depth map information is initialized, three threads of motion estimation, map construction and depth map estimation are performed in parallel: the method comprises the steps that a motion estimation thread aligns current frames newly acquired by a camera by using known map and depth map information to obtain a range finding result, existing map information is optimized according to the range finding result, and the map construction and depth map estimation thread runs simultaneously to maintain the map and depth map information.
2. The method as claimed in claim 1, wherein the step 2) of performing the motion estimation thread in parallel among the three threads of motion estimation, map construction and depth map estimation comprises the following specific steps:
carrying out image alignment based on a motion model on a current frame image newly acquired by a camera by utilizing edge point depth information in a current key frame to obtain a current frame IkPose T in world coordinate systemk,wInitial estimation ofWherein IkRepresenting the image of the k-th frame acquired by the camera, i.e. the current frame, k being a positive integer, Tk,wRepresents the pose of the current frame in the world coordinate system,the initial estimation is utilized to project the existing three-dimensional characteristic points in the map into the current image to obtain a three-dimensional characteristic point set { p ] which can be observed by the current frameiGet the result of block matching with { p }iThe corresponding two-dimensional corner setTo be provided withIn order to be able to take the value of the observation,as an initial value, successively comparing the poses T of the current frame in the world coordinate system wk,wOptimizing the position of a relevant three-dimensional characteristic point in the map;
s11: recording the current frame image as IkFirstly, the current pose of the unmanned aerial vehicle is initialized to the pose at the previous moment, namelyWith the current key frame rmAs a reference frame, minimizing I by using a Gauss-Newton iterative algorithmkAnd rmThe weighted gray scale residuals between them are as follows (1-1):
obtaining a current frame IkRelative to the reference frame rmPosition and posture Tk,mWhereinRepresents the reference frame rmThe middle depth reciprocal uncertainty is less than a set uncertainty threshold and is visible in the current frame at the set of edge points,indicating the point of the i-th visible edge,to representThe weight of (A) is taken as Representing the uncertainty of the inverse depth of the ith edge point, edge pointGray scale residual ofIs represented by the formula (1-2):
wherein pi is a projection function, and projecting three-dimensional points in the camera coordinate system into the two-dimensional image coordinate system-1In order to be the inverse of the projection function,andrespectively representing pixel points in frame IkAnd frame rmIn conjunction with the reference frame rmPose T relative to world coordinate systemm,wThen an initial estimate of the pose of the current frame can be calculatedIs represented by the formula (1-3):
if the number of the edge points participating in the weighting is less than the set edge point threshold, turning to the step S12, otherwise, directly entering the step S13;
s12: using the pose estimation obtained in the step S11 as an initial value, and taking the previous frame Ik-1Performing image alignment as a reference frame of the current frame, and minimizing I by using a Gauss-Newton iterative optimization algorithmkAnd Ik-1The gray residual between the previous frames obtains the pose T of the current frame relative to the previous framek,k-1As shown in formulas (1-4):
whereinIndicating that the depth is known in frame k-1, and that the set of corners visible in the current frame,to representThe ith point; combining the pose T of the previous frame relative to the world coordinate systemk-1,wThen calculating the initial estimate of the pose of the current frameIs represented by the formula (1-5):
at this time, the current frame IkSelecting a new key frame as a reference frame of a subsequent segment of image; simultaneously, the information is used as the input information of a map construction thread and a depth map estimation thread;
s13: from initial estimatesJudging a three-dimensional feature point set { p) observed by the current frame in the mapiGet it at the reference frame rmEstimate of the coordinates of the corresponding corner pointsTo be provided withFor initial values, p is calculated by the image tracking algorithm KLTiUpdating the two-dimensional corner set at the real imaging position of the current frame
Wherein A isiA transformation matrix representing a transformation of the reference tile to the current image;representing the average gray difference of the reference image block and the matching image block in the current image, and eliminating the illumination influence;
s14: the two-dimensional coordinates obtained in step S13 and the initial pose estimate of the current frameThe formula (1-7) is no longer satisfied:
to be provided withFor the initial value, by minimizing the reprojection error pair Tk,wUpdating the pose T of the current framek,wI.e. the final result of the motion estimation, as in equations (1-8):
Tk,wthe final result obtained by measuring the range is obtained; turning to step S15, the current frame pose T obtained in step S14 is usedk,wCalculating corner point depth values and edge point depth values as input information of a map building thread and a depth map estimation thread;
s15: utilizing the current frame pose obtained in the step S14Tk,wOptimizing each three-dimensional feature point in the map which can be observed by the current frame respectively, and enabling the sum of squares of errors of projection and real imaging positions of the three-dimensional feature point in all key frames which can observe the three-dimensional feature point to be minimum, namely, the three-dimensional feature point is expressed as a formula (1-9):
in the above formula, the superscript j indicates that the three-dimensional feature point p can be observediThe jth key frame of (1).
3. The method as claimed in claim 2, wherein the step 2) of performing the mapping thread in the three threads of motion estimation, mapping and depth map estimation in parallel comprises the following specific steps:
the map building thread handles in two cases: if the newly acquired image is selected as a new key frame in the motion estimation thread step S12, executing step S21, otherwise, executing steps S22-S24 by using the pose of the current frame obtained in the motion estimation thread step S14; the method comprises the following specific steps:
s21: when one frame of image is selected as a key frame, extracting corner features of the key frame by using a FAST algorithm; firstly, uniformly dividing an image by using grids at equal intervals, and only extracting one angular point from each grid so that the angular points are uniformly distributed in the image, wherein the total number of the angular points does not exceed the number of the grids; for each new corner pointCreating a probability depth filter and inverting the depthAnd probability of occurrence of valid observations ρiThe combined posterior distribution of (A) is defined as formula (2-1):
wherein: beta and N respectively represent Beta distribution and normal distribution, subscript N represents that N times of parameter updating has been carried out on the current probability depth filter, N is a positive integer, and parameter anAnd bnIs the number of times, mu, that valid observation and invalid observation occur respectively in the recursive updating processnAndthe mean and variance of the inverse depth gaussian distribution; when the parameters in the formula (2-1) are initialized, a isnAnd bnHas an initial value of 10, munInitialisation is to the inverse of the mean depth of the image,initializing to a maximum value;
s22: using the image I with known camera pose obtained in step S14kUpdating the depth estimation of the corner feature; corner point for recording updateAt key frame rcAccording to the current frame IkAnd rcRelative position and attitude ofkDetermine a straight line and ensure IkNeutralization ofCorresponding pixel pointOccurs on this straight line; performing epipolar line search on the straight line, and obtaining the epipolar line through sub-pixel precision block matchingAfter the coordinates of (2), the corner point is calculated by triangulationDepth of (2)
S23: the depth obtained in step S22Take the reciprocal ofWhere the superscript n indicates that this is the nth observation update to the current corner depth filter, the inverse of the depthIs modeled as in equation (2-2):
the meaning of the above formula is: key frame rcImage I of the ith corner pointkIf the measurement is valid, the reciprocal of the measurement resultSatisfy normal distributionWhereinCalculating the variance for the inverse depth due to one pixel error in the image plane, and if it is an invalid measurementSatisfy uniform distributionWhereinThe minimum value and the maximum value of the depth reciprocal are set according to the prior information of the current scene; throughUpdated by observation of (2), inverse depthThe posterior probability distribution of (2) to (3):
wherein C is a constant for ensuring the probability distribution normalization, and a is calculated by a moment matching methodn,bn,μn
S24: rejecting effective measurement probabilitiesCorner points less than 0.1 will have uncertainty σnPoints that meet the requirements are added to the map.
4. The method as claimed in claim 3, wherein the step 2) of depth map estimation thread for performing three threads of motion estimation, map construction and depth map estimation in parallel comprises the following specific steps:
if the newly acquired image is selected as a new key frame in the motion estimation thread step S12, extracting edge points from the newly added key frame, and executing steps S31 to S33; otherwise, executing the steps S34-S36 by utilizing the pose of the current frame obtained in the motion estimation thread step S14, updating the probability depth filter of the edge point by using new image information, and removing the edge point with the too low effective observation probability; the method comprises the following specific steps:
s31: in the newly added key frame, edge points are extracted, namely, Sobel operators are used for calculating the horizontal and longitudinal gray gradients G of the imagexAnd GyFor grey scale gradient mapsApproximation; selecting 2500 to 4000 pixel points with the largest gray gradient from the image, reducing the gradient threshold value 450 to be 0.95 times of the original value when the number of the pixel points with obvious gradient is less than 2500, and increasing the threshold value to be 1.05 times of the original value when the number is more than 4000;
s32: remember the key frame as rvThe previous key frame is rv-1Will frame rv-1Depth map D ofv-1Propagation to new key frame depth map Dv;rv-1Middle, edge pointHas a mean of inverse depths ofThe variance of the reciprocal of depth isWherein; r isv-1To rvRelative position and posture of Tv,v-1Edge pointIn frame rvPosition of corresponding edge point inCan be calculated from the projection relationship such asFormula (3-1):
note the bookHas an inverse depth ofThe variance of the reciprocal depth is noted asThen:
wherein, tzRepresenting translation of the camera on the optical axis; from this it derivesVariance of inverse depth ofComprises the following steps:
whereinRepresents the prediction uncertainty, willIs bound to rvMiddle nearest integer pixel point
S33: creating a probability depth filter for each edge point extracted in step S31, if the memory of the 3x3 pixel range near a certain edge point is in the prior estimationThen its depthMean value ofInitializing as equation (3-4):
uncertainty of inverse depth of parameter of probability depth filter of edge pointInitializing to the minimum uncertainty in the prior estimation; if no prior estimation exists, initializing the probability depth filter of the edge point into average depth and maximum uncertainty; getThen for each edge point, the probability depth filter for constructing an edge point according to equation (2-1) is as follows (3-5):
wherein each parameter of the equation (3-5) defines each parameter of the probability depth filter corresponding to the edge point, a, the same as in the equation (2-1)nAnd bnHas an initial value of 10, mun,σnIs initialized toAnd
s34: new key frame rvSetting the reference frame as the subsequent image, and utilizing the subsequent image to frame rvDepth map D of (1)vCarrying out observation updating; using images IkTo DvWhen updating, the edge points of the update are recorded asAccording to IkAnd rvRelative position and attitude ofkDetermine a straight line, ensure IkNeutralization ofCorresponding pixel pointOccurs on this straight line; if the included angle between the edge point gray scale gradient direction and the straight line direction is less than the threshold value, the sub-pixel precision block matching is carried out to obtain the edge point gray scale imageAfter the coordinates of (2), calculating the depth of the edge point by triangulationStep S35 is executed; if the included angle is larger than the threshold value, no search is performed, and the step S34 is returned to check the next edge point;
s35: the depth obtained in step S34Take the reciprocal ofReciprocal of depthIs modeled as in equation (3-6):
the meaning of the above formula is: key frame rvMiddle ith edge point via image IkIf the measurement is valid, the reciprocal of the measurement resultSatisfy normal distributionIf it is an invalid measurement, thenSatisfy uniform distributionWhereinThe minimum value and the maximum value of the depth reciprocal are set according to the prior information of the current scene; throughUpdated by observation of (2), inverse depthA posterior probability distribution ofAs shown in formulas (2-3):
a is calculated by a moment matching methodn,bn,μn
S36: rejecting effective measurement probabilitiesEdge points less than 0.1.
CN201610131438.2A 2016-03-08 2016-03-08 A kind of monocular vision ranging method based on point information in edge in image Active CN105809687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610131438.2A CN105809687B (en) 2016-03-08 2016-03-08 A kind of monocular vision ranging method based on point information in edge in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610131438.2A CN105809687B (en) 2016-03-08 2016-03-08 A kind of monocular vision ranging method based on point information in edge in image

Publications (2)

Publication Number Publication Date
CN105809687A CN105809687A (en) 2016-07-27
CN105809687B true CN105809687B (en) 2019-09-27

Family

ID=56467948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610131438.2A Active CN105809687B (en) 2016-03-08 2016-03-08 A kind of monocular vision ranging method based on point information in edge in image

Country Status (1)

Country Link
CN (1) CN105809687B (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570820B (en) * 2016-10-18 2019-12-03 浙江工业大学 A kind of monocular vision three-dimensional feature extracting method based on quadrotor drone
CN106652028B (en) * 2016-12-28 2020-07-03 深圳乐动机器人有限公司 Environment three-dimensional mapping method and device
CN110177532A (en) * 2017-01-22 2019-08-27 四川金瑞麒智能科学技术有限公司 A kind of intelligent wheelchair system based on big data and artificial intelligence
CN110099656A (en) * 2017-01-22 2019-08-06 四川金瑞麒智能科学技术有限公司 System and method for controlling intelligent wheel chair
CN106920279B (en) * 2017-03-07 2018-06-19 百度在线网络技术(北京)有限公司 Three-dimensional map construction method and device
CN107341814B (en) * 2017-06-14 2020-08-18 宁波大学 Four-rotor unmanned aerial vehicle monocular vision range measurement method based on sparse direct method
CN107687850B (en) * 2017-07-26 2021-04-23 哈尔滨工业大学深圳研究生院 Unmanned aerial vehicle pose estimation method based on vision and inertia measurement unit
CN107527366B (en) * 2017-08-23 2020-04-10 上海视智电子科技有限公司 Camera tracking method for depth camera
CN107888828B (en) * 2017-11-22 2020-02-21 杭州易现先进科技有限公司 Space positioning method and device, electronic device, and storage medium
WO2019104571A1 (en) * 2017-11-30 2019-06-06 深圳市大疆创新科技有限公司 Image processing method and device
CN108151728A (en) * 2017-12-06 2018-06-12 华南理工大学 A kind of half dense cognitive map creation method for binocular SLAM
CN108398139B (en) * 2018-03-01 2021-07-16 北京航空航天大学 Dynamic environment vision mileometer method fusing fisheye image and depth image
CN108615244B (en) * 2018-03-27 2019-11-15 中国地质大学(武汉) A kind of image depth estimation method and system based on CNN and depth filter
CN108615246B (en) * 2018-04-19 2021-02-26 浙江大承机器人科技有限公司 Method for improving robustness of visual odometer system and reducing calculation consumption of algorithm
CN108759833B (en) * 2018-04-25 2021-05-25 中国科学院合肥物质科学研究院 Intelligent vehicle positioning method based on prior map
CN108986037B (en) * 2018-05-25 2020-06-16 重庆大学 Monocular vision odometer positioning method and positioning system based on semi-direct method
CN108765481B (en) * 2018-05-25 2021-06-11 亮风台(上海)信息科技有限公司 Monocular video depth estimation method, device, terminal and storage medium
GB2577062B (en) * 2018-09-11 2021-04-28 Advanced Risc Mach Ltd Methods, apparatus and processor for producing a higher resolution frame
CN111260698B (en) * 2018-12-03 2024-01-02 北京魔门塔科技有限公司 Binocular image feature matching method and vehicle-mounted terminal
CN109798897B (en) * 2019-01-22 2022-07-01 广东工业大学 Method for improving monocular vision positioning reliability through environment model integrity evaluation
CN110044358B (en) * 2019-04-29 2020-10-02 清华大学 Mobile robot positioning method based on field line characteristics
CN110135376A (en) * 2019-05-21 2019-08-16 北京百度网讯科技有限公司 Determine method, equipment and the medium of the coordinate system conversion parameter of imaging sensor
CN110349212B (en) * 2019-06-28 2023-08-25 Oppo广东移动通信有限公司 Optimization method and device for instant positioning and map construction, medium and electronic equipment
CN110428462B (en) * 2019-07-17 2022-04-08 清华大学 Multi-camera stereo matching method and device
CN112364677A (en) * 2020-11-23 2021-02-12 盛视科技股份有限公司 Robot vision positioning method based on two-dimensional code
CN112632426B (en) * 2020-12-22 2022-08-30 新华三大数据技术有限公司 Webpage processing method and device
CN112907742B (en) * 2021-02-18 2024-07-16 湖南国科微电子股份有限公司 Visual synchronous positioning and mapping method, device, equipment and medium
CN113689400B (en) * 2021-08-24 2024-04-19 凌云光技术股份有限公司 Method and device for detecting profile edge of depth image section
CN113793417A (en) * 2021-09-24 2021-12-14 东北林业大学 Monocular SLAM method capable of creating large-scale map
CN115273538A (en) * 2022-08-29 2022-11-01 王炜程 GNSS-RTK technology-based parking space detection system and deployment and working methods thereof
CN116993740B (en) * 2023-09-28 2023-12-19 山东万世机械科技有限公司 Concrete structure surface defect detection method based on image data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377405A (en) * 2008-07-11 2009-03-04 北京航空航天大学 Vision measuring method of space round gesture parameter and geometric parameter
CN102128617A (en) * 2010-12-08 2011-07-20 中国科学院自动化研究所 Vision real-time measuring method based on color code block
CN103926933A (en) * 2014-03-29 2014-07-16 北京航空航天大学 Indoor simultaneous locating and environment modeling method for unmanned aerial vehicle
CN105157708A (en) * 2015-10-10 2015-12-16 南京理工大学 Unmanned aerial vehicle autonomous navigation system and method based on image processing and radar
CN105225241A (en) * 2015-09-25 2016-01-06 广州极飞电子科技有限公司 The acquisition methods of unmanned plane depth image and unmanned plane

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9981605B2 (en) * 2014-05-16 2018-05-29 GM Global Technology Operations LLC Surround-view camera system (VPM) and vehicle dynamic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377405A (en) * 2008-07-11 2009-03-04 北京航空航天大学 Vision measuring method of space round gesture parameter and geometric parameter
CN102128617A (en) * 2010-12-08 2011-07-20 中国科学院自动化研究所 Vision real-time measuring method based on color code block
CN103926933A (en) * 2014-03-29 2014-07-16 北京航空航天大学 Indoor simultaneous locating and environment modeling method for unmanned aerial vehicle
CN105225241A (en) * 2015-09-25 2016-01-06 广州极飞电子科技有限公司 The acquisition methods of unmanned plane depth image and unmanned plane
CN105157708A (en) * 2015-10-10 2015-12-16 南京理工大学 Unmanned aerial vehicle autonomous navigation system and method based on image processing and radar

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Combining Monocular and Stereo-Vision for Real-Time Vehicle Ranging and Tracking on Multilane Highways;Sayanan Sivaraman et al.;《2011 14th International IEEE Conference on Intelligent Transportation Systems》;20111007;第1249-1254页 *
融合颜色和深度信息的三维同步定位与地图构建研究;刘艳丽;《中国博士学位论文全文数据库 信息科技辑》;20141215;第5章 *

Also Published As

Publication number Publication date
CN105809687A (en) 2016-07-27

Similar Documents

Publication Publication Date Title
CN105809687B (en) A kind of monocular vision ranging method based on point information in edge in image
CN110070615B (en) Multi-camera cooperation-based panoramic vision SLAM method
CN109211241B (en) Unmanned aerial vehicle autonomous positioning method based on visual SLAM
CN108242079B (en) VSLAM method based on multi-feature visual odometer and graph optimization model
WO2020113423A1 (en) Target scene three-dimensional reconstruction method and system, and unmanned aerial vehicle
CN108229416B (en) Robot SLAM method based on semantic segmentation technology
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN114018236A (en) Laser vision strong coupling SLAM method based on adaptive factor graph
CN111812978B (en) Cooperative SLAM method and system for multiple unmanned aerial vehicles
CN110749308A (en) SLAM-oriented outdoor positioning method using consumer-grade GPS and 2.5D building models
CN109871024A (en) A kind of UAV position and orientation estimation method based on lightweight visual odometry
CN117367427A (en) Multi-mode slam method applicable to vision-assisted laser fusion IMU in indoor environment
CN110598370A (en) Robust attitude estimation of multi-rotor unmanned aerial vehicle based on SIP and EKF fusion
CN114529585A (en) Mobile equipment autonomous positioning method based on depth vision and inertial measurement
CN112945233A (en) Global drift-free autonomous robot simultaneous positioning and map building method
CN114812601A (en) State estimation method and device of visual inertial odometer and electronic equipment
CN117237789A (en) Method for generating texture information point cloud map based on panoramic camera and laser radar fusion
KR101766823B1 (en) Robust visual odometry system and method to irregular illumination changes
CN117330052A (en) Positioning and mapping method and system based on infrared vision, millimeter wave radar and IMU fusion
CN116704032A (en) Outdoor visual SLAM method based on monocular depth estimation network and GPS
CN117671022B (en) Mobile robot vision positioning system and method in indoor weak texture environment
CN113744301B (en) Motion trail estimation method and device for mobile robot and storage medium
Jia et al. Mobile robot vision odometer based on point-line features and graph optimization
CN116399350B (en) Method for determining semi-direct method visual odometer fused with YOLOv5
Macesanu et al. Computer vision based Mobile robot navigation in unknown environments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant