CN111899276A - SLAM method and system based on binocular event camera - Google Patents
SLAM method and system based on binocular event camera Download PDFInfo
- Publication number
- CN111899276A CN111899276A CN202010647021.8A CN202010647021A CN111899276A CN 111899276 A CN111899276 A CN 111899276A CN 202010647021 A CN202010647021 A CN 202010647021A CN 111899276 A CN111899276 A CN 111899276A
- Authority
- CN
- China
- Prior art keywords
- event
- camera
- time
- points
- imu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000033001 locomotion Effects 0.000 claims abstract description 42
- 238000001514 detection method Methods 0.000 claims abstract description 12
- 238000005457 optimization Methods 0.000 claims abstract description 11
- 230000010354 integration Effects 0.000 claims abstract description 9
- 238000013507 mapping Methods 0.000 claims abstract description 8
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 16
- 230000008859 change Effects 0.000 abstract description 6
- 238000005286 illumination Methods 0.000 abstract description 3
- 230000001133 acceleration Effects 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 7
- 238000013519 translation Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000003384 imaging method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000009825 accumulation Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 240000008168 Ficus benjamina Species 0.000 description 1
- 102000008115 Signaling Lymphocytic Activation Molecule Family Member 1 Human genes 0.000 description 1
- 108010074687 Signaling Lymphocytic Activation Molecule Family Member 1 Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 210000001525 retina Anatomy 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/207—Analysis of motion for motion estimation over a hierarchy of resolutions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
- G06T7/85—Stereo camera calibration
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a binocular event camera-based SLAM method and system, which comprises the steps of performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the IMU is used for assisting to carry out motion compensation, the coordinates of the event points are projected into a reference coordinate system through the relative pose obtained by IMU integration, and the depths of the event points are replaced by the median values of the depths of the adjacent three-dimensional space points; respectively carrying out feature point detection and tracking on corresponding reconstructed images input by the left event camera and the right event camera; triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by utilizing a PnP method; and performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information. The technical scheme of the invention can be used for dealing with scenes with large illumination change and high-speed motion, and can be used for solving the problem that the robot in the conventional SLAM system is easy to lose efficacy when the motion or the environment is too complex.
Description
Technical Field
The invention belongs to the field of image processing, and particularly relates to a technical scheme for realizing SLAM by using a binocular event camera.
Background
In the past decades, there has been an increasing interest in robotic perception due to the research and development of computer vision methods. Such conventional cameras are capable of capturing high information content of the camera surroundings and have become most popular in various applications due to their low cost, widespread nature.
Simultaneous Localization and mapping (SLAM) is one of the most important milestones in the field of robot perception and has gained significant success over the last 30 years. The monocular event camera SLAM cannot restore the scale truly, and the monocular camera needs a certain time to initialize the camera, otherwise, an incorrect track and a mapping result are obtained. Existing SLAMs are typically implemented based on conventional optical cameras that exhibit some limitations in design, on the one hand, outputting images at a fixed frame rate, regardless of the amount of new information present in each image, and therefore the incoming information is typically heavily redundant, with the redundant data wasting valuable computational resources. On the other hand, highly dynamic scenes or camera motion may introduce motion blur to conventional image frames and may lack sufficient overlap information between subsequent frames, so that corner detection and tracking effects based on conventional cameras become poor, which also limits the further development of SLAM. However, due to the special data format of the event camera, many existing mature SLAM methods cannot be directly applied to the time camera, and thus the application of the event camera is limited.
An event camera or Dynamic Vision Sensor (DVS) simulates the retina through a chip and responds to pulses generated by pixel-level changes in illumination due to motion. Such asFIG. 1 shows an event camera and a normal camera shooting a rotating disk with dots, where standcameraoutput indicates the output of the normal camera, which is the luminance image of the camera at a specific time point, and DVSoutput is the output of the event camera, which is a stream of event stream data, more specifically, at tjTime of day uj=(xj,yj) The brightness increment at the pixel position reaches a threshold value + -c (c > 0), then an event ej=(xj,yj,tj,pj) To be triggered, pjE { +1, -1} represents the polarity of the event, with a positive sign indicating an increase in brightness and a negative sign indicating a decrease in brightness, so the event camera outputs an asynchronous event stream, as shown in fig. 1, and the absolute brightness value of the scene is no longer directly visible since the event only records incremental changes. In contrast to conventional frame-based cameras, event cameras can capture brightness changes at an almost infinite frame rate and record events at specific points in time and image locations. Especially for mobile scenes, the event camera has great advantages in terms of data rate, speed and dynamic range, and is expected to solve the problem of failure caused by too fast movement in the conventional SLAM system. The current newer event camera, such as a DAVIS (Dynamic and active-pixel Vision Sensor), has an IMU (Inertial measurement unit) module, and the IMU can measure the linear acceleration and angular velocity of three axes, is often used to acquire three-dimensional motion information of the camera, is used for self-positioning in SLAM (Simultaneous positioning and Mapping), navigation and other applications, and can achieve time synchronization with event points and brightness images. However, technical difficulties such as reconstruction of images and time alignment of binocular images still exist in the aspects of positioning and map construction.
Disclosure of Invention
The invention provides a SLAM scheme based on a binocular event camera, aiming at the problem that the real depth information of a scene cannot be directly recovered in the conventional SLAM method based on a monocular event camera.
The technical scheme of the invention provides a SLAM method based on a binocular event camera, which comprises the following steps:
step 1, performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the motion compensation using IMU assistance is implemented as follows,
setting the event frame start time asWhen motion compensation is performed, the start of an event frame is taken as a reference system, and a certain event point e in an accumulation window is aimed atjNoting the corresponding time stamp as tjObtained by IMU integrationTo tjRelative position and attitude ofE is to bejCoordinate x ofjProjected coordinate x 'in reference coordinate system'jComprises the following steps:
wherein K is the camera internal reference matrix, K-1Is its inverse matrix, Z (x)j) The depth of the event point is replaced by a median value of the depths of the adjacent three-dimensional space points;
step 3, according to the result obtained in the step 2, triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by using a PnP method;
and 4, performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information.
Furthermore, the time alignment is performed when the image is reconstructed in step 1, which is realized as follows,
1) accumulating 30ms time event points by a left event camera to form a binary image frame, performing motion compensation, and taking the time value of the first event point within 30ms as a reconstructed image frame timestamp of the left event camera;
2) and in the event stream data of the right event camera, searching an event point with the time closest to the time stamp of the reconstructed image frame of the left event camera, taking the corresponding time of the searched event point as the starting time, accumulating the event point with the time of 30ms by the right event camera to form a binary image frame, and performing motion compensation.
In step 2, the feature point detection is realized by adopting a Shi-Tomasi method.
And in the step 2, tracking is realized by adopting a Kanade-Lucas-Tomasi method.
The invention also provides a SLAM system based on the binocular event camera, which is used for the SLAM method based on the binocular event camera.
The method mainly utilizes the characteristic that the event camera has high time resolution and high dynamic range and utilizes the characteristic that the binocular camera can acquire the real depth of the scene, thereby avoiding the problem that the traditional camera cannot acquire enough image characteristics in the scene with high dynamic range and calculating the real depth information of the scene through the binocular camera. The method based on the binocular event camera can obtain higher track estimation precision in SLAM. The technical scheme of the invention can be used for dealing with scenes with large illumination change and high-speed motion, and can be used for solving the problem that the robot in the conventional SLAM system is easy to lose efficacy when the motion or the environment is too complex.
Drawings
Fig. 1 is a schematic diagram comparing data of a conventional camera and a DVS camera.
Fig. 2 is a diagram of image reconstruction results according to an embodiment of the present invention, in which fig. 2(a) is an event image generated at a fixed time interval of a slow moving scene, fig. 2(b) is an event image generated at a fixed time interval of a fast moving scene, fig. 2(c) is an event image generated by a fixed number of event points of a scene with a simple environment, and fig. 2(d) is an event image generated by a fixed number of event points of a scene with a complex environment.
Fig. 3 is a schematic diagram of motion compensation as used in the present invention.
Fig. 4 shows the result of image reconstruction after motion compensation according to an embodiment of the present invention, in which fig. 4(a) is an image subjected to motion compensation, and fig. 4(b) is an image after motion compensation.
FIG. 5 is a schematic view of the embodiment of the present invention.
Fig. 6 is a schematic diagram of back-end BA optimization according to an embodiment of the present invention.
FIG. 7 is a block diagram of a flow chart according to an embodiment of the present invention.
Detailed Description
In order to more clearly understand the present invention, the technical solutions of the present invention are specifically described below with reference to the accompanying drawings and examples.
The method comprises the steps of taking a left event camera and a right event camera as two independent threads to simultaneously perform motion compensation, time alignment, feature point detection and tracking, then solving the three-dimensional coordinates of the feature points by utilizing triangulation, solving the camera pose by combining the two-dimensional image coordinates of the feature points and the corresponding three-dimensional space coordinates and then optimizing and outputting the motion track of the cameras and the construction of a surrounding map by combining IMU pre-integration through the back end.
Referring to fig. 7, an embodiment of the present invention provides a method for SLAM based on a binocular event camera, including the following steps:
step 1, performing motion compensation on input left and right event camera data by using IMU assistance to obtain a reconstructed gray value image.
Left and right event camera data and an event camera IMU are first input. The data format of the event camera is different from that of a general light camera, the event camera is required to reconstruct images, and the images can be generated by accumulating event points according to the number of the event points and the time intervals.
The event stream data recorded when the camera takes a picture is shown in formula 1, set { e }i(x, t) } represents data of all event points generated by the camera in the shooting process, x represents pixel coordinates of the event point, t represents generation time, and the ith event point e in the set is setiThe information of (x, t) includes its generation time tiPixel coordinate x of an event pointiAnd its polarity sigmaiWhere it is the dirac function. The data output by the event camera is asynchronousEvent streaming, whereas conventional image processing methods are directed to framed images. In order to process images or visualize event streams, event image frames need to be generated from the event streams, and generally event images are formed by directly accumulating events.
ei(x,t)=σi(x-xi)(t-ti),i∈1,2,3,...(1)
Accumulating event data directly over a period of time and then drawing it on an image in a fixed color is the simplest way to generate an image frame. In other words, in the blank image, values of coordinate pixel points of events received in a certain period of time are all recorded as 255, and a place where the event is not received is set as 0, so that a binary event image can be obtained. The selection of the time window may use a fixed length of time or may use a fixed number of events. The event images generated at fixed time intervals are shown in fig. 2(a) (b), which are two event images generated using 30ms time intervals, where fig. 2(a) is a slow moving scene and fig. 2(b) is a fast moving scene. It can be seen that if the time accumulation is performed at a fixed time interval, when the camera moving speed is too fast, the event image edge is thick, and an obvious smear phenomenon occurs. As shown in fig. 2(c) and (d), the event images generated by fixing the number of event points are time images generated by using 10000 points, fig. 2(c) is a scene with a simple environment, and fig. 2(d) is a scene with a complex environment, and the number of events activated is larger when the visible environment is more complex, and the time interval between adjacent frames is increased in the simple scene by using the imaging method of fixing the number of events. Mapping of cumulative events requires selection of appropriate means and parameters based on different scenarios and motion patterns. The patent preferably employs event images generated at fixed time intervals, with the embodiment using cumulative 30ms time intervals to generate the event images.
Since the image formed by accumulating the events for a certain period of time causes the problem of coarse edge of the event image, which is not beneficial to the subsequent feature point detection, the event point needs to be motion compensated.
As shown in FIG. 2, on the axis t of the time axis, the dots represent event points, the squares represent IMU data, and the upper I1,I2,I3,I4Indicating the order of the image frames corresponding theretoColumn, will time intervalInner IMU data integration to obtain I2,I3Transformation matrix between two framesNamely:
wherein, linear acceleration quadratic integral obtains translation variable quantityIntegration of angular velocity to obtain rotation variation A timestamp representing the kth event point at any time f,andrespectively represent arbitrary event points ejRelative to a reference timestampAndthe transformation matrix of (2). Then pass throughLinear interpolation is carried out, and any event point e can be obtainedjIs transformed by(i.e. theTo tjRelative pose) of tjA timestamp representing the point of the event.
Setting the event frame start time asCumulative window size ofWhen motion compensation is performed, starting with an event frameFor a reference system, for a certain event point e in the accumulation windowjWith a time stamp of tjCalculated by IMU integrationTo tjRelative position and attitude ofCan be combined with ejCoordinate x ofjProjected into a reference coordinate system. Post-projection coordinate x'jComprises the following steps:
wherein K is a camera internal reference matrix known after calibration by a camera, K-1Is the inverse matrix thereof, wherein Z (x)j) The depth of the event point, which is generally derived from the projection depth of the optimized three-dimensional space point in the area, is given by BA optimization in step 4, and in order to avoid the influence of the error of the calculation of the depth of the space point, the median value of the depths of the adjacent three-dimensional space points is used instead. The effect of motion compensation is shown in fig. 4, where fig. 4(a) is an image subjected to motion compensation, and fig. 4(b) is an image after motion compensation, it can be seen that the edge of the image after motion compensation is thinned, which is convenient for subsequent feature point detection and tracking.
For time alignment, the specific reconstruction method in step 1 in the embodiment is as follows:
1) accumulating event points at 30ms time by a left event camera to form a binary image frame, wherein the binary image frame is included in a blank image, values of coordinate pixel points of an event received at a certain 30ms time are all recorded as 255, a place where the event is not received is set as 0, motion compensation is carried out, and a time value of a first event point within 30ms is used as a time stamp of a reconstructed image frame of the left event camera;
2) searching an event point with the time closest to the time stamp of a reconstructed image frame of the left event camera in the event stream data of the right event camera, taking the event point time as the starting time, accumulating the event points with the time of 30ms by the right event camera in the same method to form a binary image frame, namely, in a blank image, recording the values of coordinate pixel points of an event received at a certain time of 30ms as 255, and setting the positions where the event is not received as 0, and performing motion compensation. Due to the fact that the time resolution of the event cameras is extremely high, the similarity of the reconstructed image results of the left event camera and the right event camera can be achieved, and the purpose of time alignment is achieved.
the left and right camera images are tracked separately in step 2.
The embodiment adopts a Shi-Tomasi method to detect the feature points of the reconstructed image, and comprises the steps of moving a fixed window in the image along any direction, judging whether the image is a corner point or not through the gray scale change of the image in the window at the corner point, and further realizing the feature point detection. The Kanade-Lucas-Tomasi method is abbreviated as KLT method, and the embodiment uses the KLT method to track optical flows, calculates the optical flows of characteristic points according to the assumption that the brightness value of the same object is constant in a short time, and tracks the characteristic points by using the optical flows.
Step 3, triangulating the detected and tracked feature points to calculate the corresponding three-dimensional coordinate points of the target and the pose change between images, and calculating the pose of the camera by using a PnP method:
more accurate scene depth values are obtained more efficiently by binocular camera initialization at step 3. In an embodiment, step 3 comprises the following sub-steps:
step 3.1, calculating three-dimensional coordinates of the feature points by using triangulation:
since the binocular camera knows the baseline distance between the two cameras, the absolute scale information can be obtained by triangulating the three-dimensional coordinates of the feature points detected in the reconstructed images of the left and right event cameras, and therefore, more accurate scene depth values can be calculated when the three-dimensional coordinates of the triangulated target from the feature point coordinates of the left and right cameras are calculated. As shown in FIG. 5, the coordinate of a point P in space in the world coordinate system is X, which is represented by O1The coordinate in the imaging plane of the camera 1 being the optical center is X1In the presence of O2The coordinate in the imaging plane of the camera 2 being the optical center is X2,R1、T1Is a rotation matrix and a translation matrix of the camera 1 relative to the initial pose, and has the same principle of R2、T2For the rotation matrix and translation matrix of camera 2 with respect to the initial pose, R, T is the rotation and translation matrix between cameras 1 and 2, and equation 3 can be derived from the camera imaging model.
Wherein K is a camera internal reference matrix s1And s2The distances from the optical centers of the camera 1 and the camera 2 to the target point P are approximately equal, the formula 3 is transformed to obtain a formula 4, and the three-dimensional point coordinates of the target point P can be obtained by utilizing SVD (singular value decomposition) decomposition solution. Wherein, the simplification mark K-1X2=X′2
Step 3.2, calculating the pose of the camera by using a PnP method:
the PnP (Passive-n-Point) method is used for solving the problem of estimation of the camera pose when three-dimensional space Point coordinates under a known partial world coordinate system and two-dimensional camera coordinate systems of the three-dimensional space Point coordinates and the two-dimensional camera coordinate system coordinates are known. In the invention, the two-dimensional image coordinates and the three-dimensional space coordinates of the known feature points in the step 2 and the step 3.1 are used for solving the pose change of a left camera continuously inputting two frames of images by triangulation, when the camera moves to a new position to obtain a new third frame of event frame, because the translation vector T in the relative transformation obtained by triangulation does not have a real scale, if the pose of the new camera is continuously solved by the triangulation, only a certain relative pose can be obtained, and thus the scales between the 3 camera poses (the camera poses corresponding to the first frame, the second frame and the third frame) are inconsistent. Therefore, for the subsequent camera pose, the relation between the three-dimensional coordinates and the two-dimensional pixel coordinates of the feature points can be utilized for solving, namely the PnP method. In the invention, a PnP method is utilized, three-dimensional coordinates of feature points are calculated by triangularization of left and right camera images of a first frame, two-dimensional image coordinates corresponding to the feature points of two continuous frames are detected by the left event camera feature in step 2, at least 6 groups of three-dimensional and two-dimensional matched points are selected for PnP pose resolving, and then camera pose transformation can be obtained.
And 4, optimizing a back end BA (Bundle adjustment) by combining IMU pre-integration to obtain a camera motion track and scene mapping information:
step 4.1, IMU pre-integration, wherein the IMU can output the three-axis acceleration of the sensor at a higher frequencyAnd angular velocityHowever, due to the bias and noise of the IMU itself, there is a certain difference between the output measured value and the true value. The relationship between the IMU measurement value and the true value can be represented by equation 5.
Wherein,andmeasured values of acceleration and angular velocity, atAnd ωtIs the corresponding true value.Andand (4) bias of acceleration and angular velocity, and a random walk model is obeyed.Andnoise of acceleration and angular velocity, respectively, where naNoise representing three-axis acceleration, subject to mean 0 and variance ofGaussian normal distribution of (1), nwIs the noise at angular velocity, which follows a mean of 0 and a variance ofGaussian normal distribution of (a). WhileIs a rotation matrix of the world coordinate system to the camera coordinate system at time t, gwIs the gravitational acceleration under the world coordinate system.
For tkTime of day event frame bkTo tk+1Time of day event frame bk+1The IMU observed value in between is integrated to obtain bk+1Corresponding translation of frames in world coordinate systemSpeed of rotationAnd rotationThe pose change of the camera can be obtained.
And 4.2, optimizing the back end BA, and when more new camera pose appears continuously, generating a certain error due to the accumulated error of the relative pose obtained by solving. At this time, the overall pose and the three-dimensional point cloud are usually adjusted by a BA optimization method. The essence of BA is an optimization model, which optimizes the pose obtained by PnP solution and the three-dimensional world coordinates of the feature points obtained by triangularization of the feature points by minimizing the reprojection error.
Assuming that there are n camera poses and m feature points for solving the three-dimensional world coordinate, the BA optimization problem can be represented by equation 6.
Wherein, when the camera ciCoefficient when the feature point j is observedIs 1, otherwise is 0.Is a camera ciObserving the two-dimensional pixel coordinate of the feature point j, K being the camera reference matrix, PiFor the ith camera ciThe position and the attitude of the robot are shown,is the three-dimensional world coordinate of the characteristic point j. FIG. 6 shows the case of 3 camera poses, 2 three-dimensional feature points, where C1、C2、C3As camera observation point, X1、X2、X3As three-dimensional coordinates of feature points, x11Is at C1Two-dimensional image coordinates of three-dimensional feature points, and two othersDimensional image coordinate x12、x13And the same is true.
For the solution of the BA optimization problem, an LM (Levenbrg-Marquardt) method is usually used for optimizing the pose P and the three-dimensional coordinate X of the feature point of the camera, all the steps of the patent are finished, and the map construction of the self-motion pose and the feature point of the camera can be obtained.
Fig. 7 is a flowchart of the present patent, that is, an image form similar to that of a conventional camera is reconstructed by performing motion compensation and time alignment on camera images of left and right camera input events, and then detection and tracking of corresponding feature points based on the reconstructed image, PnP pose calculation, triangularization depth calculation, IMU pre-integration, and back-end optimization can be completed by using the existing method, and finally a map composed of the self pose of the camera and the coordinates of the feature points after optimization is output.
In specific implementation, the method can adopt a computer software technology to realize an automatic operation process, and a corresponding system device for implementing the method process is also in the protection scope of the invention.
It should be understood that the above-mentioned embodiments are described in some detail, and not intended to limit the scope of the invention, and those skilled in the art will be able to make alterations and modifications without departing from the scope of the invention as defined by the appended claims.
Claims (5)
1. A SLAM method based on a binocular event camera is characterized by comprising the following steps:
step 1, performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the motion compensation using IMU assistance is implemented as follows,
setting the event frame start time asWhen performing motion compensation, the event frame start is taken as the reference frame forAccumulating a certain event point e in the windowjNoting the corresponding time stamp as tjObtained by IMU integrationTo tjRelative position and attitude ofE is to bejCoordinate x ofjProjected coordinate x 'in reference coordinate system'jComprises the following steps:
wherein K is the camera internal reference matrix, K-1Is its inverse matrix, Z (x)j) The depth of the event point is replaced by a median value of the depths of the adjacent three-dimensional space points;
step 2, respectively carrying out feature point detection and tracking on corresponding reconstructed images input by the left event camera and the right event camera;
step 3, according to the result obtained in the step 2, triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by using a PnP method;
and 4, performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information.
2. The binocular event camera based SLAM method of claim 1, wherein: the time alignment is performed when the image is reconstructed in step 1, which is realized as follows,
1) accumulating 30ms time event points by a left event camera to form a binary image frame, performing motion compensation, and taking the time value of the first event point within 30ms as a reconstructed image frame timestamp of the left event camera;
2) and in the event stream data of the right event camera, searching an event point with the time closest to the time stamp of the reconstructed image frame of the left event camera, taking the corresponding time of the searched event point as the starting time, accumulating the event point with the time of 30ms by the right event camera to form a binary image frame, and performing motion compensation.
3. The binocular event camera based SLAM method of claim 1 or 2, wherein: in step 2, a Shi-Tomasi method is adopted to realize feature point detection.
4. The binocular event camera based SLAM method of claim 1 or 2, wherein: in the step 2, tracking is realized by adopting a Kanade-Lucas-Tomasi method.
5. A SLAM system based on binocular event cameras, characterized in that: SLAM method for the binocular event based camera of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010647021.8A CN111899276A (en) | 2020-07-07 | 2020-07-07 | SLAM method and system based on binocular event camera |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010647021.8A CN111899276A (en) | 2020-07-07 | 2020-07-07 | SLAM method and system based on binocular event camera |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111899276A true CN111899276A (en) | 2020-11-06 |
Family
ID=73191664
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010647021.8A Pending CN111899276A (en) | 2020-07-07 | 2020-07-07 | SLAM method and system based on binocular event camera |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111899276A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112631314A (en) * | 2021-03-15 | 2021-04-09 | 季华实验室 | Robot control method and system based on multi-line laser radar and event camera SLAM |
CN112809679A (en) * | 2021-01-25 | 2021-05-18 | 清华大学深圳国际研究生院 | Method and device for grabbing deformable object and computer readable storage medium |
CN112967316A (en) * | 2021-03-05 | 2021-06-15 | 中国科学技术大学 | Motion compensation optimization method and system for 3D multi-target tracking |
CN114022949A (en) * | 2021-09-27 | 2022-02-08 | 中国电子科技南湖研究院 | Event camera motion compensation method and device based on motion model |
CN115997234A (en) * | 2020-12-31 | 2023-04-21 | 华为技术有限公司 | Pose estimation method and related device |
CN116389682A (en) * | 2023-03-07 | 2023-07-04 | 华中科技大学 | Dual-event camera synchronous acquisition system and noise event suppression method |
CN117372548A (en) * | 2023-12-06 | 2024-01-09 | 北京水木东方医用机器人技术创新中心有限公司 | Tracking system and camera alignment method, device, equipment and storage medium |
CN117739996A (en) * | 2024-02-21 | 2024-03-22 | 西北工业大学 | Autonomous positioning method based on event camera inertial tight coupling |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108665540A (en) * | 2018-03-16 | 2018-10-16 | 浙江工业大学 | Robot localization based on binocular vision feature and IMU information and map structuring system |
CN110415344A (en) * | 2019-06-24 | 2019-11-05 | 武汉大学 | Motion compensation process based on event camera |
CN111340851A (en) * | 2020-05-19 | 2020-06-26 | 北京数字绿土科技有限公司 | SLAM method based on binocular vision and IMU fusion |
-
2020
- 2020-07-07 CN CN202010647021.8A patent/CN111899276A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108665540A (en) * | 2018-03-16 | 2018-10-16 | 浙江工业大学 | Robot localization based on binocular vision feature and IMU information and map structuring system |
CN110415344A (en) * | 2019-06-24 | 2019-11-05 | 武汉大学 | Motion compensation process based on event camera |
CN111340851A (en) * | 2020-05-19 | 2020-06-26 | 北京数字绿土科技有限公司 | SLAM method based on binocular vision and IMU fusion |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115997234A (en) * | 2020-12-31 | 2023-04-21 | 华为技术有限公司 | Pose estimation method and related device |
CN112809679A (en) * | 2021-01-25 | 2021-05-18 | 清华大学深圳国际研究生院 | Method and device for grabbing deformable object and computer readable storage medium |
CN112967316A (en) * | 2021-03-05 | 2021-06-15 | 中国科学技术大学 | Motion compensation optimization method and system for 3D multi-target tracking |
CN112967316B (en) * | 2021-03-05 | 2022-09-06 | 中国科学技术大学 | Motion compensation optimization method and system for 3D multi-target tracking |
CN112631314A (en) * | 2021-03-15 | 2021-04-09 | 季华实验室 | Robot control method and system based on multi-line laser radar and event camera SLAM |
CN112631314B (en) * | 2021-03-15 | 2021-06-04 | 季华实验室 | Robot control method and system based on multi-line laser radar and event camera SLAM |
CN114022949A (en) * | 2021-09-27 | 2022-02-08 | 中国电子科技南湖研究院 | Event camera motion compensation method and device based on motion model |
CN116389682A (en) * | 2023-03-07 | 2023-07-04 | 华中科技大学 | Dual-event camera synchronous acquisition system and noise event suppression method |
CN116389682B (en) * | 2023-03-07 | 2024-02-06 | 华中科技大学 | Dual-event camera synchronous acquisition system and noise event suppression method |
CN117372548A (en) * | 2023-12-06 | 2024-01-09 | 北京水木东方医用机器人技术创新中心有限公司 | Tracking system and camera alignment method, device, equipment and storage medium |
CN117372548B (en) * | 2023-12-06 | 2024-03-22 | 北京水木东方医用机器人技术创新中心有限公司 | Tracking system and camera alignment method, device, equipment and storage medium |
CN117739996A (en) * | 2024-02-21 | 2024-03-22 | 西北工业大学 | Autonomous positioning method based on event camera inertial tight coupling |
CN117739996B (en) * | 2024-02-21 | 2024-04-30 | 西北工业大学 | Autonomous positioning method based on event camera inertial tight coupling |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111899276A (en) | SLAM method and system based on binocular event camera | |
CN110070615B (en) | Multi-camera cooperation-based panoramic vision SLAM method | |
Zhu et al. | The multivehicle stereo event camera dataset: An event camera dataset for 3D perception | |
JP6768156B2 (en) | Virtually enhanced visual simultaneous positioning and mapping systems and methods | |
CN107888828B (en) | Space positioning method and device, electronic device, and storage medium | |
US10825197B2 (en) | Three dimensional position estimation mechanism | |
US10260862B2 (en) | Pose estimation using sensors | |
CN110533719B (en) | Augmented reality positioning method and device based on environment visual feature point identification technology | |
WO2018142496A1 (en) | Three-dimensional measuring device | |
CN110310362A (en) | High dynamic scene three-dimensional reconstruction method, system based on depth map and IMU | |
CN109540126A (en) | A kind of inertia visual combination air navigation aid based on optical flow method | |
CN108700946A (en) | System and method for parallel ranging and fault detect and the recovery of building figure | |
WO2015134795A2 (en) | Method and system for 3d capture based on structure from motion with pose detection tool | |
US11262837B2 (en) | Dual-precision sensor system using high-precision sensor data to train low-precision sensor data for object localization in a virtual environment | |
CN110139031B (en) | Video anti-shake system based on inertial sensing and working method thereof | |
KR20150013709A (en) | A system for mixing or compositing in real-time, computer generated 3d objects and a video feed from a film camera | |
US20110249095A1 (en) | Image composition apparatus and method thereof | |
CN111798485B (en) | Event camera optical flow estimation method and system enhanced by IMU | |
KR20180030446A (en) | Method and device for blurring a virtual object in a video | |
CN110544278B (en) | Rigid body motion capture method and device and AGV pose capture system | |
Bapat et al. | Towards kilo-hertz 6-dof visual tracking using an egocentric cluster of rolling shutter cameras | |
CN118135526A (en) | Visual target recognition and positioning method for four-rotor unmanned aerial vehicle based on binocular camera | |
Huttunen et al. | A monocular camera gyroscope | |
CN112432653B (en) | Monocular vision inertial odometer method based on dotted line characteristics | |
US10540809B2 (en) | Methods and apparatus for tracking a light source in an environment surrounding a device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201106 |