CN111899276A - SLAM method and system based on binocular event camera - Google Patents

SLAM method and system based on binocular event camera Download PDF

Info

Publication number
CN111899276A
CN111899276A CN202010647021.8A CN202010647021A CN111899276A CN 111899276 A CN111899276 A CN 111899276A CN 202010647021 A CN202010647021 A CN 202010647021A CN 111899276 A CN111899276 A CN 111899276A
Authority
CN
China
Prior art keywords
event
camera
time
points
imu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010647021.8A
Other languages
Chinese (zh)
Inventor
余磊
周游龙
杨公宇
杨文�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202010647021.8A priority Critical patent/CN111899276A/en
Publication of CN111899276A publication Critical patent/CN111899276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a binocular event camera-based SLAM method and system, which comprises the steps of performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the IMU is used for assisting to carry out motion compensation, the coordinates of the event points are projected into a reference coordinate system through the relative pose obtained by IMU integration, and the depths of the event points are replaced by the median values of the depths of the adjacent three-dimensional space points; respectively carrying out feature point detection and tracking on corresponding reconstructed images input by the left event camera and the right event camera; triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by utilizing a PnP method; and performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information. The technical scheme of the invention can be used for dealing with scenes with large illumination change and high-speed motion, and can be used for solving the problem that the robot in the conventional SLAM system is easy to lose efficacy when the motion or the environment is too complex.

Description

SLAM method and system based on binocular event camera
Technical Field
The invention belongs to the field of image processing, and particularly relates to a technical scheme for realizing SLAM by using a binocular event camera.
Background
In the past decades, there has been an increasing interest in robotic perception due to the research and development of computer vision methods. Such conventional cameras are capable of capturing high information content of the camera surroundings and have become most popular in various applications due to their low cost, widespread nature.
Simultaneous Localization and mapping (SLAM) is one of the most important milestones in the field of robot perception and has gained significant success over the last 30 years. The monocular event camera SLAM cannot restore the scale truly, and the monocular camera needs a certain time to initialize the camera, otherwise, an incorrect track and a mapping result are obtained. Existing SLAMs are typically implemented based on conventional optical cameras that exhibit some limitations in design, on the one hand, outputting images at a fixed frame rate, regardless of the amount of new information present in each image, and therefore the incoming information is typically heavily redundant, with the redundant data wasting valuable computational resources. On the other hand, highly dynamic scenes or camera motion may introduce motion blur to conventional image frames and may lack sufficient overlap information between subsequent frames, so that corner detection and tracking effects based on conventional cameras become poor, which also limits the further development of SLAM. However, due to the special data format of the event camera, many existing mature SLAM methods cannot be directly applied to the time camera, and thus the application of the event camera is limited.
An event camera or Dynamic Vision Sensor (DVS) simulates the retina through a chip and responds to pulses generated by pixel-level changes in illumination due to motion. Such asFIG. 1 shows an event camera and a normal camera shooting a rotating disk with dots, where standcameraoutput indicates the output of the normal camera, which is the luminance image of the camera at a specific time point, and DVSoutput is the output of the event camera, which is a stream of event stream data, more specifically, at tjTime of day uj=(xj,yj) The brightness increment at the pixel position reaches a threshold value + -c (c > 0), then an event ej=(xj,yj,tj,pj) To be triggered, pjE { +1, -1} represents the polarity of the event, with a positive sign indicating an increase in brightness and a negative sign indicating a decrease in brightness, so the event camera outputs an asynchronous event stream, as shown in fig. 1, and the absolute brightness value of the scene is no longer directly visible since the event only records incremental changes. In contrast to conventional frame-based cameras, event cameras can capture brightness changes at an almost infinite frame rate and record events at specific points in time and image locations. Especially for mobile scenes, the event camera has great advantages in terms of data rate, speed and dynamic range, and is expected to solve the problem of failure caused by too fast movement in the conventional SLAM system. The current newer event camera, such as a DAVIS (Dynamic and active-pixel Vision Sensor), has an IMU (Inertial measurement unit) module, and the IMU can measure the linear acceleration and angular velocity of three axes, is often used to acquire three-dimensional motion information of the camera, is used for self-positioning in SLAM (Simultaneous positioning and Mapping), navigation and other applications, and can achieve time synchronization with event points and brightness images. However, technical difficulties such as reconstruction of images and time alignment of binocular images still exist in the aspects of positioning and map construction.
Disclosure of Invention
The invention provides a SLAM scheme based on a binocular event camera, aiming at the problem that the real depth information of a scene cannot be directly recovered in the conventional SLAM method based on a monocular event camera.
The technical scheme of the invention provides a SLAM method based on a binocular event camera, which comprises the following steps:
step 1, performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the motion compensation using IMU assistance is implemented as follows,
setting the event frame start time as
Figure BDA0002573438640000021
When motion compensation is performed, the start of an event frame is taken as a reference system, and a certain event point e in an accumulation window is aimed atjNoting the corresponding time stamp as tjObtained by IMU integration
Figure BDA0002573438640000022
To tjRelative position and attitude of
Figure BDA0002573438640000023
E is to bejCoordinate x ofjProjected coordinate x 'in reference coordinate system'jComprises the following steps:
Figure BDA0002573438640000024
wherein K is the camera internal reference matrix, K-1Is its inverse matrix, Z (x)j) The depth of the event point is replaced by a median value of the depths of the adjacent three-dimensional space points;
step 2, respectively carrying out feature point detection and tracking on corresponding reconstructed images input by the left event camera and the right event camera;
step 3, according to the result obtained in the step 2, triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by using a PnP method;
and 4, performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information.
Furthermore, the time alignment is performed when the image is reconstructed in step 1, which is realized as follows,
1) accumulating 30ms time event points by a left event camera to form a binary image frame, performing motion compensation, and taking the time value of the first event point within 30ms as a reconstructed image frame timestamp of the left event camera;
2) and in the event stream data of the right event camera, searching an event point with the time closest to the time stamp of the reconstructed image frame of the left event camera, taking the corresponding time of the searched event point as the starting time, accumulating the event point with the time of 30ms by the right event camera to form a binary image frame, and performing motion compensation.
In step 2, the feature point detection is realized by adopting a Shi-Tomasi method.
And in the step 2, tracking is realized by adopting a Kanade-Lucas-Tomasi method.
The invention also provides a SLAM system based on the binocular event camera, which is used for the SLAM method based on the binocular event camera.
The method mainly utilizes the characteristic that the event camera has high time resolution and high dynamic range and utilizes the characteristic that the binocular camera can acquire the real depth of the scene, thereby avoiding the problem that the traditional camera cannot acquire enough image characteristics in the scene with high dynamic range and calculating the real depth information of the scene through the binocular camera. The method based on the binocular event camera can obtain higher track estimation precision in SLAM. The technical scheme of the invention can be used for dealing with scenes with large illumination change and high-speed motion, and can be used for solving the problem that the robot in the conventional SLAM system is easy to lose efficacy when the motion or the environment is too complex.
Drawings
Fig. 1 is a schematic diagram comparing data of a conventional camera and a DVS camera.
Fig. 2 is a diagram of image reconstruction results according to an embodiment of the present invention, in which fig. 2(a) is an event image generated at a fixed time interval of a slow moving scene, fig. 2(b) is an event image generated at a fixed time interval of a fast moving scene, fig. 2(c) is an event image generated by a fixed number of event points of a scene with a simple environment, and fig. 2(d) is an event image generated by a fixed number of event points of a scene with a complex environment.
Fig. 3 is a schematic diagram of motion compensation as used in the present invention.
Fig. 4 shows the result of image reconstruction after motion compensation according to an embodiment of the present invention, in which fig. 4(a) is an image subjected to motion compensation, and fig. 4(b) is an image after motion compensation.
FIG. 5 is a schematic view of the embodiment of the present invention.
Fig. 6 is a schematic diagram of back-end BA optimization according to an embodiment of the present invention.
FIG. 7 is a block diagram of a flow chart according to an embodiment of the present invention.
Detailed Description
In order to more clearly understand the present invention, the technical solutions of the present invention are specifically described below with reference to the accompanying drawings and examples.
The method comprises the steps of taking a left event camera and a right event camera as two independent threads to simultaneously perform motion compensation, time alignment, feature point detection and tracking, then solving the three-dimensional coordinates of the feature points by utilizing triangulation, solving the camera pose by combining the two-dimensional image coordinates of the feature points and the corresponding three-dimensional space coordinates and then optimizing and outputting the motion track of the cameras and the construction of a surrounding map by combining IMU pre-integration through the back end.
Referring to fig. 7, an embodiment of the present invention provides a method for SLAM based on a binocular event camera, including the following steps:
step 1, performing motion compensation on input left and right event camera data by using IMU assistance to obtain a reconstructed gray value image.
Left and right event camera data and an event camera IMU are first input. The data format of the event camera is different from that of a general light camera, the event camera is required to reconstruct images, and the images can be generated by accumulating event points according to the number of the event points and the time intervals.
The event stream data recorded when the camera takes a picture is shown in formula 1, set { e }i(x, t) } represents data of all event points generated by the camera in the shooting process, x represents pixel coordinates of the event point, t represents generation time, and the ith event point e in the set is setiThe information of (x, t) includes its generation time tiPixel coordinate x of an event pointiAnd its polarity sigmaiWhere it is the dirac function. The data output by the event camera is asynchronousEvent streaming, whereas conventional image processing methods are directed to framed images. In order to process images or visualize event streams, event image frames need to be generated from the event streams, and generally event images are formed by directly accumulating events.
ei(x,t)=σi(x-xi)(t-ti),i∈1,2,3,...(1)
Accumulating event data directly over a period of time and then drawing it on an image in a fixed color is the simplest way to generate an image frame. In other words, in the blank image, values of coordinate pixel points of events received in a certain period of time are all recorded as 255, and a place where the event is not received is set as 0, so that a binary event image can be obtained. The selection of the time window may use a fixed length of time or may use a fixed number of events. The event images generated at fixed time intervals are shown in fig. 2(a) (b), which are two event images generated using 30ms time intervals, where fig. 2(a) is a slow moving scene and fig. 2(b) is a fast moving scene. It can be seen that if the time accumulation is performed at a fixed time interval, when the camera moving speed is too fast, the event image edge is thick, and an obvious smear phenomenon occurs. As shown in fig. 2(c) and (d), the event images generated by fixing the number of event points are time images generated by using 10000 points, fig. 2(c) is a scene with a simple environment, and fig. 2(d) is a scene with a complex environment, and the number of events activated is larger when the visible environment is more complex, and the time interval between adjacent frames is increased in the simple scene by using the imaging method of fixing the number of events. Mapping of cumulative events requires selection of appropriate means and parameters based on different scenarios and motion patterns. The patent preferably employs event images generated at fixed time intervals, with the embodiment using cumulative 30ms time intervals to generate the event images.
Since the image formed by accumulating the events for a certain period of time causes the problem of coarse edge of the event image, which is not beneficial to the subsequent feature point detection, the event point needs to be motion compensated.
As shown in FIG. 2, on the axis t of the time axis, the dots represent event points, the squares represent IMU data, and the upper I1,I2,I3,I4Indicating the order of the image frames corresponding theretoColumn, will time interval
Figure BDA0002573438640000041
Inner IMU data integration to obtain I2,I3Transformation matrix between two frames
Figure BDA0002573438640000042
Namely:
Figure BDA0002573438640000043
wherein, linear acceleration quadratic integral obtains translation variable quantity
Figure BDA0002573438640000044
Integration of angular velocity to obtain rotation variation
Figure BDA0002573438640000045
Figure BDA0002573438640000046
A timestamp representing the kth event point at any time f,
Figure BDA0002573438640000047
and
Figure BDA0002573438640000048
respectively represent arbitrary event points ejRelative to a reference timestamp
Figure BDA0002573438640000049
And
Figure BDA00025734386400000410
the transformation matrix of (2). Then pass through
Figure BDA00025734386400000411
Linear interpolation is carried out, and any event point e can be obtainedjIs transformed by
Figure BDA00025734386400000412
(i.e. the
Figure BDA00025734386400000413
To tjRelative pose) of tjA timestamp representing the point of the event.
Setting the event frame start time as
Figure BDA00025734386400000414
Cumulative window size of
Figure BDA00025734386400000415
When motion compensation is performed, starting with an event frame
Figure BDA00025734386400000416
For a reference system, for a certain event point e in the accumulation windowjWith a time stamp of tjCalculated by IMU integration
Figure BDA00025734386400000417
To tjRelative position and attitude of
Figure BDA00025734386400000418
Can be combined with ejCoordinate x ofjProjected into a reference coordinate system. Post-projection coordinate x'jComprises the following steps:
Figure BDA0002573438640000051
wherein K is a camera internal reference matrix known after calibration by a camera, K-1Is the inverse matrix thereof, wherein Z (x)j) The depth of the event point, which is generally derived from the projection depth of the optimized three-dimensional space point in the area, is given by BA optimization in step 4, and in order to avoid the influence of the error of the calculation of the depth of the space point, the median value of the depths of the adjacent three-dimensional space points is used instead. The effect of motion compensation is shown in fig. 4, where fig. 4(a) is an image subjected to motion compensation, and fig. 4(b) is an image after motion compensation, it can be seen that the edge of the image after motion compensation is thinned, which is convenient for subsequent feature point detection and tracking.
For time alignment, the specific reconstruction method in step 1 in the embodiment is as follows:
1) accumulating event points at 30ms time by a left event camera to form a binary image frame, wherein the binary image frame is included in a blank image, values of coordinate pixel points of an event received at a certain 30ms time are all recorded as 255, a place where the event is not received is set as 0, motion compensation is carried out, and a time value of a first event point within 30ms is used as a time stamp of a reconstructed image frame of the left event camera;
2) searching an event point with the time closest to the time stamp of a reconstructed image frame of the left event camera in the event stream data of the right event camera, taking the event point time as the starting time, accumulating the event points with the time of 30ms by the right event camera in the same method to form a binary image frame, namely, in a blank image, recording the values of coordinate pixel points of an event received at a certain time of 30ms as 255, and setting the positions where the event is not received as 0, and performing motion compensation. Due to the fact that the time resolution of the event cameras is extremely high, the similarity of the reconstructed image results of the left event camera and the right event camera can be achieved, and the purpose of time alignment is achieved.
Step 2, carrying out Shi-Tomasi method characteristic point detection and Kanade-Lucas-Tomasi method tracking on corresponding reconstructed images input by the left and right event cameras:
the left and right camera images are tracked separately in step 2.
The embodiment adopts a Shi-Tomasi method to detect the feature points of the reconstructed image, and comprises the steps of moving a fixed window in the image along any direction, judging whether the image is a corner point or not through the gray scale change of the image in the window at the corner point, and further realizing the feature point detection. The Kanade-Lucas-Tomasi method is abbreviated as KLT method, and the embodiment uses the KLT method to track optical flows, calculates the optical flows of characteristic points according to the assumption that the brightness value of the same object is constant in a short time, and tracks the characteristic points by using the optical flows.
Step 3, triangulating the detected and tracked feature points to calculate the corresponding three-dimensional coordinate points of the target and the pose change between images, and calculating the pose of the camera by using a PnP method:
more accurate scene depth values are obtained more efficiently by binocular camera initialization at step 3. In an embodiment, step 3 comprises the following sub-steps:
step 3.1, calculating three-dimensional coordinates of the feature points by using triangulation:
since the binocular camera knows the baseline distance between the two cameras, the absolute scale information can be obtained by triangulating the three-dimensional coordinates of the feature points detected in the reconstructed images of the left and right event cameras, and therefore, more accurate scene depth values can be calculated when the three-dimensional coordinates of the triangulated target from the feature point coordinates of the left and right cameras are calculated. As shown in FIG. 5, the coordinate of a point P in space in the world coordinate system is X, which is represented by O1The coordinate in the imaging plane of the camera 1 being the optical center is X1In the presence of O2The coordinate in the imaging plane of the camera 2 being the optical center is X2,R1、T1Is a rotation matrix and a translation matrix of the camera 1 relative to the initial pose, and has the same principle of R2、T2For the rotation matrix and translation matrix of camera 2 with respect to the initial pose, R, T is the rotation and translation matrix between cameras 1 and 2, and equation 3 can be derived from the camera imaging model.
Figure BDA0002573438640000061
Wherein K is a camera internal reference matrix s1And s2The distances from the optical centers of the camera 1 and the camera 2 to the target point P are approximately equal, the formula 3 is transformed to obtain a formula 4, and the three-dimensional point coordinates of the target point P can be obtained by utilizing SVD (singular value decomposition) decomposition solution. Wherein, the simplification mark K-1X2=X′2
Figure BDA0002573438640000062
Step 3.2, calculating the pose of the camera by using a PnP method:
the PnP (Passive-n-Point) method is used for solving the problem of estimation of the camera pose when three-dimensional space Point coordinates under a known partial world coordinate system and two-dimensional camera coordinate systems of the three-dimensional space Point coordinates and the two-dimensional camera coordinate system coordinates are known. In the invention, the two-dimensional image coordinates and the three-dimensional space coordinates of the known feature points in the step 2 and the step 3.1 are used for solving the pose change of a left camera continuously inputting two frames of images by triangulation, when the camera moves to a new position to obtain a new third frame of event frame, because the translation vector T in the relative transformation obtained by triangulation does not have a real scale, if the pose of the new camera is continuously solved by the triangulation, only a certain relative pose can be obtained, and thus the scales between the 3 camera poses (the camera poses corresponding to the first frame, the second frame and the third frame) are inconsistent. Therefore, for the subsequent camera pose, the relation between the three-dimensional coordinates and the two-dimensional pixel coordinates of the feature points can be utilized for solving, namely the PnP method. In the invention, a PnP method is utilized, three-dimensional coordinates of feature points are calculated by triangularization of left and right camera images of a first frame, two-dimensional image coordinates corresponding to the feature points of two continuous frames are detected by the left event camera feature in step 2, at least 6 groups of three-dimensional and two-dimensional matched points are selected for PnP pose resolving, and then camera pose transformation can be obtained.
And 4, optimizing a back end BA (Bundle adjustment) by combining IMU pre-integration to obtain a camera motion track and scene mapping information:
step 4.1, IMU pre-integration, wherein the IMU can output the three-axis acceleration of the sensor at a higher frequency
Figure BDA0002573438640000063
And angular velocity
Figure BDA0002573438640000064
However, due to the bias and noise of the IMU itself, there is a certain difference between the output measured value and the true value. The relationship between the IMU measurement value and the true value can be represented by equation 5.
Figure BDA0002573438640000071
Wherein,
Figure BDA0002573438640000072
and
Figure BDA0002573438640000073
measured values of acceleration and angular velocity, atAnd ωtIs the corresponding true value.
Figure BDA0002573438640000074
And
Figure BDA0002573438640000075
and (4) bias of acceleration and angular velocity, and a random walk model is obeyed.
Figure BDA0002573438640000076
And
Figure BDA0002573438640000077
noise of acceleration and angular velocity, respectively, where naNoise representing three-axis acceleration, subject to mean 0 and variance of
Figure BDA0002573438640000078
Gaussian normal distribution of (1), nwIs the noise at angular velocity, which follows a mean of 0 and a variance of
Figure BDA0002573438640000079
Gaussian normal distribution of (a). While
Figure BDA00025734386400000710
Is a rotation matrix of the world coordinate system to the camera coordinate system at time t, gwIs the gravitational acceleration under the world coordinate system.
For tkTime of day event frame bkTo tk+1Time of day event frame bk+1The IMU observed value in between is integrated to obtain bk+1Corresponding translation of frames in world coordinate system
Figure BDA00025734386400000711
Speed of rotation
Figure BDA00025734386400000712
And rotation
Figure BDA00025734386400000713
The pose change of the camera can be obtained.
And 4.2, optimizing the back end BA, and when more new camera pose appears continuously, generating a certain error due to the accumulated error of the relative pose obtained by solving. At this time, the overall pose and the three-dimensional point cloud are usually adjusted by a BA optimization method. The essence of BA is an optimization model, which optimizes the pose obtained by PnP solution and the three-dimensional world coordinates of the feature points obtained by triangularization of the feature points by minimizing the reprojection error.
Assuming that there are n camera poses and m feature points for solving the three-dimensional world coordinate, the BA optimization problem can be represented by equation 6.
Figure BDA00025734386400000714
Wherein, when the camera ciCoefficient when the feature point j is observed
Figure BDA00025734386400000715
Is 1, otherwise is 0.
Figure BDA00025734386400000716
Is a camera ciObserving the two-dimensional pixel coordinate of the feature point j, K being the camera reference matrix, PiFor the ith camera ciThe position and the attitude of the robot are shown,
Figure BDA00025734386400000717
is the three-dimensional world coordinate of the characteristic point j. FIG. 6 shows the case of 3 camera poses, 2 three-dimensional feature points, where C1、C2、C3As camera observation point, X1、X2、X3As three-dimensional coordinates of feature points, x11Is at C1Two-dimensional image coordinates of three-dimensional feature points, and two othersDimensional image coordinate x12、x13And the same is true.
For the solution of the BA optimization problem, an LM (Levenbrg-Marquardt) method is usually used for optimizing the pose P and the three-dimensional coordinate X of the feature point of the camera, all the steps of the patent are finished, and the map construction of the self-motion pose and the feature point of the camera can be obtained.
Fig. 7 is a flowchart of the present patent, that is, an image form similar to that of a conventional camera is reconstructed by performing motion compensation and time alignment on camera images of left and right camera input events, and then detection and tracking of corresponding feature points based on the reconstructed image, PnP pose calculation, triangularization depth calculation, IMU pre-integration, and back-end optimization can be completed by using the existing method, and finally a map composed of the self pose of the camera and the coordinates of the feature points after optimization is output.
In specific implementation, the method can adopt a computer software technology to realize an automatic operation process, and a corresponding system device for implementing the method process is also in the protection scope of the invention.
It should be understood that the above-mentioned embodiments are described in some detail, and not intended to limit the scope of the invention, and those skilled in the art will be able to make alterations and modifications without departing from the scope of the invention as defined by the appended claims.

Claims (5)

1. A SLAM method based on a binocular event camera is characterized by comprising the following steps:
step 1, performing motion compensation on input left and right event camera data by utilizing IMU assistance to obtain corresponding reconstructed images; the motion compensation using IMU assistance is implemented as follows,
setting the event frame start time as
Figure FDA0002573438630000011
When performing motion compensation, the event frame start is taken as the reference frame forAccumulating a certain event point e in the windowjNoting the corresponding time stamp as tjObtained by IMU integration
Figure FDA0002573438630000012
To tjRelative position and attitude of
Figure FDA0002573438630000013
E is to bejCoordinate x ofjProjected coordinate x 'in reference coordinate system'jComprises the following steps:
Figure FDA0002573438630000014
wherein K is the camera internal reference matrix, K-1Is its inverse matrix, Z (x)j) The depth of the event point is replaced by a median value of the depths of the adjacent three-dimensional space points;
step 2, respectively carrying out feature point detection and tracking on corresponding reconstructed images input by the left event camera and the right event camera;
step 3, according to the result obtained in the step 2, triangularization calculation is carried out on the detected and tracked feature points to obtain three-dimensional coordinate points corresponding to the target and pose changes among images, and the camera pose is calculated by using a PnP method;
and 4, performing back-end BA optimization by combining IMU pre-integration to obtain a camera motion track and scene mapping information.
2. The binocular event camera based SLAM method of claim 1, wherein: the time alignment is performed when the image is reconstructed in step 1, which is realized as follows,
1) accumulating 30ms time event points by a left event camera to form a binary image frame, performing motion compensation, and taking the time value of the first event point within 30ms as a reconstructed image frame timestamp of the left event camera;
2) and in the event stream data of the right event camera, searching an event point with the time closest to the time stamp of the reconstructed image frame of the left event camera, taking the corresponding time of the searched event point as the starting time, accumulating the event point with the time of 30ms by the right event camera to form a binary image frame, and performing motion compensation.
3. The binocular event camera based SLAM method of claim 1 or 2, wherein: in step 2, a Shi-Tomasi method is adopted to realize feature point detection.
4. The binocular event camera based SLAM method of claim 1 or 2, wherein: in the step 2, tracking is realized by adopting a Kanade-Lucas-Tomasi method.
5. A SLAM system based on binocular event cameras, characterized in that: SLAM method for the binocular event based camera of claims 1 to 4.
CN202010647021.8A 2020-07-07 2020-07-07 SLAM method and system based on binocular event camera Pending CN111899276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010647021.8A CN111899276A (en) 2020-07-07 2020-07-07 SLAM method and system based on binocular event camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010647021.8A CN111899276A (en) 2020-07-07 2020-07-07 SLAM method and system based on binocular event camera

Publications (1)

Publication Number Publication Date
CN111899276A true CN111899276A (en) 2020-11-06

Family

ID=73191664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010647021.8A Pending CN111899276A (en) 2020-07-07 2020-07-07 SLAM method and system based on binocular event camera

Country Status (1)

Country Link
CN (1) CN111899276A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112631314A (en) * 2021-03-15 2021-04-09 季华实验室 Robot control method and system based on multi-line laser radar and event camera SLAM
CN112809679A (en) * 2021-01-25 2021-05-18 清华大学深圳国际研究生院 Method and device for grabbing deformable object and computer readable storage medium
CN112967316A (en) * 2021-03-05 2021-06-15 中国科学技术大学 Motion compensation optimization method and system for 3D multi-target tracking
CN114022949A (en) * 2021-09-27 2022-02-08 中国电子科技南湖研究院 Event camera motion compensation method and device based on motion model
CN115997234A (en) * 2020-12-31 2023-04-21 华为技术有限公司 Pose estimation method and related device
CN116389682A (en) * 2023-03-07 2023-07-04 华中科技大学 Dual-event camera synchronous acquisition system and noise event suppression method
CN117372548A (en) * 2023-12-06 2024-01-09 北京水木东方医用机器人技术创新中心有限公司 Tracking system and camera alignment method, device, equipment and storage medium
CN117739996A (en) * 2024-02-21 2024-03-22 西北工业大学 Autonomous positioning method based on event camera inertial tight coupling

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665540A (en) * 2018-03-16 2018-10-16 浙江工业大学 Robot localization based on binocular vision feature and IMU information and map structuring system
CN110415344A (en) * 2019-06-24 2019-11-05 武汉大学 Motion compensation process based on event camera
CN111340851A (en) * 2020-05-19 2020-06-26 北京数字绿土科技有限公司 SLAM method based on binocular vision and IMU fusion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665540A (en) * 2018-03-16 2018-10-16 浙江工业大学 Robot localization based on binocular vision feature and IMU information and map structuring system
CN110415344A (en) * 2019-06-24 2019-11-05 武汉大学 Motion compensation process based on event camera
CN111340851A (en) * 2020-05-19 2020-06-26 北京数字绿土科技有限公司 SLAM method based on binocular vision and IMU fusion

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115997234A (en) * 2020-12-31 2023-04-21 华为技术有限公司 Pose estimation method and related device
CN112809679A (en) * 2021-01-25 2021-05-18 清华大学深圳国际研究生院 Method and device for grabbing deformable object and computer readable storage medium
CN112967316A (en) * 2021-03-05 2021-06-15 中国科学技术大学 Motion compensation optimization method and system for 3D multi-target tracking
CN112967316B (en) * 2021-03-05 2022-09-06 中国科学技术大学 Motion compensation optimization method and system for 3D multi-target tracking
CN112631314A (en) * 2021-03-15 2021-04-09 季华实验室 Robot control method and system based on multi-line laser radar and event camera SLAM
CN112631314B (en) * 2021-03-15 2021-06-04 季华实验室 Robot control method and system based on multi-line laser radar and event camera SLAM
CN114022949A (en) * 2021-09-27 2022-02-08 中国电子科技南湖研究院 Event camera motion compensation method and device based on motion model
CN116389682A (en) * 2023-03-07 2023-07-04 华中科技大学 Dual-event camera synchronous acquisition system and noise event suppression method
CN116389682B (en) * 2023-03-07 2024-02-06 华中科技大学 Dual-event camera synchronous acquisition system and noise event suppression method
CN117372548A (en) * 2023-12-06 2024-01-09 北京水木东方医用机器人技术创新中心有限公司 Tracking system and camera alignment method, device, equipment and storage medium
CN117372548B (en) * 2023-12-06 2024-03-22 北京水木东方医用机器人技术创新中心有限公司 Tracking system and camera alignment method, device, equipment and storage medium
CN117739996A (en) * 2024-02-21 2024-03-22 西北工业大学 Autonomous positioning method based on event camera inertial tight coupling
CN117739996B (en) * 2024-02-21 2024-04-30 西北工业大学 Autonomous positioning method based on event camera inertial tight coupling

Similar Documents

Publication Publication Date Title
CN111899276A (en) SLAM method and system based on binocular event camera
CN110070615B (en) Multi-camera cooperation-based panoramic vision SLAM method
Zhu et al. The multivehicle stereo event camera dataset: An event camera dataset for 3D perception
JP6768156B2 (en) Virtually enhanced visual simultaneous positioning and mapping systems and methods
CN107888828B (en) Space positioning method and device, electronic device, and storage medium
US10825197B2 (en) Three dimensional position estimation mechanism
US10260862B2 (en) Pose estimation using sensors
CN110533719B (en) Augmented reality positioning method and device based on environment visual feature point identification technology
WO2018142496A1 (en) Three-dimensional measuring device
CN110310362A (en) High dynamic scene three-dimensional reconstruction method, system based on depth map and IMU
CN109540126A (en) A kind of inertia visual combination air navigation aid based on optical flow method
CN108700946A (en) System and method for parallel ranging and fault detect and the recovery of building figure
WO2015134795A2 (en) Method and system for 3d capture based on structure from motion with pose detection tool
US11262837B2 (en) Dual-precision sensor system using high-precision sensor data to train low-precision sensor data for object localization in a virtual environment
CN110139031B (en) Video anti-shake system based on inertial sensing and working method thereof
KR20150013709A (en) A system for mixing or compositing in real-time, computer generated 3d objects and a video feed from a film camera
US20110249095A1 (en) Image composition apparatus and method thereof
CN111798485B (en) Event camera optical flow estimation method and system enhanced by IMU
KR20180030446A (en) Method and device for blurring a virtual object in a video
CN110544278B (en) Rigid body motion capture method and device and AGV pose capture system
Bapat et al. Towards kilo-hertz 6-dof visual tracking using an egocentric cluster of rolling shutter cameras
CN118135526A (en) Visual target recognition and positioning method for four-rotor unmanned aerial vehicle based on binocular camera
Huttunen et al. A monocular camera gyroscope
CN112432653B (en) Monocular vision inertial odometer method based on dotted line characteristics
US10540809B2 (en) Methods and apparatus for tracking a light source in an environment surrounding a device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201106