CN115375581A - Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization - Google Patents

Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization Download PDF

Info

Publication number
CN115375581A
CN115375581A CN202211076662.8A CN202211076662A CN115375581A CN 115375581 A CN115375581 A CN 115375581A CN 202211076662 A CN202211076662 A CN 202211076662A CN 115375581 A CN115375581 A CN 115375581A
Authority
CN
China
Prior art keywords
event
pixel
noise reduction
time
event stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211076662.8A
Other languages
Chinese (zh)
Inventor
王立辉
许宁徽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202211076662.8A priority Critical patent/CN115375581A/en
Publication of CN115375581A publication Critical patent/CN115375581A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • G06T3/06
    • G06T5/70
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Abstract

The dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization comprises the following steps: 1. reading an event stream output by the dynamic vision sensor, and acquiring pose information of the dynamic vision sensor; 2. performing three-dimensional reconstruction by using the event stream and the pose information, and synchronizing the events triggered at different moments to a reference moment in a time-space mode to obtain a confidence map; 3. converting the confidence map into an event probability map, and representing the response probability of the dynamic visual sensor to the scene under an ideal condition; 4. calculating the rationality of the event stream based on the consistency of the event stream and the event probability map; 5. and by improving the rationality of the event stream through the calculation noise reduction algorithm, obtaining the noise reduction precision index for evaluating and comparing the noise reduction effects of different algorithms. The method can realize the objective evaluation of the noise reduction precision of the event stream by utilizing the high-frequency advantage of the dynamic vision sensor and combining the pose information under the condition that the specific distribution of noise and the reference event stream are unknown.

Description

Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization
Technical Field
The invention belongs to the technical field of sensor signal processing, and particularly relates to a dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization.
Background
A Dynamic Vision Sensor (DVS), also called an event camera, is a biological heuristic Sensor that operates in a completely different manner from conventional cameras by simulating the visual system of a natural organism. The dynamic vision sensor does not output images at a fixed rate, drives the illumination change on each pixel sensitive based on asynchronous events, and fires an "Event" (Event) with ultra-low latency (less than 1 microsecond) when the logarithmic change in brightness reaches a preset threshold. Each event is represented by a four-dimensional vector e (x, y, t, p) containing the pixel coordinates (x, y) of the event, the trigger time t, and a polarity p, where the polarity p e {1, -1} represents an increase or decrease in luminance across the pixel. The DVS only outputs related information of local brightness change, has the characteristics of high response speed, ultralow delay, high dynamic range, only capturing dynamic change, low power consumption and the like, can overcome the defects of high redundancy, low frame rate, large delay, low dynamic range and the like of the traditional camera, and further is widely applied to the fields of unmanned driving, robots and the like.
The dynamic vision sensor is very sensitive to the change of the ambient brightness due to the structure of the dynamic vision sensor, is limited by hardware water, and contains a large amount of noise interference in the output asynchronous event stream. Noise may be derived from impulse noise during digital signal transmission, gaussian noise caused by a photodiode, and the like, and greatly affects further application and visualization of an event stream, so that noise reduction processing on event stream information is a very important link, and noise reduction of an event stream also becomes an important research subject in the field of dynamic visual sensors.
However, because the data load of the DVS is too large (millions of events are output per second), it is difficult to manually mark the validity of each event, and the specific distribution of noise and the reference event stream cannot be obtained, so that there is no effective method for measuring the noise reduction effect of the event stream and comparing the noise reduction effects of different algorithms, thereby restricting further development and application of the event stream noise reduction algorithm and the dynamic visual sensor.
Disclosure of Invention
Aiming at the problems, the invention provides a dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization, which utilizes the high-frequency advantage of a dynamic visual sensor and combines pose information to realize the objective evaluation of the noise reduction effect of the event stream under the condition that the specific distribution of noise and a reference event stream are unknown.
The invention provides a dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization, which comprises the following specific steps:
step 1: reading an event stream output by a dynamic vision sensor DVS, and acquiring pose information of the DVS by an action capture system, a visual odometer, an inertial navigation or an indoor positioning method;
step 2: based on an Event-based Multi View Stereo algorithm, three-dimensional reconstruction is carried out on an actual scene by using an Event stream in combination with pose information, events triggered at different moments are projected to a reference moment to carry out space-time synchronization, a confidence map is obtained, sharpening of a real Event and blurring of a noise Event are realized, and the sharpening and the blurring of the noise Event are used as a reference standard for noise reduction;
and 3, step 3: since the local maxima and edge regions in the DSI often correspond to intensity gradients in the scene, the probability of a trigger event is greater, getting the reference instant t r After the confidence map c (x, y) of (c), each pixel is converted into an event probability map by calculating the probability of being a local maximum or edge for each pixel, representing t in an ideal case r Event probability triggered by each pixel of a real scene in time on DVS;
and 4, step 4: based on the reference instant t r The consistency of the event stream and the event probability map is used for calculating the rationality of the event stream;
and 5: the high-precision event stream denoising method can remove noise events with low rationality and keep effective events with high rationality, so that the overall reliability of the event stream is improved, and therefore, the improvement of the noise reduction algorithm on the rationality of the event stream is calculated by comparing the rationality of the event stream before and after noise reduction, and the noise reduction precision index is obtained for evaluating and comparing the noise reduction effects of different algorithms:
Figure BDA0003831744010000021
wherein e is original And e denoised Respectively representing event streams before and after noise reduction, wherein the higher the noise reduction precision index is, the event stream to which the noise reduction algorithm is applied is representedThe rationality is improved more obviously, and the noise reduction effect is better.
As a further improvement of the invention, the three-dimensional reconstruction by using the event stream and the pose information in the step 2 comprises the following processes:
before three-dimensional reconstruction, detecting repeated events on each pixel, and only retaining the first trigger event:
IE={e i (x i ,y i ,t i )|(t i -t i-1 )>τ IE ^(t i+1 -t i )<τ IE }
wherein IE represents the first event in the repeated trigger event and represents t i Time stamp of the i-th event triggered on a certain pixel, time threshold parameter t IE Set to 20ms;
then, three-dimensional reconstruction based on the event is carried out to generate a confidence map, and the method specifically comprises the following steps:
(2-1) selecting an observation visual angle at a reference moment as a reference visual angle, discretizing an observation camera system along the optical axis direction of the reference visual angle into a grid map, constructing a parallax space diagram DSI, and discretizing the reference visual angle into N depth planes by the parallax space diagram DSI
Figure BDA0003831744010000022
Each depth plane is divided into w × h spatial cells, consistent with the pixel resolution of DVS, so DSI is divided into w × h × N spatial voxel cells, where N is set to 100;
(2-2) projecting all events from the pixel plane to the parallax space map DSI according to the poses of the corresponding moments, and calculating the intersection times of each unit voxel in the parallax space map DSI and the event back-projection ray, wherein the more the intersection times, the more times of observing and responding the corresponding area by the DVS, the greater the probability of containing the scene edge in the unit voxel, and correspondingly, the greater the probability of triggering the event on the DVS of the reference view angle;
in the process of projecting and composing an event, an effective event continuously triggered by a high frequency in an actual scene is automatically synchronized to a corresponding spatial position, and noise in an event stream cannot vote the space-time continuity of a fixed area in the DSI and is diluted by the effective event;
and finally, recording the DSI maximum value of each pixel of the reference visual angle along the optical axis direction to obtain a confidence map under the reference visual angle.
As a further improvement of the invention, the step of projecting the event from the pixel plane to the DSI in step 2 is as follows:
solving the intersection units of the event projection ray and the depth planes of the DSI by using homography, wherein each depth plane
Figure BDA0003831744010000031
Are respectively:
Z i =[n,d i ] T =[(0,0,1),z i ] T
wherein n and z i Respectively the normal vector and depth of each plane;
in the projection process, each event e is first utilized i (x i ,y i ) Relative pose [ R | t ] between observation time and reference time]Calculating the relative Z between them 0 Planar homography matrix
Figure BDA0003831744010000032
Figure BDA0003831744010000033
Then, the projection matrix P of DVS is combined, and the pixel coordinate of event is obtained in Z through homography transformation 0 Projection coordinates on a plane:
Figure BDA0003831744010000034
Figure BDA0003831744010000035
in the formula (x) i ,y i ) And (x (z) 0 ),y(z 0 ) Are the pixel coordinates of the event and are at Z, respectively 0 Projection coordinates of the plane;
the projected coordinates of the event on the remaining depth planes of the DSI, again by homographic transformation, from Z 0 The coordinates on the plane are calculated to yield:
Figure BDA0003831744010000036
wherein
Figure BDA0003831744010000037
Simplified to obtain an event at Z i Coordinates on a plane:
Figure BDA0003831744010000038
in the formula (c) x ,c y ,c z ) T =-R T t, is the coordinates of the DVS relative to the reference time instant.
As a further improvement of the present invention, the step 3 of converting the confidence map into the event probability map comprises the following processes:
(3-1) for each pixel x, y in the confidence map, using it separately with each pixel (x) in the neighborhood window Ω i ,y i ) Constructing a spatial domain Gaussian kernel G by corresponding space proximity and pixel similarity between the E and the omega d Sum-valued Gaussian kernel G r
Figure BDA0003831744010000041
Figure BDA0003831744010000042
Wherein the content of the first and second substances,
Figure BDA0003831744010000043
(3-2) after that, the normalized product of the spatial domain Gaussian kernel and the value domain Gaussian kernel is taken as the weight W (x) i ,y i ) And performing weighted fusion on all pixels in the window omega to obtain an adaptive threshold value T (x, y), comparing the adaptive threshold value T (x, y) with a confidence value c (x, y) at a central pixel, and calculating the probability of the pixel (x, y) serving as a DSI local maximum or an edge region to represent an event trigger probability p (x, y):
Figure BDA0003831744010000044
Figure BDA0003831744010000045
Figure BDA0003831744010000046
the window size is set to 7x7, and the above steps are repeated for all pixels on the confidence map, so as to obtain an event probability map of the reference time.
As a further improvement of the present invention, the event stream rationality calculation in step 4 includes the following processes:
(4-1) for event stream e i (x i ,y i ,t i ) Using I: Z 2 → 0,1 represents the dynamic visual sensor pixel plane Z in a time range before and after the reference time 2 Event trigger case of (3):
Figure BDA0003831744010000047
wherein τ represents a time range, and 1 and 0 each represent [ t r -τ,t r +τ]The presence and absence of an event at the pixel over a period of time;
(4-2) when I (x, y) =1, constructing an exponential decay kernel using the time distance between the time stamp of the event at the corresponding pixel and the reference time instant, representing the time correlation between the event stream at that pixel and the event probability map:
Figure BDA0003831744010000051
wherein the decay rate parameter delta t Setting to 20ms, quantifying the rationality of an event stream triggering an event at a pixel by the product of the event probabilities p (x, y) and Γ (x, y) for the respective location, the greater the rationality, the greater the likelihood of representing the event as a valid signal triggered by the actual scene;
when I (x, y) =0, the inverse event probability on the event probability map is used:
Figure BDA0003831744010000052
indicating the rationality of the pixel miss event;
(4-3) thus, [ t ] is obtained r -τ,t r +τ]During this time, the log reasonableness of the event stream on pixel (x, y):
Figure BDA0003831744010000053
separately calculating the pixel plane Z 2 The logarithm rationality of all the pixel positions is obtained to obtain an event stream e i The rationality of (2):
Figure BDA0003831744010000054
logP(e i ) The smaller, the event stream e is represented i The better the consistency with the event probability map, the higher the rationality.
Has the advantages that:
the invention fully utilizes the high-frequency characteristic of the dynamic visual sensor, synchronizes the effective events continuously triggered on the pixel plane of the actual scene to the three-dimensional space of the reference time, realizes the prominent sharpening of the effective events and the fuzzy filtering of the noise events, thereby quantifying the rationality of the event stream and the noise reduction precision of the algorithm, and realizing the objective evaluation of the noise reduction effect of the event stream under the condition that the specific distribution of the noise and the reference event stream are unknown.
Drawings
Fig. 1 is a flowchart of an event stream noise reduction effect evaluation method provided by the present invention.
Detailed Description
The invention is described in further detail below with reference to the following detailed description and accompanying drawings:
the invention discloses a dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization, the flow of the method is shown in figure 1, and the method specifically comprises the following steps:
step 1: and reading an event stream output by a Dynamic Vision Sensor (DVS), and acquiring the pose information of the DVS by using a motion capture system, a visual odometer, inertial navigation or indoor positioning and other methods.
Step 2: based on an Event-based multiview Stereo (EMVS) algorithm, an actual scene is reconstructed in a three-dimensional mode by using Event streams and pose information, events triggered at different moments are projected to reference moments to be subjected to space-time synchronization, a confidence map is obtained, sharpening of real events and blurring of noise events are achieved, and the sharpened events and the blurred events are used as a reference basis for noise reduction.
Since repeated events on the same pixel can be repeatedly projected to a reference moment in a short time, so that the construction of a confidence map and the subsequent evaluation of event effectiveness are influenced, before three-dimensional reconstruction is carried out, the repeated events on each pixel are detected, and only the first trigger event is reserved:
IE={e i (x i ,y i ,t i )(t i -t i-1 )>τ IE ^(t i+1 -t i )<τ IE }
wherein IE represents the first event in the repeated trigger event and represents t i Time stamp of the i-th event triggered on a certain pixel, time threshold parameter t IE Set to 20ms.
Then, three-dimensional reconstruction based on the event is carried out to generate a confidence map, and the method specifically comprises the following steps:
and (2-1) selecting an observation visual angle at a reference moment as a reference visual angle, and discretizing an observation camera system along the optical axis direction into a grid map to construct a Disparity Space Image (DSI). DSI discretizes reference views into N depth planes
Figure BDA0003831744010000061
Each depth plane is divided into w × h spatial cells, consistent with the pixel resolution of DVS, and thus DSI is divided into w × h × N spatial voxel cells. Where N is set to 100.
(2-2) projecting all events from the pixel plane to the DSI according to the poses of the corresponding moments, and calculating the intersection (also called as 'voting') times of each unit voxel in the DSI and the event back projection ray, wherein the more the intersection times, the more times of observing and responding the corresponding region by the DVS, the greater the probability of containing the scene edge in the unit voxel, and correspondingly, the greater the probability of triggering the event on the DVS of the reference view angle.
Solving the intersection units of the event projection ray and the depth planes of the DSI by using homography, wherein each depth plane
Figure BDA0003831744010000062
Are respectively:
Z i =[n,d i ] T =[(0,0,1),z i ] T
wherein n and z i Respectively, the normal vector and depth of each plane.
In the projection process, each event e is first utilized i (x i ,y i ) Relative pose [ R | t ] between observation time and reference time]Calculating the relative Z between them 0 Planar homography matrix H Z0
Figure BDA0003831744010000063
Then, the projection matrix P of DVS is combined, and the pixel coordinate of event is obtained in Z through homography transformation 0 Projection on planeShadow coordinates:
Figure BDA0003831744010000064
Figure BDA0003831744010000071
in the formula (x) i ,y i ) And (x (z) 0 ),y(z 0 ) Are the pixel coordinates of the event and are in Z, respectively 0 Projection coordinates of the plane.
The projected coordinates of the event on the remaining depth plane of the DSI, again by homography, are transformed from Z 0 The coordinates on the plane are calculated to yield:
Figure BDA0003831744010000072
wherein
Figure BDA0003831744010000073
Simplified to obtain an event at Z i Coordinates on a plane:
Figure BDA0003831744010000074
in the formula (c) x ,c y ,c z ) T =-R T t, is the coordinates of the DVS relative to the reference time.
In the process of projection composition of the event, effective events continuously triggered by a high frequency in an actual scene are automatically synchronized to a corresponding space position, and noise in the event stream cannot vote space-time persistence on a fixed region in the DSI and is diluted by the effective events, so that the algorithm has good robustness on the noise, and reference is provided for objectively evaluating the noise reduction effect of the event stream.
And finally, recording the DSI maximum value of each pixel of the reference visual angle along the optical axis direction to obtain a confidence map under the reference visual angle.
And step 3: since the local maxima and edge regions in the DSI often correspond to intensity gradients in the scene, the probability of a trigger event is greater, getting the reference instant t r After confidence map c (x, y), we convert each pixel into an event probability map by computing its probability as a local maximum or edge, representing ideally t r The event probability triggered by the real scene on each pixel of the DVS at a moment specifically includes the following steps:
(3-1) for each pixel (x, y) in the confidence map, using it separately with each pixel (x) in the neighborhood window Ω i ,y i ) Constructing a spatial domain Gaussian kernel G by corresponding space proximity and pixel similarity between the E and the omega d Sum-valued Gaussian Kernel G r
Figure BDA0003831744010000075
Figure BDA0003831744010000076
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003831744010000077
(3-2) thereafter, the normalized product of the spatial domain Gaussian kernel and the value domain Gaussian kernel is taken as the weight W (x) i ,y i ) And performing weighted fusion on all pixels in the window omega to obtain an adaptive threshold value T (x, y), comparing the adaptive threshold value T (x, y) with a confidence value c (x, y) at the central pixel, and calculating the probability of the pixel (x, y) serving as a local maximum or an edge region of the DSI, wherein the probability represents an event triggering probability p (x, y):
Figure BDA0003831744010000081
Figure BDA0003831744010000082
Figure BDA0003831744010000083
where the window size is set to 7x7. And repeating the steps for all the pixels on the confidence map to obtain the event probability map of the reference time. Therefore, by introducing the spatial proximity and the pixel similarity, the confidence map can be converted into an event probability map by combining the local scene information according to the event triggering characteristics.
And 4, step 4: based on the reference time t r The method comprises the following steps of calculating the reasonability of an event stream at a corresponding moment according to the consistency of the event stream and an event probability map, and specifically comprising the following steps:
(4-1) for event stream e i (x i ,y i ,t i ) Using I: Z 2 → 0,1 represents the dynamic visual sensor pixel plane Z in a time range before and after the reference time 2 Event trigger case of (3):
Figure BDA0003831744010000084
wherein τ represents a time range, and 1 and 0 each represent [ t ] r -τ,t r +τ]The presence and absence of an event at the pixel over the time period.
(4-2) when I (x, y) =1, constructing an exponential decay kernel using the time distance between the time stamp of the event at the corresponding pixel and the reference time instant, representing the time correlation between the event stream at that pixel and the event probability map:
Figure BDA0003831744010000085
with the decay rate parameter deltat set to 20ms. The rationality of an event stream triggering an event at a pixel is quantified by the product of the event probabilities p (x, y) and Γ (x, y) of the corresponding locations, the greater the rationality, the greater the likelihood of representing the event as a valid signal for an actual scene trigger.
When I (x, y) =0, the inverse event probability on the event probability map is used:
Figure BDA0003831744010000086
indicating the rationality of the pixel dropout event.
(4-3) thus, [ t ] is obtained r -τ,t r +τ]During this time, the log reasonableness of the event stream on pixel (x, y):
Figure BDA0003831744010000091
separately calculating the pixel plane Z 2 The logarithm rationality of all the pixel positions is obtained to obtain an event stream e i The rationality of (2):
Figure BDA0003831744010000092
logP(e i ) The smaller, the event stream e is represented i The better the consistency with the event probability map, the higher the rationality.
And 5: the high-precision event stream noise reduction method can remove noise events with low rationality and keep effective events with high rationality, so that the overall reliability of the event stream is improved. Therefore, the improvement of the noise reduction algorithm on the rationality of the event stream is calculated by comparing the rationality of the event stream before and after noise reduction, and the noise reduction precision index is obtained and used for evaluating and comparing the noise reduction effect of different algorithms:
Figure BDA0003831744010000093
wherein e is original And e denoised The event streams before and after noise reduction are respectively represented, the higher the noise reduction precision index is, the more obvious the enhancement of the noise reduction algorithm on the rationality of the event streams is represented, and the better the noise reduction effect is.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.

Claims (5)

1. The method for evaluating the noise reduction effect of the dynamic visual event stream based on event time-space synchronization comprises the following specific steps:
step 1: reading an event stream output by a dynamic vision sensor DVS, and acquiring pose information of the DVS by an action capture system, a visual odometer, an inertial navigation or an indoor positioning method;
step 2: based on an Event-based Multi View Stereo algorithm, three-dimensional reconstruction is carried out on an actual scene by using an Event stream in combination with pose information, events triggered at different moments are projected to a reference moment to carry out space-time synchronization, a confidence map is obtained, sharpening of a real Event and blurring of a noise Event are realized, and the sharpening and the blurring of the noise Event are used as a reference standard for noise reduction;
and 3, step 3: since the local maxima and edge regions in the DSI often correspond to intensity gradients in the scene, the probability of a trigger event is greater, getting the reference instant t r After the confidence map c (x, y) of (c), each pixel is converted into an event probability map by calculating the probability of being a local maximum or edge for each pixel, representing t in an ideal case r Event probability triggered by each pixel of a real scene in time on DVS;
and 4, step 4: based on the reference time t r The consistency of the event stream and the event probability map is calculated, and the reasonability of the event stream is calculated;
and 5: the event stream noise reduction method with high precision can remove noise events with low rationality and keep effective events with high rationality, so that the overall reliability of the event stream is improved, and therefore, the rationality of the event stream before and after noise reduction is compared, the improvement of the noise reduction algorithm on the rationality of the event stream is calculated, and the noise reduction precision index is obtained and is used for evaluating and comparing the noise reduction effects of different algorithms:
Figure FDA0003831743000000011
wherein e is original And e denoised The event streams before and after noise reduction are respectively represented, the higher the noise reduction precision index is, the more obvious the enhancement of the noise reduction algorithm on the rationality of the event streams is represented, and the better the noise reduction effect is.
2. The method for evaluating the noise reduction effect of the dynamic visual event stream based on the event space-time synchronization as claimed in claim 1, wherein: the three-dimensional reconstruction by using the event stream and the pose information in the step 2 comprises the following processes:
before three-dimensional reconstruction, detecting repeated events on each pixel, and only retaining the first trigger event:
IE={e i (x i ,y i ,t i )|(t i -t i-1 )>τ IE ∧(t i+1 -t i )<τ IE }
wherein IE represents the first event in the repeated trigger events and represents t i Time stamp of the i-th event triggered on a certain pixel, time threshold parameter t IE Setting the time to be 20ms;
then, three-dimensional reconstruction based on the event is carried out to generate a confidence map, and the specific steps comprise:
(2-1) selecting an observation visual angle at a reference moment as a reference visual angle, discretizing an observation camera system along the optical axis direction of the reference visual angle into a grid map, constructing a parallax space diagram DSI, and discretizing the reference visual angle into N depth planes by the parallax space diagram DSI
Figure FDA0003831743000000021
Each depth plane is divided into w × h spatial cells, consistent with the pixel resolution of DVS, so DSI is divided into w × h × N spatial voxel cells, where N is set to 100;
(2-2) projecting all events from the pixel plane to the parallax space map DSI according to the poses at the corresponding moments, and calculating the intersection times of each unit voxel in the parallax space map DSI and event back-projection rays, wherein the more the intersection times, the more times of observing and responding the corresponding area by the DVS, the greater the probability that the unit voxel contains a scene edge, and correspondingly, the greater the probability of triggering an event on the DVS at the reference view angle;
in the process of projecting and composing an event, an effective event continuously triggered by a high frequency in an actual scene is automatically synchronized to a corresponding spatial position, and noise in an event stream cannot vote the space-time continuity of a fixed area in the DSI and is diluted by the effective event;
and finally, recording the DSI maximum value of each pixel of the reference visual angle along the optical axis direction to obtain a confidence map under the reference visual angle.
3. The method for evaluating the noise reduction effect of the dynamic visual event stream based on the event space-time synchronization as claimed in claim 2, wherein: the step of projecting the event from the pixel plane to the DSI in step 2 is as follows:
solving the intersection units of the event projection ray and the depth planes of the DSI by using homography, wherein each depth plane
Figure FDA0003831743000000022
Are respectively:
Z i =[n,d i ] T =[(0,0,1),z i ] T
wherein n and z i Respectively the normal vector and depth of each plane;
in the projection process, each event e is first utilized i (x i ,y i ) Relative pose [ R | t ] between observation time and reference time]Calculating the relative Z between them 0 Planar homography matrix
Figure FDA0003831743000000023
Figure FDA0003831743000000024
Then, in combination with the projection matrix P of the DVS,deriving the position in Z from the pixel coordinates of the event by a homography transformation 0 Projection coordinates on a plane:
Figure FDA0003831743000000025
Figure FDA0003831743000000026
in the formula (x) i ,y i ) And (x (z) 0 ),y(z 0 ) Are the pixel coordinates of the event and are in Z, respectively 0 Projection coordinates of the plane;
the projected coordinates of the event on the remaining depth plane of the DSI, again by homography, are transformed from Z 0 The coordinates on the plane are calculated to yield:
Figure FDA0003831743000000031
wherein
Figure FDA0003831743000000032
Simplified to obtain an event at Z i Coordinates on a plane:
Figure FDA0003831743000000033
in the formula (c) x ,c y ,c z ) T =-R T t, is the coordinates of the DVS relative to the reference time.
4. The method for evaluating the noise reduction effect of the dynamic visual event stream based on the event space-time synchronization as claimed in claim 1, wherein: the step 3 of converting the confidence map into the event probability map comprises the following processes:
(3-1) for each pixel x, y in the confidence map, using it and the neighborhood window, respectivelyEach pixel (x) in the omega i ,y i ) Constructing a spatial domain Gaussian kernel G by corresponding space proximity and pixel similarity between the E and the omega d Sum-valued Gaussian kernel G r
Figure FDA0003831743000000034
Figure FDA0003831743000000035
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003831743000000036
(3-2) after that, the normalized product of the spatial domain Gaussian kernel and the value domain Gaussian kernel is taken as the weight W (x) i ,y i ) And performing weighted fusion on all pixels in the window omega to obtain an adaptive threshold value T (x, y), comparing the adaptive threshold value T (x, y) with a confidence value c (x, y) at the central pixel, and calculating the probability of the pixel (x, y) serving as a local maximum or an edge region of the DSI, wherein the probability represents an event triggering probability p (x, y):
Figure FDA0003831743000000037
Figure FDA0003831743000000038
Figure FDA0003831743000000039
the window size is set to 7x7, and the above steps are repeated for all pixels on the signaling map, so as to obtain an event probability map at the reference time.
5. The method for evaluating the noise reduction effect of the dynamic visual event stream based on the event space-time synchronization as claimed in claim 1, wherein: the event stream rationality calculation in step 4 includes the following processes:
(4-1) for event stream e i (x i ,y i ,t i ) Using I: Z 2 → 0,1 represents the dynamic visual sensor pixel plane Z in a time range before and after the reference time 2 Event trigger case of (3):
Figure FDA0003831743000000041
wherein τ represents a time range, and 1 and 0 each represent [ t r -τ,t r +τ]The presence and absence of an event at the pixel over a period of time;
(4-2) when I (x, y) =1, constructing an exponential decay kernel using the time distance between the time stamp of the event at the corresponding pixel and the reference time instant, representing the time correlation between the event stream at that pixel and the event probability map:
Figure FDA0003831743000000042
wherein the decay rate parameter delta t Setting to 20ms, quantifying the rationality of an event stream triggering an event at a pixel by the product of the event probabilities p (x, y) and Γ (x, y) of the respective locations, the greater the rationality, the greater the likelihood of representing the event as a valid signal for actual scene triggering;
when I (x, y) =0, the inverse event probability on the event probability map is used:
Figure FDA0003831743000000043
indicating the rationality of the pixel miss event;
(4-3) thus, [ t ] is obtained r -τ,t r +τ]During this time, the log justification of the event stream on pixel (x, y) is:
Figure FDA0003831743000000044
separately calculating the pixel plane Z 2 The logarithm rationality of all the pixel positions is obtained to obtain an event stream e i The rationality of (2):
Figure FDA0003831743000000045
logP(e i ) The smaller, the event stream e is represented i The better the consistency with the event probability map, the higher the rationality.
CN202211076662.8A 2022-09-05 2022-09-05 Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization Pending CN115375581A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211076662.8A CN115375581A (en) 2022-09-05 2022-09-05 Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211076662.8A CN115375581A (en) 2022-09-05 2022-09-05 Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization

Publications (1)

Publication Number Publication Date
CN115375581A true CN115375581A (en) 2022-11-22

Family

ID=84069308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211076662.8A Pending CN115375581A (en) 2022-09-05 2022-09-05 Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization

Country Status (1)

Country Link
CN (1) CN115375581A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115996320A (en) * 2023-03-22 2023-04-21 深圳市九天睿芯科技有限公司 Event camera adaptive threshold adjustment method, device, equipment and storage medium
CN116957973A (en) * 2023-07-25 2023-10-27 上海宇勘科技有限公司 Data set generation method for event stream noise reduction algorithm evaluation
CN117115451A (en) * 2023-08-31 2023-11-24 上海宇勘科技有限公司 Adaptive threshold event camera denoising method based on space-time content correlation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115996320A (en) * 2023-03-22 2023-04-21 深圳市九天睿芯科技有限公司 Event camera adaptive threshold adjustment method, device, equipment and storage medium
CN116957973A (en) * 2023-07-25 2023-10-27 上海宇勘科技有限公司 Data set generation method for event stream noise reduction algorithm evaluation
CN116957973B (en) * 2023-07-25 2024-03-15 上海宇勘科技有限公司 Data set generation method for event stream noise reduction algorithm evaluation
CN117115451A (en) * 2023-08-31 2023-11-24 上海宇勘科技有限公司 Adaptive threshold event camera denoising method based on space-time content correlation
CN117115451B (en) * 2023-08-31 2024-03-26 上海宇勘科技有限公司 Adaptive threshold event camera denoising method based on space-time content correlation

Similar Documents

Publication Publication Date Title
KR102319177B1 (en) Method and apparatus, equipment, and storage medium for determining object pose in an image
CN115375581A (en) Dynamic visual event stream noise reduction effect evaluation method based on event time-space synchronization
CN110689562A (en) Trajectory loop detection optimization method based on generation of countermeasure network
TW201308252A (en) Depth measurement quality enhancement
CN110544273B (en) Motion capture method, device and system
CN110276831B (en) Method and device for constructing three-dimensional model, equipment and computer-readable storage medium
CN111899345B (en) Three-dimensional reconstruction method based on 2D visual image
CN112348775B (en) Vehicle-mounted looking-around-based pavement pit detection system and method
US20230281913A1 (en) Radiance Fields for Three-Dimensional Reconstruction and Novel View Synthesis in Large-Scale Environments
CN111798485B (en) Event camera optical flow estimation method and system enhanced by IMU
CN112561996A (en) Target detection method in autonomous underwater robot recovery docking
CN114419568A (en) Multi-view pedestrian detection method based on feature fusion
CN114519772A (en) Three-dimensional reconstruction method and system based on sparse point cloud and cost aggregation
US20240096094A1 (en) Multi-view visual data damage detection
Babu et al. An efficient image dahazing using Googlenet based convolution neural networks
CN113762009B (en) Crowd counting method based on multi-scale feature fusion and double-attention mechanism
Al-Zubaidy et al. Removal of atmospheric particles in poor visibility outdoor images
US20230245396A1 (en) System and method for three-dimensional scene reconstruction and understanding in extended reality (xr) applications
CN116912393A (en) Face reconstruction method and device, electronic equipment and readable storage medium
CN113706599B (en) Binocular depth estimation method based on pseudo label fusion
Madsen et al. Estimating outdoor illumination conditions based on detection of dynamic shadows
CN113379787B (en) Target tracking method based on 3D convolution twin neural network and template updating
KR20160039447A (en) Spatial analysis system using stereo camera.
Qiao et al. Visibility enhancement for underwater robots based on an improved underwater light model
CN113888420A (en) Underwater image restoration method and device based on correction model and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination