CN111582300A

CN111582300A - High-dynamic target detection method based on event camera

Info

Publication number: CN111582300A
Application number: CN202010198845.1A
Authority: CN
Inventors: 蔡志浩; 曾逸文; 赵江; 王英勋; 陈文军; 强祺昌
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2020-08-25

Abstract

The invention discloses a high dynamic target detection method based on an event camera, which comprises the steps of obtaining event stream data from the event camera and converting the event stream into an event image; adjusting the size of the event image, and converting the event image into a uniform resolution; putting the event image into a convolutional neural network to obtain a plurality of prediction frames; and obtaining a detection result with higher confidence through non-maximum value inhibition. The invention improves the capability of high dynamic target detection against illumination interference and complex background environment interference by combining the event camera with deep learning.

Description

High-dynamic target detection method based on event camera

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a high-dynamic target detection method based on an event camera.

Background

Object detection is a computer vision technique that finds the location and size of an object in an image. The technology can be applied to the fields of artificial intelligence systems, vehicle driving assistance systems, target behavior analysis, intelligent video monitoring and the like.

An event camera, unlike a normal camera, is a biologically inspired visual sensor that outputs changes in pixel level brightness. The event camera behaves as a traditional vision sensor running at speeds of thousands of frames per second, but the amount of data is much smaller. The event camera has low overall power, small data storage, low computational performance requirements, and an increased dynamic range of several orders of magnitude over conventional cameras. Such cameras are less susceptible to motion blur and can provide reliable visual information during high speed motion.

The current high dynamic moving object detection method mainly uses a common camera. These methods are susceptible to illumination, and when the light intensity is weak, it is difficult to detect a moving object. In addition, the detection effect is much worse when the background is very complicated or the moving speed of the moving object is fast and the image blur occurs.

Disclosure of Invention

In order to solve the problems that the existing moving target detection method is easily influenced by external illumination, is easily influenced by a complex background environment and is easily influenced by motion blur, the invention provides a high-dynamic target detection method based on an event camera, and the specific technical scheme of the invention is as follows:

a high dynamic target detection method based on an event camera is characterized by comprising the following steps:

s1: obtaining event stream data from an event camera and converting the event stream into an event image;

s2: adjusting the size of the event image, and converting the event image into a uniform resolution;

s3: putting the event image into a convolutional neural network to obtain a plurality of prediction frames;

s4: and obtaining a detection result with higher confidence through non-maximum value inhibition.

Further, the step S1 includes the following steps:

s11: acquiring event stream data from an event camera;

the data representation mode of the event camera is address event expression, and each data is composed of an event address, namely the position of the event on the image, the time stamp of the event and the polarity of the event;

s12: generating an event image;

after acquiring the event stream data from the event camera, events within a fixed time interval △ t are accumulated to generate an event image, which is defined as

Wherein AE is_frameFor an event image, t_startIs a starting time, t_endTo the end time, t_evIs the current time, AE_xyFor a triggered event, time interval △ t is t_end-t_start＝10ms；

S13: removing noise;

setting a threshold value N, and for an event image, if the number of events around an event is less than the threshold value N, then the event is taken as noise to be removed, and for each event, the method is adopted for processing, and then the event image after noise removal is formed.

Further, the step S2 converts the obtained event image into an image with a resolution of 416 × 416.

Further, the convolutional neural network of step S3 is tiny yolo, and step S3 includes the following steps:

s31: extracting 13 x 13 and 26 x 26 feature maps through the convolutional and pooling layers;

s32: and predicting the characteristic diagram to obtain x, y, w, h and confidence of the target, wherein coordinates (x, y) represent the relative values of the center of the frame and the boundary of the grid unit, w and h are the width and the height of the frame, and the confidence is the IOU of the predicted boundary frame and the real boundary frame, wherein the IOU refers to the intersection-union ratio, namely the ratio of the intersection to the union.

Further, the step S4 first finds out the prediction box with the highest confidence from the prediction boxes obtained in the step S3, then calculates the IOU between the prediction box and the rest prediction boxes, and if the value of the IOU is greater than 0.4, the prediction box is removed; the above process is then repeated for the remaining prediction blocks until all prediction blocks have been processed.

The invention has the beneficial effects that:

1. the invention improves the capability of high dynamic target detection against illumination interference and complex background environment interference by combining the event camera with deep learning.

2. Compared with the prior art, the method has small data volume, the data volume of the event camera is much smaller than that of the traditional camera, and the required storage space is less than that of the common camera; the power consumption is low, the power consumption of the event camera is low, the selected neural network model is small, the occupied computing resources are few, and the power consumption is further reduced.

3. The method of the invention resists strong light and weak light, the dynamic range of the event camera is larger than that of the common camera, when the illumination intensity is particularly large or small, the event camera can still work normally, and a high dynamic target is detected; the method is resistant to complex background interference, the event camera senses illumination intensity change, namely when no object moves in the camera industry, the camera cannot output, when the moving target is detected, the camera can only shoot a moving person or other moving objects, no matter how complex the background is, the complex background cannot be shot, and therefore the detection of the moving target by the event camera cannot be influenced.

4. The method of the invention can detect the moving object which moves rapidly: processing the raw data of the event camera can generate event images with high frame rate, and the event images are not easy to generate motion blur, so that the motion images can be well detected even if the motion speed of a moving object is high.

Drawings

In order to illustrate embodiments of the present invention or technical solutions in the prior art more clearly, the drawings which are needed in the embodiments will be briefly described below, so that the features and advantages of the present invention can be understood more clearly by referring to the drawings, which are schematic and should not be construed as limiting the present invention in any way, and for a person skilled in the art, other drawings can be obtained on the basis of these drawings without any inventive effort. Wherein:

FIG. 1 is a flow chart of a method for high dynamic object detection based on an event camera according to the present invention;

FIG. 2 is an event image of the present invention;

FIG. 3 is a network architecture diagram of the present invention;

FIG. 4 is a graph showing the detection effect of the method of the present invention.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

As shown in fig. 1, a high dynamic target detection method based on an event camera includes the following steps:

Step S1 includes the following steps:

s11: acquiring event stream data from an event camera;

s12: generating an event image;

S13: removing noise;

Step S2 converts the obtained event image into an image with a resolution of 416 × 416, so as to facilitate the processing of the neural network.

The convolutional neural network of step S3 is tiny yolo, the structure is shown in fig. 3, and step S3 includes the following steps:

In the step S4, first, a prediction box with the highest confidence is found from the prediction boxes obtained in the step S3, and then the IOUs between the prediction box and the rest of the prediction boxes are calculated, and if the value of the IOU is greater than 0.4, the prediction box is removed; the above process is then repeated for the remaining prediction blocks until all prediction blocks have been processed.

Fig. 4 is a detection effect diagram of the method of the present invention, and it can be seen from the diagram that the method of the present invention can detect the category to which the moving object captured by the event camera belongs and the position in the image, and the generated event image has no redundant background information except for the moving object, and is not easily interfered by the complex background. Through test, the method can achieve good detection effect when the target moves rapidly and the illumination is weak.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A high dynamic target detection method based on an event camera is characterized by comprising the following steps:

2. The method for detecting high dynamic objects based on event camera as claimed in claim 1, wherein said step S1 includes the following steps:

s11: acquiring event stream data from an event camera;

s12: generating an event image;

after acquiring event stream data from the event camera, accumulating events within a fixed time interval delta t to generate an event image, wherein the event image is defined as

Wherein AE is_frameFor an event image, t_startIs a starting time, t_endTo the end time, t_evIs the current time, AE_xyFor a triggered event, the time interval Δ t is t_end-t_start＝10ms；

S13: removing noise;

3. The method for detecting the high dynamic objects based on the event camera as claimed in claim 1, wherein the step S2 is to convert the obtained event image into an image with a resolution of 416 x 416.

4. The method for detecting high dynamic objects based on event camera as claimed in claim 1, wherein the convolutional neural network of step S3 is tiny yolo, and the step S3 comprises the following steps:

5. The method for detecting high dynamic objects based on event camera as claimed in claim 1, wherein the step S4 first finds the prediction box with the highest confidence from the prediction boxes obtained in the step S3, then calculates the IOU between the prediction box and the rest of the prediction boxes, and if the value is greater than 0.4, then the prediction box is removed; the above process is then repeated for the remaining prediction blocks until all prediction blocks have been processed.