CN113132658B

CN113132658B - Data processing method, device, equipment and medium based on bionic image sensor

Info

Publication number: CN113132658B
Application number: CN202110413403.9A
Authority: CN
Inventors: 汪辉; 万吉祥; 黄尊恺; 祝永新; 田犁; 李全泽; 冯英奇
Original assignee: Shanghai Advanced Research Institute of CAS
Current assignee: Shanghai Advanced Research Institute of CAS
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2022-11-22
Anticipated expiration: 2041-04-16
Also published as: CN113132658A

Abstract

The application provides a data processing method, a device, equipment and a medium based on a bionic image sensor, which are characterized in that an event stream expressed by pixel positions, polarities and time stamps and acquired by the bionic image sensor is read; setting initial values of a first preset three-dimensional array and a second preset three-dimensional array; respectively determining target addresses according to the pixel positions and the polarities of the events; updating the data value corresponding to the first three-dimensional array into an initial value according to the target address, subtracting a preset value from the neighborhood data value, and keeping the other values unchanged; and/or updating the data value corresponding to the second three-dimensional array into a timestamp corresponding to the current event according to the target address; and when the preset condition is met, outputting the newly updated first three-dimensional array or the newly updated combined second three-dimensional array as an event frame, and returning to continue reading the event stream. Compared with the prior art, the method and the device can better code the inherent space-time characteristics of the output event stream of the bionic image sensor, and can obviously code the outlines of objects with different motion speeds.

Description

Data processing method, device, equipment and medium based on bionic image sensor

Technical Field

The invention relates to the technical field of digital image processing, in particular to a data processing method, a data processing device, data processing equipment and a data processing medium based on a bionic image sensor.

Background

In computer vision, data acquisition efforts rely primarily on image sensors. Conventional CMOS image sensors capture a scene in a series of successive image frames. These image frames typically contain a large amount of redundant data such as repetitive background. The reading, transmission and calculation processing of redundant data adds unnecessary transmission storage and calculation overhead. In addition, due to the limitation of the exposure mechanism, the problem of undersampling of information exists between two continuous frames, which results in the loss of motion blur and motion trail. Therefore, the conventional image sensor generally has the problems of bandwidth bottleneck, power consumption waste, information loss, insufficient computing capability, high system complexity and the like.

Inspired by biological vision principle, researchers imitate biological retina from pixel level and circuit structure, and design a bionic image sensor based on events. Unlike conventional cameras that continuously measure the absolute brightness of all pixels at a fixed frame rate, the biomimetic image sensor captures each pixel brightness change at an unsynchronized rate and generates an event output only when a transient change in the scene is captured. Therefore, the bionic image sensor can asynchronously output sensitive motion information and has the characteristics of high time resolution, high dynamic range, low delay, low bandwidth requirement and the like.

However, the asynchronous event stream characterization form output by the bionic image sensor is quite different from the data structure of the conventional image frame, and the existing mature image processing technology (such as a convolutional neural network) is difficult to be directly applied to event stream data.

Therefore, the event data based processing technology also needs to develop a new algorithm.

Disclosure of Invention

In view of the above drawbacks of the prior art, an object of the present application is to provide a data processing method, apparatus, device and medium based on a bionic image sensor, so as to solve the problem in the prior art that the asynchronous event stream is very different from the data structure of a conventional image frame and is difficult to be directly applied.

To achieve the above and other related objects, the present application provides a data processing method based on a bionic image sensor, the method comprising: reading an event stream which is acquired by a bionic image sensor and is expressed by pixel position, polarity and time stamp; setting an initial value for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array; respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; according to the target address determined in the first three-dimensional array, updating a corresponding data value to an initial value N, subtracting a preset value k from a data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and keeping a data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array into a timestamp corresponding to the current event; and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

In an embodiment of the application, the updating the data value corresponding to the target address determined in the second three-dimensional array to the timestamp corresponding to the current event includes: comparing the timestamp of the current event with the latest updated data value of the target address in the second three-dimensional array, and judging whether the difference value between the timestamp of the current event and the latest updated data value of the target address in the second three-dimensional array is greater than a preset time value or not; if so, respectively updating the target addresses in the first three-dimensional array and the second three-dimensional array; if not, the updating is not carried out.

In an embodiment of the present application, the predetermined neighborhood window range is (2R + 1) × (2R + 1); where R represents the radius of the neighborhood range of interest.

In an embodiment of the present application, the data value range of each address of the first three-dimensional array is [ N-k (2R + 1) ] ² ，N]。

In an embodiment of the present application, the outputting the newly updated first three-dimensional array and the second three-dimensional array as the encoded event frame includes: multiplying the latest updated first three-dimensional array by the normalized latest updated second three-dimensional array, and outputting the latest updated second three-dimensional array as an encoded event frame; the encoding method may adopt a frequency-based encoding method or a time-based encoding method.

In an embodiment of the present application, the preset condition includes: when the timestamp span meets the duration of a preset time window, or when the total number of the events meets a preset value.

To achieve the above and other related objects, the present application provides a data processing apparatus based on a bionic image sensor, the apparatus comprising: the reading module is used for reading an event stream expressed by pixel positions, polarities and time stamps acquired by the bionic image sensor; the initialization module is used for respectively setting an initial value for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array; the processing module is used for respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; updating a corresponding data value to an initial value according to the determined target address in the first three-dimensional array, subtracting a preset value from the data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and maintaining the data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value to the timestamp corresponding to the current event according to the target address determined in the second three-dimensional array; and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

To achieve the above and other related objects, the present application provides a computer apparatus, comprising: a memory, a processor, and a communicator; the memory is to store computer instructions; the processor executes computer instructions to implement the method as described above; the communicator is in communication with the biomimetic image sensor to acquire the stream of events collected by the biomimetic image sensor as expressed by pixel position, polarity, and time stamp.

To achieve the above and other related objects, the present application provides a computer readable storage medium storing computer instructions which, when executed, perform the method as described above.

In summary, the data processing method, apparatus, device and medium based on the bionic image sensor of the present application read the event stream expressed by the pixel position, polarity and time stamp collected by the bionic image sensor; setting an initial value for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array; respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; updating a corresponding data value to an initial value according to the determined target address in the first three-dimensional array, subtracting a preset value from the data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and maintaining the data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array into a timestamp corresponding to the current event; and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

Has the following beneficial effects:

compared with the prior art, the method and the device can better code the inherent space-time characteristics of the output event stream of the bionic image sensor. The contour of the object can be obviously coded for objects with different moving speeds.

Drawings

Fig. 1 is a flowchart illustrating a data processing method based on a bionic image sensor according to an embodiment of the present disclosure.

FIG. 2 is a schematic diagram of a three-dimensional array and simplified process according to an embodiment of the present invention.

Fig. 3 is a schematic processing flow chart illustrating a data processing method based on a bionic image sensor according to an embodiment of the present disclosure.

FIG. 4 is a block diagram of a data processing apparatus based on a bionic image sensor according to an embodiment of the present disclosure.

Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following embodiments of the present application are described by specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure of the present application. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only schematic and illustrate the basic idea of the present application, and although the drawings only show the components related to the present application and are not drawn according to the number, shape and size of the components in actual implementation, the type, quantity and proportion of the components in actual implementation may be changed at will, and the layout of the components may be more complex.

Throughout the specification, when a part is referred to as being "connected" to another part, this includes not only a case of being "directly connected" but also a case of being "indirectly connected" with another element interposed therebetween. In addition, when a certain part is referred to as "including" a certain component, unless otherwise stated, other components are not excluded, but it means that other components may be included.

The terms first, second, third, etc. are used herein to describe various elements, components, regions, layers and/or sections, but are not limited thereto. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first portion, component, region, layer or section discussed below could be termed a second portion, component, region, layer or section without departing from the scope of the present application.

Also, as used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," and/or "comprising," when used in this specification, specify the presence of stated features, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, operations, elements, components, items, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "a, B or C" or "a, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions or operations are inherently mutually exclusive in some way.

Fig. 1 is a schematic flow chart of a data processing method based on a bionic image sensor according to an embodiment of the present application. As shown, the method comprises:

step S101: and reading an event stream which is acquired by the bionic image sensor and is expressed by a pixel position x, y, a polarity p and a time stamp t.

Typically the output of a biomimetic image sensor is a series of variable data rate sequences, called digital "events" or "pulses". Each event represents a change in the intensity (log intensity) of a predefined magnitude at a pixel at a particular time. When the logarithmic intensity is increased by a threshold (ON threshold), it will produce an ON event indicating an increase in brightness. Conversely, when the logarithmic intensity decreases by a threshold value (OFF threshold value), an OFF event of decreasing brightness will occur. The biomimetic image sensor output is represented by one (x, y, t, p) quadruple per event. x, y denote the pixel location where the event is sensed, t is the event stamp recording the event, the polarity p denotes the ON/OFF nature of the event, and is typically denoted as 0 or 1. Along with the continuous change of the environment, a series of events sensed by the bionic image sensor are output in a sequence form to form an event stream.

Step S102: and respectively setting a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array as an initial value N.

Alternatively, the first three-dimensional array may be represented as a time plane S (x, y, p) and the second three-dimensional array may be represented as a time stamp plane T (x, y, p). Here the first two dimensions (x, y) represent the pixel location where the event occurred in the event frame and the third dimension (p) represents the polarity of the event.

Where a time plane can be interpreted as a plane, the value of each pixel cell location within the plane is a time-dependent calculation. The timestamp is the time at which each event occurred. Each event is represented by (x, y, p, t), with the timestamp t inside.

And each data value at a specific (x, y, p) position in the first three-dimensional array or the second three-dimensional array is initially set to an initial value N, for example, the initial value N is 0, and as a time stream continues to enter in the subsequent process, the S data value or the T data value at the corresponding position is updated.

Step S103: and respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event.

It should be noted here that the pixel position (x, y) and the polarity (p) substantially constitute an addressing address for the three-dimensional array, that is, the first three-dimensional array and the second three-dimensional array are constituted by (x, y, p) three-dimensional coordinates, in other words, every time an event comes, a specific position or address in the first three-dimensional array and the second three-dimensional array can be determined according to (x, y, p) data of the event.

Step S104: updating a corresponding data value to an initial value according to a target address determined in the first three-dimensional array, subtracting a preset value k from a data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and maintaining a data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array to be the timestamp corresponding to the current event.

In short, when the target address in the first three-dimensional array is determined, no matter how much the data value of the target address is updated at this time, the corresponding data value is updated to the initial value N, such as 0.

Next, by presetting a radius R representing the neighborhood range of interest, a neighborhood window range of (2r + 1) × (2r + 1) is found, and then a preset value k is subtracted from the data values corresponding to positions in all the first three-dimensional arrays within the neighborhood window (2r + 1) × (2r + 1) range located at the target position (x, y), as k =1. Wherein, through experimental comparison, when R is 3, the output effect is best.

And finally, keeping the data values of the non-neighborhood addresses outside the range of the rest neighborhood windows unchanged. Thus, the value range of the S median value is ensured to be [ - (2R + 1) ² ,0]. The specific updating process can be represented by the following formula:

S(x,y,p)＝0。

in an embodiment of the application, the updating the data value corresponding to the target address determined in the second three-dimensional array to the timestamp corresponding to the current event includes:

A. comparing the timestamp of the current event with the latest updated data value of the target address in the second three-dimensional array, and judging whether the difference value between the timestamp of the current event and the latest updated data value of the target address in the second three-dimensional array is smaller than a preset time value T or not _tr ；

B. If yes, respectively updating the target addresses in the first three-dimensional array and the second three-dimensional array; if not, the updating is not carried out.

In brief, each time each event (x, y, p, T) arrives, the event timestamp T is first compared to the latest time T (x, y, p) recorded at the (x, y) pixel location in the second three-dimensional array (timestamp plane) T. Only if the difference between the two is greater than a predetermined time value T _tr The first three-dimensional array (time plane) S is considered updated. At the same time, T (x, y, p) in the second three-dimensional array (timestamp plane) is updated to a new timestamp T. Otherwise, when the difference value between the two is less than or equal to the preset time value T _tr In time, neither the first three-dimensional array (time plane) S nor the second three-dimensional array (time stamp plane) T is updated.

It should be noted that the number of events captured by the bionic image sensor is related to the speed of the object moving. More events occur around objects that move faster, which may be referred to as "dense events". Less events occur around slow moving objects, called "sparse events".

Assume a time plane where the value of each pixel is related to the event that occurred at the corresponding location. When an event comes, the pixel value of the corresponding position on the time plane is updated to a preset value, and the pixel values of the positions around the point are reduced to a certain extent. This is mainly to show the suppression effect of the received signal position on the neighborhood range.

For a certain pixel position, the more events occur in the neighborhood position, the more the pixel point is inhibited, and the larger the reduction amplitude of the pixel value is. Conversely, pixels with fewer occurrences in the neighborhood region are penalized less. When no event occurs in the neighborhood, the pixel value will remain until the deadline of the time plane.

In general, the "dense event" area in the event frame has larger pixel values and more obvious contour. After the neighborhood suppression mechanism is introduced, the pixel weight of the part of the area is suppressed more. While the contours of the "sparse event" regions are suppressed less. So balanced, the event contours mapped onto the time plane should be similar regardless of the "dense event" region or the "sparse event" region, without the "sparse event" region being overly suppressed.

In order to further reduce the influence of the dense events, a time filter is added before the event integration processing. Specifically, when an event arrives, the time interval between it and the event before the pixel position is calculated. By comparing the time interval and setting a time interval threshold, whether the event is subjected to subsequent encoding processing is determined. When the timestamp interval is less than the set threshold, the two events are considered duplicate events and the event will not be processed. And each event with larger time interval can pass through the filter smoothly, so that the encoded event frame is changed correspondingly. Such a simple temporal filter can filter out some events in the "dense event" region, reducing their contribution to the time plane, while the "sparse event" region is not affected by the filter.

To further understand the technical content of step S104 in the present application, the following detailed description will be made for the present method:

as shown in fig. 2, first, to facilitate understanding of the first three-dimensional array (time plane) S and the second three-dimensional array (time stamp plane) T, the elements (x, y, p) of the three-dimensional array may be simplified to m, i.e., S (x, y, p) is simplified to S (m); t (x, y, p) is reduced to T (m).

Next, assuming that m =8, each value in the first three-dimensional array (time plane) S and the second three-dimensional array (time stamp plane) T is initialized to an initial value of 0. Namely: s = [ 00 00 00 ], T = [ 00 00 00 ].

Assuming that the 1 st event (x, y, p, t) arrives, which can be reduced to (m, t), assuming (m, t) is (5, 2), then the determined location or address in the array is 5 and the corresponding event stamp is 2; for ease of understanding, assuming the preset neighborhood window range is 1, then the corresponding is available:

S＝[0 0 0 -1 0 -1 0 0],T＝[0 0 0 0 2 0 0 0]；

assuming that the 2 nd event (x, y, p, t) arrives, (m, t) is (2, 6), then the determined position or address in the array is 2, and the corresponding event stamp is 6, then the corresponding is available:

S＝[-1 0 -1 -1 0 -1 0 0],T＝[0 6 0 0 2 0 0 0]；

assuming that the 3 rd event (x, y, p, t) arrives, (m, t) is (3, 10), then the determined position or address in the array is 3, and the corresponding event stamp is 10, then the corresponding is available:

S＝[-1 -1 0 -2 0 -1 0 0],T＝[0 6 10 0 2 0 0 0]；

assuming that the 4 th event (x, y, p, t) arrives, (m, t) is (3, 11), then the determined location or address in the array is 3 and the corresponding event stamp is 11; for ease of understanding, it is assumed here that the time value T is preset _tr Greater than 1, since the timestamp 11-10 of the same location 3 is not greater than 1, S and T are not updated, so that:

S＝[-1 -1 0 -2 0 -1 0 0],T＝[0 6 10 0 2 0 0 0]。

and so on, the next event is coded continuously. When the duration of a preset time window is reached, outputting a first newly updated three-dimensional array as a time plane (S); or the newly updated first three-dimensional array (time surface) S is combined with the second three-dimensional array (time stamp plane) T to be output as an encoded event frame, and the first step is returned to continue reading the event stream.

Step S105: and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

In some embodiments, the preset conditions include: when the timestamp span meets the duration of a preset time window, or when the total number of the events meets a preset value.

It should be noted that the time window described herein may simply correspond to a conventional image frame time window, for example, 20ms or 40ms, or may be self-adjusted according to the event stream output condition. The method and the device integrate all events in the preset time window to output an event frame of one frame.

Alternatively, when video data in a large number of still pictures or a special scene is processed, the total number of events may be used as a condition for segmenting and extracting an event stream, so as to focus more on the contour of an image object when an event occurs.

In summary, the present application can assign a larger weight to a recently occurring event in an event stream within a time window, and therefore, the first three-dimensional array and the second three-dimensional array of the present application preferably reflect the latest event, that is, update each position or data value in the array as the latest event. Therefore, the latest events are highlighted in the time window, and the outlines of the objects can be obviously coded for the objects with different moving speeds.

In an embodiment of the present application, the latest updated first three-dimensional array may be output as an encoded event frame; or multiplying the newly updated first three-dimensional array by the newly updated second three-dimensional array after normalization, and outputting the newly updated second three-dimensional array as an event frame after encoding; the encoding method may adopt a frequency-based encoding method or a time-based encoding method.

In brief, in the present application, only the first three-dimensional array (time plane) S that is updated most recently may be output as an event frame, and at this time, the second three-dimensional array (time stamp plane) T is mainly used to compare the size of the time stamp, so as to determine whether to update the first three-dimensional array (time plane) S; the first three-dimensional array (time plane) S can also be merged with the second three-dimensional array (time stamp plane) T, meaning: events that occur most recently are given a greater weight, so multiplying by normalized T may also better describe the event. The second three-dimensional array (time stamp plane) T is used as an event frame, and although the method is the existing method, the effect is better after the second three-dimensional array (time stamp plane) T is fused with the first three-dimensional array (time plane) S.

However, it should be noted that it is better to use only the first three-dimensional array (time plane) S as the event frame or to use both as the event frame than to use only the second three-dimensional array (time stamp plane) T as the event frame.

The encoding method of the output event frame of the present application may adopt, for example, a frequency-based encoding method or a time-based encoding method.

It should be noted that the event stream output by the hardware can be regarded as a series of sequences, and the work of the present application is to encode the data into a plane similar to an image frame so as to adapt to an image processing algorithm based on the image frame. The final output pixel value has a different meaning from that of a conventional image, and contains information such as the time and intensity of the event. The output event frame is simply data-wise similar to the image frame. Are all a plane and the value corresponding to xy represents a datum for that location. This is typically a pixel value, in the image frame, which takes on a value between 0 and 255. And this value in the event frame is not a gray scale pixel but a value related to the event. In the method of the present application, this value is related to the time and frequency of occurrence of the event.

In some embodiments, a detailed process flow example of the methods described herein can be found in fig. 3.

Firstly, reading a time stream through a bionic image sensor;

then, aiming at each arriving event (x, y, p, T), respectively determining target addresses in the first three-dimensional array S and the second three-dimensional array T according to the pixel position xy and the polarity p of each arriving event, then comparing the timestamp T of the event with the latest time T (x, y, p) recorded by the (x, y) pixel position in the second three-dimensional array (timestamp plane) T, and judging whether the timestamp T is greater than a time threshold value T or not _tr (ii) a If the time stamp is larger than the preset time stamp threshold, updating T (x, y, p) in the second three-dimensional array (the time stamp plane) to be a new time stamp T; if not, waiting for the next event;

secondly, according to the formula:

s (x, y, p) =0, updating a data value corresponding to the target address determined in the first three-dimensional array S to an initial value N (set to 0), subtracting a preset value k (set to 1) from a data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and maintaining a data value of a non-neighborhood address outside the preset neighborhood window range unchanged;

finally, judging whether a preset condition is met, wherein the preset condition can be that when the time stamp span meets the duration of a preset time window, or can also be that when the total number of the events meets a preset value, if the preset condition is met, the latest updated first three-dimensional array is used as the coded event frame or the latest updated first three-dimensional array is combined with the second three-dimensional array to be output, and returning to the first step to continuously read the event stream to obtain the next event frame; if not, continuing to read the next event until the next event is met.

In some implementations, the method described herein may be embodied in code as follows:

in conclusion, compared with the prior art, the method and the device can better code the inherent space-time characteristics of the output event stream of the bionic image sensor. The contour of the object can be obviously coded for objects with different moving speeds.

Fig. 4 is a block diagram of a data processing apparatus for encoding an event stream into an event frame according to an embodiment of the present application. As shown, the apparatus 400 includes:

a reading module 401, configured to read an event stream expressed by a pixel position, a polarity, and a timestamp, acquired by a bionic image sensor;

an initialization module 402, configured to set an initial value N for a data value corresponding to each address in a preset array of a first three-dimensional array and a preset array of a second three-dimensional array;

a processing module 403, configured to determine target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; updating a corresponding data value to an initial value according to a target address determined in the first three-dimensional array, subtracting a preset value k from a data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and maintaining a data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array into a timestamp corresponding to the current event; and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

It should be noted that, because the contents of information interaction, execution process, and the like between the modules/units of the apparatus are based on the same concept as the method embodiment described in the present application, the technical effect brought by the contents is the same as the method embodiment of the present application, and specific contents may refer to the description in the foregoing method embodiment of the present application, and are not described herein again.

It should be further noted that the above division of the modules of the apparatus 400 is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or may be physically separated. And these units can be implemented entirely in software, invoked by a processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the processing module 403 may be a separate processing element, or may be integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a processing element of the apparatus calls and executes the functions of the processing module 403. The other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs); or, one or more microprocessors (digital signal processors, DSP for short); or one or more Field Programmable Gate arrays (FPGA for short), etc.; for another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code; for another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown, the computer device 500 includes: a memory 501, a processor 502, and a communicator 503; the memory 501 is used for storing computer instructions; the processor 502 executes computer instructions to implement the method described in FIG. 1; the communicator 503 is communicatively coupled to the biomimetic image sensor to obtain the stream of events it collects, as expressed by pixel location, polarity, and time stamp.

In some embodiments, the number of the memory 501 in the computer device 500 may be one or more, the number of the processor 502 may be one or more, the number of the communicator 503 may be one or more, and fig. 5 is taken as an example.

In an embodiment of the present application, the processor 502 in the computer device 500 loads one or more instructions corresponding to processes of an application program into the memory 501 according to the steps described in fig. 1, and the processor 502 executes the application program stored in the memory 501, thereby implementing the method described in fig. 1.

The memory 501 may include a Random Access Memory (RAM), or may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 501 stores an operating system and operating instructions, executable modules or data structures, or a subset thereof, or an expanded set thereof, wherein the operating instructions may include various operating instructions for implementing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.

The Processor 502 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components.

In some specific applications, the various components of the computer device 500 are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. But for clarity of explanation the various busses are shown in fig. 5 as a bus system.

In an embodiment of the present application, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the method described in fig. 1.

The present application may be embodied as systems, methods, and/or computer program products, in any combination of technical details. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present application.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as a punch card or an in-groove protruding structure with instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be interpreted as a transitory signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or an electrical signal transmitted through an electrical wire.

The computer-readable programs described herein may be downloaded from a computer-readable storage medium to a variety of computing/processing devices, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device. The computer program instructions for carrying out operations of the present application may be assembly instructions, instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

In summary, the data processing method, apparatus, device and medium based on the bionic image sensor provided by the present application read the event stream expressed by the pixel position, polarity and time stamp collected by the bionic image sensor; setting an initial value N for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array; respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; updating a corresponding data value to an initial value according to the determined target address in the first three-dimensional array, subtracting a preset value k from the data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and keeping the data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value to the timestamp corresponding to the current event according to the target address determined in the second three-dimensional array; and/or updating the corresponding data value to the timestamp corresponding to the current event according to the target address determined in the second three-dimensional array; and when the preset condition is met, outputting the event frame by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as the encoded event frame, and returning to the first step to continuously read the event stream.

The application effectively overcomes various defects in the prior art and has high industrial utilization value.

The foregoing embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the present invention. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present application.

Claims

1. A data processing method based on a bionic image sensor is characterized by comprising the following steps:

reading an event stream which is acquired by a bionic image sensor and is expressed by pixel position, polarity and time stamp;

setting an initial value N for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array;

respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event;

updating a corresponding data value to an initial value according to the determined target address in the first three-dimensional array, subtracting a preset value k from the data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and keeping the data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array into a timestamp corresponding to the current event;

when a preset condition is met, outputting an event frame which is formed by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as a coded event frame, and returning to the first step to continuously read the event stream; wherein the preset conditions include: when the timestamp span meets the duration of a preset time window, or when the total number of the events meets a preset value.

2. The method of claim 1, wherein updating the data value corresponding to the target address determined in the second three-dimensional array to the timestamp corresponding to the current event comprises:

comparing the timestamp of the current event with the latest updated data value of the target address in the second three-dimensional array, and judging whether the difference value between the timestamp of the current event and the latest updated data value of the target address in the second three-dimensional array is greater than a preset time value or not;

if so, respectively updating the target addresses in the first three-dimensional array and the second three-dimensional array; if not, the updating is not carried out.

3. The method of claim 1, wherein the preset neighborhood window range is (2r + 1) × (2r + 1); where R represents the radius of the neighborhood range of interest.

4. The method of claim 1, wherein outputting the newly updated first three-dimensional array in combination with the second three-dimensional array as an encoded event frame comprises:

multiplying the newly updated first three-dimensional array by the normalized newly updated second three-dimensional array, and outputting the result as an encoded event frame; the encoding method may adopt a frequency-based encoding method or a time-based encoding method.

5. The method according to claim 1, wherein the preset condition comprises: when the timestamp span meets the duration of a preset time window, or when the total number of the events meets a preset value.

6. A data processing apparatus based on a biomimetic image sensor, the apparatus comprising:

the reading module is used for reading an event stream expressed by pixel positions, polarities and time stamps acquired by the bionic image sensor;

the initialization module is used for respectively setting an initial value for a data value corresponding to each address in the preset arrays of the first three-dimensional array and the second three-dimensional array;

the processing module is used for respectively determining target addresses in the first three-dimensional array and the second three-dimensional array according to the pixel position and the polarity of each arriving event; according to the determined target address in the first three-dimensional array, updating the corresponding data value to an initial value N, subtracting a preset value k from the data value corresponding to each neighborhood address in a preset neighborhood window range around the target address, and keeping the data value of a non-neighborhood address outside the preset neighborhood window range unchanged; and/or updating the corresponding data value of the target address determined in the second three-dimensional array into a timestamp corresponding to the current event; when a preset condition is met, outputting an event frame which is formed by using the latest updated first three-dimensional array or combining the latest updated first three-dimensional array with the second three-dimensional array as a coded event frame, and returning to the first step to continuously read the event stream; wherein the preset conditions include: when the timestamp span meets the duration of a preset time window, or when the total number of the events meets a preset value.

7. A computer device, the device comprising: a memory, a processor, and a communicator; the memory is to store computer instructions; the processor executes computer instructions to implement the method of any one of claims 1 to 5; the communicator is in communication connection with the bionic image sensor to acquire the event stream which is acquired by the bionic image sensor and expressed by pixel position, polarity and time stamp.

8. A computer-readable storage medium having stored thereon computer instructions which, when executed, perform the method of any one of claims 1 to 5.