WO2024101210A1

WO2024101210A1 - Information processing device

Info

Publication number: WO2024101210A1
Application number: PCT/JP2023/039183
Authority: WO
Inventors: 裕貴町野
Original assignee: ソニーセミコンダクタソリューションズ株式会社
Priority date: 2022-11-11
Filing date: 2023-10-31
Publication date: 2024-05-16

Abstract

[Problem] To enable visualization of the occurrence time and occurrence frequency of an event, in a format that is easy for people to understand. [Solution] Provided is an information processing device comprising: a pixel having a light detection element for detecting an event on the basis of the amount of change in the quantity of incident light; and a pixel data generation unit for generating pixel data including information about the event detected during each of a plurality of unit periods of time.

Description

Information processing device

This disclosure relates to an information processing device.

An event-based vision sensor (EVS) has been proposed that quickly acquires event information from photoelectric conversion elements when some event, such as a change in brightness, occurs in the imaging scene. This EVS detects changes in light brightness caused by, for example, the movement of a subject, as events.

A method is known that visualizes detected events in a format that is easy for humans to understand by imaging them on a frame-by-frame basis. In frame visualization, frame images are generated that contain events that occur within a certain period of time, called a time slice. If the time slice is long, a frame image that is easy for humans to understand is obtained, but the event information that occurs multiple times in the same pixel is compressed, compromising the advantage of the high speed of EVS. Conversely, if the time slice is short, the advantage of high speed is retained, but a frame image that is difficult for humans to understand is obtained. Therefore, a method has been proposed that overlaps time slices to generate frame images with improved visibility while maintaining the high speed of EVS (see, for example, non-patent document 1).

However, the technique in Non-Patent Document 1 cannot grasp the occurrence time and frequency of each event in a frame image. As a result, it is not possible to determine which of multiple events is new and which is old. In addition, it is not possible to determine whether events occur frequently at the same pixel position within a time slice.

This technology was developed in light of these circumstances, and provides an information processing device that can visualize the time and frequency of events in a format that is easy for humans to understand.

In order to solve the above problems, according to the present disclosure, a pixel having a light detection element that detects an event based on a change in the amount of light of incident light;
a pixel data generating unit configured to generate pixel data including information on the event detected in each of a plurality of unit periods;
An information processing device is provided.

The pixel data may have a pixel value that corresponds to the detection time and detection frequency of the event.

The pixel data may have a larger pixel value the more recently the event was detected and the more frequently the event was detected.

The pixel data may have a number of gradations or brightness according to the number of the plurality of unit periods.

the pixel data is a bit string data of a plurality of bits,
Information about the events in the unit period that are older may be arranged from the most significant bit side to the least significant bit side of the bit string data.

The event information may be represented by one or more bits for each unit period.

The pixel data may include information representing the polarity of the event for each unit period.

The multiple unit periods may have the same time duration.

The multiple unit periods may have a longer time span the older the time.

A plurality of the pixels arranged in a one-dimensional or two-dimensional direction;
a frame generation unit that generates frame data including information on the event that has occurred within a predetermined period, for each frame including the plurality of pixels;
a divided frame generating unit that divides the frame data into a plurality of pieces in a time axis direction to generate a plurality of divided frame data,
The pixel data generating section may generate the pixel data for each of the plurality of pixels based on information of the event of the corresponding pixel, the information being included in each of the plurality of divided frame data.

Two adjacent frames of data in the time axis direction may contain information about events that occurred within overlapping time ranges.

The overlapping time ranges may have a length that is an integer multiple of the unit period.

The pixel data generating unit may correspond each bit of the pixel data to a different one of the divided frame data, and generate a corresponding bit value of the pixel data based on the event information of the corresponding pixel in the corresponding divided frame data.

The pixel data generating unit may generate the bit value of the most significant bit of the pixel data based on the event information of the corresponding pixel in the newer divided frame data.

The split frame generation unit may split the frame data into the plurality of split frame data of the unit period having the same time length.

The split frame generation unit may split the frame data into a plurality of split frame data each having a different time length for the unit period.

The split frame generation unit may shorten the time length of the split frame data for newer times.

The image processing device may further include a frame image generating unit that generates a frame image based on the pixel data corresponding to the pixels.

The frame image generating unit may generate the frame images such that each pixel has a gradation corresponding to the detection time and detection frequency of the event, and two adjacent frame images in the time axis direction contain information about the event within an overlapping time range.

a learning unit that performs a learning process to update weights of a neural network used for predetermined information processing including at least one of object tracking, object recognition, and object motion prediction, based on the frame images;
The image processing device may further include an information processing unit that performs the predetermined information processing based on the neural network that has performed the learning process and the frame images.

1 is a block diagram showing a configuration example of an information processing device according to a first embodiment of the present disclosure. FIG. 2 is a block diagram showing a configuration example of a sensor. FIG. 2 is a circuit diagram illustrating a first example of a pixel. FIG. 11 is a circuit diagram illustrating a second example of a pixel. FIG. 2 is a diagram showing a first example of a laminated structure of a sensor. FIG. 13 is a diagram showing a second example of a laminated structure of the sensor. 13 is a flowchart of event visualization in an information processing device. FIG. 11 is a diagram illustrating a framing process. FIG. 11 is a diagram illustrating an image generation process in a comparative example. FIG. 2 is a diagram illustrating an image generation process according to the present disclosure. 3 is a block diagram showing a detailed configuration of an image generating unit according to the first embodiment of the present disclosure. FIG. 4 is a flowchart of image generation according to the first embodiment of the present disclosure. FIG. 2 is a schematic diagram showing division of frame data in the first embodiment of the present disclosure. FIG. 2 is a diagram illustrating generation of pixel data. 4 is a diagram showing a correspondence relationship between frame data and pixel data in the first embodiment of the present disclosure. FIG. FIG. 4 is a diagram showing an output of image data in the first embodiment of the present disclosure. FIG. 2 is a diagram showing two pieces of pixel data generated based on two frame data adjacent in the time axis direction. FIG. 11 is a schematic diagram showing division of frame data in a second embodiment of the present disclosure. FIG. 11 is a diagram showing the correspondence between frame data and pixel data in the second embodiment of the present disclosure. FIG. 13 is a block diagram showing a detailed configuration of an image generating unit according to a third embodiment of the present disclosure. 13 is a flowchart of image generation according to a third embodiment of the present disclosure. FIG. 13 is a block diagram showing a configuration of an information processing device according to a fourth embodiment of the present disclosure. FIG. 1 is a schematic diagram showing an example of information processing by an information processing unit; 1 is a block diagram showing an example of a schematic configuration of a vehicle control system; 4 is an explanatory diagram showing an example of the installation positions of an outside-vehicle information detection unit and an imaging unit; FIG.

Below, an embodiment of an information processing device will be described with reference to the drawings. The following description will focus on the main components of the information processing device, but the information processing device may have components and functions that are not shown or described. The following description does not exclude components and functions that are not shown or described.

First Embodiment
Fig. 1 is a block diagram showing an example of a configuration of an information processing device 1 according to a first embodiment of the present disclosure. The information processing device 1 detects the movement and brightness change of an arbitrary subject. The information processing device 1 can be applied to both a stationary device such as an industrial robot and a portable device such as a smartphone. The information processing device 1 in Fig. 1 includes a sensor 2, an event processing unit 3, and an application unit 4.

The sensor 2 has the function of detecting the movement or brightness change of a subject. The sensor 2 is, for example, an event-based vision sensor (EVS). The sensor 2 detects events based on the amount of change in the amount of incident light. The events detected by the sensor 2 are supplied to the event processing unit 3, for example, via MIPI (Mobile Industry Processor Interface).

The event processing unit 3 generates frame data including information on events that have occurred within a predetermined period (hereinafter, event information) in units of frames each including a plurality of pixels. The event processing unit 3 is configured, for example, with an FPGA (Field Programmable Gate Array). The event processing unit 3 includes an event acquisition unit 11, a decoding unit 12, and a frame generation unit 13.

The event acquisition unit 11 acquires event information output synchronously or asynchronously from the sensor 2. The event processing unit 3 may temporarily store the event information acquired from the sensor 2 in the event storage unit 14.

The decoding unit 12 decodes the event information acquired by the event acquisition unit 11. The event information output from the sensor 2 is compressed data including information such as the time of event occurrence and polarity. The decoding unit 12 decompresses the event information acquired by the event acquisition unit 11 to generate event data including information such as the time of event occurrence and polarity. The event data decoded by the decoding unit 12 is supplied to the frame generation unit 13.

As described above, the frame generation unit 13 generates frame data including event data generated within a predetermined period. In this specification, the predetermined period is sometimes referred to as a time slice. The frame generation unit 13 supplies the frame data to the application unit 4, for example, via a USB (Universal Serial Bus).

The application unit 4 generates image data on a frame-by-frame basis, and is configured, for example, by an FPGA. The event processing unit 3 and application unit 4 may also be configured on a single semiconductor chip. The sensor 2, event processing unit 3, and application unit 4 may also be configured on a single semiconductor chip.

The application unit 4 includes an image generation unit 15. The image generation unit 15 generates image data on a frame-by-frame basis based on the frame data supplied from the frame generation unit 13. The image data generated by the image generation unit 15 is used, for example, to display a frame image on the display unit 5. Alternatively, the image data generated by the application unit 4 may be used for processing such as object tracking, recognition, or motion prediction, as described below.

FIG. 2 is a block diagram showing an example of the configuration of the sensor 2. The sensor 2 includes a pixel array section 21, a vertical drive section 22, and a signal processing section 23.

The pixel array section 21 includes a plurality of pixels 30 arranged one-dimensionally or two-dimensionally in a matrix. In this specification, an example in which a plurality of pixels 30 are arranged two-dimensionally will be described. The horizontal direction in FIG. 2 is called the row direction X, and the vertical direction is called the column direction Y. In FIG. 2, a plurality of pixels 30 are arranged in the row direction X and the column direction Y.

Event detection is performed by each of the multiple pixels 30. The pixels 30 have a photoelectric conversion element 31 and a pixel circuit 32. The photoelectric conversion element 31 receives subject light and generates an electric charge according to the amount of light received. The generated electric charge is detected as an event by the pixel circuit 32.

The vertical drive unit 22 generates multiple vertical drive signals that control whether or not to drive multiple pixels arranged in the column direction Y. The vertical drive unit 22 can select a pixel block in any range in the column direction Y and sequentially drive the pixels in the selected pixel block.

The sensor 2 may also include a horizontal drive unit that controls whether or not to drive a plurality of pixels arranged in the row direction X.

The signal processing unit 23 performs a predetermined signal processing on the events detected from each pixel 30. The events after the signal processing are sequentially output to the downstream event processing unit 3 etc. via an output circuit etc. (not shown).

FIG. 3A is a circuit diagram showing a first example of a pixel 30. The pixel 30 includes a photoelectric conversion element 31 and a pixel circuit 32. The pixel circuit 32 includes a transfer transistor TRG, a charge-voltage conversion unit 33, a buffer 34, a differentiation circuit 35, and a quantizer 36. The charge-voltage conversion unit 33, the photoelectric conversion element 31, and the transfer transistor TRG constitute a logarithmic response unit 37.

The logarithmic response unit 37 performs logarithmic conversion on the charge photoelectrically converted by the photoelectric conversion element 31 to generate a voltage signal Vlog. The reason for the logarithmic conversion is to expand the dynamic range of the pixel 30 that acquires the luminance information.

The photoelectric conversion element 31 accumulates electric charges (photocharges) based on incident light that is incident on the corresponding pixel 30. For example, a photodiode is used as the photoelectric conversion element 31. The photoelectric conversion element 31 has an anode and a cathode. Either the anode or the cathode (for example, the cathode) is connected to the source of the transfer transistor TRG, and the other (for example, the anode) is connected to a predetermined reference voltage node such as a ground voltage.

The transfer transistor TRG is used to switch the transfer of photocharges. The transfer transistor TRG is turned on when, for example, a high-level transfer signal is applied to the gate of the transfer transistor TRG. The drain of the transfer transistor TRG is connected to the input node n1 of the charge-voltage conversion unit 33.

The charge-voltage conversion unit 33 converts the charge stored in the photoelectric conversion element 31 into a voltage. The charge-voltage conversion unit 33 includes transistors Q1 to Q5. For example, NMOS (N-channel Metal-Oxide-Semiconductor) transistors are used for the transistors Q1 to Q4. For example, a PMOS (P-channel Metal-Oxide-Semiconductor) transistor is used for the transistor Q5.

Transistors Q1 and Q2 are cascode-connected between the power supply voltage node and the transfer transistor TRG. The source of transistor Q1 is connected to the drain of the transfer transistor TRG and the gate of transistor Q3. The gate of transistor Q1 is connected to the drain of transistor Q3 and the source of transistor Q4. The drain of transistor Q1 is connected to the source of transistor Q2 and the gate of transistor Q4. The drain of transistor Q2 is connected to the power supply voltage node. The gate of transistor Q2 is connected to the output node n2 of the charge-voltage conversion unit 33, the drain of transistor Q4, and the drain of transistor Q5.

Transistor Q3 and transistor Q4 are cascode-connected between node n2 and the reference voltage (ground) node. The source of transistor Q3 is connected to the reference voltage (ground) node. Transistor Q4 is disposed between transistor Q3 and transistor Q5.

The source of transistor Q5 is connected to the power supply voltage node, and a bias voltage Vblog is applied to its gate. Transistor Q5 adjusts the voltage level of output node n2 according to the voltage level of bias voltage Vblog.

The voltage signal Vlog logarithmically converted by the charge-voltage converter 33 is input to the buffer 34. The buffer 34 includes a transistor Q6 and a transistor Q7 that are cascode-connected between a power supply voltage node and a reference voltage (ground) node. A PMOS transistor, for example, is used as the transistor Q6. An NMOS transistor, for example, is used as the transistor Q7.

Transistor Q6 in buffer 34 forms a source follower circuit. A pixel voltage Vsf corresponding to the voltage signal Vlog output from charge-voltage conversion unit 33 is output from buffer 34. The voltage signal Vlog is input to the gate of transistor Q6 from output node n2 of charge-voltage conversion unit 33. The source of transistor Q6 is connected to the power supply voltage node. The drain of transistor Q6 is connected to the drain of transistor Q7 and differentiation circuit 35 via output node n3 of buffer 34.

The source of transistor Q7 is connected to the reference voltage (ground) node. A bias voltage Vbsf is applied to the gate of transistor Q7. Transistor Q7 adjusts the voltage level of output node n3 according to the voltage level of bias voltage Vbsf.

The pixel voltage Vsf output from the buffer 34 is input to the differentiation circuit 35. The buffer 34 can improve the driving force of the pixel voltage Vsf. Furthermore, by providing the buffer 34, it is possible to ensure isolation so that noise generated when the downstream differentiation circuit 35 performs a switching operation is not transmitted to the charge-voltage conversion unit 33.

The differentiation circuit 35 generates a differentiation signal Vout according to the change in the voltage signal Vlog converted by the charge-voltage converter 33. The differentiation circuit 35 includes a capacitor C1 and transistors Q8 to Q10. For example, an NMOS transistor is used for the transistor Q10, and for example, PMOS transistors are used for the transistors Q8 and Q9.

Capacitor C1 is disposed between a connection node n4 of the source of transistor Q8 and the gate of transistor Q9, and an output node n3 of buffer 34. Capacitor C1 accumulates charge based on pixel voltage Vsf supplied from buffer 34. Capacitor C2 is disposed between the gate of transistor Q9 and the drain of transistor Q10. Capacitor C2 supplies charge to the source of transistor Q8 and the gate of transistor Q9 according to the amount of change in pixel voltage Vsf, which is the time derivative of pixel voltage Vsf.

Transistor Q8 switches whether or not to short the gate and drain of transistor Q9 according to auto-zero signal XAZ. Auto-zero signal XAZ is a signal that indicates initialization, and for example, goes from high level to low level every time an event detection signal described below is output from pixel 30. When auto-zero signal XAZ goes low level, transistor Q8 turns on, the differentiated signal Vout is reset to its initial value, and the charge of capacitor C1 is initialized.

The source of transistor Q10 is connected to the reference voltage (ground) node, and a bias voltage Vbdiff is applied to its gate. Transistor Q10 adjusts the voltage level of output node n5 of differentiation circuit 35 according to the voltage level of bias voltage Vbdiff.

Transistor Q9 and transistor Q10 function as an inverting circuit with connection node n4 on the gate side of transistor Q9 as the input node and connection node n5 of transistor Q9 and transistor Q10 as the output node.

As described above, the differentiation circuit 35 detects the amount of change in the pixel voltage Vsf by differential calculation. The amount of change in the pixel voltage Vsf indicates the amount of change in the amount of light incident on the pixel 30. The differentiation circuit 35 supplies the differentiation signal Vout to the quantizer 36 via the output node n5.

The quantizer 36 performs a comparison operation to compare the differential signal Vout with a threshold voltage. Based on the result of the comparison operation, the quantizer 36 detects an event indicating that the absolute value of the change in the amount of incident light has exceeded the threshold voltage, and outputs an event detection signal COMP+ and an event detection signal COMP-. The quantizer 36 includes transistors Q11 to Q14 and an inverter K1. For example, PMOS transistors are used as the transistors Q11 and Q13. Furthermore, for example, NMOS transistors are used as the transistors Q12 and Q14.

Transistors Q11 and Q12 are cascode-connected between a power supply voltage node and a reference voltage (ground) node. The source of transistor Q11 is connected to the power supply voltage node. The drain of transistor Q11 is connected to inverter K1 and the drain of transistor Q12. The source of transistor Q12 is connected to the reference voltage (ground) node. A differentiated signal Vout from differentiation circuit 35 is applied to the gate of transistor Q11. A threshold voltage Vhigh is applied to the gate of transistor Q12.

Transistors Q11 and Q12 compare the differentiated signal Vout with the threshold voltage Vhigh. Specifically, when the differentiated signal Vout of the differentiation circuit 35 is lower than the threshold voltage Vhigh, the transistor Q11 turns on, and the event detection signal COMP+ output from the drain of the transistor Q11 via the inverter K1 becomes low level.

Transistors Q13 and Q14 are cascode-connected between the power supply voltage node and the reference voltage (ground) node. The source of transistor Q13 is connected to the power supply voltage node. The drain of transistor Q13 is connected to the output node of quantizer 36 and the drain of transistor Q14. The differentiated signal Vout of differentiation circuit 35 is applied to the gate of transistor Q13. The threshold voltage Vlow is applied to the gate of transistor Q14.

Transistors Q13 and Q14 compare the differentiated signal Vout with the threshold voltage Vlow. Specifically, when the differentiated signal Vout of the differentiation circuit 35 is higher than the threshold voltage Vlow, transistor Q13 turns off, and the event detection signal COMP- output from the drain of transistor Q13 becomes low level.

The pixel 30 can detect an increase or decrease in the amount of incident light as an event. When the amount of light incident on the pixel 30 increases, a photoelectric charge is generated by the photoelectric conversion element 31, and the voltage of the input node n1 connected to the cathode of the photoelectric conversion element 31 decreases. In response to the decrease in the voltage of the input node n1, the output voltage Vlog of the charge-voltage conversion unit 33 decreases, and the pixel voltage Vsf of the buffer 34 also decreases. The differential signal Vout output from the differentiation circuit 35 decreases in response to the amount of decrease in the pixel voltage Vsf, and when it falls below the threshold voltage Vhigh, a low-level event detection signal COMP+ is output. In other words, a low-level event detection signal COMP+ indicates that the increase in the amount of incident light exceeds the threshold value determined by the threshold voltage Vhigh.

Similarly, when the amount of light incident on pixel 30 decreases, the differential signal Vout output from differentiation circuit 35 increases, and when it exceeds the threshold voltage Vlow, a low-level event detection signal COMP- is output. In other words, a low-level event detection signal COMP- indicates that the amount of decrease in the amount of incident light is below the threshold determined by the threshold voltage Vlow.

In this specification, the detection of either a low-level event detection signal COMP+ or a low-level event detection signal COMP- is referred to as the detection of an event. An event also has polarity information that indicates whether the luminance of the incident light is positive or negative. When a low-level event detection signal COMP+ is detected, the polarity is positive, and when a low-level event detection signal COMP- is detected, the polarity is negative.

The pixel 30 does not need to detect both the event detection signal COMP+ and the event detection signal COMP-, and may detect either one. FIG. 3B is a circuit diagram showing a second example of the pixel 30. The quantizer 36a in the pixel circuit 32a shown in FIG. 3B differs from the quantizer 36 in FIG. 3A in that it does not include transistors Q13 and Q14. Therefore, the pixel 30a (and pixel circuit 32a) in FIG. 3B detects only the increase in the amount of light incident on the photoelectric conversion element 31, and outputs the event detection signal COMP+.

Similarly, pixel 30 may be configured to remove transistors Q11, Q12 and inverter K1 from quantizer 36 in FIG. 3A. In that case, pixel 30 detects only the decrease in the amount of light received by photoelectric conversion element 31, among the increase and decrease, and outputs event detection signal COMP-.

The sensor 2 can also be configured, for example, with two stacked chips. FIG. 4A is a diagram showing a first example of the stacked structure of the sensor 2. This sensor 2 comprises a pixel chip 41 and a logic chip 42 stacked on the pixel chip 41. These chips are joined by vias or the like. Note that in addition to vias, they can also be joined by Cu-Cu bonding or bumps.

The pixel chip 41 has, for example, the photoelectric conversion element 31 and a part of the pixel circuit 32 (for example, the transfer transistor TRG and the charge-voltage conversion unit 33) arranged thereon. The logic chip 42 has, for example, the remaining part of the pixel circuit 32 (for example, the buffer 34, the differentiation circuit 35, and the quantizer 36), the vertical drive unit 22, and the signal processing unit 23 arranged thereon.

The sensor 2 may be composed of three or more stacked chips. FIG. 4B is a diagram showing a second example of the stacked structure of the sensor 2. In the sensor 2a of FIG. 4B, a first pixel chip 41a and a second pixel chip 41b are stacked instead of the pixel chip 41. In the first pixel chip 41a, for example, a photoelectric conversion element 31 and a transfer transistor TRG are arranged. In the second pixel chip 41b, for example, a charge-voltage conversion unit 33 is arranged.

The sensor 2a in FIG. 4B has a configuration in which the charge-voltage conversion unit 33 is removed from the first pixel chip 41a and placed on the second pixel chip 41b. This ensures that the area of the photoelectric conversion element 31 is sufficient in the first pixel chip 41a, and that the area of the charge-voltage conversion unit 33 is sufficient in the second pixel chip 41b, even when the chip area is miniaturized.

The information processing device 1 according to this embodiment is characterized by generating frame images capable of visualizing the detection time and detection frequency of an event. The series of processes for this purpose is called event visualization. FIG. 5 is a flowchart showing an outline of the processing procedure for event visualization in the information processing device 1. First, an event is detected in each pixel 30 in the sensor 2 (step S1). The event acquisition unit 11 in the event processing unit 3 acquires compressed event information at predetermined intervals. In this specification, the event information acquired within a predetermined period may be referred to as a data chunk. The data chunk acquired by the event acquisition unit 11 may be temporarily stored in the event storage unit 14 (step S2).

The decoding unit 12 decodes the data chunks stored in the event storage unit 14 and outputs the event data (step S3).

The event data output by the decoding unit 12 includes, for example, the following information:
x: X address indicating the detection position (e.g., pixel position) of the event y: Y address indicating the detection position (e.g., pixel position) of the event t: time information (time stamp) indicating the detection time of the event
p: Event polarity information

The frame generating unit 13 performs a framing process to generate frame data based on the event data output from the decoding unit 12 (step S4). The image generating unit 15 performs a frame image generation process based on the frame data supplied by the frame generating unit 13 (step S5). The image generation process will be described in detail later.

FIG. 6 is a diagram explaining the framing process. The decoding unit 12 outputs event data decoded in units of data chunks dc. One data chunk dc contains, for example, event information for all pixels that occurred during a predetermined unit detection period Δt. Each piece of event information is represented by event data e(x, y, p, t) that contains the event detection positions x, y, event detection time t, and event polarity information p described above.

The frame generation unit 13 outputs multiple event data included in a predetermined time slice Ts consisting of multiple unit detection periods Δt as one frame of data. In the example of FIG. 6, the time slice Ts has a time width of 8Δt. Furthermore, the frame data F1 includes eight data chunks dc. In other words, the frame generation unit 13 generates frame data including event information generated within a predetermined period (time slice Ts) in frame units including multiple pixels 30.

The frame generation unit 13 generates frame data such that two adjacent frame data in the time axis direction have event data with overlapping time ranges. In this specification, the overlapping time ranges are sometimes called overlapping sections. By providing overlapping sections, the accuracy of object tracking can be improved.

In the example of FIG. 6, an overlap section with a time width of 6Δt is provided between frame data F1 and F2. Similarly, an overlap section with a time width of 6Δt is provided between frame data F2 and F3.

The duration of the time slice Ts and the duration of the overlap section Tor are not limited to those shown in FIG. 6, and any duration may be set. If the time slice Ts is too short, fewer events will be included in one frame of data, making it difficult to visually grasp the characteristics of the event information. If the time slice Ts is too long compared to the overlap section Tor, the interval between frame data generation will be longer, and the advantage of the high speed of EVS will be lost. Therefore, it is desirable to adjust the duration of the time slice Ts and the duration of the overlap section Tor according to the occurrence status of events, etc.

FIG. 7A is a diagram showing an image generation process in a comparative example. FIG. 7A shows an example in which subject A moves from position Ps to position Pe in time slice Ts. A change in luminance occurs while subject A moves from position Ps to position Pe, and an event is detected by sensor 2. As a result, image data G0 including event data is generated.

Image data G0 contains event information that occurred within time slice Ts, but the time at which the event occurred is unknown. Therefore, it is not possible to determine from image data G0 whether object A is moving from position Ps to position Pe, or from position Pe to position Ps.

As shown in FIG. 7A, image data G0 of the comparative example does not include information on the time when an event occurred, so the movement of the subject cannot be visually interpreted or analyzed. The information processing device 1 disclosed herein is characterized by its ability to solve this problem.

FIG. 7B is a diagram showing the image generation process in the present disclosure. Like FIG. 7A, FIG. 7B shows an example in which subject A moves from position Ps to position Pe in time slice Ts. Image data G output by information processing device 1 of the present disclosure has the feature that the gradation changes according to the time at which an event occurs. Specifically, new events are represented in dark colors, and older events are represented in light colors. This makes it possible to determine that object A is moving from position Ps to position Pe due to the change in gradation of image data G. In this way, information processing device 1 of the present disclosure can output image data G that enables visual interpretation or analysis of the subject's movement.

FIG. 8 is a block diagram showing a detailed configuration of the image generation unit 15 in the first embodiment of the present disclosure. As described above, frame data is supplied to the image generation unit 15 from the frame generation unit 13. The image generation unit 15 also generates image data based on the frame data, and outputs the image data to the display unit 5, for example. The image generation unit 15 includes a divided frame generation unit 51, a pixel data generation unit 52, and a frame image generation unit 53.

The divided frame generation unit 51 divides the frame data supplied from the frame generation unit 13, and outputs multiple block data (divided frame data), each having a predetermined time width (hereinafter, unit period), to the pixel data generation unit 52.

The pixel data generating unit 52 generates pixel data including event information detected in each of a plurality of unit periods. The pixel data generated by the pixel data generating unit 52 is generated for each pixel 30 in the pixel array unit 21, i.e., for each individual pixel 30 that constitutes a frame image. The pixel data generating unit 52 includes a mapping unit 54, a shifting unit 55, and a storage unit 56.

The mapping unit 54 links pixel position information of the event data contained in the block data D with the corresponding coordinates (hereinafter, pixel coordinates) on the drawing canvas. The mapping unit 54 obtains block data from the divided frame generation unit 51 and extracts the event data within the block data. The mapping unit 54 assigns the event data to pixel coordinates based on the pixel position information of the event data. The mapping unit 54 determines whether or not an event corresponding to each pixel exists.

The shifting unit 55 acquires the event data for each pixel coordinate from the mapping unit 54, and generates pixel data in a binary data format consisting of multiple bits for each pixel coordinate. The storage unit 56 stores the pixel data in binary data format, with the number of bits equal to the number of partition data obtained by dividing the frame data, arranged in pixel order for each pixel coordinate.

The shift unit 55 reads out the pixel data in the memory unit 56, bit-shifts it to the LSB side, discards the old binary data on the LSB side, and adds new binary data to the MSB side. The detailed operation of the shift unit 55 will be described later.

The shift unit 55 repeats the shifting operation of the pixel data based on the event data of each block data constituting the frame data. When a series of shifting operations is completed, the pixel data generating unit 52 outputs the pixel data to the frame image generating unit 53. The frame image generating unit 53 generates image data (frame image) in frame units based on the pixel data of each pixel coordinate.

Note that the image generation unit 15 in the first embodiment of the present disclosure is not limited to the configuration shown in FIG. 8 as long as it can realize the image generation process shown in FIG. 9.

FIG. 9 is a flowchart of image generation in the first embodiment of the present disclosure. The process shown in FIG. 9 corresponds to the image generation process of step S5 shown in FIG. 5.

The split frame generation unit 51 splits the frame data into N pieces of section data (step S11). Here, N is a number corresponding to the number of gradations. FIG. 10 is a schematic diagram showing the splitting of frame data in the first embodiment of the present disclosure. As shown in FIG. 10, the split frame generation unit 51 splits the frame data F into multiple pieces in the time axis direction to generate multiple pieces of section data D. In other words, the split frame generation unit 51 splits the frame data into data for multiple unit periods.

In FIG. 10, an example is shown in which frame data F is divided into eight block data D (N=8). In addition, the unit periods of the eight block data D all have the same time width. The following explanation will be given using this example.

As described above, the frame data F includes multiple event data e(x, y, p, t). These event data are assigned to the respective block data D.

Each piece of block data D is assigned an identification number i. The identification numbers i are assigned 0 to N-1 in chronological order from oldest to newest. The block data D in Figure 9 are assigned numbers 0, 1, 2, ... 7 in chronological order.

Furthermore, the frame data F may be divided into the same time intervals as the data chunks dc in FIG. 6. In this case, the data chunks dc included in the overlap section Tor can be reused when acquiring the next frame data F. Details will be described later.

Referring back to FIG. 9, the explanation continues. Next, 0 is substituted for the identification number i, and it is initialized (step S12). The processing in steps S13 to S16 is for one piece of partition data D. In steps S13 to S16, the identification number i is incremented, and the eight pieces of partition data D are processed sequentially from the oldest one. In step S13, it is determined whether the identification number i is smaller than the number of partitions N. If the identification number i is smaller than the number of partitions N, the processing from step S14 onwards is carried out.

The pixel data generator 52 acquires all event data included in the block data D with the identification number i (step S14). One block data D may contain multiple event data for multiple events.

The pixel data generation unit 52 generates pixel data by mapping each event data acquired in step S14 onto a drawing canvas for the image data (step S15). The drawing canvas is a data area in which pixel data for each pixel 30 is arranged in the order in which the pixels 30 are arranged, and is provided in correspondence with, for example, a storage area of the storage unit 56. A frame image is generated based on each pixel data mapped onto the drawing canvas.

FIG. 11 is a diagram showing the generation of pixel data. In FIG. 11, an example of generating pixel data based on block data D with identification number i=5 is explained. The mapping unit 54 in the pixel data generation unit 52 extracts the event detection position (x1, y1) from, for example, event data e1 (x1, y1, p1, t1) in the block data D. Based on this, the event data e1 is assigned to the corresponding coordinates pxa (x1, y1) on the drawing canvas cp.

Similarly, for other events in block data D, event data is assigned to the corresponding coordinates on the drawing canvas cp. Also, for pixels for which no events exist in block data D, data indicating that no event data exists is assigned to the corresponding coordinates pxb (x2, y2) on the drawing canvas cp.

The pixel data generation unit 52 determines whether or not a corresponding event exists in the block data D for each coordinate in the drawing canvas cp, and generates binary data of one or more bits (one bit in the example of FIG. 11). The binary data is, for example, one or more bits of binary data indicating the presence or absence of an event and its polarity. In this specification, an example in which the binary data is one bit of binary data indicating the presence or absence of an event will be mainly described. For the coordinate pxa, the pixel data generation unit 52 generates binary data bnewa1 (e.g., 1) indicating that an event exists, and for the coordinate pxb, generates binary data bnewb1 (e.g., 0) indicating that an event does not exist. Similarly, the pixel data generation unit 52 generates binary data for other coordinates in the drawing canvas cp based on the presence or absence of a corresponding event in the block data D.

The memory unit 56 stores pixel data corresponding to each coordinate in the drawing canvas cp, for example as bit string data. For example, the memory unit 56 stores pixel data Arra0 for the coordinate pxa before mapping. The pixel data Arra0 has binary data bolda0 in its least significant bit (hereinafter, LSB: Least Significant Bit) and binary data bnewa0 in its most significant bit (hereinafter, MSB: Most Significant Bit).

The memory unit 56 stores pixel data consisting of, for example, N bits for each coordinate on the drawing canvas cp. In the example of FIG. 11, N=8.

The shift unit 55 shifts each bit of the pixel data Arra0 by one bit toward the LSB side to generate pixel data Arra1. As a result, the LSB of the pixel data Arra0 is discarded. The shift unit 55 also adds the corresponding event data bnewa1 in the partition data D to the MSB. As a result, new pixel data Arra1 is generated that has binary data bolda1 in the LSB and binary data bnewa1 in the MSB. The binary data bnewa1 has a bit value of 1, which indicates that an event exists.

Furthermore, it is assumed that the memory unit 56 stores pixel data Arrb0, which corresponds to the coordinate pxb before mapping and has binary data boldb0 in the LSB and binary data bnewb0 in the MSB. The shift unit 55 discards the binary data boldb0 from the LSB of the pixel data Arrb0, shifts the binary data from binary data boldb1 to binary data bnewb0 one bit at a time toward the LSB side, and adds binary data bnewb1 to the MSB side. The binary data bnewb1 is 0, which indicates that no event exists. In this way, the pixel data Arrb1 is generated.

The pixel data Arra0 and Arrb0 stored in the memory unit 56 are the pixel data generated immediately before. Specifically, immediately before the processing of step S15 is performed on the block data D with identification number i=5, the pixel data generated in the processing of the block data D with identification number i=4 is stored in the memory unit 56.

In this way, the shift unit 55 shifts each bit of the pixel data stored in the memory unit 56 for each coordinate of the drawing canvas cp toward the LSB by one bit at a time, and adds one bit of information indicating the presence or absence of an event in the block data D to the MSB, thereby generating pixel data for each pixel 30 for each block data D.

When the processing of step S15 in FIG. 9 is completed, the identification number i of the partition data D is incremented (step S16), and the processing from step S13 onwards is repeated. If the identification number i is smaller than the number of partitions N in step S13, the processing of steps S14 to S15 is performed for the next partition data D. This completes the pixel data generation processing for all section data D of the frame data F. The pixel data generation unit 52 outputs the new pixel data generated in step S15 to the frame image generation unit 53.

FIG. 12 is a diagram showing the correspondence between frame data and pixel data in the first embodiment of the present disclosure. Frame data F includes event data corresponding to all pixels 30 in the pixel array section 21. Each event data in the frame data F is assigned to each coordinate in the drawing canvas cp. In FIG. 12, the event data of a specific pixel in the frame data F is shown with a thick line, and the event data of the other pixels is shown with a dashed line. FIG. 12 also shows an example in which the event data of a specific pixel in the frame data F is assigned to pixel data pd of coordinate pxc in the drawing canvas cp.

The pixel data generation unit 52 associates each bit of the pixel data pd with a different block data D in the frame data F, and generates a corresponding bit value of the pixel data pd based on the event information of the corresponding pixel 30 in the block data D. Figure 12 shows an example in which a corresponding bit value of the pixel data pd is generated based on the event data epxc of a specific pixel.

As shown in FIG. 11, pixel data pd has older binary data on the LSB side and newer binary data on the MSB side. In particular, when the length of pixel data pd matches the number of block data, the LSB of pixel data pd corresponds to block data D with identification number i=0, and the MSB corresponds to block data D with identification number i=N-1 (i=7 in the case of FIG. 12). In other words, pixel data generator 52 generates the bit value on the higher order bit side of pixel data pd based on the event information of the corresponding pixel 30 in the newer block data D.

The binary data of each bit of pixel data pd is, for example, 1 if event data (event data epxc in FIG. 12) of the corresponding pixel 30 exists in the corresponding block data D, and is, for example, 0 if no event data epxc exists. FIG. 12 shows an example in which the binary data is 1 if at least one event data epxc exists in the corresponding block data D, but the conditions for setting the binary data of each bit of pixel data to 1 may be set arbitrarily. For example, the binary data of each bit of pixel data may be 1 if the number of events of the corresponding pixel 30 exists in the corresponding block data D equal to or exceeds a predetermined threshold, and 0 otherwise.

As shown in FIG. 12, the pixel data generating unit 52 generates pixel data pd including event information detected in each of a plurality of unit periods. The pixel data pd is bit string data of a plurality of bits. In the pixel data pd, event information of older unit periods is arranged from the MSB side to the LSB side of the bit string data. Furthermore, the event information is represented by one or more bits for each unit period. The pixel data pd may also include polarity information p of the event for each unit period.

The pixel data generating unit 52 similarly generates pixel data for other coordinates within the drawing canvas cp. That is, the pixel data generating unit 52 generates pixel data for each of all pixels 30 within the pixel array unit 21 based on the event information of the corresponding pixel 30 contained in each of the multiple block data D.

Referring back to Figure 9, the explanation continues. The frame image generating unit 53 draws a frame image based on the pixel data of each pixel for one frame generated by the processing of steps S13 to S16 (step S17). In this specification, the pixel data of each pixel constituting the frame image is collectively referred to as image data.

FIG. 13 is a diagram showing an example of image data in the first embodiment of the present disclosure. Three pieces of pixel data pd1, pd2, and pd3 that are part of the image data are shown in FIG. 13. These three pieces of pixel data correspond to coordinates pxd, pxe, and pxf, respectively, within the drawing canvas cp.

In Figure 13, the pixel data pd1 has an MSB of 1, indicating that an event has occurred most recently. The pixel data pd2 has more bits that are 1 than the pixel data pd3, indicating that an event has occurred more frequently. The pixel values of the pixel data pd1, pd2, and pd3 are 206, 124, and 97, respectively. Therefore, the pixel data pd1 has the highest gradation (brightness), the pixel data pd2 has the next highest gradation (brightness), and the pixel data pd3 has the lowest gradation (brightness).

In this way, the more recently an event occurred (detected), the larger the pixel value of the pixel data, and the more frequently an event occurred (detected), the larger the number of pixels. Therefore, the gradation (brightness) of each pixel in the frame image generated by the frame image generating unit 53 makes it possible to determine the time of occurrence (detection) of a frame and the frequency of occurrence (detection).

The number of gradations represented by the pixel values of the pixel data pd1, pd2, and pd3 is a number that corresponds to the division number N of the frame data F, and also corresponds to the number of unit periods.

The frame image generating unit 53 may reflect the polarity information p of each event in the image data. For example, either a positive polarity event or a negative polarity event may use a grayscale to represent the time and frequency of the event. The other event may use a gradation of one of the RGB colors to represent the time and frequency of the event.

After the frame image drawing process is performed in step S17 of FIG. 9, it is determined whether or not to end the framing process (step S18). Ending the framing process means that no new frame images are to be drawn. If the framing process is to end, the process of FIG. 9 ends.

If the framing process is to continue in step S18, the process from step S11 onwards is repeated to draw a new frame image.

As shown in FIG. 6, the frame generation unit 13 generates each frame data such that two adjacent frame data in the time axis direction include an overlapping time range. While FIG. 6 shows an example in which each frame data is divided into multiple data chunks dc at equal intervals, the divided frame generation unit 51 generates multiple segment data by dividing each frame data generated by the frame generation unit 13 for each unit period. As described above, pixel data is generated based on the frame data including the multiple segmented segment data.

FIG. 14 is a diagram showing two pixel data generated based on two frame data adjacent in the time axis direction. The two frame data adjacent in the time axis direction supplied to the split frame generation unit 51 have an overlapping portion where they overlap with each other, and the time width of the overlapping portion is the same as the time width of the overlapping section in FIG. 6, for example. The two pixel data pda and pdb generated based on two frame data adjacent in the time axis direction also have an overlapping portion bor as shown in FIG. 14. The time width of the overlapping portion bor is the same as the time width of the overlapping section Tor, and is an integer multiple of the unit period, which is the time width of the partition data. The number of unit periods of the overlapping portion does not necessarily match the number of data chunks in the overlapping section Tor.

As shown in Figure 14, by providing an overlapping area between two pieces of pixel data that are adjacent in the time axis direction, it is possible to accurately capture how an event changes. For example, the direction in which a moving object moves can be expressed as a trajectory in a frame image. Furthermore, when generating new pixel data, the binary data in the overlapping area with the previously generated pixel data can be used as valid data as is, so new pixel data can be generated quickly and the frame rate can be increased.

In the above case, the frame image generating unit 53 generates image data such that each pixel 30 has a gradation according to the detection time and detection frequency of the event, and two frame images adjacent in the time axis direction contain event information within an overlapping time range.

In this way, the information processing device 1 in the first embodiment of the present disclosure generates pixel data having pixel values according to the detection time and detection frequency of an event, and can generate a frame image in which not only the location of the event but also the time and frequency of the event can be easily visually recognized. More specifically, in the frame image according to this embodiment, the direction of the event and the time of the event can be expressed by, for example, brightness or gradation, making it easier to visually grasp when and in which direction a moving object moved. It is also possible to visually grasp at what time and for how long an event occurred.

In this way, the image generation process of the present disclosure, which is described in FIG. 9, expresses the time of occurrence of an event in a form that mimics the afterimage effect that occurs in the human eye. As a result, the information processing device 1 of the present disclosure has the characteristic of being able to visualize events in a form that is easier for humans to understand than conventional methods. Furthermore, the image generation process of the present disclosure makes it possible to distinguish between multiple events by gradation or brightness, even when multiple events occur at the same pixel position in one time slice.

Second Embodiment
The divided frame generating unit 51 in the first embodiment divides the frame data F into a plurality of segment data D with the same time width, but the time widths of the plurality of segment data D do not necessarily have to be the same.

FIG. 15 is a schematic diagram showing the division of frame data in the second embodiment of the present disclosure. In step S11 of FIG. 9, the divided frame generation unit 51 in the second embodiment divides the frame data F into a plurality of segment data Da each having a different time width. In the second embodiment, for example, a time range within the frame data F in which it is desired to detect the situation of an event occurrence in more detail can be divided into segment data Da with a narrower time width, and other time ranges can be divided into segment data Da with a wider time width.

In the example of Figure 15, in order to detect the occurrence of events in time periods closer to the current time in more detail, the frame data F is divided so that the time width (unit period) of the partition data becomes logarithmically longer as the time period becomes older and shorter as the time period becomes newer.

FIG. 16 is a diagram showing the correspondence between frame data F and pixel data in the second embodiment of the present disclosure. Frame data F in FIG. 16 contains the same event data as frame data F shown in FIG. 12. However, as described above, segment data Da in FIG. 16 differs from segment data D in FIG. 12 in the time range and time width that it occupies within frame data F. Furthermore, segment data Da in FIG. 16 contains different event data compared to segment data D in FIG. 12 having the same identification number i. For this reason, pixel data pdc in FIG. 16 is different data from pixel data pd shown in FIG. 12.

Compared to the pixel data pd shown in FIG. 12, the pixel data pdc in FIG. 16 is more sensitive to the presence or absence of event data in the new time range. This makes it possible to more precisely track changes in the brightness of the subject, particularly in the new time range, and to more precisely identify the time when the event is occurring.

In this way, in the second embodiment of the present disclosure, the time width of each block of segment data D obtained by dividing frame data F is made different, so that, for example, pixel data can be generated that contains more information about events that occurred at newer times than information about events that occurred at older times. As a result, the frame image generated by the information processing device 1 according to the second embodiment can visually represent information about the occurrence location, occurrence time, and occurrence frequency of events that occurred at more recent times in more detail.

Third Embodiment
Weighting may be applied to the pixel data output by pixel data generation unit 52. Fig. 17 is a block diagram showing a detailed configuration of image generation unit 15 in the third embodiment of the present disclosure. Image generation unit 15a in Fig. 17 has a configuration in which a weighting unit 60 is newly added to image generation unit 15 in Fig. 8. Weighting unit 60 is disposed between pixel data generation unit 52 and frame image generation unit 53.

The weighting unit 60 applies a predetermined weighting to the pixel data. The specific weighting method is arbitrary. For example, the bit value of a specific bit of the pixel data may be inverted, or each bit of the pixel data may be shifted to the MSB or LSB side. The pixel data weighted by the weighting unit 60 is supplied to the frame image generating unit 53.

FIG. 18 is a flowchart of image generation in the third embodiment of the present disclosure. In FIG. 18, a weighting process (step S20) is added between steps S13 and S17. In step S20, the weighting unit 60 weights the pixel data output from the pixel data generation unit 52. In step S17, the frame image generation unit 53 outputs image data based on the weighted pixel data.

For example, new event data can be emphasized by shifting all but the new binary data of the pixel data (e.g., the MSB binary data) to the LSB side. Also, the time range of interest can be changed by shifting the binary data of a specific time range to the MSB side.

In this way, the image generation unit 15a in the third embodiment of the present disclosure weights pixel data as necessary, making it possible to change the display format of event information in a frame image in various ways according to the characteristics of the event, and to provide a frame image with high visibility. Note that the weighting of pixel data according to the third embodiment can be applied to any of the first and second embodiments.

Fourth Embodiment
The information processing device 1 of the present disclosure may also have a machine learning function. Fig. 19 is a block diagram showing a configuration of the information processing device 1 according to the fourth embodiment of the present disclosure. The information processing device 1a shown in Fig. 19 includes an application unit 4a. The application unit 4a includes a neural network unit 61, a learning unit 62, and an information processing unit 63.

The image generating unit 15 (or the image generating unit 15a) generates image data by the process of FIG. 9 (or FIG. 18). The neural network unit 61 acquires the image data from the image generating unit 15 and stores it internally.

The learning unit 62 performs a learning process to update the weights of the neural network used for the specified information processing based on the image data stored in the neural network unit 61, and generates a trained specified information processing model. The specified information processing includes at least one process of tracking, recognizing, or predicting the movement of an object.

The information processing unit 63 inputs the image data generated by the image generating unit 15 to the trained neural network unit 61, and performs predetermined information processing based on the data output from the trained neural network unit 61.

FIG. 20 is a schematic diagram showing an example of information processing by the information processing unit 63. In FIG. 20, the neural network unit 61 has an information processing model M. The information processing unit 63 performs information processing based on the information processing model M and image data G1 in which subjects H1, H2, and H3 are captured.

The image data of the present disclosure can represent the movement of a subject, as explained in FIG. 7B. By incorporating this into the information processing model M, it is possible to identify or predict, for example, the movement line L of subject H1 in image data G1. Furthermore, the image data of the present disclosure can identify events based on differences in gradation even when multiple events occur at the same pixel 30. This makes it possible to identify the event of subject H2, for example, even when an event of subject H2 and an event of subject H3 occur at the same time.

In this way, the weights of the neural network are learned using image data generated by the information processing device 1 according to the first to third embodiments, and new image data is input to the trained neural network, so that data reflecting the learning results can be output from the neural network, and various information processing can be performed using this data. As a result, information processing such as object tracking, recognition, or motion prediction can be performed using the image data generated by the information processing device 1 according to the first to third embodiments.

(Application example)
The technology according to the present disclosure can be applied to various products. For example, the technology according to the present disclosure may be realized as a device mounted on any type of moving object, such as an automobile, an electric vehicle, a hybrid electric vehicle, a motorcycle, a bicycle, a personal mobility device, an airplane, a drone, a ship, a robot, a construction machine, or an agricultural machine (tractor).

21 is a block diagram showing a schematic configuration example of a vehicle control system 7000, which is an example of a mobile control system to which the technology disclosed herein can be applied. The vehicle control system 7000 includes a plurality of electronic control units connected via a communication network 7010. In the example shown in FIG. 21, the vehicle control system 7000 includes a drive system control unit 7100, a body system control unit 7200, a battery control unit 7300, an outside vehicle information detection unit 7400, an inside vehicle information detection unit 7500, and an integrated control unit 7600. The communication network 7010 connecting these multiple control units may be, for example, an in-vehicle communication network conforming to any standard such as CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), or FlexRay (registered trademark).

Each control unit includes a microcomputer that performs arithmetic processing according to various programs, a storage unit that stores the programs executed by the microcomputer or parameters used in various calculations, and a drive circuit that drives various devices to be controlled. Each control unit includes a network I/F for communicating with other control units via a communication network 7010, and a communication I/F for communicating with devices or sensors inside and outside the vehicle by wired or wireless communication. In FIG. 21, the functional configuration of the integrated control unit 7600 includes a microcomputer 7610, a general-purpose communication I/F 7620, a dedicated communication I/F 7630, a positioning unit 7640, a beacon receiving unit 7650, an in-vehicle device I/F 7660, an audio/image output unit 7670, an in-vehicle network I/F 7680, and a storage unit 7690. Other control units also include a microcomputer, a communication I/F, a storage unit, and the like.

The drive system control unit 7100 controls the operation of devices related to the drive system of the vehicle according to various programs. For example, the drive system control unit 7100 functions as a control device for a drive force generating device for generating a drive force for the vehicle, such as an internal combustion engine or a drive motor, a drive force transmission mechanism for transmitting the drive force to the wheels, a steering mechanism for adjusting the steering angle of the vehicle, and a braking device for generating a braking force for the vehicle. The drive system control unit 7100 may also function as a control device such as an ABS (Antilock Brake System) or ESC (Electronic Stability Control).

The drive system control unit 7100 is connected to a vehicle state detection unit 7110. The vehicle state detection unit 7110 includes at least one of the following: a gyro sensor that detects the angular velocity of the axial rotational motion of the vehicle body, an acceleration sensor that detects the acceleration of the vehicle, or a sensor for detecting the amount of operation of the accelerator pedal, the amount of operation of the brake pedal, the steering angle of the steering wheel, the engine speed, or the rotation speed of the wheels. The drive system control unit 7100 performs arithmetic processing using the signal input from the vehicle state detection unit 7110, and controls the internal combustion engine, the drive motor, the electric power steering device, the brake device, etc.

The body system control unit 7200 controls the operation of various devices installed in the vehicle body according to various programs. For example, the body system control unit 7200 functions as a control device for a keyless entry system, a smart key system, a power window device, or various lamps such as headlamps, tail lamps, brake lamps, turn signals, and fog lamps. In this case, radio waves or signals from various switches transmitted from a portable device that replaces a key can be input to the body system control unit 7200. The body system control unit 7200 accepts the input of these radio waves or signals and controls the vehicle's door lock device, power window device, lamps, etc.

The battery control unit 7300 controls the secondary battery 7310, which is the power supply source for the drive motor, according to various programs. For example, information such as the battery temperature, battery output voltage, or remaining capacity of the battery is input to the battery control unit 7300 from a battery device equipped with the secondary battery 7310. The battery control unit 7300 performs calculations using these signals, and controls the temperature regulation of the secondary battery 7310 or a cooling device or the like equipped in the battery device.

The outside vehicle information detection unit 7400 detects information outside the vehicle equipped with the vehicle control system 7000. For example, at least one of the imaging unit 7410 and the outside vehicle information detection unit 7420 is connected to the outside vehicle information detection unit 7400. The imaging unit 7410 includes at least one of a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, an infrared camera, and other cameras. The outside vehicle information detection unit 7420 includes at least one of an environmental sensor for detecting the current weather or climate, or a surrounding information detection sensor for detecting other vehicles, obstacles, pedestrians, etc., around the vehicle equipped with the vehicle control system 7000.

The environmental sensor may be, for example, at least one of a raindrop sensor that detects rain, a fog sensor that detects fog, a sunshine sensor that detects the level of sunlight, and a snow sensor that detects snowfall. The surrounding information detection sensor may be at least one of an ultrasonic sensor, a radar device, and a LIDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) device. The imaging unit 7410 and the outside vehicle information detection unit 7420 may each be provided as an independent sensor or device, or may be provided as a device in which multiple sensors or devices are integrated.

Here, FIG. 22 shows an example of the installation positions of the imaging unit 7410 and the outside vehicle information detection unit 7420. The imaging units 7910, 7912, 7914, 7916, and 7918 are provided, for example, at least one of the front nose, side mirrors, rear bumper, back door, and upper part of the windshield inside the vehicle cabin of the vehicle 7900. The imaging unit 7910 provided on the front nose and the imaging unit 7918 provided on the upper part of the windshield inside the vehicle cabin mainly acquire images of the front of the vehicle 7900. The imaging units 7912 and 7914 provided on the side mirrors mainly acquire images of the sides of the vehicle 7900. The imaging unit 7916 provided on the rear bumper or back door mainly acquires images of the rear of the vehicle 7900. The imaging unit 7918 provided on the upper part of the windshield inside the vehicle cabin is mainly used to detect leading vehicles, pedestrians, obstacles, traffic lights, traffic signs, lanes, etc.

Note that FIG. 22 shows an example of the imaging ranges of the imaging units 7910, 7912, 7914, and 7916. Imaging range a indicates the imaging range of the imaging unit 7910 provided on the front nose, imaging ranges b and c indicate the imaging ranges of the imaging units 7912 and 7914 provided on the side mirrors, and imaging range d indicates the imaging range of the imaging unit 7916 provided on the rear bumper or back door. For example, image data captured by the imaging units 7910, 7912, 7914, and 7916 are superimposed to obtain an overhead image of the vehicle 7900.

External information detection units 7920, 7922, 7924, 7926, 7928, and 7930 provided on the front, rear, sides, corners, and upper part of the windshield inside the vehicle 7900 may be, for example, ultrasonic sensors or radar devices. External information detection units 7920, 7926, and 7930 provided on the front nose, rear bumper, back door, and upper part of the windshield inside the vehicle 7900 may be, for example, LIDAR devices. These external information detection units 7920 to 7930 are mainly used to detect preceding vehicles, pedestrians, obstacles, etc.

Returning to FIG. 21, the explanation will be continued. The outside-vehicle information detection unit 7400 causes the imaging unit 7410 to capture an image outside the vehicle, and receives the captured image data. The outside-vehicle information detection unit 7400 also receives detection information from the connected outside-vehicle information detection unit 7420. If the outside-vehicle information detection unit 7420 is an ultrasonic sensor, a radar device, or a LIDAR device, the outside-vehicle information detection unit 7400 transmits ultrasonic waves or electromagnetic waves, and receives information on the received reflected waves. The outside-vehicle information detection unit 7400 may perform object detection processing or distance detection processing for people, cars, obstacles, signs, or characters on the road surface, based on the received information. The outside-vehicle information detection unit 7400 may perform environmental recognition processing for recognizing rainfall, fog, road surface conditions, etc., based on the received information. The outside-vehicle information detection unit 7400 may calculate the distance to an object outside the vehicle based on the received information.

The outside vehicle information detection unit 7400 may also perform image recognition processing or distance detection processing to recognize people, cars, obstacles, signs, or characters on the road surface based on the received image data. The outside vehicle information detection unit 7400 may perform processing such as distortion correction or alignment on the received image data, and may also generate an overhead image or a panoramic image by synthesizing image data captured by different imaging units 7410. The outside vehicle information detection unit 7400 may also perform viewpoint conversion processing using image data captured by different imaging units 7410.

The in-vehicle information detection unit 7500 detects information inside the vehicle. For example, a driver state detection unit 7510 that detects the state of the driver is connected to the in-vehicle information detection unit 7500. The driver state detection unit 7510 may include a camera that captures an image of the driver, a biosensor that detects the driver's biometric information, or a microphone that collects sound inside the vehicle. The biosensor is provided, for example, on the seat or steering wheel, and detects the biometric information of a passenger sitting in the seat or a driver gripping the steering wheel. The in-vehicle information detection unit 7500 may calculate the degree of fatigue or concentration of the driver based on the detection information input from the driver state detection unit 7510, or may determine whether the driver is dozing off. The in-vehicle information detection unit 7500 may perform processing such as noise canceling on the collected sound signal.

The integrated control unit 7600 controls the overall operation of the vehicle control system 7000 according to various programs. The input unit 7800 is connected to the integrated control unit 7600. The input unit 7800 is realized by a device that can be operated by the passenger, such as a touch panel, a button, a microphone, a switch, or a lever. Data obtained by voice recognition of a voice input by a microphone may be input to the integrated control unit 7600. The input unit 7800 may be, for example, a remote control device using infrared or other radio waves, or an externally connected device such as a mobile phone or a PDA (Personal Digital Assistant) that supports the operation of the vehicle control system 7000. The input unit 7800 may be, for example, a camera, in which case the passenger can input information by gestures. Alternatively, data obtained by detecting the movement of a wearable device worn by the passenger may be input. Furthermore, the input unit 7800 may include, for example, an input control circuit that generates an input signal based on information input by the passenger using the above-mentioned input unit 7800 and outputs the input signal to the integrated control unit 7600. Passengers and others can operate the input unit 7800 to input various data and instruct processing operations to the vehicle control system 7000.

The memory unit 7690 may include a ROM (Read Only Memory) that stores various programs executed by the microcomputer, and a RAM (Random Access Memory) that stores various parameters, calculation results, sensor values, etc. The memory unit 7690 may also be realized by a magnetic memory device such as a HDD (Hard Disc Drive), a semiconductor memory device, an optical memory device, or a magneto-optical memory device, etc.

The general-purpose communication I/F 7620 is a general-purpose communication I/F that mediates communication between various devices present in the external environment 7750. The general-purpose communication I/F 7620 may implement cellular communication protocols such as GSM (registered trademark) (Global System of Mobile communications), WiMAX (registered trademark), LTE (registered trademark) (Long Term Evolution) or LTE-A (LTE-Advanced), or other wireless communication protocols such as wireless LAN (also called Wi-Fi (registered trademark)) and Bluetooth (registered trademark). The general-purpose communication I/F 7620 may connect to devices (e.g., application servers or control servers) present on an external network (e.g., the Internet, a cloud network, or an operator-specific network) via, for example, a base station or an access point. In addition, the general-purpose communication I/F 7620 may connect to a terminal located near the vehicle (e.g., a driver's, pedestrian's, or store's terminal, or an MTC (Machine Type Communication) terminal) using, for example, P2P (Peer To Peer) technology.

The dedicated communication I/F 7630 is a communication I/F that supports a communication protocol developed for use in a vehicle. The dedicated communication I/F 7630 may implement a standard protocol such as WAVE (Wireless Access in Vehicle Environment), DSRC (Dedicated Short Range Communications), or a cellular communication protocol, which is a combination of the lower layer IEEE 802.11p and the higher layer IEEE 1609. The dedicated communication I/F 7630 typically performs V2X communication, which is a concept that includes one or more of vehicle-to-vehicle communication, vehicle-to-infrastructure communication, vehicle-to-home communication, and vehicle-to-pedestrian communication.

The positioning unit 7640 performs positioning by receiving, for example, GNSS signals from GNSS (Global Navigation Satellite System) satellites (for example, GPS signals from GPS (Global Positioning System) satellites), and generates position information including the latitude, longitude, and altitude of the vehicle. The positioning unit 7640 may determine the current position by exchanging signals with a wireless access point, or may obtain position information from a terminal such as a mobile phone, PHS, or smartphone that has a positioning function.

The beacon receiver 7650 receives, for example, radio waves or electromagnetic waves transmitted from radio stations installed on the road, and acquires information such as the current location, congestion, road closures, and travel time. The functions of the beacon receiver 7650 may be included in the dedicated communication I/F 7630 described above.

The in-vehicle device I/F 7660 is a communication interface that mediates the connection between the microcomputer 7610 and various in-vehicle devices 7760 present in the vehicle. The in-vehicle device I/F 7660 may establish a wireless connection using a wireless communication protocol such as wireless LAN, Bluetooth (registered trademark), NFC (Near Field Communication), or WUSB (Wireless USB). The in-vehicle device I/F 7660 may also establish a wired connection such as USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface), or MHL (Mobile High-definition Link) via a connection terminal (and a cable, if necessary) not shown. The in-vehicle device 7760 may include, for example, at least one of a mobile device or wearable device owned by a passenger, or an information device carried into or attached to the vehicle. The in-vehicle device 7760 may also include a navigation device that searches for a route to an arbitrary destination. The in-vehicle device I/F 7660 exchanges control signals or data signals with these in-vehicle devices 7760.

The in-vehicle network I/F 7680 is an interface that mediates communication between the microcomputer 7610 and the communication network 7010. The in-vehicle network I/F 7680 transmits and receives signals in accordance with a specific protocol supported by the communication network 7010.

The microcomputer 7610 of the integrated control unit 7600 controls the vehicle control system 7000 according to various programs based on information acquired through at least one of the general-purpose communication I/F 7620, the dedicated communication I/F 7630, the positioning unit 7640, the beacon receiving unit 7650, the in-vehicle device I/F 7660, and the in-vehicle network I/F 7680. For example, the microcomputer 7610 may calculate the control target value of the driving force generating device, the steering mechanism, or the braking device based on the acquired information inside and outside the vehicle, and output a control command to the drive system control unit 7100. For example, the microcomputer 7610 may perform cooperative control for the purpose of realizing the functions of an ADAS (Advanced Driver Assistance System), including vehicle collision avoidance or impact mitigation, following driving based on the distance between vehicles, vehicle speed maintenance driving, vehicle collision warning, vehicle lane departure warning, etc. In addition, the microcomputer 7610 may control the driving force generating device, steering mechanism, braking device, etc. based on the acquired information about the surroundings of the vehicle, thereby performing cooperative control for the purpose of automatic driving, which allows the vehicle to travel autonomously without relying on the driver's operation.

The microcomputer 7610 may generate three-dimensional distance information between the vehicle and objects such as surrounding structures and people based on information acquired via at least one of the general-purpose communication I/F 7620, the dedicated communication I/F 7630, the positioning unit 7640, the beacon receiving unit 7650, the in-vehicle equipment I/F 7660, and the in-vehicle network I/F 7680, and may create local map information including information about the surroundings of the vehicle's current position. The microcomputer 7610 may also predict dangers such as vehicle collisions, the approach of pedestrians, or entry into closed roads based on the acquired information, and generate warning signals. The warning signals may be, for example, signals for generating warning sounds or turning on warning lights.

The audio/image output unit 7670 transmits at least one of audio and image output signals to an output device capable of visually or audibly notifying the passengers of the vehicle or the outside of the vehicle of information. In the example of FIG. 21, an audio speaker 7710, a display unit 7720, and an instrument panel 7730 are illustrated as output devices. The display unit 7720 may include, for example, at least one of an on-board display and a head-up display. The display unit 7720 may have an AR (Augmented Reality) display function. The output device may be other devices such as headphones, a wearable device such as a glasses-type display worn by the passenger, a projector, or a lamp, in addition to these devices. When the output device is a display device, the display device visually displays the results obtained by various processes performed by the microcomputer 7610 or information received from other control units in various formats such as text, images, tables, graphs, etc. When the output device is an audio output device, the audio output device converts an audio signal consisting of reproduced audio data or acoustic data into an analog signal and audibly outputs it.

In the example shown in FIG. 21, at least two control units connected via the communication network 7010 may be integrated into one control unit. Alternatively, each control unit may be composed of multiple control units. Furthermore, the vehicle control system 7000 may include another control unit not shown. In the above description, some or all of the functions performed by any control unit may be provided by another control unit. In other words, as long as information is transmitted and received via the communication network 7010, a specified calculation process may be performed by any control unit. Similarly, a sensor or device connected to any control unit may be connected to another control unit, and multiple control units may transmit and receive detection information to each other via the communication network 7010.

Note that a computer program for implementing each function of the event processing unit 3 and application unit 4 according to this embodiment described with reference to FIG. 1 can be implemented in any of the control units, etc. Also, a computer-readable recording medium on which such a computer program is stored can also be provided. The recording medium is, for example, a magnetic disk, an optical disk, a magneto-optical disk, a flash memory, etc. Also, the above computer program may be distributed, for example, via a network, without using a recording medium.

The present technology can be configured as follows.
(1) A pixel having a light detection element that detects an event based on a change in the amount of incident light;
a pixel data generating unit configured to generate pixel data including information on the event detected in each of a plurality of unit periods;
Information processing device.
(2) the pixel data has a pixel value corresponding to a detection time and a detection frequency of the event;
An information processing device as described in (1).
(3) the pixel data has a larger pixel value as the event is detected more recently and as the event is detected more frequently;
An information processing device as described in (2).
(4) The pixel data has a number of gradations or a luminance corresponding to the number of the plurality of unit periods.
An information processing device according to (2) or (3).
(5) The pixel data is bit string data having a plurality of bits,
information of the events in the unit period that are older than the first bit is arranged from the most significant bit side to the least significant bit side of the bit string data;
An information processing device according to any one of (1) to (4).
(6) The event information is represented by one or more bits for each unit period.
An information processing device according to (5).
(7) The pixel data includes information representing a polarity of the event for each unit period.
An information processing device according to any one of (1) to (6).
(8) The plurality of unit periods have the same time width.
An information processing device according to any one of (1) to (7).
(9) The plurality of unit periods have a longer time width as the unit periods are older.
An information processing device according to any one of (1) to (7).
(10) A plurality of the pixels arranged in a one-dimensional or two-dimensional direction;
a frame generation unit that generates frame data including information on the event that has occurred within a predetermined period, for each frame including the plurality of pixels;
a divided frame generating unit that divides the frame data into a plurality of pieces in a time axis direction to generate a plurality of divided frame data,
the pixel data generation unit generates, for each of the plurality of pixels, the pixel data based on information of the event of the corresponding pixel, the information being included in each of the plurality of divided frame data;
An information processing device according to any one of (1) to (9).
(11) Two pieces of frame data adjacent to each other in the time axis direction include information of the events occurring within an overlapping time range.
An information processing device according to (10).
(12) The overlapping time ranges have a length that is an integer multiple of the unit period.
An information processing device according to (11).
(13) The pixel data generation unit associates each bit of the pixel data with a different one of the divided frame data, and generates a corresponding bit value of the pixel data based on information of the event of a corresponding pixel in the corresponding divided frame data.
An information processing device according to any one of (10) to (12).
(14) The pixel data generation unit generates a bit value on a higher-order bit side of the pixel data based on information of the event of a corresponding pixel in newer divided frame data.
An information processing device according to (13).
(15) The divided frame generation unit divides the frame data into the plurality of divided frame data of the unit period having the same time length.
An information processing device according to any one of (10) to (14).
(16) The divided frame generation unit divides the frame data into the plurality of divided frame data of the unit periods each having a different time length.
An information processing device according to any one of (10) to (14).
(17) The divided frame generation unit shortens the time length of the divided frame data as the time becomes newer.
An information processing device according to (16).
(18) A frame image generating unit that generates a frame image based on a plurality of the pixel data corresponding to the plurality of pixels.
An information processing device according to any one of (10) to (17).
(19) The frame image generating unit generates the frame images such that each pixel has a gradation corresponding to a detection time and a detection frequency of the event, and two frame images adjacent to each other in a time axis direction contain information of the event within an overlapping time range.
An information processing device according to (18).
(20) A learning unit that performs a learning process to update weights of a neural network used for predetermined information processing including at least one of object tracking, object recognition, and object motion prediction, based on the frame images; and
an information processing unit that performs the predetermined information processing based on the neural network that has performed the learning process and the frame images,
An information processing device according to (18) or (19).

The aspects of the present disclosure are not limited to the individual embodiments described above, but include various modifications that may be conceived by a person skilled in the art, and the effects of the present disclosure are not limited to the above. In other words, various additions, modifications, and partial deletions are possible within the scope that does not deviate from the conceptual idea and intent of the present disclosure derived from the contents defined in the claims and their equivalents.

1, 1a Information processing device, 2, 2a Sensor, 3 Event processing unit, 4, 4a Application unit, 5 Display unit, 11 Event acquisition unit, 12 Decode unit, 13 Frame generation unit, 14 Event storage unit, 15, 15a Image generation unit, 21 Pixel array unit, 22 Vertical drive unit, 23 Signal processing unit, 30, 30a Pixel, 31 Photoelectric conversion element, 32, 32a Pixel circuit, 33 Charge-to-voltage conversion unit, 34 Buffer, 35 Differential circuit, 36, 36a Quantizer, 37 Logarithmic response unit, 41 Pixel chip, 41a First pixel chip, 41b Second pixel chip, 42 Logic chip, 51 Split frame generation unit, 52 Pixel data generation unit, 53 Frame image generation unit, 54 Mapping unit, 55 Shift unit, 56 Memory unit, 60 Weighting unit, 61 Neural network unit, 62 Learning unit, 63 Information processing unit

Claims

A pixel having a light detection element that detects an event based on a change in the amount of incident light;
a pixel data generating unit configured to generate pixel data including information on the event detected in each of a plurality of unit periods;
Information processing device.
the pixel data has a pixel value corresponding to a detection time and a detection frequency of the event;
The information processing device according to claim 1 .
The pixel data has a larger pixel value as the event is detected more recently and as the event is detected more frequently.
The information processing device according to claim 2 .
the pixel data has a number of gradations or luminance corresponding to the number of the plurality of unit periods;
The information processing device according to claim 2 .
the pixel data is a bit string data of a plurality of bits,
information of the events in the unit period that are older than the first bit is arranged from the most significant bit side to the least significant bit side of the bit string data;
The information processing device according to claim 1 .
The event information is represented by one or more bits for each unit period.
The information processing device according to claim 5 .
the pixel data includes information representing a polarity of the event for each unit period;
The information processing device according to claim 1 .
The plurality of unit periods have the same time width.
The information processing device according to claim 1 .
The plurality of unit periods have a longer time width as the time increases.
The information processing device according to claim 1 .
A plurality of the pixels arranged in a one-dimensional or two-dimensional direction;
a frame generating unit that generates frame data including information on the event that has occurred within a predetermined period, for each frame including the plurality of pixels;
a divided frame generating unit that divides the frame data into a plurality of pieces in a time axis direction to generate a plurality of divided frame data,
the pixel data generation unit generates, for each of the plurality of pixels, the pixel data based on information of the event of the corresponding pixel, the information being included in each of the plurality of divided frame data;
The information processing device according to claim 1 .
Two adjacent frames of data in the time axis direction include information of the events occurring within an overlapping time range.
The information processing device according to claim 10.
The overlapping time ranges have a length that is an integer multiple of the unit period.
The information processing device according to claim 11.
the pixel data generation unit generates a corresponding bit value of the pixel data based on the event information of a corresponding pixel in the corresponding divided frame data by associating each bit of the pixel data with a different one of the divided frame data.
The information processing device according to claim 10.
the pixel data generation unit generates a bit value on a higher-order bit side of the pixel data based on information of the event of a corresponding pixel in newer divided frame data;
The information processing device according to claim 13.
the divided frame generation unit divides the frame data into the plurality of divided frame data of the unit period having the same time length;
The information processing device according to claim 10.
the divided frame generation unit divides the frame data into the plurality of divided frame data of the unit period each having a different time length;
The information processing device according to claim 10.
the divided frame generation unit shortens the time length of the divided frame data as the time becomes newer.
The information processing device according to claim 16.
a frame image generating unit that generates a frame image based on the plurality of pixel data corresponding to the plurality of pixels,
The information processing device according to claim 10.
the frame image generation section generates the frame images such that each pixel has a gradation corresponding to a detection time and a detection frequency of the event, and two frame images adjacent in a time axis direction contain information of the event within an overlapping time range.
The information processing device according to claim 18.
a learning unit that performs a learning process to update weights of a neural network used for predetermined information processing including at least one of object tracking, object recognition, and object motion prediction, based on the frame images;
an information processing unit that performs the predetermined information processing based on the neural network that has performed the learning process and the frame images,
The information processing device according to claim 18.