WO2024106132A1

WO2024106132A1 - Solid-state imaging device and information processing system

Info

Publication number: WO2024106132A1
Application number: PCT/JP2023/037961
Authority: WO
Inventors: 元就本田
Original assignee: ソニーセミコンダクタソリューションズ株式会社
Priority date: 2022-11-14
Filing date: 2023-10-20
Publication date: 2024-05-23
Also published as: JP2024071260A

Abstract

[Problem] To provide a solid-state imaging device and an information processing device which can achieve high resolution while suppressing delay in information processing. [Solution] A solid-state imaging device of the present disclosure comprises: a plurality of pixels that detect an event and output event data indicating detection results of the event; an event generation unit that establishes the event data output from the plurality of pixels as event data of a first octave, and generates second to i-th octave (i is an integer of two or greater) from the event data of the first octave; and an event output unit that outputs at least a portion of the event data of the first to i-th octave.

Description

Solid-state imaging device and information processing system

This disclosure relates to a solid-state imaging device and an information processing system.

While an image sensor is a sensor for acquiring images that include a subject, an event sensor (EVS: Event-based Vision Sensor) is a sensor for detecting changes in the subject. By limiting the sensing target from "images" to "changes in the subject," an event sensor can set a higher frame rate than an image sensor. Like an image sensor, an event sensor is realized by a solid-state imaging device such as a CCD (Charge Coupled Device) sensor or a CMOS (Complementary Metal Oxide Semiconductor) sensor.

JP 2017-533497 A

Conventional EVSs output event data with a certain spatial resolution. When an information processing system receives this event data and performs information processing such as object recognition, it performs processes such as scaling, filtering, and cropping on the event data to extract useful information from it.

However, increasing the resolution of an EVS in an attempt to improve its performance increases data mining costs and slows down the information processing system's information processing. As a result, the performance of the entire information processing system actually decreases.

For example, one method that has been considered is to spatially downsample event data by dropping the least significant bits of the event address information. This reduces the spatial resolution of the event data, thereby preventing increases in data mining costs and reducing delays in information processing in information processing systems. However, the reduction in spatial resolution may result in a decrease in the accuracy of target tasks such as object recognition.

The present disclosure provides a solid-state imaging device and information processing system that can achieve high resolution while suppressing delays in information processing.

The solid-state imaging device according to the first aspect of the present disclosure includes a plurality of pixels that detect an event and output event data indicating the detection result of the event, an event generation unit that converts the event data output from the plurality of pixels into event data of a first octave and generates event data of a second to i-th octave (i is an integer of 2 or more) from the event data of the first octave, and an event output unit that outputs at least a portion of the event data of the first to i-th octave. This makes it possible to achieve high resolution of the solid-state imaging device while suppressing delays in information processing, for example, by outputting event data of various octaves for information processing using the event data.

The solid-state imaging device of the first aspect may further include an octave information adding unit that adds octave information, which is identification information for the first to i-th octaves, to the event data of the first to i-th octaves, respectively, and the event output unit may output the event data to which the octave information has been added. This makes it possible, for example, to identify which octave a certain event data belongs to.

Furthermore, in this first aspect, the event output unit may output the event data in an image representation for each octave. This makes it possible to display event data for various octaves as images, for example.

In addition, in this first aspect, the event generating unit may generate event data for the j+1th octave from event data for the jth octave (j is an integer satisfying 1≦j≦i−1). This makes it possible to sequentially generate event data for various octaves from event data for the first octave, for example, by generating event data for the second octave from event data for the first octave, and generating event data for the third octave from event data for the second octave.

In addition, in this first aspect, the event generation unit may generate one line of event data for the j+1 octave from m lines (m is an integer equal to or greater than 2) of event data for the j octave. This makes it possible to easily generate event data for the j+1 octave from event data for the j octave, for example.

In addition, in this first aspect, the event generation unit may generate one row and one column of event data for the j+1 octave from m rows and n columns (n is an integer of 2 or more) of event data for the j octave. This makes it possible to more easily generate event data for the j+1 octave from event data for the j octave, for example.

In addition, in this first aspect, the event generating unit may ignite an event with the event data of the j+1 octave of the 1 row and 1 column when the event data of the j octave of the m rows and n columns includes k event firings (k is an integer satisfying 1≦k≦m×n). This makes it possible to easily replace, for example, m×n areas of the j th octave with one area of the j+1 th octave.

In addition, in this first aspect, m may be 2, n may be 2, and k may be 1, 2, 3, and 4. This makes it possible to distinguish, for example, between a case where no events are fired in the m×n regions of the jth octave and a case where at least one event is fired in the m×n regions of the jth octave.

Also, in this first aspect, m may be 2, n may be 2, and k may be 2, 3, or 4. This makes it possible to suppress the effects of noise events, for example.

In addition, in this first aspect, m may be 2, n may be 2, and k may be 3 and 4. This makes it possible to suppress the effects of noise events and flicker events, for example.

The solid-state imaging device of the first aspect may further include a frame memory that stores the event data output from the plurality of pixels, and the event generating unit may treat the event data output from the frame memory as event data of the first octave. This makes it possible to employ, for example, an arbiter-type event sensor as the solid-state imaging device.

The information processing system according to a second aspect of the present disclosure is an information processing system including a solid-state imaging device and an information processing unit, in which the solid-state imaging device includes a plurality of pixels that detect an event and output event data indicating the detection result of the event, an event generation unit that treats the event data output from the plurality of pixels as event data of a first octave and generates event data of a second to i-th octave (i is an integer of 2 or more) from the event data of the first octave, and an event output unit that outputs at least a portion of the event data of the first to i-th octave, and the information processing unit displays the event data output from the event output unit on a display screen. As a result, for example, by outputting event data of various octaves for information processing (e.g., display) using the event data, it is possible to achieve high resolution of the solid-state imaging device while suppressing delays in information processing.

Also, in this second aspect, the information processing unit may include an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit, and the information processing unit may display the event data extracted by the extraction unit on the display screen. This makes it possible, for example, to use the event data of various octaves output from a solid-state imaging device for display on an octave basis.

Furthermore, in this second aspect, the information processing unit may include an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit, and the information processing unit may perform image recognition using the event data extracted by the extraction unit. This makes it possible, for example, to use the event data of various octaves output from a solid-state imaging device for image recognition on an octave-by-octave basis.

In addition, in this second aspect, the image recognition may be user gesture recognition. This makes it possible, for example, to use event data of various octaves output from a solid-state imaging device for gesture recognition for each octave.

In addition, in this second aspect, the information processing unit may include a selection unit that selects the event data for a number of octaves specified by a user from the event data output from the event output unit, and the information processing unit may record the event data selected by the selection unit on a recording medium. This makes it possible to record, for example, event data for various octaves output from a solid-state imaging device for each octave.

In addition, in this second aspect, the information processing system may be an electronic device including the solid-state imaging device and the information processing unit. This makes it possible, for example, to perform the output of event data and the subsequent information processing in the same electronic device.

In addition, in this second aspect, the electronic device may further include a display unit having the display screen. This makes it possible, for example, to output the event data and display the event data on the same electronic device.

In addition, in this second aspect, the information processing system may include an electronic device including the information processing unit, and an imaging device that is provided outside the electronic device and includes the solid-state imaging device. This makes it possible, for example, to perform information processing using event data in an electronic device outside the imaging device.

In addition, in this second aspect, the information processing system may further include a display device provided outside the electronic device and having the display screen. This makes it possible, for example, to display event data on an electronic device external to the imaging device.

1 is a block diagram showing a configuration of a vehicle 1 according to a first embodiment. FIG. 2 is a plan view showing a sensing area of the vehicle 1 according to the first embodiment. 1 is a block diagram showing a configuration of a solid-state imaging device 100 according to a first embodiment. 2 is a diagram for explaining a pixel array 101 according to the first embodiment. FIG. 4A to 4C are diagrams for explaining the operation of the solid-state imaging device 100 according to the first embodiment. 4 is a diagram for explaining the operation of an event generating unit 103 in the first embodiment. FIG. FIG. 11 is another diagram for explaining the operation of the event generating unit 103 in the first embodiment. FIG. 11 is another diagram for explaining the operation of the event generating unit 103 in the first embodiment. 4A to 4C are diagrams for explaining the operation of an event output unit 105 according to the first embodiment. FIG. 1 illustrates an example of an electronic device 200 according to a first embodiment. FIG. 2 is a diagram illustrating another example of the electronic device 200 according to the first embodiment. FIG. 12 is a diagram for explaining details of the electronic device 200 shown in FIG. 11 . FIG. 11 is a block diagram showing a configuration of a solid-state imaging device 100 according to a second embodiment. FIG. 11 is a diagram for explaining the operation of an event generating unit 103 in the second embodiment. FIG. 11 is a perspective view illustrating a schematic configuration of a solid-state imaging device 100 according to a third embodiment. FIG. 13 is a plan view illustrating a schematic configuration of a photosensor chip 120 according to a third embodiment. FIG. 13 is a plan view illustrating a schematic configuration of a detection chip 130 according to a third embodiment. FIG. 13 is a circuit diagram showing a configuration of each address event detection circuit 131a according to a third embodiment. FIG. 11 is a circuit diagram showing a configuration of a current-voltage conversion circuit 310 according to a third embodiment. FIG. 11 is a circuit diagram showing configurations of a subtractor 330 and a quantizer 340 according to a third embodiment. FIG. 13 is a circuit diagram showing the configurations of the light receiving chip 120 and the detection chip 130 according to a modified example of the third embodiment.

Embodiments of the present disclosure will be described below with reference to the drawings.

First Embodiment
(1) Vehicle 1 of the First Embodiment
Fig. 1 is a block diagram showing the configuration of a vehicle 1 according to the first embodiment. Fig. 1 shows an example of the configuration of a vehicle control system 11, which is an example of a mobility device control system.

The vehicle control system 11 is installed in the vehicle 1 and performs processing related to driving assistance and autonomous driving of the vehicle 1.

The vehicle control system 11 includes a vehicle control ECU (Electronic Control Unit) 21, a communication unit 22, a map information storage unit 23, a location information acquisition unit 24, an external recognition sensor 25, an in-vehicle sensor 26, a vehicle sensor 27, a memory unit 31, a driving assistance/automated driving control unit 32, a DMS (Driver Monitoring System) 33, an HMI (Human Machine Interface) 34, and a vehicle control unit 35.

The vehicle control ECU 21, communication unit 22, map information storage unit 23, position information acquisition unit 24, external recognition sensor 25, in-vehicle sensor 26, vehicle sensor 27, memory unit 31, driving assistance/automatic driving control unit 32, driver monitoring system (DMS) 33, human machine interface (HMI) 34, and vehicle control unit 35 are connected to each other so as to be able to communicate with each other via a communication network 41. The communication network 41 is composed of an in-vehicle communication network or bus that complies with a digital two-way communication standard such as CAN (Controller Area Network), LIN (Local Interconnect Network), LAN (Local Area Network), FlexRay (registered trademark), or Ethernet (registered trademark). The communication network 41 may be used differently depending on the type of data being transmitted. For example, CAN may be applied to data related to vehicle control, and Ethernet may be applied to large-volume data. In addition, each part of the vehicle control system 11 may be directly connected without going through the communication network 41, using wireless communication intended for communication over relatively short distances, such as near field communication (NFC) or Bluetooth (registered trademark).

Note that, hereinafter, when each part of the vehicle control system 11 communicates via the communication network 41, the description of the communication network 41 will be omitted. For example, when the vehicle control ECU 21 and the communication unit 22 communicate via the communication network 41, it will simply be described as the vehicle control ECU 21 and the communication unit 22 communicating with each other.

[Vehicle control ECU 21]
The vehicle control ECU 21 is configured with various processors such as a CPU (Central Processing Unit), an MPU (Micro Processing Unit), etc. The vehicle control ECU 21 controls the entire or part of the functions of the vehicle control system 11.

[Communication unit 22]
The communication unit 22 communicates with various devices inside and outside the vehicle, other vehicles, servers, base stations, etc., and transmits and receives various types of data. At this time, the communication unit 22 can communicate using a plurality of communication methods.

The following provides an overview of the communications with the outside of the vehicle that can be performed by the communication unit 22. The communication unit 22 communicates with servers (hereinafter referred to as external servers) on an external network via base stations or access points using wireless communication methods such as 5G (fifth generation mobile communication system), LTE (Long Term Evolution), and DSRC (Dedicated Short Range Communications). The external network with which the communication unit 22 communicates is, for example, the Internet, a cloud network, or an operator-specific network. The communication method that the communication unit 22 uses with the external network is not particularly limited as long as it is a wireless communication method that allows digital two-way communication at a communication speed equal to or higher than a predetermined distance.

Furthermore, for example, the communication unit 22 can communicate with a terminal present in the vicinity of the vehicle using P2P (Peer To Peer) technology. The terminal present in the vicinity of the vehicle can be, for example, a terminal attached to a mobile object moving at a relatively slow speed, such as a pedestrian or a bicycle, a terminal installed at a fixed position in a store, or an MTC (Machine Type Communication) terminal. Furthermore, the communication unit 22 can also perform V2X communication. V2X communication refers to communication between the vehicle and others, such as vehicle-to-vehicle communication with other vehicles, vehicle-to-infrastructure communication with roadside devices, vehicle-to-home communication with a home, and vehicle-to-pedestrian communication with a terminal carried by a pedestrian, etc.

The communication unit 22 can, for example, receive from the outside a program for updating the software that controls the operation of the vehicle control system 11 (Over the Air). The communication unit 22 can further receive map information, traffic information, information about the surroundings of the vehicle 1, etc. from the outside. For example, the communication unit 22 can also transmit information about the vehicle 1 and information about the surroundings of the vehicle 1 to the outside. Information about the vehicle 1 that the communication unit 22 transmits to the outside includes, for example, data indicating the state of the vehicle 1, the recognition results by the recognition unit 73, etc. Furthermore, for example, the communication unit 22 performs communication corresponding to a vehicle emergency notification system such as e-Call.

For example, the communication unit 22 receives electromagnetic waves transmitted by a road traffic information and communication system (VICS (Vehicle Information and Communication System) (registered trademark)) such as a radio beacon, optical beacon, or FM multiplex broadcasting.

The following provides an overview of the communication that the communication unit 22 can perform with the inside of the vehicle. The communication unit 22 can communicate with each device in the vehicle using, for example, wireless communication. The communication unit 22 can perform wireless communication with each device in the vehicle using a communication method that allows digital two-way communication at a communication speed equal to or higher than a predetermined speed via wireless communication, such as wireless LAN, Bluetooth, NFC, or WUSB (Wireless USB). Not limited to this, the communication unit 22 can also communicate with each device in the vehicle using wired communication. For example, the communication unit 22 can communicate with each device in the vehicle using wired communication via a cable connected to a connection terminal (not shown). The communication unit 22 can communicate with each device in the vehicle using a communication method that allows digital two-way communication at a communication speed equal to or higher than a predetermined speed via wired communication, such as USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface) (registered trademark), or MHL (Mobile High-definition Link).

Here, the term "devices in the vehicle" refers to devices that are not connected to the communication network 41 in the vehicle. Examples of devices in the vehicle include mobile devices and wearable devices carried by passengers such as the driver, and information devices that are brought into the vehicle and temporarily installed.

[Map information storage unit 23]
The map information storage unit 23 stores one or both of a map acquired from an external source and a map created by the vehicle 1. For example, the map information storage unit 23 stores a three-dimensional high-precision map, a global map that has lower precision than a high-precision map and covers a wide area, and the like.

High-precision maps include, for example, dynamic maps, point cloud maps, and vector maps. A dynamic map is, for example, a map consisting of four layers of dynamic information, semi-dynamic information, semi-static information, and static information, and is provided to the vehicle 1 from an external server or the like. A point cloud map is a map composed of a point cloud (point group data). A vector map is, for example, a map that associates traffic information such as the positions of lanes and traffic lights with a point cloud map, and is adapted for ADAS (Advanced Driver Assistance System) and AD (Autonomous Driving).

The point cloud map and vector map may be provided, for example, from an external server, or may be created by the vehicle 1 based on sensing results from the camera 51, radar 52, LiDAR 53, etc. as a map for matching with a local map described below, and stored in the map information storage unit 23. In addition, when a high-precision map is provided from an external server, etc., map data of, for example, an area of several hundred meters square regarding the planned route along which the vehicle 1 will travel is acquired from the external server, etc., in order to reduce communication capacity.

[Location information acquisition unit 24]
The position information acquisition unit 24 receives GNSS signals from Global Navigation Satellite System (GNSS) satellites and acquires position information of the vehicle 1. The acquired position information is supplied to the driving assistance/automated driving control unit 32. Note that the position information acquisition unit 24 is not limited to a method using GNSS signals, and may acquire position information using a beacon, for example.

[External Recognition Sensor 25]
The external recognition sensor 25 includes various sensors used to recognize the situation outside the vehicle 1, and supplies sensor data from each sensor to each unit of the vehicle control system 11. The type and number of sensors included in the external recognition sensor 25 are arbitrary.

For example, the external recognition sensor 25 includes a camera 51, a radar 52, a LiDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) 53, and an ultrasonic sensor 54. Without being limited to this, the external recognition sensor 25 may be configured to include one or more types of sensors among the camera 51, the radar 52, the LiDAR 53, and the ultrasonic sensor 54. The number of cameras 51, radars 52, LiDAR 53, and ultrasonic sensors 54 is not particularly limited as long as it is a number that can be realistically installed on the vehicle 1. Furthermore, the types of sensors included in the external recognition sensor 25 are not limited to this example, and the external recognition sensor 25 may include other types of sensors. Examples of the sensing areas of each sensor included in the external recognition sensor 25 will be described later.

The imaging method of camera 51 is not particularly limited. For example, cameras of various imaging methods, such as a ToF (Time Of Flight) camera, a stereo camera, a monocular camera, and an infrared camera, which are imaging methods capable of distance measurement, can be applied to camera 51 as necessary. However, the present invention is not limited to this, and camera 51 may simply be used to obtain a photographed image, without being related to distance measurement.

Furthermore, for example, the external recognition sensor 25 can be equipped with an environmental sensor for detecting the environment relative to the vehicle 1. The environmental sensor is a sensor for detecting the environment such as the weather, climate, brightness, etc., and can include various sensors such as a raindrop sensor, a fog sensor, a sunlight sensor, a snow sensor, an illuminance sensor, etc.

Furthermore, for example, the external recognition sensor 25 includes a microphone that is used to detect sounds around the vehicle 1 and the location of sound sources.

[In-vehicle sensor 26]
The in-vehicle sensor 26 includes various sensors for detecting information inside the vehicle, and supplies sensor data from each sensor to each unit of the vehicle control system 11. The types and number of the various sensors included in the in-vehicle sensor 26 are not particularly limited as long as they are of the types and number that can be realistically installed in the vehicle 1.

For example, the in-vehicle sensor 26 may be equipped with one or more types of sensors including a camera, radar, a seating sensor, a steering wheel sensor, a microphone, and a biometric sensor. The camera equipped in the in-vehicle sensor 26 may be a camera using various imaging methods capable of measuring distances, such as a ToF camera, a stereo camera, a monocular camera, or an infrared camera. Without being limited to this, the camera equipped in the in-vehicle sensor 26 may be a camera simply for acquiring captured images, regardless of distance measurement. The biometric sensor equipped in the in-vehicle sensor 26 is provided, for example, on a seat, steering wheel, etc., and detects various types of biometric information of passengers such as the driver.

[Vehicle sensor 27]
The vehicle sensor 27 includes various sensors for detecting the state of the vehicle 1, and supplies sensor data from each sensor to each unit of the vehicle control system 11. The types and number of the various sensors included in the vehicle sensor 27 are not particularly limited as long as they are types and numbers that can be realistically installed in the vehicle 1.

For example, the vehicle sensor 27 includes a speed sensor, an acceleration sensor, an angular velocity sensor (gyro sensor), and an inertial measurement unit (IMU) that integrates these. For example, the vehicle sensor 27 includes a steering angle sensor that detects the steering angle of the steering wheel, a yaw rate sensor, an accelerator sensor that detects the amount of accelerator pedal operation, and a brake sensor that detects the amount of brake pedal operation. For example, the vehicle sensor 27 includes a rotation sensor that detects the number of rotations of the engine or motor, an air pressure sensor that detects the air pressure of the tires, a slip ratio sensor that detects the slip ratio of the tires, and a wheel speed sensor that detects the rotation speed of the wheels. For example, the vehicle sensor 27 includes a battery sensor that detects the remaining charge and temperature of the battery, and an impact sensor that detects external impacts.

[Memory unit 31]
The storage unit 31 includes at least one of a non-volatile storage medium and a volatile storage medium, and stores data and programs. The storage unit 31 is used, for example, as an electrically erasable programmable read only memory (EEPROM) and a random access memory (RAM), and the storage medium may be a magnetic storage device such as a hard disk drive (HDD), a semiconductor storage device, an optical storage device, or a magneto-optical storage device. The storage unit 31 stores various programs and data used by each part of the vehicle control system 11. For example, the storage unit 31 includes an event data recorder (EDR) and a data storage system for automated driving (DSSAD), and stores information on the vehicle 1 before and after an event such as an accident, and information acquired by the in-vehicle sensor 26.

[Driving assistance/automatic driving control unit 32]
The driving assistance/automatic driving control unit 32 controls driving assistance and automatic driving of the vehicle 1. For example, the driving assistance/automatic driving control unit 32 includes an analysis unit 61, an action planning unit 62, and an operation control unit 63.

The analysis unit 61 performs analysis processing of the vehicle 1 and the surrounding conditions. The analysis unit 61 includes a self-position estimation unit 71, a sensor fusion unit 72, and a recognition unit 73.

The self-position estimation unit 71 estimates the self-position of the vehicle 1 based on the sensor data from the external recognition sensor 25 and the high-precision map stored in the map information storage unit 23. For example, the self-position estimation unit 71 generates a local map based on the sensor data from the external recognition sensor 25, and estimates the self-position of the vehicle 1 by matching the local map with the high-precision map. The position of the vehicle 1 is based on, for example, the center of the rear wheel pair axle.

The local map is, for example, a three-dimensional high-precision map or an occupancy grid map created using technology such as SLAM (Simultaneous Localization and Mapping). The three-dimensional high-precision map is, for example, the point cloud map described above. The occupancy grid map is a map in which the three-dimensional or two-dimensional space around the vehicle 1 is divided into grids of a predetermined size, and the occupancy state of objects is shown on a grid-by-grid basis. The occupancy state of objects is indicated, for example, by the presence or absence of an object and the probability of its existence. The local map is also used, for example, in detection processing and recognition processing of the situation outside the vehicle 1 by the recognition unit 73.

The self-position estimation unit 71 may estimate the self-position of the vehicle 1 based on the position information acquired by the position information acquisition unit 24 and the sensor data from the vehicle sensor 27.

The sensor fusion unit 72 performs sensor fusion processing to combine multiple different types of sensor data (e.g., image data supplied from the camera 51 and sensor data supplied from the radar 52) to obtain new information. Methods for combining different types of sensor data include integration, fusion, and association.

The recognition unit 73 executes a detection process to detect the situation outside the vehicle 1, and a recognition process to recognize the situation outside the vehicle 1.

For example, the recognition unit 73 performs detection and recognition processing of the situation outside the vehicle 1 based on information from the external recognition sensor 25, information from the self-position estimation unit 71, information from the sensor fusion unit 72, etc.

Specifically, for example, the recognition unit 73 performs detection processing and recognition processing of objects around the vehicle 1. Object detection processing is, for example, processing to detect the presence or absence, size, shape, position, movement, etc. of an object. Object recognition processing is, for example, processing to recognize attributes such as the type of object, and to identify a specific object. However, detection processing and recognition processing are not necessarily clearly separated, and there may be overlap.

For example, the recognition unit 73 detects objects around the vehicle 1 by performing clustering to classify a point cloud based on sensor data from the radar 52, the LiDAR 53, or the like into clusters of points. This allows the presence or absence, size, shape, and position of objects around the vehicle 1 to be detected.

For example, the recognition unit 73 detects the movement of objects around the vehicle 1 by performing tracking to follow the movement of clusters of point clouds classified by clustering. This allows the speed and direction of travel (movement vector) of objects around the vehicle 1 to be detected.

For example, the recognition unit 73 detects or recognizes vehicles, people, bicycles, obstacles, structures, roads, traffic lights, traffic signs, road markings, etc. based on image data supplied from the camera 51. The recognition unit 73 may also recognize the types of objects around the vehicle 1 by performing recognition processing such as semantic segmentation.

For example, the recognition unit 73 can perform recognition processing of traffic rules around the vehicle 1 based on the map stored in the map information storage unit 23, the result of self-location estimation by the self-location estimation unit 71, and the result of recognition of objects around the vehicle 1 by the recognition unit 73. Through this processing, the recognition unit 73 can recognize the positions and states of traffic lights, the contents of traffic signs and road markings, the contents of traffic regulations, and lanes on which travel is possible, etc.

For example, the recognition unit 73 can perform recognition processing of the environment around the vehicle 1. The surrounding environment that the recognition unit 73 recognizes may include weather, temperature, humidity, brightness, and road surface conditions.

The behavior planning unit 62 creates a behavior plan for the vehicle 1. For example, the behavior planning unit 62 creates the behavior plan by performing route planning and route following processing.

Global path planning is a process that plans a rough route from the start to the goal. This route planning is called trajectory planning, and also includes a process of local path planning that takes into account the motion characteristics of vehicle 1 on the planned route and generates a trajectory that allows safe and smooth progress in the vicinity of vehicle 1.

Path following is a process of planning operations for traveling safely and accurately along a route planned by a route plan within a planned time. The action planning unit 62 can, for example, calculate the target speed and target angular velocity of the vehicle 1 based on the results of this path following process.

The operation control unit 63 controls the operation of the vehicle 1 to realize the action plan created by the action planning unit 62.

For example, the operation control unit 63 controls the steering control unit 81, the brake control unit 82, and the drive control unit 83 included in the vehicle control unit 35 described below, and performs acceleration/deceleration control and directional control so that the vehicle 1 proceeds along the trajectory calculated by the trajectory plan. For example, the operation control unit 63 performs cooperative control aimed at realizing ADAS functions such as collision avoidance or impact mitigation, following driving, maintaining vehicle speed, collision warning for the vehicle itself, and lane departure warning for the vehicle itself. For example, the operation control unit 63 performs cooperative control aimed at automatic driving, which drives autonomously without the driver's operation.

[DMS33]
The DMS 33 performs authentication processing of the driver and recognition processing of the driver's state based on the sensor data from the in-vehicle sensor 26 and input data input to the HMI 34 (described later), etc. Examples of the driver's state to be recognized include physical condition, alertness level, concentration level, fatigue level, line of sight direction, level of intoxication, driving operation, posture, etc.

The DMS 33 may also perform authentication processing for passengers other than the driver and recognition processing for the status of the passengers. For example, the DMS 33 may also perform recognition processing for the situation inside the vehicle based on sensor data from the in-vehicle sensor 26. Examples of the situation inside the vehicle that may be recognized include temperature, humidity, brightness, odor, etc.

[HMI34]
The HMI 34 inputs various data and instructions, and presents various data to the driver, etc.

The following provides an overview of data input by the HMI 34. The HMI 34 is equipped with an input device that allows a person to input data. The HMI 34 generates input signals based on data and instructions input by the input device, and supplies the signals to each part of the vehicle control system 11. The HMI 34 is equipped with input devices such as a touch panel, buttons, switches, and levers. Without being limited to these, the HMI 34 may further be equipped with an input device that allows information to be input by a method other than manual operation, such as voice or gestures. Furthermore, the HMI 34 may use, as an input device, an externally connected device such as a remote control device that uses infrared rays or radio waves, or a mobile device or wearable device that supports the operation of the vehicle control system 11.

The presentation of data by the HMI 34 will be briefly described below. The HMI 34 generates visual information, auditory information, and tactile information for the occupant or the outside of the vehicle. The HMI 34 also performs output control to control the output, output content, output timing, output method, etc. of each piece of generated information. The HMI 34 generates and outputs, as visual information, information indicated by images or light, such as an operation screen, a status display of the vehicle 1, a warning display, and a monitor image showing the situation around the vehicle 1. The HMI 34 also generates and outputs, as auditory information, information indicated by sounds, such as voice guidance, warning sounds, and warning messages. The HMI 34 also generates and outputs, as tactile information, information that is imparted to the occupant's sense of touch by, for example, force, vibration, movement, etc.

The output device from which the HMI 34 outputs visual information may be, for example, a display device that presents visual information by displaying an image itself, or a projector device that presents visual information by projecting an image. Note that the display device may be a device that displays visual information within the field of vision of the passenger, such as a head-up display, a transmissive display, or a wearable device with an AR (Augmented Reality) function, in addition to a display device having a normal display. The HMI 34 may also use display devices such as a navigation device, an instrument panel, a CMS (Camera Monitoring System), an electronic mirror, or lamps provided in the vehicle 1 as output devices that output visual information.

The output device through which the HMI 34 outputs auditory information can be, for example, an audio speaker, headphones, or earphones.

Haptic elements using haptic technology can be used as output devices for the HMI 34 to output haptic information. Haptic elements are provided on parts of the vehicle 1 that are in contact with passengers, such as the steering wheel and the seat.

[Vehicle control unit 35]
The vehicle control unit 35 controls each unit of the vehicle 1. The vehicle control unit 35 includes a steering control unit 81, a brake control unit 82, a drive control unit 83, a body system control unit 84, a light control unit 85, and a horn control unit 86.

The steering control unit 81 detects and controls the state of the steering system of the vehicle 1. The steering system includes, for example, a steering mechanism including a steering wheel, an electric power steering, etc. The steering control unit 81 includes, for example, a steering ECU that controls the steering system, an actuator that drives the steering system, etc.

The brake control unit 82 detects and controls the state of the brake system of the vehicle 1. The brake system includes, for example, a brake mechanism including a brake pedal, an ABS (Antilock Brake System), a regenerative brake mechanism, etc. The brake control unit 82 includes, for example, a brake ECU that controls the brake system, and an actuator that drives the brake system.

The drive control unit 83 detects and controls the state of the drive system of the vehicle 1. The drive system includes, for example, an accelerator pedal, a drive force generating device for generating drive force such as an internal combustion engine or a drive motor, and a drive force transmission mechanism for transmitting the drive force to the wheels. The drive control unit 83 includes, for example, a drive ECU for controlling the drive system, and an actuator for driving the drive system.

The body system control unit 84 detects and controls the state of the body system of the vehicle 1. The body system includes, for example, a keyless entry system, a smart key system, a power window device, a power seat, an air conditioning system, an airbag, a seat belt, a shift lever, etc. The body system control unit 84 includes, for example, a body system ECU that controls the body system, an actuator that drives the body system, etc.

The light control unit 85 detects and controls the state of various lights of the vehicle 1. Examples of lights to be controlled include headlights, backlights, fog lights, turn signals, brake lights, projection, and bumper displays. The light control unit 85 includes a light ECU that controls the lights, an actuator that drives the lights, and the like.

The horn control unit 86 detects and controls the state of the car horn of the vehicle 1. The horn control unit 86 includes, for example, a horn ECU that controls the car horn, an actuator that drives the car horn, etc.

FIG. 2 is a plan view showing the sensing area of the vehicle 1 in the first embodiment. FIG. 2 shows an example of the sensing area of the camera 51, radar 52, LiDAR 53, ultrasonic sensor 54, etc. of the external recognition sensor 25 in FIG. 1. Note that FIG. 2 shows a schematic view of the vehicle 1 as seen from above, with the left end side being the front end of the vehicle 1 and the right end side being the rear end of the vehicle 1.

[Sensing area 1-1F, B]
Sensing area 1-1F and sensing area 1-1B are examples of sensing areas of the ultrasonic sensors 54. The sensing area 1-1F covers the periphery of the front end of the vehicle 1 with a plurality of ultrasonic sensors 54. The sensing area 1-1B covers the periphery of the rear end of the vehicle 1 with a plurality of ultrasonic sensors 54.

The sensing results in sensing area 1-1F and sensing area 1-1B are used, for example, for parking assistance for vehicle 1.

[Sensing areas 1-2F, B, L, R]
Sensing area 1-2F to sensing area 1-2B show examples of sensing areas of a short-range or medium-range radar 52. Sensing area 1-2F covers a position farther in front of the vehicle 1 than sensing area 1-1F. Sensing area 1-2B covers a position farther in the rear of the vehicle 1 than sensing area 1-1B. Sensing area 1-2L covers the rear periphery of the left side of the vehicle 1. Sensing area 1-2R covers the rear periphery of the right side of the vehicle 1.

The sensing results in sensing area 1-2F are used, for example, to detect vehicles, pedestrians, etc. in front of vehicle 1. The sensing results in sensing area 1-2B are used, for example, for collision prevention functions behind vehicle 1. The sensing results in sensing area 1-2L and sensing area 1-2R are used, for example, to detect objects in blind spots to the sides of vehicle 1.

[Sensing areas 1-3F, B, L, R]
Sensing area 1-3F to sensing area 1-3B show examples of sensing areas sensed by camera 51. Sensing area 1-3F covers a position farther in front of vehicle 1 than sensing area 1-2F. Sensing area 1-3B covers a position farther in the rear of vehicle 1 than sensing area 1-2B. Sensing area 1-3L covers the periphery of the left side of vehicle 1. Sensing area 1-3R covers the periphery of the right side of vehicle 1.

The sensing results in sensing area 1-3F can be used, for example, for recognizing traffic lights and traffic signs, lane departure prevention support systems, and automatic headlight control systems. The sensing results in sensing area 1-3B can be used, for example, for parking assistance and surround view systems. The sensing results in sensing area 1-3L and sensing area 1-3R can be used, for example, for surround view systems.

[Sensing area 1-4]
A sensing area 1-4 shows an example of a sensing area of the LiDAR 53. The sensing area 1-4 covers a position farther in front of the vehicle 1 than the sensing area 1-3F. On the other hand, the sensing area 1-4 has a narrower range in the left-right direction than the sensing area 1-3F.

The sensing results in sensing areas 1-4 are used, for example, to detect objects such as nearby vehicles.

[Sensing Areas 1-5]
A sensing area 1-5 shows an example of a sensing area of a long-range radar 52. The sensing area 1-5 covers a position farther ahead of the vehicle 1 than the sensing area 1-4. On the other hand, the sensing area 1-5 has a narrower range in the left-right direction than the sensing area 1-4.

The sensing results in sensing areas 1-5 are used, for example, for ACC (Adaptive Cruise Control), emergency braking, collision avoidance, etc.

The sensing areas of the cameras 51, radar 52, LiDAR 53, and ultrasonic sensors 54 included in the external recognition sensor 25 may have various configurations other than those shown in FIG. 2. Specifically, the ultrasonic sensor 54 may also sense the sides of the vehicle 1, and the LiDAR 53 may sense the rear of the vehicle 1. The installation positions of the sensors are not limited to the examples described above. The number of sensors may be one or more.

(2) Solid-state imaging device 100 according to the first embodiment
FIG. 3 is a block diagram showing the configuration of the solid-state imaging device 100 according to the first embodiment.

The solid-state imaging device 100 is provided in the vehicle 1 shown in FIG. 1, and is included in the external recognition sensor 25, for example. The solid-state imaging device 100 is an EVS for detecting changes in a subject. Examples of subjects include a person, a vehicle, and an obstacle in front of the vehicle 1. The solid-state imaging device 100 may be built into an electronic device 200 such as a smartphone, as in an example described later, or may be electrically connected to an electronic device 200 such as a game console (see FIGS. 10 and 11).

As shown in FIG. 3, the solid-state imaging device 100 includes a pixel array 101, an event acquisition unit 102, an event generation unit 103, an event synthesis unit 104, and an event output unit 105. The pixel array 101 includes a plurality of pixels 101a. The event generation unit 103 includes a first filter unit 103a, a second filter unit 103b, and a third filter unit 103c. The event synthesis unit 104 includes an octave information addition unit 104a and an output timing adjustment unit 104b. The event output unit 105 includes an event data selection unit 105a and an event data formation unit 105b.

[Pixel array 101]
The pixel array 101 includes a plurality of pixels 101a arranged in a two-dimensional array (matrix). In FIG. 3, the horizontal direction on the paper corresponds to the row direction of the pixel array 101, and the vertical direction on the paper corresponds to the column direction of the pixel array 101.

Each pixel 101a has the function of detecting events such as on events and off events. An on event is fired when the luminance of the pixel 101a increases and the absolute value of the amount of change (increase) in luminance is greater than a threshold value. An off event is fired when the luminance of the pixel 101a decreases and the absolute value of the amount of change (decrease) in luminance is greater than a threshold value. For example, an on event is fired when a subject enters the pixel 101a, and an off event is fired when a subject leaves the pixel 101a. Each pixel 101a then outputs event data indicating the event detection result.

[Event Acquisition Unit 102]
The event acquiring unit 102 acquires an event (event data) from each pixel 101a in the pixel array 101. In this embodiment, since the solid-state imaging device 100 is a scan-type EVS, the event acquiring unit 102 sequentially scans multiple rows of the pixel array 101 and acquires event data from the pixel array 101 for each row.

The event data for each row of the pixel array 101 is treated as event data V1 of the first octave. The event acquisition unit 102 outputs the event data V1 of the first octave to the event generation unit 103 and the event synthesis unit 104. In this embodiment, the event data V1 of the first octave acquired by the event acquisition unit 102 is sent to the event synthesis unit 104, and a copy of the event data V1 is sent to the event generation unit 103.

FIG. 3 shows the first octave event data V1 as a series of multiple regions P1 arranged in a row. Each region P1 represents the event data for one pixel 101a in the first octave event data V1. For example, four regions P1 correspond to event data for four pixels 101a. The first octave event data V1 is also called the first octave event firing/non-firing column. The first octave event data V1 is character string data representing the event data for one row of pixels 101a. In this embodiment, it is possible to know whether an event has fired at a pixel 101a by acquiring the event data for that pixel 101a.

[Event generating unit 103]
The event generating unit 103 regards the event data output from the event acquiring unit 102 as event data V1 of the first octave, and generates event data V2 to Vi of the second to i-th octaves (i is an integer equal to or greater than 2) from the event data V1 of the first octave. Fig. 3 shows an example where i = 4, that is, an example where event data V2 to V4 of the second to fourth octaves are generated from the event data V1 of the first octave.

When the first filter unit 103a receives the first octave event data V1, it waits until _m1 rows ( _m1 is an integer equal to or greater than 2) of the first octave event data V1 are accumulated. _{The m1} rows of the first octave event data V1 correspond to the event data of _m1 rows of the pixels 101a in the pixel array 101. When the _m1 rows of the first octave event data V1 are accumulated, the first filter unit 103a generates _r1 rows and _s1 columns of the second octave event data V2 ( _r1 is an integer that satisfies _r1 < _m1 , _s1 is an integer that satisfies _s1 < _n1 ) from the _m1 rows and _n1 columns of the first octave event data V1 ( _n1 is an integer equal to or greater than 2). The first filter unit 103a then discards the _m1 rows and _n1 columns of the first octave event data V1. The first filter unit 103a repeats this process for the event data V1 of the first octave of all rows in order.

The first filter unit 103a outputs the second octave event data V2 of r ₁ rows and s ₁ columns to the second filter unit 103b and the event synthesis unit 104. In this embodiment, the second octave event data V2 generated by the first filter unit 103a is sent to the event synthesis unit 104, and a copy thereof is sent to the second filter unit 103b.

FIG. 3 shows the second octave event data V2 as a schematic representation of a plurality of regions P2 arranged in a row. FIG. 3 shows an example where m ₁ =2, n ₁ =2, r ₁ =1, and s ₁ =1. Therefore, the region P1 of 2 rows and 2 columns is replaced with the region P2 of 1 row and 1 column. Here, it is assumed that the pixel array 101 of this embodiment includes pixels 101a of M rows and N columns. Therefore, the process of generating the event data V2 of the second octave from the event data V1 of the first octave is repeated M×N/4 times in total. In FIG. 3, four regions P1 are aggregated into one region P2, and therefore each region P2 corresponds to event data obtained by aggregating the event data of four pixels 101a. The event data V2 of the second octave is also called the event ignition presence/absence column of the second octave.

When the second filter unit 103b receives the second octave event data V2, it waits until m ₂ rows (m ₂ is an integer equal to or greater than 2) of the second octave event data V2 are accumulated. When m ₂ rows of the first octave event data V2 are accumulated, the second filter unit 103b generates r ₂ rows and s ₂ columns (r ₂ is an integer that satisfies r ₂ <m ₂ , and s ₂ is an integer that satisfies s ₂ <n ₂ ) of the third octave event data V3 from m 2 rows and n ₂ columns (n ₂ is an integer equal to or greater than ₂ ) of the second octave event data V2. The second filter unit 103b then discards the m ₂ rows and n ₂ columns of the first octave event data V2. The second filter unit 103b repeats such processing for all rows of the second octave event data V2 in order.

The second filter unit 103b outputs the third octave event data V3 of r ₂ rows and s ₂ columns to the third filter unit 103c and the event synthesis unit 104. In this embodiment, the second octave event data V3 generated by the second filter unit 103b is sent to the event synthesis unit 104, and a copy thereof is sent to the third filter unit 103c.

FIG. 3 shows the third octave event data V3 as a schematic representation of a plurality of regions P3 arranged in a row. FIG. 3 shows an example where m ₂ =2, n ₂ =2, r ₂ =1, and s ₂ =1. Therefore, the region P2 of 2 rows and 2 columns is replaced with the region P3 of 1 row and 1 column. The process of generating the third octave event data V3 from the second octave event data V2 is repeated M×N/16 times in total. In FIG. 3, four regions P2 are aggregated into one region P3, and therefore each region P3 corresponds to event data obtained by aggregating event data for 16 pixels 101a. The third octave event data V3 is also called the third octave event ignition presence/absence column.

The operation of the third filter unit 103c is the same as that of the first filter unit 103a and the second filter unit 103b. The third filter unit 103c generates the fourth octave event data V4 of r 3 rows and s ₃ columns (r ₃ is an integer satisfying r ₃ <m ₃ , s ₃ is an integer satisfying s ₃ <n ₃ ) from the third octave event data V3 of m ₃ rows and n ₃ columns (n ₃ is an integer satisfying ₂ or more, n 3 is an integer satisfying ₂ or more). The third filter unit 103c outputs the fourth octave event data V4 of r ₃ rows and s ₃ columns to the event synthesis unit 104. FIG. 3 shows the fourth octave event data V4 as a plurality of regions P4 arranged in a row. FIG. 3 shows an example where m ₃ =2, n ₃ =2, r ₃ =1, and s ₃ =1. Therefore, the region P3 of 2 rows and 2 columns is replaced with a region P4 of 1 row and 1 column. The event data V4 of the fourth octave is also called the event firing presence/absence column of the fourth octave.

In this way, the event generating unit 103 generates event data Vj+1 of the j+1th octave from event data Vj of the jth octave (j is an integer satisfying 1≦j≦i-1). This makes it possible to sequentially generate event data V2 to Vi of the second to i-th octaves from event data V1 of the first octave. Note that the term "octave" in this embodiment is used by analogy with the musical term "octave" because the differences between the event data V1 to Vi of the first to i-th octaves correspond to differences in frequency.

[Event synthesis unit 104]
The event synthesis unit 104 acquires the event data V1 of the first octave from the event acquisition unit 102, and acquires the event data V2 to Vi of the second to i-th octaves from the event generation unit 103. Fig. 3 shows an example where i = 4, that is, an example where the event synthesis unit 104 acquires the event data V1 to V4 of the first to fourth octaves.

The octave information adding unit 104a adds octave information to the event data V1 to Vi of the first to i-th octaves and stores the added information. The octave information is identification information for the event data V1 to Vi of the first to i-th octaves. In this embodiment, the octave information is based on the number of octaves of the event data, and is, for example, a value obtained by subtracting 1 from the number of octaves of the event data. Therefore, the octave information for the event data V1 to Vi of the first to i-th octaves is "0 to i-1", respectively. Specifically, the octave information for the event data V1 of the first octave is "0", the octave information for the event data V2 of the second octave is "1", and the octave information for the event data V3 of the third octave is "2".

The output timing adjustment unit 104b adjusts the timing at which the event data held by the octave information addition unit 104a is output to the event output unit 105. In this embodiment, the output timing adjustment unit 104b outputs the event data V1 to Vi of the first to i-th octaves to the event output unit 105 in order from the event data with the largest octave number to the event data with the largest octave number. At this time, the event data V1 to Vi of the first to i-th octaves are output with octave information added.

[Event output unit 105]
The event output unit 105 outputs at least a part of the event data V1 to Vi of the first to i-th octaves to the outside of the solid-state imaging device 100. The event output unit 105 outputs the event data acquired from the event synthesis unit 104. The event output unit 105 of the present embodiment outputs the event data to the vehicle control system 11 shown in Fig. 1, but may instead output the event data to the electronic device 200 shown in Fig. 10 or 11.

The event data selection unit 105a selects the event data to be output to the event data formation unit 105b from the event data V1 to Vi of the first to i-th octaves. For example, when the event data V1 to V3 of the first to third octaves are selected from the event data V1 to Vi of the first to i-th octaves, the event data V1 to V3 of the first to third octaves are output to the event data formation unit 105b.

The event data forming unit 105b converts the event data selected by the event data selecting unit 105a into an event output data format. The event data forming unit 105b then outputs the event data converted into the event output data format to the outside of the solid-state imaging device 100 with the octave information added.

FIG. 4 is a diagram for explaining the pixel array 101 of the first embodiment.

Each of A to C in Figure 4 shows a pixel array 101 that outputs event data, and an image E obtained by converting this event data into an image representation. Image E corresponds to an image captured when a situation in which the letter "A" newly appears. Image E also corresponds to an image obtained from event data V1 of the first octave.

( _x0 , _y0 , _t0 , _p0 ), ( _x1 , _y1 , _t1 , _p1 ), and ( _x2 , _y2 , _t2 , _p2 ) shown in Figures 4A to 4C represent the event data of each pixel 101a. _x0 and _y0 represent the coordinates of each pixel 101a. _t0 represents the time when the event data was obtained. _p0 represents the polarity of the event data. For example, the polarity when an on-event is fired is "+", and the polarity when an off-event is fired is "-". The above also applies to the other x, y, t, and p.

A in FIG. 4 shows a pixel array 101 with a small number of pixels and low resolution. B in FIG. 4 shows a pixel array 101 with a medium number of pixels and medium resolution. C in FIG. 4 shows a pixel array 101 with a large number of pixels and high resolution.

If the resolution of the solid-state imaging device 100 is increased in an attempt to improve the performance of the solid-state imaging device 100, the data mining costs will increase and the information processing of the information processing system that uses the event data output from the solid-state imaging device 100 will be delayed. As a result, the performance of the entire information processing system will actually decrease. Examples of such information processing systems are the vehicle control system 11 shown in FIG. 1, the electronic device 200 shown in FIG. 10, and the system shown in FIG. 11 (a system including the electronic device 200).

The solid-state imaging device 100 of this embodiment therefore outputs event data of various octaves, as described with reference to FIG. 3. This makes it possible to achieve high resolution for the solid-state imaging device 100 while suppressing delays in information processing using the event data. For example, even if the pixel array 101 shown in FIG. 4C is employed, it is possible to suppress delays in information processing using the event data. Further details of such effects will be described later.

FIG. 5 is a diagram for explaining the operation of the solid-state imaging device 100 of the first embodiment.

Arrow A1 in Figure 5 indicates the process of generating second octave event data V2 of one row and one column from first octave event data V1 of two rows and two columns. In this process, area P1 of two rows and two columns is replaced with area P2 of one row and one column. Arrow A2 in Figure 5 indicates the process of generating third octave event data V3 of one row and one column from second octave event data V2 of two rows and two columns. In this process, area P2 of two rows and two columns is replaced with area P3 of one row and one column. The presence or absence of a check in areas P1 to P3 indicates whether an event has been fired or not.

5 further shows an image E1 obtained by converting the event data V1 of the first octave into an image representation, an image E2 obtained by converting the event data V2 of the second octave into an image representation, and an image E3 obtained by converting the event data V3 of the third octave into an image representation. The images E1, E2, and E3 are similar to the image E shown in C, B, and A of FIG. 4, respectively. Thus, according to this embodiment, it is possible to generate a low-resolution image from a high-resolution image by generating event data V2 to Vi of the second to i-th octaves from the event data V1 of the first octave. As will be described later, the event output unit 105 outputs the event data in an image representation for each octave. As a result, it is possible to display the event data for each octave in a form such as images E1 to E3 by using the event data output in an image representation.

In this embodiment, the event data is expressed in the form of (oc, x, y, t, p) by adding octave information. x and y represent the coordinates of the pixels 101a corresponding to the regions P1 to P3, for example, the coordinates of one pixel 101a corresponding to the region P1, the average coordinates of four pixels 101a corresponding to the region P2, or the average coordinates of 16 pixels 101a corresponding to the region P3. t represents the time when the event data was obtained. p represents the polarity of the event data. oc represents the octave information of the event data. For example, the octave information oc of the event data V1 in the first octave is "0", the octave information oc of the event data V2 in the second octave is "1", and the octave information oc of the event data V3 in the third octave is "2".

FIG. 6 is a diagram for explaining the operation of the event generation unit 103 in the first embodiment.

A in FIG. 6 shows the process of generating r rows and s columns of second octave event data V2 from m rows and n columns of first octave event data V1. A in FIG. 6 shows an example where m=2, n=2, r=1, s=1. Therefore, 1 row and 1 column of second octave event data V2 is generated from 2 rows and 2 columns of first octave event data V1, and 2×2 regions P1 are replaced with 1×1 regions P2. Similarly, in the example shown in A in FIG. 6, 1 row and 1 column of j+1th octave event data Vj+1 is generated from 2 rows and 2 columns of jth octave event data Vj.

In A of FIG. 6, k represents the number of event firings contained in the m×n regions P1 (k is an integer satisfying 1≦k≦m×n). For example, if the number of event firings contained in the 2×2 regions P1 is 3, the value of k is 3. In this case, three of the 2×2 regions P1 correspond to the "checked" region P1 shown in FIG. 5, and the remaining one of the 2×2 regions P1 corresponds to the "unchecked" region P1 shown in FIG. 5. This indicates that events were fired in three of the four pixels 101a, and no event was fired in the remaining one of the four pixels 101a.

In the example shown in FIG. 6A, when k of the four regions P1 is 1, 2, 3, or 4, one region P2 is "checked" (FIG. 6C), and when k of the four regions P1 is 0, one region P2 is "unchecked" (FIG. 6B). That is, when an event is fired in any of the four regions P1, it is treated as if an event is fired in one region P2. On the other hand, when an event is not fired in any of the four regions P1, it is treated as if an event is not fired in one region P2. This makes it possible to reflect information regarding the presence or absence of an event firing from the event data V1 of the first octave to the event data V2 of the second octave. This is also the case when generating the event data Vj+1 of the j+1th octave from the event data Vj of the jth octave. In FIG. 6B and C, the checked regions P1 and P2 are shown in black, and the unchecked regions P1 and P2 are shown in white.

FIG. 7 is another diagram for explaining the operation of the event generating unit 103 in the first embodiment.

A to C in FIG. 7 correspond to A to C in FIG. 6, respectively. However, in the example shown in A in FIG. 7, when k of the four regions P1 is 2, 3, or 4, one region P2 is "checked" (C in FIG. 7), and when k of the four regions P1 is 0 or 1, one region P2 is "unchecked" (B in FIG. 7). In other words, when an event is fired in two or more of the four regions P1, it is treated as if an event was fired in one region P2. On the other hand, in other cases, it is treated as if an event was not fired in one region P2. The same is true when generating event data Vj+1 in the j+1th octave from event data Vj in the jth octave.

If an event fires in one of the four regions P1, this is likely due to the influence of a noise event. Therefore, if an event fires in one of the four regions P1, it can be treated as if no event fires in one region P2, making it possible to suppress the influence of the noise event.

FIG. 8 is another diagram for explaining the operation of the event generating unit 103 in the first embodiment.

A to C in FIG. 8 correspond to A to C in FIG. 6, respectively. However, in the example shown in A in FIG. 8, when k of the four regions P1 is either 2 or 3, one region P2 is "checked" (C in FIG. 8), and when k of the four regions P1 is either 0, 1, or 4, one region P2 is "unchecked" (B in FIG. 8). In other words, when an event is fired in two or three of the four regions P1, it is treated as if an event was fired in one region P2. On the other hand, in other cases, it is treated as if an event was not fired in one region P2. This is also the case when generating event data Vj+1 in the j+1th octave from event data Vj in the jth octave.

If an event fires in one of the four regions P1, this is likely to be the result of a noise event. Also, if an event fires in four of the four regions P1, this is likely to be the result of a flicker event. Therefore, if an event fires in one or four of the four regions P1, it is treated as if no event fires in the single region P2, making it possible to suppress the effects of noise events and flicker noise.

In the example shown in FIG. 8A, when k of the four regions P1 is 1, 2, or 3, one region P2 may be "checked," and when k of the four regions P1 is 0 or 4, one region P2 may be "unchecked." This makes it possible to suppress the effects of flicker noise.

FIG. 9 is a diagram for explaining the operation of the event output unit 105 in the first embodiment.

A in FIG. 9 is a diagram for explaining the operation of the event data selection unit 105a. The event data selection unit 105a selects the event data to be output to the event data formation unit 105b from the event data V1 to Vi of the first to i-th octaves. In this case, the event data selection unit 105a may select the event data V1 to Vi of all octaves, or may select only the event data V1 to Vi of some octaves. For example, when the event data V1 to V3 of the first to third octaves are selected from the event data V1 to Vi of the first to i-th octaves, the event data V1 to V3 of the first to third octaves are output to the event data formation unit 105b. The event data V1 to V3 of the first to third octaves are output with the octave information "0 to 2" added, respectively.

B of FIG. 9 is a diagram for explaining the operation of the event data formation unit 105b. The event data formation unit 105b converts the event data selected by the event data selection unit 105a into an event output data format. For example, the event output unit 105 converts the event data selected by the event data selection unit 105a into an address event representation or an image representation for each octave. The event data formation unit 105b then outputs the event data converted into the event output data format to the outside of the solid-state imaging device 100 with octave information added. When performing information processing using the event data outside the solid-state imaging device 100, the event data output in the image representation can be used to display the event data for each octave in a format such as images E1 to E3.

(3) Electronic Device 200 of First Embodiment
FIG. 10 is a diagram illustrating an example of the electronic device 200 according to the first embodiment.

A in FIG. 10 shows a smartphone with a camera function as an example of electronic device 200. As shown in B in FIG. 10, this electronic device 200 includes an imaging unit 201, a display unit 202, an information processing unit 203, a storage unit 204, and an input unit 205. The information processing unit 203 includes an extraction unit 203a and a selection unit 203b.

The imaging unit 201 is a functional block for implementing a camera function. The imaging unit 201 includes the solid-state imaging device 100 shown in FIG. 3. This electronic device 200 functions as an information processing system that performs information processing using event data output from the solid-state imaging device 100 (event output unit 105).

The display unit 202 has a display screen for displaying characters and images. For example, the display unit 202 displays event data output from the solid-state imaging device 100 on the display screen. In this embodiment, the event output unit 105 outputs the event data in an image representation, and the display unit 202 displays this event data in the form of an image on the display screen (B in FIG. 10). This image may be a still image or a moving image. In A in FIG. 10, the display screen displays an image captured by the solid-state imaging device 100 as an image sensor, and an image in which the event data output from the solid-state imaging device 100 as an event sensor (EVS) is displayed in the form of an image. In a mode in which event data is displayed on the display screen, the display screen is also called a viewer.

The information processing unit 203 performs various information processes such as controlling the electronic device 200. For example, the information processing unit 203 receives event data from the solid-state imaging device 100 and displays the event data on the display screen of the display unit 202.

The storage unit 204 includes a recording medium such as a semiconductor memory. The information processing unit 203 can read information necessary for information processing from the storage unit 204, and record information generated by information processing in the storage unit 204. For example, the information processing unit 203 receives event data from the solid-state imaging device 100, and records this event data in the storage unit 204.

The input unit 205 accepts input operations from the user. The information processing unit 203 performs information processing according to the input operations. The input unit 205 includes, for example, a touch panel and hard buttons.

In the event output unit 105, the event data selection unit 105a selects the event data to be output to the event data formation unit 105b from the event data V1 to Vi of the first to i-th octaves. For example, when the event data V1 to V3 of the first to third octaves is selected from the event data V1 to Vi of the first to i-th octaves, the event data V1 to V3 of the first to third octaves is output to the event data formation unit 105b. The event data formation unit 105b outputs the event data selected by the event data selection unit 105a in an image representation for each octave. For example, the event data V1 to V3 of the first to third octaves is output in an image representation for each octave.

The extraction unit 203a extracts event data of a predetermined number of octaves from the event data output from the solid-state imaging device 100 (event data formation unit 105b). For example, the extraction unit 203a extracts event data V2 of the second octave from event data V1 to V3 of the first to third octaves. Event data of a predetermined number of octaves can be extracted based on the octave information of the event data. The information processing unit 203 displays the event data extracted by the extraction unit 203a on the display screen. For example, when event data V2 of the second octave is extracted, the event data V2 of the second octave is displayed on the display screen in the form of an image.

The extraction unit 203a automatically extracts event data for the number of octaves that match the resolution of the viewer. For example, when event data is first displayed in the viewer, event data V2 of the second octave is extracted and displayed. Thereafter, when the user performs an operation to increase the resolution of the viewer, event data V1 of the first octave is extracted and displayed. On the other hand, when the user performs an operation to decrease the resolution of the viewer, event data V3 of the third octave is extracted and displayed.

The selection unit 203b selects the event data for the number of octaves specified by the user from the event data output from the solid-state imaging device 100 (event data formation unit 105b). For example, when the event data V2 for the second octave is displayed on the viewer, if the user touches the "recording start button" on the touch panel, the selection unit 203b selects the event data V2 for the second octave from the event data V1 to V3 for the first to third octaves. Then, the selection unit 203b starts recording the event data V2 for the second octave in the storage unit 204. After that, if the user touches the "recording end button" on the touch panel, the recording of the event data V2 for the second octave ends. In this way, the user can specify the number of octaves of the event data to be recorded and the timing to start and end recording. As a result, a video from the start of recording to the end of recording is recorded in the storage unit 204 (recording medium).

The number of octaves of the event data recorded in the memory unit 204 may be different from the number of octaves of the event data displayed in the viewer. For example, when the event data V2 of the second octave is displayed in the viewer, if the user specifies "event data V1 of the first octave" as "to be recorded" on the touch panel, the event data V1 of the first octave may be recorded in the memory unit 204. Furthermore, even if no such specification is made, the event data V1 of the first octave may be recorded in the memory unit 204.

In addition, the information processing unit 203 may use the extracted event data for information processing other than display. For example, after the event data V2 of the second octave is extracted, the information processing unit 203 may use the event data V2 of the second octave for image recognition. An example of image recognition is recognition of a user's gesture. For example, the information processing unit 203 may perform image recognition to recognize a user's gesture using an image of the user included in the event data V2 of the second octave. At this time, the information processing unit 203 may use event data of multiple octaves for image recognition.

In this embodiment, the solid-state imaging device 100 outputs event data of various octaves. Therefore, the information processing unit 203 can display event data of various octaves by extracting event data of a predetermined number of octaves from the output event data. In addition, the information processing unit 203 can record event data of various octaves by selecting event data of a predetermined number of octaves from the output event data. If the information processing unit 203 were to generate event data of a different resolution from event data of a certain resolution, the information processing by the information processing unit 203 would be delayed. According to this embodiment, the solid-state imaging device 100 is responsible for the process of generating event data of various resolutions (number of octaves), so that the delay in information processing by the information processing unit 203 can be suppressed. In this embodiment, the process of generating event data of various resolutions is performed by the solid-state imaging device 100 as hardware, instead of by the information processing unit 203 as software.

FIG. 11 is a diagram showing another example of the electronic device 200 of the first embodiment.

A in FIG. 11 shows a game console as an example of electronic device 200. As shown in A in FIG. 11, this electronic device 200 is used by being connected to an imaging device 201' and a display device 202' via wired or wireless connection. As shown in B in FIG. 11, this electronic device 200 includes an information processing unit 203, a storage unit 204, and an input unit 205. The information processing unit 203 includes an extraction unit 203a and a selection unit 203b.

The imaging device 201' is, for example, a camera that is an accessory to a game console. The imaging device 201' includes the solid-state imaging device 100 shown in FIG. 3, similar to the imaging unit 201 described above. This electronic device 200, together with the imaging device 201' and the display device 202', constitutes an information processing system that performs information processing using event data output from the solid-state imaging device 100 (event output unit 105).

The display device 202' is, for example, a large LCD television. The display device 202' has a display screen for displaying characters and images, similar to the display unit 202 described above. In A of FIG. 11, the imaging device 201' captures an image of a user playing on a game console. In B of FIG. 11, the display device 202' captures an image of the user and displays the event data obtained in the form of an image on the display screen. In the mode in which event data is displayed on the display screen, the display screen is also called a viewer.

The functions of the information processing unit 203, memory unit 204, and input unit 205 shown in FIG. 11B are generally similar to the functions of the information processing unit 203, memory unit 204, and input unit 205 shown in FIG. 10A.

FIG. 12 is a diagram for explaining the details of the electronic device 200 shown in FIG. 11.

As shown in A of FIG. 11, the imaging device 201' captures the entire body of the user. Therefore, the event data V1 to V3 of the first to third octaves output from the imaging device 201' (solid-state imaging device 100) includes event data related to the entire body of the user.

In FIG. 12A, the display screen of the display device 202' displays the user's entire body at low resolution using the third octave event data V3. Image E3 in the area enclosed by the dotted line in FIG. 12A includes the user's entire body. When all of the third octave event data V3 is displayed on the display screen, image E3 is obtained.

In FIG. 12B, the display screen of the display device 202' displays the user's entire body at medium resolution using the second octave event data V2. Image E2 in the area enclosed by the dotted line in FIG. 12B includes the user's hands. This image E2 corresponds to a portion of the second octave event data V2.

In FIG. 12C, the display screen of the display device 202' displays the user's hand in high resolution using the event data V1 of the first octave. Image E1 in the area enclosed by the dotted line in FIG. 12C includes the user's hand. This image E1 corresponds to a portion of the event data V1 of the first octave.

The information processing unit 203 can use these images E1 to E3 to enlarge or reduce the entire body of the user for display. For example, by transitioning the display content of the display screen from A in FIG. 12 to B in FIG. 12, the entire body of the user can be enlarged and displayed while increasing the image resolution. Also, by transitioning the display content of the display screen from B in FIG. 12 to C in FIG. 12, the hand of the user can be enlarged and displayed while increasing the image resolution. This makes it possible to confirm the gesture of the user's hand on the display screen. The transition from A in FIG. 12 to B in FIG. 12 can be realized by switching the event data extracted by the extraction unit 203a from the event data V3 of the third octave to the event data V2 of the second octave. The transition from B in FIG. 12 to C in FIG. 12 can be realized by switching the event data extracted by the extraction unit 203a from the event data V2 of the second octave to the event data V1 of the first octave.

The information processing unit 203 may perform image recognition using the event data extracted by the extraction unit 203a to automate the confirmation of gestures. For example, the information processing unit 203 extracts the area of image E3, i.e., the user's entire body, from the event data V3 of the third octave by image recognition. Next, the information processing unit 203 extracts the area of image E2, i.e., the user's hand part, from the event data V2 of the second octave by image recognition. Next, the information processing unit 203 identifies the user's hand gesture from the event data V1 of the first octave by image recognition. This makes it possible for the information processing unit 203 to automatically recognize the user's hand gesture. According to this embodiment, by having the solid-state imaging device 100 take charge of the process of generating event data of various resolutions (number of octaves), it is possible to suppress such delays in image recognition.

As described above, the solid-state imaging device 100 of this embodiment treats the event data output from the pixel array 101 as event data V1 of the first octave, and generates event data V2 to Vi of the second to i-th octaves from the event data V1 of the first octave. Therefore, according to this embodiment, by outputting event data of various octaves for information processing using the event data, it is possible to achieve high resolution of the solid-state imaging device 100 while suppressing delays in information processing.

Second Embodiment
FIG. 13 is a block diagram showing the configuration of a solid-state imaging device 100 according to the second embodiment.

The solid-state imaging device 100 of this embodiment, like the solid-state imaging device of the first embodiment, includes a pixel array 101, an event acquisition unit 102, an event generation unit 103, an event synthesis unit 104, and an event output unit 105. The solid-state imaging device 100 of this embodiment further includes a frame memory 111. The event acquisition unit 102 of this embodiment also includes an arbiter unit 102a and a time stamp unit 102b.

The solid-state imaging device 100 of this embodiment is an arbiter-type EVS. Therefore, the event acquisition unit 102 of this embodiment acquires event data from multiple pixels 102a in the pixel array 102 in a random order. The event data acquired by the event acquisition unit 102 is stored in the frame memory 111 for a certain period of time, and is then output from the frame memory 111 to the event generation unit 103 and the event synthesis unit 104. The frame memory 111 includes multiple memory cells arranged in a two-dimensional array (matrix), similar to the pixel array 101. The event data of each row of the frame memory 111 is treated as event data V1 of the first octave.

The event acquisition unit 102 of this embodiment includes an arbiter unit 102a and a time stamp unit 102b as functional blocks for an arbiter-type EVS. The arbiter unit 102a arbitrates multiple events (request signals) output from multiple pixels 101a. The time stamp unit 102b assigns a time stamp to an event fired from each pixel 101a. In this embodiment, the value of t in the event data (x, y, t, p) of each pixel 101a becomes the value of the time stamp.

FIG. 14 is a diagram for explaining the operation of the event generation unit 103 in the second embodiment.

FIG. 14, like A of FIG. 6, A of FIG. 7, and A of FIG. 8, shows the process of generating r rows and s columns of second octave event data V2 from m rows and n columns of first octave event data V1. FIG. 14 shows an example where m=2, n=2, r=1, and s=1. Therefore, 1 row and 1 column of second octave event data V2 is generated from 2 rows and 2 columns of first octave event data V1, and 2×2 regions P1 are replaced with 1×1 regions P2. Similarly, in the example shown in FIG. 14, 1 row and 1 column of j+1th octave event data Vj+1 is generated from 2 rows and 2 columns of jth octave event data Vj.

The processing in the example shown in FIG. 14 can be performed in the same manner as the processing in the example shown in FIG. 6A, FIG. 7A, or FIG. 8A. However, in this embodiment, the value of t in the event data (x, y, t, p) of each pixel 101a becomes the value of the timestamp, so the four regions P1 shown in FIG. 14 have different values of t. In FIG. 14, the values of t in the four regions P1 are ta, tb, tc, and td, respectively.

The event data of this embodiment, like the event data of the first embodiment, is expressed in the form (oc, x, y, t, p) with the addition of octave information. x and y represent the coordinates of the pixels 101a corresponding to regions P1 to P3, etc., and represent, for example, the coordinates of one pixel 101a corresponding to region P1, the average coordinates of four pixels 101a corresponding to region P2, or the average coordinates of 16 pixels 101a corresponding to region P3. t represents the time when the event data was obtained. p represents the polarity of the event data.

In FIG. 14, the t of one region P2 is taken as the statistical value of the t of four regions P1. In other words, the t of one region P2 is taken as the value obtained by performing statistical processing on the t of four regions P1. The statistical value of the t of four regions P1 is, for example, the average value, maximum value, minimum value, etc. of the t of the four regions P1. FIG. 14 shows an example in which the t' of one region P2 is taken as the average value of ta to td of the four regions P1 (t' = (ta + tb + tc + td) / 4). Similarly, when replacing four regions Pj with one region Pj+1, the t of one region Pj+1 is taken as the statistical value of the t of the four regions Pj.

The solid-state imaging device 100 of this embodiment treats the event data output from the frame memory 111 as event data V1 of the first octave, and generates event data V2 to Vi of the second to i-th octaves from the event data V1 of the first octave. Therefore, according to this embodiment, as in the first embodiment, by outputting event data of various octaves for information processing using the event data, it is possible to achieve high resolution of the solid-state imaging device 100 while suppressing delays in information processing.

Third Embodiment
FIG. 15 is a perspective view that illustrates a schematic configuration of a solid-state imaging device 100 according to the third embodiment.

The solid-state imaging device 100 of this embodiment includes a detection chip 120 and a light-receiving chip 130 stacked on the detection chip 120. The detection chip 120 and the light-receiving chip 130 are electrically connected through connections such as via plugs, metal pads, and metal bumps. The solid-state imaging device 100 of this embodiment functions as the solid-state imaging device 100 of the first or second embodiment.

FIG. 15 shows the X-axis, Y-axis, and Z-axis, which are perpendicular to each other. The X-axis and Y-axis correspond to the horizontal direction, and the Z-axis corresponds to the vertical direction. The +Z direction corresponds to the upward direction, and the -Z direction corresponds to the downward direction. Note that the -Z direction may or may not strictly coincide with the direction of gravity.

FIG. 16 is a plan view that shows a schematic configuration of the light receiving chip 120 of the third embodiment.

A in FIG. 16 shows an example of the planar structure of the light-receiving chip 120. The light-receiving chip 120 includes a light-receiving section 121 and multiple via arrangement sections 122 to 124. B in FIG. 16 shows an example of the planar structure of the light-receiving section 121. The light-receiving section 121 includes multiple photodiodes 121a.

In the light receiving section 121, multiple photodiodes 121a are arranged in an array (two-dimensional lattice). A pixel address consisting of a row address and a column address is assigned to each photodiode 121a, and each photodiode 121a is treated as a pixel. Each photodiode 121a photoelectrically converts incident light to generate a photocurrent. Via plugs electrically connected to the detection chip 130 are arranged in the via arrangement sections 122 to 124.

FIG. 17 is a plan view that shows a schematic configuration of the detection chip 130 of the third embodiment.

A in FIG. 17 shows an example of the planar structure of the detection chip 130. The detection chip 130 includes an address event detection unit 131, multiple via placement units 132 to 134, a row driving circuit 135, a column driving circuit 136, and a signal processing circuit 137. B in FIG. 17 shows an example of the planar structure of the address event detection unit 131. The address event detection unit 131 includes multiple address event detection circuits 131a.

In the address event detection section 131, multiple address event detection circuits 131a are arranged in an array (two-dimensional lattice). A pixel address is assigned to each address event detection circuit 131a, and each address event detection circuit 131a is electrically connected to the photodiode 121a at the same address. Each address event detection circuit 121a quantizes a voltage signal corresponding to the photocurrent from the corresponding photodiode 121a and outputs it as a detection signal. This detection signal is a one-bit signal indicating whether or not an address event has been detected in which the amount of incident light has exceeded a predetermined threshold, and is output to the signal processing circuit 137. Via plugs electrically connected to the light receiving chip 120 are arranged in the via arrangement sections 132 to 134.

The row driving circuit 135 selects a row address and outputs a detection signal corresponding to that row address to the address event detection unit 131. The column driving circuit 136 selects a column address and outputs a detection signal corresponding to that column address to the address event detection unit 131. The signal processing circuit 137 performs predetermined signal processing on the detection signal from the address event detection unit 131. The signal processing circuit 137 arranges the detection signals as pixel signals in a two-dimensional lattice pattern and acquires image data having one bit of information for each pixel. The signal processing circuit 137 performs signal processing such as image recognition processing on this image data.

FIG. 18 is a circuit diagram showing the configuration of each address event detection circuit 131a in the third embodiment.

Each address event detection circuit 131a includes a current-voltage conversion circuit 310, a buffer 320, a subtractor 330, a quantizer 340, and a transfer circuit 350.

The current-voltage conversion circuit 310 converts the photocurrent from the corresponding photodiode 121a into a voltage signal. The current-voltage conversion circuit 310 supplies this voltage signal to the buffer 320.

The buffer 320 corrects the voltage signal from the current-voltage conversion circuit 310. The buffer 320 outputs the corrected voltage signal to the subtractor 330.

The subtractor 330 reduces the level of the voltage signal from the buffer 320 in accordance with the row drive signal from the row drive circuit 135. The subtractor 330 supplies the reduced voltage signal to the quantizer 340.

The quantizer 340 quantizes the voltage signal from the subtractor 330 into a digital signal and outputs it as a detection signal. The quantizer 340 outputs this detection signal to the transfer circuit 350.

The transfer circuit 350 transfers the detection signal from the quantizer 340 to the signal processing circuit 137 in accordance with the column drive signal from the column drive circuit 136.

FIG. 19 is a circuit diagram showing the configuration of a current-voltage conversion circuit 310 according to the third embodiment.

The current-voltage conversion circuit 310 includes an N-type transistor 311, a P-type transistor 312, and an N-type transistor 313. These N-type and P-type transistors 311 to 313 are, for example, MOS (Metal-Oxide-Semiconductor) transistors.

The source of the N-type transistor 311 is electrically connected to the cathode of the photodiode 121a, and the drain of the N-type transistor 311 is electrically connected to the power supply terminal (VDD). The P-type transistor 312 and the N-type transistor 313 are connected in series between the power supply terminal and the ground terminal (GND). The node between the P-type transistor 312 and the N-type transistor 313 is electrically connected to the gate of the N-type transistor 311 and the input terminal of the buffer 320. A predetermined bias voltage Vbias1 is applied to the gate of the P-type transistor 312. The node between the N-type transistor 311 and the photodiode 121a is electrically connected to the gate of the N-type transistor 313.

The drain of N-type transistor 311 and the drain of N-type transistor 313 are placed on the power supply side, and this type of circuit is called a source follower. The photocurrent from photodiode 121a is converted into a voltage signal by the source follower. P-type transistor 312 supplies a constant current to N-type transistor 313. Note that the ground of the light receiving chip 120 and the ground of the detection chip 130 are separated from each other to prevent interference.

FIG. 20 is a circuit diagram showing the configuration of the subtractor 330 and quantizer 340 of the third embodiment.

The subtractor 330 includes a capacitor 331, an inverter 332, a capacitor 333, and a switch 334. The quantizer 340 includes a comparator 341.

One electrode of the capacitor 331 is electrically connected to the output terminal of the buffer 320, and the other electrode of the capacitor 331 is electrically connected to the input terminal of the inverter 332. The inverter 332 inverts the voltage signal input via the capacitor 331 and outputs the inverted signal to the non-inverting input terminal (+) of the comparator 341. The capacitor 333 is connected in parallel to the inverter 332. The switch 334 opens and closes the path that electrically connects both electrodes of the capacitor 333 in accordance with the row drive signal.

When switch 334 is turned on, a voltage signal Vinit is input to the electrode of capacitor 331 on the buffer 320 side, and the electrode of capacitor 331 on the inverter 332 side becomes a virtual ground terminal. For convenience, the potential of this virtual ground terminal is set to zero. At this time, the charge Qinit stored in capacitor 331 is expressed by the following equation 1, where the capacitance of capacitor 331 is C1.

Qinit = C1 × Vinit ... Equation 1
On the other hand, since both electrodes of the capacitor 333 are short-circuited, the charge stored in the capacitor 333 becomes zero.

Next, consider the case where switch 334 is turned off and the voltage of the electrode of capacitor 331 on the buffer 320 side changes to Vafter. At this time, the charge Qafter stored in capacitor 331 is expressed by the following equation 2.

Qafter = C1 x Vafter ... Formula 2
On the other hand, the charge Q2 stored in the capacitor 333 is expressed by the following equation 3, where the output voltage is Vout and the capacitance of the capacitor 333 is C2.

Q2 = C2 x Vout ... Equation 3
At this time, the total charge amount of the

capacitors

331 and 333 does not change, so the following equation 4 holds.

Qinit = Qafter + Q2 ... Equation 4
By substituting equations 1 to 3 into equation 4 and rearranging it, the following equation 5 is obtained.

Vout = (C1/C2) x (Vinit-Vafter) ... Equation 5
Equation 5 represents the subtraction operation of the voltage signal, and the gain of the subtraction result is C1/C2. Since it is usually desired to maximize the gain, it is preferable to set C1 large and design C2 small. On the other hand, if C2 is too small, kTC noise increases and noise characteristics may deteriorate, so the reduction in the capacitance of C2 is limited to a range in which noise can be tolerated. Furthermore, since an address event detection circuit 131a including a subtractor 330 is mounted for each pixel, there are area restrictions on C1 and C2. Taking these into consideration, for example, C1 is set to a value of 20 to 200 femtofarads (fF), and C2 is set to a value of 1 to 20 femtofarads (fF).

Comparator 341 compares the voltage signal from subtractor 330 with a predetermined threshold voltage Vth applied to the inverting input terminal (-). Comparator 341 outputs a signal indicating the comparison result as a detection signal to transfer circuit 350.

In a synchronous solid-state imaging device, a simple pixel circuit including a photodiode and three to four transistors is provided for each pixel. In contrast, in an asynchronous solid-state imaging device 100, as illustrated in Figures 18 to 20, a complex pixel circuit including a photodiode 121a and an address event detection circuit 131a is provided for each pixel. Therefore, if both the photodiode 121a and the address event detection circuit 131a were placed on the same chip, the mounting area would be larger than in the case of the synchronous type. Therefore, in the solid-state imaging device 100 of this embodiment, the photodiode 121a and the address event detection circuit 131a are placed on the light receiving chip 120 and the detection chip 130, respectively. According to this embodiment, the mounting area can be reduced by distributing the photodiode 121a and the address event detection circuit 131a in this manner.

FIG. 21 is a circuit diagram showing the configuration of the light receiving chip 120 and the detection chip 130 of a modified example of the third embodiment.

19, N-type transistor 311, P-type transistor 312, and N-type transistor 313 in current-voltage conversion circuit 310 are arranged in detection chip 130. On the other hand, in FIG. 21, N-type transistor 311 and N-type transistor 313 in current-voltage conversion circuit 310 are arranged in light-receiving chip 120, and P-type transistor 312 in current-voltage conversion circuit 310 is arranged in detection chip 130. The configuration in FIG. 21 is adopted, for example, when there is a risk that the circuit scale of detection chip 130 will increase with an increase in the number of pixels. According to this modified example, by arranging N-type transistor 311 and N-type transistor 313 in light-receiving chip 120, it is possible to reduce the circuit scale of detection chip 130.

Furthermore, if one of the N-

type transistors

311, 313 were placed in the light receiving chip 120 and the other N-

type transistor

311, 313 were placed in the detection chip 130, a process of forming the N-type transistor 120 in the light receiving chip 120 and a process of forming the N-type transistor 130 in the detection chip 130 would be required, increasing the number of processes for manufacturing the light receiving chip 120 and the detection chip 130. According to this modified example, by placing both N-

type transistors

311, 313 in the light receiving chip 120, it is possible to reduce the number of processes for manufacturing the light receiving chip 120 and the detection chip 130. This makes it possible to reduce the manufacturing costs of the solid-state imaging device 100.

As described above, according to this embodiment, it is possible to reduce the mounting area and manufacturing costs of the solid-state imaging device 100 of the first or second embodiment.

The above describes embodiments of the present disclosure, but these embodiments may be implemented with various modifications without departing from the spirit of the present disclosure. For example, two or more embodiments may be implemented in combination.

In addition, this disclosure can also be configured as follows:

(1)
a plurality of pixels that detect an event and output event data indicative of the detection result of the event;
an event generating unit that sets the event data output from the plurality of pixels as event data of a first octave and generates event data of a second to i-th octave (i is an integer equal to or greater than 2) from the event data of the first octave;
an event output unit that outputs at least a part of the event data of the first to i-th octaves;
A solid-state imaging device comprising:

(2)
an octave information adding unit that adds octave information, which is identification information of the first to i-th octaves, to the event data of the first to i-th octaves, respectively;
The solid-state imaging device according to (1), wherein the event output unit outputs the event data to which the octave information is added.

(3)
The solid-state imaging device according to (1), wherein the event output unit outputs the event data in an image representation for each octave.

(4)
The solid-state imaging device according to (1), wherein the event generating unit generates event data of a j+1-th octave from event data of a j-th octave (j is an integer satisfying 1≦j≦i−1).

(5)
The solid-state imaging device according to (4), wherein the event generation unit generates one row of event data for the j+1 octave from m rows of event data for the j octave (m is an integer equal to or greater than 2).

(6)
The solid-state imaging device according to (5), wherein the event generation unit generates event data for the j+1 octave for 1 row and 1 column from event data for the j octave for m rows and n columns (n is an integer equal to or greater than 2).

(7)
The solid-state imaging device according to (6), wherein the event generation unit ignites an event with the event data of the j+1 octave of the 1st row and 1st column when the event data of the j octave of the m rows and n columns includes k event firings (k is an integer satisfying 1≦k≦m×n).

(8)
The solid-state imaging device according to (7), wherein m is 2, n is 2, and k is 1, 2, 3, or 4.

(9)
The solid-state imaging device according to (7), wherein m is 2, n is 2, and k is 2, 3, or 4.

(10)
The solid-state imaging device according to (7), wherein m is 2, n is 2, and k is 3 or 4.

(11)
A frame memory for storing the event data output from the plurality of pixels,
The solid-state imaging device according to (1), wherein the event generating unit generates the event data output from the frame memory as event data of the first octave.

(12)
An information processing system including a solid-state imaging device and an information processing unit,
The solid-state imaging device includes:
a plurality of pixels that detect an event and output event data indicative of the detection result of the event;
an event generating unit that sets the event data output from the plurality of pixels as event data of a first octave and generates event data of a second to i-th octave (i is an integer equal to or greater than 2) from the event data of the first octave;
an event output unit that outputs at least a part of the event data of the first to i-th octaves;
The information processing unit displays the event data output from the event output unit on a display screen.

(13)
the information processing unit includes an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit,
The information processing system according to (12), wherein the information processing unit displays the event data extracted by the extraction unit on the display screen.

(14)
the information processing unit includes an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit,
The information processing system according to (12), wherein the information processing unit performs image recognition using the event data extracted by the extraction unit.

(15)
The information processing system according to (14), wherein the image recognition is user gesture recognition.

(16)
the information processing unit includes a selection unit that selects the event data for a number of octaves designated by a user from the event data output from the event output unit,
The information processing system according to (12), wherein the information processing unit records the event data selected by the selection unit on a recording medium.

(17)
The information processing system according to (12), wherein the information processing system is an electronic device including the solid-state imaging device and the information processing unit.

(18)
The information processing system according to (17), wherein the electronic device further includes a display unit having the display screen.

(19)
The information processing system according to (12), further comprising: an electronic device including the information processing section; and an imaging device provided outside the electronic device and including the solid-state imaging device.

(20)
The information processing system according to (19), further comprising a display device provided outside the electronic device and having the display screen.

1: vehicle, 11: vehicle control system, 21: vehicle control ECU,
22: communication unit, 23: map information storage unit, 24: location information acquisition unit,
25: external recognition sensor, 26: in-vehicle sensor, 27: vehicle sensor,
31: memory unit, 32: driving assistance/automatic driving control unit, 33: DMS,
34: HMI, 35: vehicle control unit, 41: communication network,
51: camera, 52: radar, 53: LiDAR, 54: ultrasonic sensor,
61: analysis unit, 62: action planning unit, 63: operation control unit,
71: self-position estimation unit, 72: sensor fusion unit, 73: recognition unit,
81: steering control unit, 82: brake control unit, 83: drive control unit,
84: body control unit, 85: light control unit, 86: horn control unit,
100: solid-state imaging device, 101: pixel array, 101a: pixel,
102: event acquisition unit, 102a: arbiter unit, 102b: time stamp unit,
103: event generating unit, 103a: first filter unit, 103b: second filter unit,
103c: third filter unit, 104: event synthesis unit,
104a: octave information adding unit, 104b: output timing adjusting unit,
105: event output unit, 105a: event data selection unit,
105b: event data forming unit, 111: frame memory,
120: light receiving chip, 121: light receiving unit, 121a: photodiode,
122: via placement unit, 123: via placement unit, 124: via placement unit,
130: detection chip, 131: address event detection unit,
131a: address event detection circuit, 132: via placement unit,
133: via arrangement section, 134: via arrangement section,
135: row driving circuit, 136: column driving circuit, 137: signal processing circuit,
200: electronic device, 201: imaging unit, 201': imaging device,
202: display unit, 202': display device, 203: information processing unit,
203a: extraction unit, 203b: selection unit, 204: storage unit, 205: input unit,
310: current-voltage conversion circuit, 311: N-type transistor,
312: P-type transistor, 313: N-type transistor,
320: buffer, 330: subtractor, 331: capacitor,
332: inverter, 333: capacitor, 334: switch,
340: quantizer, 341: comparator, 350: transfer circuit

Claims

a plurality of pixels that detect an event and output event data indicative of the detection result of the event;
an event generating unit that sets the event data output from the plurality of pixels as event data of a first octave and generates event data of a second to i-th octave (i is an integer equal to or greater than 2) from the event data of the first octave;
an event output unit that outputs at least a part of the event data of the first to i-th octaves;
A solid-state imaging device comprising:
an octave information adding unit that adds octave information, which is identification information of the first to i-th octaves, to the event data of the first to i-th octaves, respectively;
The solid-state imaging device according to claim 1 , wherein the event output section outputs the event data to which the octave information is added.
The solid-state imaging device according to claim 1, wherein the event output unit outputs the event data in an image representation for each octave.
The solid-state imaging device according to claim 1, wherein the event generating unit generates event data for the j+1th octave from event data for the jth octave (j is an integer satisfying 1≦j≦i-1).
The solid-state imaging device according to claim 4, wherein the event generation unit generates one row of event data for the j+1 octave from m rows of event data for the j octave (m is an integer equal to or greater than 2).
The solid-state imaging device according to claim 5, wherein the event generation unit generates 1 row and 1 column of event data for the j+1 octave from m rows and n columns of event data for the j octave (n is an integer equal to or greater than 2).
The solid-state imaging device according to claim 6, wherein the event generating unit ignites an event with the event data of the j+1 octave of the 1st row and 1st column when the event data of the j octave of the m rows and n columns includes k event firings (k is an integer satisfying 1≦k≦m×n).
The solid-state imaging device of claim 7, wherein m is 2, n is 2, and k is 1, 2, 3, and 4.
The solid-state imaging device of claim 7, wherein m is 2, n is 2, and k is 2, 3, and 4.
The solid-state imaging device of claim 7, wherein m is 2, n is 2, and k is 3 and 4.
A frame memory for storing the event data output from the plurality of pixels,
2. The solid-state imaging device according to claim 1, wherein the event generating section generates the event data output from the frame memory as the event data of the first octave.
An information processing system including a solid-state imaging device and an information processing unit,
The solid-state imaging device includes:
a plurality of pixels that detect an event and output event data indicative of the detection result of the event;
an event generating unit that sets the event data output from the plurality of pixels as event data of a first octave and generates event data of a second to i-th octave (i is an integer equal to or greater than 2) from the event data of the first octave;
an event output unit that outputs at least a part of the event data of the first to i-th octaves;
The information processing unit displays the event data output from the event output unit on a display screen.
the information processing unit includes an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit,
The information processing system according to claim 12 , wherein the information processing unit displays the event data extracted by the extraction unit on the display screen.
the information processing unit includes an extraction unit that extracts the event data of a predetermined number of octaves from the event data output from the event output unit,
The information processing system according to claim 12 , wherein the information processing unit performs image recognition using the event data extracted by the extraction unit.
The information processing system according to claim 14, wherein the image recognition is user gesture recognition.
the information processing unit includes a selection unit that selects the event data for a number of octaves designated by a user from the event data output from the event output unit,
The information processing system according to claim 12 , wherein the information processing unit records the event data selected by the selection unit on a recording medium.
The information processing system according to claim 12, wherein the information processing system is an electronic device including the solid-state imaging device and the information processing unit.
The information processing system according to claim 17, wherein the electronic device further comprises a display unit having the display screen.
The information processing system according to claim 12, comprising an electronic device including the information processing section, and an imaging device provided outside the electronic device and including the solid-state imaging device.
The information processing system according to claim 19, further comprising a display device provided outside the electronic device and having the display screen.