WO2022221994A1 - 具有接口系统的事件驱动集成电路 - Google Patents

具有接口系统的事件驱动集成电路 Download PDF

Info

Publication number
WO2022221994A1
WO2022221994A1 PCT/CN2021/088143 CN2021088143W WO2022221994A1 WO 2022221994 A1 WO2022221994 A1 WO 2022221994A1 CN 2021088143 W CN2021088143 W CN 2021088143W WO 2022221994 A1 WO2022221994 A1 WO 2022221994A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
module
driven
address
interface system
Prior art date
Application number
PCT/CN2021/088143
Other languages
English (en)
French (fr)
Inventor
图芭•代米尔吉
西克萨迪克•尤艾尔阿明
乔宁
里克特奥勒•树里
Original Assignee
成都时识科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都时识科技有限公司 filed Critical 成都时识科技有限公司
Priority to KR1020237028111A priority Critical patent/KR20230134548A/ko
Priority to PCT/CN2021/088143 priority patent/WO2022221994A1/zh
Priority to CN202180004244.5A priority patent/CN115500090A/zh
Priority to JP2023552014A priority patent/JP2024507400A/ja
Priority to US18/010,486 priority patent/US20240107187A1/en
Priority to EP21937245.5A priority patent/EP4207761A4/en
Publication of WO2022221994A1 publication Critical patent/WO2022221994A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/47Image sensors with pixel address output; Event-driven image sensors; Selection of pixels to be read out based on image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/50Control of the SSIS exposure
    • H04N25/57Control of the dynamic range
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/70SSIS architectures; Circuits associated therewith
    • H04N25/79Arrangements of circuitry being divided between different or multiple substrates, chips or circuit boards, e.g. stacked image sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Definitions

  • the present invention relates to an event-driven integrated circuit, and in particular to a low-power integrated circuit with an interface module for asynchronously processing events.
  • Event-driven sensors exist in the prior art.
  • One class of event-driven sensors is an event-driven camera, which includes an array of pixels with pixels. According to the change of the brightness of the pixel, the event-driven camera generates an event, where the event includes an identifier of the change, such as -1 for darker, +1 for brighter, such cameras are called dynamic vision sensors (DVS). , Dynamic vision sensors).
  • DVS dynamic vision sensors
  • Dynamic vision sensors There are other known event-driven sensors, such as one-dimensional sensors, sound sensors.
  • DVS generates events in an asynchronous manner, and it is an event-driven sensor.
  • Traditional clock-based cameras need to read all frames or lines of all pixels, which cannot be compared with DVS.
  • DVS provides ultra-fast image processing while still maintaining a low rate because DVS only records changes.
  • processing pipeline refers specifically to wiring, such as interconnections between different components, but also to data processing by components and data transfer between different components.
  • processing pipeline also specifically refers to the particular manner in which various output ports or various first components of the system are connected to various input ports or various second components of the system.
  • the magazine Science first introduced IBM's brain-inspired chip TrueNorth, which has 5.4 billion transistors, 4,096 synaptic cores, and 1 million programmable Spiking neurons, 2.56 million configurable synapses.
  • the chip structure adopts an event-driven design and is an asynchronous-synchronous hybrid chip: the routing, scheduler, and controller adopt a quasi-delay non-sensitive clockless asynchronous design, while the neuron is a traditional clock synchronization.
  • the clock is generated by an asynchronous controller, and the global clock frequency is 1kHz. If calculated with a video input of 30 frames per second and 400*240 pixels, the power consumption of the chip is 63mW.
  • the power consumption of the chip is 63mW.
  • Prior art 1 "A million spiking-neuron integrated circuit with a scalable communication network and interface", Paul A.Merolla,John V.Arthur etal,Vol.345,Issue 6197,SCIENCE,8Aug 2014.
  • the gesture recognition system based on the IBM TrueNorth chip was disclosed: refer to Fig-1 and Fig-4 of this article (or Fig. 11 of the present invention, the details of this figure can be found in Referring to the original text, which is not repeated in the present invention), the TrueNorth processor located on the NSIe development board receives events output from the DVS128 through USB2.0. In other words, the connection between the DVS and the processor is via a USB cable. For details, please refer to:
  • Prior art 2 "A Low Power, Fully Event-Based Gesture Recognition System", Arnon Amir, Brian Taba et al, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017.
  • Prior art 3 "A hybrid and scalable brain-inspired robotic platform", Zhe Zou, Rong Zhao etal, Scientific Reports, 23 Oct 2020.
  • USB cables or other cable implementation technologies
  • all have a certain connection length so the system may suffer from signal loss and noise interference coupled to the cables, and Because it is a cable, each handshake between devices, such as necessary communication before and after data transmission, consumes more energy and slows down the processing speed of the system, which adversely affects the performance of brain-like chips.
  • these top designers are not aware of the adverse effects of this factor and believe that the proposed technical solutions have exhausted all efforts to pursue extreme low power consumption, and the solutions have satisfied various requirements.
  • DVS in order to obtain better quality image information, those skilled in the art need to choose a special semiconductor process to manufacture DVS , such as CIS-CMOS process image sensor.
  • an AI processor such as the sCNN processor described later
  • CMOS process is not suitable for manufacturing high-quality image sensors (imaging effects are not ideal).
  • the AI processor will occupy a large amount of chip area and increase chip costs, which is more and more important for the pursuit of more and more In terms of the development trend of small chip footprint, it will lose commercial competitiveness. Therefore, how to eliminate signal loss and noise interference, preferably to further pursue lower chip footprint and manufacturing cost, is an important problem to be solved for the industrialization/commercialization of brain-like chips.
  • the present invention is proposed to solve one of the above technical problems or a combination of several technical problems, and the technical solution of the invention can solve or alleviate the combination of one or more of the above technical problems.
  • the technology mentioned in the above background technology may belong to or not disclosed technology in whole or in part, that is, the applicant does not admit that the technology mentioned in the background technology must belong to the existing technology in the sense of the patent law. technology (prior art), unless there is substantial evidence to prove it.
  • the technical solutions and technical features disclosed in the above-mentioned background art have been disclosed along with the disclosure of the patent documents of the present invention.
  • An integrated circuit comprising an event-driven sensor (10) and an event-driven interface system (20) and an event-driven processor (30), the event-driven sensor (10) and the event-driven interface system (20) and the The event-driven processor (30) is coupled to the single chip (3).
  • the event-driven sensor (10) is configured to be asynchronous after the input device (11) of the event-driven sensor (10) detects an event-generating signal or/and a change in the event-generating signal generating and asynchronously outputting an event (100) comprising or associated with an event address indicative of said input device (11), the output of said event-driven sensor (10) coupled to said event-driven interface an input of the system (20);
  • the event-driven interface system (20) is configured to asynchronously receive the event (100) and preprocess the received event (100), and the output end of the event-driven interface system (20) is coupled to the event-driven interface system (20) an input end of the event-driven processor (30);
  • the event-driven processor (30) is configured to: receive an event (101) preprocessed by the event-driven interface system (20), and process the received event (101) in an asynchronous manner;
  • the event-driven sensor (10), the event-driven interface system (20) and the event-driven processor (30) are coupled to a single chip (3) through a transition board (40).
  • both the event-driven interface system (20) and the event-driven processor (30) are located in the first die (1-1); or, the event-driven sensor (10) and all The event-driven interface system (20) is both located on the second die (1-2); or, a part of the event-driven interface system (20) and the event-driven processor (30) are located on the first die (1-2). 1-1) and the other part of the event-driven interface system (20) and the event-driven sensor (10) are both located in the second die (1-2).
  • both the event-driven interface system (20) and the event-driven processor (30) are located in the first die (1-1), and the event-driven sensor (10) is located in the first die (1-1).
  • Two die (1-2) are stacked on the event-driven interface system (20) and the first die (1-1) where the event-driven processor (30) is located.
  • the interposer (40) is a silicon interposer or a glass interposer.
  • the event-driven sensor (10) and the event-driven interface system (20) and the event-driven processor (30) are packaged on a single chip (3) by 2.5D or 3D packaging technology above.
  • the event-driven sensor (10) is of one or a combination of one or more of the following types: point sensor, 1D sensor, 2D sensor.
  • the event-driven sensor (10) is of one or a combination of the following types: sound/vibration sensor, dynamic vision sensor.
  • the event-driven processor (30) is configured with a spiking neural network.
  • the event-driven processor (30) is configured with a spiking convolutional neural network.
  • the first die and the second die are fabricated using different processes.
  • the event-driven interface system (20) includes at least one interface module (200), the interface module (200) constitutes a programmable daisy-chain form, and asynchronous processing from the event-driven sensor (200) 10) Received event (100).
  • the at least one interface module (200) includes a replication module (201) configured to: receive an event (100) and perform a replication operation to obtain a replication event (100c), the event (100) ) from the event-driven sensor (10) or from another interface module (200) of the event-driven interface system (20) (which is a fusion module in certain types of embodiments), and sends the replication event (100c) ) to an external processing pipeline and send the event (100) along the daisy chain.
  • a replication module (201) configured to: receive an event (100) and perform a replication operation to obtain a replication event (100c), the event (100) ) from the event-driven sensor (10) or from another interface module (200) of the event-driven interface system (20) (which is a fusion module in certain types of embodiments), and sends the replication event (100c) ) to an external processing pipeline and send the event (100) along the daisy chain.
  • the at least one interface module (200) includes a fusion module (202) configured to receive events (100, 100e) from at least two different places, wherein the events (100 ) from other interface modules (200) of the event-driven interface system (20) (which in certain types of embodiments are replication modules) or the event-driven sensor (10); the event (100e) also components/modules or other event-driven sensors from the integrated circuit or other integrated circuits and send some or all of the received events (100, 100e) along the programmable daisy chain to subsequent interface modules ( 200).
  • a fusion module (202) configured to receive events (100, 100e) from at least two different places, wherein the events (100 ) from other interface modules (200) of the event-driven interface system (20) (which in certain types of embodiments are replication modules) or the event-driven sensor (10); the event (100e) also components/modules or other event-driven sensors from the integrated circuit or other integrated circuits and send some or all of the received events (100, 100e) along the programmable da
  • the at least one interface module (200) includes a subsampling module (203) configured to assign a single address to a number of events (100) received.
  • the subsampling module (203) comprises a separation module (203-4) configured to route the event (100) to the associated said event (100) according to the address value of the received event (100).
  • the address reassembly module (203-5) of (203), the address reassembly module (203-5) being configured to adjust the event address according to the scaled address value and then send the adjustment along the programmable daisy chain Event after address (100).
  • the at least one interface module (200) includes an area of interest module (204) configured to: adjust attributes of at least one event address, the adjusting manner including one or more of the following Kind: Shift, flip, transpose, or/and rotate at least one attribute of the event address; or/and
  • the at least one interface module (200) includes an event routing module (205) configured to receive the event (100), add header information to the received event (100), and connect the received event (100) with the The header information of the event (100) sends the event (100) to the event-driven processor (30) or/and other event-driven processors or other processing pipelines.
  • an event routing module (205) configured to receive the event (100), add header information to the received event (100), and connect the received event (100) with the The header information of the event (100) sends the event (100) to the event-driven processor (30) or/and other event-driven processors or other processing pipelines.
  • the at least one interface module (200) includes a rate control module configured to send only a portion of the events (100) along the programmable daisy chain after a maximum speed is exceeded, To limit the rate of events not to exceed the maximum rate.
  • the at least one interface module (200) includes a mapping module (206) configured to map one event address to another event address.
  • mapping module (206) includes one or a combination of the following:
  • a region of interest module a lookup table module, a flip or/and rotation module; wherein the flip or/and rotation module is configured to flip or/and rotate the event address of the event (100).
  • the at least one interface module (200) includes an event address rewriting module (207) configured to convert the received event address into a uniform address format, whereby in the available A uniform event address format is passed on the programming daisy chain.
  • the at least one interface module (200) includes an event address filtering module (208) configured to filter out a series of events (100) having a particular selected event address.
  • the event address filtering module (208) is specifically a hot pixel filtering module (208'), which is configured to filter events (100) with specific event addresses, and pass them through the CAM memory (208'). -3) Store a preset list of event addresses to be filtered.
  • any one or more interface modules (200) of the event-driven interface system (20) may be bypassed by programmable switches.
  • the present invention also provides an event-driven interface system (20), which is coupled to an event-driven sensor (10) and an event-driven processor (30) to form an integrated circuit, the event-driven sensor (10) Generating and asynchronously outputting an event (100), said event (100) including or being associated with an event address of an input device (11) on said event-driven sensor (10) indicating that the event was generated; said event-driven interface system (20) Comprising at least one interface module (200), the interface module (200) forms a programmable daisy-chain form and asynchronously processes events (100) received from the sensor (10).
  • the at least one interface module (200) includes one or more of the following: a replication module (201), a fusion module (202), a subsampling module (203), a region of interest module (204) and an event routing module (205); where:
  • the replication module (201) configured to: receive an event (100) and perform a replication operation to obtain a replication event (100c), the event (100) from the event-driven sensor (10) or from the event driving other interface modules (200) of the interface system (20) and sending said replication events (100c) to external processing pipelines and sending said events (100) along said daisy chain;
  • the fusion module (202) is configured to receive events (100, 100e) from at least two different places, wherein the events (100) are from other interface modules of the event-driven interface system (20) (200) or said event-driven sensor (10); said event (100e) also originates from said integrated circuit or a component/module of other integrated circuit or other event-driven sensor, and along said programmable daisy chain sending part or all of the received event (100, 100e) to the subsequent interface module (200);
  • the subsampling module (203) configured to: assign a single address for a number of events (100) received;
  • the region of interest module (204) configured to:
  • Adjusting the attribute of at least one event address includes one or more of the following ways: shifting, flipping, transposing or/and rotating the attribute of at least one event address; or/and
  • the at least one interface module (200) has the following interface module coupling sequence:
  • a replication module (201), a fusion module (202), a subsampling module (203), a region of interest module (204), and an event routing module (205); or
  • a fusion module (202), a replication module (201), a subsampling module (203), a region of interest module (204), and an event routing module (205).
  • the event (100) comes from other interface modules (200) of the event-driven interface system (20), specifically the fusion module (202); Or/and for the fusion module (202), the event (100) comes from other interface modules (200) of the event-driven interface system (20), specifically the replication module (201).
  • the upstream of the interface module coupling sequence further includes: an event address rewriting module (207) or/and an event address filtering module (208); wherein the event address rewriting module (207) is configured To: convert the received event address into a unified address format, thereby transmitting the unified event address format on the programmable daisy chain;
  • the event address filtering module (208) therein is configured to filter out a series of events (100) having a particular selected event address.
  • the event (100) is first processed by the event address rewriting module (207) and then processed by the event address filtering module (208).
  • the event address filtering module (208) is specifically a hot pixel filtering module (208'), which is configured to:
  • Events with specific event addresses are filtered (100), and a preset list of event addresses to be filtered is stored through the CAM memory (208'-3).
  • the at least one interface module (200) further includes a mapping module (206) including one or a combination of: an area of interest module, a lookup table module, flip or /and rotation module; wherein the flip or/and rotation module is configured to flip or/and rotate the event address of the event (100).
  • a mapping module (206) including one or a combination of: an area of interest module, a lookup table module, flip or /and rotation module; wherein the flip or/and rotation module is configured to flip or/and rotate the event address of the event (100).
  • the at least one interface module (200) further includes a rate control module configured to send only a portion of the events (100) along the programmable daisy chain after a maximum speed is exceeded , to limit the rate of events to not exceed the maximum rate.
  • any one or more interface modules (200) of the event-driven interface system (20) may be bypassed by programmable switches.
  • the event-driven interface system (20) and the event-driven processor (30) through an adapter board (40) coupled to a single Chip (3); or are fabricated in the same die.
  • An event-driven interface system which can transmit events efficiently, flexibly, and with low power consumption, and provides an event preprocessing function for a processor to process events efficiently and conveniently.
  • FIG. 1 is a schematic diagram of an event-driven circuit system according to a certain class of embodiments of the invention.
  • FIG. 2 is a schematic diagram of a circuit system according to another embodiment of the invention.
  • FIG. 3 is a cross-sectional view of a chip in accordance with an embodiment of the invention.
  • FIG. 4 is a schematic cross-sectional view of a 3D chip geometry according to an embodiment of the invention.
  • FIG. 5 is a schematic diagram of a circuit system including a sound/vibration sensor according to an embodiment of the invention.
  • Fig. 6 is the event processing flow chart of sensor production
  • Fig. 7 is the process flow chart of sound sensor recording vibration signal event
  • FIG. 8 is a schematic diagram of a daisy chain of an interface system
  • FIG. 9 is a schematic diagram of a hot pixel filtering module having the ability to filter events with selected event addresses
  • FIG. 10 is a schematic diagram of a sub-sampling module
  • FIG. 11 is a gesture recognition system based on IBM TrueNorth in the prior art.
  • the embodiments of the method class and the product class may describe some technical features separately.
  • the present document implies that the corresponding embodiments of other classes also have the corresponding technical features/match/correspond to the technical features.
  • the matching device/step is just not clearly described in text.
  • method-type embodiments implicitly include certain steps/instructions/functions performed/implemented by a device/module/component in product-type embodiments
  • product-type embodiments implicitly include implementing method-type embodiments to execute certain steps/instructions /function device/module/component etc.
  • module refers to a product or part of a product that is implemented solely by hardware, only by software, or by a combination of software and hardware. Unless clearly indicated by the context, it is not implied in the present invention that the above-mentioned terms can only be implemented by hardware or software.
  • the multi-scheme description methods "A and/or B”, “A or/and B”, “A and/or B”, and “A or/and B” all include three parallel technical schemes: (1) A; (2) A and B; (3) B.
  • the meaning to be expressed is: the technical solution is allowed to be implemented with errors on the premise of not affecting the solution of the technical problem , that is, it is not required that after strict actual parameter measurement, the obtained data strictly conform to the general mathematical definition (because there is no physical entity that fully conforms to the mathematical definition), this term is not ambiguous and ambiguous, leading to unclear technical solutions
  • the limited/expressed range should, in fact, be based on whether the technical problem can still be solved as the criterion for judging whether it falls within the limited range.
  • B corresponding to A means that B is associated with A, and B can be determined according to A. However, it should also be understood that determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and/or other information.
  • first, second, etc. are usually used to identify and distinguish objects, but this does not constitute a limitation on the number of objects of the same type. Although it usually refers to a single object, it does not mean that there is only one object of this type, for example, it may be for the purpose of effect enhancement, pressure sharing, and equivalent replacement.
  • the field described in the scheme belongs to event-driven integrated circuit systems, so its sensors, interface systems, and processors are all event-driven.
  • integrated circuit system and “integrated circuit” have basically the same meaning
  • interface system and “interface circuit” have basically the same meaning
  • system here has the meaning of product attribute.
  • coupled refers to an electrical/electrical connection between two or more components in this relationship.
  • An event-driven integrated circuit system 1 includes an event-driven sensor 10 (hereinafter referred to as a sensor), an event-driven interface system 20 (hereinafter referred to as an interface system) and an event-driven processor 30 (hereinafter referred to as a processor).
  • a sensor event-driven sensor 10
  • an interface system event-driven interface system 20
  • an event-driven processor 30 hereinafter referred to as a processor
  • event-driven sensor 10 event-driven interface system 20
  • event-driven processor 30 are all divided according to functions for convenience of description, but this does not mean that the above components are necessarily physically independent. They can be implemented as three separate independent components, or multiple components can be combined together to achieve the integration/combination of multiple functions in a single component, such as the sensor 10 and the interface system 20 can be combined together (especially The interface system 20 and the processor 30 can be combined together on the same die (also called a bare chip, a bare chip). Perhaps some of the selection manners will reduce performance in a certain aspect, but the present invention does not limit the physical division form thereof.
  • a certain type of embodiment of the present invention is to combine the above event-driven sensor 10, event-driven interface system 20, event-driven processing
  • At least three components of the device 30 are coupled to a single chip (not shown in FIG. 1 ).
  • the three components described above are coupled to a single chip (in the case of only a single die).
  • the sensor 10 and the processor 30 use the same fabrication process, such as a conventional 65nm CMOS process, but this suboptimal solution comes at the expense of the image quality of the sensor 10.
  • the interposer includes but is not limited to: silicon interposer and glass interposer.
  • the present invention does not limit the material type of the adapter plate.
  • single chip is meant to include more than one die coupled through an interposer, or to include only one die without the need for an interposer. It should be noted that in some cases the context of the term may imply/restrict that the term represents only one of the above meanings.
  • event-driven sensor 10 is an event-driven 2D array sensor (such as an event-driven camera), and sensor 10 typically includes one or more event-driven sensor input devices 11 .
  • An event-driven camera includes a large number of pixels, and each pixel is an event-driven input device 11 .
  • the input device 11 of the event-driven sensor is configured to asynchronously generate an event upon detection of an event-generating signal or/and a change in the event-generating signal, such as a change in light intensity on a pixel.
  • each event is associated or includes an event address that includes/indicates an identifier of the input device 11, such as the X and Y coordinates of the pixels in the 2D array.
  • the event-driven sensor is a 1D, 2D, 3D or other type of sensor.
  • the sensor 10 is coupled to the input 21 of the event-driven interface system 20 through the output 12 of the sensor 10, and the sensor 10 outputs the event 100 in an asynchronous manner.
  • interface system 20 may include a series of interface modules 200, where each interface module 200 is configured to process incoming events 100 in a programmable manner. In this way, all events 100 processed by the processor 30 have the same format from the perspective of a unified event address structure and possible event headers.
  • the interface module 200 may be configured to include performing 1) a filtering step; or/and 2) an address manipulation step.
  • a filtering step or/and 2) an address manipulation step.
  • an address manipulation step to limit the incident event rate to within the processing capability of the processor 30, or/and to provide a predefined event address format.
  • the interface system 20 includes a series/multiple parallel inputs 21 (eg, 21-1, 21-2) for receiving events 100 from the sensor 10, and also includes a series/multiple parallel outputs 22 (eg, 21-1, 21-2). Such as 22-1, 22-2) are coupled to the input of the processor 30, and are configured to transmit the preprocessed events 101 to the processor 30 in a parallel manner. This setup allows multiple events to be transmitted simultaneously, resulting in reduced power consumption and fast event processing.
  • Figure 2 shows an alternative embodiment of some kind.
  • the event-driven sensor 10 is a 1D event-driven sensor, such as an event-driven mechanical pressure sensor, which is used to detect mechanical vibrations.
  • sensor 10 , interface system 20 and processor 30 are assembled on a single chip 3 and coupled through an interposer board 40 .
  • processor 10, interface system 20, and processor 30 are coupled to the same side of chip 3; in another class of embodiments, processor 10, interface system 20, and processor 30 are coupled on both sides of chip 3.
  • the present invention does not limit whether the above-mentioned three components are assembled on the same side.
  • sensor 10 and interface system 20 are in the same die; while in another class of embodiments, interface system 20 and processor 30 are in the same die.
  • the senor 10 may also be a point sensor, in which case the event addresses of the point sensors are all the same address.
  • the sound sensor type in some embodiments, it may include at least two sound sensors in different physical positions to realize stereo sound effect collection.
  • this event-driven system is designed in such a way that the application power consumption is extremely low, and it can operate asynchronously. Especially suitable for application scenarios based on battery power supply and long working time.
  • the processor 30 is configured as an event-driven spiking artificial neural network (or simply an event-driven spiking neural network, also known in the art as a spiking neural network SNN).
  • the SNN includes a variety of network algorithms.
  • the above-mentioned neural network is configured as an event-driven convolutional neural network (sCNN), which is particularly suitable for ultra-fast application requirements, such as object recognition.
  • the specific implementation of the sCNN can at least refer to the prior art (PCT patent application document, title: Event-driven spiking convolutional neural network, publication date: 15, Oct, 2020):
  • the event-driven processor 30 configured as an event-driven spiking neural network or sCNN, combined with the circuit-integrated geometry, can further accommodate the needs of long-term, low-power application scenarios.
  • the integrated circuit system thus configured can output only relevant information including detected objects, such as "[table]", “[chair]", “[** is approaching]” and the like. Compared with traditional technology, it does not need to record and upload a large amount of data information, avoid the information transmission delay of connecting cloud data, massive computing power requirements, power consumption, and is very suitable for low power consumption, low latency, and low data. Application scenarios of storage pressure and long battery life, such as IoT and edge computing.
  • the average power consumption of the solution adapted to 64*64 DVS under the 65nm process is as low as 0.1mW, and the peak power consumption is only 1mW; the average power consumption of the solution adapted to 128*128 DVS As low as 0.3mW, peak power consumption is only 3mW.
  • the processor 30 includes at least two (or more) processors, in particular each processor is configured to perform a different task. Processor 30 is also configured to process events asynchronously.
  • a C4 bump 43 may be used to connect the interposer board 40 , and the interposer board 40 is provided with a number of through holes 42 .
  • the adapter board 40 is provided with a number of micro-bumps 41, and two bare chips are arranged on the micro-bumps 41: a first bare die (or a first integrated circuit structure) 1-1, a second bare die (or Called the second integrated circuit structure) 1-2.
  • the coupling of the first die 1-1 and the second die 1-2 can be realized by some optional specific means such as micro bumps ( ⁇ bumps) 41 and through holes 42 .
  • both the event-driven interface system (20) and the event-driven processor (30) are located on the first die (1-1).
  • both the event-driven sensor (10) and the event-driven interface system (20) are located on the second die (1-2).
  • a portion of the event-driven interface system (20) and the event-driven processor (30) are both located on the first die (1-1) and another portion of the event-driven interface system (20) and the event-driven sensor (10) are all located in the second bare die (1-2).
  • the through holes 42 in the present invention include but are not limited to: through silicon vias (TSVs) and through glass vias (TGVs).
  • the different dies described above are coupled through Cu-Cu technology.
  • the adapter plate 40 of the present invention includes, but is not limited to, a silicon adapter plate or a glass adapter plate.
  • Figure 4 is a cross-sectional view of a 3D chip geometry in certain types of embodiments.
  • the interface system 20 and the processor 30 are mounted on the first die 1-1
  • the sensor 10 is mounted on the second die 1-2, which implements the first circuit through the interposer board 40 Possibility of spatial isolation of structure 1-1 and second circuit structure 1-2.
  • the adapter plate 40 includes a plurality of through holes 42 therein.
  • Micro-bumps 41 are included between the first bare die 1-1 and the interposer 40, and the interface system 20 and the processor 30 are assembled on the same bare die, that is, the first bare die 1-1, through the micro-bumps 41 or/ and vias 42 and the like can achieve electrical/electrical coupling between the interface system 20 and the processor 30 .
  • a through hole 42 is provided in the interface system 20, and between the second die 1-2 where the sensor 10 is located and the first die 1-1 are coupled through micro bumps 41, so the second die 1- 2 is stacked over the first die 1-1.
  • the chip with the 3D structure arranged in this way greatly reduces the occupied area of the chip, improves the information transfer speed between different components, and reduces the overall power consumption of the system.
  • the feature size/technology node of the second die 1-2 and the first die 1-1 may be different, for example, the process of the second die 1-2 It can be larger than 65nm, and the process of interface system and processor is smaller than 65nm, such as 22/14/10/7/5nm or smaller. This allows selecting a more cost-effective manufacturing process in the manufacturing process of the chip, and the present invention does not limit the selection and combination.
  • the processor 30 and the interface system 20 may also use the same manufacturing process and be fabricated on the same integrated circuit structure/die or on different integrated circuit structures.
  • the system includes a sound/vibration sensor 10' connected to an event-driven amplifier 13 and configured to amplify the sensor signal and output a sound/vibration power spectrum indicative of recording ) events 100 of varying intensity, especially where the events are generated asynchronously for each frequency in the energy spectrum.
  • the amplifier 13 is connected to an interface system 20 , wherein the interface system 20 is configured to process any event 100 generated by the amplifier 13 and pass the processed event to the processor 30 .
  • the sensor 10' and the amplifier 13 are arranged on the same chip, which can achieve the advantage of maintaining a single chip structure.
  • the event-driven sensor 10 is of one or a combination of one or more of the following types: point sensor, 1D sensor, 2D sensor.
  • the event-driven sensor 10 is of one or a combination of the following types: sound/vibration sensor, dynamic vision sensor.
  • FIG. 6 A certain type of embodiment is depicted in FIG. 6 , showing a flowchart for processing an event 100 generated by a sensor 10 in the circuitry 1 .
  • the preprocessing event 100 is implemented through the interface system 20 , preprocessed and output to the processor 30 .
  • upstream refers to the side closer to the sensor
  • downstream refers to the side closer to the processor, both of which are related to the sequence of event processing.
  • the replication module 201 is located in the event routing module. 205 upstream.
  • each geometry between the sensor 10 and the processor 30 represents at least one processing step, which is performed by the interface system 20 .
  • the interface system 20 includes a series of interface modules 200 (including but not limited to 201, 202, 203, 204, 205, 206, 207, 208, 208', etc.).
  • the interface modules 200 can be independently programmable, and any one or more interface modules (200) can be bypassed by programmable switches to support more network configurations.
  • the interface module 200 may be specifically implemented as a hardware circuit and constitute a programmable daisy chain configured to asynchronously process incoming events from the sensor 10 in an event-driven and asynchronous manner 100.
  • the first interface module 201 is a copy module, which is configured at a subsequent stage of the sensor 10 and includes an input 201-1 coupled to the sensor
  • the output terminal 10-1 of 10 further includes a first output terminal 201-2a and a second output terminal 201-2b.
  • the input 201-1 of the replication module 201 is configured to receive the event 100 and perform a replication operation, ie to replicate the received addressed event 100, and is configured to forward/forward the replicated event 100c and its replicated
  • the event address is sent to the first output 201-2a of the replication module 201, and the received event 100 is sent to the second output 201-2b of the replication module 201 along the daisy chain.
  • Replication events 100c may be fed into external processing pipelines (not shown) of different systems.
  • the replication module 201 allows coupling into other systems before the event is processed in any way.
  • the merge module 202 is coupled to the replication module 201, and the first input terminal 202-1a of the fusion module 202 is coupled to the second output terminal 201-1b of the replication module 201.
  • the fusion module 202 is also provided with a second input 202-1b, which is configured to receive events 100e from other components/modules, which may come from this circuit system or other circuit systems (not shown), such as the first Events for two sensor or daisy chain outputs.
  • the coupling module 202 also includes an output 202-2 that is configured to transmit all received events 100 or/and events 100e along the daisy chain.
  • the event 100e is merged into the stream formed by the event 100, so the event 100e and the event 100 will not be distinguished and collectively referred to as the event 100 thereafter.
  • This embodiment allows more components or/and information to be integrated in the processing pipeline of the circuit system, with more flexible system configuration capabilities.
  • Sub-sampling module 203 (sub-sampling/sum pooling module): The output terminal 202-2 of the fusion module 202 is coupled to the input terminal 203-1 of the sub-sampling module 203.
  • the resampling module 203 is configured to assign a single address to a plurality of received events 100, so that the number of different event addresses can be reduced. In this way, several event addresses representing several different pixels 11 of the 2D array sensor 10, for example, can be subsampled to actually fewer pixels.
  • the process of subsampling, in some specific application scenarios, is called binning.
  • the subsampling module 203 may be bypassed by a programmable switch (not shown).
  • Region Of Interest module 204 (Region Of Interest module, ROI): the output end 203-2 of the subsampling module 203 is coupled to the input end 204-1 of the region of interest module 204, wherein the region of interest module 204 is configured to adjust (adjust) The property (property) of at least one event address, specifically the adjustment method can be shifted, flipped (flip), swapped (swap) or/and rotated (rotate) at least one property of the event address, within the ROI module.
  • the operation performed may be a rewrite operation through the event address.
  • the region of interest module 204 may be further configured to discard events whose address attribute values are outside the programmable address attribute value range.
  • the region of interest module 204 is programmable and is configured to store programmable ranges of address attribute values as described above, the range of address attribute values being set for each address attribute.
  • the area of interest module 204 is also configured to send received events 100, as long as these events 100 are not discarded, to the next level along the daisy chain along with the adjusted addresses.
  • the region of interest module 204 allows manipulation of cropped images or/and other basic geometrical operations at event 100 pixel coordinates.
  • Event routing module 205 the input terminal 205 - 1 of the event routing module 205 is coupled to the output terminal 204 - 2 of the area of interest module 204 to receive the event 100 .
  • the event routing module 205 is configured to: optionally, associate header information for the received event 100, and output the event 100 together with its header information to the first output terminal 205- 2a.
  • the event routing module 205 is configured to replicate the event 100 including the header information and the adjusted event address, and output the replicated event 100 together with the replicated header information to the second output 205 of the event routing module 205 -2b, the second output 205-2b can be coupled to other event-driven processors or other processing pipelines.
  • the event routing module 205 thus configured thus adds to the circuitry 1 the ability to provide preprocessing information from events for any type of processor, processor, or processor with any type of input format required by a running program.
  • the first output 205-2a of the event routing module 205 is coupled to the processor 30, and the processed event 100 with the event address and header information is then passed to the processor 30 and executed for processing tasks, such as modes or features Identify tasks or other applications.
  • the circuitry 1 may further include other interface modules 200 that may be configured to perform tasks on events 100 received from the sensor 10 or the like, such as rate control tasks, hot pixel filtering tasks, events Address rewrite task (refer to Figure 8-9).
  • other interface modules 200 may be configured to perform tasks on events 100 received from the sensor 10 or the like, such as rate control tasks, hot pixel filtering tasks, events Address rewrite task (refer to Figure 8-9).
  • Rate control module is configured to limit the rate of events not to exceed the maximum rate. For example: especially rate limiting with the same event address; when the maximum speed is exceeded, only fractions of events are sent along the daisy chain, such as every nth received event will not be sent along the daisy chain, n is a value determined according to the current event rate; the maximum rate may be programmable and adjustable in a memory, which may belong to the rate control module, or may come from outside the module.
  • the rate control module may include or be connected to a processing unit with a clock for determining the event reception rate of the module. The event rate on such a daisy chain will not exceed the maximum rate, while also limiting the rate of data fed to processor 30.
  • interface module 200 which is programmable, it is obvious that this interface module can easily be bypassed by issuing appropriate commands to the module. For example, when coordinate flipping is not required, it can be realized by bypassing the entire region of interest module 204 , thus realizing the direct connection between the subsampling module 203 and the event routing module 205 .
  • Figure 7 shows a certain type of embodiment when the sensor records a vibrating sound sensor.
  • the sensor includes an amplifier (refer to Figure 5) and a filter, and the sensor is configured to generate events from the sound sensor, asynchronously encode the events 100 from a power spectrum recorded by the sensor. Additionally, the amplifier can be configured as a shift a channel. As a combined unit, the sound sensor and amplifier can be viewed collectively as an event-driven sensor 10 .
  • the event-driven sensor 10 delivers the event 100 to the fusion module 202 , and the event 100e is delivered from a different processing pipeline and fused into the event 100 generated by the event-driven sensor 10 .
  • the fused event 100 is then passed to the replication module 201, so the fused event 100 is replicated.
  • Duplicate events 100c are then passed to the same processing pipeline, which is where event 100e is received at fusion module 202, or passed to other processing pipelines (not shown). The effect of this is to allow a great deal of freedom in designing or handling the daisy chain. It is not difficult to see that in this type of embodiment, many events 100, 100e may have been fed in early in the daisy chain.
  • the advantage of the programmable modules of the interface system 20 including the daisy chain is that in the processor 30, based on the unified event format or/and event address format, the event 100 can be processed and the processor performs its intended purpose without further processing.
  • Mapping module 206 (mapping module): between the sub-sampling module 203 responsible for pooling (pooling, meaning basically equivalent to sub-sampling) of the interface module 200, and the event routing module 205 responsible for routing events (refer to FIG. 6 ) ), the mapping module 206 (which itself includes the ROI module 204) may be placed in between for the purpose of enabling rich event address mapping operations.
  • an interface module is/includes a mapping module (such as the mapping module 206 described above), wherein the mapping module 206 is configured to map one event address to another event address.
  • the mapping module 206 includes one or a combination of the following:
  • ROI 204 Region of interest module
  • a flip or/and rotation module it is configured as an event address for flip or/and rotation events.
  • all interface modules of the present invention can be bypassed by their internal programmable switches.
  • FIG. 8 shows a daisy chain implemented by programmable interface modules 200, which are incorporated into interface system 20, in a certain type of embodiment.
  • Event address rewrite module 207 Event driven sensor 10 provides a stream of events 10 which are sent to optional event address rewrite module 207 .
  • the event address rewriting module 207 is configured to rewrite the event format into a common format for subsequent processing steps, namely converting the event address received from the sensor into a unified address format, thereby A unified event address format is passed on the chain.
  • the event address rewrite module 207 may be programmed for a particular sensor model in order to accommodate virtually any type of sensor 10 that provides any event format.
  • the format of the event may be related to the byte-order and format of the event address stored in the generated event.
  • the unified event address format is a predefined data format, which realizes that subsequent processing can rely on the predefined data format, so the instance of double check (double check) about the event format can be omitted to achieve faster processing.
  • Event address filter module 208 Once the normal events and event address formatting are completed by the event address rewrite module 207 , the events are further processed by the event address filter module 208 .
  • the event address filtering module 208 is configured to filter out a series of events having a particular selected event address.
  • the selected event address can be stored, read and written in CAM memory (Content-addressable memory). These filters allow filtering out hot pixels or the like. So the event address filtering module 208 may be a hot pixel filtering module.
  • the event address filtering module 208 as the first module or a part thereof, can reduce the number of events transmitted on the daisy chain at an early stage, and this processing method will reduce the energy consumption of the daisy chain. If the processor 30 also has the ability to filter addresses, the event address rewrite module 207 can be bypassed.
  • the filtered events 100 will then be sent to the replication module 201 or/and the fusion module 202, which can respectively implement replication events 100c providing sensors to external systems 100c and incorporating external event 100e sources.
  • Replication module 201 and fusion module 202 can be bypassed independently by programming.
  • the order of the two may have two different sequential processing orders: the copying module 201 and the fusion module 202 are firstly processed, or vice versa.
  • the copying module 201 and the fusion module 202 are firstly processed, or vice versa.
  • the subsampling module 203 After the duplication module 201 or/and the fusion module 202, the subsampling module 203 will process the input events in the manner of the previous embodiment. Placing the subsampling module 203 at a specific location in the daisy chain can handle all events, even if they originate externally.
  • the region of interest module 204 is in the subsequent part of the sub-sampling module 203 and processes all events sent by the sub-sampling module 203 .
  • the region of interest module 204 reduces the number of event addresses, thereby reducing the workload load.
  • the region of interest module 204 may likewise be configured to flip or/and rotate the X, Y coordinates of the event address.
  • an event routing module 205 Arranged subsequent to the area of interest module 204 is an event routing module 205 that is configured to prepare an event, such as providing header information for the event 10 , which is sent to the processor 30 .
  • the daisy chain shown in FIG. 8 provides a unified approach for efficient, fast, and flexible processing of events from sensor 10 or other sources.
  • Hot pixel filter module 208' (hot pixel filter module): FIG. 9 is a specific embodiment of a certain type of event address filter module 208: hot pixel filter module 208'.
  • the function of the hot pixel filtering module 208' is to filter events with specific event addresses. This allows, for example, to reduce or completely remove such events on a daisy chain. Such events are removed because the pixels of an input device, such as a 2D array sensor, are compromised or damaged.
  • the hot pixel filtering module 208' includes an input 208'-1 for receiving the event 100.
  • the hot pixel filtering enable (enable) determination step S800 After receiving the event 100, after the hot pixel filtering enable (enable) determination step S800, if it is determined to be disabled (No, Disabled), the hot pixel filtering module 208' can be bypassed directly through the programming switch and the Event 100 is fed directly to output 208'-2 of 801 hot pixel filtering module 208'. If S800 is determined to be enabled (Yes, Enabled), preferably, the preset list of event addresses to be filtered is read from the CAM memory 208'-3. In the address comparison/matching step S802, it is verified whether the address of the event 100 belongs to one of the addresses to be filtered in the list.
  • the event 100 will be filtered out S803 in the daisy chain (filtered out), and dropped (dropped). And if there is no address in the above list that matches the address of the event 100, then the event 100 will be output to the pipeline at the output 208'-2 of the hot pixel filtering module 208'.
  • FIG. 10 shows the working flow chart of the sub-sampling module 203 in some embodiments.
  • An incoming event 100 is processed by evaluating its associated address, particularly its address coordinates, such as X, Y, Z coordinates.
  • the address of event 100 is split into three different addresses by splitting module 203-4 .
  • the coordinates X, Y, Z are shifted or divided by operation S901
  • the coordinates X, Y, and Z are sub-sampled into a coordinate set with a low data amount, which effectively reduces the number of pixel coordinates.
  • the event addresses thus processed are then merged at the address reorganization module 203-5 , and then the adjusted addresses are sent to the subsequent stage along with the event 100 for further processing.
  • the separation module 203-4 is configured to route the event 100 to a scaling register in the associated subsampling module according to the address value (eg X, Y, Z coordinates) of the received event, the scaling register being configured to split, subsample, pool or/and shift the received address value, and output the address value to an address reorganization module 203-5 in the subsampling module, the address reassembly module 203-5 being configured to The scaled address value adjusts the event address, then sends the adjusted event along the daisy chain.
  • the address value eg X, Y, Z coordinates
  • the subsampling module can adjust the pixel resolution of the 2D array sensor so that the number of pixels on the X,Y axis will be reduced.
  • the resolution of the image processed by the processor can be from 256*256 to 64*64.
  • it can also be processed by a similarly configured subsampling module as above.
  • event 100 may also include a channel identifier, which is not processed separately as described above, but merely loops through S900 in subsampling module 203 alone.
  • any module, component or device exemplified herein that executes instructions may include or otherwise have access to a non-transitory computer/processor readable storage medium or media for storing information, such as computer/processor readable instructions, data structures, program modules and/or other data. Any such non-transitory computer/processor storage medium may be part of or accessible or connectable to the device. Any application or module described herein may be implemented using computer/processor readable/executable instructions, which may be stored or otherwise maintained by such a non-transitory computer/processor readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)
  • Multi Processors (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

本发明涉及一种事件驱动集成电路,其包括传感器、接口系统和处理器,方案异步产生和处理带有地址的事件,且接口系统包括构成可编程菊花链形式的复制模块、融合模块、二次采样模块、兴趣区域模块、事件路由模块等。通过转接板将传感器、接口系统和处理器耦接在单芯片上,且可以使用不同的工艺制造不同的裸晶,方案可以消弭现有技术中信号损失和噪声干扰,可实现信号的高速处理和更低的芯片占用面积、制造成本等技术效果,解决了现有技术中芯片面积大、信号处理能力弱的技术问题。此外接口系统的巧妙设计,丰富了接口系统的功能和可配置性,为后续处理提供各种功耗、功能、速度上的优势。

Description

具有接口系统的事件驱动集成电路 技术领域
本发明涉及一种事件驱动集成电路,并具体涉及一种具有接口模块的异步处理事件的低功耗集成电路。
背景技术
在现有技术中已有事件驱动(event-driven)传感器。有一类事件驱动传感器是事件驱动相机,其包括具有像素的像素阵列。根据像素的明度变化,事件驱动相机产生事件(event),其中事件包括变化的标识符(identifier),比如-1表示更暗,+1表示更亮,这类相机被称为动态视觉传感器(DVS,Dynamic vision sensors)。还有其它已知的事件驱动传感器,比如一维的传感器、声音传感器。
DVS产生事件是以异步的方式进行的,它是事件驱动的传感器。传统基于时钟(clock-based)的相机需要读取全部的像素的全部帧(frames)或线(lines),其是无法与DVS比拟的。DVS提供超快图像处理,却依然能保持低速率,其原因在于DVS仅仅记录变化(changes)。
但是与基于传统的冯诺依曼(Von-Neumann)架构所依赖的时钟、同步操作处理器相比,后续的事件处理步骤,需要与之根本上不同的处理架构。
为事件驱动系统设计通用(versatile)的处理结构,是一项极具挑战性的任务,特别是考虑到各种各样的预处理(preprocessing)和处理部件之间的互连。与时钟系统相比,事件驱动系统必须根据不同事件处理流程,特别是在不同部件之间交换传输数据时涉及AD-HOC握手机制。这样的握手机制包括必须的交换数据请求、请求的确认、以及后续的数据交换。
在事件驱动系统中的部件,尤其是传感器,考虑到像素阵列(pixel array)尺寸、不同的热像素/噪点(hot-pixels,后称热像素)、事件地址(event addressing)等,可能有不同的规格。当组装不同的部件至系统中时,为适应系统的处理管道(processing pipeline),这将会是一个冗长、耗时、昂贵的任务。术语“处理管道”特指布线(wring),比如不同部件之间的互连,也指部件的数据处理和不同部件之间的数据传输。术语“处理管道(processing pipeline)”也特指多种(various)输出端口(ports)或系统的多种第一部件如何连接至多种输入端口或系统的多种 第二部件的特定方式。
事件驱动系统最根本的设计理念就是追求极致的低功耗以适应边缘计算。为了获得极致的低功耗,本领域技术人员已想尽办法从各个角度来尝试。
在现有技术中,2014年8月8日《科学》杂志首次介绍了IBM的类脑(brain-inspired)芯片TrueNorth,该芯片具有54亿晶体管、4096个神经突触核心、100万个可编程脉冲神经元、256万个可配置突触。芯片结构采用了事件驱动设计,并且是异步-同步混合芯片:路由、调度器、控制器采用的是准延迟(quasi-delay)非敏感无时钟异步设计,而神经元则是传统的有时钟同步电路方案,其时钟是由异步的控制器产生,全局时钟频率为1kHz。若以30帧/秒、400*240像素的视频输入测算,该芯片的功耗为63mW。具体可参考:
现有技术1:“A million spiking-neuron integrated circuit with a scalable communication network and interface”,Paul A.Merolla,John V.Arthur etal,Vol.345,Issue 6197,SCIENCE,8Aug 2014.
在2017年7月21-26日召开的CVPR会议中公开了基于IBM TrueNorth芯片手势识别系统:参考该文章的Fig-1和Fig-4(或本发明的附图11,关于该图细节具体可参考原文,本发明不再赘述),位于NSIe开发板上的TrueNorth处理器通过USB2.0接收来自DVS128输出的事件。换言之,DVS和处理器之间是通过USB线缆连接的。具体可以参考:
现有技术2:“A Low Power,Fully Event-Based Gesture Recognition System”,Arnon Amir,Brian Taba etal,2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),21-26 July 2017.
2017年11月,Intel披露了其研发的类脑芯片Loihi,从公开的演讲资料获知该芯片功耗为74mW。在Intel官方网站中公开的神经形态(Neuromorphic)计算进展中指出:在INRC成员用于需要直接访问硬件的领域(例如机器人技术)研究时,Loihi硬件可供选择。这些系统包括一个代号为“Kapoho Bay”的USB外形尺寸。除了为Loihi提供USB接口外,Kapoho Bay还提供了来自iniVation公司的DAVIS 240C DVS硅视网膜相机的事件驱动硬件接口。换言之,对于Loihi而言,其同样是通过USB与DVS进行连接。具体可以参考网络链接(链接内容还可在Internet archive中获取):
https://newsroom.intel.com/news/intel-announces-neuromorphic-computing-research-collabo  rators/#gs.vj5gmb,6 Dec 2018.
2019年8月1日《自然》杂志以封面文章报道了清华大学研发的类脑芯片天机芯(Tianjic),在0.9V典型电压下其功耗为400mW(1.2V时为950mW)。在该团队随后其它文章中,披露了天机芯其它技术细节。该文章指出:单芯片PCB配备了Altera Cyclone 4 FPGA和通信接口USB、SPI,参考该文章Fig-2b,相机通过USB连接至FPGA并最终连接至Tianjic。换言之,该公开的技术中视觉传感器与处理器也是通过USB线缆连接的。该文章可以参考:
现有技术3:“A hybrid and scalable brain-inspired robotic platform”,Zhe Zou,Rong Zhao etal,Scientific Reports,23 Oct 2020.
以上是全球顶尖事件驱动系统设计者为追求极致低功耗而做出努力的举例,相比于基于冯诺伊曼架构的传统CPU约100-200W的功耗,其能效比已获得极大的提升。为了连接传感器和AI处理器,上述现有技术均是通过USB线缆/接口或其它类似接口技术来实现其连接,这些全球顶尖事件驱动设计者并未意识到这有何不妥之处。
然而发明人独立发现并意识到:在上述现有技术中,USB线缆(或其它线缆实现技术)都具有一定连线长度,因此系统可能承受耦合至线缆的信号损失和噪声干扰,并且由于是线缆,因此设备间的每次的握手,比如数据传输前后必要通信,会消耗更多的能量且拖慢系统的处理速度,这些对类脑芯片的性能都产生不利影响。在该领域中的现有技术中,这些顶尖设计者们均未意识到该因素的不利影响而认为所提出的技术方案已用尽了追求极致低功耗的各种努力、方案已满足各项技术指标的需要,也未指出面对上述不利因素应当采取或暗示应当采取行动以消弭该不利影响,因此在无本领域其它技术方案的明确启示下,这些全球顶尖设计者缺乏对上述技术手段需要进一步改进的认识/动机。
此外,发明人在为了解决上述问题的研究中/某些解决方案中进一步发现,对于DVS而言,为了获得质量更佳的图像信息,本领域技术人员需要选用一种特殊的半导体工艺来制造DVS,如CIS-CMOS工艺图像传感器。但是对于AI处理器(如后述的sCNN处理器),其如若采用常规的CMOS工艺,而这种常规CMOS工艺又不适合制造高质量图像传感器(成像效果不理想)。而如果全部使用CIS-CMOS工艺去制造传感器和AI处理器,尤其是在同一芯片/裸片(die)上,AI处理器将会占用大量的芯片面积、增加芯片成本,这对于追求越来越小 芯片占用面积的发展趋势而言,将失去商业竞争力。因此,如何消弭信号损失和噪声干扰,优选地还能进一步追求更低的芯片占用面积及制造成本,对于类脑芯片产业化/商业化而言,是一重要的待解决问题。
由于完全不同于传统的冯诺伊曼架构,事件产生到事件处理之间涉及到事件的传递过程,该过程可以视为一个接口系统,该接口系统的设计将是完全不同以往的技术挑战。如何设计该接口系统,使之能高效、灵活、低功耗地传递事件,同样是本领域技术人员面临的综合挑战之一。
本发明正是为解决上述技术问题之一或若干技术问题的组合而提出的,发明的技术方案能解决或缓解上述一个或多个技术问题的组合。
除明确给出技术来源外,以上背景技术中所提到的技术,可能全部或部分属于或未公开的技术,即申请人不承认背景技术所提及的技术一定属于专利法意义上的现有技术(prior art),除非有实质性的证据能予以证明。与此同时,上述背景技术所公开的技术方案、技术特征,已随本发明专利文件的公开而公开。
发明内容
为解决或缓解上述一个或多个技术问题的组合,本发明提出的技术方案是通过如下方式实现的:
一种集成电路,其包括事件驱动传感器(10)和事件驱动接口系统(20)和事件驱动处理器(30),所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)耦接于单芯片(3)。
在某类实施例中,所述事件驱动传感器(10),被配置为:在所述事件驱动传感器(10)的输入设备(11)检测到事件产生信号或/和事件产生信号的变化后异步生成和异步输出事件(100),所述事件(100)包括或被关联指示所述输入设备(11)的事件地址,所述事件驱动传感器(10)的输出端耦接至所述事件驱动接口系统(20)的输入端;
所述事件驱动接口系统(20),被配置为:异步接收所述事件(100)并对所接收的事件(100)预处理,所述事件驱动接口系统(20)的输出端耦接至所述事件驱动处理器(30)的输入端;
所述事件驱动处理器(30),被配置为:接收所述事件驱动接口系统(20)预处理后的事件(101),并以异步的方式处理所接收的事件(101);
所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)之间通过转接板(40)而耦接于单芯片(3)。
在某类实施例中,所述事件驱动接口系统(20)以及所述事件驱动处理器(30)均位于第一裸晶(1-1);或,所述事件驱动传感器(10)和所述事件驱动接口系统(20)均位于第二裸晶(1-2);或,所述事件驱动接口系统(20)的一部分与所述事件驱动处理器(30)均位于第一裸晶(1-1)且所述事件驱动接口系统(20)的另一部分与所述事件驱动传感器(10)均位于第二裸晶(1-2)。
在某类实施例中,所述事件驱动接口系统(20)以及所述事件驱动处理器(30)均位于第一裸晶(1-1),且所述事件驱动传感器(10)所在的第二裸晶(1-2)堆叠在所述事件驱动接口系统(20)以及所述事件驱动处理器(30)所在的第一裸晶(1-1)之上。
在某类实施例中,所述转接板(40)是硅转接板或玻璃转接板。
在某类实施例中,所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)是通过2.5D或3D封装技术封装于单芯片(3)之上。
在某类实施例中,所述事件驱动传感器(10)属于以下类型中的一种或多种的组合:点传感器、1D传感器、2D传感器。
在某类实施例中,所述事件驱动传感器(10)属于以下类型中的一种或多种的组合:声音/震动传感器、动态视觉传感器。
在某类实施例中,事件驱动处理器(30)被配置有脉冲神经网络。
在某类实施例中,事件驱动处理器(30)被配置有脉冲卷积神经网络。
在某类实施例中,所述的第一裸晶和所述的第二裸晶采用不同的工艺制造。
在某类实施例中,所述的事件驱动接口系统(20)包括至少一个接口模块(200),所述的接口模块(200)构成可编程菊花链形式,异步处理从所述事件驱动传感器(10)接收到的事件(100)。
在某类实施例中,所述至少一个接口模块(200)包括复制模块(201),其被配置为:接收事件(100)并且执行复制操作得到复制事件(100c),所述的事件(100)来自所述事件驱动传感器(10)或来自所述事件驱动接口系统(20)的其它接口模块(200)(在某类实施例中,其是融合模块),并且发送所述复制事件(100c)至外部处理管道,以及沿着所述的菊花链发送所述事件(100)。
在某类实施例中,所述至少一个接口模块(200)包括融合模块(202),其被配置为:至少从两处不同的地方接收事件(100,100e),其中所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)(在某类实施例中,其是复制模块)或所述的事件驱动传感器(10);所述的事件(100e)还来自所述集成电路或其它集成电路的部件/模块或其它事件驱动传感器,并沿着所述可编程菊花链发送所述接收到的事件(100,100e)的部分或全部至后续的接口模块(200)。
在某类实施例中,所述至少一个接口模块(200)包括二次采样模块(203),其被配置为:为接收到的若干事件(100)分派成单一的地址。
在某类实施例中,所述二次采样模块(203)包括的分离模块(203-4)被配置为:根据接收事件(100)的地址值路由所述事件(100)至关联的所述二次采样模块(203)中的缩放寄存器;所述缩放寄存器被配置为:分割、二次采样、池化或/和移位接收到的地址值,并输出地址值至所述二次采样模块(203)中的地址重组模块(203-5),所述地址重组模块(203-5)被配置为:根据缩放后的地址值来调整事件地址,然后沿着所述可编程菊花链发送调整地址后的事件(100)。
在某类实施例中,所述至少一个接口模块(200)包括兴趣区域模块(204),其被配置为:调整至少一个事件地址的属性,所述的调整方式包括如下方式的一种或多种:移位、翻转、调换或/和旋转至少一个事件地址的属性;或/和
抛弃地址属性值在可编程的地址属性值范围之外的事件(100),沿着所述可编程菊花链发送未被抛弃的事件(100)。
在某类实施例中,所述至少一个接口模块(200)包括事件路由模块(205),其被配置为:接收事件(100),为接收到的事件(100)添加头信息,并连同所述事件(100)的所述头信息发送所述事件(100)至所述事件驱动处理器(30)或/和其它事件驱动处理器或其它处理管道。
在某类实施例中,所述至少一个接口模块(200)包括速率控制模块,其被配置为:当超过最大速度后,仅沿着所述可编程菊花链发送部分所述事件(100),以限制事件的速率不超过最大速率。
在某类实施例中,所述至少一个接口模块(200)包括映射模块(206),其被配置为:将一个事件地址映射成另外一个事件地址。
在某类实施例中,所述映射模块(206)包括如下内容的之一或组合:
兴趣区域模块、查找表模块、翻转或/和旋转模块;其中翻转或/和旋转模块被配置为翻转或/和旋转所述事件(100)的事件地址。
在某类实施例中,所述至少一个接口模块(200)包括事件地址重写模块(207),其被配置为:为接收到的事件地址转换为统一的地址格式,由此在所述可编程菊花链上传递统一的事件地址格式。
在某类实施例中,所述至少一个接口模块(200)包括事件地址过滤模块(208),其被配置为:过滤掉一系列具有特定挑选过的事件地址的事件(100)。
在某类实施例中,所述事件地址过滤模块(208)具体为热像素过滤模块(208’),其被配置为:过滤具有特定事件地址的事件(100),且通过CAM存储器(208’-3)存储预设的待过滤的事件地址列表。
在某类实施例中,所述的事件驱动接口系统(20)的任意一个或多个接口模块(200)可以被可编程开关旁路。
此外本发明还提供一种事件驱动接口系统(20),其被耦接于事件驱动传感器(10)和事件驱动处理器(30)之中,构成集成电路,所述的事件驱动传感器(10)生成和异步输出事件(100),所述事件(100)包括或被关联指示产生事件的所述的事件驱动传感器(10)上的输入设备(11)的事件地址;所述的事件驱动接口系统(20)包括至少一个接口模块(200),所述的接口模块(200)构成可编程菊花链形式,异步处理从所述传感器(10)接收到的事件(100)。
在某类实施例中,所述至少一个接口模块(200)包括以下的一个或多个:复制模块(201)、融合模块(202)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205);其中:
所述复制模块(201),其被配置为:接收事件(100)并且执行复制操作得到复制事件(100c),所述的事件(100)来自所述事件驱动传感器(10)或来自所述事件驱动接口系统(20)的其它接口模块(200),并且发送所述复制事件(100c)至外部处理管道,以及沿着所述的菊花链发送所述事件(100);
所述融合模块(202),其被配置为:至少从两处不同的地方接收事件(100,100e),其中所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)或所述的事件驱动传感器(10);所述的事件(100e)还来自所述集成电路或其它集成电路的部件/模块或其它事件驱动传感器,并沿着所述可编程菊 花链发送所述接收到的事件(100,100e)的部分或全部至后续的接口模块(200);
所述二次采样模块(203),其被配置为:为接收到的若干事件(100)分派成单一的地址;
所述兴趣区域模块(204),其被配置为:
调整至少一个事件地址的属性,所述的调整方式包括如下方式的一种或多种:移位、翻转、调换或/和旋转至少一个事件地址的属性;或/和
抛弃地址属性值在可编程的地址属性值范围之外的事件(100),沿着所述可编程菊花链发送未被抛弃的事件(100);
所述事件路由模块(205),其被配置为:
接收事件(100),为接收到的事件(100)添加头信息,并连同所述事件(100)的所述头信息发送所述事件(100)至所述事件驱动处理器(30)或/和其它事件驱动处理器或其它处理管道。
在某类实施例中,沿着所述可编程菊花链的事件传递方向,所述至少一个接口模块(200)具有如下接口模块耦接顺序:
复制模块(201)、融合模块(202)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205);或
融合模块(202)、复制模块(201)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205)。
在某类实施例中,对于所述复制模块(201),所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)具体是所述融合模块(202);或/和对于所述融合模块(202),所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)具体是所述复制模块(201)。
在某类实施例中,所述接口模块耦接顺序的上游还包括:事件地址重写模块(207)或/和事件地址过滤模块(208);其中的事件地址重写模块(207)被配置为:为接收到的事件地址转换为统一的地址格式,由此在所述可编程菊花链上传递统一的事件地址格式;
其中的事件地址过滤模块(208)被配置为:过滤掉一系列具有特定挑选过的事件地址的事件(100)。
在某类实施例中,所述事件(100)先经过事件地址重写模块(207)的处理,然后经过事件地址过滤模块(208)的处理。
在某类实施例中,所述事件地址过滤模块(208)具体为热像素过滤模块(208’),其被配置为:
过滤具有特定事件地址的事件(100),且通过CAM存储器(208’-3)存储预设的待过滤的事件地址列表。
在某类实施例中,所述至少一个接口模块(200)还包括映射模块(206),所述映射模块(206)包括如下内容的之一或组合:兴趣区域模块、查找表模块、翻转或/和旋转模块;其中翻转或/和旋转模块被配置为翻转或/和旋转所述事件(100)的事件地址。
在某类实施例中,所述至少一个接口模块(200)还包括速率控制模块,其被配置为:当超过最大速度后,仅沿着所述可编程菊花链发送部分所述事件(100),以限制事件的速率不超过最大速率。
在某类实施例中,所述的事件驱动接口系统(20)的任意一个或多个接口模块(200)可以被可编程开关旁路。
在某类实施例中,所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)之间:通过转接板(40)而耦接于单芯片(3);或被制造在同一个裸晶中。
本发明技术方案与现有技术相比所具有的有益效果包括但不限于:
1、提供了一种传感器、接口、事件驱动处理器的集成解决方案,其能够消弭信号损失和噪声干扰,还能进一步追求更低的芯片占用面积及制造成本,且方案能实现亚毫瓦级的功耗。
2、提供了一种事件驱动接口系统,其能够高效、灵活、低功耗地传递事件,并为处理器高效、方便处理事件而提供事件预处理功能。
以上披露的技术方案、技术特征、技术手段,与后续的具体实施方式部分中所描述的技术方案、技术特征、技术手段之间可能不完全相同、一致。但是该部分披露的这些新的技术方案同样属于本发明文件所公开的众多技术方案的一部分,该部分披露的这些新的技术特征、技术手段与后续具体实施方式部分公开的技术特征、技术手段是以相互合理组合的方式,披露更多的技术方案,是具体实施方式部分的有益补充。与此相同,说明书附图中的部分细节内容可能在说明书中未被明确描述,但是如果本领域技术人员基于本发明其它相关文字或附图的描述、本领域的普通技术知识、其它现有技术(如会议、期刊论文等),可以推知 其技术含义,那么该部分未明确被文字记载的技术方案、技术特征、技术手段,同样属于本发明所披露的技术内容,且如上描述的一样可以被用于组合,以获得相应的新的技术方案。本发明任意位置所披露的所有技术特征所组合出的技术方案,用于支撑对技术方案的概括、专利文件的修改、技术方案的披露。
附图说明
图1是根据发明的某类实施例的事件驱动电路系统示意图;
图2是根据发明的另一实施例的电路系统示意图;
图3是根据发明的某实施例中芯片的剖视图;
图4是根据发明的某实施例的3D芯片几何结构的剖视示意图;
图5是根据发明的某实施例的包括声音/震动传感器的电路系统示意图;
图6是传感器生产的事件处理流程图;
图7是声音传感器记录震动信号事件处理流程图;
图8是接口系统的菊花链(daisy chain)示意图;
图9是拥有过滤具有选定事件地址的事件能力的热像素过滤模块示意图;
图10是二次采样/下采样(sub-sampling)模块的示意图;
图11是现有技术中基于IBM TrueNorth的手势识别系统。
具体实施方式
(一)、关于技术方案描述方式的声明
为清晰描述、充分理解本发明披露的技术方案,在本发明的任意专利文件中,均作如下约定:
该部分内容所描述的实施例,即便在同一附图、同一区域的文字描述,但其也并非仅仅是针对某一个具体实施例的描述,而是对于具有某类技术特征的潜在的实施例的选择性描述。本发明文件公开的实施例是下面某些技术特征选择性的全部合理组合,只要这种组合不是逻辑上的相互矛盾或者无意义的,因此某些技术特征在某些类具体实施方式中并非是必须存在的。
方法类、产品类的实施例可能会单独描述了某些技术特征,通常情况下,本发明文件暗含对应的其它类实施例也同样存在对应的该技术特征/与该技术特征相匹配/相对应/相配合的装置/步骤,仅仅只是未明确文字描述而已。举例而言, 方法类实施例暗含包括了产品类实施例某装置/模块/部件执行/实现的某些步骤/指令/功能、产品类实施例暗含包括了实现方法类实施例执行某步骤/指令/功能的装置/模块/部件等。该些暗含的技术特征,同样用于支撑对技术方案的概括、专利文件的修改、技术方案的披露。
本发明任何位置所涉及的非英文术语括号内的英文单词/词组,是对该术语的含义进一步补充解释,被用于辅助解释其在本领域的通用英文表达方式。当非英文术语无法被正确理解、解读、存在矛盾时,该英文含义可以辅助理解此处的含义,并在必要的时候以该英文含义为准。此外,并非任意位置出现的上述非英文术语均明确采用了上述方式解释,但某处被解释的含义,应当同样可以解释在其它位置处出现的相同的术语。
在说明书内,可能会针对某个具体的术语予以特定的解释,在任意与本发明相关的文件中,如所附的权利要求,均可以根据该解释对相同术语予以解读。
本发明任意文件中所提的“模块”、“模组”、“module”、“component”、“部件”、“组件”“某某部”、“系统”(如有,后同)等,是指仅由硬件、仅由软件、软件和硬件相结合的方式来实施的产品或产品的一部分。除非有明确的上下文指示,本发明中未暗示上述术语只能由硬件或软件作为实施形式。
汉语中的顿号“、”通常表达并列词或词组之间的停顿,分割同类的并列的事,或者分隔用汉字作为序号的序号和内文,因此通常其本身并不明确决定逻辑“或/与”的表达,其所表达的逻辑通常需要根据上下文语境分析。
本发明任意语言的任意专利文件中约定:类似“A、B、C”写法表示:通常表达A或B或C,而根据语境必须是“和”逻辑的则除外;类似“A、B或C”写法表示:A或B或C;类似“A、B和C”写法表示:A和B和C。类似“A,B or C”表示:A或B或C;类似“A,B and C”表示:A和B和C。
而多方案描述方式“A and/or B”、“A or/and B”、“A和/或B”、“A或/和B”均包括三种并列的技术方案:(1)A;(2)A和B;(3)B。
在非数学公式中,“/”一般表示逻辑“或”。
本发明任意专利文件中,类似“第一某某模块输入端”描述方式与“某某模块第一输入端”含义等同,均指同一对象,除非“某某模块”本身不足以区分具体指代对象或引起歧义时应按语境做出最大合理解释。
本发明文件任意位置出现的“可以/可以是”(may、may be、might,表示选 择,如果语境中表达“能力”的则除外),是一种描述优选实施例的方式,其暗示还可以存在潜在的、其它的类型的合理替代方式。本发明任意位置出现的技术术语“大致”、“近似”、“接近”等表达近似含义描述词时,其所要表达的含义是:在不影响解决技术问题的前提下技术方案允许存在误差地实施,即并非要求在严格的实际参数测量后,得出的数据严格符合一般的数学定义(因为不存在完全符合数学定义的物理实体),这种术语并非含糊其辞、模棱两可从而导致不清楚技术方案所限定/表达的范围,事实上应当以是否还能解决技术问题为判定是否落入限定范围的标准。
在本发明任意位置,可能会有表达同一对象但是术语并非完全一样的情况。通常情况下,这是简称与全称的区别、助词等虚词的插入等原因导致的。在上下文语境中,若无其它明确的指代,二者均指同一对象。
表述方式“与A相应的B”,表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
表述方式“第一”、“第二”等通常用于对象的标识及区分,但这并不构成对同类对象的数量上的限制。虽然其本身通常单指某一对象,但这并不意味着该类对象仅只有一个,比如可能是出于效果增强、压力分摊、等效替换等目的。
即便前文并未强调,在具体技术方案中提到某类实施例时会强调更进一步的有益的技术效果,该进一步的技术效果同样是发明人付出创造性的劳动后获得的。
本发明的附图是为了方便对待公开的技术方案进行描述而绘制的,因而难以客观地、全貌地展示技术方案。示意图展示的尺寸、比例、数量等很可能不是对实际产品的精确刻画,而未展示的技术特征也未必代表实际技术方案没有该技术特征,因而附图细节不应当构成对本发明真实意图的不当限制。
专利制度的立法宗旨并非是要求申请人对其欲披露的技术方案事无巨细地描述。虽然申请人已尽可能多地披露更多的技术细节,但仍不可能且无人可能对任何技术特征均做出无限详尽、无限深入底层地描述。对于本发明文件一些未做过多篇幅介绍的技术特征/目标/手段,其并非本领域技术人员想实施却一直无法实施的技术特征/手段、想实现却一直无法实现的技术目标,即便现有技术也许还不能提供完美的解决方案。申请人认为本领域技术人员基于本领域或其它相关领域普通技术知识、专利、期刊、书籍、技术手册、互联网技术文档等一切现有 技术,即可获知/组合/挖掘/实验出某种/某些可实施方法。
(二)、技术方案描述
方案所描述的领域属于事件驱动的集成电路系统,因此其传感器、接口系统、处理器均是事件驱动的。本发明中的“集成电路系统”与“集成电路”含义基本等同,“接口系统”与“接口电路”含义基本等同,此处的“系统”具有产品属性含义。术语“耦接”(couple),表示两个或多个具有该关系的部件之间电气/电性连接。
参考图1,其展示的是某类实施例的集成电路布局示意图。事件驱动集成电路系统1,其包括事件驱动传感器10(后简称传感器)、事件驱动接口系统20(后简称接口系统)和事件驱动处理器30(后简称处理器)。
需要说明的是,上述事件驱动传感器10、事件驱动接口系统20、事件驱动处理器30,均是为便于描述而根据功能而划分的,但这并不意味着以上部件一定是物理独立的。它们可以被实施为三个分离的独立部件,也可以将多个部件组合在一起,以单独的一个部件实现多个功能的整合/组合,比如传感器10和接口系统20可以组合在一起(尤指在同一个裸晶(die,也称裸芯片、裸片)),接口系统20和处理器30可以组合在一起。或许其中的某些选择方式会造成某方面性能的降低,但本发明对其物理划分形式不作限定。
为了解决USB线缆带来的信号损失、噪声干扰和传输延迟,追求更极致的低功耗,本发明的某类实施例为:将上述事件驱动传感器10、事件驱动接口系统20、事件驱动处理器30至少三个部件耦接于单个芯片(chip)(图1未示出)。在某类实施例中,将上述三个部件耦接于单芯片(仅有单个裸晶情形)。此时,传感器10和处理器30使用相同的制造工艺,比如常规65nm CMOS工艺,但这种次优方案以牺牲传感器10的图像质量为代价。
在某类更优实施例中,所述的传感器10、接口系统20、处理器30至少三个部件是通过转接板(interposer,图1未示出)耦接于单芯片(不止一个裸晶情形)。本发明所述的转接板包括但不限于:硅转接板(silicon interposer)、玻璃转接板(glass interposer)。本发明不对转接板的材质类型进行限定。
在本发明任意位置中,术语“单芯片”是指:包括通过转接板而耦接的不止一个裸晶,或者是:仅有一个裸晶但不需要转接板。需要注意,某些情况下该术 语上下文可能会暗示/限制该术语仅代表上述含义中的一种。
在某类实施例中,事件驱动传感器10是一个事件驱动2D阵列传感器(比如事件驱动相机),且传感器10通常包括一个或多个事件驱动传感器的输入设备11。事件驱动相机包括大量的像素,而每一个像素则是一个事件驱动输入设备11。事件驱动传感器的输入设备11被配置为在检测到事件产生信号或/和事件产生信号的变化(比如像素上的光强度的变化)后异步生成一事件。其中,每一事件被关联或包括一事件地址,该地址包括/指示输入设备11的标识符,比如2D阵列中像素的X和Y坐标。
在某类实施例中,所述的事件驱动传感器是1D、2D、3D或其它类型传感器。
关于事件驱动2D阵列传感器的设计方案,至少可以参考如下现有技术(欧洲专利,题目:Photoarray for detecting time-dependent image data,公开日June,27,2018):
现有技术4:EP1958433B1;
本发明以全文引入的方式,将其纳入本发明公开内容的一部分。限于篇幅,本文不再赘述具体内容。本发明的传感器的实施方式不限于此。
在某类实施例中,传感器10通过传感器10的输出端12耦接至事件驱动接口系统20的输入端21,传感器10通过异步的方式输出事件100。事件驱动接口系统20的接口系统输入端21接收到的每一事件经过各种预处理任务处理后,可以实现统一的后续处理,而这将实现不依赖传感器10的个性。特别地,接口系统20可包括一系列的接口模块200,其中每个接口模块200被配置为以可编程方式来处理进来的事件100。如此以来,从统一的事件地址结构(event address structure)和可能存在的事件头(event header)角度,处理器30处理的所有的事件100都具有相同的格式。再则,在某类实施例中,接口模块200可配置为包括执行1)过滤步骤;或/和2)地址操纵步骤。示例地,为限制突发事件率(incident event rate)在处理器30处理能力之内,或/和提供预定义的事件地址格式。上述每一步操作,都可以有效降低电路系统1的能量消耗。
为并行信号处理,接口系统20包括一系列/多个并行输入端21(如21-1、21-2)以用于从传感器10接收事件100,也包括一系列/多个并行输出端22(如22-1、22-2)耦接至处理器30的输入端,且被配置为将被预处理后的事件101以并行 的方式传输给处理器30。这种设置方式允许同时地传输多个事件,实现了降低功耗和事件快速处理。
图2展示的是某类替代的实施例。在这种情况下,事件驱动传感器10是一个1D事件驱动传感器,比如是一个事件驱动机械压力传感器,其被用于检测机械振动。类似地,传感器10、接口系统20和处理器30被装配在单芯片3上,且通过转接板40耦接。
在某类实施例中,处理器10、接口系统20和处理器30被耦接于芯片3的同一侧;在另一类实施例中,处理器10、接口系统20和处理器30被耦接于芯片3的两侧。本发明不对上述三个部件是否被装配在同一侧进行限定。
在某类实施例中,传感器10和接口系统20在同一个裸晶中;而在另一类实施例中,接口系统20和处理器30在同一个裸晶中。
此外,传感器10还可以是点传感器(point sensor),此时该点传感器的事件地址均是同一个地址。对于声音传感器类型,某类实施例中,其可以包括至少两个物理位置不同的该声音传感器以实现立体音效采集。
与非事件驱动或/和不在单芯片3上的传统电路系统相比,这种事件驱动系统的设计方式可以实现特别低的应用能量消耗,且可以异步操作。尤其是适用基于电池供电、超长工作时间的应用场景。
将多个事件驱动部件装配在单芯片3上的协同作用,会致使电路系统1能够以极低的能量足迹(energy footprint),允许高速处理操作。这些操作可以是特征检测和分类方法。为了这些目的,处理器30被配置为事件驱动脉冲人工神经网络(event-driven spiking artificial neural network,或简称事件驱动脉冲神经网络,也即本领域常称的脉冲神经网络SNN)。SNN包括多种网络算法,优选地,上述神经网络被配置为事件驱动脉冲卷积(convolutional)神经网络(sCNN),该种网络尤其适用超快应用需要,比如物体识别。
所述sCNN的具体实现方式,至少可以参考现有技术(PCT专利申请文件,题目:Event-driven spiking convolutional neural network,公开日:15,Oct,2020):
现有技术5:WO2020/207982A1;
本发明以全文引入的方式,将其纳入本发明公开内容的一部分。限于篇幅,本文不再赘述具体内容。本发明的sCNN的实施方式不限于此,且本发明不排除某具体实施例中还可以是SNN+ANN等异构融合网络。
被配置为事件驱动脉冲神经网络或sCNN的事件驱动处理器30与电路集成的几何结构相结合,可进一步适应长时间、低功耗的应用场景的需要。
被如此配置的集成电路系统,可以实现只输出包括被检测到的物体的相关信息,比如“[桌子]”、“[椅子]”、“[**正在靠近]”等类似的消息。与传统技术相比,其不需要记录和上传大量的数据信息、免去了连接云端数据的信息传输延时、海量的算力需求、功耗消耗,非常适用低功耗、低延迟、低数据存储压力、长续航的应用场景,如IoT、边缘计算领域。在根据本发明的某具体实施例,65nm工艺下适配64*64的DVS的方案的平均功耗低至0.1mW,峰值功耗仅为1mW;适配128*128的DVS的方案平均功耗低至0.3mW,峰值功耗仅为3mW。
在某类实施例中,处理器30包括至少两个(或更多)处理器,特别地每个处理器被配置为执行不同的任务。处理器30还被配置为异步处理事件。
参考图3,其是某类实施例中芯片的剖视图。在封装衬底4上,举例地,可以通过C4凸点(bump)43连接转接板40,转接板40中设置有若干的通孔42。而转接板40上设置有若干微凸点41,在微凸点41上设置有两块裸晶:第一裸晶(或称第一集成电路结构)1-1、第二裸晶(或称第二集成电路结构)1-2。基于转接板40的设置,可以通过微凸点(μbump)41、通孔42等一些可选的具体手段可实现第一裸晶1-1和第二裸晶1-2的耦接。
在某类实施例中,事件驱动接口系统(20)以及事件驱动处理器(30)均位于第一裸晶(1-1)。
在某类实施例中,事件驱动传感器(10)和事件驱动接口系统(20)均位于第二裸晶(1-2)。
在某类实施例中,事件驱动接口系统(20)的一部分与事件驱动处理器(30)均位于第一裸晶(1-1)且事件驱动接口系统(20)的另一部分与事件驱动传感器(10)均位于第二裸晶(1-2)。
本发明所述通孔42包括但不限于:硅通孔(through silicon via,TSV)、玻璃通孔(through glass via,TGV)。
在某些实施例中,通过Cu-Cu技术耦接上述不同的裸晶。
本发明所述转接板40包括但不限于:硅转接板或玻璃转接板。
图4是某类实施例中,3D芯片几何结构的剖视图。示例地,接口系统20和处理器30被装配在第一裸晶1-1上,而传感器10则被装配于第二裸晶1-2上, 这是通过转接板40来实现第一电路结构1-1和第二电路结构1-2空间隔离的可能性。
举例地,封装衬底4与转接板40之间可以是C4凸点43。转接板40内包括若干通孔42。
第一裸晶1-1与转接板40之间包括微凸点41,接口系统20和处理器30被装配于同一裸晶,即第一裸晶1-1,通过微凸点41或/和通孔42等可实现接口系统20和处理器30之间的电气/电性耦接。在接口系统20中设有通孔42,且在传感器10所在第二裸晶1-2与上述第一裸晶1-1之间通过微凸点41相耦接,因而第二裸晶1-2堆叠在第一裸晶1-1之上。
该种设置的3D结构的芯片,极大地缩小了芯片的占用面积,提升了不同部件之间的信息传递速度,降低了系统总体功耗。
特别且有利地,具有如此结构设计的芯片,第二裸晶1-2与第一裸晶1-1的特征尺寸/制造工艺(technology node)可以不同,比如第二裸晶1-2的工艺可以是大于65nm的,而接口系统和处理器的工艺是小于65nm的,比如22/14/10/7/5nm或更小。这允许芯片的生产制造过程中,选择一个更具性价比的制造工艺,本发明对此选择组合方式不做限定。
可选地,处理器30和接口系统20也可以采用相同的制造工艺,且被制造于同一集成电路结构/裸晶上或者不同集成电路结构上。
在图5中示例的某类实施例中,系统包括声音/震动传感器10’,其与事件驱动放大器13相连,且被配置为放大传感器信号,且输出指示记录的声音/震动能量谱(power spectrum)强度(intensity)变化的事件100,尤其是其中的事件是为能量谱中每一频率而异步产生的。放大器13和声音传感器10’于是构成了事件驱动传感器10。放大器13和接口系统20相连接,其中的接口系统20被配置为处理放大器13产生的任意事件100并将其处理后的事件传递给处理器30。在该类实施例中,传感器10’和放大器13被装置于同一芯片上,这可以实现维持单芯片结构的优点。
在某类实施例中,所述事件驱动传感器10属于以下类型中的一种或多种的组合:点传感器、1D传感器、2D传感器。
在某类实施例中,所述事件驱动传感器10属于以下类型中的一种或多种的组合:声音/震动传感器、动态视觉传感器。
在图6中描述的是某类实施例,展示的是处理电路系统1中传感器10生成的事件100的流程图。比如,包括一系列像素11的2D传感器10,预处理事件100是通过接口系统20来实现的,预处理后并输出至处理器30。
本发明中术语“上游”是指更靠近传感器一侧,术语“下游”是指更靠近处理器一侧,二者均与事件处理的先后顺序有关,如图6中复制模块201位于事件路由模块205的上游。
该图中,在传感器10和处理器30之间的每一个几何图形都代表至少一个处理步骤,这些处理步骤由接口系统20执行。为了实现这个目的,接口系统20包括一系列的接口模块200(包括但不限于201、202、203、204、205、206、207、208、208’等)。接口模块200可以是独立可编程的,任意一个或多个接口模块(200)可以被可编程开关旁路,以支持更多的网络配置。优选地,接口模块200可特别地被实施为硬件电路,且构成可编程菊花链(programmable daisy chain),该菊花链被配置为以事件驱动和异步的方式,异步处理从传感器10传入的事件100。
复制模块201(copy module):在图6的某类示例中,第一个接口模块201是复制模块,其被配置在传感器10的后续阶段,且包括的一个输入端201-1耦接至传感器10的输出端10-1,还包括第一输出端201-2a和第二输出端201-2b。复制模块201的输入端201-1被配置为接收事件100并且执行复制操作,即复制接收到的带址的事件100,并且被配置为转送/发送(forward)复制得到的事件100c及其复制的事件地址至复制模块201的第一输出端201-2a,以及沿着菊花链发送接收到的事件100至复制模块201的第二输出端201-2b。
复制事件100c可以被送入(feed)不同系统的外部处理管道(未示出)。复制模块201允许在该事件被任何处理之前就被耦合至其它系统之中。
融合模块202(merge module):在某类实施例中,融合模块202耦接复制模块201,融合模块202的第一输入端202-1a与复制模块201的第二输出端201-1b相耦合。融合模块202还设置有第二输入端202-1b,其被配置为从其它部件/模块中接收事件100e,而这些部件、模块可能来自本电路系统或其它电路系统(未示出),比如第二传感器或者菊花链输出的事件。耦合模块202还包括输出端202-2,其被配置为沿着菊花链发送所有的接收到的事件100或/和事件100e。此后,事件100e被融合到事件100所构成的流(stream)中,因而此后不再区分事件100e和事件100而统称为事件100。该实施例允许在电路系统的处理管道中 集成更多的部件或/和信息,具有更灵活的系统配置能力。
二次采样模块203(sub-sampling/sum pooling module):融合模块202的输出端202-2耦接至二次采样模块203的输入端203-1。二次采样模块203被配置为:为接收到的若干事件100分派成单一的地址,如此即可实现不同事件地址的个数的降低。通过这种方法,举例而言,代表2D阵列传感器10的若干不同像素11的若干事件地址,可以被二次采样成事实上较少的像素。二次采样的处理,在某些特定应用场景下,被称为像素组合(binning)。
二次采样模块203可以通过可编程开关(未示出)而被旁路(bypass)。
兴趣区域模块204(Region Of Interest module,ROI):二次采样模块203的输出端203-2与兴趣区域模块204的输入端204-1耦接,其中兴趣区域模块204被配置为调整(adjust)至少一个事件地址的属性(property),具体而言调整的方式可以通过移位(shift)、翻转(flip)、调换(swap)或/和旋转(rotate)至少一个事件地址的属性,ROI模块内执行的操作可以是通过事件地址重写操作。此外,兴趣区域模块204还可以进一步被配置为抛弃(discard)那些地址属性值在可编程的地址属性值范围之外的事件。兴趣区域模块204是可被编程的,且被配置为存储可编程的上述地址属性值范围,该地址属性值范围是针对每个地址属性而设置的。此外,兴趣区域模块204还被配置为发送接收到的事件100,只要这些事件100没有被抛弃,连同调整后的地址沿着菊花链一同被向下一级发送。
在某类实施中,从2D图像角度,该兴趣区域模块204允许在事件100像素坐标上对裁剪图像或/和其它基本的几何学上的操作。
事件路由模块205(event routing module):事件路由模块205的输入端205-1与兴趣区域模块204的输出端204-2耦接,接收事件100。事件路由模块205被配置为:可选地,为接收到的事件100添加(associate)头信息(header information),并将事件100连同其头信息输出至事件路由模块205的第一输出端205-2a。可选地,事件路由模块205被配置为复制包括头信息以及调整后的事件地址的事件100,并连同其复制的头信息输出该复制后的事件100至事件路由模块205的第二输出端205-2b,该第二输出端205-2b可以耦接至其它事件驱动处理器或其它处理管道。
如此配置后的事件路由模块205由此为电路系统1增添了如下能力:可以为任意类型处理器、处理器或处理器用运行的程序需要的任意类型输入格式,从事 件中提供预处理信息。
事件路由模块205的第一输出端205-2a与处理器30耦合,带有事件地址和头信息的处理后的事件100之后就被传入处理器30,并被执行处理任务,比如模式或特征识别任务或者其它应用。
可选地,电路系统1可能进一步包括其它的接口模块200,这些接口模块200可以被配置为在从传感器10或类似地方接收的事件100上执行任务,比如速率控制任务、热像素过滤任务、事件地址重写任务(参考图8-9)。
速率控制模块(rating limit module):被配置为限制事件的速率不超过最大速率。举例地:尤其是限制具有相同事件地址的速率;当超过最大速度后,仅沿着菊花链发送部分(fraction)事件,比如每第n个接收到的事件就不会被沿着菊花链发送,n是根据当前事件速率而确定的一个值;该最大速率可以是可编程地、可调地存储于一个存储器中,该存储器可属于速率控制模块,也可来自该模块外部。速率控制模块可包括或连接至一个带有时钟的处理单元,以用于确定该模块的事件接收速率。如此菊花链上的事件速率将不会超过最大速率,同时也限制了送入处理器30的数据速率。
对于每个接口模块200,其是可编程的,很显然这很容易通过向模块发出合适的指令即可旁路该接口模块。比如坐标翻转并非需要的时候,可以通过旁路整个兴趣区域模块204来实现,这样就实现了二次采样模块203和事件路由模块205之间的直连。
图7展示的是传感器记录震动的声音传感器时的某类实施例。其中传感器包括一个放大器(参考图5)和滤波器(filter),并且传感器被配置为从声音传感器生成事件、根据传感器记录的能量谱(power spectrum)异步编码事件100。此外,放大器可以被配置为移位通道(shift a channel)。作为一个合并单元,声音传感器和放大器可以整体被视为一个事件驱动传感器10。
如前所述的方式,事件驱动传感器10传递事件100至融合模块202,事件100e从不同的处理管道传递过来,并被融合至事件驱动传感器10产生的事件100中。同样如前例所述的那样,融合后的事件100然后被传递至复制模块201,所以融合后的事件100被复制了。复制事件100c然后被传递至相同的处理管道,这就是事件100e在融合模块202接收的来源,或者被传递至其它的处理管道(未示出)。这实现的效果是允许在设计或处理菊花链时有很大的自由度。不难发现, 在该类实施例中,在菊花链早期许多事件100、100e可能就已被送入了。包括菊花链的接口系统20的可编程模块的优势在于:在处理器30中,基于统一的事件格式或/和事件地址格式,事件100可以被处理且处理器执行它的预计目的且不需要进一步的处理。
复制模块201和融合模块202的协同合作,尤其是在对事件100进行处理的早期,很大程度上增强了系统的互用性(interoperability)。
映射模块206(mapping module):在接口模块200的负责池化(pooling,含义与二次采样基本等同)的二次采样模块203、负责事件的路由的事件路由模块205之间(可参考图6),映射模块206(其本身就包括ROI模块204)可被置于其间,目的在于实现丰富的事件地址映射操作。
根据某类实施例,某一个接口模块是/包括映射模块(如上述的映射模块206),其中的映射模块206被配置为将一个事件地址映射成另外一个事件地址。映射模块206包括如下内容的之一或组合:
1、兴趣区域模块(ROI 204);
2、查找表(LUT)模块;
3、翻转或/和旋转模块;其被配置为翻转或/和旋转事件的事件地址。
优选地,本发明所有的接口模块(包括后述的)均可以被其内部的可编程开关旁路。
图8展示的某类实施例中是通过可编程接口模块200实现的菊花链,其被并入到接口系统20中。
事件地址重写模块207(event address rewrite module):事件驱动传感器10提供事件10的流(stream),并被发送至可选的事件地址重写模块207。该事件地址重写模块207被配置为重写事件格式为通用/统一格式(common format)以用于后续的处理步骤,即将从传感器接收到的事件地址转换为统一的地址格式,由此在菊花链上传递统一的事件地址格式。事件地址重写模块207可以是针对特定传感器模型而编程的,目的在于适应实际上提供了任意事件格式的任意类型传感器10。事件的格式可能与字节顺序(byte-order)和存储在生成事件的事件地址格式有关。统一的事件地址格式是预先定义的数据格式,这实现了后续的处理可以依赖该预定义的数据格式,因此可以省略关于事件格式的双重检查(double check)实例,实现更快速的处理。
事件地址过滤模块208(event address filter module):一旦普通事件和事件地址格式化(formatting)被事件地址重写模块207完成,事件将进一步被事件地址过滤模块208处理。事件地址过滤模块208被配置为过滤掉一系列具有特定挑选过的事件地址的事件。所述的挑选过的事件地址可以被存储、读、写于CAM存储器(Content-addressable memory)。这些过滤允许过滤掉热像素或类似物。所以事件地址过滤模块208可以是热像素过滤模块。事件地址重写模块207之后,事件地址过滤模块208作为第一模块或其一部分,它可以在早期就降低在菊花链上传递的事件的数量,而这种处理方式将降低菊花链的能量消耗。如果处理器30也具有过滤地址的能力时,事件地址重写模块207可以被旁路掉。
过滤后的事件100之后将被发送至复制模块201或/和融合模块202,该两个模块可分别实现向外部系统100c提供传感器的复制事件100c和并入外部的事件100e源。复制模块201和融合模块202可以被独立地通过编程而被旁路掉。
在同时包括复制模块201和融合模块202的技术方案中,二者的顺序可以具有两种不同的先后处理顺序:先复制模块201后融合模块202,或者反之。具体可参考前述图6-7实施例的描述。
在复制模块201或/和融合模块202之后,二次采样模块203将以前述实施例的方式来处理输入的事件。在菊花链特定的位置上放置二次采样模块203可以处理所有的事件,即便事件是来源于外部。
兴趣区域模块204处于二次采样模块203的后续部分,且处理二次采样模块203送来的一切事件。兴趣区域模块204降低了事件地址的数量,由此降低了工作压力负载。兴趣区域模块204可以同样被配置为翻转或/和旋转事件地址的X、Y坐标。
在兴趣区域模块204的后续部分安排的是事件路由模块205,其被配置为:准备事件,比如提供事件10的头信息,该事件并被发送至处理器30。
图8所示的菊花链提供了一种为实现高效、快速、灵活地处理来自传感器10或其它源头的事件的统一处理方式。
热像素处理模块208’(hot pixel filter module):图9是事件地址过滤模块208的某类具体实施例:热像素过滤模块208’。该热像素过滤模块208’的功能是过滤具有特定事件地址的事件。举例而言,这允许减少或彻底移除菊花链上的该类事件。这类事件被移除是因为输入设备,比如2D阵列传感器的像素被削弱 (compromised)或者损坏。热像素过滤模块208’包括输入端208’-1,用于接收事件100。
接收到事件100后,在热像素过滤使能(enable)判断步骤S800后,如果判断为非使能(No,Disabled),则可以通过编程开关而直接旁路掉热线像素过滤模块208’而将事件100直接送入801热像素过滤模块208’的输出端208’-2。如果S800判断为使能(Yes,Enabled),优选地,那么从CAM存储器208’-3中读取预设的待过滤的事件地址列表。在地址比对/匹配(match)步骤S802中,验证事件100的地址是否属于列表中待过滤的地址之一。如果接收到的事件100地址与列表中的某个地址相匹配,该事件100将在菊花链中被滤除S803(filtered out),且被丢弃(dropped)。而如果上述列表中不存在与事件100的地址相匹配的地址,那么事件100将在热像素过滤模块208’的输出端208’-2输出至管道(pipeline)。
在图10中,其展示的是某类实施例中二次采样模块203的工作流程图。输入的事件100通过评估其关联的地址而被处理,尤其指的是其地址坐标,比如X,Y,Z坐标。
举例而言,事件100的地址被 分离模块203-4分离为三路不同的地址。上述的坐标(X,Y,Z)通过移位或分割(division)操作S901后,坐标X,Y,Z被二次采样成低数据量的坐标集合,这有效地降低了像素坐标的数量。如此处理后的事件地址,之后在 地址重组模块203-5处融合,然后调整后的地址随同事件100被送入后级以待进一步的处理。
具体地,分离模块203-4被配置为:根据接收事件的地址值(如X,Y,Z坐标)路由事件100至关联的二次采样模块中的缩放寄存器(scaling register),该缩放寄存器被配置为分割、二次采样、池化或/和移位接收到的地址值,并输出地址值至二次采样模块中的地址重组模块203-5,该地址重组模块203-5被配置为根据缩放后的地址值来调整事件地址,然后沿着菊花链发送调整地址后的事件。
举例地,通过池化相邻输入设备的事件地址,二次采样模块可以调整2D阵列传感器像素分辨率,由此X,Y轴的像素数量将会降低。比如,处理器处理图像的分辨率可由256*256至64*64。对于1D传感器,其同样可以被如上类似配置的二次采样模块处理。
在某类实施例中,事件100可能还包括通道标识符(channel identifier),其并不会被上述分离处理,而仅仅是独自在二次采样模块203中遍历通过(loop through)S900。
(三)、对规避设计技术方案的声明
尽管已经参考本发明的具体特征和实施例描述了本发明,但是在不脱离本发明的情况下可以对其进行各种修改和组合。因此,说明书和附图应简单地视为由所附权利要求限定的本发明的一些实施例的说明,并且预期涵盖落入本发明范围内的任何和所有修改、变化、组合或等同物。因此,尽管已经详细描述了本发明及其优点,但是在不脱离由所附权利要求限定的本发明的情况下,可以进行各种改变、替换和变更。此外,本申请的范围不旨在限于说明书中描述的过程、机器、制造、物质组成、装置、方法和步骤的特定实施例。
本领域普通技术人员从本发明的公开内容将容易理解,可以根据本发明应用执行与本文描述的相应实施例实质上相同功能或达到实质上相同的结果的当前存在或稍后开发的过程、机器、制造、物质组成、装置、方法或步骤。因此,所附权利要求目的在于在其范围内包括这样的过程、机器、制造、物质组成、装置、方法或步骤。
为了实现更好的技术效果或出于某些应用的需求,本领域技术人员可能在本发明的基础之上,对技术方案作出进一步的改进。然而,即便该部分改进/设计具有创造性或/和进步性,只要利用了本发明权利要求所覆盖的技术特征,依据“全面覆盖原则”,该技术方案同样应落入本发明的保护范围之内。
所附的权利要求中所提及的若干技术特征可能存在替代的技术特征,或者对某些技术流程的顺序、物质组织顺序可以重组。本领域普通技术人员知晓本发明后,容易想到该些替换手段,或者改变技术流程的顺序、物质组织顺序,然后采用了基本相同的手段,解决基本相同的技术问题,达到了基本相同的技术效果,因此即便权利要求中明确限定了上述手段或/和顺序,然而该些修饰、改变、替换,均应依据“等同原则”而落入权利要求的保护范围。
对于权利要求中有明确的数值限定的,比如1.5V电压,其不应被理解为严格限定了电压值必须是1.5V。通常情况下,本领域技术人员能够理解,1.2V、1.8V电压同样能够应用于某具体的实施方式中。这些未脱离本发明构思的通过 细节规避的设计方案,同样落入该权利要求的保护范围。
结合本文中所公开的实施例中描述的各方法步骤和单元,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明所要求保护的范围。
此外,本文示例的执行指令的任何模块、组件或设备可以包括或以其它方式访问用于存储信息的非暂时性计算机/处理器可读存储介质或介质,诸如,计算机/处理器可读指令、数据结构、程序模块和/或其它数据。任何这种非暂时性计算机/处理器存储介质可以是设备的一部分或者可访问或可连接到设备。本文描述的任何应用或模块可以使用计算机/处理器可读/可执行指令来实现,该指令可以由这种非暂时性计算机/处理器可读存储介质存储或以其它方式保持。
表1:附图标号清单
Figure PCTCN2021088143-appb-000001
Figure PCTCN2021088143-appb-000002

Claims (38)

  1. 一种集成电路,其包括事件驱动传感器(10)和事件驱动接口系统(20)和事件驱动处理器(30),其特征在于:
    所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)耦接于单芯片(3)。
  2. 根据权利要求1所述的集成电路,其特征在于:
    所述事件驱动传感器(10),被配置为:在所述事件驱动传感器(10)的输入设备(11)检测到事件产生信号或/和事件产生信号的变化后异步生成和异步输出事件(100),所述事件(100)包括或被关联指示所述输入设备(11)的事件地址,所述事件驱动传感器(10)的输出端耦接至所述事件驱动接口系统(20)的输入端;
    所述事件驱动接口系统(20),被配置为:异步接收所述事件(100)并对所接收的事件(100)预处理,所述事件驱动接口系统(20)的输出端耦接至所述事件驱动处理器(30)的输入端;
    所述事件驱动处理器(30),被配置为:接收所述事件驱动接口系统(20)预处理后的事件(101),并以异步的方式处理所接收的事件(101);
    所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)之间通过转接板(40)而耦接于单芯片(3)。
  3. 如权利要求2所述的集成电路,其特征在于:
    所述事件驱动接口系统(20)以及所述事件驱动处理器(30)均位于第一裸晶(1-1);或,所述事件驱动传感器(10)和所述事件驱动接口系统(20)均位于第二裸晶(1-2);或,所述事件驱动接口系统(20)的一部分与所述事件驱动处理器(30)均位于第一裸晶(1-1)且所述事件驱动接口系统(20)的另一部分与所述事件驱动传感器(10)均位于第二裸晶(1-2)。
  4. 如权利要求2所述的集成电路,其特征在于:
    所述事件驱动接口系统(20)以及所述事件驱动处理器(30)均位于第一裸晶(1-1),且所述事件驱动传感器(10)所在的第二裸晶(1-2)堆叠在所述事件驱动接口系统(20)以及所述事件驱动处理器(30)所在的第一裸晶(1-1)之上。
  5. 如权利要求2所述的集成电路,其特征在于:
    所述转接板(40)是硅转接板或玻璃转接板。
  6. 如权利要求2所述的集成电路,其特征在于:
    所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)是通过2.5D或3D封装技术封装于单芯片(3)之上。
  7. 如权利要求2所述的集成电路,其特征在于:
    所述事件驱动传感器(10)属于以下类型中的一种或多种的组合:点传感器、1D传感器、2D传感器、3D传感器。
  8. 如权利要求2所述的集成电路,其特征在于:
    所述事件驱动传感器(10)属于以下类型中的一种或多种的组合:声音/震动传感器、动态视觉传感器。
  9. 如权利要求2所述的集成电路,其特征在于:
    事件驱动处理器(30)被配置有脉冲神经网络。
  10. 如权利要求2所述的集成电路,其特征在于:
    事件驱动处理器(30)被配置有脉冲卷积神经网络。
  11. 如权利要求3或4所述的集成电路,其特征在于:
    所述的第一裸晶和所述的第二裸晶采用不同的工艺制造。
  12. 如权利要求2至10任意一项所述的集成电路,其特征在于:
    所述的事件驱动接口系统(20)包括至少一个接口模块(200),所述的接口模块(200)构成可编程菊花链形式,异步处理从所述事件驱动传感器(10)接收到的事件(100)。
  13. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括复制模块(201),其被配置为:接收事件(100)并且执行复制操作得到复制事件(100c),所述的事件(100)来自所述事件驱动传感器(10)或来自所述事件驱动接口系统(20)的其它接口模块(200),并且发送所述复制事件(100c)至外部处理管道,以及沿着所述的菊花链发送所述事件(100)。
  14. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括融合模块(202),其被配置为:至少从两处不同的地方接收事件(100,100e),其中所述的事件(100)来自所述事件驱 动接口系统(20)的其它接口模块(200)或所述的事件驱动传感器(10);所述的事件(100e)还来自所述集成电路或其它集成电路的部件/模块或其它事件驱动传感器,并沿着所述可编程菊花链发送所述接收到的事件(100,100e)的部分或全部至后续的接口模块(200)。
  15. 如权利要求13所述的集成电路,其特征在于:
    所述集成电路的其它接口模块(200)是融合模块(202)。
  16. 如权利要求14所述的集成电路,其特征在于:
    所述集成电路的其它接口模块(200)是复制模块(201)。
  17. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括二次采样模块(203),其被配置为:为接收到的若干事件(100)分派成单一的地址。
  18. 如权利要求17所述的集成电路,其特征在于:
    所述二次采样模块(203)包括的分离模块(203-4)被配置为:根据接收事件(100)的地址值路由所述事件(100)至关联的所述二次采样模块(203)中的缩放寄存器;所述缩放寄存器被配置为:分割、二次采样、池化或/和移位接收到的地址值,并输出地址值至所述二次采样模块(203)中的地址重组模块(203-5),所述地址重组模块(203-5)被配置为:根据缩放后的地址值来调整事件地址,然后沿着所述可编程菊花链发送调整地址后的事件(100)。
  19. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括兴趣区域模块(204),其被配置为:
    调整至少一个事件地址的属性,所述的调整方式包括如下方式的一种或多种:移位、翻转、调换或/和旋转至少一个事件地址的属性;或/和
    抛弃地址属性值在可编程的地址属性值范围之外的事件(100),沿着所述可编程菊花链发送未被抛弃的事件(100)。
  20. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括事件路由模块(205),其被配置为:
    接收事件(100),为接收到的事件(100)添加头信息,并连同所述事件(100)的所述头信息发送所述事件(100)至所述事件驱动处理器(30)或/和其它事件驱动处理器或其它处理管道。
  21. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括速率控制模块,其被配置为:
    当超过最大速度后,仅沿着所述可编程菊花链发送部分所述事件(100),以限制事件的速率不超过最大速率。
  22. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括映射模块(206),其被配置为:
    将一个事件地址映射成另外一个事件地址。
  23. 如权利要求22所述的集成电路,其特征在于:
    所述映射模块(206)包括如下内容的之一或组合:
    兴趣区域模块、查找表模块、翻转或/和旋转模块;其中翻转或/和旋转模块被配置为翻转或/和旋转所述事件(100)的事件地址。
  24. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括事件地址重写模块(207),其被配置为:为接收到的事件地址转换为统一的地址格式,由此在所述可编程菊花链上传递统一的事件地址格式。
  25. 如权利要求12所述的集成电路,其特征在于:
    所述至少一个接口模块(200)包括事件地址过滤模块(208),其被配置为:过滤掉一系列具有特定挑选过的事件地址的事件(100)。
  26. 如权利要求25所述的集成电路,其特征在于:
    所述事件地址过滤模块(208)具体为热像素过滤模块(208’),其被配置为:过滤具有特定事件地址的事件(100),且通过CAM存储器(208’-3)存储预设的待过滤的事件地址列表。
  27. 如权利要求12所述的集成电路,其特征在于:
    所述的事件驱动接口系统(20)的任意一个或多个接口模块(200)可以被可编程开关旁路。
  28. 一种事件驱动接口系统(20),其被耦接于事件驱动传感器(10)和事件驱动处理器(30)之中,构成集成电路,所述的事件驱动传感器(10)生成和异步输出事件(100),所述事件(100)包括或被关联指示产生事件的所述的事件驱动传感器(10)上的输入设备(11)的事件地址;其特征在于:
    所述的事件驱动接口系统(20)包括至少一个接口模块(200),所述的接口模块(200)构成可编程菊花链形式,异步处理从所述传感器(10)接收到的事件(100)。
  29. 如权利要求28所述的事件驱动接口系统(20),其特征在于:
    所述至少一个接口模块(200)包括以下的一个或多个:复制模块(201)、融合模块(202)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205);其中:
    所述复制模块(201),其被配置为:接收事件(100)并且执行复制操作得到复制事件(100c),所述的事件(100)来自所述事件驱动传感器(10)或来自所述事件驱动接口系统(20)的其它接口模块(200),并且发送所述复制事件(100c)至外部处理管道,以及沿着所述的菊花链发送所述事件(100);
    所述融合模块(202),其被配置为:至少从两处不同的地方接收事件(100,100e),其中所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)或所述的事件驱动传感器(10);所述的事件(100e)还来自所述集成电路或其它集成电路的部件/模块或其它事件驱动传感器,并沿着所述可编程菊花链发送所述接收到的事件(100,100e)的部分或全部至后续的接口模块(200);
    所述二次采样模块(203),其被配置为:为接收到的若干事件(100)分派成单一的地址;
    所述兴趣区域模块(204),其被配置为:
    调整至少一个事件地址的属性,所述的调整方式包括如下方式的一种或多种:移位、翻转、调换或/和旋转至少一个事件地址的属性;或/和
    抛弃地址属性值在可编程的地址属性值范围之外的事件(100),沿着所述可编程菊花链发送未被抛弃的事件(100);
    所述事件路由模块(205),其被配置为:
    接收事件(100),为接收到的事件(100)添加头信息,并连同所述事件(100)的所述头信息发送所述事件(100)至所述事件驱动处理器(30)或/和其它事件驱动处理器或其它处理管道。
  30. 如权利要求29所述的事件驱动接口系统(20),其特征在于:
    沿着所述可编程菊花链的事件传递方向,所述至少一个接口模块(200)具 有如下接口模块耦接顺序:
    复制模块(201)、融合模块(202)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205);或
    融合模块(202)、复制模块(201)、二次采样模块(203)、兴趣区域模块(204)和事件路由模块(205)。
  31. 如权利要求30所述的事件驱动接口系统(20),其特征在于:
    对于所述复制模块(201),所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)具体是所述融合模块(202);或/和
    对于所述融合模块(202),所述的事件(100)来自所述事件驱动接口系统(20)的其它接口模块(200)具体是所述复制模块(201)。
  32. 如权利要求30所述的事件驱动接口系统(20),其特征在于:所述接口模块耦接顺序的上游还包括:事件地址重写模块(207)或/和事件地址过滤模块(208);其中的事件地址重写模块(207)被配置为:为接收到的事件地址转换为统一的地址格式,由此在所述可编程菊花链上传递统一的事件地址格式;
    其中的事件地址过滤模块(208)被配置为:过滤掉一系列具有特定挑选过的事件地址的事件(100)。
  33. 如权利要求32所述的事件驱动接口系统(20),其特征在于:
    所述事件(100)先经过事件地址重写模块(207)的处理,然后经过事件地址过滤模块(208)的处理。
  34. 如权利要求32所述的事件驱动接口系统(20),其特征在于:
    所述事件地址过滤模块(208)具体为热像素过滤模块(208’),其被配置为:过滤具有特定事件地址的事件(100),且通过CAM存储器(208’-3)存储预设的待过滤的事件地址列表。
  35. 如权利要求29所述的事件驱动接口系统(20),其特征在于:所述至少一个接口模块(200)还包括映射模块(206),所述映射模块(206)包括如下内容的之一或组合:兴趣区域模块、查找表模块、翻转或/和旋转模块;其中翻转或/和旋转模块被配置为翻转或/和旋转所述事件(100)的事件地址。
  36. 如权利要求29所述的事件驱动接口系统(20),其特征在于:
    所述至少一个接口模块(200)还包括速率控制模块,其被配置为:当超过 最大速度后,仅沿着所述可编程菊花链发送部分所述事件(100),以限制事件的速率不超过最大速率。
  37. 如权利要求28-36任意一项所述的事件驱动接口系统(20),其特征在于:
    所述的事件驱动接口系统(20)的任意一个或多个接口模块(200)可以被可编程开关旁路。
  38. 如权利要求28-36任意一项所述的事件驱动接口系统(20),其特征在于:
    所述事件驱动传感器(10)和所述事件驱动接口系统(20)以及所述事件驱动处理器(30)之间:通过转接板(40)而耦接于单芯片(3);或被制造在同一个裸晶中。
PCT/CN2021/088143 2021-04-19 2021-04-19 具有接口系统的事件驱动集成电路 WO2022221994A1 (zh)

Priority Applications (6)

Application Number Priority Date Filing Date Title
KR1020237028111A KR20230134548A (ko) 2021-04-19 2021-04-19 인터페이스 시스템을 갖는 이벤트 구동 집적 회로
PCT/CN2021/088143 WO2022221994A1 (zh) 2021-04-19 2021-04-19 具有接口系统的事件驱动集成电路
CN202180004244.5A CN115500090A (zh) 2021-04-19 2021-04-19 具有接口系统的事件驱动集成电路
JP2023552014A JP2024507400A (ja) 2021-04-19 2021-04-19 インターフェースシステムを備えたイベント駆動型集積回路
US18/010,486 US20240107187A1 (en) 2021-04-19 2021-04-19 Event-driven integrated circuit having interface system
EP21937245.5A EP4207761A4 (en) 2021-04-19 2021-04-19 EVENT-DRIVEN INTEGRATED CIRCUIT WITH INTERFACE SYSTEM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/088143 WO2022221994A1 (zh) 2021-04-19 2021-04-19 具有接口系统的事件驱动集成电路

Publications (1)

Publication Number Publication Date
WO2022221994A1 true WO2022221994A1 (zh) 2022-10-27

Family

ID=83723647

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/088143 WO2022221994A1 (zh) 2021-04-19 2021-04-19 具有接口系统的事件驱动集成电路

Country Status (6)

Country Link
US (1) US20240107187A1 (zh)
EP (1) EP4207761A4 (zh)
JP (1) JP2024507400A (zh)
KR (1) KR20230134548A (zh)
CN (1) CN115500090A (zh)
WO (1) WO2022221994A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107302695A (zh) * 2017-05-31 2017-10-27 天津大学 一种基于仿生视觉机理的电子复眼系统
EP1958433B1 (en) 2005-06-03 2018-06-27 Universität Zürich Photoarray for detecting time-dependent image data
WO2020116416A1 (ja) * 2018-12-05 2020-06-11 株式会社ソニー・インタラクティブエンタテインメント 信号処理装置、電子機器、信号処理方法およびプログラム
WO2020207982A1 (en) 2019-04-09 2020-10-15 Aictx Ag Event-driven spiking convolutional neural network
CN112534816A (zh) * 2018-08-14 2021-03-19 华为技术有限公司 用于视频图像编码的编码参数的基于事件自适应
CN112597980A (zh) * 2021-03-04 2021-04-02 之江实验室 一种面向动态视觉传感器的类脑手势序列识别方法
CN112598700A (zh) * 2019-10-02 2021-04-02 传感器无限公司 用于目标检测和追踪的神经形态视觉与帧速率成像

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108574793B (zh) * 2017-03-08 2022-05-10 三星电子株式会社 被配置为重新生成时间戳的图像处理设备及包括其在内的电子设备
EP3506622A1 (en) * 2017-12-26 2019-07-03 Prophesee Method for outputting a signal from an event based sensor, and event-based sensor using such method
US20200169681A1 (en) * 2018-11-26 2020-05-28 Bae Systems Information And Electronic Systems Integration Inc. Ctia based pixel for simultaneous synchronous frame-based & asynchronous event-driven readouts
KR20210000985A (ko) * 2019-06-26 2021-01-06 삼성전자주식회사 비전 센서, 이를 포함하는 이미지 처리 장치 및 비전 센서의 동작 방법
CN111190647B (zh) * 2019-12-25 2021-08-06 杭州微纳核芯电子科技有限公司 一种事件驱动型常开唤醒芯片
CN111031266B (zh) * 2019-12-31 2021-11-23 中国人民解放军国防科技大学 基于哈希函数的动态视觉传感器背景活动噪声过滤方法、系统及介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1958433B1 (en) 2005-06-03 2018-06-27 Universität Zürich Photoarray for detecting time-dependent image data
CN107302695A (zh) * 2017-05-31 2017-10-27 天津大学 一种基于仿生视觉机理的电子复眼系统
CN112534816A (zh) * 2018-08-14 2021-03-19 华为技术有限公司 用于视频图像编码的编码参数的基于事件自适应
WO2020116416A1 (ja) * 2018-12-05 2020-06-11 株式会社ソニー・インタラクティブエンタテインメント 信号処理装置、電子機器、信号処理方法およびプログラム
WO2020207982A1 (en) 2019-04-09 2020-10-15 Aictx Ag Event-driven spiking convolutional neural network
CN112598700A (zh) * 2019-10-02 2021-04-02 传感器无限公司 用于目标检测和追踪的神经形态视觉与帧速率成像
CN112597980A (zh) * 2021-03-04 2021-04-02 之江实验室 一种面向动态视觉传感器的类脑手势序列识别方法

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AMON AMIRBRIAN TABA ET AL.: "A Low Power, Fully Event-Based Gesture Recognition System", 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR, 21 July 2017 (2017-07-21)
PAUL A. MEROLLAJOHN V. ARTHUR ET AL.: "A million spiking-neuron integrated circuit with a scalable communication network and interface", SCIENCE, vol. 345, 8 August 2014 (2014-08-08)
See also references of EP4207761A4
ZHE ZOURONG ZHAO ET AL.: "A hybrid and scalable brain-inspired robotic platform", SCIENTIFIC REPORTS, 23 October 2020 (2020-10-23)

Also Published As

Publication number Publication date
US20240107187A1 (en) 2024-03-28
KR20230134548A (ko) 2023-09-21
CN115500090A (zh) 2022-12-20
EP4207761A4 (en) 2024-06-19
JP2024507400A (ja) 2024-02-19
EP4207761A1 (en) 2023-07-05

Similar Documents

Publication Publication Date Title
US11677662B2 (en) FPGA-efficient directional two-dimensional router
TWI746878B (zh) 高頻寬記憶體系統以及邏輯裸片
EP3298740B1 (en) Directional two-dimensional router and interconnection network for field programmable gate arrays
US7155554B2 (en) Methods and apparatuses for generating a single request for block transactions over a communication fabric
EP3557488A1 (en) Neuromorphic circuit having 3d stacked structure and semiconductor device having the same
WO2017173755A1 (zh) 片上数据划分读写方法、系统及其装置
Yuan et al. 14.2 A 65nm 24.7 µJ/Frame 12.3 mW Activation-Similarity-Aware Convolutional Neural Network Video Processor Using Hybrid Precision, Inter-Frame Data Reuse and Mixed-Bit-Width Difference-Frame Data Codec
TW201040962A (en) Configurable bandwidth memory devices and methods
US7277975B2 (en) Methods and apparatuses for decoupling a request from one or more solicited responses
US20210232902A1 (en) Data Flow Architecture for Processing with Memory Computation Modules
CN112805727A (zh) 分布式处理用人工神经网络运算加速装置、利用其的人工神经网络加速系统、及该人工神经网络的加速方法
US20220308935A1 (en) Interconnect-based resource allocation for reconfigurable processors
CN108256643A (zh) 一种基于hmc的神经网络运算装置和方法
WO2022221994A1 (zh) 具有接口系统的事件驱动集成电路
WO2020087276A1 (zh) 大数据运算加速系统和芯片
CN101562544B (zh) 一种数据包生成器和数据包生成方法
CN116246963A (zh) 一种可重构3d芯片及其集成方法
KR20200040165A (ko) 분산처리용 인공신경망 연산 가속화 장치, 이를 이용한 인공신경망 가속화 시스템, 및 그 인공신경망의 가속화 방법
US20220222194A1 (en) On-package accelerator complex (ac) for integrating accelerator and ios for scalable ran and edge cloud solution
US11436185B2 (en) System and method for transaction broadcast in a network on chip
US20230017778A1 (en) Efficient communication between processing elements of a processor for implementing convolution neural networks
WO2022088171A1 (en) Neural processing unit synchronization systems and methods
Lee et al. Mini Pool: Pooling hardware architecture using minimized local memory for CNN accelerators
US11349782B2 (en) Stream processing interface structure, electronic device and electronic apparatus
US10353455B2 (en) Power management in multi-channel 3D stacked DRAM

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937245

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18010486

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021937245

Country of ref document: EP

Effective date: 20230331

ENP Entry into the national phase

Ref document number: 20237028111

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020237028111

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2023552014

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE