CN113537471A

CN113537471A - Improved spiking neural network

Info

Publication number: CN113537471A
Application number: CN202110835587.8A
Authority: CN
Inventors: P·A·范德梅德; A·S·曼卡尔
Original assignee: A SMankaer; P AFandemeide
Current assignee: A SMankaer; P AFandemeide
Priority date: 2018-11-01
Filing date: 2019-10-31
Publication date: 2021-10-22
Anticipated expiration: 2039-10-31
Also published as: WO2020092691A1; EP3874411A1; KR20210098992A; US20200143229A1; US11468299B2; AU2021254524A1; US11657257B2; JP2022509754A; CN111417963A; CN113537471B; AU2019372063A1; EP3874411A4; AU2019372063B2; CN111417963B; US20230026363A1; AU2021254524B2

Abstract

System, method, and computer program product embodiments for an improved Spiking Neural Network (SNN) configured to learn and perform unsupervised extraction of features from an input stream are disclosed herein. Embodiments operate by receiving a set of spiking locations corresponding to a set of synapses associated with a spiking neuron circuit. The embodiment applies a first logical AND (AND) function to a first spike position in the set of spike positions AND a first synaptic weight of a first synapse in the set of synapses. The embodiment increments a membrane potential value associated with the spiking neuron circuit based on the application. The embodiment determines that a membrane potential value associated with the spiking neuron circuit has reached a learning threshold. Then, the embodiment performs a spike-time dependent plasticity (STDP) learning function based on this determination that the membrane potential value of the spiking neuron circuit has reached the learning threshold.

Description

Improved spiking neural network

The application is a divisional application of a Chinese national phase patent application with the international application number of PCT/US2019/059032, the international application date of 2019, 10 and 31, and the invention name of the PCT application is 'improved spiking neural network' which enters the Chinese national phase patent application with the application number of 201980004791.6 after 3 and 30 days in 2020.

Cross Reference to Related Applications

The present application claims benefit of U.S. provisional application No. 62/754,348 entitled "improved spiking neural network" filed on 2018, 11/1, which is hereby incorporated by reference in its entirety for all purposes.

Technical Field

The present methods relate generally to neural circuit engineering and, more particularly, to systems and methods for on-chip low-power high-density autonomous learning artificial neural networks.

Background

Artificial neural networks have long been aimed at replicating the function of biological neural networks (the brain), but have met with limited success. The "brute force" hardware approach to artificial neural network design is cumbersome and inadequate, and is far from achieving the desired goal of replicating human brain function. Therefore, there is a need for a spiking neural network that is capable of implementing an autonomous, reconfigurable spiking neural network that can be scaled to a very large network, yet can be mounted on a chip while reasoning quickly from a variety of possible input data and/or sensor sources.

Disclosure of Invention

Provided herein are systems, apparatus, articles of manufacture, methods, and/or computer program product embodiments, and/or combinations and subcombinations thereof, for improved Spiking Neural Networks (SNNs) configured to learn and perform unsupervised extraction of features from input streams. Some embodiments include a neuromorphic integrated circuit including a spike transducer, a reconfigurable neuron structure, a memory, and a processor. The spike converter is configured to generate spikes from the input data. A reconfigurable neuron structure includes a neural processor that includes a plurality of spiking neuron circuits. The spiking neuron circuit is configured to perform a task based on the spikes received from the spike transducer and the neural network configuration. The memory includes a neural network configuration including an array of potentials and a plurality of synapses. The neural network configuration also defines connections between the plurality of spiking neuron circuits and the plurality of synapses. The processor is configured to modify the neural network configuration based on the configuration file.

Embodiments for learning and performing unsupervised extraction of features from an input stream are also described herein. Some embodiments provide for receiving, at a spiking neuron circuit, a set of spiking bits corresponding to a set of synapses. The spiking neuron circuit applies a logical AND (AND) function to a spiking position in the spiking position set AND a synaptic weight of a synapse in the synaptic set. Based on the application of the logical AND function, the spiking neuron circuit increments a membrane potential value. The neural processor then determines that a membrane potential value associated with the spiking neuron circuit has reached a learning threshold. Thereafter, based on this determination that the membrane potential value of the spiking neuron circuit has reached the learning threshold, the neural processor performs a spike-time dependent plasticity (STDP) learning function.

This summary is provided merely for purposes of illustrating some example embodiments to provide an understanding of the subject matter described herein. Accordingly, the above-described features are merely examples and should not be construed to narrow the scope or spirit of the subject matter in the present disclosure. Other features, aspects, and advantages of the disclosure will become apparent from the following detailed description, the drawings, and the claims.

Drawings

The accompanying drawings are incorporated herein and form a part of the specification.

Fig. 1 is a block diagram of a neural model according to some embodiments.

Fig. 2A is a block diagram of a neuromorphic integrated circuit according to some embodiments.

Fig. 2B is a block diagram of the neuromorphic integrated circuit in fig. 2A, according to some embodiments.

Fig. 3 is a flow diagram of input spike buffering, grouping, and output spike buffering for a next layer according to some embodiments.

Fig. 4 is a block diagram of a neural processor configured as a spiking convolutional neural processor, in accordance with some embodiments.

Fig. 5 is a block diagram of a neural processor configured as a spike fully connected neural processor, in accordance with some embodiments.

Fig. 6A is an example of grouping spikes into groups of spikes, according to some embodiments.

Fig. 6B is an example representation of the spike grouping in fig. 6A according to some embodiments.

FIG. 7 is an example of a method of selecting which bits to increment or decrement a membrane potential counter, according to some embodiments.

Fig. 8 illustrates weight exchange steps of an STDP learning method according to some embodiments.

FIG. 9 illustrates a convolution method used in a spiking convolution neural processor, in accordance with some embodiments.

FIG. 10 illustrates a symbolic representation of convolution in an 8 × 8 matrix of pixels with a depth of 1, according to some embodiments.

FIG. 11 illustrates a symbolic representation of a convolution involving 2 spike channels, two 3 × 3 inverse convolution kernels, and the resulting membrane potential values, according to some embodiments.

Fig. 12 illustrates the resulting spikes generated in the neurons on channel 1 and channel 2 of fig. 11, in accordance with some embodiments.

FIG. 13 illustrates a spiking neural network convolution operation, according to some embodiments.

FIG. 14 illustrates the result of applying eight directional filter neuron convolutions to an input image according to some embodiments.

Fig. 15 illustrates the similarity between DVS spike-based convolution and frame-based convolution according to some embodiments.

FIG. 16 illustrates an example of a YAML configuration file, in accordance with some embodiments.

Fig. 17 illustrates a configuration register including scan chains defining the configuration and connectivity of each spiking neuron circuit and each layer of spiking neuron circuits, according to some embodiments.

In the drawings, like reference numbers generally indicate the same or similar elements. Additionally, in general, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears.

DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION

Provided herein are system, apparatus, device, method, and/or computer program product embodiments, and/or combinations and subcombinations thereof, for improved Spiking Neural Networks (SNNs) configured to learn and perform unsupervised extraction of features from an input stream. Embodiments herein include independent neuromorphic integrated circuits that provide improved SNNs. Neuromorphic integrated circuits have several advantages. First, neuromorphic integrated circuits are compact in size. For example, a neuromorphic integrated circuit integrates a processor complex, one or more sensor interfaces, one or more data interfaces, a spike converter, and a memory on a silicon chip. This allows efficient use of the silicon area in a hardware implementation. Second, neuromorphic integrated circuits may be reprogrammed using user-defined profiles for many different tasks. For example, the connections between the layers in the neuromorphic integrated circuit and the neural processor may be reprogrammed using a user-defined configuration file. Third, the neuromorphic integrated circuit provides a low-delay output. Fourth, neuromorphic integrated circuits consume a small amount of power. For example, neuromorphic integrated circuits consume two orders of magnitude less power than comparable Artificial Neural Networks (ANN) when performing the same task. Moreover, neuromorphic integrated circuits may provide accuracy approaching or equaling that of the prior art. Finally, neuromorphic integrated circuits provide an improved learning approach with both built-in dynamic balancing capabilities and fast convergence of synaptic weights to incoming data patterns.

An ANN is typically composed of artificial neurons characterized by an architecture that is determined at design time. Neurons can loosely mimic neurons in the biological brain. The neurons may be hardware circuits or may be defined programmatically. The function of an ANN may be defined by connections between neurons in a layer, connections between neuron layers, synaptic weights, and preprocessing of input data to fit into a predefined input range.

In ANN, inference can be performed using multiply-accumulate (MAC) operations. In a MAC operation, an incoming data value may be multiplied by a plurality of synaptic weights stored in memory. For example, the ANN may perform multiple MAC operations (e.g., 1.56 hundred million MAC operations) on each image to classify objects in the image. The results of these multiplications may then be integrated by adding in each neuron in the network. After performing the MAC operation, a nonlinear function may be applied to the integral values of the neurons, resulting in output values. The output value may be a floating point value.

Multiple layers may be used to create an ANN to perform a particular task. Many neurons can be used in parallel in each layer of the ANN. Pooling operations may also be performed between inference layers.

A deep ANN may refer to a software or hardware implementation of an ANN having many layers. The depth ANN can perform image classification very successfully. For example, depth ANN has been very successful in image classification of images in ImageNet. ImageNet is a collection of a large number of images with manual labels.

The deep ANN is established based on the confidence that biological neurons communicate data through the firing rate (e.g., the rate at which electrical impulses are received and generated by the neurons). Neurons in the deep ANN may communicate in floating point numbers or multi-bit integers.

Convolutional Neural Networks (CNNs) are a class of deep ANN in which neurons are trained using a number of labeled examples to extract features that appear in a dataset. For example, the data set may be an image. CNN may perform convolution operations on the image. The convolution operation may act on a small portion of the image and convey a value indicative of the appearance of a feature in the image to the next layer in the CNN. The feature may be contained in a small rectangular area. This small rectangle can be moved around the larger input image programmatically. When a feature in the image matches the feature information stored in the neuron synaptic weights, a value is sent to the next layer in the CNN. In CNN, synaptic weights are shared between neurons that respond to similar features at different locations of an image. Neurons in CNN can act as filters for defined features. Training of CNNs can be accomplished by a mathematical optimization technique called back-propagation.

Although CNNs have been successful in detecting features and classifying images, CNNs often suffer from a number of technical problems, including computationally intensive, catastrophic forgetfulness, and misclassification of resistant samples. CNN also suffers from high latency. Although multi-core processors and massively parallel processing may be used in CNNs to offset latency problems caused by high computational requirements, this tends to result in high power requirements for CNNs. For example, a CNN used to classify images in ImageNet may use up to 2000 watts of power. This is because CNNs may have to employ a high performance Central Processing Unit (CPU) and one or more Graphics Processing Units (GPUs) implemented on a peripheral component interconnect express (PCIe) add-on board.

SNN may address some technical issues associated with CNN. SNN is based on the proposition in biomedical research that biological neurons communicate data at a timing of pulses emitted by a sensing organ and between nerve layers. A pulse is a brief burst of energy called a spike. SNNs are a class of ANN in which spikes are used to express information between neurons. Spikes can express information based on their temporal and spatial distribution. Spiking neurons in the SNN may only spike and consume power when a series of events at the input are identified as a previously learned sequence. This is similar to the process that occurs in the biological brain. This technique of simulating brain function in the SNN and obtaining results such as classification of objects in the image or identification of specific features in the data stream may be referred to as neuromorphic computation.

SNNs consume orders of magnitude less power than other types of ANN because neurons in the SNN do not continue to process to perform the MAC requirements of the ANN. Conversely, neurons consume power only when spikes occur. In SNN, whenever an input spike is received, neural function may be simulated by adding a variable non-zero synaptic weight value to the simulated membrane potential value of the neuron. The simulated membrane potential value may then be compared to one or more threshold values. Spikes may be generated when the membrane potential value of a neuron reaches or exceeds a threshold value. SNNs do not exhibit catastrophic amnesia and can continue to learn after they are trained. Moreover, there is no evidence that SNN suffers from the problem of incorrect classification due to antagonistic samples.

However, conventional SNNs may suffer from some technical problems. First, conventional SNNs cannot switch between convolution and fully connected operation. For example, a conventional SNN may be configured at design time to learn features and classify data using a fully connected feed-forward architecture. Embodiments herein (e.g., neuromorphic integrated circuits) address this technical problem by combining features of CNNs and SNNs into a Spiking Convolutional Neural Network (SCNN), which may be configured to switch between convolutional operations or fully connected neural network functions. SCNN may also reduce the number of synaptic weights per neuron. This may also allow the SCNN to be deeper (e.g., have more layers) than conventional SNNs, with less synaptic weight per neuron. Embodiments herein further improve convolution operations by using a winner-take-all (WTA) method for each neuron that acts as a filter at a particular location in the input space. This may improve the selectivity and invariance of the network. In other words, this may improve the accuracy of the reasoning operation.

Second, conventional SNNs are not reconfigurable. Embodiments herein address this technical problem by allowing the connection between a neuron and a synapse of an SNN to be reprogrammed based on a user-defined configuration. For example, the connection between the layer and the neural processor may be reprogrammed using a user-defined configuration file.

Third, conventional SNNs do not provide buffering between the different layers of the SNN. Buffering may allow for a time delay in passing the output spike to the next layer. Embodiments herein address this technical problem by adding input and output spike buffers between layers of the SCNN.

Fourth, conventional SNNs do not support synaptic weight sharing. Embodiments herein address this technical problem by allowing the kernels of the SCNN to share synaptic weights when performing convolution. This may reduce the memory requirements of the SCNN.

Fifth, conventional SNNs typically use 1-bit synaptic weights. However, using 1-bit synaptic weights does not provide a way to inhibit connections. Embodiments herein address this technical problem by using ternary synaptic weights. For example, embodiments herein may use two-bit synaptic weights. These ternary synaptic weights may have positive, zero, or negative values. The use of negative weights may provide a way to suppress connections, which may improve selectivity. In other words, this may improve the accuracy of the reasoning operation.

Sixth, conventional SNNs do not perform pooling. This results in increased memory requirements for conventional SNNs. Embodiments herein address this technical problem by performing pooling on previous layer outputs. For example, embodiments herein may perform pooling on the array of potentials output by a previous layer. This pooling operation reduces the dimensionality of the potential array while retaining the most important information.

Seventh, conventional SNNs typically store spikes in bit arrays. Embodiments herein provide improved ways to represent and handle spikes. For example, embodiments herein may use a connection list rather than a bit array. This connection list is optimized so that each input layer neuron has an offset index set that must be updated. This allows embodiments herein to update all membrane potential values of connected neurons in the current layer by only considering a single connection list.

Eighth, conventional SNNs often process spikes one after another. Rather, embodiments herein may handle spike packets. This can result in the potential array being updated once the spike is processed. This may allow for greater hardware parallelism.

Finally, conventional SNNs fail to provide a way to import learning (e.g., synaptic weights) from external sources. For example, SNN does not provide a way to import learning performed offline using back propagation. Embodiments herein address this technical problem by allowing a user to import offline performed learning into a neuromorphic integrated circuit.

In some embodiments, the SCNN may include one or more neural processors. Each neural processor may be interconnected by a reprogrammable fabric. Each neural processor is reconfigurable. Each neuron processor may be configured to perform convolution or classification in fully connected layers.

Each neural processor may include a plurality of neurons and a plurality of synapses. Neurons can be reduced to integrated and Fire (I & F) neurons. Neurons and synapses may be interconnected by reprogrammable structures. Each neuron of the neural processor may be implemented in hardware or software. Neurons implemented in hardware may be referred to as neuron circuits.

In some embodiments, each neuron may use an increment or decrement function to set the membrane potential value of the neuron. This is more efficient than using the additive function of conventional I & F neurons.

In some embodiments, the SCNN may use different learning functions. For example, SCNN may use an STDP learning function. In other embodiments, the SCNN may use synaptic weight exchange to implement an improved version of the STDP learning function. Such an improved STDP learning function may provide built-in dynamic balancing (e.g., stable learning weights) and increased efficiency.

In some embodiments, the input to the SCNN is from an audio stream. An analog-to-digital (a/D) converter may convert the audio stream into digital data. The a/D converter may output digital data in the form of Pulse Code Modulation (PCM) data. The data-to-spike converter may convert the digital data into a series of spatially and temporally distributed spikes representing the spectrum of the audio stream.

In some embodiments, the input to the SCNN is from a video stream. The a/D converter may convert the video stream into digital data. For example, an a/D converter may convert a video stream into pixel information, where the intensity of each pixel is represented as a digital value. A digital camera may provide such pixel information. For example, a digital camera may provide pixel information for red, green, and blue pixels in the form of three 8-bit values. The pixel information may be captured and stored in memory. The data-to-spike converter can convert pixel information into spatially and temporally distributed spikes by means of sensory neurons that simulate the action of human visual nerve bundles.

In some embodiments, the input to the SCNN is from data in the form of binary values. The data-to-spike converter may convert data in the form of binary values into spikes by passing through a gaussian receive field. As will be appreciated by one of ordinary skill in the art, the data-to-spike conversion may convert data in the form of binary values into spikes in other ways.

In some embodiments, a digital vision sensor (e.g., a Dynamic Vision Sensor (DVS) supplied by the iniVation AG or other manufacturer) is connected to the spike input interface of the SCNN. The digital visual sensor may transmit pixel event information in the form of spikes. The digital vision sensor may encode spikes on an Address Event Representation (AER) bus. Pixel events occur when the pixel intensity increases or decreases.

In some embodiments, the input format of the SCNN is spikes distributed in space and time. A spike may be defined as a brief burst of electrical energy.

In some embodiments, the SCNN may be composed of one or more layers of spiking neurons. Spiking neurons can mimic the function of neurons. Spiking neurons may be interconnected by circuits that mimic synaptic function.

The spiking neurons may be implemented in hardware or software. A hardware-implemented spiking neuron may be referred to as a spiking neuron circuit. However, as will be understood by one of ordinary skill in the art, in any of the embodiments herein, software-implemented spiking neurons may be used instead of spiking neuron circuits.

In some embodiments, the SCNN may be configured from a stored configuration. The stored configuration may be modified using YAML non-markup language (YAML) files. YAML files may define the functions of components in a neural structure to form SCNNs for specific tasks. For example, the YAML file may configure the SCNN to classify images in the canadian advanced research institute 10(CIFAR-10) dataset (a collection of images commonly used to train machine learning and computer vision algorithms).

In some embodiments, each layer in the SCNN may be defined, connected, and configured as a convolutional layer with maximum pooling and shared synaptic weights, or a fully connected layer with a single synapse.

In some embodiments, the convolutional layer may be used in conjunction with one or more max-pooling layers for dimensionality reduction of the input signal by extracting certain features and passing those features as metadata to the next layer in the SCNN. The metadata delivered by each neuron may take the form of a neuron membrane potential value or spike. A spike may indicate that a threshold has been reached. The spike may trigger a learning event or an output spike. The neuron membrane potential value is a potential value of a neuron. The neuron membrane potential value can be read independently of the threshold value.

In some embodiments, the convolutional network layer in the SCNN may include a plurality of spiking neuron circuits. Each spiking neuron circuit may include an integrator and a plurality of synapses shared with other neurons in the layer. Each spiking neuron circuit may be configured as a feature detector. The convolutional layer may be followed by a pooling layer. As one of ordinary skill in the art will appreciate, the pooling layer may be a maximum pooling layer, an average pooling layer, or another type of pooling layer. The max pooling layer may receive an output of a spiking neuron circuit (e.g., a feature detector). The largest pooling layer can only pass the neuron output with the highest potential value (e.g., the neuron membrane potential value or spike) to the next layer. Average pooling may be performed by dividing the input into rectangular pooled regions and calculating the average for each region.

In some embodiments, fully connected layers in the SCNN may be used for classification, autonomous feature learning, and feature extraction. The fully connected layer may include a plurality of spiking neuron circuits. Each spiking neuron circuit may include an integrator and a plurality of synapses. The plurality of synapses may not be shared with other neuron circuits in the fully connected layer.

In some embodiments, learning in SCNN may be performed by a method called spike-time dependent plasticity (STDP). In STDP learning, an input spike preceding an output spike indicates that the input spike caused the output spike. In STDP, this may result in an enhancement of synaptic weight.

In some embodiments, synaptic weight swapping is used to improve the STDP learning method. Synaptic weight values may be exchanged between synapses to strengthen synaptic inputs that cause output spiking events and weaken synapses that do not cause them. This may cause spiking neuron circuits to become increasingly selective to specific input functions.

In some embodiments, the STDP learning method is further improved using synaptic weights of the third bit. The synaptic weight of the third bit may have a positive value, a zero value, or a negative value. Synapses storing positive weights may be referred to as excitatory synapses. Synapses storing negative weights may be referred to as inhibitory synapses. Storing a synapse of zero weight may not result in the selection process.

In some embodiments, a spike input buffer is present at the input of each layer of the neural network. The spike input buffer may receive and store spike information. The spike information may be transmitted as digital bits to a spike input buffer. A "1" may be used to indicate the presence of a spike. A "0" may be used to indicate the absence of a spike.

In some embodiments, the grouper may classify spikes in the spike input buffer into one or more spike groups. The spike packet may be stored in a packet register.

In some embodiments, a first logical AND function may be applied to the bit pattern stored in the grouping register AND the positive weight bits stored in the synapses. A logical "1" at the output of the first logical AND function increments a membrane potential counter of the spiking neuron circuit. A second AND function may be applied to the bit pattern stored in the input spike buffer AND the inverse weight bit in the synapse. A logical "1" at the output of the second logical AND function decrements the membrane potential counter of the spiking neuron circuit.

In some embodiments, a layer in the SCNN is a collection of neurons that share parameters. The layer may receive spike information from a previous layer and propagate the spike information to a subsequent layer.

In some embodiments, the SCNN may support both feed-forward and feedback architectures. This may be a connection topology where each layer receives input from a local bus structure and passes output to the same local bus.

In some embodiments, each layer may receive and transmit a data structure containing an Address Event Representation (AER) pattern of headers and event addresses. This information may be received into a spike input buffer. AER events contain three components: x, y, f, where f is a feature (e.g., channel) and x, y are the coordinates of the spiking neuron circuit where the spike occurs. The incoming spike buffer can be processed to create a spike packet that is processed by the layer. A layer may output spikes to an output spike buffer until the next layer for processing.

In some embodiments, all layer types other than the input layer type may have an array of potentials. The potential array may store a membrane potential value for each spiking neuron circuit.

In some embodiments, each layer may include two data structures that describe the connectivity between the spiking neuron circuits in that layer to the input of the neuron. The first data structure may be referred to as a connection list array. The entries in the connection list array may correspond to a list of spiking neuron circuits to which a particular input is connected. The connection list array may contain connectivity information from the source to the target.

The second data structure may be referred to as a weight vector array. Each entry in the weight vector array corresponds to a vector of inputs to which a particular spiking neuron circuit is connected. The weight vector array may contain destination-to-source information.

In some embodiments, each spiking neuron circuit in a fully connected layer type has a single entry in the potential array. In contrast, in some embodiments, the spiking neuron circuits of the convolutional layers may share a single set of synaptic weights that is applied to the x-y coordinates across each input channel. Synaptic weights may be stored in a destination-to-source format in a weight vector array.

In ANN with computationally derived neuron functions (e.g., deep learning neural networks (DNNs)), training and reasoning can be two independent operations performed in different environments or machines. In the training phase, the DNN learns from a large training data set by computing synaptic weight values in the neural network by means of back propagation. Conversely, learning may not be performed during the inference phase of DNN.

In some embodiments, SCNN does not explicitly partition between training and reasoning operations. Inference operations may operate through event propagation. Event propagation may refer to the processing of an input by the SCNN layer to update the potential array and generate an output spike buffer that fires spikes for that layer. The spiking neuron circuits in a layer may first perform an event propagation step (e.g., inference) and then perform a learning step. In some embodiments, when learning is disabled in a layer of the SCNN, that layer may only perform the event propagation step, which is actually the inference phase.

In some embodiments involving convolution, spiking neuron circuits may share synaptic weights. These neuron circuits may be referred to as filters. This is because these spiking neuron circuits can filter out certain features from the input stream.

Fig. 1 is a block diagram of a neural network model according to some embodiments. In fig. 1, the spikes may be communicated over a local bus 101. For example, the local bus 101 may be a network on chip (NoC) bus. The spikes may be communicated in the form of network packets. The network packet may contain one or more spikes and a code indicating a source address and a destination address.

In fig. 1, spike decoder 102 may decode spikes in a network packet. Spike decoder circuit 102 can send a spike to a particular spiking neuron circuit based on a source address in the network packet. For example, spike decoder circuit 102 may store a spike in spike input buffer 103 of a corresponding spiking neuron circuit. Spike decoder circuit 102 may also store an address in spike input buffer 103 of the corresponding neuron circuit at which the bit is to end.

Spike input buffer 103 may store one or more spikes. A "1" bit may indicate the presence of a spike, while a zero bit may indicate the absence of a spike. Spike input buffer 103 may also contain an address where the bit will end.

In fig. 1, grouper 114 may classify spikes in spike input buffer 103 into one or more spike groups. The spike packet may be stored in the packet register 104. For example, where the spiking neuron circuit has 1024 synapses, the grouping register 104 may be 1024 bits in length. The grouper 114 may sort the bits in the spike input buffer 103 into the correct position along the 1024-bit grouping register 104. This classification process is further described in fig. 6.

In fig. 1, the synaptic weight values may be stored in the synaptic weight memory 105. In some embodiments, the synapse weight memory 105 may be implemented using Static Random Access Memory (SRAM). As will be appreciated by one of ordinary skill in the art, the synapse weight memory 105 may be implemented using various other storage techniques.

The synaptic weight values in the synaptic weight memory 105 may be positive or negative. In some embodiments, the synaptic weight values in the synaptic weight memory 105 may be transferred into the weight register 106 for processing. The positive synaptic weight value in the weight register 106 may be anded with a corresponding bit in the grouping register 104 in a logical AND circuit 107. For each positive result of the AND function, the resulting output of the logical AND circuit 107 may increment the counter 109. The counter 109 may represent the value of the membrane potential of the neuron.

The negative synaptic weight value in the weight register 106 may be anded with a corresponding bit in the grouping register 104 in a logical AND circuit 108. The resulting output of the logic AND circuit 108 may decrement the counter 109. This process may continue until all bits in the packet register 104 have been processed.

After all bits in the packet register 104 have been processed, the counter 109 may contain a value representing the number of bits in the packet register 104 corresponding to the positive and negative synaptic weight values in the weight register 106. The value in the counter 109 may be compared to at least one threshold value using a threshold comparator 110.

In some embodiments, the threshold comparator 110 may compare the value in the counter 109 to two thresholds. For example, threshold comparator circuit 110 may compare the value in counter 109 with the value in learning threshold register 111 and the value in spike threshold register 112.

In some embodiments, the value in the learning threshold register 111 may be initially set to a low value to allow neuron learning. During the learning process, synaptic weights may be assigned to incoming spikes using weight exchanger 113. This process is shown in fig. 8 and 9. In some embodiments, as neurons learn, the value in the counter 109 increases, and the value in the learning threshold register 111 also increases. This process may continue until the neuron responds strongly to a particular learning pattern.

Fig. 2A is a block diagram of a neuromorphic integrated circuit 200 according to some embodiments. The neuromorphic integrated circuit 200 may include a neuron structure 201, a conversion complex 202, a sensor interface 203, a processor complex 204, one or more data interfaces 205, one or more memory interfaces 206, a multi-chip expansion interface 207 that may provide a high-speed chip-to-chip interface, a power management unit 213, and one or more Direct Memory Access (DMA) engines 214.

In some embodiments, the neuron structure 201 may include a plurality of reconfigurable neural processors 208. The neural processor 208 may include a plurality of neurons. For example, the neural processor 208 may include a plurality of spiking neuron circuits and a plurality of synapses. As described above, the spiking neuron circuit may be implemented using the input spike buffer 103, the grouper 114, the grouping register 104, the logical AND circuit 107, the logical AND circuit 108, the counter 109, the threshold comparator 110, the learning threshold 111, AND the spike threshold 112. The plurality of synapses may be implemented using a weight register 106, a synapse weight memory 105 and a weight exchanger 113. Each neural processor 208 may include a plurality of reprogrammable spiking neuron circuits that may be connected to any portion of the neural structure 201.

In some embodiments, conversion complex 202 may include one or more of a pixel-to-spike converter 209, an audio-to-spike converter 210, a Dynamic Visual Sensor (DVS) -to-spike converter 211, and a data-to-spike converter 212. Pixel-to-spike converter 209 may convert the image to a spike event.

In some embodiments, sensor interface 203 may include one or more interfaces for pixel data, audio data, analog data, and digital data. The sensor interface 203 may also include an AER interface for DVS pixel data.

In some embodiments, the processor complex 204 may include at least one programmable processor core, memory, and input output peripherals. The processor complex 204 may be implemented on the same chip as the neuromorphic integrated circuit 200.

In some embodiments, the data interface 205 may include one or more interfaces for input and output peripherals. One or more interfaces may use the peripheral component interconnect express (PCIe) bus standard, the Universal Serial Bus (USB) bus standard, the ethernet bus standard, the Controller Area Network (CAN) bus standard, and a Universal Asynchronous Receiver and Transmitter (UART) for transmitting and receiving serial data.

In some embodiments, memory interface 206 may include one or more interfaces for dynamic Random Access Memory (RAM) expansion. One or more interfaces may use a double data rate synchronous dynamic random access memory (DDR SDRAM) standard. For example, one or more interfaces may use the DDR3 or DDR4 standards.

In some embodiments, the multi-chip expansion interface 207 may carry spike information to enable the neural structure 201 to expand to multiple chips. The multi-chip expansion interface 207 may use AERs to carry spike information. AER is a standard for communicating spiking events over a system bus. The address of the particular neuron that produced the spike at the time of occurrence of the spike is transmitted.

In some embodiments, neuromorphic integrated circuit 200 may take spike information as input and generate AER spike events as output. In addition to outputting spikes from the last layer of the SCNN, AER spike events may also transmit the membrane potential value of each spiking neuron circuit.

In some embodiments, the neural structure 201 may process spikes in a feed-forward manner. AER formatted data may be used to send spikes between layers. Each layer may have an input spike buffer (e.g., input spike buffer 103) that converts spikes stored in the input spike buffer into a set of spike packets. Each layer can completely process all spikes in the input spike buffer before each layer sends its output spike to the next layer.

Fig. 2B is another block diagram of the neuromorphic integrated circuit 200 in fig. 2A, according to some embodiments. Fig. 2B illustrates the interconnection of components of neuromorphic integrated circuit 200 using a local bus 220 (e.g., a NoC bus). In fig. 2B, the neuromorphic integrated circuit 200 may include a neuron structure 201, a processor complex 204, one or more data interfaces 205, a pixel-to-spike converter 209, an audio-to-spike converter 210, and a DMA engine 214 as shown in fig. 2A. The neuromorphic integrated circuit 200 may further include a synapse weight memory 222, a memory 224, a serial Read Only Memory (ROM)226, configuration registers 228, a PCIe interface module 230, a PCIe bus 232, a UART interface 234, a CAN interface 236, a USB interface 238, and an ethernet interface 240.

In some embodiments, the synaptic weight store 222 may be identical to the synaptic weight store 105 of FIG. 1. A synaptic weight memory 222 may be connected to the neuron structure 201. The synapse weight memory 222 may store weights for all synapses and membrane potential values for all spiking neuron circuits. The synaptic weight memory 222 may be accessed externally through one or more DMA engines 214 from a PCIe interface module 230 that may be connected to a PCIe bus 232.

In some embodiments, a configuration register 228 may be connected to the neuron structure 201. During initialization of the neuron structure 201, the processor complex 204 may read the serial ROM 226 by writing values to the configuration registers 228 and the synapse weight memory 222, and configure the neuron structure 201 for externally defined functions by writing values to the configuration registers 228 and the synapse weight memory 222.

In some embodiments, the processor complex 204 is available externally through the PCIe interface 230. The program may be stored in the memory 224. The program may determine the functionality of the UART interface 234, the CAN interface 236, the USB interface 238, and the Ethernet interface 240. One or more of these interfaces may communicate data to be processed by the neuron structure 201, the processor complex 204, or both.

The audio-to-spike converter 210 may pass the spikes directly onto the local bus 220 for processing by the neuron structure 201. The pixel-to-spike converter 209 may be connected to an external image sensor and convert the pixel information into spike packets that are distributed over the local bus 220 for processing by the neuron structure 201. The processed spikes may be grouped (e.g., inserted into network packets) and placed on the local bus 220.

Fig. 3 is a flow diagram of input spike buffering, grouping, and output spike buffering for a next layer according to some embodiments. Fig. 3 includes an input spike buffer 301, one or more spike groupings 302, a neuron structure 303, and an output spike buffer 304. In fig. 3, the spikes in the input spike buffer 301 may be classified into one or more spike groupings 302 for particular neurons (e.g., spiking neuron circuits) in the neuron structure 303. After processing in the neuron structure 303, any resulting spikes may be stored in an output spike buffer 304, which is sent to the next layer. The resulting spikes in output spike buffer 304 may be grouped for processing by subsequent layers.

In some embodiments, one layer of the neuron structure 303 may process all of the input spike buffers 301. The layer may process each spike packet 302 sequentially. The resulting output spikes may be placed in output spike buffer 304. Until after all spike packets 302 have been processed, output spike buffer 304 may not be sent to the next layer for processing. In some embodiments, all layers of the neuron structure 303 may follow the workflow.

In some embodiments, the neuron structure 303 may process many spikes at once. In some embodiments, different spike buffer types may be used for layers in the neuron structure 303. The type of spike input buffer may depend on the nature of the input data. The difference between spike buffer types may be: how the spike buffer type generates spike packets from the input spike buffer 301.

In some embodiments, the packet buffer type may be used to process continuous or ongoing data types (e.g., a spike stream generated by a DVS camera). The user may configure different layers of the neuron structure 303 to use this buffer type. The packet buffer type can handle many spikes in one or very large bursts at a time. The packet buffer may store the spikes in the order in which they were received until the number of spikes reaches a size (e.g., packet size) defined by parameters specified in a configuration file (e.g., YAML file). Once the packet buffer reaches this size, the spike packets may be passed to neural structure 303 for processing. The packet buffer may then be emptied. Thereafter, the packet buffer may continue to store spikes.

In some embodiments, a refresh buffer type may be used to process data in a form that defines a size (e.g., a conventional video image frame or a defined set of values). For example, a video frame may have a defined size, such as 640 x 480 pixels. However, in this case, many spikes sent at one time may be sent out immediately as a single packet for processing. The length of the spike packets may vary.

In some embodiments, each layer type may implement the functionality to handle all spike input buffers (e.g., spike input buffer 301) by first generating packets from the spike input buffer. This function can process all spike packets after all spike input buffers are grouped. The function may then delete the processed spike packet and push an output spike from the spike packet to an output spike buffer (e.g., output spike buffer 304). This function may then fetch the next spike packet for processing. The difference between buffer types may be: how buffer type generates spike packets from an input spike buffer.

Fig. 4 is a block diagram of a neural processor 400 configured as a spiking convolutional neural processor, in accordance with some embodiments. The neural processor 400 may include a network on a local bus 401 (e.g., a NoC bus), a spike decoder 402, a synaptic weight memory 403, a neuron location generator, a pooling circuit 404, a neuron structure 405, a potential update and check circuit 406, and a spike generator 407. The neuron structure 405 may be identical to the neuron structure 201 of fig. 2. The synaptic weight memory 403 may store synaptic weight values and membrane potential values (e.g., an array of potentials) for the neurons. As one of ordinary skill in the art will appreciate, the pooling circuitry 404 may perform a max pooling operation, an average pooling operation, or other types of pooling operations. The one-to-many spike generator circuit 407 may generate spike packets, which may be transmitted one-to-many across the local bus 401.

Fig. 5 is a block diagram of a neural processor 500 configured as a spike fully connected neural processor, in accordance with some embodiments. The neural processor 500 includes a local bus 501 (e.g., a NoC bus), a spike decoder 502, a synaptic weight memory 503, a neuron location generator, a packet former 504, a neuron structure 505, a potential update and check circuit 506, and a potential and spike output circuit 507. The neuron structure 505 may be identical to the neuron structure 201 of fig. 2. The synaptic weight memory 503 may store synaptic weight values and membrane potential values (e.g., an array of potentials) for the neurons. In fig. 5, spikes may be received into a spike input buffer and distributed as a spike packet using a spike decoder 502.

In some embodiments, the synaptic weight may be the weight of the third bit. The synaptic weights for these third bits may be 2 bits wide. These 2-bit wide synaptic weights may include positive and negative values. This is different from conventional SNNs. A positive value in a synaptic weight of 2 bits wide may increase the membrane potential value of a neuron. Negative values in 2-bit wide synaptic weights may lower the membrane potential value of a neuron.

In some embodiments, spikes in a spike grouping may be distributed according to their synaptic destination numbers. In some embodiments, during processing, the synaptic weight of the third bit is logically anded with the spikes represented in the spike grouping. The positive spike peak position may be used to represent a spike in a spike packet. A zero may be used to indicate that no spikes are present in the spike packet. Synaptic weights may be negative or positive. Negative synaptic weights may decrement the neuron's counter 109 (e.g., a membrane potential register). A positive synaptic weight may increment a neuron's counter 109 (e.g., a membrane potential register).

In some embodiments, the learning process may be implemented by examining the input when a learning threshold of the neuron (e.g., a value in learning threshold register 111) is reached. The learning threshold of the neuron may be initially set to a very low value. The learning threshold may increase as neurons learn and more synaptic weights match. In some embodiments, the learning process may involve exchanging unused synaptic weights (e.g., positive synaptic weights at locations where no spikes occurred) and unused spikes (e.g., spikes in a spike grouping at locations relative to a synaptic weight having a value of zero). Unused synaptic weights may be swapped to locations containing unused spikes.

In some embodiments, a spike is generated if the neuron membrane potential value (e.g., represented by counter 109) exceeds a spike threshold (e.g., a value in spike threshold register 112). The spike is placed on the local bus.

Fig. 6A is an example of grouping spikes into groups of spikes, according to some embodiments. In fig. 6A, spike input buffer 601 (e.g., equivalent to spike input buffer 103) receives spikes from the local bus that have been processed by spike decoding circuitry. The grouper 602 may classify spikes in the spike input buffer 601 into spike groupings 603 according to the synaptic index number of the spike. For example, in fig. 6A, the received spike trains are 1, 6, 23, 1, 19, 18. As one of ordinary skill in the art will appreciate, the spike train may be much larger than the small number of spikes shown in fig. 6A. For example, a spike train may include thousands of spikes distributed to multiple synapses.

Fig. 6B is an example representation of the spike grouping 603 in fig. 6A according to some embodiments. In fig. 6B, spike packet 603 includes classified spikes from spike input buffer 601. In spike grouping 603,

locations

1, 6, 18, 19, and 23 are highlighted, indicating that these locations contain a logical "1" value. The remaining locations within the spike grouping 603 contain zeros (e.g., indicating that no spikes are present).

In some embodiments, the spikes may be organized in the same order as the synaptic weights are located in memory (e.g., synaptic weight memory 105). This may allow an AND operation to be performed between the synaptic weight value AND a spike in the incoming spike grouping to determine whether to increment or decrement a membrane potential counter (e.g., counter 109). When a spike occurs at a position where the synaptic weight value is zero, the counter does not change for that bit position.

Fig. 7 is an example of a method of selecting whether to increment or decrement a membrane potential value (e.g., counter 109) according to some embodiments. In fig. 7, a logical AND operation is performed between spike grouping 701 AND weight register 702 (e.g., weight register 702 is equivalent to weight register 106). In fig. 7,

spike bits

1, 6, 18, 19, and 23 of spike grouping 701 are highlighted, indicating that they contain a logical "1" value (e.g., indicating the presence of a spike). The remaining locations within spike grouping 701 contain zeros (e.g., indicating that no spikes are present).

The weight register 702 may contain logic bits indicating a positive or negative value. In fig. 7,

bits

1, 4, 5, 14, and 22 contain positive values, while bit 18 contains negative values. Positive values may indicate excitatory effects, while negative values may indicate inhibitory effects. The bits in the weight register 702 may be labeled EXC for excitatory weights and INH for inhibitory weights. A logical AND is performed between the bits in the weight register 702 AND the bits in the spike packet 701. Thus, the spike occurring at position 1 increments the value of the membrane potential of the neuron (e.g., counter 109). Conversely, a spike occurring at location 18 decrements the value of the neuron's membrane potential (e.g., counter 109).

Fig. 7 is an example of a spike-time dependent plasticity (STDP) learning method according to some embodiments. In STDP learning, a spike that causes an output event/spike may have its representative synaptic weight strengthened, while a spike that does not cause an output event/spike may have its synaptic weight weakened.

In some embodiments, the STDP learning method is modified such that unused synaptic weights are exchanged to locations containing unused spikes. For example, synaptic weights that have a value of zero and have received a spike are exchanged with synaptic weights that have a logic "1" and have not received any spike.

In some embodiments, when a logical AND operation is performed on spikes whose bits in the spike grouping are "1" AND the synaptic weight is zero, the result is zero. This may be referred to as "unused spikes". When a logical AND operation is performed on the spike bits in the spike grouping that are "0" AND the synaptic weight is "1", the result is zero. This may be referred to as "unused synaptic weights". Learning circuitry (e.g., weight exchanger 113) may exchange randomly selected unused synaptic weights where an unused spike occurs.

In fig. 7, position 1 in spike grouping 701 contains the spike being used. Position 1 in the synaptic weight 702 contains the weight used. This may result in an increment of the membrane potential value (e.g., counter 109) of the neuron.

Positions

4 and 5 of the synaptic weight 702 contain unused synaptic weights. These synaptic weights are candidates for swapping. Position 6 of spike grouping 701 contains an unused spike. In other words, position 6 of spike grouping 701 contains a 1, but position 6 of synaptic weight 702 contains a zero. Unused synaptic weights may be swapped to that location. The location 14 of the synaptic weight 702 contains unused synaptic weights. The location 18 of the spike grouping 701 contains the spike used and the location 18 of the synaptic weight 702 contains the synaptic weight (in this case, inhibitory) used. This may result in a decrement in the membrane potential value (e.g., counter 109) of the neuron. Location 19 of spike grouping 701 contains an unused spike. The location 22 of the synaptic weight 702 contains unused synaptic weights. Location 23 of spike grouping 701 contains an unused spike.

The inspiration for this STDP learning approach comes from learning that occurs in the biological brain. In some embodiments, a modified form of the STDP learning method is used to perform learning. This modified approach is similar to the mechanism by which biological neurons learn.

In some embodiments, a spiking neuron circuit spikes when its input drives its membrane potential value (e.g., counter 109) to a threshold. This may mean that when a neuron is driven to a threshold and spikes are generated, the connection from its most recently activated input is strengthened, while many of its other connections are weakened. This may result in neurons learning to respond to input patterns they see repeatedly, learning features that characterize the input data set autonomously.

In some embodiments, depending on other characteristics of this STDP method, such as natural competition between neurons caused by changes in learning thresholds, the population of neurons within a layer can learn extensive coverage of the input feature space. Thus, the response of a population of neurons to a given input will carry information about the features present.

In the brain, the perception process is usually hierarchical, occurring over a series of layers. Early layers extract information about simple features, while higher layers learn to respond to combinations of those features, making their response more selective to more complex shapes or objects, and making them more conventional due to their response to spatial position or orientation, i.e., they are invariant to spatial position or orientation.

In some embodiments, the modified STDP learning method is completely unsupervised. This is in contrast to conventional multiple supervised training approaches used in neural networks. This means that embodiments herein can be presented as unlabeled datasets and can learn to respond to different features present in the data without any other information. Learning may be an ongoing process.

In some embodiments, when a new class is added to an already trained dataset, the entire neural network (e.g., neuron structure 201) need not be retrained. This can eliminate the technical problem of catastrophic forgetfulness. By allowing learning to continue, new classes can be added to the network's identified features.

Unsupervised learning can extract features. However, unsupervised learning cannot directly "tag" its output without the tagging data. In the classification task, the neural network (e.g., neuron structure 201) may learn a set of features that distinguish classes present in the stimulation dataset. The method of linking the response representing the feature to the input tag may then be employed at the discretion of the user.

Fig. 8 illustrates weight exchange steps of an STDP learning method according to some embodiments. FIG. 8 shows an example of the next step in the modified STDP learning process, in which "unused synaptic weights" are exchanged for "unused inputs" to enhance the response of neurons to the same or similar input spike patterns in the future.

In fig. 8,

spike bits

1, 6, 18, 19, and 23 of spike packet 801 are highlighted, indicating that these spike bits contain a logic "1" value (e.g., indicating the presence of a spike). The remaining locations within spike grouping 801 contain zeros (e.g., indicating that no spikes are present).

Bits

1, 4, 5, 14, and 22 of synaptic weight 802 contain a "+ 1" value, while bit 18 contains a "-1" value. Bit 19 of unused spike 801 represents an unused spike in spike packet 801.

Bits

5 and 14 of unused synaptic weights 802 represent unused synaptic weights in synaptic weights 802. In fig. 8, new synaptic weights 805 represent the result of exchanging unused synaptic weights (e.g., positive synaptic weights at locations where no spikes occurred) and unused spikes (e.g., spikes in groups of spikes at locations relative to synaptic weights having a value of zero). For example, bit 14 of the new synaptic weight 805 contains the value of bit 18 of the synaptic weight 802, and vice versa.

FIG. 9 illustrates convolution in a neural processor configured as a spike fully connected neural processor, in accordance with some embodiments. For example, fig. 9 shows a modified STDP method that learns by weight exchange in a convolutional layer for use in a neural processor configured as a spike fully connected neural processor.

Convolution may be a mathematical operation with the purpose of extracting features from data (e.g., an image). The result of the convolution between two data sets, whether image data or another data type, is a third data set.

In some embodiments, the convolution may operate on spike and potential values. Each neural processor (e.g., neural processor 208) may identify the neuron with the highest potential value and broadcast that neuron to other neural processors in the same layer. In some embodiments, if the potential value of a neuron is above a learning threshold (e.g., the value in learning threshold register 111), synaptic weights for all kernels of the neuron that are output to that neuron are updated. The same event packet may be retransmitted from the upper layer. The neural processor may only affect spikes within the neuron receive field. For example, in FIG. 9, the area within a square is similar to 902. Neural processor 208 identifies unused spikes in the receive field (shown as U) and unused weights in the kernels (e.g., kernels 901 and 903) (shown as U).

In some embodiments, the modified STDP learning method may determine the total number of swapped bits across all cores between unused and preferred synaptic weights according to a rule. For example, a rule may be the number of swapped bits-minimum (min) (number of swaps, number of unused spikes, number of unused weights), where the number of swaps may be a configuration field. In this example, min (5, 3, 4) ═ 3. The modified STDP learning method may then randomly swap bits of "number of swapped bits" across all cores. In fig. 9, three bits are swapped. Thereafter, the modified STDP learning method may update the synaptic weights of the filters of all other neural processors of the same layer accordingly.

The neural network may have a training phase. The training phase may use known samples. The neural network may also have an inference phase during which previously unused samples may be identified. During the training phase, the output neurons are labeled according to the stimulation class to which they are most responsive. During the inference phase, the inputs are labeled according to the features to which neurons respond most. The unsupervised learning approach of embodiments herein may be useful in situations where there is a substantial data set that is labeled only a small portion. In this case, embodiments herein may be trained on the entire data set, after which a supervision phase is performed to label the network output using a smaller labeled data set.

In some embodiments, a supervised learning algorithm may be used. The inference component of embodiments herein can be completely separate from the learning algorithm, preserving its advantages of fast, efficient computation. Embodiments herein have been designed such that synaptic weights learned offline using a user-selected algorithm can be easily uploaded. The network design may be limited to binary or ternary synaptic weights and activation levels. Under these constraints, more and more third party technologies are destined for supervised learning.

While unsupervised learning can perform these tasks well, there are some situations where supervised learning approaches would be advantageous, with the help of supervision, at some times. However, again, unsupervised learning has the ability to perform tasks that are not possible with supervised learning methods, such as finding unknown and unexpected patterns in data for which no labeled results are available for supervision. These methods are easily missed.

FIG. 10 illustrates a symbolic representation of convolution in an 8 × 8 matrix of pixels with a depth of 1, according to some embodiments. In fig. 10, an exemplary 5 × 5 convolution filter 1002 is applied. In FIG. 10, the filter 1002 is allowed to "pop out" of the original input. In fig. 10, this can be done by padding the original input 1001 with zeros.

Four convolution types are supported: effective, identical, complete, and filled. Fig. 10 shows the resulting convolution. A "full" convolution (e.g., full convolution 1003) may maximally increase the size of the output convolution by a padding of 4. The "same" convolution (e.g., the same convolution 1004) may use padding of 2 to generate the size of the output convolution, which is the same as the original input dimension (e.g., 8 × 8 × 1). An "effective" convolution (e.g., effective convolution 1005) may use 0 padding and result in a size of the output convolution that is smaller than the original input dimension.

In some embodiments, the SCNN may be allowed to use full convolution, the same convolution, or an effective convolution. The SCNN may also be allowed to use a custom convolution type called "padding". The programmer may indicate the type of filled convolution by specifying a fill around each side of the original input 1001.

In some embodiments, different types of convolutions may be defined by equations 2 through 4. Equations 2 through 4 may define the size of the convolution input in terms of the original input size and the filter size. In equations 2 to 4, I_wCan represent the width of the original input, C_wCan represent the width of the convolution input (e.g., an array of potentials), and k_wMay represent the width of the filter.

Effective type C_w＝I_w-(k_w-1) (2)

Same type C_w＝I_w (3)

Complete type C_w＝I_w+(k_w-1) (4)

FIG. 11 illustrates a symbolic representation of a convolution involving 2 spike channels, two 3 × 3 inverse convolution kernels, and the resulting membrane potential values, according to some embodiments. Fig. 11 shows two channels of

spikes

1101 and 1102. Fig. 11 also shows two 3 x 3

inverse convolution kernels

1103 and 1104, and the resulting neuronal membrane potential values 1105. Two channel examples of the present embodiment are shown here, modified in that spiking neuron circuits are used to perform these operations instead of a programmed processor. First, all potentials in a neural processor configured as an SCNN processor are cleared to zero. When a spike packet enters, processing of the spike packet causes a change in the membrane potential value of the affected neuron circuit.

Fig. 12 illustrates the resulting spikes generated in the spiking neuron circuits on

channels

1101 and 1102 of fig. 11, in accordance with some embodiments. Channel 1101 shows a spiking neuron circuit that is fired to a "1" in the matrix. All other locations in the channel 1101 are filled with zeros. The spike map can be convolved with two

inverse kernels

1103 and 1104 shown in fig. 11 for

channels

1101 and 1102, resulting in the neuronal membrane potential values 1105 shown in fig. 11.

FIG. 13 illustrates a spiking neural network convolution operation, according to some embodiments. Fig. 13 shows an input (e.g., an image) with three channels (e.g.,

channels

1301, 1302, and 1303) processed by two filters (e.g., filters 1304 and 1305). The blank entries in

filters

1304 and 1305 may correspond to zero values.

In fig. 13, the dimension of the filter 1304 is 5 × 5 × 3 (e.g., filter width × filter height × number of channels). The filter 1304 is centered on the coordinates (2, 2) of the input image. The upper left corner of the input image has coordinates (0, 0). The width and height of the filter may be smaller than the input image. As will be understood by those of ordinary skill in the art, filters typically have a 3 x 3, 5 x 5, or 7 x 7 configuration.

In some embodiments, the filters may have different sets of weights corresponding to particular channels of the input. Each set of weights may be referred to as a kernel of a filter. In FIG. 13, filter 1304 has three cores (e.g.,

cores

1306, 1307, and 1308). The number of kernels in each filter may match the number of channels in the input. Each input event may have (x, y) coordinates and channel coordinates.

In some embodiments, the results of the convolution operation are summed into a single entry in the potential array. In fig. 13, the dashed box shows where the filter 1304 convolution occurs over the input. The smaller dashed box in the potential array 1309 of the filter 1304 shows where these inputs are summed.

In fig. 13, the convolution operation performed by the filter 1304 can be described as a 3D dot product. The dashed box shows the position where the filter 1304 is aligned with the input. The 3D dot product is summed across the x, y sum channels and the scalar sum is placed into a third matrix, referred to as a potential array (or activation map). In fig. 13, a potential array 1309 represents a potential array for the filter 1304. As will be understood by those skilled in the art, each element of the potential array may be considered a membrane potential value of a neuron (e.g., a spiking neuron circuit).

In fig. 13,

potential arrays

1309 and 1310 represent potential arrays corresponding to

filters

1304 and 1305. The dashed box shows the result of the current convolution. The dimension of the potential array may define the total number of neurons (e.g., spiking neuron circuits). In fig. 13,

filters

1304 and 1305 each simulate nine neurons. Each

filter

1304 and 1305 may be centered at a different x-y position within the three input channels. This example of convolution in fig. 13 shows the binary values for the elements in the input image and the weights in

filters

1304 and 1305. However, as will be understood by those of ordinary skill in the art, the elements in the input image and the weights in

filters

1304 and 1305 may include positive and negative floating point values.

In some embodiments, the discrete convolution may be performed using equation 1. In equation 1, f may represent an input and g may represent a filter (e.g., filter 1304). As will be understood by those of ordinary skill in the art, equation 1 is analogous to calculating a dot product centered at a different image position for each value. However, as will be understood by those of ordinary skill in the art, it may be necessary to "flip" the filter before "sliding" the filter through the input for each dot product. Because convolution is a useful mathematical property, the convolution operation may require flipping the index in the filter.

In some embodiments, the stride of a convolutional layer may be defined as how much to shift a filter (e.g., 1304) between subsequent dot-product operations. In some embodiments, the convolution step may be hard coded to 1.

In fig. 13, the

filters

1304 and 1305 are not allowed to "stretch" the original input image at all during the convolution operation. This type of convolution may be referred to as an "effective" convolution. This type of convolution may result in potential arrays (e.g., potential arrays 1309 and 1310) that are smaller than the original input.

FIG. 14 illustrates the result of applying eight directional filter neuron convolutions to an input image according to some embodiments. In fig. 14, an input image 1401 containing a cat with a single channel is converted into a spiking map having the same width and height dimensions as the original image, with

channels

1402, 1403, 1404, 1405, 1406, 1407, 1408, and 1409.

Fig. 15 illustrates the similarity between DVS spike-based convolution and frame-based convolution according to some embodiments. A frame may refer to a frame transmitted by a standard camera. Events (or spikes) are transmitted by the spike or event based camera. Event-based convolution can perform a convolution operation at each event (or spike) and place the result in an array of output membrane potentials. Frame-based convolution 1502 shows a classic frame-based convolution in which the convolution is performed on the entire image. The parameters and results of the convolution are shown. Event-based convolution 1504 shows an event-based convolution operation in which an event (or spike) at (3, 3) is processed at time 0 nanoseconds (ns), then an event at (2, 3) is processed at time 10ns, then another event at (3, 3) is processed at 20ns, and finally an event at (3, 2) is processed at 30 ns. The resulting array after each event has been processed is shown above the kernel. The end result is the same.

FIG. 16 illustrates an example of a YAML configuration file 1600 in accordance with some embodiments. The YAML file may be a function of the Python programming language. In some embodiments, the YAML profile 1600 can be used to program and initialize a neuromorphic integrated circuit to handle events (or spikes) in a defined application. An event is indicated by the occurrence of a spike and may indicate a color shift in the image, an increase or decrease in the measured analog value, a change in contrast, the occurrence of particular data in a packet, or other real-world phenomena.

In some embodiments, a neuromorphic integrated circuit or software simulation thereof is configured by the YAML configuration file 1600 to process CIFAR10 datasets in SCNN with eight different neural layers. The first layer (layer 1602) is configured as an input layer with a 32 x 32 bit organization to match the resolution of the data set. This layer may convert the pixel information into spikes and is connected to layer 1604.

Layer 1604 is configured as a convolutional layer with a third synaptic weight. Layer 1604 is defined as the "convolutional third" layer type. The weights may be preloaded into layer 1604 from a file named "scnn _ conv2_ wts. dat" that exists in a particular directory. Layer 1604 is defined to use a "flush buffer". A "flush buffer" may be defined elsewhere in the YAML configuration file 1600. The peak packet size of layer 1604 is defined as 131,072 peaks. This may be equivalent to a full frame of 32 x 32 pixels with a depth of 8. Layer 1604 is defined to have 128 outputs to the next layer (e.g., layer 1606). Layer 1604 is defined to have a convolution size of 3 x 3 pixels, pooled across a 2 x 2 field.

A similar configuration may be made for each convolutional layer in the SCNN. The last layer may be of the "fully connected ternary" type, indicating that this layer is a fully connected layer with a synaptic weight of the third bit. This last layer may have ten outputs. This may equate to ten classifications contained in the CIFAR10 dataset. The last layer may have a packet size of 1024. This may be equivalent to the number of features returned in the previous convolutional layer.

In some embodiments, the YAML configuration file 1600 may be processed during initialization of the neuromorphic integrated circuit. The constructor task may generate parameter objects from the parameters specified in the YAML configuration file 1600. The constructor task may assign a separate layer for each parameter object in the YAML configuration file 1600. Each layer may be created with parameters specific to each layer. The data structure may be used to sequentially iterate through all layers. A buffer type object may be created and initialized for each layer in the SCNN. Each layer may be initialized by registers organized as scan chains and connected to the previous layer, except for the input layer, which may be connected to input signals and output to the next layer. During layer initialization, the connection list, the weight vector array, and the potential array are initialized. The connection list may contain information about which neuron circuits are connected. Each neuron circuit in the SCNN may have a defined number of synapses, including synapse weight values. The membrane potential value for each neuron may be defined as the sum of all synaptic weight vectors connected to the neuron circuit and specified by spiking values in the spike packet.

FIG. 17 illustrates a configuration register that includes scan chains that define the configuration and connectivity of each neuron circuit and each layer of neuron circuits, according to some embodiments. FIG. 17 illustrates a configuration register that includes scan chains that define the configuration and connectivity of each neuron circuit and each layer of neuron circuits, according to some embodiments. The configuration data may be sent to the neural processor in a sequential manner to build a processing sequence.

Claims

1. A neuromorphic integrated circuit comprising:

a spike converter configured to generate a spike from input data;

a grouper configured to generate a grouping of spikes comprising spike positions, the spike positions representing the spikes, wherein each spike position in the grouping of spikes corresponds to a synapse of a plurality of synapses;

a reconfigurable neuron structure comprising a neural processor comprising a plurality of spiking neuron circuits configured to perform tasks based on the spiking and neural network configurations;

a memory comprising the neural network configuration, wherein the neural network configuration comprises an array of potentials and the plurality of synapses, the neural network configuration defines connections between the plurality of spiking neuron circuits and the plurality of synapses, the array of potentials comprises membrane potential values for the plurality of spiking neuron circuits, and the plurality of synapses have respective synaptic weights; and

a processor configured to:

selecting a spiking neuron circuit among the plurality of spiking neuron circuits based on the spiking neuron circuit having a membrane potential value that is the highest value of the membrane potential values of the plurality of spiking neuron circuits;

determining that the membrane potential value of the selected spiking neuron circuit has reached a learning threshold associated with the spiking neuron circuit; and

performing a spike-time dependent plasticity (STDP) learning function based on the determination that the membrane potential value of the selected spiking neuron circuit has reached the learning threshold associated with the spiking neuron circuit.

2. The neuromorphic integrated circuit of claim 1, wherein each of the synaptic weights has a weight value selected from the group consisting of negative ones, zeros, and ones.

3. The neuromorphic integrated circuit of claim 1, wherein each spike position in the spike grouping is represented using a digital bit.

4. The neuromorphic integrated circuit of claim 1, further comprising: a spike input buffer configured to store the spikes from the spike converter.

5. The neuromorphic integrated circuit of claim 1, wherein to perform the spike-time-dependent plasticity (STDP) learning function, the neural processor is configured to:

determining that a first spiking bit in the grouping of spikes represents an unused spike, wherein the first spiking bit corresponds to a first synapse of the plurality of synapses, a value of the first spiking bit is one, and a value of a first synaptic weight for the first synapse is zero;

determining that a second synaptic weight of a second synapse of the plurality of synapses is an unused synaptic weight, wherein the second synaptic weight corresponds to a second spiky position in the grouping of spikes, a value of the second synaptic weight is one, and a value of the second spiky position in the grouping of spikes is zero; and

exchanging a value of the second synaptic weight with a value of the first spike.

6. The neuromorphic integrated circuit of claim 1, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

applying a first logical AND function to a first peak location in the group of spikes AND a first synaptic weight of a first synapse corresponding to the first peak location, wherein the first synaptic weight has a value of one, AND the applying the first logical AND function outputs a logical one; and

incrementing a membrane potential value associated with the spiking neuron circuit in response to the applying the first logical AND function outputting a logical one.

7. The neuromorphic integrated circuit of claim 1, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

applying a second logical AND function to a first spike bit in the grouping of spikes AND a second synaptic weight corresponding to the first spike bit, wherein the second synaptic weight has a negative value, AND the applying the second logical AND function outputs a logical one;

decrementing a membrane potential value associated with the spiking neuron circuit in response to the applying the second logical AND function outputting a logical one.

8. The neuromorphic integrated circuit of claim 1, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

determining that a membrane potential value associated with the spiking neuron circuit has reached a spiking threshold of the spiking neuron circuit; and

generating an output spike based on the determination that the value of membrane potential of the spiking neuron circuit has reached the spike threshold.

9. The neuromorphic integrated circuit of claim 8, further comprising an output spike buffer and a network-on-chip (NoC) bus, wherein the spiking neuron circuits of the plurality of spiking neuron circuits are configured to insert the output spikes in the output spike buffer for transmission on the network-on-chip (NoC) bus.

10. The neuromorphic integrated circuit of claim 1, wherein the reconfigurable neuron structure is configured to perform the task using a convolution operation.

11. The neuromorphic integrated circuit of claim 1, further comprising:

a plurality of communication interfaces including at least one of a Universal Serial Bus (USB) interface, an Ethernet interface, a Controller Area Network (CAN) bus interface, a serial interface using a Universal Asynchronous Receiver Transmitter (UART), a peripheral component interconnect express (PCIe) interface, or a Joint Test Action Group (JTAG) interface.

12. The neuromorphic integrated circuit of claim 1, wherein the input data comprises pixel data, audio data, or perceptual data.

13. The neuromorphic integrated circuit of claim 1, further comprising a plurality of sensor interfaces.

14. The neuromorphic integrated circuit of claim 1, wherein the processor is configured as at least one of a spike fully connected neuro processor and a spike convolutional neuro processor.

15. A method of spike-time dependent plasticity (STDP) learning, comprising:

generating a spike from the input data at a spike converter;

generating, at a grouper, a grouping of spikes comprising spike positions, the spike positions representing the spikes, wherein each spike position in the grouping of spikes corresponds to a synapse of a plurality of synapses;

constructing a reconfigurable neuron structure comprising a neural processor comprising a plurality of spiking neuron circuits to perform tasks based on the spikes;

selecting, at the neural processor, the spiking neuron circuit among the plurality of spiking neuron circuits based on the spiking neuron circuit having a membrane potential value that is the highest value of the membrane potential values of the plurality of spiking neuron circuits;

determining, at the neural processor, that a membrane potential value of the selected spiking neuron circuit has reached a learning threshold associated with the spiking neuron circuit; and

performing a spike-time dependent plasticity (STDP) learning function at the neural processor based on the determination that the membrane potential value of the selected spiking neuron circuit has reached the learning threshold associated with the selected spiking neuron circuit.

16. The spike-time dependent plasticity (STDP) learning method of claim 15, wherein the set of spike positions represents input data comprising pixel data, audio data, or perceptual data.

17. The method of spike-time dependent plasticity (STDP) learning according to claim 15, wherein the set of synapses has respective synaptic weights, and each synaptic weight has a weight value selected from the group consisting of negative ones, zero, and ones.

18. The spike-time dependent plasticity (STDP) learning method of claim 15, further comprising: storing the spikes from the spike converter at a spike input buffer.

19. The spike-time dependent plasticity (STDP) learning method of claim 15, further comprising:

determining, by the neural processor, that a first spiking bit in the group of spikes represents an unused spike, wherein the first spiking bit corresponds to a first synapse of a plurality of synapses, a value of the first spiking bit is one, and a value of a first synaptic weight of the first synapse is zero;

determining, by the neural processor, that a second synaptic weight of a second synapse of the plurality of synapses is an unused synaptic weight, wherein the second synaptic weight corresponds to a second peaked location in the grouping of spikes, a value of the second synaptic weight is one, and a value of the second peaked location in the grouping of spikes is zero; and

exchanging, by the neural processor, a value of the second synaptic weight with a value of the first spike.

20. The spike-time dependent plasticity (STDP) learning method of claim 15, wherein each spike position in the spike packet is represented using a digital bit.

21. The method of spike-time dependent plasticity (STDP) learning according to claim 15, the performing the spike-time dependent plasticity (STDP) learning function further comprising:

at a spiking neuron circuit of the plurality of spiking neuron circuits, applying a first logical AND function to a first spike position in the group of spikes AND a first synapse weight of a first synapse corresponding to the first spike position, wherein the first synapse weight has a value of one, AND the applying the first logical AND function outputs a logical one; and

incrementing, at the spiking neuron circuit of the plurality of spiking neuron circuits, a membrane potential value associated with the spiking neuron circuit in response to the applying the first logical AND function outputting a logical one.

22. The spike-time dependent plasticity (STDP) learning method of claim 15, further comprising:

at a spiking neuron circuit of the plurality of spiking neuron circuits, applying a second logical AND function to the first spike bit AND a second synaptic weight corresponding to the first spike bit in the spike grouping, wherein the second synaptic weight has a negative value, AND the applying the second logical AND function outputs a logical one;

decrementing, at the spiking neuron circuit of the plurality of spiking neuron circuits, a membrane potential value associated with the spiking neuron circuit in response to the applying the second logical AND function outputting a logical one.

23. The method of spike-time dependent plasticity (STDP) learning according to claim 15, wherein the processor is configured as at least one of a spike fully connected neural processor and a spike convolutional neural processor.

24. A neuromorphic integrated circuit comprising:

a spike converter configured to generate a spike from input data;

a memory comprising the neural network configuration, wherein the neural network configuration comprises an array of potentials and a plurality of synapses, the neural network configuration defines connections between the plurality of spiking neuron circuits and the plurality of synapses, the array of potentials comprises membrane potential values for the plurality of spiking neuron circuits, and the plurality of synapses have respective synaptic weights; and

a processor configured to modify the neural network configuration.

25. The neuromorphic integrated circuit of claim 24, wherein each of the synaptic weights has a weight value selected from the group consisting of negative ones, zeros, and ones.

26. The neuromorphic integrated circuit of claim 24, wherein the neuromorphic integrated circuit further comprises:

a spike input buffer configured to store the spikes from the spike converter; and

a grouper configured to generate a grouping of spikes including spike positions representing the spikes in the spike input buffer, wherein each spike position in the grouping of spikes corresponds to a synapse of the plurality of synapses.

27. The neuromorphic integrated circuit of claim 26, wherein each spike position in the spike grouping is represented using a digital bit.

28. The neuromorphic integrated circuit of claim 26, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

29. The neuromorphic integrated circuit of claim 26, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

30. The neuromorphic integrated circuit of claim 24, wherein a spiking neuron circuit of the plurality of spiking neuron circuits is configured to:

31. The neuromorphic integrated circuit of claim 24, wherein the reconfigurable neuron structure is configured to perform the task using a convolution operation.

32. The neuromorphic integrated circuit of claim 24, wherein to modify the neural network configuration, the processor is configured to modify the connections between the plurality of spiking neuron circuits and the plurality of synapses based on the configuration file.