CN111656360A

CN111656360A - System and method for sparsity utilization

Info

Publication number: CN111656360A
Application number: CN201880061175.XA
Authority: CN
Inventors: K.F.布什; J.H.霍利曼三世; P.沃伦坎普; S.W.贝利
Original assignee: Morita Co
Current assignee: Morita Co
Priority date: 2017-07-21
Filing date: 2018-07-20
Publication date: 2020-09-11
Anticipated expiration: 2038-07-20
Also published as: CN111656360B; CN117993448A

Abstract

Disclosed is a neuromorphic integrated circuit that, in some embodiments, includes a multilayer neural network disposed in an analog multiplier array of two-quadrant multipliers. Each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistor of the multiplier is approximately zero, the weight value of the transistor of the multiplier is approximately zero, or a combination thereof. Also disclosed is a method of neuromorphic integrated circuits, the method comprising, in some embodiments: training a neural network; tracking a rate of change of the weight values; determining whether and how quickly certain weight values are trending toward zero; and driving those weight values towards zero, thereby encouraging sparsity in the neural network. Sparsity in a neural network combined with a multiplier wired to ground minimizes power consumption of the neuromorphic integrated circuit, making battery power sufficient for power.

Description

System and method for sparsity utilization

Priority

This application claims priority to U.S. provisional patent application No.62/535, 705 entitled "Systems and Methods for spaarity Exploiting" filed on 20.7.2018 and U.S. patent application No.16/041, 565, and 21.7.7.2017, which are hereby incorporated by reference in their entirety.

Technical Field

Embodiments of the present disclosure relate to the field of neuromorphic computing. More particularly, embodiments of the present disclosure relate to systems and methods for encouraging sparsity in a neural network of a neuromorphic integrated circuit and minimizing power consumption of the neuromorphic integrated circuit.

Background

Conventional central processing units ("CPUs") process instructions based on "time keeping". Specifically, the CPU operates such that information is transmitted at regular intervals. Silicon-based chips can be fabricated with more than 50 hundred million transistors per die, which have features as small as 10 nm, based on complementary metal oxide semiconductor ("CMOS") technology. Advances in CMOS technology have been successfully exploited in the evolution of parallel computing, which is ubiquitously used in personal computers and cellular phones that contain multiple processors.

However, as machine learning is becoming prevalent for numerous applications including bioinformatics, computer vision, video games, marketing, medical diagnostics, online search engines, and the like, conventional CPUs often fail to supply a sufficient amount of processing power while maintaining low power consumption. In particular, machine learning is a sub-part of computer science, which is directed to software that has the ability to learn from and make predictions about data. Further, one branch of machine learning includes deep learning, which is directed to utilizing deep (multi-layer) neural networks.

Currently, research is being conducted to develop direct hardware implementations of deep neural networks, which may include systems that attempt to simulate "silicon" neurons (e.g., "neuromorphic calculations"). Neuromorphic chips (e.g., silicon computational chips designed for neuromorphic computations) operate by processing instructions in parallel (e.g., as opposed to a conventional sequential computer) using current pulse trains transmitted at non-uniform intervals. As a result, neuromorphic chips require much less power for processing information, particularly artificial intelligence ("AI") algorithms. To accomplish this, the neuromorphic chip may contain five times as many transistors as a conventional processor, consuming as little power as 1/2000. Therefore, the development of neuromorphic chips has been directed to providing a chip with tremendous processing power that consumes much less power than conventional processors. In addition, neuromorphic chips are designed to support dynamic learning in the context of complex and unstructured data.

There is a continuing need to develop neuromorphic chips with significant processing power that consume much less power than conventional processors. Provided herein are systems and methods for encouraging sparsity in a neural network of a neuromorphic chip and minimizing power consumption of the neuromorphic chip.

Disclosure of Invention

Disclosed herein is a neuromorphic integrated circuit that, in some embodiments, includes a multilayer neural network disposed in an analog multiplier array of a plurality of two-quadrant multipliers arranged in a memory sector of the neuromorphic integrated circuit. Each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistor of the multiplier is approximately zero, the weight value of the transistor of the multiplier is approximately zero, or a combination thereof. Sparsity in a neural network combined with a plurality of multipliers wired to ground minimizes power consumption of the neuromorphic integrated circuit.

In some embodiments, each of the multipliers draws no current when the input signal value of the input signal to the transistor of the multiplier is zero, the weight value of the transistor of the multiplier is zero, or a combination thereof.

In some embodiments, the weight values correspond to synaptic weight values set between neural nodes in a neural network in the neuromorphic integrated circuit.

In some embodiments, multiplying the input signal values by the weight values provides output signal values that are combined to arrive at a decision for the neural network.

In some embodiments, the transistors of the two-quadrant multiplier comprise metal oxide semiconductor field effect transistors ("MOSFETs").

In some embodiments, each of the two-quadrant multipliers has a differential structure configured to allow for programmed compensation for overshoot if either of the two cells is set to have a weight value that is greater than the target.

In some embodiments, the neuromorphic integrated circuit is configured for one or more application-specific standard products ("ASSPs") selected from keyword spotting (key spotting), speaker identification, one or more audio filters, gesture recognition, image recognition, video object classification and segmentation, and autonomous vehicles including drones.

In some embodiments, the neuromorphic integrated circuit is configured to operate on battery power.

Also disclosed herein is a method of neuromorphic integrated circuits, the method comprising, in some embodiments: the method includes training a multi-layer neural network in an analog multiplier array of a plurality of two-quadrant multipliers disposed in a memory sector of a neuromorphic integrated circuit, and encouraging sparsity in the neural network during training. Each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistor of the multiplier is approximately zero, the weight value of the transistor of the multiplier is approximately zero, or a combination thereof. Encouraging sparsity in a neural network includes training with a training algorithm configured to drive a large number of input signal values, weight values, or a combination thereof toward zero for a multiplier, thereby enabling minimal power consumption of the neuromorphic integrated circuit.

In some embodiments, the method further comprises tracking the rate of change of the weight values of each of the multipliers during training, and determining whether certain weight values are trending toward zero, and how quickly those particular weight values are trending toward zero.

In some embodiments, the method further comprises: as part of encouraging sparsity in the neural network, for those weight values that are trending toward zero during training, the weight values are driven toward zero.

In some embodiments, the weight values correspond to synaptic weight values between neural nodes in a neural network of the neuromorphic integrated circuit.

In some embodiments, the method further comprises incorporating the neuromorphic integrated circuit into one or more ASSPs selected from the group consisting of keyword localization, speaker identification, one or more audio filters, gesture recognition, image recognition, video object classification and segmentation, and autonomous vehicles comprising drones.

Also disclosed herein is a method of neuromorphic integrated circuits, the method comprising, in some embodiments: training a multi-layer neural network in an analog multiplier array of a plurality of two-quadrant multipliers disposed in a memory sector of a neuromorphic integrated circuit; tracking a rate of change of the weight value of each of the multipliers during training; determining whether certain weight values are trending toward zero, and how quickly those weight values are trending toward zero; and for those weight values that are trending toward zero, driving the weight values toward zero, thereby encouraging sparsity in the neural network. Each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistor of the multiplier is approximately zero, the weight value of the transistor of the multiplier is approximately zero, or a combination thereof.

In some embodiments, the method further comprises setting a subset of the weight values to zero prior to training the neural network, thereby further encouraging sparsity in the neural network.

In some embodiments, training utilizes a training algorithm configured to drive a large number of input signal values, weight values, or combinations thereof toward zero for a multiplier, thereby enabling minimal power consumption of the neuromorphic integrated circuit.

In some embodiments, the training encourages sparsity in the neural network by minimizing a cost function that includes an amount of non-zero ones of the weight values.

In some embodiments, the method further comprises minimizing the cost function with an optimization function comprising gradient descent, backpropagation, or both gradient descent and backpropagation. The power consumption estimate of the neuromorphic integrated circuit is used as a component of a cost function.

Drawings

Embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

fig. 1 provides a schematic diagram illustrating a system 100 for designing and updating a neuromorphic integrated circuit ("IC"), according to some embodiments.

Fig. 2 provides a schematic diagram illustrating an analog multiplier array according to some embodiments.

Fig. 3 provides a schematic diagram illustrating an analog multiplier array according to some embodiments.

Figure 4 provides a schematic diagram illustrating an unbiased, two-quadrant multiplier of an analog multiplier array according to some embodiments.

Detailed Description

Term(s) for

In the following description, certain terminology is used to describe features of the invention. For example, in some cases, the term "logic" may represent hardware, firmware, and/or software configured to perform one or more functions. As hardware, logic may comprise circuitry with data processing or storage functionality. Examples of such circuitry may include, but are not limited to or limited to, a microprocessor, one or more processor cores, a programmable gate array, a microcontroller, a controller, an application specific integrated circuit, a wireless receiver, transmitter and/or transceiver circuitry, semiconductor memory, or combinational logic.

The term "process" may include an instance of a computer program (e.g., a set of instructions, which is also referred to herein as an application). In one embodiment, a process may include one or more threads executing concurrently (e.g., each thread may be executing the same or different instructions concurrently).

The term "processing" may include executing a binary or script, or launching an application in which an object is processed, where launching should be interpreted as placing the application in an open state, and in some implementations, performing an emulation of an action typical of human interaction with the application.

The term "object" generally refers to a collection of data, whether in transit (e.g., over a network) or stationary (e.g., stored), that often has a logical structure or organization that enables it to be classified or typed. Herein, the terms "binary file" and "binary" will be used interchangeably.

The term "file" is used broadly to refer to a group or collection of data, information, or other content for use with a computer program. A file may be accessed, opened, stored, manipulated, or otherwise processed as a single entity, object, or unit. The files may contain other files and may contain related or unrelated content or no content at all. The file may also have a logical format, or be part of a file system having a logical structure or organization of files in complex form. Files may have names (sometimes simply referred to as "file names") and often have additional attributes or other metadata. There are many types of files such as data files, text files, program files, and directory files. The file may be generated by a user of the computing device or generated by the computing device. Access to and/or operation on the file may be mediated by an operating system and/or one or more applications of the computing device. The file system may organize files of the computing devices of the storage device. The file system may enable tracking of files and access to those files. The file system may also enable operations on files. In some embodiments, operations on a file may include file creation, file modification, file opening, file reading, file writing, file closing, and file deletion.

Finally, the terms "or" and/or "as used herein are to be interpreted as being inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any one of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

Referring now to fig. 1, a schematic diagram illustrating a system 100 for designing and updating a neuromorphic IC is provided, in accordance with some embodiments. As shown, the system 100 may include a simulator 110 configured to design and update a neuromorphic IC (such as the neuromorphic IC 102), a neuromorphic synthesizer 120, and a cloud 130. As further shown, designing and updating the neuromorphic IC may include creating a machine-learning architecture with the simulator 110 based on a particular problem. The neuromorphic synthesizer 120 may then transform the machine learning architecture into a netlist (netlist) for the electronic components of the neuromorphic IC102 and the nodes to which the electronic components are connected. In addition, the neuromorphic synthesizer 120 may transform the machine learning architecture into a graphic database system ("GDS") file detailing the IC layout of the neuromorphic IC 102. From the netlist and GDS files of the neuromorphic IC102, the neuromorphic IC102 itself may be fabricated according to current IC fabrication techniques. Once the neuromorphic IC102 is manufactured, it may be deployed to address the particular problem for which it was designed. While the initially manufactured neuromorphic IC102 may include initial firmware with customized synaptic weights between nodes, the initial firmware may be updated by the cloud 130 as needed to adjust the weights. Since the cloud 130 is configured to update the firmware of the neuromorphic IC102, the cloud 130 is not required for daily use.

Neuromorphic ICs, such as neuromorphic IC102, may be up to 100 times more energy efficient than graphics processing unit ("GPU") solutions, and up to 280 times more energy efficient than digital CMOS solutions, with an accuracy that meets or exceeds comparable software solutions. This makes such neuromorphic ICs suitable for battery-powered applications.

The neuromorphic IC (such as the neuromorphic IC 102) may be configured for ASSPs including, but not limited to, keyword localization, speaker identification, one or more audio filters, gesture recognition, image recognition, video object classification and segmentation, or autonomous vehicles including drones. For example, if the particular question is one of the keyword locations, the simulator 110 may create a machine learning architecture with respect to one or more aspects of the keyword locations. The neuromorphic synthesizer 120 may subsequently transform the machine learning architecture into a netlist and GDS file corresponding to the neuromorphic ICs for keyword spotting, which may be fabricated according to current IC fabrication techniques. Once the neuromorphic IC for keyword localization is manufactured, it may be deployed to address keyword localization in, for example, a system or device.

Neuromorphic ICs, such as neuromorphic IC102, may be deployed in toys, sensors, wearable devices, augmented reality ("AR") systems or devices, mobile systems or devices, appliances, internet of things ("IoT") devices, or audible devices.

Referring now to fig. 2, a schematic diagram illustrating an analog multiplier array 200 is provided, according to some embodiments. Such an analog multiplier array may be based on a digital NOR flash array, since the core of the analog multiplier array may be similar or identical to the core of the digital NOR flash array. That is, at least the selection and readout circuitry of the analog multiplier array is different from the digital NOR array. For example, the output current is routed as an analog signal to the next layer, rather than going through the bit line to the sense amplifier/comparator to be converted to a bit. The word line analog is driven by an analog input signal rather than a digital address decoder. Furthermore, the analog multiplier array 200 may be used in a neuromorphic IC (such as the neuromorphic IC 102). For example, the neural network may be provided in the analog multiplier array 200 in a memory sector of the neuromorphic IC.

Since the analog multiplier array 200 is an analog circuit, the input and output current values (or signal values) may vary in a continuous range, rather than simply being on or off. This is useful for storing weights (also known as coefficients) of the neural network, as opposed to digital bits. In operation, the weights are multiplied by the input current values to provide output current values that are combined to arrive at a decision for the neural network.

The analog multiplier array 200 may utilize standard program and erase circuitry to generate the tunneling and erase voltages.

Referring now to fig. 3, a schematic diagram illustrating an analog multiplier array 300 is provided, according to some embodiments. Analog multiplier array 300 may use two transistors (e.g., positive metal oxide semiconductor field effect transistor [ "MOSFET"]And negative MOSFETs) to perform a signed weight (e.g., a positive weight or a negative weight) two-quadrant multiplication of non-negative input current values. If the input current value is multiplied by a positive or negative weight, the product or output current value may be positive or negative accordingly. The positively weighted products may be stored in a first column (e.g., corresponding to analog multiplier array 300)

And the negatively weighted product may be stored in a second column (e.g., corresponding to in analog multiplier array 300)

Columns of (c). The aforementioned positively and negatively weighted product or output signal value may be considered a differential current value to provide useful information for decision making.

Because each output current from the positive or negative transistor is wired to ground and is accordingly proportional to the product of the input current value and the positive or negative weight, the power consumption of the positive or negative transistor approaches zero when the input current value or weight is at or near zero. I.e. if the input signal value is“

", or if the weight is"

", then the corresponding transistor of the analog multiplier array 300 will not consume power. This is significant because in many neural networks, often most of the values or weights, especially after training "

". Thus, energy is saved when nothing is to be done or occurs. This is different from a differential pair based multiplier, which consumes constant current (e.g. by means of tail bias current)

) Regardless of the input signal.

Referring now to fig. 4, a schematic diagram illustrating an unbiased, two-quadrant multiplier 400 of an analog multiplier array (such as analog multiplier array 300) is provided, according to some embodiments. As previously set forth, because each output current from a positive transistor (e.g., M1 of the two-quadrant multiplier 400) or a negative transistor (e.g., M2 of the two-quadrant multiplier 400) is proportional to the product of the input current value and the positive or negative weight, respectively, the power consumption of the positive or negative transistor approaches zero (or is zero) when the input current value or weight approaches zero (or is zero). This is different from a differential pair based multiplier, which consumes constant current (e.g. by means of tail bias current)

) Regardless of the input signal.

When sparsity (many zeros) is encouraged in a neural network composed of such unbiased two-quadrant multipliers via training, substantial power savings can be achieved. That is, the neural network in the analog multiplier array of the plurality of two-quadrant multipliers disposed in the memory sector of the neuromorphic integrated circuit may be trained to encourage sparsity in the neural network, thereby minimizing power consumption of the neuromorphic IC. The subset of weight values may even be set to zero before the neural network is trained, further encouraging sparsity in the neural network and minimizing power consumption of the neuromorphic IC. In fact, the power consumption of the neuromorphic IC may be minimized so that the neuromorphic IC may operate on battery power.

Training the neural network may include training with a training algorithm configured to drive a large number of input current values, weight values, or combinations thereof toward zero for a plurality of multipliers, thereby encouraging sparsity in the neural network and minimizing power consumption of the neuromorphic IC. The training may be iterative, and the weight values may be adjusted with each iteration of the training. The training algorithm may be further configured to track a rate of change of the weight value of each of the plurality of multipliers to drive the weight value towards zero. The rate of change of the weight values may be used to determine whether certain weight values are trending toward zero and how quickly the weight values are trending toward zero, which may be used in training to drive the weight values toward zero more quickly, such as by programming the weight values to be approximately zero or zero. Further, training the neural network and encouraging sparsity in the neural network may include minimizing a cost function that includes an amount of non-zero ones of the weight values. Minimizing the cost function may include using an optimization function that includes gradient descent, backpropagation, or both gradient descent and backpropagation. The power consumption estimate of the neuromorphic integrated circuit may be used as part of the cost function.

When programming a two-quadrant multiplier such as unbiased, two-quadrant multiplier 400, it is common to erase each of its programmable cells (e.g., the cell including transistor M1 and the cell including transistor M2) to set the cell to one extreme weight value before setting each cell to its target weight value. Extending this to a complete array, such as analog multiplier array 300, all programmable cells in the complete array are set to one extreme weight value before each cell is set to its target weight value. When setting cells to their desired weight values, an overshoot problem exists if one or more cells are set with a weight value that is higher than the target. That is, all cells in the complete array must be reset to one extreme weight value before resetting the cells to their target weight value. However, the differential structure of each unbiased, two-quadrant multiplier of the analog multiplier array provided herein allows for such overshoot to be compensated for by programming, thereby avoiding the time consuming process of erasing and resetting all cells in the array.

In the example of compensating for overshoot by programming, of the two-quadrant multiplier 400

And

may be erased to set the cell to an extreme weight value. After erasing the cells, if

Programmed to have an excessive weight value, then

Can be programmed to have a greater weight value than the initial target to compensate

And achieve the initial target effect. Thus, the differential structure can be utilized to compensate for programming overshoot without having to erase any one or more cells and restart.

The foregoing systems and methods encourage sparsity in the neural network of the neuromorphic IC and minimize power consumption of the neuromorphic IC, such that the neuromorphic can operate on battery power.

In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims.

Claims

1. A neuromorphic integrated circuit comprising:

a multi-layer neural network disposed in an analog multiplier array of a plurality of two-quadrant multipliers arranged in a memory sector of the neuromorphic integrated circuit,

wherein each multiplier in the multiplier is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistor of the multiplier is approximately zero, the weight value of the transistor of the multiplier is approximately zero, or a combination thereof, and

wherein sparsity in a neural network combined with a plurality of multipliers wired to ground minimizes power consumption of the neuromorphic integrated circuit.

2. The neuromorphic integrated circuit of claim 1, wherein each of the multipliers draws no current when an input signal value of an input signal to a transistor of the multiplier is zero, a weight value of a transistor of the multiplier is zero, or a combination thereof.

3. The neuromorphic integrated circuit of claim 1, wherein weight values correspond to synaptic weight values between neural nodes disposed in a neural network in the neuromorphic integrated circuit.

4. The neuromorphic integrated circuit of claim 3, wherein the multiplication of the input signal values by the weight values provides output signal values that are combined to arrive at a decision for the neural network.

5. The neuromorphic integrated circuit of claim 1, wherein the transistors of the two-quadrant multiplier comprise metal-oxide-semiconductor field-effect transistors ("MOSFETs").

6. The neuromorphic integrated circuit of claim 1, wherein each of the two-quadrant multipliers has a differential structure configured to allow for programmed compensation for overshoot if either of the two cells is set to have a weight value greater than a target value.

7. The neuromorphic integrated circuit of claim 1, wherein the neuromorphic integrated circuit is configured for one or more application-specific standard products ("ASSPs") selected from keyword spotting, speaker identification, one or more audio filters, gesture recognition, image recognition, video object classification and segmentation, and autonomous vehicles including drones.

8. The neuromorphic integrated circuit of claim 1, wherein the neuromorphic integrated circuit is configured for operation on battery power.

9. A method of neuromorphic integrated circuits, comprising:

training a multi-layer neural network in an analog multiplier array of a plurality of two-quadrant multipliers disposed in a memory sector of the neuromorphic integrated circuit;

wherein each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistors of the multiplier is approximately zero, the weight values of the transistors of the multiplier are approximately zero, or a combination thereof; and

sparsity in a neural network is encouraged by training with a training algorithm configured to drive a large number of input signal values, weight values, or combinations thereof toward zero for a multiplier, thereby enabling minimal power consumption of the neuromorphic integrated circuit.

10. The method of claim 9, wherein each of the multipliers draws no current when the input signal value of the input signal to the transistors of the multipliers is zero, the weight values of the transistors of the multipliers are zero, or a combination thereof.

11. The method of claim 9, further comprising:

tracking a rate of change of the weight value of each of the multipliers during training; and

it is determined whether certain weight values are trending toward zero, and how quickly those particular weight values are trending toward zero.

12. The method of claim 9, further comprising:

as part of encouraging sparsity in the neural network, for those weight values that are trending toward zero during training, the weight values are driven toward zero.

13. The method of claim 9, wherein weight values correspond to synaptic weight values between neural nodes in a neural network of the neuromorphic integrated circuit.

14. A method of neuromorphic integrated circuits, comprising:

wherein each of the multipliers is wired to ground and draws a negligible amount of current when the input signal value of the input signal to the transistors of the multiplier is approximately zero, the weight values of the transistors of the multiplier are approximately zero, or a combination thereof;

tracking a rate of change of the weight value of each of the multipliers during training;

determining whether certain weight values are trending toward zero, and how quickly those weight values are trending toward zero; and

for those weight values that are trending toward zero, the weight values are driven toward zero, thereby encouraging sparsity in the neural network.

15. The method of claim 14, wherein each of the multipliers draws no current when the input signal value of the input signal to the transistors of the multipliers is zero, the weight values of the transistors of the multipliers are zero, or a combination thereof.

16. The method of claim 14, further comprising:

the subset of weight values is set to zero prior to training the neural network, further encouraging sparsity in the neural network.

17. The method of claim 14, wherein training utilizes a training algorithm configured to drive a large number of input signal values, weight values, or combinations thereof toward zero for a multiplier, thereby enabling minimal power consumption of the neuromorphic integrated circuit.

18. The method of claim 14, wherein training encourages sparsity in neural networks by minimizing a cost function, the cost function comprising an amount of non-zero weight values of weight values.

19. The method of claim 14, further comprising:

minimizing a cost function with an optimization function comprising gradient descent, backpropagation, or both gradient descent and backpropagation;

wherein the power consumption estimate of the neuromorphic integrated circuit is used as a component of a cost function.

20. The method of claim 14, further comprising:

incorporating the neuromorphic integrated circuit into one or more application-specific standard products ("ASSP") selected from the group consisting of keyword localization, speaker identification, one or more audio filters, gesture recognition, image recognition, video object classification and segmentation, and autonomous vehicles including drones.