US20230206077A1 - Analog learning engine and method - Google Patents

Analog learning engine and method Download PDF

Info

Publication number
US20230206077A1
US20230206077A1 US18/107,082 US202318107082A US2023206077A1 US 20230206077 A1 US20230206077 A1 US 20230206077A1 US 202318107082 A US202318107082 A US 202318107082A US 2023206077 A1 US2023206077 A1 US 2023206077A1
Authority
US
United States
Prior art keywords
neural network
charge
error
input
generation mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/107,082
Inventor
David Schie
Sergey Gaitukevich
Peter Drabos
Andreas Sibrai
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aistorm Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US18/107,082 priority Critical patent/US20230206077A1/en
Assigned to AISTORM INC. reassignment AISTORM INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DRABOS, Peter, GAITUKEVICH, SERGEY, SCHIE, DAVID, SIBRAI, ANDREAS
Publication of US20230206077A1 publication Critical patent/US20230206077A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present invention generally relates to an analog mathematical computing device and, more particularly to, a device which is capable of performing machine based learning.
  • Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
  • AI artificial intelligence
  • Machine learning focuses on the development of systems that can access data and use it learn for themselves.
  • Digital machine learning is a cumbersome process as the number of neurons in a neural network can be large and the calculation of derivatives and weight adjustments required to follow an error contour may represent an enormous number of calculations.
  • the calculation of derivatives is difficult for digital systems and the generation of an error contour which requires the calculation of a chain of derivatives to develop the error contour can take significant time and mathematical computing power.
  • a neural network error contour generation mechanism has a device which perturbs weights & biases associated with analog neurons to measure an error which results from perturbations at different points within the neural network.
  • a backpropagation mechanism has a neural network error contour generation mechanism comprising a device which perturbs weights & biases associated with analog neurons to measure an error which results from perturbations at different points within the neural network.
  • a set of m mini-batches of n training samples are inputted to the neural network error contour generation mechanism, wherein n is the number of training examples and m the number of mini-batches of training examples.
  • each weight and/or bias in the neural network may be modified according to an average of an error function multiplied by the derivatives of local activation to move towards a target.
  • a backpropagation mechanism has an error contour comprising a change in a neural network resulting from perturbation of each weighted sum.
  • a set of m mini-batches of n training samples are inputted to the neural network error contour generation mechanism, wherein n is the number of training examples and m the number of mini-batches of training examples.
  • Each weight in the neural network is modified according to an average of an error function multiplied by information related to its local activation to move towards a target.
  • a weight tuning circuit has one of a gated current or charge input representing an input value to match.
  • a ⁇ modulator using two switched charge reservoirs in inverter configuration is provided (ie. an integrator with gain coupled to a switch charge reference).
  • An output of the ⁇ modulator adjusts current sources feeding a node (which controls a weight current) between the two switched charge reservoirs against a comparator reference voltage to increase accuracy using oversampling.
  • a weight tuning circuit has one of a gated current or charge input representing an input value to match.
  • a ⁇ modulator using two switched charge reservoirs in inverter configuration.
  • a current representing a weight is subtracted from a node between the two switched charge reservoirs.
  • a resulting integrated value is compared to a comparator reference to generate an average accurate value over multiple cycles.
  • a charge domain implementation of the four equations of backpropagation is implemented.
  • the weights and biases are adjusted according to calculations made in the charge domain utilizing charge domain multipliers and memory storage devices.
  • FIG. 1 is a block diagram showing an exemplary embodiment of a neural network architecture that relies upon a multiplier within each connection in accordance with one aspect of the present application;
  • FIG. 2 is block diagram showing an exemplary embodiment of details of the neural network architecture of FIG. 1 in accordance with one aspect of the present application;
  • FIG. 3 is block diagram showing an exemplary embodiment of details of the neural network architecture of FIG. 1 in accordance with one aspect of the present application;
  • FIG. 4 is a block diagram showing an exemplary embodiment of an analog multiplier cell in accordance with one aspect of the present application
  • FIG. 5 is a block diagram showing an exemplary embodiment of a multiply and accumulate circuit in accordance with one aspect of the present application
  • FIG. 5 A is a block diagram showing an exemplary embodiment of a systolic multiply and accumulate circuit in accordance with one aspect of the present application
  • FIG. 6 depicts an exemplary embodiment of a depleted junction with minimized junction overlap capacitance in accordance with one aspect of the present application
  • FIG. 7 depicts an exemplary embodiment of a time view of a neuron network cycle in accordance with one aspect of the present application
  • FIG. 8 is a block diagram showing an exemplary embodiment of a short-term memory cell in accordance with one aspect of the present application.
  • FIG. 9 is a block diagram showing an exemplary embodiment of a decision circuit in accordance with one aspect of the present application.
  • FIG. 10 shows an exemplary embodiment of equations used for backpropagation in accordance with one aspect of the present application
  • FIG. 11 shows an exemplary embodiment of an equation used for cost function in accordance with one aspect of the present application.
  • FIG. 12 shows an exemplary embodiment of equations used for error calculations in accordance with one aspect of the present application
  • FIG. 13 is a block diagram showing an exemplary embodiment of a short term memory cell with pulse adjustment in accordance with one aspect of the present application
  • FIG. 14 is a timing diagram showing an exemplary embodiment of a short term memory cell output having its control register and therefore magnitude increased by positive pulses in accordance with one aspect of the present application
  • FIG. 15 shows an exemplary embodiment of equations used to develop an output layer error contour in accordance with one aspect of the present application
  • FIG. 16 is a block diagram showing an exemplary embodiment of a circuit for generating a curl of cost function in accordance with one aspect of the present application
  • FIG. 17 is a block diagram showing an exemplary embodiment of a circuit for generation of ⁇ ′ (z) and (a-y) in accordance with one aspect of the present application;
  • FIG. 18 is a block diagram showing an exemplary embodiment of a ⁇ MOD 1 converter in accordance with one aspect of the present application.
  • FIG. 19 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD 1 ⁇ feedback in accordance with one aspect of the present application;
  • FIG. 20 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD 1 ⁇ feedback with inverter in accordance with one aspect of the present application.
  • FIG. 21 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD 1 ⁇ current mode feedback in accordance with one aspect of the present application.
  • FIG. 22 is a charge mode multiply and add circuit or “memory cell” which may be used to temporarily store one or more charges or to perform addition or to temporarily store information for later distribution;
  • FIG. 23 is an example of a multiple input multiply and add circuit accepting pulse inputs, current source weights and further incorporating a ReLU decision circuit;
  • FIG. 24 shows an example of weighted inputs where charge may be added or removed to implement a subtract function.
  • a neural network 10 may be seen.
  • the circles 12 are neurons having input data.
  • the lines 14 are multipliers which multiply the input data by a weight (w).
  • the result may be fed to a decision circuit or to subsequent layers and that output in turn fed to the next layer.
  • w weight
  • each neuron containing a summer of the weighted inputs and in some cases a decision circuit, may be connected to many neurons in the following layers, therefore the number of weights can be very large.
  • functions such as pooling (decimation), expansion, decisions, convolution and other functions may be constructed from these basic elements using the same or related techniques.
  • FIGS. 2 and 3 shows additional details of the neural network 10 ( FIG. 1 ).
  • a For a given neuron there are multiple inputs from a previous layer, except for the input layer where the input comes from an external source.
  • the output of each layer is the activation, a.
  • each of the activations from a previous neuron are multiplied by a weight, modified by a fixed bias (if any) and the result, z, is fed to a decision circuit to create the activation for the next stage.
  • a neural network operates on mathematics and as such can be implemented in a number of ways. There are therefore many competing ways to implement a neural network. In general, the different methods focus upon optimizing the movement of information, and on the performance and efficiency of the multiplication.
  • One option for a neural network multiplier is an analog multiplier based upon a switched charge approach such as that shown in FIG. 4 and disclosed in co-pending application having a Ser. No. 16/291,311, a filing date of Mar. 4, 2019 and which is incorporated in its entirety herewith. This multiplier may be extended to a serial multiplier and summer such as the one shown in FIG.
  • FIG. 5 or may be extended to a parallel/serial systolic summer which can multiply and sum multiple operands simultaneously and implement systolic functions as shown in FIG. 5 A disclosed in co-pending application filed concurrently herewith, in the name of the same inventor as the present application and incorporated in its entirety herewith.
  • the capacitors used in these multipliers may be made extremely small through the use of depleted junction transfer gates as illustrated in FIG. 6 disclosed in co-pending application having a Ser. No. 16/291,311, a filing date of Mar. 4, 2019 and which is incorporated in its entirety herewith.
  • FIG. 7 illustrates that during a given cycle, the multipliers and other neuron circuitry are active only for a very short portion of the cycle. During the rest of the cycle the circuitry may be re-used for learning or be off, while returning to execution during the required portions of the cycle. This is a particularly useful aspect of the analog switched charge implementation of a neural network.
  • a short term memorycell 80 may be used as shown in FIG. 8 .
  • This short term meinorycell 80 might retain at least 8-bit accuracy for 10's of ms to 1000's of ms depending upon process and leakage.
  • the short term memory cell 80 may accept currents and temporarily stores their value.
  • an existing value could be introduced with the switch in series with “training updated current” switched on. Assuming the sample and hold transistor MNs is on, the current will be mirrored through the minor pair MN 5 /MN 1 and MN 30 . MN 1 will send this current to MP 1 which will mirror it to MP 2 and finally it will flow through the diode connected MP 4 . MP 1 also mirrors the information to MP 3 . Now we can open the “training updated current” switch and close the switch in series with MP 3 . This will equalize all the junctions on MNs such that the leakage current will be extremely small. Now we can open MNs and the Vgs required to maintain the stored voltage will be stored upon the gate capacitor C 1 which could be the parasitic capacitance of MN 1 . This circuitry may be replaced with charge domain circuitry such that MNs is an optimized charge domain transfer gate.
  • FIG. 9 shows a four quadrant transconductor 90 that could be used as such a decision circuit, accepting voltage input and outputting current which can be integrated by the next input stage.
  • This four quadrant transconductor 90 also is compatible with time based charge transfer which can be normalized for errors.
  • the four quadrant transconductor 90 can implement multiple functions such as sigmoid, ReLU or ramp.
  • a current based storage such as that shown is only one way by which information might be stored temporarily and by which information might be transferred between neuron stages.
  • FIG. 10 shows an example of a learning algorithm, backpropagation.
  • Backpropagation is a method used in AI networks to calculate a gradient that is needed in the calculation of the weights and biases to be used in the AI network.
  • Backpropagation is shorthand for “the backward propagation of errors,” since an error is computed at the output and distributed backwards throughout the network's layers. It is commonly used to train deep neural networks.
  • FIG. 11 illustrates a cost function.
  • a cost function measures whether we are converging to a target y(x) or diverging so that we can decide whether to keep adjusting weights in a given direction or not.
  • the quadratic cost function is measuring the square of the difference between a target and the output of the neural network for a given training example x and averaging these over n training examples.
  • BP 2 ( FIG. 10 ) illustrates how we can create the remainder of the error function using the chain rule. In other words, one can calculate the error of the previous layer by moving the error backwards through the network. Eventually one may end up with an error contour according to BP 4 ( FIG. 10 ) which allows one to express the rate of change of the cost function against the change in the weight of the weight connecting the jth neuron of the Ith layer to the kth neuron of the I-1th layer.
  • n is the number of training examples and m the number of mini-batches of training examples.
  • Perturbation theory is a mathematical method for finding an approximate solution to a problem, by starting from the exact solution of a related, simpler problem.
  • a critical feature of the technique is a middle step that breaks the problem into “solvable” and “perturbation” parts.
  • Perturbation theory is applicable if the problem at hand cannot be solved exactly, but can be formulated by adding a “small” term to the mathematical description of the exactly solvable problem.
  • weights are provided as a ratio of the currents. Assuming one has stored the weights temporarily in the short term memory cell 80 shown in FIG. 8 , one can show that the value of the current may be adjusted slightly by modifying the voltage stored on C 1 slightly. One can do this by adding or subtracting a current pulse with a current source as shown in FIG. 13 . FIG. 14 shows the output current stepping up with repeated pulses. The change may also be created by capacitively coupling a voltage pulse to C 1 .
  • FIG. 15 shows the math that one may have to perform for the calculation of the output layer error function.
  • Analog calculation of this math is shown in FIGS. 16 & 17 .
  • FIG. 16 shows how one might use the memory cells and current mode mathematics to generate and sample and hold the derivative of the cost function and store the result in our memory cell.
  • FIG. 22 the concept of the charge domain mathematics is shown as 110 .
  • the trip point of the common source comparator has been stored on the gate by pulling the gate node below its trip point (with read_out enabled but not pulse_out), releasing it, and allowing current source 140 to charge it up until said current source is turned off by the switching of said common source comparator.
  • the drain of the MOSFET 120 is held low by the low side switch of the first inverter 160 which is coupled to the signal Read out 130 .
  • the output of the second inverter 150 is also disabled until requested by a signal “pulse out” 170 .
  • One or more current sources are coupled to the gate of said MOSFET through switches further coupled to input pulses.
  • the width of each of said pulses represents an input value and the magnitude of the current source represents a weight.
  • the pulse gating said current source weights results in a multiplication of weight x input time which reduces the voltage on the gate of said MOSFET by a voltage of C/Q where C is the total capacitance on said gate node and Q is the total charge removed from said node.
  • the accumulated charge from said one or more weighted inputs is replaced when the “Read out” circuit 130 enables the circuit and simultaneously a pulse is initiated by enabling “pulse out”. “pulse_out” will end when the gate of the comparator returns to its trip point and turns off current source 140 .
  • this concept is extended to add a bias input, which works in a similar manner to the weighted current sources except the current source is of the same magnitude as 140 .
  • the devices connected to the MOSFET gates are drivers not inverters. Also shown is some additional charge domain mathematics. In this case it is a ReLU decision circuit.
  • the circuit works by ORing the pulse produced by the switching of the common source comparator 180 and the output of a second common source comparator 190 previously loaded to 50% of the dynamic range of the circuit. By ORing the pulse output of the weighted input summer with a 50% pulse, we are ensured at least a 50% pulse but if the pulse exceeds 50% then the pulse will be the linear value of the multiply & add circuit (weighted summer) which is the definition of a ReLU.
  • a DLL is well known to those skilled in the art and there are many efficient architectures for quickly locking a pulse magnitude by adjusting a parameter such as the current source value. These values will therefore be stored as currents using a circuit similar to that shown in FIG. 13 .
  • n is the number of training examples, m the number of mini-batches and x the present training sample.
  • ⁇ x,1 and a x,I ⁇ 1 for a single set of samples (each x). We will need to multiply those terms so it will be necessary to convert a x,I ⁇ 1 from our pulse output memory cell to a current using a MDLL but we are not going to do that quite yet.
  • We are going to use a multiplier where instead of the current source in series with S 0 in FIG.
  • biases are easier to adjust since we do not have to do a DLL and multiply and simply need to use a weighted summer for each ⁇ x,1 .
  • the above switched charge charge domain circuitry accepting weighted charge inputs rather than voltage inputs, produces several distinct advantages compared with switched capacitor circuits. Firstly, it is easier to transmit input information as a time pulse which when gating a current source representing weights produces a weighted charge input. Additionally, we can accept an arbitrary number of such weighted inputs asynchronously.
  • the gate of the common source MOSFET acts as a memory.
  • the noise of the circuit is a fraction of that of a similar switched capacitor circuit since the noise bandwidth is modified by the extremely small conduction time relative to the half period used in a switched capacitor circuit. By accepting charge there is no requirement to trim capacitors as there would be in a switched capacitor circuit accep t ing voltage.
  • the a ⁇ modulator allows improvement in accuracy by oversampling the A of a quantizer output vs an input value, filtering that result with gain and modifying the quantizer output in conformance with said filter output over multiple cycles.
  • the value over multiple cycles and averaged result will be of an accuracy improved in proportion to gain and oversampling ratio.
  • FIGS. 18 - 21 This concept may be shown in FIGS. 18 - 21 .
  • modulator 100 This modulator works by taking ⁇ of a quantizer output and an input value to match, using an integrator as a filter driving said single bit quantizer. In this case the quantizer applies it output by moving the switched capacitor reference around ground. The resulting bitstream will average to the desired input value.
  • the modulator 100 is a switched capacitor integrator with a digital reference fed to the one side of the input capacitor C.
  • FIG. 19 shows a comparator based version of the modulator 100 shown in FIG. 18 .
  • FIG. 20 shows how an inverter with level shift (“inverter configuration”) could be used to replace the comparator (although it may be necessary to provide more filter gain) provided the inverter contains a level shift which allows operation around ground similar to the comparator.
  • FIG. 21 shows a charge or current based version of the modulator where the input is a charge input and the charge is adjusted against gated currents rather than voltages. This circuit will produce a continuous waveform around the input value which will average over multiple cycles to the input value.
  • this technique may utilize higher order filters, more gain, and higher order quantizers to achieve high levels of accuracy. Additionally, it may be applied over the entire neural network operation such that the output of the neural network is considered a multiple cycle average. Alternatively it could be applied only locally such as to a control register which maintains a current weight input or to match a pulse input.
  • Factory loading and calibration of a neural network system can take significant time due to the large number of weights and biases that need to be loaded into a large network.
  • By integrating the neural network into a pixel array and providing local charge domain memory one has an opportunity to parallel load the data very quickly, saving time and cost during factory calibration or during learning situations.
  • a test fixture which provides pixel data corresponding to the data that one wishes to load, then one can load in parallel as many weights or biases as there are pixels.
  • Target data may be loaded in a similar way.
  • the data can then be parallel stored in non-volatile memory located in each neuron without burdening the parallel loading system.

Abstract

A neural network learning mechanism has a device which perturbs analog neurons to measure an error which results from perturbations at different points within the neural network and modifies weights and biases to converge to a target.

Description

    RELATED APPLCIATIONS
  • This patent application is related to U.S. Provisional Application No. 62/663,125 filed Apr. 26, 2018, entitled “ANALOG LEARNING ENGINE” in the name of David Schie, and which is incorporated herein by reference in its entirety. The present patent application claims the benefit under 35 U.S.C § 119(e).
  • TECHNICAL FIELD
  • The present invention generally relates to an analog mathematical computing device and, more particularly to, a device which is capable of performing machine based learning.
  • BACKGROUND
  • Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of systems that can access data and use it learn for themselves.
  • Digital machine learning is a cumbersome process as the number of neurons in a neural network can be large and the calculation of derivatives and weight adjustments required to follow an error contour may represent an enormous number of calculations. The calculation of derivatives is difficult for digital systems and the generation of an error contour which requires the calculation of a chain of derivatives to develop the error contour can take significant time and mathematical computing power.
  • Therefor; it would be desirable to provide a system and method that overcome the above problems.
  • SUMMARY
  • In accordance with one embodiment, a neural network error contour generation mechanism is disclosed. The neural network error contour generation mechanism has a device which perturbs weights & biases associated with analog neurons to measure an error which results from perturbations at different points within the neural network.
  • In accordance with one embodiment, a backpropagation mechanism is disclosed. The backpropagation mechanism has a neural network error contour generation mechanism comprising a device which perturbs weights & biases associated with analog neurons to measure an error which results from perturbations at different points within the neural network. A set of m mini-batches of n training samples are inputted to the neural network error contour generation mechanism, wherein n is the number of training examples and m the number of mini-batches of training examples. In one type of machine learning, each weight and/or bias in the neural network may be modified according to an average of an error function multiplied by the derivatives of local activation to move towards a target.
  • In accordance with one embodiment, a backpropagation mechanism is disclosed. The backpropagation mechanism has an error contour comprising a change in a neural network resulting from perturbation of each weighted sum. A set of m mini-batches of n training samples are inputted to the neural network error contour generation mechanism, wherein n is the number of training examples and m the number of mini-batches of training examples. Each weight in the neural network is modified according to an average of an error function multiplied by information related to its local activation to move towards a target.
  • In accordance with one embodiment, a weight tuning circuit is disclosed. The weight tuning circuit has one of a gated current or charge input representing an input value to match. A ΣΔ modulator using two switched charge reservoirs in inverter configuration is provided (ie. an integrator with gain coupled to a switch charge reference). An output of the ΣΔ modulator adjusts current sources feeding a node (which controls a weight current) between the two switched charge reservoirs against a comparator reference voltage to increase accuracy using oversampling.
  • In accordance with one embodiment, a weight tuning circuit is disclosed. The weight timing circuit has one of a gated current or charge input representing an input value to match. A ΣΔ modulator using two switched charge reservoirs in inverter configuration. A current representing a weight is subtracted from a node between the two switched charge reservoirs. A resulting integrated value is compared to a comparator reference to generate an average accurate value over multiple cycles.
  • In accordance with one embodiment a charge domain implementation of the four equations of backpropagation is implemented. The weights and biases are adjusted according to calculations made in the charge domain utilizing charge domain multipliers and memory storage devices.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present application is further detailed with respect to the following drawings. These figures are not intended to limit the scope of the present application but rather illustrate certain attributes thereof. The same reference numbers will be used throughout the drawings to refer to the same or like parts.
  • FIG. 1 is a block diagram showing an exemplary embodiment of a neural network architecture that relies upon a multiplier within each connection in accordance with one aspect of the present application;
  • FIG. 2 is block diagram showing an exemplary embodiment of details of the neural network architecture of FIG. 1 in accordance with one aspect of the present application;
  • FIG. 3 is block diagram showing an exemplary embodiment of details of the neural network architecture of FIG. 1 in accordance with one aspect of the present application;
  • FIG. 4 is a block diagram showing an exemplary embodiment of an analog multiplier cell in accordance with one aspect of the present application;
  • FIG. 5 is a block diagram showing an exemplary embodiment of a multiply and accumulate circuit in accordance with one aspect of the present application;
  • FIG. 5A is a block diagram showing an exemplary embodiment of a systolic multiply and accumulate circuit in accordance with one aspect of the present application;
  • FIG. 6 depicts an exemplary embodiment of a depleted junction with minimized junction overlap capacitance in accordance with one aspect of the present application;
  • FIG. 7 depicts an exemplary embodiment of a time view of a neuron network cycle in accordance with one aspect of the present application;
  • FIG. 8 is a block diagram showing an exemplary embodiment of a short-term memory cell in accordance with one aspect of the present application;
  • FIG. 9 is a block diagram showing an exemplary embodiment of a decision circuit in accordance with one aspect of the present application;
  • FIG. 10 shows an exemplary embodiment of equations used for backpropagation in accordance with one aspect of the present application;
  • FIG. 11 shows an exemplary embodiment of an equation used for cost function in accordance with one aspect of the present application;
  • FIG. 12 shows an exemplary embodiment of equations used for error calculations in accordance with one aspect of the present application;
  • FIG. 13 is a block diagram showing an exemplary embodiment of a short term memory cell with pulse adjustment in accordance with one aspect of the present application;
  • FIG. 14 is a timing diagram showing an exemplary embodiment of a short term memory cell output having its control register and therefore magnitude increased by positive pulses in accordance with one aspect of the present application
  • FIG. 15 shows an exemplary embodiment of equations used to develop an output layer error contour in accordance with one aspect of the present application;
  • FIG. 16 is a block diagram showing an exemplary embodiment of a circuit for generating a curl of cost function in accordance with one aspect of the present application;
  • FIG. 17 is a block diagram showing an exemplary embodiment of a circuit for generation of σ′ (z) and (a-y) in accordance with one aspect of the present application;
  • FIG. 18 is a block diagram showing an exemplary embodiment of a ΔΣ MOD1 converter in accordance with one aspect of the present application;
  • FIG. 19 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD1 ΔΣ feedback in accordance with one aspect of the present application;
  • FIG. 20 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD1 ΔΣ feedback with inverter in accordance with one aspect of the present application; and
  • FIG. 21 is a block diagram showing an exemplary embodiment of a ZCELL as a MOD1 ΔΣ current mode feedback in accordance with one aspect of the present application.
  • FIG. 22 is a charge mode multiply and add circuit or “memory cell” which may be used to temporarily store one or more charges or to perform addition or to temporarily store information for later distribution;
  • FIG. 23 is an example of a multiple input multiply and add circuit accepting pulse inputs, current source weights and further incorporating a ReLU decision circuit;
  • FIG. 24 shows an example of weighted inputs where charge may be added or removed to implement a subtract function.
  • DESCRIPTION OF THE APPLICATION
  • The description set forth below in connection with the appended drawings is intended as a description of presently preferred embodiments of the disclosure and is not intended to represent the only forms in which the present disclosure may be constructed and/or utilized. The description sets forth the functions and the sequence of steps for constructing and operating the disclosure in connection with the illustrated embodiments. It is to be understood, however, that the same or equivalent functions and sequences may be accomplished by different embodiments that are also intended to be encompassed within the spirit and scope of this disclosure.
  • Referring to FIG. 1 , a neural network 10 may be seen. In the neural network 10, the circles 12 are neurons having input data. The lines 14 are multipliers which multiply the input data by a weight (w). The result may be fed to a decision circuit or to subsequent layers and that output in turn fed to the next layer. As each neuron, containing a summer of the weighted inputs and in some cases a decision circuit, may be connected to many neurons in the following layers, therefore the number of weights can be very large. Although a simple view of the neural network is shown in FIG. 1 , functions such as pooling (decimation), expansion, decisions, convolution and other functions may be constructed from these basic elements using the same or related techniques.
  • FIGS. 2 and 3 shows additional details of the neural network 10 (FIG. 1 ). For a given neuron there are multiple inputs from a previous layer, except for the input layer where the input comes from an external source. The output of each layer is the activation, a. For a given neuron each of the activations from a previous neuron are multiplied by a weight, modified by a fixed bias (if any) and the result, z, is fed to a decision circuit to create the activation for the next stage.
  • A neural network operates on mathematics and as such can be implemented in a number of ways. There are therefore many competing ways to implement a neural network. In general, the different methods focus upon optimizing the movement of information, and on the performance and efficiency of the multiplication. One option for a neural network multiplier is an analog multiplier based upon a switched charge approach such as that shown in FIG. 4 and disclosed in co-pending application having a Ser. No. 16/291,311, a filing date of Mar. 4, 2019 and which is incorporated in its entirety herewith. This multiplier may be extended to a serial multiplier and summer such as the one shown in FIG. 5 or may be extended to a parallel/serial systolic summer which can multiply and sum multiple operands simultaneously and implement systolic functions as shown in FIG. 5A disclosed in co-pending application filed concurrently herewith, in the name of the same inventor as the present application and incorporated in its entirety herewith. The capacitors used in these multipliers may be made extremely small through the use of depleted junction transfer gates as illustrated in FIG. 6 disclosed in co-pending application having a Ser. No. 16/291,311, a filing date of Mar. 4, 2019 and which is incorporated in its entirety herewith.
  • The analog multipliers described above operate very quickly. FIG. 7 illustrates that during a given cycle, the multipliers and other neuron circuitry are active only for a very short portion of the cycle. During the rest of the cycle the circuitry may be re-used for learning or be off, while returning to execution during the required portions of the cycle. This is a particularly useful aspect of the analog switched charge implementation of a neural network.
  • To utilize circuitry during the off time it may be necessary to save current information. To do this, a short term memorycell 80 may be used as shown in FIG. 8 . This short term meinorycell 80 might retain at least 8-bit accuracy for 10's of ms to 1000's of ms depending upon process and leakage. The short term memory cell 80 may accept currents and temporarily stores their value.
  • For example, an existing value could be introduced with the switch in series with “training updated current” switched on. Assuming the sample and hold transistor MNs is on, the current will be mirrored through the minor pair MN5/MN1 and MN30. MN1 will send this current to MP1 which will mirror it to MP2 and finally it will flow through the diode connected MP4. MP1 also mirrors the information to MP3. Now we can open the “training updated current” switch and close the switch in series with MP3. This will equalize all the junctions on MNs such that the leakage current will be extremely small. Now we can open MNs and the Vgs required to maintain the stored voltage will be stored upon the gate capacitor C1 which could be the parasitic capacitance of MN1. This circuitry may be replaced with charge domain circuitry such that MNs is an optimized charge domain transfer gate.
  • Now that we have a means by which to store information, we need a decision circuit which provides activations as current information. One way to translate the activations to current is through the decision circuit. FIG. 9 shows a four quadrant transconductor 90 that could be used as such a decision circuit, accepting voltage input and outputting current which can be integrated by the next input stage. This four quadrant transconductor 90 also is compatible with time based charge transfer which can be normalized for errors. The four quadrant transconductor 90 can implement multiple functions such as sigmoid, ReLU or ramp. Those skilled in the art will understand that a current based storage such as that shown is only one way by which information might be stored temporarily and by which information might be transferred between neuron stages.
  • FIG. 10 shows an example of a learning algorithm, backpropagation. Backpropagation is a method used in AI networks to calculate a gradient that is needed in the calculation of the weights and biases to be used in the AI network. Backpropagation is shorthand for “the backward propagation of errors,” since an error is computed at the output and distributed backwards throughout the network's layers. It is commonly used to train deep neural networks.
  • FIG. 11 illustrates a cost function. A cost function measures whether we are converging to a target y(x) or diverging so that we can decide whether to keep adjusting weights in a given direction or not. In this case the quadratic cost function is measuring the square of the difference between a target and the output of the neural network for a given training example x and averaging these over n training examples.
  • Choosing the quadratic function allows one to simplify the error function as shown in FIG. 12 . The curl of the cost function becomes simply the difference between the neural network activation output and the target. BP2 (FIG. 10 ) illustrates how we can create the remainder of the error function using the chain rule. In other words, one can calculate the error of the previous layer by moving the error backwards through the network. Eventually one may end up with an error contour according to BP4 (FIG. 10 ) which allows one to express the rate of change of the cost function against the change in the weight of the weight connecting the jth neuron of the Ith layer to the kth neuron of the I-1th layer.
  • Propagating this error backwards through the network, one can calculate the error function of previous layers. For example we can calculate the error caused by the first neuron in the second layer δ1 2 by multiplying δ1 3 by σ′(z1 2)*w3 11.
  • The general way to perform backpropagation training of the neural network is usually described as follows:
      • 1. Input a set of training examples.
      • 2. For each training example x, set the corresponding input activation vector ax,1, and perform the following steps (note these equations manipulate matrices):
        • a. Feedforward: For each I=2,3, . . . L, compute zx,1=w1ax,i−1+b1 and a ax,1=σ(zx,1)
        • b. Output error δx,L: Compute the vector δx,L=□8Cx⊙σ′(zx,L)
        • c. Backpropagate the error: For each I=L−1,L−2, . . . 2 compute δx,1=((wI+1)Tδx,I+1)⊙σ′(zx,1) 3. Gradient Descent: For each I=L,L−1, . . . ,2 update the weights according to the rule
  • w l -> w 1 - n m Σ x ( δ x , l a x , l - 1 ) T
  • and the biases according to the rule
  • b l -> b 1 - n m Σ x δ x , l
  • Where n is the number of training examples and m the number of mini-batches of training examples.
  • In a digital system, one would literally need to perform the derivations and multiplies to generate this error contour. This takes a significant amount of computing power due to the large number of multiplies and derivatives involved. In the switched charge system, it is much easier to generate the error contour as one can generate derivatives by actually perturbing.
  • Perturbation theory is a mathematical method for finding an approximate solution to a problem, by starting from the exact solution of a related, simpler problem. A critical feature of the technique is a middle step that breaks the problem into “solvable” and “perturbation” parts. Perturbation theory is applicable if the problem at hand cannot be solved exactly, but can be formulated by adding a “small” term to the mathematical description of the exactly solvable problem.
  • In the multipliers shown, weights are provided as a ratio of the currents. Assuming one has stored the weights temporarily in the short term memory cell 80 shown in FIG. 8 , one can show that the value of the current may be adjusted slightly by modifying the voltage stored on C1 slightly. One can do this by adding or subtracting a current pulse with a current source as shown in FIG. 13 . FIG. 14 shows the output current stepping up with repeated pulses. The change may also be created by capacitively coupling a voltage pulse to C1.
  • One can now explain how one can perform the equations of backpropagation using analog current mode mathematics. FIG. 15 shows the math that one may have to perform for the calculation of the output layer error function. Analog calculation of this math is shown in FIGS. 16 & 17 . One may use the short term memory cell 80 to sample and hold the change in y1 that results adding a small perturbation at the input to z1 3. This can be accomplished by summing in an additional current source which is pulsed for a short period of time into the neuron summer input capacitor (and removing it again or reloading it from “training updated current” after learning use to be ready for the next calculation) in another short term memory cell 80 and store that delta σ′(z1 3) in a memory cell. Similarly, assuming that the targets are saved as currents we can subtract the target yi from the final activation a1 3 as shown in FIG. 16 assuming we saved it in a memory cell after the forward propagation. We can now use a multiplier to generate δ1 3, δ1 3=(a1 3−y1) σ′(z1 3), and then use the transconductor 90 (FIG. 9 ) to convert it into a current, and save it in a memory cell.
  • One can similarly perturb the previous layer to find σ′(z1 2) and δ1 2. FIG. 16 shows how one might use the memory cells and current mode mathematics to generate and sample and hold the derivative of the cost function and store the result in our memory cell. One can work backwards to generate the error contour per 2c. and store it in the short term memorycell 80. One can repeat this for a given training set x, over a number of training examples n, in a number of mini-batches m and then modify the weights according to 3 by using the multiplier shown in FIG. 4 . Recall that if current is subtracted from our summing node then one multiplies, if it is added one divides. One can also apply normalizations with additional stages or by scaling at the same time.
  • The above discloses the use of analog mathematics to accelerate the learning process, Due to the perturbations and minimization of the number of multiplies and derivatives, one can dramatically increase the speed and reduce the power required for training.
  • As an alternative to current mode mathematics a charge domain mechanism is presented. In FIG. 22 the concept of the charge domain mathematics is shown as 110. Initially, the trip point of the common source comparator has been stored on the gate by pulling the gate node below its trip point (with read_out enabled but not pulse_out), releasing it, and allowing current source 140 to charge it up until said current source is turned off by the switching of said common source comparator. Thereafter, the drain of the MOSFET 120 is held low by the low side switch of the first inverter 160 which is coupled to the signal Read out 130. The output of the second inverter 150 is also disabled until requested by a signal “pulse out” 170. One or more current sources, proportional to current source 140, are coupled to the gate of said MOSFET through switches further coupled to input pulses. The width of each of said pulses represents an input value and the magnitude of the current source represents a weight. The pulse gating said current source weights results in a multiplication of weight x input time which reduces the voltage on the gate of said MOSFET by a voltage of C/Q where C is the total capacitance on said gate node and Q is the total charge removed from said node. The accumulated charge from said one or more weighted inputs is replaced when the “Read out” circuit 130 enables the circuit and simultaneously a pulse is initiated by enabling “pulse out”. “pulse_out” will end when the gate of the comparator returns to its trip point and turns off current source 140.
  • In FIG. 23 this concept is extended to add a bias input, which works in a similar manner to the weighted current sources except the current source is of the same magnitude as 140. The devices connected to the MOSFET gates are drivers not inverters. Also shown is some additional charge domain mathematics. In this case it is a ReLU decision circuit. The circuit works by ORing the pulse produced by the switching of the common source comparator 180 and the output of a second common source comparator 190 previously loaded to 50% of the dynamic range of the circuit. By ORing the pulse output of the weighted input summer with a 50% pulse, we are ensured at least a 50% pulse but if the pulse exceeds 50% then the pulse will be the linear value of the multiply & add circuit (weighted summer) which is the definition of a ReLU.
  • We will now illustrate specifically a backpropagation algorithm implemented using the described charge domain methods. Note that with respect to the circuits described in FIG. 22 and FIG. 23 , although we only show n-channel switch and current source arrangements which add to the pulse width, we can also couple gated p-channel switch and series current source arrangements that would reduce the output pulse width as shown in FIG. 24 . We start at the input with an input training set al and propagate forward using circuits similar to FIG. 23 (neuron cell) and record in circuits similar to that shown in FIG. 22 (memory cell), except there will be only one input to the cell and its weight will be 1 (identical to the upper current source 140) which when multiplied by the input pulse will store the value for each activation. As a secondary operation we will add a small charge which will be used as the Δz perturbation and will record the σ(zj+Δzj) value which will be used to determine σ′(zj 1). As we will soon illustrate, it is necessary to have this information as a current source weight if we are to multiply it against a pulse input. We therefore need to generate the equivalent of a single input FIG. 22 memory cell (MDLL) with the weighted current source producing a pulse on the memory cell output equal to that stored on the original σ′(zj 1) memory cell when the input pulse value coupled to the input current source gating the mosfet in series with the current source is of maximal value (full pulse corresponding to a 1). We can do this with a delay lock loop which adjusts the current source magnitude of the current source in the MDLL to lock the pulse width from said memory cell to the memory cell saving the pulse information from the the original σ′(z1 3). A DLL is well known to those skilled in the art and there are many efficient architectures for quickly locking a pulse magnitude by adjusting a parameter such as the current source value. These values will therefore be stored as currents using a circuit similar to that shown in FIG. 13 .
  • After loading each activation we do not enable the output pulse but just store all of the results in separate memory cells. Next, we move to the the output and compute (a1 3−y1)σ′(z1 3). Firstly, we will load our target y1 into the circuit shown in FIG. 22 , except there will be only one input to the cell and its weight will be 1 (identical to the upper current source). After loading y1 we do not enable the read out pulse. We have already loaded a1 3 into a memory cell so we now couple the output of y1to a complimentary input (see FIG. 24 ) of the a1 3 memory cell and enable the output of the y1 to perform the subtraction (a1 3−y1). Using the same method we subtract al 3 from the previously stored σ(z1 3+Δz1 3) and store the resulting σ′(z1 3) in a memory cell. We then use a MDLL to convert σ′(z1 3) from a pulse memory input to a current source memory input and use this current to calculate δx,L.
  • Once we have the output layer δx,L computed in this way we multiply the output layer δx,L by the incoming weights and then by σ′(zx,1). Specifically, δx,l=((wI+1)Tδx,I+1) ⊙σ′(zx,1). The first multiply ((wI+1)Tδx,I+1) is easy since we simply use the weights and the just created δx,L pulse as an input to construct the next value. Assuming we used our memory cell to generate ((wI+1)Tδx,I+1) by introducing δx,I+1 as an input pulse and the weights in their original current source form we can now couple the output pulse to another memory cell where the current source was adjusted using the delay lock loop technique to produce a current input to match the memory cell which stored value of σ′(zx,1) which we stored earlier. In this way we can generate our error contour δx,1 and store it on memory cells.
  • Now we can finally start our gradient decent. We learned that we should update as follows: The weights
  • w l -> w 1 - n m Σ x ( δ x , l a x , l - 1 ) T
  • and the biases according to the rule
  • b l -> b 1 - n m Σ x δ x , l ,
  • where n is the number of training examples, m the number of mini-batches and x the present training sample. We have already generated and δx,1 and ax,I−1 for a single set of samples (each x). We will need to multiply those terms so it will be necessary to convert ax,I−1 from our pulse output memory cell to a current using a MDLL but we are not going to do that quite yet. First it would be better to store δx,I and ax,I−1 over our dataset x in our memory cell. We already know how many training examples n and how many minibatches m we are using and therefore know n/m. We are going to use a multiplier where instead of the current source in series with S0 in FIG. 22 being proportional to the input weight current sources such that a full pulse input would produce the same output pulse, we scale the S0 current sources by n/rn to build in that scaling factor. Next for each weight, w1, we will use a local DLL to develop a memory cell with a current source equivalent of ax,I−1 as described above. Now we can use our modified memory cell to output a weight adjustment which will be applied using the method in FIGS. 13 and 14 where the magnitude of the pulse we subtract are the values we just calculated all summed together across all of x using another weighted summer, ie.
  • n m Σ x ( δ x , l a x , l - 1 ) .
  • The result will be an adjustment of the weight
  • w l -> w 1 - n m Σ x ( δ x , l a x , l - 1 ) .
  • Similarly, we will adjust the biases using the same method according to
  • b l -> b 1 - n m Σ x δ x , l
  • except the biases are easier to adjust since we do not have to do a DLL and multiply and simply need to use a weighted summer for each δx,1.
  • The above switched charge charge domain circuitry, accepting weighted charge inputs rather than voltage inputs, produces several distinct advantages compared with switched capacitor circuits. Firstly, it is easier to transmit input information as a time pulse which when gating a current source representing weights produces a weighted charge input. Additionally, we can accept an arbitrary number of such weighted inputs asynchronously. The gate of the common source MOSFET acts as a memory. The noise of the circuit is a fraction of that of a similar switched capacitor circuit since the noise bandwidth is modified by the extremely small conduction time relative to the half period used in a switched capacitor circuit. By accepting charge there is no requirement to trim capacitors as there would be in a switched capacitor circuit accepting voltage. There is no operational amplifier and resulting noise and settling time, as such the system is much faster and propagation time is dependent upon component values. It is scalable into smaller lithography processes since it does not rely on analog circuitry such as the aforementioned analog operational amplifier. Switched charge based decision circuits and other neural network building blocks can easily be created using similar means and time based mathematics used to easily implement a result. The common source multiplier approach implements correlated double sampling (CDS) which removes flicker noise, offsets, and temperature or process variations or leakage variations.
  • In some instances, it may be useful to improve the accuracy of the currents holding the weights or the accuracy of input values. It is known to those skilled in the art the a ΣΔ modulator allows improvement in accuracy by oversampling the A of a quantizer output vs an input value, filtering that result with gain and modifying the quantizer output in conformance with said filter output over multiple cycles. The value over multiple cycles and averaged result will be of an accuracy improved in proportion to gain and oversampling ratio.
  • This concept may be shown in FIGS. 18-21 . In FIG. 18 , the concept of a ΣΔ mod1 modulator 100 (hereinafter modulator 100) is shown. This modulator works by taking Δ of a quantizer output and an input value to match, using an integrator as a filter driving said single bit quantizer. In this case the quantizer applies it output by moving the switched capacitor reference around ground. The resulting bitstream will average to the desired input value.
  • The modulator 100 is a switched capacitor integrator with a digital reference fed to the one side of the input capacitor C. FIG. 19 shows a comparator based version of the modulator 100 shown in FIG. 18 . FIG. 20 shows how an inverter with level shift (“inverter configuration”) could be used to replace the comparator (although it may be necessary to provide more filter gain) provided the inverter contains a level shift which allows operation around ground similar to the comparator. Finally, FIG. 21 shows a charge or current based version of the modulator where the input is a charge input and the charge is adjusted against gated currents rather than voltages. This circuit will produce a continuous waveform around the input value which will average over multiple cycles to the input value. It is known to those skilled in the art that this technique may utilize higher order filters, more gain, and higher order quantizers to achieve high levels of accuracy. Additionally, it may be applied over the entire neural network operation such that the output of the neural network is considered a multiple cycle average. Alternatively it could be applied only locally such as to a control register which maintains a current weight input or to match a pulse input.
  • Factory loading and calibration of a neural network system can take significant time due to the large number of weights and biases that need to be loaded into a large network. By integrating the neural network into a pixel array and providing local charge domain memory one has an opportunity to parallel load the data very quickly, saving time and cost during factory calibration or during learning situations. Specifically, if a test fixture is developed which provides pixel data corresponding to the data that one wishes to load, then one can load in parallel as many weights or biases as there are pixels. For example, a 12M pixel array could parallel load 12M pieces of information. With a sub-his charge accumulation time this means that it is possible to load 12M*1e6=12e12 or 12 terabytes of data per second. At an assumed accuracy of 14 bits this is equivalent to loading at a rate of 12*12=144 terabits of data per second which is difficult to match using other means. Target data may be loaded in a similar way.
  • The data can then be parallel stored in non-volatile memory located in each neuron without burdening the parallel loading system.
  • While embodiments of the disclosure have been described in terms of various specific embodiments, those skilled in the art will recognize that the embodiments of the disclosure may be practiced with modifications within the spirit and scope of the claims.

Claims (25)

What is claimed is:
1. A neural network error contour generation mechanism comprising a device which perturbs analog neurons to measure an error which results from perturbations at different points within the neural network.
2. The neural network error contour generation mechanism of claim 1, comprising
a neuron summer to integrate a perturbation; and
analog circuit sampling and holding an activation result of the perturbation to calculate σ′(z).
3. neural network error contour generation mechanism of claim 2, comprising multiplier circuit multiplying σ′(z) by a curl of a cost function to generate output layer errors.
4. The neural network error contour generation mechanism of claim 3, wherein the neurons are comprised of on or more of: switched charge multiplier, division and current mode summations, decision or similar circuits.
5. The neural network error contour generation mechanism of claim 3, comprising means for generating errors for layers below an output layer by using one of switched charge multiplier or division and current or charge domain summations to generate remaining error values working backwards from the output layer errors.
6. The neural network error contour generation mechanism of claim 1, wherein the error caused by a perturbation at each weighted inputto a neuron is measured at a respective output of the neural network.
7. The neural network error contour generation mechanism of claim 1, comprising a neural network update circuit modifying analog weight values to direct the neural network error contour generation mechanism towards a target in response to an error contour generated.
8. The neural network error contour generation mechanism of claim 6, comprising a neural network update circuit modifying analog weight values to direct the neural network error contour generation mechanism towards a target in response to an error contour generated.
9. The neural network error contour generation mechanism of claim 1, wherein an error contour generated is stored in an analog memory.
10. The neural network error contour generation mechanism of claim 8, wherein the error contour generated is stored in an analog memory.
11. The neural network error contour generation mechanism of claim 1, comprising a circuit modifying analog bias values to direct the neural network error contour generation mechanism towards a target.
12. The neural network error contour generation mechanism of claim 1, wherein the error is a quadratic difference.
13. A backpropagation mechanism comprising:
a neural network error contour generation mechanism comprising a device which perturbs analog neurons to measure an error which results from perturbations at different points within the neural network;
wherein a set of m mini-batches of n training samples are inputted to the neural network error contour generation mechanism, wherein n is the number of training examples and m the number of mini-batches of training examples;
wherein each weight in the neural network error is modified in part according to an average over n samples of the curl of an error function multiplied by a local activation derivative to move towards a target.
14. A backpropagation means comprising:
an error contour comprising a change in a neural network resulting from perturbation of each weighted input sum;
wherein a set of m mini-batches of n training samples are inputted to the neural network, wherein n is the number of training examples and m the number of mini-batches of training examples;
wherein each weight in the neural network is modified according to an average over n samples of the curl of an error function multiplied by a local activation derivative to move towards a target.
15. A weight tuning circuit comprising:
one of a gated current or charge input representing an input value to match;
a ΣΔ modulator using two switched charge reservoirs in inverter configuration;
wherein an output of the ΣΔ modulator adjusts currents sources feeding a node between the two switched charge reservoirs against a comparator reference voltage.
16. The weight tuning circuit of claim 15, wherein multiple weights use the comparator reference voltage, the comparator reference voltage adjusted to eliminate offset errors.
17. A weight tuning means comprising:
one of a gated current or charge input representing an input value to match;
a ΣΔ modulator using two switched charge reservoirs in inverter configuration;
wherein a current representing a weight is set from a node between the two switched charge reservoirs; and
wherein a resulting integrated value is compared to a comparator reference to generate an average accurate value over multiple cycles.
18. The weight tuning circuit of claim 17, wherein multiple weights use the comparator reference voltage, the comparator reference voltage adjusted to eliminate offset errors.
19. A learning device whereby an incremental charge is propagated through a neural network to determine its sensitivity on the output of the neural network and where those weights which produce a significant change towards a target are modified by a controller to minimize said error.
20. The learning device of 19 where said modifications to said weights are made by said controller in conformance with the fourth rule of backpropagation by adjustment of said weight current source magnitude.
21. A charge domain switched charge multiply and sum circuit, accepting weighted charge inputs, where the limited pulse duration of the charge transfer modifies the noise bandwidth compared to a switch capacitor circuit such that the noise is dramatically reduced and thus required capacitance dramatically reduced.
22. A charge domain neuron, comprising the switch charge multiply and sum circuit of 21 and further comprising a charge domain ReLU decision circuit wherein a weighted input charge is introduced to a weighted summer and a threshold level for a ReLU circuit is loaded as charge into a second weighted summer and the outputs pulses of said summer circuits are simultaneously OR'ed to produce a ReLU output.
23. An input operand converting device by which to convert a pulse input value to a current source input value in a charge domain memory cell comprising:
a charge domain memory circuit which accepts a pulse input which gates a current input so as to store a weighted charge on a summing node;
a controller which applies a maximal current input value to said memory circuit while accepting a pulse input value so as to generate said weighted charge on said summing node;
a delay lock loop;
wherein said delay lock loop accepts the output pulse of said memory circuit and then the controller applies a maximal input pulse after which said delay lock loop adjusts the magnitude of said current weight value until the output of said memory circuit matches the pulse with initially introduced.
24. A backpropagation mechanism consisting of:
a controller;
the charge domain neurons of claim 22;
the input operand converting device of claim 23;
wherein the four equations of back propagation are implemented by utilizing combinations of said charge domain neurons and operand converting devices;
wherein said controller adjusts weights and biases in conformance with the fourth rule of backpropagation.
25. An oversampled neural network where the inputs to neurons in the neural network are oversampled and where each comprises a ΣΔ modulator wherein the Δ of each input operand versus a quantizer output value is filtered with gain and then reapplied to said quantizer such that accurate values of the neural network operation are the average of said oversampled values.
US18/107,082 2018-04-26 2023-02-08 Analog learning engine and method Pending US20230206077A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/107,082 US20230206077A1 (en) 2018-04-26 2023-02-08 Analog learning engine and method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201862663125P 2018-04-26 2018-04-26
US16/396,583 US11604996B2 (en) 2018-04-26 2019-04-26 Neural network error contour generation circuit
US18/107,082 US20230206077A1 (en) 2018-04-26 2023-02-08 Analog learning engine and method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/396,583 Division US11604996B2 (en) 2018-04-26 2019-04-26 Neural network error contour generation circuit

Publications (1)

Publication Number Publication Date
US20230206077A1 true US20230206077A1 (en) 2023-06-29

Family

ID=68292511

Family Applications (3)

Application Number Title Priority Date Filing Date
US16/396,583 Active 2041-11-08 US11604996B2 (en) 2018-04-26 2019-04-26 Neural network error contour generation circuit
US18/107,082 Pending US20230206077A1 (en) 2018-04-26 2023-02-08 Analog learning engine and method
US18/107,066 Pending US20230376770A1 (en) 2018-04-26 2023-02-08 Analog learning engine and method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/396,583 Active 2041-11-08 US11604996B2 (en) 2018-04-26 2019-04-26 Neural network error contour generation circuit

Family Applications After (1)

Application Number Title Priority Date Filing Date
US18/107,066 Pending US20230376770A1 (en) 2018-04-26 2023-02-08 Analog learning engine and method

Country Status (2)

Country Link
US (3) US11604996B2 (en)
WO (1) WO2019210276A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111954888A (en) * 2018-03-02 2020-11-17 艾斯多姆有限公司 Charge domain math engine and method
US10832014B1 (en) * 2018-04-17 2020-11-10 Ali Tasdighi Far Multi-quadrant analog current-mode multipliers for artificial intelligence
US11772663B2 (en) * 2018-12-10 2023-10-03 Perceptive Automata, Inc. Neural network based modeling and simulation of non-stationary traffic objects for testing and development of autonomous vehicle systems
US11526285B2 (en) * 2019-04-03 2022-12-13 Macronix International Co., Ltd. Memory device for neural networks
CN113205048B (en) * 2021-05-06 2022-09-09 浙江大学 Gesture recognition method and system
US11789857B2 (en) * 2021-08-11 2023-10-17 International Business Machines Corporation Data transfer with continuous weighted PPM duration signal

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5052043A (en) 1990-05-07 1991-09-24 Eastman Kodak Company Neural network with back propagation controlled through an output confidence measure
US5224203A (en) 1990-08-03 1993-06-29 E. I. Du Pont De Nemours & Co., Inc. On-line process control neural network using data pointers
US5150450A (en) * 1990-10-01 1992-09-22 The United States Of America As Represented By The Secretary Of The Navy Method and circuits for neuron perturbation in artificial neural network memory modification
US5222193A (en) 1990-12-26 1993-06-22 Intel Corporation Training system for neural networks and the like
US5640494A (en) * 1991-03-28 1997-06-17 The University Of Sydney Neural network with training by perturbation
US5226092A (en) 1991-06-28 1993-07-06 Digital Equipment Corporation Method and apparatus for learning in a neural network
US5479579A (en) * 1992-09-04 1995-12-26 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Cascaded VLSI neural network architecture for on-line learning
US5987444A (en) * 1997-09-23 1999-11-16 Lo; James Ting-Ho Robust neutral systems
WO2016179533A1 (en) * 2015-05-06 2016-11-10 Indiana University Research And Technology Corporation Sensor signal processing using an analog neural network
US10885429B2 (en) * 2015-07-06 2021-01-05 University Of Dayton On-chip training of memristor crossbar neuromorphic processing systems
WO2017068491A1 (en) * 2015-10-23 2017-04-27 Semiconductor Energy Laboratory Co., Ltd. Semiconductor device and electronic device
US20200143240A1 (en) * 2017-06-12 2020-05-07 D5Ai Llc Robust anti-adversarial machine learning

Also Published As

Publication number Publication date
US20230376770A1 (en) 2023-11-23
US11604996B2 (en) 2023-03-14
US20190332459A1 (en) 2019-10-31
WO2019210276A1 (en) 2019-10-31

Similar Documents

Publication Publication Date Title
US20230206077A1 (en) Analog learning engine and method
US11132176B2 (en) Non-volatile computing method in flash memory
KR930002792B1 (en) Neuron architecture
Miao et al. Automated digital controller design for switching converters
US8014879B2 (en) Methods and systems for adaptive control
Kang et al. An on-chip-trainable Gaussian-kernel analog support vector machine
US20170368682A1 (en) Neural network apparatus and control method of neural network apparatus
US11467984B2 (en) System and methods for mixed-signal computing
US11573792B2 (en) Method and computing device with a multiplier-accumulator circuit
Gu et al. ROQ: A noise-aware quantization scheme towards robust optical neural networks with low-bit controls
US20230359571A1 (en) System and methods for mixed-signal computing
KR20190085785A (en) Neuromorphic arithmetic device and operating method thereof
CN114268221A (en) AI-based error prediction for power conversion regulators
Shaikh et al. Novel product ANFIS‐PID hybrid controller for buck converters
US20230186089A1 (en) Analog learning engine and method
KR20200076083A (en) Neuromorphic system performing supervised training using error back propagation
CN109946966B (en) Magnetic fluid motion control method based on uncertain parameter quantification
CN117616427A (en) System, method and computer device for transistor-based neural network
US20060222128A1 (en) Analog signals sampler providing digital representation thereof
Figueroa et al. On-chip compensation of device-mismatch effects in analog VLSI neural networks
TW202131627A (en) Neural amplifier, neural network and sensor device
US20230117488A1 (en) Interpretable Neural Networks for Nonlinear Control
Sargolzaei et al. Assessment of He’s homotopy perturbation method for optimal control of linear time-delay systems
Karban et al. The principle of prediction of complex time-dependent nonlinear problems using RNN
JPH1091604A (en) Function learning device

Legal Events

Date Code Title Description
AS Assignment

Owner name: AISTORM INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SCHIE, DAVID;DRABOS, PETER;GAITUKEVICH, SERGEY;AND OTHERS;REEL/FRAME:062624/0433

Effective date: 20191119

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION