WO2021116743A1 - Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive - Google Patents

Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive Download PDF

Info

Publication number
WO2021116743A1
WO2021116743A1 PCT/IB2019/060782 IB2019060782W WO2021116743A1 WO 2021116743 A1 WO2021116743 A1 WO 2021116743A1 IB 2019060782 W IB2019060782 W IB 2019060782W WO 2021116743 A1 WO2021116743 A1 WO 2021116743A1
Authority
WO
WIPO (PCT)
Prior art keywords
type charge
charge carrier
regions
type
terminals
Prior art date
Application number
PCT/IB2019/060782
Other languages
English (en)
Inventor
Sanket DIWALE
Original Assignee
Ecole Polytechnique Federale De Lausanne (Epfl)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ecole Polytechnique Federale De Lausanne (Epfl) filed Critical Ecole Polytechnique Federale De Lausanne (Epfl)
Priority to PCT/IB2019/060782 priority Critical patent/WO2021116743A1/fr
Publication of WO2021116743A1 publication Critical patent/WO2021116743A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06GANALOGUE COMPUTERS
    • G06G7/00Devices in which the computing operation is performed by varying electric or magnetic quantities
    • G06G7/12Arrangements for performing computing operations, e.g. operational amplifiers
    • G06G7/122Arrangements for performing computing operations, e.g. operational amplifiers for optimisation, e.g. least square fitting, linear programming, critical path analysis, gradient method
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06GANALOGUE COMPUTERS
    • G06G7/00Devices in which the computing operation is performed by varying electric or magnetic quantities
    • G06G7/12Arrangements for performing computing operations, e.g. operational amplifiers
    • G06G7/32Arrangements for performing computing operations, e.g. operational amplifiers for solving of equations or inequations; for matrices
    • G06G7/38Arrangements for performing computing operations, e.g. operational amplifiers for solving of equations or inequations; for matrices of differential or integral equations

Definitions

  • the present invention relates to a hardware accelerator for solving for instance optimal transport and/or Bayesian inference problems using drift-diffusion of charge carriers in the accelerator.
  • the method also relates to a method of operating the accelerator and to a computer program product.
  • Bayesian inference and optimal transport problems play a central role in mathematical modelling of various real-world phenomena.
  • the computational tractability of such problems is limited to special cases for exact computation due to the requirement to compute intractable integrals. Even in the cases where exact computation is possible, the computational cost is exponential in the number of integral variables, making the problem NP-hard.
  • Most practical applications resort to iterative numerical approximation of the solution to such problems via numerical optimisation or stochastic sampling algorithms, such as Monte Carlo approaches.
  • methods stemming from the differential form of the optimisation step in optimal transport and Bayesian inference, as its special case
  • Bayesian inference as its special case
  • the differential form of the gradient flow PDE avoids the computation of the intractable integrals in the original problem.
  • the PDE itself requires expensive numerical computation to solve over a discretised domain as required in traditional numerical methods for PDE solvers. All the above approaches, namely the numerical optimisation approximations, Monte Carlo approaches, exact computations or the PDE approach require some form of parallelisation and hardware acceleration to obtain practical solve times.
  • Such acceleration is currently provided by specialised software implementations of these algorithms on a graphics processing unit (GPU), tensor processing unit (TPU) or cluster computing systems.
  • GPU graphics processing unit
  • TPU tensor processing unit
  • cluster computing systems the performance in terms of energy efficiency and acceleration capability of currently available acceleration techniques is not satisfactory.
  • the proposed device can be used to emulate the solution to the optimal transport and/or Bayesian inference problems.
  • the emulation using device or semiconductor physics makes solving the gradient flow PDE extremely fast and energy efficient.
  • the emulation happens in the order of microseconds while numerical solution to the same PDE can take tens of minutes to hours on a multicore processor.
  • the device emulation also is a zero current consuming process and thus does not bear any resistive energy loss. Any energy consumed comes from the capacitive losses that may occur during reconfiguration of the hardware to different voltage levels or resistive losses during the output current measurement for the time duration of the measurement.
  • the proposed new solution has also the advantage that compared with existing solutions, the proposed solution has a lower cost, and increased miniaturisation.
  • the proposed device operating as a hardware accelerator differs significantly from state-of-the-art hardware accelerators that rely on emulating computer algorithms by processing the required computations in parallel using floating point operations on binary digital data.
  • the proposed device uses the inherent physical processes occurring within a semiconductor or charge carrying material to emulate a PDE using an analogue process.
  • a device system comprising a set of devices according to the first aspect of the present invention.
  • Other aspects of the invention are recited in the dependent claims attached hereto.
  • Figure 1 is a simplified schematic illustration of the device according to an embodiment of the present invention
  • Figure 2 is simplified schematic illustration of the device according to a variant of the present invention
  • Figures 3 shows a network of devices of Figure 1 connected together
  • Figures 4a to 4c are flow charts illustrating method steps for solving a given gradient flow related problem
  • Figure 5 is a schematic illustration of a computational space according to an example of the present invention
  • Figure 6 is a schematic illustration showing how line segments together with their related vertices of a computational space can be mapped into a device space according to an example of the present invention.
  • the proposed solution provides an energy efficient hardware accelerator for these computations, addressing the needs of all such fields and provides a superior alternative to energy hungry computations using GPUs or similar units.
  • the embodiment describing the semiconductor accelerator device is also extended to cover a reconfigurable semiconductor accelerator device system comprising a set of interconnected semiconductor devices.
  • the teachings of the invention are not limited to the above applications.
  • the teachings of the invention are equally applicable in various other technical fields. Identical or corresponding functional and structural elements which appear in the different drawings are assigned the same reference numerals.
  • the present invention proposes a semiconductor accelerator and an in-hardware physical process to solve the gradient flow PDE for obtaining solutions to any one of the above-mentioned problems as charge distributions in a semiconductor.
  • the invention thus covers the design of the physical hardware device, and a process of operating the device, which involves the physical process (drift-diffusion) occurring in the semiconductor as a means to solve the gradient flow PDE, a process to configure the device or device system to solve different optimal transport and/or Bayesian inference problems, and/or a measurement technique to retrieve the solution from the physical process.
  • a new approach was also developed to solve an N-dimensional gradient flow using a collection of R-dimensional gradient flow equations combined using additional consensus terms, where R ⁇ N. This approach was used to emulate the N-dimensional gradient flow using a collection of one- to three-dimensional gradient flow equations using the physical device.
  • Figure 1 shows an example semiconductor apparatus or device 1 that can be used to emulate gradient flow for solving various problems, such as optimal transport and Bayesian inference, as detailed later.
  • the device comprises in a bottom or lower region or on a first side 3, which in this case is the bottom side, a set of first type charge carrier, doped or semiconductor regions 5 (indicated with a backward oriented pattern in Figure 1 ) separated from each other by a respective separating or isolating element or region 7 (indicated with a forward oriented pattern in Figure 1), which in this example is an insulating region.
  • the separating regions may be made of any suitable insulating material or they could be second type doped or semiconductor regions or any combination of them.
  • the first type doped regions 5 and the separating regions 7 are in this example both longitudinal regions extending in this example substantially orthogonally from the first side 3 towards a second, opposing side 9 of the device, which in this example is the top or upper side.
  • the first type doped regions 5 and the separating regions 7 thus extend between a first region end and a second, opposing region end, which in this example is located substantially in a centre region of the device extending longitudinally along an imaginary reference line extending across the device or along a longitudinal axis of the device (which does not necessarily cross the centre of the device), which in this example forms a straight axis, but the longitudinal axis could be curved instead.
  • the first type doped regions 5 and the separating regions 7, which are arranged between the first type doped regions 5 to isolate the first type doped regions from each other are arranged along an imaginary first type semiconductor region reference line extending across the device 1 between its extremities. It is to be noted that the first type doped regions 5 and the separating regions 7 do not have to form a 90-degree angle with respect to the longitudinal device axis, but these regions could be instead angled with respect to the longitudinal device axis.
  • the device length along the longitudinal axis of the device is typically between 10 pm and 1000 pm, or more specifically between 30 pm and 300 pm, or between 50 pm and 200 pm.
  • the device dimension along an axis perpendicular to the longitudinal axis is typically between 3 pm and 300 pm, or more specifically between 10 pm and 100 pm, or between 15 pm and 70 pm.
  • the first type doped regions 5 and the separating regions 7 face at their second ends a second type charge carrier, doped or semiconductor region 11 , which in this example forms one continuous region and extends along the longitudinal device axis across the entire device between a third side 13 and a fourth side 15 of the device, which in this example are the lateral sides of the device 1.
  • the region is continuous, disconnected or uninterrupted in the sense that is allows charge carriers to flow or move within the continuous region.
  • the separating regions 7 are longer than the first type doped regions 5 and thus protrude into the second type doped region 11 to ensure that the first type doped regions are properly insulated from each other.
  • the first type doped regions 5 are p-type regions, while the second type doped region 11 is an n- type region.
  • the doping types in these regions could be easily reversed without affecting the functionality of the device 1.
  • the device is thus split into two different types of semiconductor regions, one which has a dominant concentration of n-type charge carriers and another which has a dominant concentration of p-type charge carriers.
  • the second type doped regions 11 form a set or series of pockets, which in this example form another set of first type doped regions, which are separated from the first type doped regions in the lower part of the device by the second type doped region 11.
  • the p-type regions 5 at the bottom are separated into several semiconductor channels by an insulating material, while the p-type regions 5 at the top are separated into channels by a respective channel of n-type material.
  • the number of p-type regions may be between 2 and 1000, or more specifically between 20 and 1000, or between 3 and 500. However, it has been discovered that reliable results can be obtained if the number of p-type regions is between 10 and 300. It is to be noted that typically fewer p-type regions makes manufacturing easier at the cost of lower granularity of the input voltage distribution.
  • the device 1 further comprises a set of input terminals 17 forming conductive metal-semiconductor interfaces, which can be used to apply bias or input signals, which in this example are voltages, at various points.
  • bias or input signals which in this example are voltages, at various points.
  • the input terminals have a conductive interface with the semiconductor to allow current to flow in or out of the device.
  • the input terminals are in this example located at the first end of a respective p-type region. However, other alternative terminal locations would also be possible.
  • the input terminals 17 are in this example placed both in the lower and upper parts of the device 1 , although it would be possible to have them e.g.
  • the device also comprises a set or series of output terminals 19 connected to the n-type region 11.
  • the output terminals 19 are in this example structurally substantially identical to the input terminals 17, and are arranged along the longitudinal axis of the device.
  • the output terminals are located close to the p-n interface, the location of which is however not well defined. Each bottom p-channel is thus in this example terminated at the first end with an input terminal, which can be used to apply input signals (which may be considered to function as input biasing signals) as required for the emulation process.
  • the output terminals are placed along each p-type channel in the n-type material where output voltages (which may also be considered to function as output biasing signals) can be applied and the output signals (in this example currents) from these terminals can be used to characterise the solution for the emulated problem.
  • the input signals and the output biasing signals are in the present example voltage signals.
  • the first type doped regions 5, the input terminals 17 and the output terminals 19 are evenly spread along the device between the two lateral sides.
  • the positions or the number of the output terminals 19 does not depend on the number of the input terminals 17 and/or the positions or number of the p-type regions.
  • the arrangement of the input and output terminals remains unaffected if the p-type and n-type regions are inverted,
  • the interface of the p-type and n-type regions forms a series or set of semiconductor p-n-junctions or unidirectional charge-flow barriers 20 within the device 1 that allow for n-type carriers to be contained within the device 1.
  • a p-n junction can be understood to be a boundary or interface between two types of semiconductor materials, namely p-type and n-type, inside a single crystal of semiconductor.
  • the p- side (positive side) contains an excess of holes, while the n-side (negative side) contains an excess of electrons in the outer shells of the electrically neutral atoms there. This allows electric current to pass through the junction 20 in one direction only (assuming a closed-circuit configuration).
  • the charge-flow barriers 20 thus prevent the flow of first type charge carriers from the first type doped regions 5 to the second type doped region 11 , and prevent the flow of second type charge carriers from the second type doped region 11 to the first type doped regions 5 in the absence of an external force to help the first and second type charge carriers to overcome the charge-flow barrier.
  • the first type doped regions 5 comprise predominantly first type charge carriers (i.e., the first type charge carrier regions comprise more first type charge carriers than second type charge carriers) while the second type doped region 11 comprises predominantly second type charge carriers (i.e., the second type charge carrier region comprises more second type charge carriers than first type charge carriers) having an opposite electric charge to the first type charge carriers.
  • Various imaginary reference lines maybe defined in the device 1.
  • one or more charge-flow barrier reference lines RL1 are defined such that each of them crosses the p-n junctions and extends across the device 1.
  • One or more first type charge carrier region reference lines RL2 are defined such that the p-type regions are arranged along the first type charge carrier region reference lines RL2.
  • An output terminal reference line RL3 is defined such that the output terminal reference line RL3 crosses the output terminals 19. In the example of Figure 1, all these three reference lines are parallel. Furthermore, in this example all these reference lines are straight lines.
  • an emulation phase during which input signals are applied to the device 1 but no direct signals are applied to the n-region, the charge carriers rearrange in the device, and some transient current, i.e., positive charge carriers, move into the n-region across the p-n junction 20.
  • some transient current i.e., positive charge carriers
  • the emulation takes place in the n-region, but close to the p-n junctions 20.
  • the distribution of the n-type carriers under such a closed system while applying input signals to the input terminals 17 allows for emulation of the required gradient flow equation.
  • the device At the end of the emulation phase, the device reaches a steady state or equilibrium during which substantially no charge carrier rearrangement occurs.
  • a measurement phase during which output biasing signals are applied to the output terminals 19 for a short duration of time (e.g., a duration between 1 ps and 100 ps), and input signals are continuously applied to the input terminals 17, output signals are measured to obtain the emulation result.
  • the measurement phase may be repeated several times even so that it repeatedly and briefly interrupts the emulation phase.
  • the final solution to the problem can be obtained only when the steady state has been reached.
  • the respective output signal which in this example is current, is proportional to the n-type charge density near the respective output terminal 19 and can thus be used to indirectly infer the charge distribution in the device, which in turn gives the solution to the emulated problem.
  • the device of Figure 1 can be used to emulate gradient flow problems in one dimension.
  • Figure 2 shows a variant of the device of Figure 1.
  • the device 1 of Figure 2 is able to emulate two-dimensional flows as two spatial dimensions are available. If a further dimension is added, then the device would be able to emulate three-dimensional flows. Thus, a single device would be able to emulate at most three-dimensional flows as at-most three spatial device dimensions are available.
  • Figure 3 shows an arrangement that can be used to solve multi- dimensional gradient flow problems. More specifically, multiple devices 1 can be interconnected as shown in Figure 3 to emulate the gradient flow for such problems.
  • the example device system 21 of Figure 3 includes five devices 1. As shown in Figure 3, multiple devices can be dynamically connected together by using a bus or channels or interconnectors 23, and more specifically semiconductor channels. These channels are in the following description called consensus channels denoting their mathematical function as explained later.
  • the consensus channels comprise an n- type element and a p-type element in direct contact with each other and form one or more longitudinal channels for connecting various devices 1 together. Any given device may thus be connected to one or more consensus channels 23.
  • the channels are thus used to connect the various devices at one or more consensus points or terminals 25 to one or more other devices 1.
  • the consensus terminals 25 may thus be considered to be a subset of the input terminals 17.
  • the consensus channels further comprise control terminals 27, operating as biasing terminals for applying biasing signals, which in this example are voltages, to the consensus channel 23.
  • the control terminals are used in a consensus control process as explained later.
  • Figure 3 also shows multiplexers or switches 29 for dynamically changing the configuration of the channel connections to configure the device system 21 to emulate different problems.
  • the purpose of the channels 23 is to equalise the carrier concentration in all the connected devices 1 near the points of interconnection (i.e., near the consensus terminals 25) to bring all the connected devices in charge carrier consensus with each other.
  • a series of biasing voltages can be applied at the control terminals 27 placed along the consensus channel. These biasing voltages can be used to control the flow of carriers across the consensus channel to drive the devices into consensus.
  • spatially varying biasing signals are applied along a respective consensus channel to actively and quickly bring the devices into consensus.
  • the biasing signals also advantageously vary over time.
  • control terminals 27 may all be connected to equal voltage potential or substantially equal voltage potential to allow for passive equalisation of the carrier concentrations using the inherent carrier diffusion mechanism of a semiconductor. According to this passive consensus process, the equal voltage potential along a respective consensus channel 23 would be the same potential as applied to the consensus terminal 25 of that channel.
  • the device design and interconnections as shown in Figure 3 allow the emulation of the gradient flow PDE for N-dimensional problems using drift-diffusion of charges in a network of such semiconductor devices.
  • drift-diffusion refers to the movement of charges due to two separate influences, (i) drift of charges, due to the influence of a spatially varying voltage on a charged particle, and (ii) diffusion, which refers to the net movement of particles from a region of higher concentration to a region of lower concentration of the particles.
  • each device 1 is used to emulate one-dimensional gradient flow PDE.
  • the device system 21 is used to emulate a general N-dimensional gradient flow by allowing each device to emulate the flow for a different dimension and/or part of the computational space; and making interconnections between the devices to enforce consensus between the devices to obtain the global N-dimensional solution.
  • each device it would be possible to use each device to emulate two-dimensional or three-dimensional flows as well and made similar networks.
  • a new consensus-based algorithm will be explained later in more detail to solve an N-dimensional gradient flow problem using a collection of one-dimensional (or R ⁇ N dimensional) gradient flow problems. This allows for numerical PDE solvers to solve the problem using meshes in lower dimensional spaces making the numerical solver faster than directly solving on an N-dimensional mesh. It also allows the above emulation using a network of semiconductor devices.
  • semiconductor physics restricts the kind of electric fields that can exist within the substrate. This makes it impossible to use the drift- diffusion part of the physics in a bulk substrate to emulate a gradient flow PDE by applying an appropriate electric field/potential field. Instead, a sequence of p-n junctions 20 is used in the substrate according to the present invention.
  • the p-n bulk regions act as reserve for excess charge carriers that can supply p or n carriers as required to support any desired electric field at the p-n junction interface.
  • a desired electric field for the gradient flow emulation can be established and the resultant equilibrium of carrier densities at the junction provides the solution to the gradient flow PDE.
  • drift currents i.e., in this case the output currents during the measurement phase(s)
  • drift currents are proportional to charge densities
  • a probing electric field vertical in the configuration of Figure 1 generated by the applied output biasing signals, and which is orthogonal to the gradient flow emulation electric field (horizontal in the configuration of Figure 1), can be applied to measure the output current and thus indirectly infer the charge density at the given point.
  • Another challenge for emulating the gradient flow PDE is in imposing a zero-current flow at the external boundaries of the device (the current flow during the measurement is only a small perturbation to the process, and which is removed after a short duration).
  • Applying an electric field to a substrate requires electric contacts to be made (by means of the input and output terminals) in order to apply appropriate spatially varying voltages to the input and output terminals. If the input terminals are directly connected to the substrate that emulates the gradient flow (in this case the n- region), this would result in a current flow through the substrate and thus violate the required boundary condition for the gradient flow PDE. Instead, the p-n junction 20 is used to prevent such current flow.
  • any current flow out of the device can be prevented.
  • the process of applying the input signals (or voltages) to the input terminals applies the same input voltage at the p-n junction.
  • p and n are interchangeable depending on the choice of which type of carrier is to be used for the gradient flow emulation.
  • step 101 problem variables are defined.
  • a definition is obtained for the variables whose value is to be solved for, henceforth called the problem variables.
  • the problem variables could be a parameter X whose value depends on matrix A and vector B, where A and B are in this example external data.
  • a problem variable space referred to as a computational space
  • a definition is obtained for the collection of all possible values taken by the variables. This collection is assumed to be representable using a continuous subset of numeric values in henceforth called the computational space. is understood to be the N-dimensional real number space, or the Euclidean space.
  • problem functions are defined. In other words, a set of functions is obtained that can be evaluated for any given value of the variables, henceforth called problem functions.
  • a problem function takes as inputs values for the problem variables and optionally can take inputs for a distribution of the problem variables, previous output values of the problem function and external problem related data, like the matrices A, B in the linear algebra problems mentioned above or external data used for training an artificial intelligence system.
  • a computer programming language may be used to provide such a description.
  • Other ways to provide the description may include pen-paper descriptions or hardware implementations of the functions.
  • the user can also specify a few hardware configuration options, described in connection with steps 107 to 113.
  • step 107 the computational space in is partitioned using a collection of n-dimensional polytopes, where the partition may cover the entire computational space or be an approximate cover that maximises the coverage of the space according to a given metric (for example a computational space of a sphere shape could be approximated by using a set of cubes).
  • a wide number of choices for partitioning methods and coverage metrics are available from the state-of-the-art for computational geometry that include methods for partitioning subsets in using simplexes (generalisation of triangles in and gridding using hypercubes (for instance according to teachings of S. Z. M. a. I. H. S. Bettayeb, “Embedding grids into hypercubes”, Journal of Computer and System Sciences, vol. 45, no. 3, pp. 340-366, 1992), as examples.
  • a coverage metric is understood to be a function that gives an absolute value as an output describing the difference between the real computational space and its approximation.
  • the partitioning provides a collection of edges 31, referred to as line segments, and vertices 33 and, in belonging to partitioning polytopes, and which are used in steps 109 and 111 to map the partitioned computational space to a collection or set of devices, and to configure interconnections between the devices, respectively.
  • a vertex is understood to be a point where two or more curves, lines or edges meet.
  • Figure 5 shows an example of a partitioned square in a two-dimensional computational space as an example of the output of step 107.
  • the method and coverage metric for partitioning can be specified using default settings or user input.
  • the partition of the computational space from step 107 provides a collection of line segments 31 and vertices 33 that in step 109 can be mapped to a collection of devices 1 of the type described earlier with reference to Figure 1 or 2, for instance.
  • the mapping takes each line segment 31 in the partition and assigns to it (i) a physical device 1 , and (ii) a continuous region or segment (which has not yet been assigned to any line segment in the computation space) within the physical device along a device solution axis 35.
  • the device solution axis may be defined by an imaginary (straight or substantially straight) reference line passing through the p-n junctions 20 in a given device, and thus it coincides with the charge-flow barrier reference line RL1.
  • the device solution axis 35 is defined in the device space.
  • Geometrical objects like points, lines or curves can be mapped from the computational space to a representation in the device space with the help of mathematical transformations like projection, rotation, translation and scaling. These transformations can also be inverted to map geometrical objects given in the device space to geometrical objects in the computational space.
  • one single device may comprise several parallel device solution axes. Each device solution axis 35 may thus be understood to define one geometrical object in the device space.
  • a given device solution axis representing a line also represents a line in the computational space, through its inverted transformation.
  • the device configuration of Figure 2 defines six vertical and six horizontal device solution axes in addition to 9 + 9 diagonal device solution axes.
  • the mapping in step 109 is done by first assigning to each line segment obtained from step 107, a new device and one complete or substantially complete device solution axis of that device.
  • the individual line segments can be freely scaled, translated and/or rotated when doing the mapping, it is possible to map each line segment from the computational space into a given device 1.
  • a mapping table is advantageously created to store the mapping result. Further optimisation of such an assignment can be done to reduce the number of devices in the configuration as follows: i.
  • a collection of collinear and connected line segments 31 from the computational space can be assigned to the same device 1 and the same device solution axis if the total length of the connected segments can be scaled by multiplying with a numeric constant, such that the vertices 33 of the segments overlap (or coincide) with the positions of the input terminals 17 and/or consensus terminals 25 (each position of a respective terminal has a given length along the respective device) on the device solution axis. It is to be noted that the positions of the input and consensus terminals have already been defined earlier when the devices were manufactured or optimised for a particular problem description in a separate step before the manufacturing. ii.
  • the optimisation from (i) can be extended as follows: A collection of connected segments can be assigned to the same device 1 if all the segments can be scaled by multiplying with a numeric constant (maintaining congruency of the angles between the segments) and the vertices of the segments overlap with positions of the input terminals 17 and/or consensus terminals 25 of the device. This is thus an additional condition to condition (i).
  • a line segment 31 cannot be grouped together with other line segments 31 according to optimisations (i) or (ii), the line segment retains its initial assignment of a separate device and its complete device solution axis. Furthermore, grouping together larger collections of line segments on the same device solution axis leads to a reduction in spatial resolution with which points can be represented using a device solution axis.
  • An additional condition can be thus optionally be checked before applying optimisations from (i) or (ii); i.e., if the grouping of a collection of segments onto a single axis due to optimisation with (i) or (ii) reduces the spatial resolution below a minimum threshold (specified via a default setting or user input), then the optimisation is not applied to that collection.
  • each line segment 31 is assigned to a device 1 and a device solution axis, and the vertices 33 of the line segments 31 are represented by some input terminals and/or consensus terminals on that device. It is to be noted that additional input terminals and/or consensus terminals not coinciding with the vertices may be present. These terminals are mapped back to the computational space in step 111. Some input terminals and/or consensus terminals in a device have a fixed representation in the computational space, corresponding to the line segment vertices.
  • each device solution axis has been assigned to a collection of collinear line segments in step 109 by scaling the collection with a constant multiplication factor.
  • any input or consensus terminals on a given axis can be mapped back to a point on the collinear line segments in the computational space by an inverse transformation by using the constant multiplication factor or its inverse value.
  • This process is also illustrated in Figure 6.
  • all consensus and input terminals are assigned a corresponding point in the computational space, which is used in step 113 to configure the interconnections between devices 1. It is to be noted that in the present description, all or at least some of the constant factors may be replaced by constant transformation matrices.
  • the device interconnections are configured.
  • the consensus terminals 25 on two or more devices are configured to be connected together using a consensus channel 23 if the consensus terminals 25 represent the same point in the computational space.
  • Such connections can be made using a permanent consensus channel interconnection between the devices if the chip containing the devices is to be permanently configured to solve a single defined problem.
  • the interconnections can be implemented via a reconfigurable bus interconnection system that uses the switches 29 to connect or disconnect the consensus channels 23 to different consensus terminals 25, when the chip is meant to be reconfigurable to solve solutions for different problem definitions.
  • step 115 it is determined whether or not an updated distribution of variables is available from a feed-back loop (between the output terminals of the device system 21 and the input terminals 17). If this distribution is available, then in step 117, this distribution is obtained and is set as a distribution of variables for the subsequent processing. In device level operations, involving feedback from output measurements, a distribution of variables can be set using the measured output distribution, after a first or any subsequent solution iteration. If on the other hand, no distribution is available from the feed-back loop, for example if no feed-back loop exists, then in step 119, an initial distribution of variables is defined. An initial distribution of variables in the computational space can be defined for example as a uniform distribution over the entire computational space.
  • step 121 external problem data, such as a learning data set, is obtained.
  • the defined problem functions can depend on the external data to be fed into the problem solver (i.e., the system solving the defined problem) during execution.
  • the data can be collected at once, made available in batches or be streamed as a continuous time signal.
  • step 123 the values of the problem functions at device terminal points (in this example the locations of the input terminals 17 and the consensus terminals 25 collectively defining the terminal points) are determined or computed.
  • the computation of the values of the problem functions takes the external data, the terminal points representation in the computational space from step 111 , and the current distribution of the problem variables, and evaluates the problem functions obtained in step 105, giving an output stream P of problem function values evaluated at the input and consensus terminals.
  • steps 115 to 125 describe the preprocessing operation of the device system 21. However, steps 115 to 121 are optional.
  • step 127 the input voltage signals obtained for all the device input terminals 17 and optionally also for one or more consensus terminals 25 from step 125 are applied at the corresponding device terminals.
  • the drift-diffusion processes in the device group is evolved for k seconds before taking an output measurement, where k is a constant, greater than or equal to 0, specified using a default settings value or a user input.
  • step 129 the charge distribution in the device group is measured. More specifically, in this step the output biasing signals are applied to the output terminals 19 to bias the p-n junctions 20 and the output current is measured.
  • the output current driven by a voltage difference between the input and output terminals is proportional to the charge density between the terminals
  • the output currents provide a means to estimate the charge distribution at various points in the device.
  • the p-n junctions 20 for the measurement can then take advantage of forward or reverse biasing, where a p-n junction is said to be reverse biased when the voltage applied to the p region is greater than the voltage applied to the n-region and reverse biased when the voltage applied in the n-region is greater than the voltage applied to the p-region.
  • the measurement process is next explained in more detail.
  • the output terminals 19 are assigned to one or more device solution axes during the manufacturing process of the device and stored in the form of a table.
  • the table is accessible to the preprocessing and postprocessing operations.
  • step 129 for each output terminal, one device solution axis assigned to the terminal is randomly selected.
  • the device solution axis also had one or more line segments assigned to it in step 109.
  • step 109 the position of the output terminal 19 on the line segments assigned to that device solution axis is determined, and the translation, rotation and scaling operations applied for that axis in step 109 are inverted, to recover the position of the output terminals 19 in the computational space.
  • the maximum voltage value V_max (or more broadly maximum signal amplitude) applied at any input or consensus terminals in the device is found.
  • the measurement offset voltage V_offset (or output biasing signal value) is obtained from a default settings value for the device or from a user input.
  • a voltage V_offset+V_max is applied at all the output terminals 19 from which a measurement is desired. Now the current flowing in or out of the output terminals 19 can be measured, giving the output signals at those terminals.
  • step 131 the measured charge distribution at the output terminals 19 is transformed into a variable distribution in the computational space. This distribution along with the corresponding applied input signal is made available to the preprocessing block or operation as feedback signals for future iterations.
  • the p nearest input and/or consensus terminals to the output terminal are determined (p is a integer constant taken from a default value or from a user input).
  • the voltage difference V1, V2. ... .. Vp between each of the p nearest terminals and the output terminal is determined.
  • the distances in the device d1, d2, ..., dp between each of the p nearest terminals and the output terminal are determined.
  • I meas denote the current measured flowing out from the terminal.
  • the charge density at the output terminal is then estimated as where m represents the mobility constant for the predominant charge carrier in the continuous region (in this case the n-type region, with the predominant charge carrier being the n-type carrier, i.e., m being the mobility constant for the n-type carrier).
  • the value for m is known from physical properties of the semiconductor material and can be obtained as a default settings value or as a user input.
  • Q est represents the charge density for the n-type region when no input, consensus or output terminal voltages are applied to the device.
  • Q nominal is approximately equal to the doping density in the continuous region (n-type region in this case).
  • the value for Q nominal can be taken from a default settings value or as a user input. Let there be a total of N output terminals 19, each labelled with a unique number between 1 to N.
  • the output of the postprocessing block or operation is the variable distribution in the computational space and is taken as the final output when the device system 21 is observed to have reached a steady state, i.e., when its change from one iteration to another over a predefined length of iterations, or predefined length of time is seen to be smaller than threshold value (defined using a default setting or user input).
  • X boundary denote the boundary of X.
  • a function that, for any point x ⁇ X boundary maps x to a vector in perpendicular to X boundary at
  • the solution to the Bayesian inference problem is given by the steady state distribution ⁇ SS satisfying the above PDEs with its boundary and initial conditions.
  • the conditions on the problem function V also ensure that the steady state distribution does not depend on the initial distribution ⁇ 0 used for the initial time condition up to a constant multiplication factor.
  • ⁇ 0 we can obtain an approximation to the solution of the Bayesian inference problem using the evolution ⁇ in finite time, as described above, and thus does not require an explicit choice of ⁇ 0 to be imposed.
  • the resulting approximation ⁇ is then a distribution on X proportional to the solution (probability) distribution of the Bayesian inference problem and the proportionality constant can be deduced by integrating ⁇ over the set X.
  • the term denotes the vector dot product of the gradient terms with the vector perpendicular to the boundary.
  • the Bayesian inference problem and its system of PDEs and boundary conditions can be viewed as a special case of the more general optimal transport problem, that has been shown in the book by Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savare, “Gradient flows: in metric spaces and in the space of probability measures”, Springer Science & Business Media, 2008, to be solvable using a PDE of the form along with boundary and initial conditions as given by BC-I and IC-I respectively.
  • F is a given function of ⁇ , derived from the optimal transport problem desired to be solved.
  • the system of devices (1) allows the emulation of gradient flow PDEs of the form described in PDE-I and PDE-II with boundary and initial time conditions as described in BC-I & IC- I, respectively.
  • the charge distribution at the p-n junctions 20 in the device system 21 approximate the steady state distribution ⁇ SS for the problem the system is configured to be solved, and thus provides a solution to the above problems by simply measuring the spatial distribution of charges along the various imaginary reference lines passing through the p-n junctions, as described above.
  • the input signals to the devices 1 are used to establish a spatial distribution of the voltages along the imaginary reference lines of the device, corresponding to the values of V computed at the corresponding points of the imaginary line in the computational space and multiplied by some constant of choice Z.
  • the device drift-diffusion physics equation that dictates the charge movement along this imaginary line is given by where D is the diffusion constant for the charge carriers in the n-type region and A is the thermal voltage of the device, (both a property of the physical material used to build the device and environmental factors like temperature and are known from experimental data for many semiconductor materials).
  • any ⁇ ss that is a steady state solution for PDE-III is also a steady state solution for i.e., the constant multiplication factor D does not affect the steady state solution itself.
  • the input signals are used to create a spatial distribution of voltages in the system of the form given by Z F( ⁇ k (x)) at various points x in the computational space corresponding to the input and/or consensus terminals, where ⁇ k is the current distribution estimate for the variables as obtained from blocks 117 or 119 in Fig. 4b.
  • ⁇ k is the current distribution estimate for the variables as obtained from blocks 117 or 119 in Fig. 4b.
  • the system of devices emulates a PDE with the charge carriers as given by By choosing a large constant Z, such that is much larger, (say
  • the consensus process in the device system to solve multidimensional gradient flows is next explained in more detail.
  • an interconnection of multiple devices is used to emulate the gradient flow for any N- dimensional space ( N ⁇ 1).
  • groups of line segments 31 from the computational space are assigned to a group of devices 1 as described earlier.
  • the device emulates a one-dimensional gradient flow PDE as described above.
  • the consensus terminals 25 in one or more devices 1 represent the same point in the computational space, we connect those consensus terminals using a consensus channel 29 (which can be multi-branch channel).
  • the system of interconnected devices thus formed has one or more consensus channels, each consensus channel corresponding to a unique point in the computational space.
  • Each consensus terminal 25 is connected to the device 1 along a physical direction that is in this example orthogonal to every imaginary reference line passing through the p-n junctions 20 in the device. By making such an orthogonal connection, the flow of the charge carriers along the consensus channel is not affected by the electric field induced by spatial voltage distribution input along the reference lines.
  • the objective of connecting two or more devices using consensus channels is to equalise (make equal) the value of the charge distributions, at the locations of the consensus terminals 25 connected using the channel.
  • Each consensus channel achieves this objective using one of two possible mechanisms, that we denote as ‘passive’ and ‘active’ mechanisms.
  • the passive mechanism applies a constant voltage throughout the consensus channels 29 using one or more consensus control terminals 27.
  • the active mechanism of consensus uses in addition to the diffusion currents of (J-1), the drift currents induced by an applied voltage difference across two or more consensus control terminals 27 in a consensus channel 29.
  • the active mechanism of consensus uses prior estimates of the charge density at the locations of the consensus terminals 25 connected using the channel. Let the consensus channel be connected to M consensus terminals 25 and let be (in general continuous time signals) denoting previously observed estimates of the charge distribution values at these terminals. Let there be a total of K consensus control terminals 27 attached to the consensus channel. The active mechanism then specifies a function H that takes the previous estimates given by n and computes K voltage values, to be applied to the consensus control terminals (one value for each terminal).
  • a feedback loop which may have one or more associated processing elements, to actively control the voltages at the consensus control terminals 27 can then be obtained by repeating the following steps: (i) measure the charge distribution values in the vicinity of the N consensus terminals 25 to get (ii) compute and apply the voltages to the K consensus control terminals 27,(iii) wait for a time ⁇ t to allow the charge distributions to evolve, (iv) go to (i) and repeat.
  • a respective feedback loop is thus connected between at least some of the output terminals 19 of a respective device and at least some of the input terminals 17 and/or at least some of the control terminals 27 of the channel 23 associated with the respective device 1 to dynamically adjust the input signals and/or the biasing signals based on the output signals.
  • the control mechanism is repeated indefinitely until the final output measurement for the solution from the output terminals is made.
  • the consensus mechanism allows for a system of interconnected devices existing in an at-most three-dimensional physical space to emulate PDEs in any general N-dimensional space, where N may even be greater than three.
  • the invention also relates to a computer program product comprising instructions for implementing at least some of the steps of the method when loaded and run on computing means of a computing device.
  • the voltage configuration (i.e., all or some of the applied signals to the device system) can be changed over time as the problem input changes or as part of a controlled feed- back scheme where the output is measured and the voltage configuration is changed accordingly in order to emulate the solution process for certain optimal transport problems.
  • semiconductor materials instead of using semiconductor materials to build the device, there exist other mechanisms and materials, such as ionic solutions that could implement the device.
  • two types of charge carrier regions with a charge-flow barrier interface as described before and used for the invention can be implemented in materials as well, as done with all carbon materials from the teachings of Feng, X., Zhao, X., Yang, L. et al., “All carbon materials pn diode.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Semiconductor Integrated Circuits (AREA)

Abstract

La présente invention concerne un procédé d'émulation d'écoulement de gradient permettant de résoudre un problème donné en tant que distribution de charge dans un dispositif (1) comprenant : des régions de porteur de charge de premier type (5) interfaçant une région de porteur de charge de second type (11) formant ainsi des barrières d'écoulement de charge (20) ; la séparation de régions (7) pour séparer les régions de porteur de charge de premier type (5) les unes par rapport aux autres ; des bornes d'entrée (17) reliées aux régions de porteur de charge de premier type (5) ; et une ou plusieurs bornes de sortie (19) connectées à la région de porteur de charge de second type (11), et pour mesurer des signaux de sortie, et pour recevoir des signaux de polarisation pour polariser les barrières d'écoulement de charge (20) pendant une phase de mesure.
PCT/IB2019/060782 2019-12-13 2019-12-13 Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive WO2021116743A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2019/060782 WO2021116743A1 (fr) 2019-12-13 2019-12-13 Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2019/060782 WO2021116743A1 (fr) 2019-12-13 2019-12-13 Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive

Publications (1)

Publication Number Publication Date
WO2021116743A1 true WO2021116743A1 (fr) 2021-06-17

Family

ID=69137948

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2019/060782 WO2021116743A1 (fr) 2019-12-13 2019-12-13 Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive

Country Status (1)

Country Link
WO (1) WO2021116743A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2704818A (en) 1947-04-24 1955-03-22 Gen Electric Asymmetrically conductive device
US4300151A (en) * 1978-07-19 1981-11-10 Zaidan Hojin Handotai Kenkyu Shinkokai Change transfer device with PN Junction gates
US6949401B2 (en) * 1997-06-03 2005-09-27 Daimler Chrysler Ag Semiconductor component and method for producing the same
US20120187498A1 (en) * 2009-08-05 2012-07-26 Ning Qu Field-Effect Transistor with Integrated TJBS Diode

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2704818A (en) 1947-04-24 1955-03-22 Gen Electric Asymmetrically conductive device
US4300151A (en) * 1978-07-19 1981-11-10 Zaidan Hojin Handotai Kenkyu Shinkokai Change transfer device with PN Junction gates
US6949401B2 (en) * 1997-06-03 2005-09-27 Daimler Chrysler Ag Semiconductor component and method for producing the same
US20120187498A1 (en) * 2009-08-05 2012-07-26 Ning Qu Field-Effect Transistor with Integrated TJBS Diode

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
FENG, X.ZHAO, X.YANG, L. ET AL.: "All carbon materials pn diode", NAT COMMUN, vol. 9, 2018, pages 3750
HUANG YIPENG ET AL: "Analog Computing in a Modern Context: A Linear Algebra Accelerator Case Study", IEEE MICRO, vol. 37, no. 3, 14 June 2017 (2017-06-14), pages 30 - 38, XP011652837, ISSN: 0272-1732, [retrieved on 20170614], DOI: 10.1109/MM.2017.55 *
HUANG YIPENG ET AL: "Hybrid Analog-Digital Solution of Nonlinear Partial Differential Equations", 2017 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), ACM, 14 October 2017 (2017-10-14), pages 665 - 678, XP033536666 *
IVAN VLASSIOUKSERGEI SMIRNOVZUZANNA SIWY: "Nanofluidic Ionic Diodes. Comparison of Analytical and Numerical Solutions", ACS NANO, vol. 2, no. 8, 2008, pages 1589 - 1602
S. Z. M. A. I. H. S. BETTAYEB: "Embedding grids into hypercubes", JOURNAL OF COMPUTER AND SYSTEM SCIENCES, vol. 45, no. 3, 1992, pages 340 - 366

Similar Documents

Publication Publication Date Title
Jamuna et al. Multi-objective biogeography based optimization for optimal PMU placement
Wai et al. Decentralized Frank–Wolfe algorithm for convex and nonconvex problems
AU2017414716B2 (en) Operator averaging within quantum computing systems
Basetti et al. Power system static state estimation using JADE-adaptive differential evolution technique
Bayona et al. A 3-D RBF-FD solver for modeling the atmospheric global electric circuit with topography (GEC-RBFFD v1. 0)
Kiseleva et al. Theory of continuous optimal set partitioning problems as a universal mathematical formalism for constructing voronoi diagrams and their generalizations. I. Theoretical foundations
Jiang et al. Parameter identification of chaotic systems using artificial raindrop algorithm
Stordal et al. A theoretical look at ensemble-based optimization in reservoir management
Wang et al. A new three‐dimensional magnetopause model with a support vector regression machine and a large database of multiple spacecraft observations
US9412074B2 (en) Optimized trotterization via multi-resolution analysis
Hutahaean et al. Impact of model parameterisation and objective choices on assisted history matching and reservoir forecasting
Leonardi et al. Realistic nano-polycrystalline microstructures: beyond the classical Voronoi tessellation
Orús et al. Geometric entanglement in topologically ordered states
Yu et al. Parameter identification of photovoltaic models using a sine cosine differential gradient based optimizer
Pei et al. Non-dominated sorting and crowding distance based multi-objective chaotic evolution
WO2021116743A1 (fr) Émulation d'écoulement de gradient à l'aide de procédés de diffusion de dérive
Yang et al. Solar irradiance monitoring network design using the variance quadtree algorithm
Calo et al. Goal-oriented self-adaptive hp finite element simulation of 3D DC borehole resistivity simulations
Grudzien et al. Structure-and physics-preserving reductions of power grid models
Tufto et al. Analysis of genetic structure and dispersal patterns in a population of sea beet
Alonso-Sanz A glimpse of the Mandelbulb with memory
Shawlin Sparse matrix based power flow solver for real-time simulation of power system
Wang et al. Key-Point Interpolation: A Sparse Data Interpolation Algorithm based on B-splines
Paszynski et al. Parallel self-adaptive hp finite element method with shared data structure
Hernández et al. Efficient Data Assimilation in High-Dimensional Hydrologic Modeling through Optimal Spatial Clustering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19832450

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19832450

Country of ref document: EP

Kind code of ref document: A1