WO2020234900A1 - Ganaka-3 : ordinateur et architecture fonctionnant sur des modèles - Google Patents

Ganaka-3 : ordinateur et architecture fonctionnant sur des modèles Download PDF

Info

Publication number
WO2020234900A1
WO2020234900A1 PCT/IN2020/050450 IN2020050450W WO2020234900A1 WO 2020234900 A1 WO2020234900 A1 WO 2020234900A1 IN 2020050450 W IN2020050450 W IN 2020050450W WO 2020234900 A1 WO2020234900 A1 WO 2020234900A1
Authority
WO
WIPO (PCT)
Prior art keywords
models
ganaka
data
computer
operands
Prior art date
Application number
PCT/IN2020/050450
Other languages
English (en)
Inventor
G. N. Srinivasa Prasanna
Original Assignee
Prasanna G N Srinivasa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Prasanna G N Srinivasa filed Critical Prasanna G N Srinivasa
Publication of WO2020234900A1 publication Critical patent/WO2020234900A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Definitions

  • ADDRESS C/O IIIT-B, 26/C,
  • NAME ABHILASHA ASWAL
  • ADDRESS C/O IIIT-B, 26/C,
  • NAME ANUSHKA CHANDRABABU
  • ADDRESS C/O IIIT-B, 26/C,
  • NAME SUNIL KUMAR VUPPALA
  • ADDRESS C/O IIIT-B, 26/C,
  • ADDRESS C/O IIIT-B, 26/C,
  • ADDRESS C/O IIIT-B, 26/C,
  • the material is in a continuous stage of revision, and some portions have been updated since 17 th May 2019. It is upto the respected Patent Office to decide whether this update was beyond what can be readily inferred by what was intended to be disclosed on 17 th May or not (in which case the update will have May 19 th 2019, as priority date)).
  • a initial version of the original material is also enclosed as Ganaka chap3 may17. in the " drawings " category, and should have 17 th May as priority date (as communicated to the Patent Office on May 17 th , and discussed by email with the patent office on May 18 th ). This initial version itself was under update on May 17 th , and may have contained some of this updated material, if the upload would have seamlessly succeeded on May 17 th .
  • This invention relates to the software/hardware architecture of computers. It enables a general computer and memory system, to operate in a time and resource efficient fashion, using data and models of data. It is an application of the RISC philosophy, to big-data and machine learning
  • this invention is an extension of ideas in POLYTOPE AN D CONVEX BODY DATABASE as claimed in "POLYTOPE AN D CONVEX BODY DATABASE “ 2464/CHE/2012 (now US Patent 10,255,299), "CONVEX MODEL DATABASES: AN EXTENSION OF POLYTOPE AN D CONVEX BODY DATABASE “ 201641038613, “DECISION SU PPORT METHODS U N DER U NCERTAINTY “ 1677/CH E/2008, "Motion Control Using Electromagnetic Forces", US Patent 7,348,754, the patent application 3083/CHE/2009 “Electrical Mechanisms, Design Methods and Properties” (also US 13/515,034), and related applications and incorporates all the ideas therein by reference.
  • This invention further extends convex model databases in representing uncertainty, and in handling "big-data", both of which are major issues in data processing (exemplarily in machine learning). Some of the description is repeated here for completeness, and this extension in part provides more details, and further generalizations.
  • the invention addresses two significant issues with existing computer systems - the difficult of handling huge volumes of data (bigdata), and the dependence of answers on specific realizations of data
  • This invention deals with a data modelling computer and memory system (extended to a database) - which will be referred to as an Ganaka (computer in Sanskrit).
  • Ganaka is especially useful in processing uncertain data, and Big Data, both of which are major issues in the data processing today, and will be referred to as point data in all that follows, following the previously referred applications.
  • Figure 1 Ganaka ALU's WCS Tightly Integrated with Enhanced 1-Structure (Namespace Translator not shown for simplicity)
  • the WCS/EIS can interrupt each other exemplarily .
  • a general view of this invention is as a computer system handing both point data and models as the basic data objects. Instead of bytes or words, we additionally talk about models. Performance has to be exemplarily measured in models analyzed per second for the ALU (here called the inference engine IE), models stored per Megabyte of memory, and models transmitted per second over interconnect. Most standard conventional architectural features (parallelism, pipelining, out of order execution, speculative execution, ...), have analogs in Ganaka, referring to models instead of atomic data.
  • Ganaka's IE is an ML/AI system, and is described in Figure 6 and associated text, of the second provisional application No 201941018933 "Ganaka-2: A Computer and Architecture Operating on Models, Cont'd, filed on 12/05/2019". The flow of control is described in Figure 7 and associated text of the same application. The same figures are Figures 12 and 13 of PCT/IB2019/001296 respectively.
  • Relational algebra on sets is richer than on point data.
  • the resource usage (time, area, memory, ...) of an operation depends not only on the operands, but also on the operands/result operations on other data, due to transitivity and other properties. This is analogous to but not identical to examination of commonalities amongst algebraic expressions (and some of those methods can also be used here).
  • the Inference engine (IE) in the ALU has to have links/cache to/of related calculations, and access them as a CAM.
  • Each operand has links to "siblings" subsets/supersets/intersections/..., which are used in parallel during calculations.
  • This is an enhanced l-structure (EIS), as it maintains both relationship links as in l-structures, and can be accessed by the IE.
  • EIS enhanced l-structure
  • the IE functions by loading an operator/operand specific routine into the WCS depending on operand type(s).
  • the WCS exemplarily is executed by a fast conventional ALU.
  • the IE does pointer chasing using either/both operands as the root (either in software or hardware), in the enhanced I- structure (EIS).
  • EIS enhanced I- structure
  • the exact pointers to chase depends on the operation, and the operand. For example, to determine if model MA is a subset of model MB
  • a superset of MA is found to be a subset of a subset of MB, in which case the computation in the WCS is stopped exemplarily using an interrupt,
  • the results of the WCS calculation are used when it completes execution, and may be optionally stored in the EIS, depending on storage available.
  • New subsets/supersets can be generated on the fly, depending on the operation statistics. For example, bounding boxes can be generated for some/all of the sets in the EIS, and inferences may use them initially for a quick check.
  • Subsets/supersets are exemplarily generated using but not limited to the methods below (no inferencing is needed for this - the relationship is by construction itself)
  • Disjoint sets can be formed by removing the intersection from both sets.
  • Intersecting sets can be formed by determining a pre-existing point common to both sets, or adding a new point.
  • El-structure is a high speed cache, and can impact datapath clock rate.
  • the program executed by the EIS is relatively small (pointer chasing, plus the above and similar heuristics for generating new supersets and subsets). It can be hardware, or in another small special purpose control store
  • the EIS is exercised for all operations in the buffer, and all possibly inferences based on the existing results are made. Then, in priority based on the utility of an operation (utility depends partly on how many additional inferences can be made, as per the referred prior art), the full blown calculation in the WCS is made.
  • Ganaka has specific facilities for seamless operation with existing ISA's, without interference to existing resource utilizations. These include but are not limited to:
  • semaphores can be exemplarily used to indicate times when the models in the model memory are not consistent with the data in standard RAM.
  • Model Memory can also be organized as constraint windows-, since duals (in the sense of optimization) of“variables” are “constraints”, which are atomic constituents of models.
  • the conventional ALU/bus/memory/i-o structure can be used as is, without performance degradation for normal operations, and extensive improvement for model based calculations. o If the IE is a separate block, it can cycle steal time on the system bus/network/io.
  • Voluminous models can be partially loaded, and computation starts immediately, with partial results stored for future completion.
  • a VGG16 type neural network has 10's of millions of parameters, and can be loaded layer by layer during bus-idle times, and multiple data piped through, and partial results at the input to the first missing layer, stored for future completion.
  • the IE can work exemplarily in a master-slave fashion, with its operation/access dictated by the normal CPU, using a programmable/machine-learnable IE controller (which may be part of the WCS referred to earlier).
  • the canonical variable names are organized in a linear order, as in physical addresses in standard computers. Actual polytope may have different variable names, "x,”, “y”, “z”, which can be mapped to the canonical variable name space. Hence the following polytope models all specify the same polytope, assuming that the variable names in Model A do not appear in Model B or C, and so on.
  • EIS Enhanced l-structure
  • the namespace translation can be replaced by a general linear, affine, or nonlinear map, with scaling, translation, rotation, and distortion inbuilt.
  • the set-theoretic relationships are not guaranteed to hold with general nonlinear transformations.
  • Models are summaries/generalizations of data, can be mutated/updated as in PCT/IB2019/001296, and successors, with the following extensions
  • any associated l-structure relationships may have to be re-evaluated. If the update is small (say only in a single constraint in a model), then the relationship update can use high speed incremental methods, e.g. incremental linear programming in linear models,
  • out-of-order execution OOE
  • hazard detection well-known in the state-of-art has to be used.
  • the resource hazards considered have to include the EIS, since relationships satisfied by a mutable model can themselves mutate, and should not be read before they change when the final value is required, or written before they are read when the initial value is needed (unless the changed relationship can be predicted/compensated for using alternative means).
  • Intersects P means selecting all polytopes which intersect with P.
  • the order is not even apriori defined, and has to be optimized to reduce computation and maximize EIS usage (along with other standard database optimization issues, ).
  • Models are summaries/generalizations of data, can be mutated/updated as in PCT/IB2019/001296, and successors, with the following extensions.
  • models can be split into multiple models, which may be a more accurate representation, with similar issues on Hazard detection and mitigation.
  • Figure 3 shows details of a 64-bit embodiment of Ganaka, with block labels following our convention ⁇ Figure#_ShortName>, but with the Figure numbers of Ganaka-2 applications, for easy reference to the prior description - the extensions here include the automatic generation of convenient
  • W and K are parameters which can be tuned as per Ganaka's instruction stream. Other choices based exemplarily on machine learning algorithms using the aforesaid set of operands as input can be devised.
  • the high speed data interface F38_HSI is 4-way, with each way being 64 bits, clocked at DDR5 rate of 6.4 GHz/transfer's per second (25.6*64 bits/second). This voluminous datastream is modelled, and reduced to a single 64-bit multiplexed model bus, operating at lGHz, with pre-emption, and 8 priority levels.
  • the Global Data memory F38_GDM is 32-way interleaved, with each way being 256 GB, for a total of 8 TB of memory (possibly an SSD).
  • the data modeler F38_CHCH AH reduces this to a sequence of models in F38_GMM (one model, say, for each 10GB datablock), occupying only 1 GB, a reduction in data volume of 8000x (four orders of magnitude). This reduction may be exemplarily by using the reduced convex hull of a block of datapoints, or other means. While the same cores as in the IE can do the modeling, due to the high incoming data rates, a dedicated separate core is preferable.
  • the execution engines F38_ IENB (multicore, RISC, along with digital/analog accelerators F38_AA in this embodiment), work at 1 GFIz, and access operands over the model bus, and a kernel WCS of just 16MB. For polyhedral models, the smallest rows are sent first.
  • the IXC is scheduled by the EE Execution Controller F38_CON, which has the responsibility of scheduling IE, Model memory, and 10.
  • F32_EMEC is a electrical mechanism (emec), connected to the non-blocking interconnect, through which Ganaka schedules emec control operations in real time.
  • Figure 5 shows the operation of Ganaka's controller.
  • Model parameters including priority are accessed from F39_CHCHAH, F39_GMM, ....
  • an exemplary data layout of a model is shown in the prior art referred to Figure 21 (a) and (b) of Ganaka-2 - the model can be in the primal or the dual space).
  • An optimal control of the data modeler, global model memory, interconnect, and execution engines is computed, and control signals are sent to each of the modules, analogous to a VLIW word.
  • the control path can be 10's to 100's of bits in width.
  • new data can be generated to fill the data memory, using the inverter F38JNV (also shown as F39JNV.
  • This embodiment allows answers to be given over the entire set of possible inputs consistent with models, as well as reducing data volumes by four orders of magnitude.
  • model sizes vary from a single 64-bit word for atomic operands (singleton models), to 100 MB for Inception and 500 MB for VGG16 models of Imagenet.
  • An end-to-end application utilizing Ganaka's facilities can exemplarily be a robotic device/self driven car, having a vision system, and a (semi) automatically controlled actuator.
  • a robotic device/self driven car having a vision system, and a (semi) automatically controlled actuator.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Remote Sensing (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

Ganaka est un ordinateur qui fonctionne à l'aide de modèles en tant qu'opérandes, et est approprié pour des applications de mégadonnées et d'apprentissage automatique. La présente invention décrit comment des descriptions succinctes de données basées sur des enveloppes convexes, et des approximations de celles-ci, peuvent être déduites, et comment des calculs efficaces peuvent être effectués en tenant compte de la variabilité extrême de besoins en ressources. Une variété d'optimisations architecturales est décrite, comprenant le choix d'opérandes non vus pour simplifier des opérations futures.
PCT/IN2020/050450 2019-05-19 2020-05-19 Ganaka-3 : ordinateur et architecture fonctionnant sur des modèles WO2020234900A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201941019709 2019-05-19
IN201941019709 2019-05-19

Publications (1)

Publication Number Publication Date
WO2020234900A1 true WO2020234900A1 (fr) 2020-11-26

Family

ID=73458517

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2020/050450 WO2020234900A1 (fr) 2019-05-19 2020-05-19 Ganaka-3 : ordinateur et architecture fonctionnant sur des modèles

Country Status (1)

Country Link
WO (1) WO2020234900A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120239164A1 (en) * 2011-03-18 2012-09-20 Rockwell Automation Technologies, Inc. Graphical language for optimization and use
WO2013190577A2 (fr) * 2012-06-21 2013-12-27 Bhatia Neha Polytope et base de données de corps convexe

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120239164A1 (en) * 2011-03-18 2012-09-20 Rockwell Automation Technologies, Inc. Graphical language for optimization and use
WO2013190577A2 (fr) * 2012-06-21 2013-12-27 Bhatia Neha Polytope et base de données de corps convexe

Similar Documents

Publication Publication Date Title
Sinha et al. corrfunc–a suite of blazing fast correlation functions on the CPU
CN103562870B (zh) 异构核心的自动加载平衡
Kumar et al. A framework for hardware/software codesign
CN1336587A (zh) 在以正常模式执行指令期间执行硬件测试的处理器
Song et al. Cambricon-G: A polyvalent energy-efficient accelerator for dynamic graph neural networks
US10747930B2 (en) Event-driven design simulation
Zhou et al. Model-architecture co-design for high performance temporal gnn inference on fpga
US20220197636A1 (en) Event-driven design simulation
US11275582B2 (en) Event-driven design simulation
Lanzagorta et al. Introduction to reconfigurable supercomputing
US20200143097A1 (en) Event-driven design simulation
Karmazin et al. Timing driven placement for quasi delay-insensitive circuits
WO2020234900A1 (fr) Ganaka-3 : ordinateur et architecture fonctionnant sur des modèles
Nguyen et al. Balancing performance, flexibility, and scalability in a parallel computing platform for membrane computing applications
US11308025B1 (en) State machine block for high-level synthesis
US10789405B2 (en) Event-driven design simulation
US20190286775A1 (en) Event-driven design simulation
Kokkinis et al. Leveraging HW approximation for exploiting performance-energy trade-offs within the edge-cloud computing continuum
Chen et al. Parallelizing FPGA technology mapping using graphics processing units (GPUs)
Kavvadias et al. Development of a customized processor architecture for accelerating genetic algorithms
Sambhus Reuse-Aware Temporal Partitioning of Data-Flow Graphs on a Coarse-Grained Reconfigurable Array
Khouri et al. Memory binding for performance optimization of control-flow intensive behaviors
Lysecky et al. Techniques for reducing read latency of core bus wrappers
CN112817784B (zh) 一种面向软错误的寄存器可靠性建模与评估方法
US20040162960A1 (en) Generation of native code to enable page table access

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20810145

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020810145

Country of ref document: EP

Effective date: 20211220

122 Ep: pct application non-entry in european phase

Ref document number: 20810145

Country of ref document: EP

Kind code of ref document: A1