WO2015019364A2 - Graph based ontology modeling system - Google Patents

Graph based ontology modeling system Download PDF

Info

Publication number
WO2015019364A2
WO2015019364A2 PCT/IN2014/000508 IN2014000508W WO2015019364A2 WO 2015019364 A2 WO2015019364 A2 WO 2015019364A2 IN 2014000508 W IN2014000508 W IN 2014000508W WO 2015019364 A2 WO2015019364 A2 WO 2015019364A2
Authority
WO
WIPO (PCT)
Prior art keywords
node
nodes
edges
rti
graph
Prior art date
Application number
PCT/IN2014/000508
Other languages
French (fr)
Other versions
WO2015019364A3 (en
Inventor
Subramanian JAYAKUMAR
Original Assignee
Subramanian JAYAKUMAR
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Subramanian JAYAKUMAR filed Critical Subramanian JAYAKUMAR
Publication of WO2015019364A2 publication Critical patent/WO2015019364A2/en
Publication of WO2015019364A3 publication Critical patent/WO2015019364A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Definitions

  • This invention relates to graph based ontology modeling system.
  • Ontology designates a data model that represents domain knowledge and is used to query and reason about the properties of the objects in that domain and the relations between them.
  • the syntax and semantics of the specific ontology language used govern the expressiveness and utility of the language to model knowledge and derive new knowledge.
  • ontologies have used a declarative language with first order logic like semantics along with a reasoning/inference system to handle queries.
  • RDF Resource Description Framework
  • ontologies essentially capture relationship between entities, they have evolved from a textual representation to a graph based representation with the same underlying characteristics. With the advent of graph databases, it is now easier to store graphs directly instead of converting them into relational tables.
  • US 2005/0097561 discloses a system and method for managing data, such as in a data warehousing, analysis, or similar applications, where dataflow graphs are expressed as reusable map components, at least some of which are selected from a library of components, and map components are assembled to create an integrated dataflow application.
  • Composite map components encapsulate a dataflow pattern using other maps as subcomponents. Ports are used as link points to assemble map components and are hierarchical and composite allowing ports to contain other ports.
  • the dataflow application may be executed in a parallel processing environment by recognizing the linked data processes within the map components and assigning threads to the linked data processes.
  • US 7299458 discloses a method of forming a control-dataflow graph that includes separating a control flow graph into two or more basic blocks, and converting said two or more basic blocks into code blocks, where the code blocks are formed into the control-dataflow graph.
  • Another embodiment of the invention includes a method of forming a control-dataflow graph that includes separating a control flow graph into two or more basic blocks, forming a lode node in at least one of said basic blocks, forming a store node in at least one of said code blocks, inserting a delay node in at least one of said code blocks, segregating external hardware logic modules from said control flow graph, and converting said two or more basic blocks into code blocks, wherein the code blocks are formed into the control-dataflow graph.
  • US 7703085 describes a system and method for compiling computer code written to conform to a high-level language standard to generate a unified executable containing the hardware logic for a reconfigurable processor, the instructions for a traditional processor (instruction processor), and the associated support code for managing execution on a hybrid hardware platform.
  • Explicit knowledge of writing hardware-level design code is not required since the problem can be represented in a high-level language syntax.
  • a top-level driver invokes a standard-conforming compiler that provides syntactic and semantic analysis. The driver invokes a compilation phase that translates the CFG representation being generated into a hybrid controlflow-dataflow graph representation representing optimized pipelined logic which may be processed into a hardware description representation.
  • the driver invokes a hardware description language (HDL) compiler to produce a netlist file that can be used to start the place-and-route compilation needed to produce a bitstream for the reconfigurable computer.
  • HDL hardware description language
  • the programming environment then provides support for taking the output from the compilation driver and combining all the necessary components together to produce a unified executable capable of running on both the instruction processor and reconfigurable processor.
  • US 7316001 describes a software system including an Object Process Graph for defining applications and a Dynamic Graph Interpreter that interprets Object Process Graphs.
  • An Object Process Graph defines all of an application's manipulations and processing steps and all of the application's data.
  • An Object Process Graph is dynamic, making it possible to change any aspect of an application's data entry, processing or information display at any time.
  • When an Object Process Graph is interpreted it functions to accept data, process the data and produce information output. Modifications made to an Object Process Graph while it is being interpreted take effect immediately and can be saved.
  • Object Process Graphs and Dynamic Graph Interpreters can be deployed on single user workstation computers or on distributed processing environments where central servers store Object Process Graphs and run Dynamic Graph Interpreters, and workstation computers access the servers via the intranet or local intranets.
  • the present invention is contrived in consideration of the circumstances mentioned hereinbefore, and is intended to provide an ontology that has imperative as well as declarative characteristics including recursive-traversing interpreter (or RTI or Recursive Traversing Interpreter) that enables running imperative queries on this ontology.
  • recursive-traversing interpreter or RTI or Recursive Traversing Interpreter
  • the implementation of both the Ontology and the recursive-traversing interpreter in a single processor based computer system is described. Its extension to multiple computers connected over a network to enable both representing "big data' as well as parallel processing is described.
  • a graph based ontology modeling system comprising a knowledge base server containing information in the form of graph comprising a plurality of nodes and a plurality of edges; a client system having a recursive- traversing interpreter (RTI) enabling queries on said graph using a combination of eager and lazy evaluation method and updating values of various nodes in said graph; wherein said graph comprises: dataflow edges to capture the flow of data required in the ontology; controlflow edges to specify the next node to be traversed by said RTI once the current node has been evaluated; property edges to form any relation other than those specified by dataflow edges or controlflow edges and expressing all other relationships in the ontology; data nodes having data and defined by name, value, argument, type and description and function nodes having function or a reference to a function and defined by name, value, argument, type and description wherein said data nodes and function nodes may serve as terminal nodes for said RTI; combiner nodes which enable said RTI to evaluate a function
  • agent nodes are linked to other nodes through dataflow and/ or controlflow edges to form a sub-graph representing computation wherein said sub-graph is compiled into a function or a sub-routine wherein RTI invokes said function or sub-routine to evaluate said computation.
  • compilation of sub-graph into a function or a subroutine is either completed when such sub-graph is first encountered during a query or can be a pre-compiled link to the function or sub-routine, which is stored as an additional property of the initial node of the sub-graph.
  • RTI enables a recursive search within object and agent nodes to find required input nodes and/or output nodes.
  • RTI is associated with an internal data structure in the form of a stack to keep track of nodes to be visited.
  • Another embodiment of the invention includes a computer program product including instructions recorded on a non-transitory computer readable storage media, which, when executed by at least one processor, cause the at least one processor to perform a method as described herein.
  • base server and client system may be one computer.
  • FIG. 1 is an exemplary embodiment of 'Atom Node' according to present invention showing a data node with name 'N';
  • FIG. 2 is an exemplary embodiment of 'Atom Node' according to present invention showing a function node with name 'Add';
  • FIG. 3 is an exemplary embodiment of 'Combiner Node' with name 'C I ' according to present invention
  • Fig. 4 is an exemplary embodiment of 'Branch Node' with name ' ⁇ ⁇ according to present invention
  • Fig. 5 is an exemplary embodiment of Object Node' with name ⁇ ⁇ according to present invention
  • Fig. 6 is an exemplary embodiment of 'Agent Node' with name ⁇ according to present invention.
  • Fig. 7 is an exemplary embodiment of 'Model Node' with name 'M l ' according to present invention.
  • FIG. 8 and Fig. 9 show block diagrams illustrating implementation of the ontology according to embodiments of the present invention.
  • Fig. 10 shows a graph illustrating an example for 'Euler GCD Computation with Objects as Output' according to present invention
  • Fig. 1 1 shows a graph illustrating an example for 'Food Network' according to present invention
  • FIGs. 12, 13 and 14 show graphs illustrating an example for 'Legal Document' according to present invention
  • Figs. 15 and 16 show graphs illustrating an example for 'Financial Modelling' according to present invention.
  • Figs. 17 and 18 show graphs illustrating an example for 'Probabilistic Graphical Models' according to present invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • nodes and three types of edges form the basic elements that provide imperative and declarative expressive power.
  • the node types have been defined loosely based on recursive function theory.
  • recursive function theory the building blocks can be simplified as comprising of a given function, the principle of composition of functions and the principle of primitive recursion.
  • lambda calculus an alternate system to recursive function theory, but shown to be equivalent by Alonzo Church
  • composition is achieved through an 'eval' operation applied to a set of elements, each of which are a part of the basic functions or data types described in the programming language. Recursion was achieved through the ability of a function to call itself.
  • composition is represented by two types of nodes (called 'combiner' and 'branch' in the present description) and recursion is achieved through looping using the fact that the two types of nodes representing composition permit two different types of traversal behaviour.
  • Two more types of nodes are used to represent data and functions (called 'data' and ' function' in the present description), which are the building blocks or 'atoms' as referred to in this specification.
  • the final three types of nodes are used for declarative specifications such as constructing compound entities from the atoms mentioned above and the nodes that serve to represent composition (these are called 'object', 'agent' and 'model' in the present description).
  • Another kind of declarative specification refers to representing declarative relations in the form of functions and one of the final three ' types of nodes is used in such cases.
  • 'compounds' refer to all nodes that are not atoms, that is, they are either combiners, branches, agents, objects or models.
  • node 'atom' is shown according to present invention with only partial list of properties.
  • These nodes are called atoms for two different reasons. The first is that they refer to the basic building blocks for any imperative specification. The second reason is that these nodes serve as terminal nodes for the recursive-traversing interpreter. This point is covered in detail in the description of the recursive-traversing interpreter.
  • each atom (data or function) is represented as a node with the following properties:
  • a unique feature in representing atoms in this work is the use of the property ArgValues. This is an array, primarily used to keep value inputs from other nodes. The utility of this is explained in the section describing the recursive- traversing interpreter.
  • Another key representational feature, in the case of function nodes is the fact that these nodes too have a value property. This value represents the operation/function to be performed by the recursive-traversing interpreter.
  • the name of the function node does not affect this process and hence a function node can also be used to represent dynamically evolving processes. It may be noted that only a partial list of properties is shown in both the figures.
  • a function node could also have a graph traversal operation as its value, thereby adding to the expressive power of the ontology.
  • the second type of nodes described in the present invention is "compound'. All nodes that are not atoms are called compounds. Broadly they can be divided into three groups - the first comprising combiners and branches, the second comprising agents and objects and the final group comprising models. The division of the compounds into these three groups is merely for ease in understanding their semantic role and does not play any other role in the ontology described in the present invention.
  • exemplary embodiment of 'combiner' is shown according to present invention.
  • the function of the combiners is to enable the recursive-traversing interpreter (RTI) to evaluate a function by providing it with the function and its arguments.
  • RTI recursive-traversing interpreter
  • They provide objects to agents to generate output.
  • the input and output agents, objects, functions, data (or other combiners or branches) are indicated by dataflow edges. (These are one of the three types of edges, the others being controlflow and property. These edges are described in detail in a later part of the description.)
  • the RTI computes the value of the combiner based on its input (incoming) dataflow edges and writes the computed value to the nodes indicated by its output (outgoing) dataflow edges.
  • the node 300 is a combiner.
  • the nodes 200 and 100 with names Func (atom - function) and x (atom - data) respectively and are the inputs to the combiner C I 300.
  • the output of the combiner is written to node 100' named y (atom - data).
  • traversal control is transferred to the unnamed node.
  • edges 304 represent data flow and edge 302 represents control flow.
  • exemplary embodiment of 'branch' is shown according to present invention. Branches too serve the same purpose as combiners, but with a slight change in their behaviour post output writing.
  • the next node to be traversed is indicated by one of the multiple outgoing controlflow edges from the branch node, with the value computed by the RTI for the branch node being used to select the appropriate controlflow edge.
  • These branch nodes help to create a loop in the present ontology.
  • Both combiners and branches have the same types of properties as described in the previous section.
  • the node 400, B l is a branch. Its inputs are a logical function L 200 and a data variable x 100.
  • the logical function 200 can have a value of 0 or 1 in this case. Similar to the combiner, at the branch node 400, the RTI computes its output and writes it to y. Then, based on L(x) being 0 or 1 , it transfers control to node Nl or N2 respectively. It can be seen that in the general case, this can be extended to any number of possible choices beyond 0 and 1. [041] Referring to Fig. 5, exemplary embodiment of Object node' is shown according to present invention. Object nodes are used to group together other nodes. These other nodes could be nodes of any type, including other object nodes.
  • nodes 'part' of an object are linked to an object node with the name of the object using property edges. Any other specific property of this relation can be captured in the edge using other appropriate k'ey-value pairs as properties.
  • object Nodes thus, require only three properties - name, type and description/ comments (which have the same meaning as described in the section on Atoms).
  • node 500 is an object. It has two data nodes 100 named height and weight. These two are atoms. Note that the object node 500 is like a 'tag' that identifies the nodes that 'belong' to it. Instead of the data nodes 100, it could also have other object nodes to form more complex objects. The object node itself does not have any value.
  • the edges 502 in Fig. 5 are neither dataflow nor controlflow edges. Rather they are property edges.
  • Each agent 600 has at least one combiner node 300, one input node 100 and one output node 100', with the latter two being nodes of any type.
  • the property edge 602 joining the necessary combiner node 300 to the agent has an added property called 'Sub-type' which has the value 'Initial'.
  • the edge 604 joining the input node has 'Sub-type' 'Input'
  • the edge 606 joining the output node has 'Sub-type' Output'. It is to be noted that there can be multiple combiner nodes and input and output nodes linked to an agent.
  • Every agent can have only one node with the edge label initial and this node must be a combiner node. If an agent node has links to multiple input and/or output nodes, these can be ordered using additional properties on the edges (property edges) linking them to the agent. Note that these nodes can also be linked (and generally are linked) through dataflow and/or controlflow edges to form an imperative sub- graph. Agent nodes too require the three properties that are needed by objects with an added property called 'Status'. This property takes on two values (agent-NI and agent-I to specify whether the agent is not invoked or invoked respectively), the use of which is explained in the section describing the recursive-traversing interpreter. [043] Referring to Fig.
  • model node 700 Ml is a model. It represents the fact that B is the brother of A.
  • the three types of edges mentioned hereinbefore are dataflow edges, control flow edges and property edges.
  • Dataflow edges capture the flow of data required for evaluations in the ontology. These edges have a property called Order' which has an integer value. This value represents the order of this input as required by the node that this edge points to. Additionally, instead of using the 'value' property for every node as its value, one could specify the property name to be used as value on the dataflow edge that links the current node to other nodes.
  • Controlflow edges are used to specify the next node to be traversed by the recursive-traversing interpreter once the current node has been evaluated. They too have the Order' property. In case of branch nodes being the source of these edges, the value of the branch node and the order of the outgoing controlflow edge are matched to determine the next node to be traversed. In case of all other nodes, multiple outgoing controlflow nodes may be sorted by order or their traversal may be parallelized in case the hardware used for the implementation supports parallel execution.
  • Property edges are used to form any relation other than those specified by the above two edge types. Primarily they are used to create objects and agents by linking an object or an agent node to other nodes. These edges form the residual class of edges and by adding further properties on these edges, all other relationships can be expressed in this ontology.
  • the present invention provides an ontology that has imperative as well as declarative characteristics including a recursive-traversing interpreter that enables running imperative queries on the ontology hereinbefore described.
  • recursive-traversing interpreter is defined herein in detail.
  • the recursive-traversing interpreter updates the values of various nodes in the graph as specified in the ontology model described hereinbefore. In order to do this, it uses its internal library of functions, which in this case is the standard function library provided by most common high-level programming languages. However, this library can be the Instruction Set Architecture of the microprocessor and thus the hardware would be directly linked to the present ontology. Once a starting node for traversal is specified by the user query, this interpreter does the following steps:
  • this interpreter updates all other nodes which are linked from this node via outgoing dataflow edges with the value of this atom.
  • the interpreter checks if it requires any arguments for its value to be computed. The "argValues' property of the node serves to hold this data. If any value if required, then it recursively traverses those nodes, if not, it computes the value of the present node and writes its output to all the nodes linked from this node via outgoing dataflow edges.
  • the inteipreter recursively searches the object by traversing through its property edges to find the atom whose value is sought by the combiner or any other node that led to the object being traversed.
  • the interpreter identifies the input nodes and passes on the requirement to the combiner and then fetches this data (from other input nodes to the combiner).
  • the agent node's status is 'agent- ⁇ , that is, not invoked.
  • the agent node's status is changed to 'agent- ⁇ , that is, invoked. This status then enables the RTI to fetch the value/s from the output node/s of the agent and then update all nodes linked by outgoing dataflow edges to the combiner or branch that led to the traversal of the agent node, with these output values.
  • the interpreter fetches the value/s specified by the output node/s (that is, the node/s that is/are connected by outgoing dataflow edges from the model node) and provides this value/s to the node that led to the traversal of the model node.
  • the interpreter then traverses the next node indicated by the outgoing controlflow edge from the current node. Only in the case of a branch node, the interpreter chooses the appropriate node to be traversed next based on the value of the branch node and the order of the outgoing controlflow edges from the branch node.
  • this interpreter Since this interpreter recursively traverses the graph, it is called a recursive-traversing interpreter. As indicated in point 1 above, the atoms form the base case of the recursion.
  • a combination of eager and lazy evaluation is used for this recursive inteipreter evaluation.
  • the eager part ensures that when the interpreter computes a node's value, all the nodes that receive input from this node are immediately updated.
  • the lazy part of the interpreter is used in combiners and branches. In case of these nodes, their argValues property is erased after every computation. These values are then sought only when the node is traversed again.
  • This ensures that once a set of values is used for a computation, it is not reused. Also the next fetching of these values is done only when they are required.
  • This strategy plays a key role in enabling looping which simulates recursion in the present ontology.
  • Another key feature of this interpreter is its ability to reduce Application Programming Interface (API) specification requirements by enabling a recursive search (exact or approximate search) within objects and agents to find required input nodes.
  • API Application Programming Interface
  • the sub-graph representing the computation could be compiled into a function or a subroutine in any programming language. Then the RTI could invoke this function or subroutine for the result of the computation, instead of following its normal traversal based procedure.
  • This work of compiling subgraphs into functions or sub-routines could be done either when they are first encountered in the course of a query or they can be pre-compiled and the link to the compiled target can be stored as an additional property of the initial node of the sub-graph.
  • the RTI implementation has an internal data structure, which is a stack, which it uses to keep track of nodes to be visited.
  • This stack can also be moved to the graph database as a separate node with property edges leading out from this node indicating the nodes to be visited and the Order' property on these edges indicating the order in which the nodes need to be visited by the RTI.
  • some data and function nodes can also be declared to be non- atomic so that the behaviour of the RTI when traversing these nodes becomes similar to that when it is traversing a combiner. This flexibility helps in some specific domains and is more of a convenience rather than a requirement.
  • multiple dataflow edges could provide input to the same node with the same order (that is, as possible candidates for the same value required by the node).
  • a choice can be made amongst these edges based on any predetermined criterion (say better reliability if the multiple inputs represent data from different types of sensors or lower cost if the multiple inputs are from different agents, each of which has a different computational cost etc.) or any criterion specified by the user dynamically at run-time.
  • Structural Inversion In Structural Inversion, the inversion is done at the node level with each combiner or branch being replaced by a new combiner or branch with the dataflow edges appropriately changed and the function nodes replaced with the inverses of the same. If the inverses of the function nodes are not known apriori or the sub-graph being traversed is cyclic, then this approach would not be feasible.
  • the domains of the input and output variables are discretized (if they are not already sufficiently discrete) and then the conditional probability relation between the new input bins and the output bins are computed.
  • This conditional probability along with a prior distribution for the input variables (estimated or assumed) enables us to compute the inverse by specifying the posterior probabilities of the various output bins using Bayes' rule.
  • an added advantage of the present ontology is the ability to compute inverses of (at least some) imperative processes to enable answering a much larger class of user queries, permitting optimization, system design etc.
  • Optimization can be of two types in querying the present ontology.
  • the first type is called a structural one, where the best agent to compute a particular output is to be identified. This problem can be solved similar to an edge-weighted shortest path problem.
  • different agents can give the same output, but may require different inputs that differentiate them based on their accuracy and precision, the number of inputs needed by the agent too could add to the cost of the objective function. On the other hand, the accuracy and precision provided by the agent could then contribute favourably to the objective function.
  • the second type of optimization involves choosing input values for a node or a set of nodes that optimizes the value or value/s for another node or set of nodes. This is a standard black-box optimization problem, with the black box being replaced by the sub-graph and thus permitting structural inferences too.
  • the implementation of the ontology described hereinbefore uses a property graph database, which is one of the most common types of graph databases.
  • This database can be implemented in a single computer with sufficient hard disk space, and for scalability it can be stored across multiple machines.
  • the storing and data retrieval mechanisms are specific to the graph database implementation and are generally provided by the database provider.
  • the recursive-traversing interpreter can be implemented in any modern computer system with single or multiple processors. Theoretically, it only requires access to the functions specified as part of the processor architecture. However, for ease of use, the present work uses high-level functions provided by most standard programming languages that then interface with the processor's hardware- software interface.
  • an Apple Macbook Air laptop computer with an Intel 1.86 GHz Core 2 Duo Processor, running Mac OS X 10.7.5 (Lion) was used.
  • the computer had a memory of 2GB 1067 MHz DDR3 and a flash based hard disk of 128 GB capacity.
  • For the graph database a free open-source graph database was used.
  • the Recursive Traversing interpreter was implemented in Python 2.7.5.
  • the system can be used over a network of computers in two ways.
  • the first approach which is to handle big data, is to partition the graph over several computers in a network.
  • the use of the present ontology is to aid the partitioning logic whereby a single agent or object is not partitioned into different machines whenever possible.
  • FIG. 9 Another approach to scaling this ontological system is to allow multiple computers (clients) to connect to the computer (or cluster or network) with the graph database (knowledge base server), but each client computer can have its own recursive-traversing Interpreter.
  • the advantage here is that, in this case, each client can have its own implementation of the atomic functions and thus can exhibit different behaviour while accessing the same graph.
  • This architecture is shown in Fig. 9.
  • the agent 1002 is used to compute the GCD of the two input nodes to the combiner node 1004.
  • the lists of nodes and edges in this example are given in Tables 1 and 2 below.
  • the main combiner in this application is the node numbered 13.
  • the Recursive Traversing Interpreter (RTI) starts from this node. In order to understand the computation, we track the internal stack of nodes to be visited by the RTI and also the argValues and value property of each node that the RTI visits.
  • RTI is asked to start by exploring node 13. Its internal stack (called fringe) has only node 13 at this stage. It pops the fringe and traverses to node 13. It checks the type of the node and finds that it is a combiner. It hence checks if all its arguments are present (i.e. there are no "NA" in its argValues array). Initially, it finds that all the required arguments for node 13 are missing and hence, it adds the required nodes to the fringe. It then proceeds to pop the fringe again and continue the same algorithm. The various steps involved in this are given in the table below. Whenever a node is evaluated, its value is updated to the argValues property of all its outgoing DATAFLOW neighbours with the index of the argValues property array being determined by the Order' property of the DATAFLOW edge.
  • the representation in the format prescribed in the present work allows one to capture the effect of the mode of the preparation as well as other parameters involved in the preparation. If the function atoms are chosen based on what a robotic system can do, then the specification also becomes an executable recipe for that system. Further, this representation also illustrates use of additional properties on edges such as 'quantity' as described in the figure.
  • nodes 200 represent functions
  • nodes 100 represent data (food ingredients)
  • nodes 500 represent objects
  • nodes 300 represent combiners (steps in the recipe).
  • the edges 304 represent data flow (flow of ingredients)
  • edges 302 represent flow of control (sequence of steps)
  • edges 502 represent property edges which denote membership of an object and also have a property on them which defines the quantity of each ingredient to be used in the particular step.
  • queries such as:
  • EXAMPLE 3 Legal Document (Act/Regulations/Agreement etc.)
  • the graphs illustrate the representation of the above three clauses.
  • the advantage of this representation is that one can query about the parameters appearing in various clauses as well as one can compute the effect of execution of the action statement clauses.
  • simulation of various scenarios as well as identifying the optimum choice of parameters that comply with the specified regulations can be done.
  • action statements can also be functions with only side effects, such as 'Filing of a document'.
  • the output of the combiner would be the status of completion of the function (i.e. success or failure along with time of execution, acknowledgement etc.)
  • Cost of equity can be computed using two different methods - Capital Asset Pricing Model (risk free rate + beta * (market return - risk free rate) or can be computed using a (hypothetical) thumb rule such as (risk free rate + inflation rate + 5 %).
  • Fig. 16 illustrates a graph that permits both these options with two agents (the nodes belonging to the agents (connected via Property edges) are not shown to reduce clutter).
  • the CAPM agent requires inputs - risk free rate (r_f), market return (r_m) and beta.
  • the ThumbRule Agent requires r_f and inflation rate (r_i). Note that the 'parameters' Object has the inputs required for both these agents. The multiple choices of agents are indicated by both the data edges having the same Order * property.
  • EXAMPLE 5 Probabilistic Graphical Models
  • the two main types of probabilistic graphical models are Bayesian Networks and Markov Networks represented by directed graphs and undirected graphs respectively.
  • the theory of Probabilistic Graphical Models demonstrates the construction and equivalence of a graph from a joint probability distribution such that the independence relations specified in the graph are valid in the specification of the joint probability distribution. This graph-based factorization permits one to exploit such independencies and consequently reduce the number of parameters required to state the joint probability distribution.
  • the difference between Bayesian Networks and Markov Networks is that in the former, each factor (that governs dependence of one node on the values of the others) represents a conditional probability distribution (CPD) and thus can be interpreted as possibly representing a causal link.
  • CPD conditional probability distribution
  • the factors are just functions that assign values to every value combination of the variables in their domain or scope.
  • the concepts of active trails in such graphs help us determine independencies and/or conditional independencies between any sets of variables.
  • local structure of the conditional probability distribution or factor specification permits further reduction in number of required parameters.
  • Fig. 17 & Fig. 18 a simple Bayesian Network and a simple Markov Network are represented in the ontology specified in the present invention respectively.
  • the concepts of active trails for both the types need to be modified.
  • this representation enables representation of multiple CPDs for the same set of variables as can be seen in Fig. 17 for Bayesian Networks.
  • agents 1 and 2 represent two different CPDs that relate the same three variables A, B and C.
  • the same graph can be used to capture multiple relationships, which is not possible in the traditional approach.
  • local structure can be easily captured in the Agent definition.
  • Fig. 18, which depicts a Markov Network the explicit relationship between factors and variables is captured in the present representation. Again, as in the Bayesian Network case, multiple choices for the factor function between the same set of variables can be represented here. This explicit relationship between factors and variables is not captured in the traditional approach.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A graph based ontology modeling system comprising a knowledge base server containing information in the form of graph comprising a plurality of nodes and a plurality of edges; a client system having a recursive-traversing interpreter (RTI) enabling queries on said graph using a combination of eager and lazy evaluation method and updating values of various nodes in said graph; wherein said graph comprises: dataflow edges; controlflow edges; property edges; data nodes; function nodes; combiner nodes; branch nodes; agent nodes; and model nodes; wherein based on query of a user specifying the starting node for traversal, RTI updates the values of various nodes.

Description

TITLE OF THE INVENTION
Graph Based Ontology Modeling System
FIELD OF THE INVENTION
[001] This invention relates to graph based ontology modeling system.
BACKGROUND OF THE INVENTION
[002] Ontology designates a data model that represents domain knowledge and is used to query and reason about the properties of the objects in that domain and the relations between them. The syntax and semantics of the specific ontology language used govern the expressiveness and utility of the language to model knowledge and derive new knowledge. Traditionally, ontologies have used a declarative language with first order logic like semantics along with a reasoning/inference system to handle queries. At present, most of these systems use Resource Description Framework (RDF) or similar approaches at their core. Since ontologies essentially capture relationship between entities, they have evolved from a textual representation to a graph based representation with the same underlying characteristics. With the advent of graph databases, it is now easier to store graphs directly instead of converting them into relational tables. This also helps in scaling as most queries in graphs can be viewed as graph traversals, which allow replacing costly join operations in equivalent relational algebra tables. However, this declarative language based approach is not very useful when an ontology is to be used for dynamical computations like updating the state of a system with many agents and objects. For this purpose, imperative or functional language characteristics would prove useful. This has been explored to some extent in the system called OpenCog, though for a different purpose. The OpenCog system uses a hypergraph database and to specify functions it uses a combinator-based approach. The disadvantage of such approaches is that for evaluating recursive functions using only graphs, they require creation of new nodes in the graph for every recursion instance and thus would run into scalability issues. [003] Another approach is the graph transformation approach which has been discussed in "Graph Transfomiation in a Nutshell, Reiko Hackel, Electronic Notes in Theoretical Computer Science 148 (2006) 187-198". In this, rules are used to transform the graph database. Typically, these rules have a left-hand side and a right-hand side, where both are subgraphs. The left-hand side of a rule is matched with the nodes in the graph to identify a subgraph that has the same pattern. This subgraph is then replaced with the subgraph specified by the right-hand side of the rule. The OpenCog system has its foundations on a hypergraph re-writing system (a type of Graph Transformation).
[004] Graph based functional and imperative approaches have been used primarily in the form of visual programming languages and not for ontological engineering. [005] US 2005/0097561 discloses a system and method for managing data, such as in a data warehousing, analysis, or similar applications, where dataflow graphs are expressed as reusable map components, at least some of which are selected from a library of components, and map components are assembled to create an integrated dataflow application. Composite map components encapsulate a dataflow pattern using other maps as subcomponents. Ports are used as link points to assemble map components and are hierarchical and composite allowing ports to contain other ports. The dataflow application may be executed in a parallel processing environment by recognizing the linked data processes within the map components and assigning threads to the linked data processes.
[006] US 7299458 discloses a method of forming a control-dataflow graph that includes separating a control flow graph into two or more basic blocks, and converting said two or more basic blocks into code blocks, where the code blocks are formed into the control-dataflow graph. Another embodiment of the invention includes a method of forming a control-dataflow graph that includes separating a control flow graph into two or more basic blocks, forming a lode node in at least one of said basic blocks, forming a store node in at least one of said code blocks, inserting a delay node in at least one of said code blocks, segregating external hardware logic modules from said control flow graph, and converting said two or more basic blocks into code blocks, wherein the code blocks are formed into the control-dataflow graph.
[007) US 7703085 describes a system and method for compiling computer code written to conform to a high-level language standard to generate a unified executable containing the hardware logic for a reconfigurable processor, the instructions for a traditional processor (instruction processor), and the associated support code for managing execution on a hybrid hardware platform. Explicit knowledge of writing hardware-level design code is not required since the problem can be represented in a high-level language syntax. A top-level driver invokes a standard-conforming compiler that provides syntactic and semantic analysis. The driver invokes a compilation phase that translates the CFG representation being generated into a hybrid controlflow-dataflow graph representation representing optimized pipelined logic which may be processed into a hardware description representation. The driver invokes a hardware description language (HDL) compiler to produce a netlist file that can be used to start the place-and-route compilation needed to produce a bitstream for the reconfigurable computer. The programming environment then provides support for taking the output from the compilation driver and combining all the necessary components together to produce a unified executable capable of running on both the instruction processor and reconfigurable processor.
[008] US 7316001 describes a software system including an Object Process Graph for defining applications and a Dynamic Graph Interpreter that interprets Object Process Graphs. An Object Process Graph defines all of an application's manipulations and processing steps and all of the application's data. An Object Process Graph is dynamic, making it possible to change any aspect of an application's data entry, processing or information display at any time. When an Object Process Graph is interpreted, it functions to accept data, process the data and produce information output. Modifications made to an Object Process Graph while it is being interpreted take effect immediately and can be saved. Object Process Graphs and Dynamic Graph Interpreters can be deployed on single user workstation computers or on distributed processing environments where central servers store Object Process Graphs and run Dynamic Graph Interpreters, and workstation computers access the servers via the intranet or local intranets.
[009] In all the aforementioned approaches, the emphasis is on graphical specification of a programming language and hence they have very limited expressive power to handle meta-queries (i.e. structural queries) as also in modeling complex agents, objects and their interactions. In a Dataflow Graph, as explained in US 2005/0097561 , the nodes of the graph are data transformation processes that have input and output ports. The edges are data pipes that connect output ports to input ports. There are also parent to child edges denoting component ownership. These graphs could either be flat graphs or composite graphs. The execution of a dataflow graph is done by producing an execution plan, which is an internal and private data structure. This data structure is a flat (non-hierarchical) data structure. All the structural information of the graph, which are important for human understanding, but are not needed for execution, are deleted while creating the execution plan.
[010] The first two types (RDF based and Graph Transformation based) of approaches described previously do not exploit the local structure of the graph for imperative computations. The third type of approach, though not an ontology, suffers from drawbacks in expressive power as well as in querying ability. To address these limitations, an approach is required that utilizes local structure as well as uses a recursive traversal based approach for computation (or interpretation). This recursive traversal provides better expressive power as well as querying ability. The other key benefit provided by a graph structure that has not been used in the past is the power of traversal to relax the API (Application Programming Interface) specification requirements used in imperative approaches. Besides, the ability to express declarative structures in imperative form, finding the inverse of imperative specifications (at least in some cases), utilizing structural properties such as shortest path to construct new imperative structures, optimization and using structural properties for transferring knowledge between different domains are requirements that have not been explored in the prior approaches.
[Oi l] The present invention is contrived in consideration of the circumstances mentioned hereinbefore, and is intended to provide an ontology that has imperative as well as declarative characteristics including recursive-traversing interpreter (or RTI or Recursive Traversing Interpreter) that enables running imperative queries on this ontology. The implementation of both the Ontology and the recursive-traversing interpreter in a single processor based computer system is described. Its extension to multiple computers connected over a network to enable both representing "big data' as well as parallel processing is described.
SUMMARY OF THE INVENTION
[012] Disclosed herein a graph based ontology modeling system comprising a knowledge base server containing information in the form of graph comprising a plurality of nodes and a plurality of edges; a client system having a recursive- traversing interpreter (RTI) enabling queries on said graph using a combination of eager and lazy evaluation method and updating values of various nodes in said graph; wherein said graph comprises: dataflow edges to capture the flow of data required in the ontology; controlflow edges to specify the next node to be traversed by said RTI once the current node has been evaluated; property edges to form any relation other than those specified by dataflow edges or controlflow edges and expressing all other relationships in the ontology; data nodes having data and defined by name, value, argument, type and description and function nodes having function or a reference to a function and defined by name, value, argument, type and description wherein said data nodes and function nodes may serve as terminal nodes for said RTI; combiner nodes which enable said RTI to evaluate a function by providing it with function and its arguments wherein RTI computes value of combiner nodes based on its input dataflow edges and writes the computed value to nodes indicated by its output dataflow edges and proceeds to traverse the next node indicated by controlflow edge leading out of combiner node; branch nodes which enable said RTI to evaluate a function by providing it with function and its arguments wherein RTI computes value of branch node based on its input dataflow edges and writes the computed value to nodes indicated by its output dataflow edges and proceeds to traverse the next node indicated by the computed value and one of the multiple outgoing controlflow edges leading out of branch node; object nodes defined by name, type and description having property edges wherein said object nodes identify and group together other nodes forming part of said object nodes; agent nodes defined by name, type, description and status have property edges, at least one combiner node, one input and one output node of any type wherein said property edges joining respective node to agent node has an added property called sub-type which helps the RTI in its traversal; and model nodes having output node with pre- populated value wherein when RTI evaluates a model node and fetches value from the output node indicated by the outgoing data flow edge from the model node; wherein based on query of a user specifying the starting node for traversal, said recursive-traversing interpreter performs the following step: if selected node is a data node or a function node, RTI updates all other nodes which are linked from this node via outgoing dataflow edges with the value of selected data node; if selected node is a combiner node or a branch node RTI checks if any arguments is required to compute the value of said RTI and if required, RTI traverses those nodes else proceed to compute the value of the present node and writes the output to all the nodes linked from this node via outgoing dataflow edges; if selected node is an object node, RTI searches the object by traversing through property edges associated thereto to find the data node whose value is sought by the combiner node or any other node that led to the object being traversed; if selected node is an agent node, RTI identifies the input nodes and passes on the requirement to the combiner and then fetches this data from other input nodes to the combiner; if selected node is a model, RTI fetches the value/s specified by the output node/s and provides this value/s to the node that led to the traversal of the model node; wherein once the value of any node is evaluated, RTI then traverses the next node indicated by the outgoing controlflow edge from the current node, however, only in the case of a branch node, RTI chooses the appropriate node to be traversed next based on the value of the branch node and the order of the outgoing controlflow edges from the branch node. [013] In some embodiments, system comprises agent nodes linked to multiple combiner nodes and multiple input and output nodes of any type
[014J In some embodiments, agent nodes are linked to other nodes through dataflow and/ or controlflow edges to form a sub-graph representing computation wherein said sub-graph is compiled into a function or a sub-routine wherein RTI invokes said function or sub-routine to evaluate said computation.
[015] In some embodiments, compilation of sub-graph into a function or a subroutine is either completed when such sub-graph is first encountered during a query or can be a pre-compiled link to the function or sub-routine, which is stored as an additional property of the initial node of the sub-graph.
[016] In some embodiments, RTI enables a recursive search within object and agent nodes to find required input nodes and/or output nodes. [017] In some embodiments, RTI is associated with an internal data structure in the form of a stack to keep track of nodes to be visited.
[018] Another embodiment of the invention includes a computer program product including instructions recorded on a non-transitory computer readable storage media, which, when executed by at least one processor, cause the at least one processor to perform a method as described herein. In some embodiments, base server and client system may be one computer.
[019] Additional novel features shall be set forth in part in the description that follows, and in part will become apparent to those skilled in the art upon examination of the following specification or may be learned by the practice of the invention. The invention may be better understood and further advantages and uses thereof more readily apparent when considered in view of the following detailed description of exemplary embodiments, taken with the accompanying drawings. These embodiments describe only a few of the various ways in which the principles of various other embodiments may be realized and the described embodiments are intended to include all such embodiments and their equivalents and the reference numerals used in the accompanying drawings correspond to the like elements throughout the description. The features and advantages of the invention may be realized and attained by means of the instrumentalities, combinations and methods pointed out in the appended claims.
A BRIEF DESCRIPTION OF THE DRAWINGS
[020] The above-mentioned and other features and advantages of the various embodiments of the invention, and the manner of attaining them, will become more apparent and will be better understood by reference to the accompanying drawings, wherein:
[021] Fig. 1 is an exemplary embodiment of 'Atom Node' according to present invention showing a data node with name 'N';
[022] Fig. 2 is an exemplary embodiment of 'Atom Node' according to present invention showing a function node with name 'Add';
[023] Fig. 3 is an exemplary embodiment of 'Combiner Node' with name 'C I ' according to present invention;
[024] Fig. 4 is an exemplary embodiment of 'Branch Node' with name 'Β Γ according to present invention; [025] Fig. 5 is an exemplary embodiment of Object Node' with name Ό Γ according to present invention;
[026] Fig. 6 is an exemplary embodiment of 'Agent Node' with name ΆΓ according to present invention;
[027] Fig. 7 is an exemplary embodiment of 'Model Node' with name 'M l ' according to present invention;
[028] Fig. 8 and Fig. 9 show block diagrams illustrating implementation of the ontology according to embodiments of the present invention;
[029] Fig. 10 shows a graph illustrating an example for 'Euler GCD Computation with Objects as Output' according to present invention;
[030] Fig. 1 1 shows a graph illustrating an example for 'Food Network' according to present invention;
[031] Figs. 12, 13 and 14 show graphs illustrating an example for 'Legal Document' according to present invention;
[032] Figs. 15 and 16 show graphs illustrating an example for 'Financial Modelling' according to present invention; and
[033] Figs. 17 and 18 show graphs illustrating an example for 'Probabilistic Graphical Models' according to present invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[034] In the Ontology specified in the present invention, seven types of nodes and three types of edges form the basic elements that provide imperative and declarative expressive power. The node types have been defined loosely based on recursive function theory. In recursive function theory, the building blocks can be simplified as comprising of a given function, the principle of composition of functions and the principle of primitive recursion. In most modern functional programming languages based on lambda calculus (an alternate system to recursive function theory, but shown to be equivalent by Alonzo Church), composition is achieved through an 'eval' operation applied to a set of elements, each of which are a part of the basic functions or data types described in the programming language. Recursion was achieved through the ability of a function to call itself. In imperative programming languages, loops serve the same purpose, though recursion is also permitted to increase expressive power, although it generally is the more inefficient approach. In the present invention, composition is represented by two types of nodes (called 'combiner' and 'branch' in the present description) and recursion is achieved through looping using the fact that the two types of nodes representing composition permit two different types of traversal behaviour. Two more types of nodes are used to represent data and functions (called 'data' and ' function' in the present description), which are the building blocks or 'atoms' as referred to in this specification. The final three types of nodes are used for declarative specifications such as constructing compound entities from the atoms mentioned above and the nodes that serve to represent composition (these are called 'object', 'agent' and 'model' in the present description). Another kind of declarative specification refers to representing declarative relations in the form of functions and one of the final three' types of nodes is used in such cases. In the present specification, 'compounds' refer to all nodes that are not atoms, that is, they are either combiners, branches, agents, objects or models.
[035] Referring to Figs. 1 and 2, an exemplary embodiment of node 'atom' is shown according to present invention with only partial list of properties. There are two kinds of atoms - data 100, as shown in Fig. 1 , and function 200, as shown in Fig. 2, in the present ontology. These nodes are called atoms for two different reasons. The first is that they refer to the basic building blocks for any imperative specification. The second reason is that these nodes serve as terminal nodes for the recursive-traversing interpreter. This point is covered in detail in the description of the recursive-traversing interpreter. In the present implementation, using a single processor computer and a property graph database, each atom (data or function) is represented as a node with the following properties:
1. Name: (A string of alphanumeric symbols)
2. Value: (A string of alphanumeric symbols or an integer or a real number) 3. ArgValues: (An array of a string of alphanumeric symbols or integers or real numbers)
4. Type: (A string label which is one of the seven types of nodes)
5. Description/Comments: (Optional)
[036] A unique feature in representing atoms in this work is the use of the property ArgValues. This is an array, primarily used to keep value inputs from other nodes. The utility of this is explained in the section describing the recursive- traversing interpreter. Another key representational feature, in the case of function nodes, is the fact that these nodes too have a value property. This value represents the operation/function to be performed by the recursive-traversing interpreter. The name of the function node does not affect this process and hence a function node can also be used to represent dynamically evolving processes. It may be noted that only a partial list of properties is shown in both the figures.
[037] It should be noted that in addition to standard functions, a function node could also have a graph traversal operation as its value, thereby adding to the expressive power of the ontology. [038] The second type of nodes described in the present invention is "compound'. All nodes that are not atoms are called compounds. Broadly they can be divided into three groups - the first comprising combiners and branches, the second comprising agents and objects and the final group comprising models. The division of the compounds into these three groups is merely for ease in understanding their semantic role and does not play any other role in the ontology described in the present invention.
[039] Referring to Fig. 3, exemplary embodiment of 'combiner' is shown according to present invention. The function of the combiners is to enable the recursive-traversing interpreter (RTI) to evaluate a function by providing it with the function and its arguments. In a more general sense, they provide objects to agents to generate output. The input and output agents, objects, functions, data (or other combiners or branches) are indicated by dataflow edges. (These are one of the three types of edges, the others being controlflow and property. These edges are described in detail in a later part of the description.) The RTI computes the value of the combiner based on its input (incoming) dataflow edges and writes the computed value to the nodes indicated by its output (outgoing) dataflow edges. Once the output is written, the next node to be traversed by the recursive- traversing interpreter is given by the controlflow edge leading out from the combiner node. As shown in Fig. 3, the node 300 is a combiner. The nodes 200 and 100 with names Func (atom - function) and x (atom - data) respectively and are the inputs to the combiner C I 300. The output of the combiner is written to node 100' named y (atom - data). Once the RTI writes the output to node 100' y, traversal control is transferred to the unnamed node. It should be noted that edges 304 represent data flow and edge 302 represents control flow.
[040] Referring to Fig. 4, exemplary embodiment of 'branch' is shown according to present invention. Branches too serve the same purpose as combiners, but with a slight change in their behaviour post output writing. In this case, the next node to be traversed is indicated by one of the multiple outgoing controlflow edges from the branch node, with the value computed by the RTI for the branch node being used to select the appropriate controlflow edge. These branch nodes help to create a loop in the present ontology. Both combiners and branches have the same types of properties as described in the previous section. As shown in Fig. 4, the node 400, B l is a branch. Its inputs are a logical function L 200 and a data variable x 100. Its output is written to the data node y 100'. The logical function 200 can have a value of 0 or 1 in this case. Similar to the combiner, at the branch node 400, the RTI computes its output and writes it to y. Then, based on L(x) being 0 or 1 , it transfers control to node Nl or N2 respectively. It can be seen that in the general case, this can be extended to any number of possible choices beyond 0 and 1. [041] Referring to Fig. 5, exemplary embodiment of Object node' is shown according to present invention. Object nodes are used to group together other nodes. These other nodes could be nodes of any type, including other object nodes. The nodes 'part' of an object are linked to an object node with the name of the object using property edges. Any other specific property of this relation can be captured in the edge using other appropriate k'ey-value pairs as properties. As explained here, object Nodes, thus, require only three properties - name, type and description/ comments (which have the same meaning as described in the section on Atoms). As shown in Fig. 5, node 500 is an object. It has two data nodes 100 named height and weight. These two are atoms. Note that the object node 500 is like a 'tag' that identifies the nodes that 'belong' to it. Instead of the data nodes 100, it could also have other object nodes to form more complex objects. The object node itself does not have any value. The edges 502 in Fig. 5 are neither dataflow nor controlflow edges. Rather they are property edges.
[042] Referring to Fig. 6, exemplary embodiment of 'agent' is shown according to present invention. Agents are similar to objects but they have some added features. Each agent 600 has at least one combiner node 300, one input node 100 and one output node 100', with the latter two being nodes of any type. The property edge 602 joining the necessary combiner node 300 to the agent has an added property called 'Sub-type' which has the value 'Initial'. Similarly the edge 604 joining the input node has 'Sub-type' 'Input' and the edge 606 joining the output node has 'Sub-type' Output'. It is to be noted that there can be multiple combiner nodes and input and output nodes linked to an agent. However, every agent can have only one node with the edge label initial and this node must be a combiner node. If an agent node has links to multiple input and/or output nodes, these can be ordered using additional properties on the edges (property edges) linking them to the agent. Note that these nodes can also be linked (and generally are linked) through dataflow and/or controlflow edges to form an imperative sub- graph. Agent nodes too require the three properties that are needed by objects with an added property called 'Status'. This property takes on two values (agent-NI and agent-I to specify whether the agent is not invoked or invoked respectively), the use of which is explained in the section describing the recursive-traversing interpreter. [043] Referring to Fig. 7, exemplary embodiment of 'model node' is shown according to present invention. A model node is very similar to a combiner or a branch node. The only difference is that a model node already has its output node value populated. When the recursive-traversing interpreter evaluates a model node, it does not compute any function; rather it fetches the value from the output node indicated by the outgoing dataflow edge from the model node. Thus, a model node helps to capture declarative specifications in this ontology. It is to be noted that this is significantly different from an RDF like representation as all the necessary imperative steps - functions as well as data are captured as input dataflow edges to the model node. This makes it possible to define new functions (i.e. values inside function nodes), which can then be used inside for input to a combiner or a branch node. As shown in Fig. 7, node 700 Ml is a model. It represents the fact that B is the brother of A.
[044] The three types of edges mentioned hereinbefore will now be described in detail. The three types of edges used in the present invention are dataflow edges, control flow edges and property edges.
[045] Dataflow edges capture the flow of data required for evaluations in the ontology. These edges have a property called Order' which has an integer value. This value represents the order of this input as required by the node that this edge points to. Additionally, instead of using the 'value' property for every node as its value, one could specify the property name to be used as value on the dataflow edge that links the current node to other nodes.
[046] Controlflow edges are used to specify the next node to be traversed by the recursive-traversing interpreter once the current node has been evaluated. They too have the Order' property. In case of branch nodes being the source of these edges, the value of the branch node and the order of the outgoing controlflow edge are matched to determine the next node to be traversed. In case of all other nodes, multiple outgoing controlflow nodes may be sorted by order or their traversal may be parallelized in case the hardware used for the implementation supports parallel execution.
[047] Property edges are used to form any relation other than those specified by the above two edge types. Primarily they are used to create objects and agents by linking an object or an agent node to other nodes. These edges form the residual class of edges and by adding further properties on these edges, all other relationships can be expressed in this ontology.
[048] As mentioned earlier, the present invention provides an ontology that has imperative as well as declarative characteristics including a recursive-traversing interpreter that enables running imperative queries on the ontology hereinbefore described. Now, recursive-traversing interpreter is defined herein in detail.
[049] The recursive-traversing interpreter updates the values of various nodes in the graph as specified in the ontology model described hereinbefore. In order to do this, it uses its internal library of functions, which in this case is the standard function library provided by most common high-level programming languages. However, this library can be the Instruction Set Architecture of the microprocessor and thus the hardware would be directly linked to the present ontology. Once a starting node for traversal is specified by the user query, this interpreter does the following steps:
1. If that node is an atom, this interpreter updates all other nodes which are linked from this node via outgoing dataflow edges with the value of this atom.
2. If that node is a combiner or a branch the interpreter then checks if it requires any arguments for its value to be computed. The "argValues' property of the node serves to hold this data. If any value if required, then it recursively traverses those nodes, if not, it computes the value of the present node and writes its output to all the nodes linked from this node via outgoing dataflow edges.
3. If the node is an object, the inteipreter recursively searches the object by traversing through its property edges to find the atom whose value is sought by the combiner or any other node that led to the object being traversed.
4. If the node is an agent, the interpreter identifies the input nodes and passes on the requirement to the combiner and then fetches this data (from other input nodes to the combiner). During this first call, the agent node's status is 'agent- ΝΓ, that is, not invoked. Once this call is made and the inputs to the agent provided, the agent node's status is changed to 'agent-Γ, that is, invoked. This status then enables the RTI to fetch the value/s from the output node/s of the agent and then update all nodes linked by outgoing dataflow edges to the combiner or branch that led to the traversal of the agent node, with these output values.
5. If the node is a model, the interpreter fetches the value/s specified by the output node/s (that is, the node/s that is/are connected by outgoing dataflow edges from the model node) and provides this value/s to the node that led to the traversal of the model node.
6. Once the value of any node is evaluated, the interpreter then traverses the next node indicated by the outgoing controlflow edge from the current node. Only in the case of a branch node, the interpreter chooses the appropriate node to be traversed next based on the value of the branch node and the order of the outgoing controlflow edges from the branch node.
[050] Since this interpreter recursively traverses the graph, it is called a recursive-traversing interpreter. As indicated in point 1 above, the atoms form the base case of the recursion. In this work, a combination of eager and lazy evaluation is used for this recursive inteipreter evaluation. The eager part ensures that when the interpreter computes a node's value, all the nodes that receive input from this node are immediately updated. The lazy part of the interpreter is used in combiners and branches. In case of these nodes, their argValues property is erased after every computation. These values are then sought only when the node is traversed again. This ensures that once a set of values is used for a computation, it is not reused. Also the next fetching of these values is done only when they are required. This strategy plays a key role in enabling looping which simulates recursion in the present ontology. Another key feature of this interpreter is its ability to reduce Application Programming Interface (API) specification requirements by enabling a recursive search (exact or approximate search) within objects and agents to find required input nodes. Thus, objects created by different people can be used by agents created by different people without each knowing the structure of the other, as long as they use a common name or description keywords to represent the same variable.
[051] Additionally, in order to make queries involving computation more efficient, the sub-graph representing the computation could be compiled into a function or a subroutine in any programming language. Then the RTI could invoke this function or subroutine for the result of the computation, instead of following its normal traversal based procedure. This work of compiling subgraphs into functions or sub-routines could be done either when they are first encountered in the course of a query or they can be pre-compiled and the link to the compiled target can be stored as an additional property of the initial node of the sub-graph.
[052] Furthermore, the RTI implementation has an internal data structure, which is a stack, which it uses to keep track of nodes to be visited. This stack can also be moved to the graph database as a separate node with property edges leading out from this node indicating the nodes to be visited and the Order' property on these edges indicating the order in which the nodes need to be visited by the RTI. [053] Furthermore, some data and function nodes can also be declared to be non- atomic so that the behaviour of the RTI when traversing these nodes becomes similar to that when it is traversing a combiner. This flexibility helps in some specific domains and is more of a convenience rather than a requirement.
[054] Another point to be noted is that multiple dataflow edges could provide input to the same node with the same order (that is, as possible candidates for the same value required by the node). A choice can be made amongst these edges based on any predetermined criterion (say better reliability if the multiple inputs represent data from different types of sensors or lower cost if the multiple inputs are from different agents, each of which has a different computational cost etc.) or any criterion specified by the user dynamically at run-time.
[055] The ability of an ontological system to generate new knowledge in response to user queries is a key factor that determines its utility. The system described so far lends itself to . any query that requires an evaluation. It is also possible to handle all declarative type queries and inferences in this system that is possible in other RDF like declarative systems. However, an added advantage here, is the ability to compute inverses of (at least some) imperative processes to enable answering a much larger class of user queries, permitting optimization, system design etc. There are three distinct strategies for this automatic inverse computation described as follows:
[056] Structural Inversion: In Structural Inversion, the inversion is done at the node level with each combiner or branch being replaced by a new combiner or branch with the dataflow edges appropriately changed and the function nodes replaced with the inverses of the same. If the inverses of the function nodes are not known apriori or the sub-graph being traversed is cyclic, then this approach would not be feasible.
[057] In value-based inversion, there are two approaches possible here:
[058] Numerical: In this a numerical approximation method such as Picard's iteration or Newton-Raphson's method is used, provided the sub-graph being traversed is numerical, acyclic and has all the inputs relevant for the numerical method being used.
[059] Pseudo: Here, we use history of various agents to group agents with their inverses for particular sub-domains for these agents. These are not actual inverses, as they only seem to be inverses for a particular sub-domain. Hence, they are called pseudo inverses. It is to be noted that this approach can be adopted only if a history of agents' action has been recorded and pseudo inverse pairs have been identified prior to evaluating the query. [060] Probabilistic Inversion: This appears to be the most promising strategy for general cases. This uses Bayes' Theorem to compute inverses. The domains of the input and output variables are discretized (if they are not already sufficiently discrete) and then the conditional probability relation between the new input bins and the output bins are computed. This conditional probability along with a prior distribution for the input variables (estimated or assumed) enables us to compute the inverse by specifying the posterior probabilities of the various output bins using Bayes' rule.
[061] In the present ontology, machine learning is possible, which is similar to tuning a neural network. Here, learning could also involve changing the values of the function nodes in addition to changing the values of the data nodes. However, the main strength of neural networks, which is the back propagation algorithm, does not seem to have a ready analogue here. However, this system is suited to another key problem - that of transferring learning/knowledge.
[062] The ability to transfer learning or knowledge from one domain to another is a key challenge in Artificial Intelligence today. One limitation to this is the multitude of data structures used in various domains making it a huge combinatorial exercise in finding all potential isomorphisms between the two systems. It must further be noted here that these isomorphisms need only be approximate in most cases. One of the promising applications of the use of the ontology model according to present invention is to enable transferring knowledge between two different domains. A key step in knowledge transfer, particularly in fields such as reinforcement learning is to find similar tasks or functions in two different domains. The Ontology model proposed in this work, enables comparing agents, objects or models in different domains by comparing only their graph structure and not their exact values. This structural similarity finding can be automated, thus providing one option for possible automated mapping between different domains in Transfer Learning.
[063] As described hereinbefore, an added advantage of the present ontology is the ability to compute inverses of (at least some) imperative processes to enable answering a much larger class of user queries, permitting optimization, system design etc. Optimization can be of two types in querying the present ontology. The first type is called a structural one, where the best agent to compute a particular output is to be identified. This problem can be solved similar to an edge-weighted shortest path problem. Further, as different agents can give the same output, but may require different inputs that differentiate them based on their accuracy and precision, the number of inputs needed by the agent too could add to the cost of the objective function. On the other hand, the accuracy and precision provided by the agent could then contribute favourably to the objective function.
[064] The second type of optimization involves choosing input values for a node or a set of nodes that optimizes the value or value/s for another node or set of nodes. This is a standard black-box optimization problem, with the black box being replaced by the sub-graph and thus permitting structural inferences too.
[065] The implementation of the ontology described hereinbefore uses a property graph database, which is one of the most common types of graph databases. This database can be implemented in a single computer with sufficient hard disk space, and for scalability it can be stored across multiple machines. The storing and data retrieval mechanisms are specific to the graph database implementation and are generally provided by the database provider. The recursive-traversing interpreter can be implemented in any modern computer system with single or multiple processors. Theoretically, it only requires access to the functions specified as part of the processor architecture. However, for ease of use, the present work uses high-level functions provided by most standard programming languages that then interface with the processor's hardware- software interface. For the purpose of illustration of the system presented, an Apple Macbook Air laptop computer with an Intel 1.86 GHz Core 2 Duo Processor, running Mac OS X 10.7.5 (Lion) was used. The computer had a memory of 2GB 1067 MHz DDR3 and a flash based hard disk of 128 GB capacity. For the graph database, a free open-source graph database was used. The Recursive Traversing interpreter was implemented in Python 2.7.5.
[066] Besides a single computer based operation, the system can be used over a network of computers in two ways. The first approach, which is to handle big data, is to partition the graph over several computers in a network. The use of the present ontology is to aid the partitioning logic whereby a single agent or object is not partitioned into different machines whenever possible. Since the recursive- traversing Interpreter focuses only on local data (neighbouring nodes and edges) for execution, a separate copy of these distributed elements in the graph database could be made in a new machine and post running the interpreter, these (sub-) graphs could be updated in their original locations such as a database cluster or a knowledge base server containing information in the form of graph comprising a plurality of nodes and a plurality of edges. This implementation is depicted in Fig. 8.
[067] Another approach to scaling this ontological system is to allow multiple computers (clients) to connect to the computer (or cluster or network) with the graph database (knowledge base server), but each client computer can have its own recursive-traversing Interpreter. The advantage here is that, in this case, each client can have its own implementation of the atomic functions and thus can exhibit different behaviour while accessing the same graph. This architecture is shown in Fig. 9.
[068] Following are the examples carried out on the system and graph database mentioned hereinbefore.
[069] EXAMPLE 1 : Euler GCD Computation with Object as Output
[070] Referring to Fig. 10, in this example, the agent 1002 is used to compute the GCD of the two input nodes to the combiner node 1004. There are 23 nodes in this example numbered from 0 to 22. There are 42 relationships numbered from 0 to 41. The lists of nodes and edges in this example are given in Tables 1 and 2 below. The main combiner in this application is the node numbered 13. The Recursive Traversing Interpreter (RTI) starts from this node. In order to understand the computation, we track the internal stack of nodes to be visited by the RTI and also the argValues and value property of each node that the RTI visits. Initially the values of all combiners are set to " " (blank) and the argValues array is set to "NA" for each element, with the number of elements in the array determined by the number of incoming data edges to the combiner with unique order property. All the data and function nodes are set to initial values as given in Table 3.
Table 1: List of Nodes
Nodejd Name Type
0 "if-branch" "branch"
1 "if function" "function"
2 "equals" "data"
3 "zero" "data"
4 "mod-combiner" "combiner"
5 "input-m" "data"
6 "input-n" "data"
7 "mod-function" "function" Nodejd Name Type
8 "set function" "function"
9 "set-combiner- 1 " "combiner"
10 "set-combiner-2" "combiner"
1 1 "set-combiner-3" "combiner"
12 "output" "data"
13 "new-combiner-main" "combiner"
14 "output object" "object"
15 "input-n" "data"
16 "input-Object" "object"
17 "second object" "object"
18 "EulerAgent" "agent"
19 "input-m" "data"
20 "equals" "data"
21 "small output object" "object"
22 "output" "data"
Table 2: List of Edges
Edgejd Type Start_Node_Id End_Node_Id
0 "DATAFLOW" 1 0
1 "DATAFLOW" 2 0
2 "DATAFLOW" 0
3 "DATAFLOW" 4 0
4 "DATAFLOW" 7 4
5 "DATAFLOW" 5 4
6 "DATAFLOW" 6 4
7 "DATAFLOW" 10 5
8 "DATAFLOW" 1 1 6
9 "DATAFLOW" 9 12
10 "DATAFLOW" 8 9 1 1 "DATAFLOW" 6 9
12 "DATAFLOW" 8 10
13 "DATAFLOW" 6 10
14 "DATAFLOW" 8 1 1
15 "DATAFLOW" 4 1 1
16 "CONTROLFLOW" 0 9
17 "CONTROLFLOW" 0 10
18 "CONTROLFLOW" 10 1 1
19 "CONTROLFLOW" 1 1 0
20 "DATAFLOW" 18 13
21 "DATAFLOW" 16 13
22 "DATAFLOW" 15 13
23 "DATAFLOW" 13 14
24 "PROPERTY" 17 16
25 "PROPERTY" 0 18
26 "PROPERTY" 1 18
27 "PROPERTY" 2 18
28 "PROPERTY" 3 18
29 "PROPERTY" 4 18
30 "PROPERTY" 5 18
31 "PROPERTY" 6 18
32 "PROPERTY" 7 18
33 "PROPERTY" 8 18
34 "PROPERTY" 9 18
35 "PROPERTY" 10 18
36 "PROPERTY" 1 1 18
37 "PROPERTY" 12 18
38 "PROPERTY" 19 17
39 "PROPERTY" 20 14
40 "PROPERTY" 21 14 41 "PROPERTY" 22 21
[071] In this example, RTI is asked to start by exploring node 13. Its internal stack (called fringe) has only node 13 at this stage. It pops the fringe and traverses to node 13. It checks the type of the node and finds that it is a combiner. It hence checks if all its arguments are present (i.e. there are no "NA" in its argValues array). Initially, it finds that all the required arguments for node 13 are missing and hence, it adds the required nodes to the fringe. It then proceeds to pop the fringe again and continue the same algorithm. The various steps involved in this are given in the table below. Whenever a node is evaluated, its value is updated to the argValues property of all its outgoing DATAFLOW neighbours with the index of the argValues property array being determined by the Order' property of the DATAFLOW edge.
Table 3
Step Fringe Current Current Current Current Current Fringe
No. at Node Node Node Node Node at
Start Type ArgValues Status Value End of of
Step Step
1 13 13 Combiner [NA, NA, Not 13, 15,
NA] Evaluated 16, 18
2 13, 15, 18 Agent Not 13, 15, 16, 18 Evaluated 16
3 13, 15, 16 Object Evaluated Object 13, 15 16 Id
4 13, 15 15 Data 24 Evaluated 24 13
5 13 13 Combiner [Agent-Id, Not 13, 18
Object-Id, Evaluated
24]
6 13, 18 18 Agent Evaluated Agent 13, 0 Initial
Node Id
And
Input
Nodes
Data
Written
(5 and 6 from
Object
Node 16 and
Data
Node
15)
13,0 0 Combiner [NA, NA, Not 13, 0,
NA, NA] Evaluated 4 3 2
1
13, 0, 1 Function □kin Evaluated jklf 13, 0, , 3, 2, 4,3,2 1
13, 0, 2 Data [==] Evaluated == 13, 0, ,3,2 4,3
13, 0, 3 Data [0] Evaluated 0 13, 0. ,3 4
13, 0, 4 Combiner [NA, NA, Not 13, 0,
NA] Evaluated 4, 6, 5,
7
13, 0, 7 Function [jkMod] Evaluated jkMod 13, 0, , 6, 5, 4,6,5 7
13, 0, 5 Data [42] Evaluated 42 13, 0, , 6, 5 4, 6
13, 0, 6 Data [24] Evaluated 24 13, 0, , 6 4
13, 0, 4 Combiner [jkMod, Evaluated 18 13, 0
42, 24]
13, 0 0 Combiner [if == 0 Evaluated, False 13, 10
18] Controlflow
edge to 10
13, 10 10 Combiner [NA, 24] Not 13, 10,
Evaluated 8
13, 10, 8 Function LjkSet] Evaluated jkSet 13, 10 8
13, 10 10 Combiner UkSet, 24] Evaluated, 24 13, 1 1
Controlflow
edge to 1 1
13, 1 1 1 1 Combiner [jkSet, 18] Evaluated, 18 13, 0
Controlfow
edge to 0
13, 0 0 Combiner [NA, NA, Not 13, 0.
NA, NA] Evaluated 4, 3, 2,
1
13, 0, 1 Function Dkif] Evaluated jklf 13, 0, , 3, 2, 4, 3, 2 1
13, 0, 2 Data [=] Evaluated 13, 0, , 3, 2 4, 3
13, 0, 3 Data [0] Evaluated 0 13, 0, , 3 4 13, 0, 4 Combiner [NA, NA, Not 13, 0, 4 NA] Evaluated 4, 6, 5,
7
13, 0, 7 Function [jkMod] Evaluated jKMod 13, 0, 4, 6, 5, 4, 6, 5 7
13, 0, 5 Data [24] Evaluated 24 13, 0, 4, 6, 5 4, 6
13, 0, 6 Data [ 18] Evaluated 18 13, 0, 4, 6 4
13, 0, 4 Combiner [jkMod, Evaluated 6 13, 0 4 24, 18]
13, 0 0 Combiner [if == 0 6] Evaluated, False 13, 10
Controlflow
edge to 10
13, 10 10 Combiner [NA, 18] Not 13, 10,
Evaluated 8
13, 10, 8 Function [jkSet] Evaluated jkSet 13, 10 8
13, 10 10 Combiner [jkSet, 18] Evaluated, 18 13, 1 1
Controlflow
edge to 1 1
13, 1 1 1 1 Combiner [jkSet, 6] Evaluated, 6 13, 0
Controlflow
edge to 0
13, 0 0 Combiner [NA, NA, Not 13, 0,
NA, NA] Evaluated 4, 3, 2,
1
13, 0, 1 Function DklfJ Evaluated jklf 13, 0, , 3, 2, 4, 3, 2 1
13, 0, 2 Data M Evaluated 13, 0, 4,3,2 4,3
13, 0, 3 Data [0] Evaluated 0 13, 0, 4,3 4
13, 0, 4 Combiner [NA, NA, Not 13, 0, 4 NA] Evaluated 4, 6, 5,
7
13, 0, 7 Function [jkMod] Evaluated jkMod 13. 0, 4, 6, 5, 4, 6, 5 7
13, 0, 5 Data [18] Evaluated 18 13, 0, 4,6, 5 4,6
13, 0, 6 Data [6] Evaluated 6 13, 0, 4,6 4
13, 0, 4 Combiner [jkMod 18 Evaluated 0 13,0 4 6]
13,0 0 Combiner Uklf == 0 Evaluated, 0 13,9
0] Controlflow
edge to 9
13,9 9 Combiner [jkSet 6] Evaluated 6 13
13 13 Combiner [Agent, Evaluated, ==,6
Object, 24] output
(marked as
output in
Agent
Node, here
Nodes 2
and 12)
written to Nodes
given by
outgoing
Dataflow
edges
[072] EXAMPLE 2: Food Network
[073] Food networks, which have been described in "Flavor Network and the principles of food pairing, Yong-Yeol Ahn, Sebastian E. Ahnert, James P. Bagrow and Albert-Laszlo Barabasi, Nature, Scientific Reports, Published 15 December 201 1 ''' and have been traditionally modelled in a declarative (RDF / first order logic type) form, are modelled in the proposed system. One limitation mentioned in the said reference is addressed using the representation developed in the present invention. This limitation is that of the representation used in said reference not being able to address the effect of the mode of preparation on the flavour of a dish. Referring to Fig. 1 1 it shows two steps in a food recipe given in "http://www. epicurious.com/articlesguides/bestof/toprecipes/bestpastarecipes/reci pes/food/views/Lemon-Gnocchi-with-Spinach-and-Peas-240959". In this figure, the representation in the format prescribed in the present work allows one to capture the effect of the mode of the preparation as well as other parameters involved in the preparation. If the function atoms are chosen based on what a robotic system can do, then the specification also becomes an executable recipe for that system. Further, this representation also illustrates use of additional properties on edges such as 'quantity' as described in the figure.
[074] In the above figure, nodes 200 represent functions, nodes 100 represent data (food ingredients), nodes 500 represent objects and nodes 300 represent combiners (steps in the recipe). The edges 304 represent data flow (flow of ingredients), edges 302 represent flow of control (sequence of steps) and edges 502 represent property edges which denote membership of an object and also have a property on them which defines the quantity of each ingredient to be used in the particular step. Using this representation, several additional properties of recipes can be learned. For example, queries such as:
• Which ingredient occurs in maximum number of recipes and is also the single largest component by weight (or cost or volume or importance) of those recipes?
• Which ingredients are used in multiple steps in the same recipe?
• Which ingredients are generally used only in the last few steps of recipes?
• What are the relationships between functions and ingredients? Do some functions always need a particular ingredient (for a trivial example, frying and oil generally occur together) and do some ingredients only pair with specific functions?
• Is there a relationship between nature of ingredients and number of steps in a recipe?
[075] EXAMPLE 3: Legal Document (Act/Regulations/Agreement etc.)
[076] The statements (clauses) in any legal document can be classified into three types - definitions, facts and action statements. All these three can be represented in the ontological model described in the present invention. For example, we can consider three typical clauses in a hypothetical legal agreement:
a. 'Group' means Company A and Company B
b. The PAT for FY 2013 is INR 1 million
c. Equity Share Conversion is to be done at a value of 5 times the EBITDA for FY 2014
[077] Referring to Figs. 12, 13 & 14, the graphs illustrate the representation of the above three clauses. The advantage of this representation is that one can query about the parameters appearing in various clauses as well as one can compute the effect of execution of the action statement clauses. Using inversion and optimization techniques in action statement clauses, simulation of various scenarios as well as identifying the optimum choice of parameters that comply with the specified regulations can be done. Note that action statements can also be functions with only side effects, such as 'Filing of a document'. In this case, the output of the combiner would be the status of completion of the function (i.e. success or failure along with time of execution, acknowledgement etc.)
[078] EXAMPLE 4: Financial Modelling
[079] One of the advantages of using the present work in Financial Modelling is the ability to attach multiple data sources for the same parameter and then choose the required one during query execution. Also, the method of processing various financial parameters could be chosen during query execution based on the accuracy and other requirements of the user. Two examples representing these two types are shown in Fig. 15 & Fig.16.
[080] As shown in Fig. 15, gross profit is computed as the difference of revenues and Cost of Goods Sold (COGS). But, here, there are two different sources to compute revenue. The multiple revenue options are indicated by the fact that both the data edges have the same Order' property.
[081] Cost of equity can be computed using two different methods - Capital Asset Pricing Model (risk free rate + beta * (market return - risk free rate) or can be computed using a (hypothetical) thumb rule such as (risk free rate + inflation rate + 5 %). Fig. 16 illustrates a graph that permits both these options with two agents (the nodes belonging to the agents (connected via Property edges) are not shown to reduce clutter). The CAPM agent requires inputs - risk free rate (r_f), market return (r_m) and beta. The ThumbRule Agent requires r_f and inflation rate (r_i). Note that the 'parameters' Object has the inputs required for both these agents. The multiple choices of agents are indicated by both the data edges having the same Order* property. [082] EXAMPLE 5: Probabilistic Graphical Models
[083] The two main types of probabilistic graphical models are Bayesian Networks and Markov Networks represented by directed graphs and undirected graphs respectively. The theory of Probabilistic Graphical Models demonstrates the construction and equivalence of a graph from a joint probability distribution such that the independence relations specified in the graph are valid in the specification of the joint probability distribution. This graph-based factorization permits one to exploit such independencies and consequently reduce the number of parameters required to state the joint probability distribution. The difference between Bayesian Networks and Markov Networks is that in the former, each factor (that governs dependence of one node on the values of the others) represents a conditional probability distribution (CPD) and thus can be interpreted as possibly representing a causal link. In the latter type, the factors are just functions that assign values to every value combination of the variables in their domain or scope. The concepts of active trails in such graphs help us determine independencies and/or conditional independencies between any sets of variables. In addition to this, local structure of the conditional probability distribution or factor specification permits further reduction in number of required parameters. Referring to Fig. 17 & Fig. 18, a simple Bayesian Network and a simple Markov Network are represented in the ontology specified in the present invention respectively. In this representation, the concepts of active trails for both the types need to be modified. However, this representation enables representation of multiple CPDs for the same set of variables as can be seen in Fig. 17 for Bayesian Networks. Here, agents 1 and 2 represent two different CPDs that relate the same three variables A, B and C. Thus, the same graph can be used to capture multiple relationships, which is not possible in the traditional approach. Also, local structure can be easily captured in the Agent definition. In Fig. 18, which depicts a Markov Network, the explicit relationship between factors and variables is captured in the present representation. Again, as in the Bayesian Network case, multiple choices for the factor function between the same set of variables can be represented here. This explicit relationship between factors and variables is not captured in the traditional approach.
[084] Since other modifications and changes to fit particular requirements and environments will be apparent to hose skilled in the art, the invention is not considered limited as described by the present preferred embodiments which have been chosen for purposes of disclosure, and covers all changes and modifications which do not constitute departure from the spirit and scope of this invention.

Claims

1. A graph based ontology modeling system comprising a knowledge base server containing information in the form of graph comprising a plurality of nodes and a plurality of edges; a client system having a recursive-traversing interpreter (RTI) enabling queries on said graph using a combination of eager and lazy evaluation method and updating values of various nodes in said graph; wherein said graph comprises:
dataflow edges to capture the flow of data required in the ontology; controlflow edges to specify the next node to be traversed by said RTI once the current node has been evaluated; property edges to form any relation other than those specified by dataflow edges or controlflow edges and expressing all other relationships in the ontology;
data nodes having data and defined by name, value, argument, type and description and function nodes having function or a reference to a function and defined by name, value, argument, type and description wherein said data nodes and function nodes may serve as terminal nodes for said RTI; combiner nodes which enable said RTI to evaluate a function by providing it with function and its arguments wherein RTI computes value of combiner nodes based on its input dataflow edges and writes the computed value to nodes indicated by its output dataflow edges and proceeds to traverse the next node indicated by controlflow edge leading out of combiner node; branch nodes which enable said RTI to evaluate a function by providing it with function and its arguments wherein RTI computes value of branch node based on its input dataflow edges and writes the computed value to nodes indicated by its output dataflow edges and proceeds to traverse the next node indicated by the computed value and one of the multiple outgoing controlflow edges leading out of branch node; object nodes defined by name, type and description having property edges wherein said object nodes identify and group together other nodes forming part of said object nodes; agent nodes defined by name, type, description and status have property edges, at least one combiner node, one input and one output node of any type wherein said property edges joining respective node to agent node has an added property called sub-type which helps the RTI in its traversal; and model nodes having output node with pre- populated value wherein when RTI evaluates a model node and fetches value from the output node indicated by the outgoing data flow edge from the model node;
wherein based on query of a user specifying the starting node for traversal, said recursive-traversing interpreter performs the following step:
if selected node is a data node or a function node, RTI updates all other nodes which are linked from this node via outgoing dataflow edges with the value of selected data node;
if selected node is a combiner node or a branch node RTI checks if any arguments is required to compute the value of said RTI and if required, RTI traverses those nodes else proceed to compute the value of the present node and writes the output to all the nodes linked from this node via outgoing dataflow edges;
if selected node is an object node, RTI searches the object by traversing through property edges associated thereto to find the data node whose value is sought by the combiner node or any other node that led to the object being traversed;
if selected node is an agent node, RTI identifies the input nodes and passes on the requirement to the combiner and then fetches this data from other input nodes to the combiner;
if selected node is a model, RTI fetches the value/s specified by the output node/s and provides this value/s to the node that led to the traversal of the model node; wherein
once the value of any node is evaluated, RTI then traverses the next node indicated by the outgoing controlflow edge from the current node, however, only in the case of a branch node, RTI chooses the appropriate node to be traversed next based on the value of the branch node and the order of the outgoing controlflow edges from the branch node.
2. A graph based ontology modeling system as claimed in claim 1 , comprising agent nodes linked to multiple combiner nodes and multiple input and output nodes of any type. 3. A graph based ontology modeling system as claimed in claim 2, wherein said agent nodes are linked to other nodes through dataflow and/ or controlflow edges to form a sub-graph representing computation.
A graph based ontology modeling system as claimed in claim 3, wherein said sub-graph is compiled into a function or a sub-routine wherein RTI invokes said function or sub-routine to evaluate said computation.
A graph based ontology modeling system as claimed in claim 4, wherein compilation of sub-graph into a function or a sub-routine is either completed when such sub-graph is first encountered during a query or can be a precompiled link to the function or sub-routine, which is stored as an additional property of the initial node of the sub-graph.
6. A graph based ontology modeling system as claimed in claim 1 or any of the aforesaid claims, wherein RTI enables a recursive search within object nodes and agent nodes to find required input nodes and/or output nodes.
7. A graph based ontology modeling system as claimed in claim 1 or any of the aforesaid claims, wherein RTI is associated with an internal data structure in the form of a stack to keep track of nodes to be visited.
8. A computer program product including instructions recorded on a non- transitory computer readable storage media, which, when executed by at least one processor, cause at least one processor to perform a method as claimed in claim 1.
9. A graph based ontology modeling system as claimed in claim 1 , wherein base server and client system are located on one computer.
PCT/IN2014/000508 2013-08-08 2014-07-31 Graph based ontology modeling system WO2015019364A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN2617MU2013 IN2013MU02617A (en) 2013-08-08 2013-08-08
IN2617/MUM/2013 2013-08-08

Publications (2)

Publication Number Publication Date
WO2015019364A2 true WO2015019364A2 (en) 2015-02-12
WO2015019364A3 WO2015019364A3 (en) 2015-11-26

Family

ID=54199345

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2014/000508 WO2015019364A2 (en) 2013-08-08 2014-07-31 Graph based ontology modeling system

Country Status (2)

Country Link
IN (1) IN2013MU02617A (en)
WO (1) WO2015019364A2 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018170101A1 (en) * 2017-03-16 2018-09-20 Raytheon Company Weighted property graph data model representing system architecture
EP3460682A1 (en) * 2017-09-22 2019-03-27 1nteger, LLC Systems and methods
US10430462B2 (en) 2017-03-16 2019-10-01 Raytheon Company Systems and methods for generating a property graph data model representing a system architecture
KR20190121371A (en) * 2017-03-16 2019-10-25 레이던 컴퍼니 Quantify Consistency of System Architecture
KR20190121844A (en) * 2017-03-16 2019-10-28 레이던 컴퍼니 Robust Quantification by Analyzing Attribute Graph Data Model
US20200226181A1 (en) * 2019-01-10 2020-07-16 International Business Machines Corporation Semantic queries based on semantic representation of programs and data source ontologies
US10997541B2 (en) 2017-09-22 2021-05-04 1Nteger, Llc Systems and methods for investigating and evaluating financial crime and sanctions-related risks
CN114207620A (en) * 2019-07-29 2022-03-18 国立研究开发法人理化学研究所 Data interpretation device, method and program, data integration device, method and program, and digital city construction system
US20220269730A1 (en) * 2019-06-24 2022-08-25 Thatdot, Inc. Graph processing system
US11714992B1 (en) * 2018-12-13 2023-08-01 Amazon Technologies, Inc. Neural network processing based on subgraph recognition
US11782706B1 (en) 2021-06-29 2023-10-10 Amazon Technologies, Inc. Reconfigurable neural network processing based on subgraph recognition
US11948116B2 (en) 2017-09-22 2024-04-02 1Nteger, Llc Systems and methods for risk data navigation

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912656B (en) * 2016-04-07 2020-03-17 桂林电子科技大学 Method for constructing commodity knowledge graph

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003210803A1 (en) * 2002-02-01 2003-09-02 John Fairweather A system and method for real time interface translation
US20090254510A1 (en) * 2006-07-27 2009-10-08 Nosa Omoigui Information nervous system
US7834875B2 (en) * 2007-04-02 2010-11-16 International Business Machines Corporation Method and system for automatically assembling stream processing graphs in stream processing systems
US8453127B2 (en) * 2010-09-20 2013-05-28 Sap Ag Systems and methods providing a token synchronization gateway for a graph-based business process model

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102265092B1 (en) 2017-03-16 2021-06-14 레이던 컴퍼니 Robustness quantification by attribute graph data model analysis
US10430462B2 (en) 2017-03-16 2019-10-01 Raytheon Company Systems and methods for generating a property graph data model representing a system architecture
WO2018170101A1 (en) * 2017-03-16 2018-09-20 Raytheon Company Weighted property graph data model representing system architecture
US10430463B2 (en) 2017-03-16 2019-10-01 Raytheon Company Systems and methods for generating a weighted property graph data model representing a system architecture
KR102274803B1 (en) 2017-03-16 2021-07-07 레이던 컴퍼니 Quantifying the consistency of system architecture
KR20190121371A (en) * 2017-03-16 2019-10-25 레이던 컴퍼니 Quantify Consistency of System Architecture
KR20190121372A (en) * 2017-03-16 2019-10-25 레이던 컴퍼니 Attribute graph data model representing system architecture
KR20190121373A (en) * 2017-03-16 2019-10-25 레이던 컴퍼니 Weighted attribute graph data model representing system architecture
KR20190121844A (en) * 2017-03-16 2019-10-28 레이던 컴퍼니 Robust Quantification by Analyzing Attribute Graph Data Model
US10459929B2 (en) 2017-03-16 2019-10-29 Raytheon Company Quantifying robustness of a system architecture by analyzing a property graph data model representing the system architecture
US10496704B2 (en) 2017-03-16 2019-12-03 Raytheon Company Quantifying consistency of a system architecture by comparing analyses of property graph data models representing different versions of the system architecture
KR102285862B1 (en) 2017-03-16 2021-08-03 레이던 컴퍼니 Property graph data model representing system architecture
KR102185553B1 (en) 2017-03-16 2020-12-02 레이던 컴퍼니 A weighted attribute graph data model representing the system architecture
AU2018235926B2 (en) * 2017-03-16 2022-02-17 Raytheon Company Property graph data model representing system architecture
AU2018235930B2 (en) * 2017-03-16 2022-02-03 Raytheon Company Weighted property graph data model representing system architecture
US11734632B2 (en) 2017-09-22 2023-08-22 Integer, Llc Systems and methods for investigating and evaluating financial crime and sanctions-related risks
US10373091B2 (en) 2017-09-22 2019-08-06 1Nteger, Llc Systems and methods for investigating and evaluating financial crime and sanctions-related risks
US10997541B2 (en) 2017-09-22 2021-05-04 1Nteger, Llc Systems and methods for investigating and evaluating financial crime and sanctions-related risks
EP3460682A1 (en) * 2017-09-22 2019-03-27 1nteger, LLC Systems and methods
US11734633B2 (en) 2017-09-22 2023-08-22 Integer, Llc Systems and methods for investigating and evaluating financial crime and sanctions-related risks
US11948116B2 (en) 2017-09-22 2024-04-02 1Nteger, Llc Systems and methods for risk data navigation
US11714992B1 (en) * 2018-12-13 2023-08-01 Amazon Technologies, Inc. Neural network processing based on subgraph recognition
US20200226181A1 (en) * 2019-01-10 2020-07-16 International Business Machines Corporation Semantic queries based on semantic representation of programs and data source ontologies
US11520830B2 (en) * 2019-01-10 2022-12-06 International Business Machines Corporation Semantic queries based on semantic representation of programs and data source ontologies
US11874875B2 (en) * 2019-06-24 2024-01-16 Thatdot, Inc. Graph processing system
US20220269730A1 (en) * 2019-06-24 2022-08-25 Thatdot, Inc. Graph processing system
CN114207620A (en) * 2019-07-29 2022-03-18 国立研究开发法人理化学研究所 Data interpretation device, method and program, data integration device, method and program, and digital city construction system
CN114207620B (en) * 2019-07-29 2023-08-15 国立研究开发法人理化学研究所 Data interpretation device, method, storage medium, data integration device, method, storage medium, and digital city construction system
US11782706B1 (en) 2021-06-29 2023-10-10 Amazon Technologies, Inc. Reconfigurable neural network processing based on subgraph recognition

Also Published As

Publication number Publication date
IN2013MU02617A (en) 2015-06-12
WO2015019364A3 (en) 2015-11-26

Similar Documents

Publication Publication Date Title
WO2015019364A2 (en) Graph based ontology modeling system
Delaware et al. Fiat: Deductive synthesis of abstract data types in a proof assistant
Hegedüs et al. A model-driven framework for guided design space exploration
Hearnden et al. Incremental model transformation for the evolution of model-driven systems
Mazairac et al. BIMQL–An open query language for building information models
Barbierato et al. Performance evaluation of NoSQL big-data applications using multi-formalism models
Eiter et al. A model building framework for answer set programming with external computations
Pellier et al. PDDL4J: a planning domain description library for java
Rivera et al. Formal specification and analysis of domain specific models using Maude
Martínez et al. Reactive model transformation with ATL
Venetis et al. Query extensions and incremental query rewriting for OWL 2 QL ontologies
Wuillemin et al. Structured probabilistic inference
Li et al. Performance benchmark on semantic web repositories for spatially explicit knowledge graph applications
Bobek et al. HEARTDROID—Rule engine for mobile and context‐aware expert systems
Neele et al. Solving parameterised boolean equation systems with infinite data through quotienting
Hinkel Implicit incremental model analyses and transformations
Castelltort et al. Handling scalable approximate queries over NoSQL graph databases: Cypherf and the Fuzzy4S framework
Sequeda Integrating relational databases with the semantic web: A reflection
Vlasenko et al. A Saturation-based Algebraic Reasoner for ELQ.
Liu et al. Modeling and validating temporal rules with semantic Petri net for digital twins
Hogan et al. In-database graph analytics with recursive SPARQL
Claus Jensen et al. Symbolic model checking of weighted PCTL using dependency graphs
Selvaraj Improving Program Analysis using Efficient Semantic and Deductive Techniques
Altmeyer et al. On design formalization and retrieval of reuse candidates
Yang et al. Lifted model checking for relational MDPs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14834077

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14834077

Country of ref document: EP

Kind code of ref document: A2