WO2020039342A2

WO2020039342A2 - System and method for learning knowledge graph schema via dialog

Info

Publication number: WO2020039342A2
Application number: PCT/IB2019/057003
Authority: WO
Inventors: Indrajit Bhattacharya; Gautam Shroff; Moumita Saha
Original assignee: Tata Consultancy Services Limited
Priority date: 2018-08-20
Filing date: 2019-08-20
Publication date: 2020-02-27
Also published as: WO2020039342A3

Abstract

A processor implemented method for learning knowledge graph schema via dialog is provided. The method include receiving, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph; determining, a plurality of possible interpretations associated with at least one statement from the set of statements; identifying, at least one of (i) mentions of types, and (ii) instances of initial schema entities, associated with at least one statement from the set of statements; identifying, set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements; generating, set of candidate statement graphs by sampling the set of possible statement graphs from a distribution; identifying, plurality of expert preferred statement graphs from the generated set of candidate statement graphs; and identifying, a subsequent preferred question from a set of possible questions to query an domain expert.

Description

SYSTEM AND METHOD FOR LEARNING KNOWLEDGE GRAPH SCHEMA VIA

DIALOG

CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY

[001] The present application claims priority from Indian provisional patent application no. 201821031052, filed on August 20, 2018. The entire contents of the aforementioned application are incorporated herein by reference.

TECHNICAL FIELD

[002] The disclosure herein generally relates to field of knowledge graphs, more particularly, to system and method for learning knowledge graph schema via dialog.

BACKGROUND

[003] Personal assistants that interact in natural language are becoming an increasingly popular and useful instrument for automation of enterprise operations. Such personal assistants or agents need large repositories of open domain and domain specific knowledge to be able to answer question. Typically, such knowledge is represented and stored in structured knowledge bases or knowledge graphs. Populating such knowledge bases is never a one-time affair. No knowledge graph is even close to being complete in terms of the facts that they store. Even stored facts need to be frequently updated to reflect the changing world knowledge. Automated knowledge extraction from natural language text including user utterances is still far from achieving a satisfactory mix of precision and recall.

[004] Current use of dialog with a human teacher or expert for augmenting knowledge graphs to accommodate new facts only considers open knowledge-graphs, which are free of predetermined types. Incompleteness in such knowledge graphs can extend to the schema as well. The schema is typically designed by a human expert ahead of time based on his best guess of what questions can be posed to the question-answering agent. This schema design as well is unlikely to be a one-shot affair, since the expert becomes better aware of user needs once the system is deployed with an initial schema.

[005] Existing work on schema augmentation for knowledge graphs uses structured data and unstructured corpora and suffer from precision issues. Further, these cannot capture implicit knowledge that resides only in the mind of an expert.

[006] Current use of dialog with expert to enhance knowledge also does not account for the possibility that the expert may not provide an answer to every question. In reality, the more complex the question, the less likely is the user to provide an answer. Current dialog systems for knowledge acquisition do not consider this no-response probability. Current systems to explore new (terrain) graphs in the context of mobile agents do not involve interaction with a human expert. Current systems of noisy generalized binary search that ask as few questions as possible do not have different response probabilities for different queries. They also cannot search over the space of complex objects like graphs.

SUMMARY

[007] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method for learning knowledge graph schema via dialog is provided. The processor implemented method includes receiving, via one or more hardware processors, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph; determining, via the one or more hardware processors, a plurality of possible interpretations associated with at least one statement from the set of statements; identifying, via the one or more hardware processors, at least one of (i) mentions of types, (ii) instances of initial schema entities, and combination thereof, associated with at least one statement from the set of statements; identifying, via the one or more hardware processors, a set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements; generating, via the one or more hardware processors, a set of candidate statement graphs by sampling the set of possible statement graphs from a distribution; identifying, via the one or more hardware processors, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs; and identifying, via the one or more hardware processors, a subsequent preferred question with an expected utility from a set of possible questions to query an domain expert. In an embodiment, the plurality of possible interpretations corresponds to at least one statement graph.

[008] In an embodiment, the set of possible statement graphs and corresponding probabilities may be identified by augmenting a current schema based on defining the distribution over statement graphs for a new statement. In an embodiment, an initial knowledge graph schema may be a labeled multi-graph comprising multiple relations of different types between same pair of entities. In an embodiment, at least one statement graph for each of the statement may be at least one of (i) a tree, (ii) a cyclic graph, (iii) an acyclic graph, and combination thereof. In an embodiment, the at least one statement from the set of statements may include two mentions of entity types or associated instances. In an embodiment, the processor implemented method may further include sampling by defining a Markov Chain over space of statement graphs and using joint distribution as the stationary distribution. In an embodiment, neighbors of any statement graph may be defined using operations comprising at least one of (i) a vertex insertion, (ii) a vertex collapse and combination thereof. In an embodiment, the at least one sample from the Markov chain may be selected as the candidate statement graphs after a suitable bum-in period.

[009] In another aspect, there is provided a processor implemented system to estimate heart rate associated with a subject in presence of plurality of motion artifacts is provided. The system comprises a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph; determine, a plurality of possible interpretations associated with at least one statement from the set of statements; identify, at least one of (i) mentions of types, (ii) instances of initial schema entities, and combination thereof, associated with at least one statement from the set of statements; identify, a set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements; generate, a set of candidate statement graphs by sampling the set of possible statement graphs from a distribution; identify, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs; and identify, a subsequent preferred question with an expected utility from a set of possible questions to query an domain expert. In an embodiment, the plurality of possible interpretations corresponds to at least one statement graph.

[010] In an embodiment, the set of possible statement graphs and corresponding probabilities may be identified by augmenting a current schema based on defining the distribution over statement graphs for a new statement. In an embodiment, an initial knowledge graph schema may be a labeled multi-graph comprising multiple relations of different types between same pair of entities. In an embodiment, at least one statement graph for each of the statement may be at least one of (i) a tree, (ii) a cyclic graph, (iii) an acyclic graph, and combination thereof. In an embodiment, the at least one statement from the set of statements may include two mentions of entity types or associated instances. In an embodiment, the one or more hardware processors are further configured by the instructions to sample by defining a Markov Chain over space of statement graphs and using joint distribution as the stationary distribution. In an embodiment, neighbors of any statement graph may be defined using operations comprising at least one of (i) a vertex insertion, (ii) a vertex collapse and combination thereof. In an embodiment, the at least one sample from the Markov chain may be selected as the candidate statement graphs after a suitable burn-in period.

[Oil] In yet another aspect, there are provided one or more non-transitory machine readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors causes at least one of: receiving, via one or more hardware processors, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph; determining, via the one or more hardware processors, a plurality of possible interpretations associated with at least one statement from the set of statements; identifying, via the one or more hardware processors, at least one of (i) mentions of types, (ii) instances of initial schema entities, and combination thereof, associated with at least one statement from the set of statements; identifying, via the one or more hardware processors, a set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements; generating, via the one or more hardware processors, a set of candidate statement graphs by sampling the set of possible statement graphs from a distribution; identifying, via the one or more hardware processors, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs; and identifying, via the one or more hardware processors, a subsequent preferred question with an expected utility from a set of possible questions to query an domain expert. In an embodiment, the plurality of possible interpretations corresponds to at least one statement graph.

[012] In an embodiment, the set of possible statement graphs and corresponding probabilities may be identified by augmenting a current schema based on defining the distribution over statement graphs for a new statement. In an embodiment, an initial knowledge graph schema may be a labeled multi-graph comprising multiple relations of different types between same pair of entities. In an embodiment, at least one statement graph for each of the statement may be at least one of (i) a tree, (ii) a cyclic graph, (iii) an acyclic graph, and combination thereof. In an embodiment, the at least one statement from the set of statements may include two mentions of entity types or associated instances. In an embodiment, the processor implemented method may further include sampling by defining a Markov Chain over space of statement graphs and using joint distribution as the stationary distribution. In an embodiment, neighbors of any statement graph may be defined using operations comprising at least one of (i) a vertex insertion, (ii) a vertex collapse and combination thereof. In an embodiment, the at least one sample from the Markov chain may be selected as the candidate statement graphs after a suitable bum-in period.

[013] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

[014] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:

[015] FIG. 1 illustrates an exemplary block diagram illustrating a system for learning knowledge graph schema via dialog, according to embodiments of the present disclosure.

[016] FIG. 2 is an exemplary flow diagram illustrating a method for learning knowledge graph schema via dialog, according to embodiments of the present disclosure.

DETAIFED DESCRIPTION OF EMBODIMENTS

[017] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following claims.

[018] The embodiments of the present disclosure provide a system and method for learning knowledge graph schema via dialog. The embodiments of the present disclosure provides a mechanism of augmenting a knowledge graph schema on-the-fly for accommodating new questions by entering into a dialog with a domain expert. The mechanism of augmentation involves a search over possible enhancements to the current schema, which is generated by using but not limited to an MCMC technique. In circumstances of complex queries, the embodiments of the present disclosure enables one or more experts not to respond to the queries. In the context of response uncertainty, problem of identifying the expert’s preferred augmentation to the schema with the minimum number of queries is analyzed. The problem is formulated as noisy generalized binary search over the space of schema graphs. The embodiments of the present disclosure provides iteratively splitting a set of candidate graph enhancements using questions with highest expected utility. The embodiments of the present disclosure providing dialog strategy significantly reduces dialog complexity while engaging the expert in meaningful dialog.

[019] Referring now to the drawings, and more particularly to FIG. 1 through 2, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.

[020] FIG. 1 illustrates an exemplary block diagram illustrating a system for learning knowledge graph schema via dialog according to embodiments of the present disclosure. The system 100 includes or is otherwise in communication with one or more hardware processors such as a processor 106, an I/O interface 104, at least one memory such as a memory 102, and the memory 102 further include a knowledge graph learning module 108. In an embodiment, the knowledge graph learning module 108 can be implemented as a standalone unit in the system 100. In another embodiment, the knowledge graph learning module 108 can be implemented as a module in the memory 102. The processor 106, the I/O interface 104, and the memory 102, may be coupled by a system bus.

[021] The I/O interface 104 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The interfaces 104 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a camera device, and a printer. The I/O interfaces 104 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the interfaces 104 may include one or more ports for connecting a number of computing systems with one another or to another server computer. The I/O interface 104 may include one or more ports for connecting a number of devices to one another or to another server.

[022] The hardware processor 106 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the hardware processor 106 is configured to fetch and execute computer-readable instructions stored in the memory 102.

[023] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 102 includes a plurality of modules 108 and a repository 110 for storing data processed, received, and generated by the plurality of modules 108. The plurality of modules 108 may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types.

[024] The repository 110, amongst other things, includes a system database and other data. The system database may include any and all information about various knowledge graph schema comprising of different entities that may be formed by capturing the domain knowledge and a set of statements pertaining to natural language. The other data may include data generated as a result of the execution of one or more modules in the plurality of modules 108.

[025] The system 100 for learning knowledge graph schema via dialog (Question- Answering agents), in which the dialog system takes a collection of natural language statements and an initial schema of a knowledge graph, which may be empty, and engages in dialog with an expert to interpret each of the statements by finding an appropriate enhancement to the knowledge graph schema. The dialog system enhances the knowledge graph by augmenting the initial schema. The knowledge graph learning module 108 of the system 100 is configured to automatically extract knowledge from natural language text and constructing enhanced knowledge graph schema.

[026] The knowledge graph learning module 108 of the system 100 is configured to identify types and instances (e.g., mention and mention types) of the schema entities for each of a set of a natural language statements one at a time. In an embodiment, natural language processing techniques (NLP), knowledge of an existing entity types in the schema, and associated mentions are used to identify the entity mentions from each statement and associated corresponding types. The knowledge graph learning module 108 of the system 100 is configured to generate possible statement interpretations or statement graphs for each statement by augmenting the current schema. In an embodiment, defining a distribution over statement graphs for each new statement, given already identified statement graphs for historical statements, and then sampling candidate statement graphs from the distribution using an MCMC technique to generate more likely statement graphs.

[027] The knowledge graph learning module 108 of the system 100 is configured to identify right statement graph from the generated possible statement graphs by entering into a dialog (asking queries) with a domain expert. In an embodiment, the queries are binary queries framed about entities and relationships that are present in the right statement graph. The dialog system asks less queries as much as possible, asks as simple queries as possible and avoid asking complex queries. In an embodiment, a complexity of the queries is evaluated based on the number of entities and relationships contained in the query. The dialog system asks the query having highest expected utility first which enables eliminating as many possible statement graphs as possible. If the expert answers the query, then the dialog system proceed with remaining statement graphs based on their expected utility. If the expert does not answer the query, then the dialog system ask the query having next highest expected utility, and the dialog system continues until only one correct statement graph is left. The knowledge graph learning module 108 of the system 100 is configured to introduce name of one or more new entities and relationships into the schema if any from each correct statement, by asking with the expert and building the enhanced knowledge graph schema.

[028] FIG. 2 is an exemplary flow diagram illustrating a method for learning knowledge graph schema via dialog, according to embodiments of the present disclosure. With reference to FIG. 2, the method includes iteratively performing steps 206 through 214. At step 204, the method 200 includes receiving, by a processor, a set of statements pertaining to natural language and an initial schema of at least a knowledge graph. In an embodiment, statement pertaining to the natural language but not limited to a query and questions received from a user. In an embodiment, an initial knowledge graph schema is a labeled multi-graph G = {E, R}, where E is the set of entity types and R is the set of binary relation types. In an embodiment, this is a multi-graph since there may be multiple relations of different types between the same pair of entities. The knowledge graph schema is further described by a non-limiting example, wherein the example includes a set of statements providing a dialog between a human (H) and an agent (A) that answers questions about the Research domain. The example is as follows:

HI: Does Tom Mitchell work on "never ending learning"?

Al: I know that Tom Mitchell is a researcher. But I do not understand "never ending learning". Is it the name of some entity? H2: It is the name of a topic.

A2: 1 know that researchers write papers about topics. Is this what you mean?

H3: No. I want to know about projects.

A3: Is project a new kind of entity?

H4: Yes.

A4: Are researchers related to projects, and projects to topics?

H5: Yes, that is right.

A5: Could you tell me how projects and researchers are related?

H6: Researchers lead projects.

A6: 1 understand. And how are projects related to topics?

H7: Projects are about topics.

A7: Thank you. I now understand that your question was about researcher Tom

Mitchell leading projects about topic "never ending learning".

[029] The above example of research domain includes a plurality of entity types such as Researcher, Paper, Institute and Topic, and relation types such as Author Of (Researcher, Paper), Affiliated with (Researcher, Project) and About (Paper, Topic). Further, a typed knowledge graph K = (G, I), or KG, is defined using a schema G and examples or instances I of the entity and relation types in G. For example, the research KG include at least one of entity instances Researcher (T. Mitchell), Institute (CMU), Paper (Coupled semi- supervised learning), and relation instance Affiliated With (T. Mitchell, CMU), etc. Usually, each entity type includes own attributes which are inherited by one or more instances. In an embodiment, the knowledge graphs can be used but not limited to natural language question answering.

[030] At step 206, a plurality of possible interpretations or one or more statement graphs for each statement from the set of statements is determined. In an embodiment, the one or more statement graphs for each of the statement can be a tree, a cyclic graph, an acyclic graph and the like. In an embodiment, for a given set of statements S and an initial schema Go, possible interpretations or statement graphs are searched for finding an enhancement G^*of Go such that the statement graphs G_s. for all statements si e S are contained in the enhanced schema G^*. In an embodiment, along with G^*, the statement graphs for the individual statements are unknown as well and need to be identified. In an embodiment, the statements are but not restricted to questions where the statement graph is a directed path or arbitrary length. In an embodiment, the one or more questions form a significant fraction of real questions asked to a question answer (QA) agent. In an embodiment, a dialog with the expert for each question statement in S is proposed. In the i* dialog, a statement si e S is considered, and a sequence of queries are asked to identify the true statement graph G_s. for si.

[031] The queries that can be asked to the expert about the statement graph G_s. are required to be modelled. In an embodiment, one or more types of queries are posed to the expert. In an embodiment, one type of query includes but not restricted to binary queries of the form "Is the question about pi OR p₂ OR... p_k?", where each pi is a possible path in G^* . In an embodiment, such a path may involve new entity and relation types that are not in the current schema graph. As a result, such binary queries can only recover the structure of G_s.. In another embodiment, another type of query may include asking for one or more names of new entity and relation types in G_s.. In an embodiment, each such query q(si) has an utility. In an embodiment, intuitively, utility measures extent to which the response to the query which prunes space of candidates for G_s.. But this utility is uncertain, since the query may not get a response. The response probability depends on complexity of the query. The disclosure utilizes a completely graph-theoretic notion of the query complexity: total number of entities in the paths pi, . . . , p_k mentioned in the query. Considering this response uncertainty, a dialog strategy is designed that recovers the statement graph G_s. for each si e S— and as a result the complete schema graph G^*— using as short a dialog with the expert as possible. In an embodiment, the length of the dialog is aggregated complexity of one or more queries posed to the expert during the dialog.

[032] At step, 208, mentions of types and instances of initial schema entities, for each of the statement from the set of statements, are identified. In an embodiment, each of the statement from the set of statements include two mentions of entity types or their associated instances. For example, the entity types are denoted as ei(si) and e₂(sO, and exactly one relation path mention, which is denote as p(si). In another embodiment, for a given natural language question statement s, the goal of mention type identification is two-fold. In an embodiment, the step 208 is further explained with the help of a previously mentioned non-limiting example with query "Does Tom Mitchell work on never ending learning?". Initially, the two mentions mei(s) and me₂(s) of entity types or entity instances and the single mention m_r(s) of a relation path contained in the statement are identified. Here, ‘Tom Mitchell’ and "never ending learning" are the two entity instance mentions and‘work on’ is a relation path mention. Further, the entity types of the two mentions mei(s) and me₂(s), which are denoted as ei(s) and e₂(s), are identified. Here, the entity type for‘Tom Mitchell’ is Researcher and that for "never ending learning" is Topic.

[033] In an embodiment, mention identification is performed by using simple unsupervised NLP techniques or supervised techniques when needed without affecting the remaining components of our solution. It is assumed that mei(s) and me₂(s) are separated by m_r(s) in the statement s. in an embodiment, a combination of noun phrase detection and named entity detection from nltkl is used to identify mei(s) and me₂(s). Alternatively, entity mentions can be in the form of‘Wh-’ questions. For example, for the question "Who works on never ending learning?",‘who’ is a mention of entity type Researcher.

[034] For identifying the relation path mention m_r(s), a verb phrase in between mei(s) and me₂(s) is checked. Further, the entity types ei(s) and e₂(s) are identified. In an embodiment, the entity mentions can be of two different forms. The first possibility is a type mention, where the type name is mentioned, for example "paper", "researcher", "institution", or some variations or synonyms. The second possibility is an instance mention, where an instance of one of these types is mentioned, for example "Coupled semi-supervised learning", "Tom Mitchell", or "Carnegie Mellon". The type as Researcher from both "researcher" and "Tom Mitchell" is required to be identified. In an embodiment, to address this, two distributions for each entity type e e E— a type mention distribution

(e) and an instance mention distribution P_m(^e) arc considered. In an embodiment, these are estimated from mention type annotations of previously seen statements. For previously encountered mentions, these two distributions to identify the most likely type are used. For new mentions, the type could be one of the existing types in E or a new type with the help of expert.

[035] Further, at step 210, a set of possible statement graphs and their corresponding probabilities for each of the statement from the set of statements is identified by augmenting the current schema based on defining the distribution over statement graphs for a new statement. In an embodiment, the step 210 requires defining a distribution over statement graphs for a new statement, given already identified statement graphs for historical statements a probability distribution for the statement graph G_s . The defined probability distribution is provided in equation 1 as follows:

[036] Here, the first term can be imagined as the prior probability of a statement graph (path) for si starting at entity ei(s) and ending at entity e₂(s) given the statement graphs for previous statements. The second term is the probability of the relation mention in the statement given the statement graph and can be imagined as the likelihood.

[037] Further, the distribution is defined over statement graphs G_s for a statement s. Here, the statement graph is assumed to be a random path in the true schema graph, with start probabilities over entity types and transition probabilities over relations between entity types. The path needs to start at el(s) and end at e2(s). The previously seen statement graphs serve as historical paths that provide evidence for these start and transition probabilities. This is similar to random walks used for link prediction in knowledge graphs along with the possibility of previously unseen entity types and relations. The random-walk probability of a statement path is defined as:

[038] where p() is the distribution over start entities of the random walk, and Q(ej,) is the transition distribution from entity ej to other entities. The definition for these two distributions need to account for new entities and relationships.

[039] Further, the probability p(1<) of the random path starting at entity k is defined as follows:

n(k) oc n_k + a_e for existing k

oc a_n for new k

[040] Here, m is the number of times previous statement graphs have started at entity k, and a_e > 0 allows new start entities. Similarly, the probability Q(j, k) of entity k following entity j in the path is defined as follows:

Q(j, k ) oc n_jk + f_ee for existing j and k

oc b_bh for existing j, new k

oc b_h6 for new j, existing k

oc b_hh for new j and k

[041] Here, n_jk is the number of times previous statement graphs have transitioned from entity j to entity k, and p_ee

> 0 allows transitions from and to previously unrelated and unseen entities. In an embodiment, intuition is that (a) more frequently seen transitions are more likely, and (b) while encountering new entities and relationships is possible, that probability progressively reduces as the number of historical statements increases. In the previously considered example statement, the true statement path Research er Paper— «-Topic involves two new transitions, one to a new entity (Paper) and another from the same new entity. Conditioned on the statement graph, the likelihood of the relation oath mention in the statement is defined as follows:

P(m_r(s)G_s) = P(m_r(s) E(G_S))

[042] where E(G_S) is the sequence of intermediate relations in G_s. In the example, the probability that the relation sequence (AuthorOf (Researcher, Paper), About(Paper, Topic)) is mentioned as’works on’ in a statement. The probabilities are again estimated from the historical mentions of relation paths in the previously seen statements with already identified statement graphs, with a small probability of unseen mentions. In an embodiment, together, the two terms define the probability of a specific statement graph for a statement.

[043] Upon defining the distribution over statement graphs G_s for a statement s, as depicted in step 212, a set of candidate statement graphs are generated by sampling the possible statement graphs from the distribution. In an embodiment, sampling could be performed by using but not limited to a MCMC technique. In an embodiment, the MCMC technique performs sampling by defining a Markov Chain over the space of statement graphs and using the joint distribution in Eqn.l as the stationary distribution. In an embodiment, the neighbors of any statement graph are defined using operations such as vertex insertion and vertex collapse. In an embodiment, vertex insertion operation introduces a vertex between currently adjacent entities in G_s. For example, Researcher— «-Topic includes Researcher— «-Paper Topic as a vertex-insertion neighbor. In an embodiment, the inserted entity can be an existing entity from previous statement graphs or a new one. In another embodiment, vertex insertion can operate on any currently adjacent pair of entities in the statement graph. In an embodiment, vertex collapse operation is the inverse of vertex insertion.

[044] This collapses an intermediate vertex of the statement graph G_s by introducing a direct edge between its neighbors. In an embodiment, vertex collapse can operate any current intermediate vertex of the statement graph. In an embodiment, at each sampling step, all the neighbors of the current statement graph defined by these operations and sample from these using their probabilities defined by equation.1, are considered. The embodiment of the present disclosure includes the statement graph itself among its neighbors and can be easily shown that this definition of neighborhood ensures ergodicity of the Markov Chain. In an embodiment, samples from the Markov chain are selected as the candidate statement graphs after a suitable burn-in period.

[045] At step 214, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs are identified. In an embodiment, expert preferred statement graphs G^* are identified with as short a dialog as possible, with the knowledge that the expert may not answer all questions. In an embodiment, for identifying expert preferred statement graphs from the generated set of candidate statement graphs, the method 200 includes utilizing a generalized binary search and utility adapted to an unsupervised setting. In an embodiment, the proposed strategy is motivated by simple binary search which requires O(logn) steps to search a set of size n. For example, G^k be the remaining candidates at step k of the dialog for statement s. Further, a 2-way partition n^k(s) is considered that splits G^k such that the resultant cumulative probabilities of the two splits are P^k and P^k. In an embodiment, the utility u (n^k(s)) of the partition n^k(s) is defined as min( P^k~P^k)· In one embodiment, the step 214 considers utility of a query as how many possible candidates are eliminated by that query, and combine with the uncertainty of the response to ask that query at each step which has the highest expected utility. The disclosure provides a partition that maximizes the utility at each step of the dialog.

[046] Further, response uncertainty and expected utility are determined. For instance, Q(n^k(s)) denote the query to the expert for partition (

) of the remaining candidate set G^k. The query is unlikely to receive a response, since the query is very complex. For such cases, the present disclosure defines a complexity c(Q(n^k(s )) of a query Q(n^k(s)) purely in graph theoretic terms as the total number of nodes (entities) mentioned in the query. For the no-response probability

for a query on 7 r^fc(s), various "squashing functions" are used on the query complexity. In an embodiment, complexity may consider the number of entities and relationships contained in the query. Example:“Is the statement about Person-Paper” include complexity 2. The generalized logistic function used in the present disclosure is provided as:

[047] The parameters w, t are selected such that [1, n] covers most of the probability mass for some positive integer n. Using this definition of no-response probability, the expected utility of a partition is defined which is provided as:

[048] The partition at step k of the dialog that maximizes the expected utility is selected. In an embodiment, a query for a partition is determined. In one embodiment, the queries can be binary queries about the entities and relationships present in the right statement. For example, Ts the statement about Person-Paper?’ is a binary query about the entities and relationships. A plurality of path features whose presence or absence distinguish the two splits of the partition such as nodes, edges, length 2 paths, and the like, are identified. For example, if all the paths in one of the splits of the partition contain the entity paper, while no path in the other split contain it, then a possible query for the partition is the following: "Is the statement about papers?". Being a simpler question, this is much more likely to get a response. If no such feature exists, then the fall-back query is enumeration. Further, the present disclosure brings everything together to propose a dialog strategy.

[049] In an embodiment, the proposed dialog strategy for statement s starts with the entire set of candidate statement graphs G_s, and iteratively splits and prunes by asking queries to the expert until one candidate is confirmed. At each step, a partition

that has the highest expected utility is searched and its corresponding query Q

is asked to the expert. If the expert provides a response, then the candidate set is split and proceed to the next step. If no response is received, then the query is moved with the next highest expected utility. Since the number of possible partitions is exponential, exhaustively considering all partitions is not an option. At step 216, a subsequent preferred question with an expected utility (e.g., next possible question to an domain expert) is identified from a set of possible questions to query an domain expert.

[050] The present disclosure considers only a linear number of possibilities by ordering the candidate statement graphs by probability, and considering each position in the order as a possible splitting point. Further, the structure of the statement graph is identified. If the identified statement graph includes new entities and relationships, then their names are additionally obtained from the expert. In such circumstances, a second kind of query is required, which asks for the name a specific entity or relationship, by providing its known context. As example name-query may ask "What is the name of the new entity related to both researchers and topics in this statement?".

[051] Experimental Results:

[052] In an embodiment, the proposed dialog strategy based method is evaluated using data from an agent that answers research related questions developed and deployed in house within our research organization. It includes a schema of ten entity types and fifteen relationship types. This agent currently answers about hundred questions a day from different users. From the system’s log, the most common patterns of questions are identified, and from these selected twenty real questions such that their (expert-labeled) statement graphs cover the entire schema, and whose true statement graphs are paths in the schema. Then various experiments are performed to see if the agent provided in the disclosure can conduct meaningful dialog with a real user to successfully recover the individual statement graphs for these questions. To simulate no-responses from the expert, random decision is taken on whether the expert answers a specific query by sampling a binary random variable with the corresponding no-response probability. First, it is observed that the agent is able to recover the correct statement graphs for all questions with a meaningful dialog. Below, a snippet from a real conversation for a question similar to the previous example is shown. The snippet substitutes steps A2 to H5.

AG : Does this statement involve a new entity type?

Hl’: Yes.

A2’: Is this statement about papers?

H2’: No.

A3’: I think this statement is about Researcher related to a new entity which is related to Topic. Am I right?

H3’: Yes, you are right.

A4’: What is the name of this new entity?

H4’: It is project.

[053] In an embodiment, noting the differences with motivating dialog in the introduction, since the present disclosure has modeled the expert to provide only yes-no answers, a response of the form H3 in the previous example is currently not possible in this example and necessitated a slightly longer dialog. Also, based on the proposed dialog strategy, the query about a new entity came before confirming the existing path between Researcher and Topic via Paper. Further, which leads to a total dialog complexity of just two, since the agent queried about two entities in total (New entity in AG and Paper in A2’). By step A3’, the dialog had already converged to the right statement path.

[054] Further, a comparison of the proposed dialog strategy with a naive sequential strategy is provided that presents each of the candidate statement graphs to the expert in sorted order of their probabilities until the right one is reached. The two strategies are compared using total dialog complexity. For currently considered example question, a naive strategy leads to a dialog complexity of sixteen, (compared with two for proposed strategy) since the right path appears in the 5th position in the ranked list and the ones above need to be completely specified in the query sequence before this is reached. Thus, the present disclosure shows improvement in dialog complexity. Further, the proposed strategy leads to an average dialog complexity of 5.0 over a set of questions compared to 7.9 for the sequential strategy.

[055] The embodiments of the present disclosure provides dialog strategy that reduces the dialog complexity significantly while engaging the expert in meaningful dialog. The embodiments of the present disclosure addresses problem of updating the knowledge graph schema based on the new needs in a structured manner, thereby an enhanced knowledge graph schema is maintained from time to time. Also the implicit knowledge that reside in the mind of expert is captured to the enhanced knowledge graph schema and there by achieving improved precision and recall.

[056] The embodiments of the present disclosure is able to engage a human expert in a meaningful dialog and recover statement graphs for questions with very low query complexity and improved accuracy. The embodiments of the present disclosure proposes a dialog mechanism for a non-technical domain expert to easily construct or augment a domain schema as required based on data from an end application. The embodiments of the present disclosure minimizes human expert involvement. In an embodiment, the present disclosure can be extended to provide a better model with more complex statement where the statement graphs are trees instead of paths, handling incorrect responses, handling form of queries other than question statements and responses, interleave candidate generation and dialog, estimating no-response probabilities from historical dialog, and providing an algorithm which is resilient to a small probability of such erroneous responses.

[057] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.

[058] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application- specific integrated circuit (ASIC), a field- programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.

[059] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

[060] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms“a,”“an,” and“the” include plural references unless the context clearly dictates otherwise. [061] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term“computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.

[062] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

Claims

CLAIMS:

1. A processor implemented method for learning knowledge graph schema via dialog, comprising:

receiving, via one or more hardware processors, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph (204);

determining, via the one or more hardware processors, a plurality of possible interpretations associated with at least one statement from the set of statements (206), wherein the plurality of possible interpretations corresponds to at least one statement graph;

identifying, via the one or more hardware processors, at least one of (i) mentions of types, (ii) instances of initial schema entities, and combination thereof, associated with at least one statement from the set of statements (208);

identifying, via the one or more hardware processors, a set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements (210);

generating, via the one or more hardware processors, a set of candidate statement graphs by sampling the set of possible statement graphs from a distribution (212);

identifying, via the one or more hardware processors, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs (214); and

identifying, via the one or more hardware processors, a subsequent preferred question with an expected utility from a set of possible questions to query an domain expert (216).

2. The processor implemented method of claim 1, wherein the set of possible statement graphs and corresponding probabilities are identified by augmenting a current schema based on defining the distribution over statement graphs for a new statement.

3. The processor implemented method of claim 1, wherein an initial knowledge graph schema is a labeled multi-graph comprising multiple relations of different types between same pair of entities.

4. The processor implemented method of claim 1 , wherein the at least one statement graph for each of the statement can be at least one of (i) a tree, (ii) a cyclic graph, (iii) an acyclic graph, and combination thereof.

5. The processor implemented method of claim 1, wherein the at least one statement from the set of statements comprises two mentions of entity types or associated instances.

6. The processor implemented method of claim 1, further comprising, sampling, via the one or more hardware processors, by defining a Markov Chain over space of statement graphs and using joint distribution as the stationary distribution, wherein neighbors of any statement graph are defined using operations comprising at least one of (i) a vertex insertion, (ii) a vertex collapse and combination thereof, wherein the at least one sample from the Markov chain is selected as the candidate statement graphs after a suitable bum-in period.

7. A system (100) to learn knowledge graph schema via dialog, comprising:

a memory (102) storing instructions;

one or more communication interfaces (104); and

one or more hardware processors (106) coupled to the memory (102) via the one or more communication interfaces (104), wherein the one or more hardware processors (106) are configured by the instructions to:

receive, a set of statements pertaining to a natural language and an initial schema of at least one knowledge graph;

determine, a plurality of possible interpretations associated with at least one statement from the set of statements, wherein the plurality of possible interpretations corresponds to at least one statement graph;

identify, at least one of (i) mentions of types, (ii) instances of initial schema entities, and combination thereof, associated with at least one statement from the set of statements;

identify, a set of possible statement graphs and corresponding probabilities associated with the at least one statement from the set of statements;

generate, a set of candidate statement graphs by sampling the set of possible statement graphs from a distribution;

identify, a plurality of expert preferred statement graphs from the generated set of candidate statement graphs; and identify, via the one or more hardware processors, a subsequent preferred question with an expected utility from a set of possible questions to query an domain expert.

8. The system of claim 7, wherein the set of possible statement graphs and corresponding probabilities are identified by augmenting a current schema based on defining the distribution over statement graphs for a new statement.

9. The system of claim 7, wherein an initial knowledge graph schema is a labeled multi graph comprising multiple relations of different types between same pair of entities.

10. The system of claim 7, wherein the at least one statement graph for each of the statement can be at least one of (i) a tree, (ii) a cyclic graph, (iii) an acyclic graph, and combination thereof.

11. The system of claim 7, wherein the at least one statement from the set of statements comprises two mentions of entity types or associated instances.

12. The system of claim 7, wherein one or more hardware processors is further configured to sample, by defining a Markov Chain over space of statement graphs and using joint distribution as the stationary distribution, wherein neighbors of any statement graph are defined using operations comprising at least one of (i) a vertex insertion, (ii) a vertex collapse and combination thereof, wherein the at least one sample from the Markov chain is selected as the candidate statement graphs after a suitable bum-in period.