WO2020178626A1

WO2020178626A1 - Systems and methods for adaptive question answering

Info

Publication number: WO2020178626A1
Application number: PCT/IB2019/053080
Authority: WO
Inventors: Neha PRABHUGAONKAR; Abhay PARAB; Natwar Mall
Original assignee: Cuddle Artificial Intelligence Private Limited
Priority date: 2019-03-01
Filing date: 2019-04-15
Publication date: 2020-09-10
Also published as: JP2022523601A; CN111886601A; CN111886601B

Abstract

The invention relates to systems and methods for adaptive question answering which is adaptive to user's characteristics, goals and needs by continuously learning from user interactions and adapting both the context and data visualization. The system for adaptive question answering comprises software modules embodied on a computer network, and the software modules comprise an Interpretation Engine, an Answering Engine and a Learning Engine.

Description

SYSTEMS AND METHODS FOR ADAPTIVE QUESTION ANSWERING

RELATED APPLICATIONS

[0001] This application claims priority benefit of Indian Patent Application No.

201921008186, filed March 1, 2019, which are incorporated entirely by reference herein for all purposes.

FIELD

[0002] The invention relates to systems and methods in the field of computer science, including hardware and software, and artificial intelligence.

BACKGROUND ART

[0003] Data-driven decision-making situations, such as Business Intelligence involves complex high-dimensional data-sets. This often require looking at various data sources, slicing them appropriately, examining results and discovering the most meaningful insights. Business users often spends disproportionate amount of time on inefficient data busywork. They have very simple requirement, being able to ask business queries in most natural way and get relevant business answers without worrying about the query language and other technical part.

[0004] In most Question Answering systems, the output remains independent of the user’s characteristics, goals and needs. Typically, these systems are static and rarely interacts with the users and hence is incapable of learning and adapting answers based on the context of question. Thus, there is a need for an adaptive learning system which adjusts its answers with respect to a user’s characteristics, goals and needs.

SUMMARY OF THE INVENTION

[0005] The present disclosure describes an Adaptive Question Answering Engine (AQUAE) system which is adaptive to user’s characteristics, goals and needs by continuously learning from user interactions and adapting both the context and data visualization, thereby improving quality and experience of the user. Furthermore, the natural language interface allows a more natural flow of business queries for non-technical business users who don’t need to face discomfort and difficulty while using technical terminology. [0006] One exemplary system embodiment herein provides an adaptive question answering engine system comprising software modules embodied on a computer network, and the software modules comprise an Interpretation Engine, an Answering Engine and a Learning Engine. The Interpretation Engine receives questions in natural language from a user and processes the question for holistic understanding of the user’s question by incorporating semantic and usage knowledge from a Learning Engine. The question understanding is not restricted to question text, but also identifies user’s intent, makes intelligent assumptions in case of insufficiently elucidated questions, performs disambiguation in case of ambiguities. The

Interpretation Engine generates an Interpretation which is passed to an Answering Engine for generation of relevant answer(s). An Answering Engine formulates various intermediate queries based on the Interpretation and retrieve appropriate answers and metadata associated with the answers for individual intermediate query. The Answering Engine determines visualization preference by incorporating semantic and usage knowledge from a Learning Engine and aggregate and rank answers as appropriate. The Answering Engine also recommends follow-up actions that user can perform to aid user on his information needs and further analysis. A

Learning Engine augments, adapts and improves knowledge based on user interactions which are fed back to the Learning Engine. User interactions comprise data enquiry, correction of ambiguous entities, actions on interpretation, actions on answer, tracking of answer, drill-down part of data, visualization changes, up-vote/down- vote on answers, actions on suggested analysis, and follow-up on suggested questions.

[0007] One exemplary method embodiment herein provides a method for adaptive question answering, comprising steps of

receiving a question by an Interpretation Engine in natural language from a user;

processing the question by the Interpretation Engine for holistic understanding of the user’s question by incorporating semantic and usage knowledge from a Learning Engine;

generating an Interpretation by the Interpretation Engine passing to an Answering Engine for generation of relevant answers;

formulating various intermediate queries by the Answering Engine based on the

Interpretation;

retrieving appropriate answers and metadata associated with the answers for individual intermediate query; determining visualization preference by the Answering Engine by incorporating semantic and usage knowledge from the Learning Engine;

aggregating and ranking answers as appropriate; and

recommending follow-up actions by the Answering Engine to aid user on his information needs and further analysis;

wherein the Learning Engine augments, adapts and improves knowledge based on user interactions which comprise data enquiry, correction of ambiguous entities, actions on interpretation, actions on answer, tracking of answer, drill-down part of data, visualization changes, up-vote/down-vote on answers, actions on suggested analysis, or follow-up on suggested questions.

[0008] An additional embodiment herein comprises a computer network for adaptive question answering comprises a first subnetwork for data processing and a second subnetwork for data storage. An embodiment for the first subnetwork for data processing comprises at least one virtual or physical server node for implementing an Interpretation Engine, an Answering Engine, an Learning Engine, data synchronization or other modules. Another embodiment for the first subnetwork for data processing comprises a multi-server-node cluster which gets deployed with Interpretation Engine, Answering Engine, Learning Engine and all other required modules, and a second server node for data synchronization. A further embodiment for the first subnetwork for data processing provides serverless architectures. An embodiment for the second subnetwork for data storage comprises a big data framework and a database system for data inquiry and retrieval.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] The invention will be described in more detail below on the basis of a drawing, which illustrates exemplary embodiments. In the drawing, in each case schematically:

[0010] FIG.l shows the high-level process flow of AQUAE.

[0011] FIG. 2 depicts the hardware details of AQUAE.

[0012] FIG. 3 shows high-level representation of what constitutes a question.

[0013] FIG. 4 details high-level building blocks of Analytics specific Meta Ontology (AMO). [0014] FIG. 5 details high level user knowledge captured from various user/system interaction.

[0015] FIG. 6 depicts the details of the Interpretation Engine.

[0016] FIG. 7 depicts the details of the Answering Engine.

[0017] FIG. 8 shows high level representation of what constitutes Answer in AQUAE.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0018] It should be understood that this invention is not limited to the particular methodology, protocols, and systems, etc., described herein and as such may vary. The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention, which is defined solely by the claims.

[0019] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms“a”,“an” and“the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms“comprises” and/or“comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

[0020] The following description and the drawings sufficiently illustrate specific embodiments to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. Portions and features of some embodiments may be included in, or substituted for, those of other embodiments.

[0021] FIG.l details the high-level process flow of AQUAE. A user can interact with the AQUAE in natural language using any system of interaction, such as mobile applications, desktop applications, web applications, voice-based hardware, etc. The input from the user is captured as a question. The question is further processed by the Interpretation Engine. The Interpretation Engine incorporates organization and usage knowledge for holistic understanding of the user’s question. The question understanding is not restricted to question text, but also identifies user’s intent, makes intelligent assumptions in case of insufficiently elucidated questions, performs disambiguation in case of ambiguities and so on. The semantic understanding of the question context is termed as Interpretation.

[0022] Interpretation is further passed to Answering Engine for generation of relevant answer(s). Answering Engine is also responsible for determining which data source(s), which slice of data user might be interested in. It also retrieves data from underlying data cluster, determine visualization preference based on past interactions, builds on additional contexts as deemed fit. To aid user on his information needs and further analysis, the Answering Engine also recommends follow-up actions that user can perform.

[0023] All these answers would be presented to user on system of interaction in appropriate manner, also providing an opportunity to user to interact with the system. All these user interactions are fed back to the Learning Engine. User interactions can be correction of assumptions made while answering a question, choosing an alternative visualisation or even simple feedback like upvote or downvote on an answer. AQUAE adapts from these interactions thereby enriching the usage knowledge as well as semantic knowledge.

[0024] FIG. 2 depicts the exemplary hardware details of a computer network of AQUAE. The computer network for adaptive question answering comprises a first subnetwork for data processing and a second subnetwork for data storage. The first subnetwork for data processing comprises a multi-server-node cluster which gets deployed with Interpretation Engine,

Answering Engine, Learning Engine and all other required modules, and a second server node for data synchronization. The second subnetwork for data storage comprises a big data frame work and a database system for data inquiry and retrieval. Table 1 provides the exemplary hardware of AQUAE.

Table 1. Exemplary hardware details of AQUAE

[0025] Another embodiment for the first subnetwork for data processing comprises at least one virtual or physical server node for implementing an Interpretation Engine, an

Answering Engine, an Learning Engine, data synchronization or other modules. A further embodiment for the first subnetwork for data processing provides serverless architectures.

Serverless architectures are application designs that incorporate third-party“Backend as a Service” (BaaS) services, and/or that include custom code run in managed, ephemeral containers on a“Functions as a Service” (FaaS) platform. Serverless architectures remove much of the need for a traditional always-on server component and may benefit from significantly reduced operational cost, complexity, and engineering lead time.

Question

[0026] FIG. 3 shows high-level representation of what constitutes a question.

Knowledge

[0027] An Intelligent Business intelligence (BI) system can help you understand the implications of various organizational processes better and enhance your ability to identify suitable opportunities for your organization, thus enabling you to plan for a successful future. BI Analytics is widely used by organizations for providing actionable insights from disparate & complex data landscape. This data would be scattered within and outside of the organization. Moreover, each organization has its own nomenclature and the data is very unique on its own.

An organization agnostic Adaptive Question Answering Engine, need to understand this organization specific knowledge. Building an organization specific ontology may not suffice to create a domain or organization agnostic AQUAE.

Semantic knowledge

[0028] Several efforts have been made in creation of meta-ontology, but were

unsuccessful to cater the analytics need. This invention solves this problem by building Analytics specific Meta Ontology (AMO). FIG. 4 details high-level building blocks of AMO. Each concept would have name, label, glossary, synonym and other relevant properties. This invention captures this understanding of domain and organization specific knowledge in semantic knowledge. Consider semantic knowledge as organization specific ontology derived from AMO. For example,‘Brand’ is an Attribute,‘Colgate’ is an entity of type Brand,‘Total Number of Unit’ is a measure with glossary as“total numbers of items being sold”.

Usage Knowledge

[0029] FIG. 5 details high-level user knowledge captured from various user/system interaction. User model captures all the user-specific interactions, such as feedback being provided by the user on answers, corrections of assumed entities etc. While organization model captures meta information across organization based on similar feedback. Visualization model captures user’s visualization preferences for a given insight. Insight knowledge is a repository of all the insights which were served to users and links to AMO for all valid contexts. User session keeps track of what users are performing by monitoring interactive information interchanged between user and system. System builds up user specific interest from every user interaction.

Interpretation Engine

[0030] The role of the Interpretation Engine is to convert user’s question into an intermediate structured representation called as the Interpretation. FIG. 6 depicts the details of the Interpretation Engine. The Interpretation essentially captures the key entities from the question after analysing and understanding the context of the question.

Entity Identification

[0031] Interpretation Engine uses a semantic-parser algorithm to parse the question and identify the key constituent phrases and tokens from the question. Named Entity Identification plays a pivotal role in the interpretation Engine. The examples of named entities can be, person or organization names, locations, dates and times. Named entities can then be organized under predefined categories, such as“period” - relative, specific & periodic,“business objects” - column values,“measure” - numerical columns,“filters & conditions”, and other important features from question and user context.

Lexical Entity Disambiguation

[0032] Next step after entity identification is the disambiguation of lexical entities or Word Sense Disambiguation (WSD). WSD is the task to determine the correct meaning of an ambiguous word in a given context. In natural language text, words can be polysemous (word having more than one sense) in nature. In business context, measures and business objects can be often polysemous in nature. In such cases, we use the context and usage knowledge to disambiguate the entities. The measures are disambiguated and ranked using inferencing algorithm and weighted context similarity approach. For example, in a question -“Sales of Region East for this month”, the word“Sales” have more than two senses (“Total Unit Sales”, “Total Dollar Sales”). In such A scenario, as per usage knowledge and question context,‘Sales’ could be associated with‘Total Dollar Sales”.

Semantic Disambiguation

[0033] Once we disambiguate ambiguous entities, the next step is Semantic

Disambiguation. While lexical disambiguation is all about disambiguating entities at word level, semantic disambiguation deals with disambiguation of entities considering the entire context of the question. This involves disambiguating entities considering the data source information of measures and also with respect to the other entities in the question. Once all the measures and business objects are disambiguated appropriate filters and conditions are applied on the measure entities.

[0034] Enterprise application without access control is no-go for almost all (if not all) organizations. The access control plays a considerable role in ensuring that user would get access to only authorized information.

[0035] Last step in Interpretation Engine is to identify user’s intent which can be further utilized by Answering Engine.

Answering Engine

[0036] The main task of the Answering Engine is to generate appropriate answer(s) using the semantic Interpretation considering the user intent as deduced from the question. FIG. 7 depicts the details of the Answering Engine.

[0037] Once the question is interpreted, considering the context, the Engine formulates various intermediate queries required to answer the questions. The queries can range from one to many based on the user’s intent. For each query, the answers are equivalent to the queries formed. [0038] As a second step, for individual intermediate query, period and measures are inferred using a Bayesian formulation in case they are not mentioned in the question. Once the period, measures and other entities are in place, the Answering Engine consults the enterprise data to obtain the appropriate answer and metadata associated with the answer for individual intermediate query. Using the interpretation, the Answering Engine identifies and recommends the most frequent and relevant information to the user along with the answer(s).

[0039] The next step is to determine the visualization for the answer. It helps to improve the ability to understand the hidden information in a more constructive way. Business leaders need the ability to easily drill down into the data to see where they can improve, take actions and to grow their business. Data visualization brings business intelligence to life. Depending on the answer data and past user interactions, the AQUAE provides the user with the best visualization along with alternate visualizations supported for the answer(s).

[0040] Giving answer would trigger next set of questions that user might ask. AQUAE recommends follow-up actions to ease discovery and effective analysis. As a last step, the user is served with assembled and ranked answer(s) for the question asked.

Answer

[0041] FIG. 8 shows high level representation of what constitutes Answer in AQUAE. Learning Engine & Interactions

[0042] Learning Engine is responsible for augmentation/adaptation of knowledge based on user interactions. Following user interactions are supported by AQUAE. Learning Engine improves knowledge based on each interaction to make AQUAE smarter.

Data enquiry

[0043] The first interaction starts with user asking a question in natural language.

Learning Engine learns the context for a given user at a given time building user session. This user session along with recent questions plays a considerable role in understanding user’s intent.

Correction of ambiguous entities [0044] As detailed earlier, entities can be ambiguous in nature and in such cases AQUAE make intelligent assumptions in disambiguating entities based on the context and usage knowledge. Corrections of ambiguous entities provides an opportunity for Learning Engine to enrich its knowledge, thereby improving the entity disambiguation for subsequent data enquiries (for all the users in an organization).

Actions on Interpretation

[0045] In case of incomplete questions, system infers necessary & mandatory entities (e.g. measure) using knowledge and context for generation of answers. AQUAE allows the user to change this context. This interaction is considered as feedback for subsequent data enquiries (for all the users in an organization)

Actions on Answers

[0046] AQUAE considers period entities as a special case and allows user to change the same when answer is being served. This allows user for data exploration with respect to different time frames. Learning Engine learns relevant time period for a given context from this interaction. The knowledge will be later used by AQUAE to infer time periods in case of incomplete questions (for all the users in an organization)

Tracking of answer

[0047] Answers can be tracked/untracked based on user’s changing business preferences. Learning Engine captures user’s interest from these interactions. This knowledge helps AQUAE to better rank the answers as well as helps in disambiguating entities.

Drill-down part of data

[0048] Users typically have a need to drill-down part of data for better understanding of business. AQUAE enables the same and learns user’s interest areas. Also, these interactions allow AQUAE to predict and pre-empt follow-up questions that users might have, thereby improving suggested analysis based on answers.

Visualization changes [0049] Change in visualization helps AQUAE to understand what is the preferred visualization for a given answer. This helps AQUAE to learn & recommend the best

visualization for subsequent answers (for all the users in an organization) in case of data enquiry.

Up-vote /down-vote on answers

[0050] AQUAE allows user to provide feedback on relevancy and validity of answer using up-vote/down-vote actions. This helps AQUAE to learn and adapt user model, thereby improving experience with subsequent data enquiries (for all the users in an organization)

Actions on suggested analysis

[0051] AQUAE pre-empts follow-up questions that user might have when presented with an answer by recommending related analysis. For example,“benchmark across all region” might be recommended for the answer,“sales of West region in 2018”. Invocation or non-invocation of these recommended analyses along with context allows Learning Engine to learn about user’s way of interaction with the data, giving opportunity to improve the same.

Follow-up on suggested questions

[0052] AQUAE recommends suggested questions to the user, when data is not available for a given context. This recommendation is based on usage knowledge.

Claims

1. An adaptive question answering engine system comprising software modules embodied on a computer network, wherein the software modules comprise:

an Interpretation Engine for receiving a question in natural language from a user, processing the question for holistic understanding by incorporating semantic and usage knowledge from a Learning Engine, and generating an Interpretation which is passed to an Answering Engine;

the Answering Engine for formulating various intermediate queries based on the Interpretation, retrieving answers and metadata associated with the answers for individual intermediate query, determining visualization preference by incorporating semantic and usage knowledge from the Learning Engine, recommending follow-up actions that the user can perform to aid the user on his information needs and further analysis, aggregating and ranking answers; and

the Learning Engine for augmenting, adapting and improving knowledge based on user interactions which are fed back to the Learning Engine.

2. The adaptive question answering engine system of claim 1 , wherein the user interactions comprise data enquiry, correction of ambiguous entities, actions on interpretation, actions on answer, tracking of answer, drill-down part of data, visualization changes, up-vote/down-vote on answers, actions on suggested analysis, and follow-up on suggested questions.

3. The adaptive question answering engine system of claim 1, wherein the holistic understanding of the question comprises question text and use’s intent.

4. The adaptive question answering engine system of claim 1 , wherein the Interpretation Engine uses a semantic-parser algorithm to parse the question and identify entities from the question.

5. The adaptive question answering engine system of claim 4, wherein the identification of entities comprises period identification, business object identification, measure inference, condition identification, and other important features from question and user context.

6. The adaptive question answering engine system of claim 1 , wherein the holistic understanding of the question makes intelligent assumptions in case of insufficiently elucidated questions and performs disambiguation in case of ambiguities.

7. The adaptive question answering engine system of claim 6, wherein the disambiguation comprises Lexical entity disambiguation and semantic disambiguation.

8. The adaptive question answering engine system of claim 1, wherein the computer network comprises a first subnetwork for data processing and a second subnetwork for data storage.

9. The adaptive question answering engine system of claim 8, wherein the first subnetwork for data processing comprises at least one virtual or physical server node for implementing the Interpretation Engine, the Answering Engine, the Learning Engine, data synchronization or other modules.

10. The adaptive question answering engine system of claim 8, wherein the first subnetwork for data processing comprises serverless architectures.

11. The adaptive question answering engine system of claim 8, wherein the first subnetwork for data processing comprises a multi-server-node cluster which gets deployed with the

Interpretation Engine, the Answering Engine, the Learning Engine and all other required modules, and a second server node for data synchronization.

12. The adaptive question answering engine system of claim 8, wherein the second subnetwork for data storage comprises a big data frame work and a database system for data inquiry and retrieval.

13. A method for adaptive question answering, comprising steps of

formulating various intermediate queries by the Answering Engine based on the

Interpretation;

retrieving appropriate answers and metadata associated with the answers for individual intermediate query;

determining visualization preference by the Answering Engine by incorporating semantic and usage knowledge from the Learning Engine;

recommending follow-up actions by the Answering Engine to aid user on his information needs and further analysis; and

aggregating and ranking answers as appropriate;

14. The method of claim 13, wherein the holistic understanding of the question comprises question text and use’s intent.

15. The method of claim 13, wherein the Interpretation Engine uses a semantic -parser algorithm to parse the question and identify entities from the question.

16. The method of claim 13, wherein the identification of entities comprises period identification, business object identification, measure inference, condition identification, and other important features from question and user context.

17. The method of claim 13, wherein the holistic understanding of the question makes intelligent assumptions in case of insufficiently elucidated questions and performs disambiguation in case of ambiguities.

18. The method of claim 17, wherein the disambiguation comprises Lexical entity disambiguation and semantic disambiguation.