US11238231B2 - Data relationships in a question-answering environment - Google Patents

Data relationships in a question-answering environment Download PDF

Info

Publication number
US11238231B2
US11238231B2 US16/275,798 US201916275798A US11238231B2 US 11238231 B2 US11238231 B2 US 11238231B2 US 201916275798 A US201916275798 A US 201916275798A US 11238231 B2 US11238231 B2 US 11238231B2
Authority
US
United States
Prior art keywords
subset
data
influence
characteristic
conditions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/275,798
Other versions
US20190179904A1 (en
Inventor
David L. Johnson
Brian R. Muras
Daniel J. Strauss
Eric G. Thiemann
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US16/275,798 priority Critical patent/US11238231B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JOHNSON, DAVID L., MURAS, BRIAN R., STRAUSS, DANIEL J., THIEMANN, ERIC G.
Publication of US20190179904A1 publication Critical patent/US20190179904A1/en
Application granted granted Critical
Publication of US11238231B2 publication Critical patent/US11238231B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language

Definitions

  • the present disclosure relates to question-answering techniques, and more specifically, to establishing relationships between data in a question-answering environment.
  • QA systems can be designed to receive input questions, analyze them, and return applicable answers. Using various techniques, QA systems can provide mechanisms for searching corpora (e.g., databases of source items containing relevant content) and analyzing the corpora to determine answers to an input question.
  • corpora e.g., databases of source items containing relevant content
  • a computer-implemented method of establishing influence relationships between data in a question-answering environment can include determining a set of conditions indicating a set of user statuses, and analyzing, using a first natural language processing technique, a corpus of data including a set of user data.
  • the method can include identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions.
  • the method can include establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions.
  • FIG. 1 depicts a block diagram of an example computing environment for use with a question-answering (QA) system, according to embodiments of the present disclosure.
  • QA question-answering
  • FIG. 3 depicts an example system architecture configured to establish a set of influence relationships between data, according to embodiments of the present disclosure.
  • FIG. 4 depicts a diagram of influence factors and influence relationships between data in a question-answering environment, according to embodiments of the present disclosure.
  • Embodiments of the present disclosure are directed towards a method of establishing influence relationships between data in a question-answering environment.
  • answers can be generated in response to input queries (e.g., questions).
  • the QA system can be configured to receive an input query, analyze one or more data sources, and based on the analysis, generate answers.
  • answers can be data in various forms including, but not limited to, text, documents, images, video, and audio.
  • a QA system could be configured to provide answers including explanations of how various types of data, such as patient data and a set of conditions, are connected.
  • the system could be configured to establish relationships between user data and conditions, and, based on established relationships, provide explanations on how the data is connected.
  • the system could be configured to determine a causal relationship between the user data and the set of conditions, such that the system indicates that user data can cause one or more of the set of conditions.
  • the method can include determining a set of conditions indicating a set of user statuses and analyzing, using a natural language processing technique, a corpus of data including a set of user data.
  • the set of conditions can indicate various statuses of a user.
  • the set of conditions can indicate various actual or possible statuses of the user with regard to the user's health or medical state.
  • the set of conditions could include various illnesses such as influenza, food poisoning, cold, giardia, etc.
  • the set of conditions could include various symptoms such as a fever, cough, headache, etc.
  • conditions could include other various statuses such as, busy, stressed, elated, etc.
  • the set of user data can include various data related to a user.
  • the set of user data can include electronic user information such as user accounts (bank accounts, credit cards, etc.), social media information, public records, and other electronic information associated with the user.
  • the set of user data can include financial information such as spending habits, bank statements, credit card statements, credit history, and other financial information.
  • user data can include travel information including information locations and durations of user travel.
  • user data can include social media data including social network posts, pictures, video, or other information posted on various social networks.
  • user data can include geographic data including the user's home address, work address, or other information related to the geographic location of the user.
  • the method can include identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions.
  • an influence factor can be one or more shared characteristics identified in the set of user data and in the set of conditions, where the one or more shared characteristics are understood as possible consequences of the data. For example, user data indicating that a user is a schoolteacher could have an influence factor associated with it of “exposure to children”, since exposure to children is a possible consequence of being a schoolteacher.
  • a system could analyze a corpus and determine from various medical texts and other data that a condition of a gastro-intestinal discomfort could have an influence factor of eating food in a developing country. Further the system could analyze the set of user data and identify a subset of the user data corresponding with travel to a developing country. For example, the system could identify a trip from the user's social media page and/or bank accounts showing purchases in developing countries. Described further herein, the system can use natural language processing techniques to analyze user data and identify data which is associated with one or more of the influence factors.
  • the computing environment 100 can include one or more remote devices 102 , 112 and one or more host devices 122 .
  • Remote devices 102 , 112 and host device 122 can be distant from each other and communicate over a network 150 .
  • the host device 122 can be a central hub from which remote devices 102 , 112 establish a communication connection.
  • the host device and remote devices can be configured in various suitable relationships (e.g., in a peer-to-peer or other relationship).
  • the network 150 can be implemented by suitable communications media (e.g., wide area network (WAN), local area network (LAN), Internet, and Intranet).
  • remote devices 102 , 112 and host devices 122 can be local to each other, and communicate via appropriate local communication medium (e.g., local area network (LAN), hardwire, wireless link, Intranet).
  • the network 150 can be implemented within a cloud computing environment, or using one or more cloud computing services.
  • a cloud computing environment can include a network-based, distributed data processing system that provides one or more cloud computing services.
  • 122 may be hosted in a cloud environment, and may be hosted on a Virtual Machine running in the cloud.
  • a cloud computing environment can include multiple computers (e.g., hundreds or thousands of them or more), among one or more data centers and configured to share resources over the network 150 .
  • remote devices 102 , 112 can enable users to submit input queries (e.g., search requests or other user queries) to host devices 122 to retrieve search results.
  • the remote devices 102 , 112 can include a query module 110 , 120 (e.g., in the form of a web browser or other suitable software module) and present a graphical user interface or other interface (command line prompts, menu screens, etc.) to solicit queries from users for submission to one or more host devices 122 and to display answers/results obtained from the host devices 122 in relation to such user queries.
  • host device 122 and remote devices 102 , 112 can be computer systems, and can each be equipped with a display or monitor.
  • the computer systems can include at least one processor 106 , 116 , 126 ; memories 108 , 118 , 128 ; internal or external network interface or communications devices 104 , 114 , 124 (e.g., modem, network interface cards); optional input devices (e.g., a keyboard, mouse, touchscreen, or other input device); and commercially available or custom software (e.g., browser software, communications software, server software, natural language processing software, search engine and/or web crawling software, filter modules for filtering content based upon predefined criteria).
  • the computer systems can include servers, desktops, laptops, and hand-held devices.
  • the answer module 132 can include one or more modules or units to perform the various functions of embodiments as described below, and can be implemented by a combination of software and/or hardware modules or units.
  • FIG. 2 a block diagram of a QA system can be seen, according to embodiments of the present disclosure. Aspects of FIG. 2 are directed toward a system architecture 200 , including a QA system 212 to generate a group of answers (e.g., one or more answers) in response to an input query.
  • one or more users can send requests for information to QA system 212 using a remote device (such as remote devices 102 , 112 of FIG. 1 ).
  • the remote device can include a client application 208 which can include one or more entities operable to generate information that is dispatched to QA system 212 via network 215 .
  • QA system 212 can be able to perform methods and techniques for responding to the requests sent by the client application 208 .
  • the information received at QA system 212 can correspond to input queries received from users, where the input queries can be expressed in natural language, or images, or other forms.
  • questions can be audio-type (e.g., spoken-word recordings, music, scientific sound recordings), video-type (e.g., a film, a silent movie, a video of a person asking a detailed question), image-type (e.g., a picture, a photograph, a drawing), or other type that can be received and processed by the QA system.
  • audio-type e.g., spoken-word recordings, music, scientific sound recordings
  • video-type e.g., a film, a silent movie, a video of a person asking a detailed question
  • image-type e.g., a picture, a photograph, a drawing
  • client application 208 can operate on a variety of devices. Such devices can include, but are not limited to, mobile and hand-held devices (e.g., laptops, mobile phones, personal or enterprise digital assistants, and the like), personal computers, servers, or other computer systems that can access the services and functionality provided by QA system 212 .
  • client application 208 can include one or more components, such as a mobile client 210 .
  • Mobile client 210 acting as an agent of client application 208 , can dispatch user query requests to QA system 212 .
  • client application 208 can also include a search application 202 , either as part of mobile client 210 or separately, that can perform several functions, including some or all of the above functions of mobile client 210 listed above.
  • search application 202 can dispatch requests for information to QA system 212 .
  • search application 202 can be a client application to QA system 212 .
  • Search application 202 can send requests for answers to QA system 212 .
  • Search application 202 can be installed on a personal computer, a server, or other computer system.
  • search application 202 can include a search graphical user interface (GUI) 204 and session manager 206 .
  • GUI search graphical user interface
  • search GUI 204 can be a search box or other GUI component, the content of which can represent a question to be submitted to QA system 212 .
  • Users can authenticate to QA system 212 via session manager 206 .
  • session manager 206 can keep track of user activity across sessions of interaction with the QA system 212 . Session manager 206 can also keep track of what questions are submitted within the lifecycle of a session of a user. For example, session manager 206 can retain a succession of questions posed by a user during a session. In some embodiments, answers produced by QA system 212 in response to questions posed throughout the course of a user session can also be retained.
  • Information for sessions managed by session manager 206 can be shared between various computer systems and devices.
  • client application 208 and QA system 212 can be communicatively coupled through network 215 , e.g., the Internet, intranet, or other public or private computer network.
  • QA system 212 and client application 208 can communicate by using Hypertext Transfer Protocol (HTTP) or Representational State Transfer (REST) calls.
  • HTTP Hypertext Transfer Protocol
  • REST Representational State Transfer
  • QA system 212 can reside on a server node.
  • Client application 208 can establish server-client communication with QA system 212 or vice versa.
  • the network 215 can be implemented within a cloud computing environment, or using one or more cloud computing services.
  • QA system 212 can respond to a request for information sent by client applications 208 (e.g., question posed by a user). QA system 212 can generate a group of answers in response to the request.
  • QA system 212 can include a question analyzer 214 , data sources 224 , and answer generator 228 .
  • Question analyzer 214 can be a computer module that analyzes the received questions. Question analyzer 214 can perform various methods and techniques for analyzing the questions (syntactic analysis, semantic analysis, image recognition analysis, etc.). In some embodiments, question analyzer 214 can parse received questions. Question analyzer 214 can include various modules to perform analyses of received questions.
  • computer modules that question analyzer 214 can encompass include, but are not limited to, a tokenizer 216 , part-of-speech (POS) tagger 218 , semantic relationship identifier 220 , and syntactic relationship identifier 222 .
  • POS part-of-speech
  • tokenizer 216 can be a computer module that performs lexical analysis. Tokenizer 216 can convert a sequence of characters into a sequence of tokens. A token can be a string of characters typed by a user and categorized as a meaningful symbol. Further, in some embodiments, tokenizer 216 can identify word boundaries in an input query and break the question or text into its component parts such as words, multiword tokens, numbers, and punctuation marks. In some embodiments, tokenizer 216 can receive a string of characters, identify the lexemes in the string, and categorize them into tokens.
  • POS tagger 218 can be a computer module that marks up a word in a text to correspond to a particular part of speech.
  • POS tagger 218 can read a question or other text in natural language and assign a part of speech to each word or other token.
  • POS tagger 218 can determine the part of speech to which a word corresponds based on the definition of the word and the context of the word.
  • the context of a word can be based on its relationship with adjacent and related words in a phrase, sentence, question, or paragraph. In some embodiments, the context of a word can be dependent on one or more previously posed questions.
  • parts of speech that can be assigned to words include, but are not limited to, nouns, verbs, adjectives, adverbs, and the like.
  • parts of speech categories that POS tagger 218 can assign include, but are not limited to, comparative or superlative adverbs, wh-adverbs, conjunctions, determiners, negative particles, possessive markers, prepositions, wh-pronouns, and the like.
  • POS tagger 218 can tag or otherwise annotate tokens of a question with part of speech categories.
  • POS tagger 218 can tag tokens or words of a question to be parsed by QA system 212 .
  • semantic relationship identifier 220 can be a computer module that can identify semantic relationships of recognized entities (e.g., words, phrases) in questions posed by users. In some embodiments, semantic relationship identifier 220 can determine functional dependencies between entities and other semantic relationships.
  • syntactic relationship identifier 222 can be a computer module that can identify syntactic relationships in a question composed of tokens posed by users to QA system 212 .
  • Syntactic relationship identifier 222 can determine the grammatical structure of sentences, for example, which groups of words are associated as “phrases” and which word is the subject or object of a verb.
  • Syntactic relationship identifier 222 can conform to formal grammar.
  • question analyzer 214 can be a computer module that can parse a received user query and generate a corresponding data structure of the user query. For example, in response to receiving a question at QA system 212 , question analyzer 214 can output the parsed question as a data structure. In some embodiments, the parsed question can be represented in the form of a parse tree or other graph structure. To generate the parsed question, question analyzer 214 can trigger computer modules 216 - 222 . Additionally, in some embodiments, question analyzer 214 can use external computer systems for dedicated tasks that are part of the question parsing process.
  • the output of question analyzer 214 can be used by QA system 212 to perform a search of a set of (i.e., one or more) corpora to retrieve information to answer a question posed by a user.
  • a corpus can refer to one or more data sources.
  • data sources 224 can include databases, information corpora, data models, and document repositories.
  • the data source 224 can include an information corpus 226 .
  • the information corpus 226 can enable data storage and retrieval.
  • the information corpus 226 can be a storage mechanism that houses a standardized, consistent, clean and integrated form of data. The data can be sourced from various operational systems.
  • Data stored in the information corpus 226 can be structured in a way to specifically address reporting and analytic requirements.
  • the information corpus can be a relational database.
  • data sources 224 can include one or more document repositories.
  • answer generator 228 can be a computer module that generates a group of answers in response to posed questions.
  • Examples of answers generated by answer generator 228 can include, but are not limited to, natural language sentences, reports, charts, or other analytic representation, raw data, web pages, and the like.
  • answers can be of audio type, image type, or other suitable medium type.
  • answer generator 228 can include query processor 230 , visualization processor 232 , and feedback handler 234 .
  • query processor 230 When information in the data source 224 matching a parsed question is located, a technical query associated with the pattern can be executed by query processor 230 .
  • visualization processor 232 Based on data retrieved by a technical query executed by query processor 230 , visualization processor 232 can be configured to render visualization of the retrieved answers as described herein. The rendered visualization of the answers can represent the answer to the input query.
  • visualization processor 232 can render visualization in various forms including, but not limited to, images, charts, tables, dashboards, maps, and the like.
  • feedback handler 234 can be a computer module that processes feedback from users on answers generated by answer generator 228 .
  • users can be engaged in dialog with the QA system 212 to evaluate the relevance of received answers.
  • the answer generator 228 could produce the group of answers corresponding to a question submitted by a user. The user could rank each answer according to its relevance to the question.
  • the feedback of users on generated answers can be used for future question answering sessions.
  • the client application 208 could be used to receive an input query from a user.
  • the question analyzer 214 could, in some embodiments, be used to analyze input queries.
  • the input queries can include a question asking for explanations for a set of conditions.
  • the answer generator 228 in embodiments, could be used to analyze the data sources 224 to determine influence factors between user data in the information corpus 226 and one or more of the set of conditions.
  • system architecture 300 for establishing influence relationships between data in a question-answering environment can be seen, according to embodiments of the present disclosure.
  • the system architecture 300 can represent an example architecture for executing embodiments of the present disclosure.
  • the system architecture 300 could be an example representation of aspects of the answer generator 228 ( FIG. 2 ) and/or the question analyzer 214 ( FIG. 2 ).
  • system architecture 300 can include a relationship analyzer 306 and an answer generator 314 .
  • the relationship analyzer 306 can be a computer module configured to establish influence relationships between data in a QA environment.
  • the relationship analyzer 306 can be configured to determine a set of conditions 301 .
  • the set of conditions 301 can be the same or substantially similar as described herein.
  • the relationship analyzer can receive the set of conditions 301 as inputs. For example, in embodiments a user could enter the set of conditions manually as text. The relationship analyzer 306 could then use natural language processing techniques as described herein to parse the text to determine the set of conditions 301 .
  • Relationship analyzer 306 can be communicatively connected to database 312 .
  • Database 312 can store various types of information including text, images, audio, video, and other suitable information.
  • database can include a mass quantity of various kinds of data related to various subjects.
  • the database could include various medical information including journals, medical texts, clinical research, doctor's notes, and other information.
  • the database 312 could include information related to various additional subject matter.
  • the database 312 can be accessed and parsed by the relationship analyzer 306 to establish relationships between data based on the stored information.
  • database 312 can be a corpus of information. In some embodiments, database 312 can substantially correspond to information corpus 226 ( FIG. 2 ). In embodiments database 312 can include a set of user data 313 . User data can be the same or substantially similar as described herein. For example, in embodiments, the set of user data 313 includes one or more types of content including economic data, medical data, personal data, family history, and historical user data.
  • the relationship analyzer 306 can include a characteristic identifier 308 .
  • the characteristic identifier 308 can be configured to identify characteristics of the set of conditions 301 and in the set of user data 313 .
  • characteristics are elements, features, traits, themes, etc. that can be related to or correspond to data.
  • a condition of the common cold could have characteristics including, but not limited to, “contagious”, “sore throat”, “nasal congestion”, and “common in children”.
  • user data indicating travel abroad could have characteristics including but not limited to, “exposure to people”, “stress”, and “unusual food and beverage”. Described further herein, characteristic relationships can be used to establish influence factors and influence relationships between two or more pieces of data.
  • the characteristic identifier 308 can identify characteristics in data using natural language processing techniques as described herein.
  • the characteristic identifier 308 can employ a natural language processor 309 .
  • the natural language processor 309 can be configured to perform various methods and techniques for natural language analysis of data in the QA environment.
  • the natural language processor 309 can be configured to perform syntactic analysis, semantic analysis, image recognition analysis, concept matching and other suitable methods and techniques.
  • characteristics can be determined by concept matching techniques.
  • Concept matching techniques can include, but are not limited to, semantic similarity analysis, syntactic analysis, and ontological matching.
  • the natural language processor 309 could be configured to parse data in the QA environment to determine semantic features (e.g., repeated words and/or keywords) and/or syntactic features (e.g., location of semantic features in headings and/or title).
  • Ontological matching could be used to map semantic and/or syntactic features to a particular concept.
  • a QA system could receive a question asking for possible causes in user data for a set of conditions including a common cold, and paresthesia (tingling sensation) in appendages.
  • characteristic identifier 308 using natural language processor 309 , could parse the set of conditions 301 and the database 312 to determine characteristics of the set of conditions 301 .
  • the natural language processor could identify various concepts from a corpus, such as the database 312 , corresponding to the common cold. For example, the natural language processor could identify in various medical texts that exposure to young children can increase the chances of contracting the common cold.
  • the natural language processor 309 could then select “exposure to children” as the concept.
  • the natural language processor 309 could identify from medical journals or other sources that high stress levels can result in paresthesia. Thus, the natural language processor 309 could select “stress” as another concept. Thus, in embodiments, the characteristic identifier 308 could be configured to select the concept of “stress” as a characteristic of paresthesia and the concept of “exposure to children” as a characteristic of the common cold.
  • the characteristic identifier 308 could parse the set of user data and identify characteristics of the set of user data. For example, the characteristic identifier 308 could parse financial records, such as paystubs and tax information that shows that the user works at an elementary school and has been putting in overtime. As described herein, the natural language processor 309 could identify “exposure to children” and “stress” as concepts from analysis of this information. Thus, the characteristic identifier 308 could select the concepts of “stress” and “exposure to children” as characteristics of a subset of the user data.
  • the influence factor identifier 310 can be configured to identify influence factors between data in the QA environment.
  • the influence factor identifier 310 can be configured to identify influence factors based on comparing characteristics identified by the characteristic identifier 308 . In embodiments, comparisons can be made between data having common (e.g., shared) characteristics and different (e.g., non-shared) characteristics. Based on the comparisons of these characteristic relationships, the influence factor identifier can identify influence factors between data. In embodiments, if characteristics are the same or substantially similar then the influence factor identifier can identify the characteristics as one or more influence factors.
  • the influence factor identifier 310 can use natural language processor 309 to compare characteristics.
  • natural language processor 309 can use various techniques such as syntactic analysis, semantic analysis, image recognition analysis, concept matching and other suitable methods and techniques as described herein.
  • natural language processor can determine whether characteristics are the same or substantially similar. In embodiments, characteristics are substantially similar if they are identical. In some embodiments, characteristics are substantially similar if they are related. For example, in embodiments, related characteristics could be a first characteristic describing a genus and a second characteristic describing a species of that genus. For example, if a characteristic of the common cold was “exposure to children” and a characteristic of user data was “exposure to people”, the concepts could be considered substantially similar as “exposure to people” includes the characteristic of “exposure to children”.
  • the characteristics can be weighted.
  • the influence factors identified from the characteristics can be weighted based on the weights of the characteristics. For example, in embodiments, an influence factor identified from a highly weighted characteristic will be a highly weighted influence factor.
  • the characteristics can be weighted based on the source of the characteristic, such as type of user data. For example, in embodiments, if a characteristic was parsed from financial data it could have an assigned weight based on that the characteristic was located from financial data. In an additional, characteristics from financial data could have higher weights than characteristics from other types of data, such as social networking data. In some embodiments, characteristics can be weighted based on the format of the user data.
  • characteristics parsed from textual data could be weighed higher than audio data.
  • the influence factor could be weighted based on the NLP analysis that detected the characteristic. For example, in embodiments, NLP could detect urgency which could give the characteristic a higher weight. For example, a high urgency characteristic could be parsed from a social network data that says “wow! I really feel sick after eating at that restaurant!”
  • influence factor identifier 310 can be configured to group influence factors based on a data pair to which each influence factor belongs. For example, in a situation having a set of conditions including conditions A, B, and C, and a set of user data including user data D and E, there can be, in some embodiments, as many as six different answer pairs (A-D, A-E, B-D, B-E, C-D, and C-E) and, therefore, as many as six different sets of influence factors.
  • the relationship analyzer 306 can be configured to establish influence relationships using a set of influence factors.
  • Each influence relationship can represent a composite of a particular set of influence factors.
  • influence relationships can be measures or indicators as to how datum of a data pair are likely to interact or influence each other. Further, in some embodiments, for data pairs having no influence factors, there can be deemed to be no influence relationship between those answers forming the pair or there can be deemed to be a null or neutral influence relationship. For instance, if there are no influence factors corresponding to the A-D pair then the relationship between condition A and user data D could be deemed a neutral influence relationship.
  • the answer generator 314 can be configured to generate answers based on influence relationships. For example, in response to a question about how two sets of data influence each other, the answer generator could generate one or more explanations detailing the influence relationships established by the relationship analyzer 306 . For example, the answer generator could present text describing that an influence relationship between working as a schoolteacher and the common cold was obtained. In embodiments, the answer generator could include evidence used to arrive at the influence relationship. For example, the answer generator could present elements in the database 312 used to establish the influence relationship.
  • the answer generator 314 can be configured to generate text based explanations of the influence relationship.
  • explanations can be generated in various formats including images, text, audio, video, tables, charts, and in other suitable formats.
  • answer generator 314 can be configured to use natural language processing techniques as described herein, to generate the explanations.
  • example diagram 400 of data relationships between data in a QA environment can be seen, according to embodiments of the present disclosure.
  • example diagram 400 includes two types of data including condition A 401 A and user data B 401 B.
  • various amounts of data can be compared for data relationships.
  • diagram 400 could be a representation of a QA system's response to a question of how types of data influence one another (such as condition A 401 A and user data B 401 B).
  • influence relationship A/B 404 exists corresponding to condition A 401 A and user data B 401 B.
  • influence relationship A/B 404 can be a composite of one or more influence factors.
  • there are two influence factors corresponding to condition A 401 A and user data B 401 B (characteristic b-based influence factor 402 A and characteristic d-based influence factor 402 B).
  • influence factors can be based on characteristics (a, b, c, and d).
  • Characteristics can be associated with the data as described herein. For example, condition A 401 A is associated with characteristic a 400 A, b 400 B, and d 400 D.
  • User data B 401 B is associated with characteristics b 400 B, c 400 C, and d 400 D.
  • Two characteristics are common characteristics (b and d, respectively) which is indicated by lines from condition A 401 A and user data B 401 B to characteristic b 400 B and d 400 D.
  • characteristic based influence factors 402 A corresponding to both condition A 401 A and user data B 401 B
  • characteristic d-based influence factor 402 B corresponding to both condition A 401 A and user data B 401 B.
  • influence relationship A/B 404 can be generated based on the sets of influence factors. Specifically, a first influence relationship (including the characteristic b-based influence factor and characteristic d-based influence factor) can be used to generate an influence relationship A/B 404 between condition A 401 A and user data B 401 B.
  • FIG. 5 a flowchart diagram of a method 500 of establishing influence relationships between data in a question-answering environment can be seen, according to embodiments of the present disclosure.
  • a set of conditions can be determined that indicate a set of user statuses.
  • the set of conditions can be the same or substantially similar as described herein.
  • the set of conditions can be various statuses for a user.
  • the set of conditions could include various illnesses and/or symptoms.
  • a corpus can be analyzed that includes a set of user data.
  • the corpus can be the same or substantially similar as described herein.
  • the corpus includes mass quantities of information on various subject matter.
  • the set of user data can be the same or substantially similar as described herein.
  • the set of user data can include various types of electronic information accessible by a QA system for analysis.
  • characteristics can be identified that correspond to a subset of user data and to a subset of the conditions to identify influence factors. Characteristics can be the same or substantially similar as described herein. In embodiments, characteristics can be identified using natural language processing techniques. For example, in embodiments, concept matching techniques, as described herein, can be used to identify characteristics.
  • a set of influence factors can be identified based on a comparison of characteristics. Influence factors can be the same or substantially similar as described herein. In embodiments, the set of influence factors can be identified by determining that characteristics corresponding to the subset of user data and the subset of conditions are substantially similar.
  • a set of influence relationships can be established based on the set of influence factors.
  • Influence relationships can be the same or substantially similar as described herein.
  • influence relationships can be composites of groups of influence factors.
  • the method 500 can include evaluating the influence relationships based on the influence factors.
  • the influence relationships can be evaluated by calculating a relationship score that indicates the relative strength of the influence relationship.
  • the calculated relationship score can be based on the number of influence factors that make up the influence relationship.
  • the relationship analyzer 306 can be configured to determine the number of influence factors that make up the influence relationship. In some embodiments, the greater the number of influence factors that make up the influence relationship, the stronger the influence relationship.
  • the relationship score can be based on the weight of the influence factors, as described herein. For example, in embodiments, the higher the weight of the influence factors in the influence relationship, the greater the relationship score.
  • the method can include generating a set of explanations for the set of conditions using relevant influence relationships.
  • the set of explanations can be text based descriptions of the influence relationships established by embodiments of the present disclosure.
  • an explanation could include text describing that an influence relationship between working as a schoolteacher and the common cold was obtained.
  • the present invention may be a system, a method, and/or a computer program product.
  • the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Abstract

A computer-implemented method of establishing influence relationships between data in a question-answering environment is disclosed. Establishing influence relationships can include determining a set of conditions indicating a set of user statuses and analyzing, using a first natural language processing technique, a corpus of data including a set of user data. Establishing influence relationships between data can include identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions. In embodiments, establishing influence relationships can include establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions.

Description

BACKGROUND
The present disclosure relates to question-answering techniques, and more specifically, to establishing relationships between data in a question-answering environment.
Question-answering (QA) systems can be designed to receive input questions, analyze them, and return applicable answers. Using various techniques, QA systems can provide mechanisms for searching corpora (e.g., databases of source items containing relevant content) and analyzing the corpora to determine answers to an input question.
SUMMARY
According to embodiments of the present disclosure, a computer-implemented method of establishing influence relationships between data in a question-answering environment is disclosed. The method can include determining a set of conditions indicating a set of user statuses, and analyzing, using a first natural language processing technique, a corpus of data including a set of user data. The method can include identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions. The method can include establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions.
Embodiments of the present disclosure are directed towards a system for establishing influence relationships between data in a question-answering environment. The system can include a processor, a computer readable storage medium having program instructions embodied therewith. The program instructions can be executable by the processor to cause the system to determine a set of conditions indicating a set of user statuses and analyze, using a first natural language processing technique, a corpus of data including a set of user data. The program instructions can cause the system to identify, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions. The program instructions can cause the system to establish, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions.
Embodiments of the present disclosure are directed towards a computer program product for establishing influence relationships between data in a question-answering environment. The computer program product can include a computer readable storage medium having program instructions embodied therewith. The program instructions can be executable by a computer to cause the computer to perform a method. In embodiments, the method can include determining a set of conditions indicating a set of user statuses and analyzing, using a first natural language processing technique, a corpus including a set of user data. The method can include identifying, based on analyzing the corpus, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions. In embodiments, the method can include establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions.
The above summary is not intended to describe each illustrated embodiment or every implementation of the present disclosure.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
The drawings included in the present application are incorporated into, and form part of, the specification. They illustrate embodiments of the present disclosure and, along with the description, serve to explain the principles of the disclosure. The drawings are only illustrative of certain embodiments and do not limit the disclosure.
FIG. 1 depicts a block diagram of an example computing environment for use with a question-answering (QA) system, according to embodiments of the present disclosure.
FIG. 2 depicts a block diagram of an example QA system configured to generate answers in response to one or more input queries, according to embodiments of the present disclosure.
FIG. 3 depicts an example system architecture configured to establish a set of influence relationships between data, according to embodiments of the present disclosure.
FIG. 4 depicts a diagram of influence factors and influence relationships between data in a question-answering environment, according to embodiments of the present disclosure.
FIG. 5 depicts a flowchart diagram of a method of establishing influence relationships between data in a question-answering environment, according to embodiments of the present disclosure.
While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
DETAILED DESCRIPTION
Aspects of the present disclosure relate to question-answering techniques, more particular aspects relate to establishing relationships between a set of user data and a set of conditions indicating various user statuses. While the present disclosure is not necessarily limited to such applications, various aspects of the disclosure may be appreciated through a discussion of various examples using this context.
Embodiments of the present disclosure are directed towards a method of establishing influence relationships between data in a question-answering environment. In a QA system, answers can be generated in response to input queries (e.g., questions). For example, the QA system can be configured to receive an input query, analyze one or more data sources, and based on the analysis, generate answers. In embodiments, answers can be data in various forms including, but not limited to, text, documents, images, video, and audio.
In embodiments, answers could include possible explanations (e.g., causes) for various conditions. For example, the QA system could receive a question asking for possible explanations for a particular illness based on patient data.
In some instances in the medical field, when a patient seeks medical care, data about the patient can be collected by a health care provider to help explain possible conditions either currently affecting the patient or to identify possible future conditions. For example, a health care provider could survey the patient with questions related to the patient's spending habits, travel habits, medical history, or other suitable patient data to attempt to ascertain explanations or causes for potential conditions. However, in some instances patient data can be difficult to collect. For example, the patient may not know or remember sought after data. Further, the number of questions used to obtain patient data can be limited due to time constraints and/or customer service concerns.
Thus, a QA system could be configured to provide answers including explanations of how various types of data, such as patient data and a set of conditions, are connected. For example, the system could be configured to establish relationships between user data and conditions, and, based on established relationships, provide explanations on how the data is connected. In embodiments, the system could be configured to determine a causal relationship between the user data and the set of conditions, such that the system indicates that user data can cause one or more of the set of conditions.
The method can include determining a set of conditions indicating a set of user statuses and analyzing, using a natural language processing technique, a corpus of data including a set of user data. The set of conditions can indicate various statuses of a user. In embodiments, the set of conditions can indicate various actual or possible statuses of the user with regard to the user's health or medical state. For example, the set of conditions could include various illnesses such as influenza, food poisoning, cold, giardia, etc. In some examples, the set of conditions could include various symptoms such as a fever, cough, headache, etc. In some embodiments, conditions could include other various statuses such as, busy, stressed, elated, etc.
The set of user data can include various data related to a user. In embodiments, the set of user data can include electronic user information such as user accounts (bank accounts, credit cards, etc.), social media information, public records, and other electronic information associated with the user. In embodiments, the set of user data can include financial information such as spending habits, bank statements, credit card statements, credit history, and other financial information. In some embodiments, user data can include travel information including information locations and durations of user travel. In some embodiments, user data can include social media data including social network posts, pictures, video, or other information posted on various social networks. In some embodiments, user data can include geographic data including the user's home address, work address, or other information related to the geographic location of the user.
The method can include identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions. Described further herein, an influence factor can be one or more shared characteristics identified in the set of user data and in the set of conditions, where the one or more shared characteristics are understood as possible consequences of the data. For example, user data indicating that a user is a schoolteacher could have an influence factor associated with it of “exposure to children”, since exposure to children is a possible consequence of being a schoolteacher.
In an additional example, a system could analyze a corpus and determine from various medical texts and other data that a condition of a gastro-intestinal discomfort could have an influence factor of eating food in a developing country. Further the system could analyze the set of user data and identify a subset of the user data corresponding with travel to a developing country. For example, the system could identify a trip from the user's social media page and/or bank accounts showing purchases in developing countries. Described further herein, the system can use natural language processing techniques to analyze user data and identify data which is associated with one or more of the influence factors.
The method can include establishing, based on the identified influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions. Described further herein, an influence relationship can be a composite of influence factors for data. For example, one or more influence factors could exist between elements of the set of user data and the set of conditions. The influence relationship could be a composite of the one or more influence factors for those elements.
Referring now to FIG. 1 a block diagram of an example computing environment 100 for use with a QA system can be seen, according to embodiments of the present disclosure. In some embodiments, the computing environment 100 can include one or more remote devices 102, 112 and one or more host devices 122. Remote devices 102, 112 and host device 122 can be distant from each other and communicate over a network 150. In embodiments, the host device 122 can be a central hub from which remote devices 102, 112 establish a communication connection. In embodiments, the host device and remote devices can be configured in various suitable relationships (e.g., in a peer-to-peer or other relationship).
In some embodiments, the network 150 can be implemented by suitable communications media (e.g., wide area network (WAN), local area network (LAN), Internet, and Intranet). In some embodiments, remote devices 102, 112 and host devices 122 can be local to each other, and communicate via appropriate local communication medium (e.g., local area network (LAN), hardwire, wireless link, Intranet). In some embodiments, the network 150 can be implemented within a cloud computing environment, or using one or more cloud computing services. Consistent with various embodiments, a cloud computing environment can include a network-based, distributed data processing system that provides one or more cloud computing services. In some embodiments, 122 may be hosted in a cloud environment, and may be hosted on a Virtual Machine running in the cloud. Further, a cloud computing environment can include multiple computers (e.g., hundreds or thousands of them or more), among one or more data centers and configured to share resources over the network 150.
In some embodiments, host device 122 can include a QA system 130 having a search application 134 and an answer module 132. The search application 134 can be configured to search one or more databases or other computer systems for content that is related to an input query submitted by a user at a remote device 102, 112.
In some embodiments, remote devices 102, 112 can enable users to submit input queries (e.g., search requests or other user queries) to host devices 122 to retrieve search results. For example, the remote devices 102, 112 can include a query module 110, 120 (e.g., in the form of a web browser or other suitable software module) and present a graphical user interface or other interface (command line prompts, menu screens, etc.) to solicit queries from users for submission to one or more host devices 122 and to display answers/results obtained from the host devices 122 in relation to such user queries.
Consistent with various embodiments, host device 122 and remote devices 102, 112 can be computer systems, and can each be equipped with a display or monitor. The computer systems can include at least one processor 106, 116, 126; memories 108, 118, 128; internal or external network interface or communications devices 104, 114, 124 (e.g., modem, network interface cards); optional input devices (e.g., a keyboard, mouse, touchscreen, or other input device); and commercially available or custom software (e.g., browser software, communications software, server software, natural language processing software, search engine and/or web crawling software, filter modules for filtering content based upon predefined criteria). In some embodiments, the computer systems can include servers, desktops, laptops, and hand-held devices. In addition, the answer module 132 can include one or more modules or units to perform the various functions of embodiments as described below, and can be implemented by a combination of software and/or hardware modules or units.
Referring now to FIG. 2 a block diagram of a QA system can be seen, according to embodiments of the present disclosure. Aspects of FIG. 2 are directed toward a system architecture 200, including a QA system 212 to generate a group of answers (e.g., one or more answers) in response to an input query. In some embodiments, one or more users can send requests for information to QA system 212 using a remote device (such as remote devices 102, 112 of FIG. 1). The remote device can include a client application 208 which can include one or more entities operable to generate information that is dispatched to QA system 212 via network 215. QA system 212 can be able to perform methods and techniques for responding to the requests sent by the client application 208. In some embodiments, the information received at QA system 212 can correspond to input queries received from users, where the input queries can be expressed in natural language, or images, or other forms.
An input query (similarly referred to herein as a question) can be one or more words that form a search term or request for data, information, or knowledge. A question can be expressed in the form of one or more keywords. Questions can include various selection criteria and search terms. A question can be composed of complex linguistic features in addition to keywords. However, a keyword-based search for answers can also be possible. In some embodiments, using restricted syntax for questions posed by users can be enabled. The use of restricted syntax can result in a variety of alternative expressions that assist users in better stating their needs. In some embodiments, questions can be implied (rather than explicit) questions. Furthermore, in some embodiments, questions can be audio-type (e.g., spoken-word recordings, music, scientific sound recordings), video-type (e.g., a film, a silent movie, a video of a person asking a detailed question), image-type (e.g., a picture, a photograph, a drawing), or other type that can be received and processed by the QA system.
In some embodiments, client application 208 can operate on a variety of devices. Such devices can include, but are not limited to, mobile and hand-held devices (e.g., laptops, mobile phones, personal or enterprise digital assistants, and the like), personal computers, servers, or other computer systems that can access the services and functionality provided by QA system 212. In some embodiments, client application 208 can include one or more components, such as a mobile client 210. Mobile client 210, acting as an agent of client application 208, can dispatch user query requests to QA system 212.
Consistent with various embodiments, client application 208 can also include a search application 202, either as part of mobile client 210 or separately, that can perform several functions, including some or all of the above functions of mobile client 210 listed above. For example, in some embodiments, search application 202 can dispatch requests for information to QA system 212. In some embodiments, search application 202 can be a client application to QA system 212. Search application 202 can send requests for answers to QA system 212. Search application 202 can be installed on a personal computer, a server, or other computer system.
In some embodiments, search application 202 can include a search graphical user interface (GUI) 204 and session manager 206. In such situations, users can be able to enter questions in search GUI 204. In some embodiments, search GUI 204 can be a search box or other GUI component, the content of which can represent a question to be submitted to QA system 212. Users can authenticate to QA system 212 via session manager 206. In some embodiments, session manager 206 can keep track of user activity across sessions of interaction with the QA system 212. Session manager 206 can also keep track of what questions are submitted within the lifecycle of a session of a user. For example, session manager 206 can retain a succession of questions posed by a user during a session. In some embodiments, answers produced by QA system 212 in response to questions posed throughout the course of a user session can also be retained. Information for sessions managed by session manager 206 can be shared between various computer systems and devices.
In some embodiments, client application 208 and QA system 212 can be communicatively coupled through network 215, e.g., the Internet, intranet, or other public or private computer network. In some embodiments, QA system 212 and client application 208 can communicate by using Hypertext Transfer Protocol (HTTP) or Representational State Transfer (REST) calls. In some embodiments, QA system 212 can reside on a server node. Client application 208 can establish server-client communication with QA system 212 or vice versa. In some embodiments, the network 215 can be implemented within a cloud computing environment, or using one or more cloud computing services.
Consistent with various embodiments, QA system 212 can respond to a request for information sent by client applications 208 (e.g., question posed by a user). QA system 212 can generate a group of answers in response to the request. In some embodiments, QA system 212 can include a question analyzer 214, data sources 224, and answer generator 228. Question analyzer 214 can be a computer module that analyzes the received questions. Question analyzer 214 can perform various methods and techniques for analyzing the questions (syntactic analysis, semantic analysis, image recognition analysis, etc.). In some embodiments, question analyzer 214 can parse received questions. Question analyzer 214 can include various modules to perform analyses of received questions. For example, computer modules that question analyzer 214 can encompass include, but are not limited to, a tokenizer 216, part-of-speech (POS) tagger 218, semantic relationship identifier 220, and syntactic relationship identifier 222.
In some embodiments, tokenizer 216 can be a computer module that performs lexical analysis. Tokenizer 216 can convert a sequence of characters into a sequence of tokens. A token can be a string of characters typed by a user and categorized as a meaningful symbol. Further, in some embodiments, tokenizer 216 can identify word boundaries in an input query and break the question or text into its component parts such as words, multiword tokens, numbers, and punctuation marks. In some embodiments, tokenizer 216 can receive a string of characters, identify the lexemes in the string, and categorize them into tokens.
Consistent with various embodiments, POS tagger 218 can be a computer module that marks up a word in a text to correspond to a particular part of speech. POS tagger 218 can read a question or other text in natural language and assign a part of speech to each word or other token. POS tagger 218 can determine the part of speech to which a word corresponds based on the definition of the word and the context of the word. The context of a word can be based on its relationship with adjacent and related words in a phrase, sentence, question, or paragraph. In some embodiments, the context of a word can be dependent on one or more previously posed questions. Examples of parts of speech that can be assigned to words include, but are not limited to, nouns, verbs, adjectives, adverbs, and the like. Examples of other part of speech categories that POS tagger 218 can assign include, but are not limited to, comparative or superlative adverbs, wh-adverbs, conjunctions, determiners, negative particles, possessive markers, prepositions, wh-pronouns, and the like. In some embodiments, POS tagger 218 can tag or otherwise annotate tokens of a question with part of speech categories. In some embodiments, POS tagger 218 can tag tokens or words of a question to be parsed by QA system 212.
In some embodiments, semantic relationship identifier 220 can be a computer module that can identify semantic relationships of recognized entities (e.g., words, phrases) in questions posed by users. In some embodiments, semantic relationship identifier 220 can determine functional dependencies between entities and other semantic relationships.
Consistent with various embodiments, syntactic relationship identifier 222 can be a computer module that can identify syntactic relationships in a question composed of tokens posed by users to QA system 212. Syntactic relationship identifier 222 can determine the grammatical structure of sentences, for example, which groups of words are associated as “phrases” and which word is the subject or object of a verb. Syntactic relationship identifier 222 can conform to formal grammar.
In some embodiments, question analyzer 214 can be a computer module that can parse a received user query and generate a corresponding data structure of the user query. For example, in response to receiving a question at QA system 212, question analyzer 214 can output the parsed question as a data structure. In some embodiments, the parsed question can be represented in the form of a parse tree or other graph structure. To generate the parsed question, question analyzer 214 can trigger computer modules 216-222. Additionally, in some embodiments, question analyzer 214 can use external computer systems for dedicated tasks that are part of the question parsing process.
In some embodiments, the output of question analyzer 214 can be used by QA system 212 to perform a search of a set of (i.e., one or more) corpora to retrieve information to answer a question posed by a user. As used herein, a corpus can refer to one or more data sources. In some embodiments, data sources 224 can include databases, information corpora, data models, and document repositories. In some embodiments, the data source 224 can include an information corpus 226. The information corpus 226 can enable data storage and retrieval. In some embodiments, the information corpus 226 can be a storage mechanism that houses a standardized, consistent, clean and integrated form of data. The data can be sourced from various operational systems. Data stored in the information corpus 226 can be structured in a way to specifically address reporting and analytic requirements. In some embodiments, the information corpus can be a relational database. In some example embodiments, data sources 224 can include one or more document repositories.
In some embodiments, answer generator 228 can be a computer module that generates a group of answers in response to posed questions. Examples of answers generated by answer generator 228 can include, but are not limited to, natural language sentences, reports, charts, or other analytic representation, raw data, web pages, and the like. In some embodiments, answers can be of audio type, image type, or other suitable medium type.
In some embodiments, answer generator 228 can include query processor 230, visualization processor 232, and feedback handler 234. When information in the data source 224 matching a parsed question is located, a technical query associated with the pattern can be executed by query processor 230. Based on data retrieved by a technical query executed by query processor 230, visualization processor 232 can be configured to render visualization of the retrieved answers as described herein. The rendered visualization of the answers can represent the answer to the input query. In some embodiments, visualization processor 232 can render visualization in various forms including, but not limited to, images, charts, tables, dashboards, maps, and the like.
In some embodiments, feedback handler 234 can be a computer module that processes feedback from users on answers generated by answer generator 228. In some embodiments, users can be engaged in dialog with the QA system 212 to evaluate the relevance of received answers. For example, the answer generator 228 could produce the group of answers corresponding to a question submitted by a user. The user could rank each answer according to its relevance to the question. In some embodiments, the feedback of users on generated answers can be used for future question answering sessions.
The various components of the QA system 212 described herein can be used to implement various aspects of the present disclosure. For example, the client application 208 could be used to receive an input query from a user. The question analyzer 214 could, in some embodiments, be used to analyze input queries. In embodiments, the input queries can include a question asking for explanations for a set of conditions. The answer generator 228, in embodiments, could be used to analyze the data sources 224 to determine influence factors between user data in the information corpus 226 and one or more of the set of conditions.
Referring now to FIG. 3 a block diagram of a system architecture 300 for establishing influence relationships between data in a question-answering environment can be seen, according to embodiments of the present disclosure. In embodiments, the system architecture 300 can represent an example architecture for executing embodiments of the present disclosure. For example, in some instances, the system architecture 300 could be an example representation of aspects of the answer generator 228 (FIG. 2) and/or the question analyzer 214 (FIG. 2).
In embodiments, the system architecture 300 can include a relationship analyzer 306 and an answer generator 314.
The relationship analyzer 306 can be a computer module configured to establish influence relationships between data in a QA environment. In embodiments, the relationship analyzer 306 can be configured to determine a set of conditions 301. The set of conditions 301 can be the same or substantially similar as described herein. In embodiments, the relationship analyzer can receive the set of conditions 301 as inputs. For example, in embodiments a user could enter the set of conditions manually as text. The relationship analyzer 306 could then use natural language processing techniques as described herein to parse the text to determine the set of conditions 301.
Relationship analyzer 306 can be communicatively connected to database 312. Database 312 can store various types of information including text, images, audio, video, and other suitable information. In embodiments, database can include a mass quantity of various kinds of data related to various subjects. For example in embodiments, the database could include various medical information including journals, medical texts, clinical research, doctor's notes, and other information. In embodiments, the database 312 could include information related to various additional subject matter. The database 312 can be accessed and parsed by the relationship analyzer 306 to establish relationships between data based on the stored information.
In embodiments, database 312 can be a corpus of information. In some embodiments, database 312 can substantially correspond to information corpus 226 (FIG. 2). In embodiments database 312 can include a set of user data 313. User data can be the same or substantially similar as described herein. For example, in embodiments, the set of user data 313 includes one or more types of content including economic data, medical data, personal data, family history, and historical user data.
In embodiments, the relationship analyzer 306 can include a characteristic identifier 308. The characteristic identifier 308 can be configured to identify characteristics of the set of conditions 301 and in the set of user data 313. In embodiments, characteristics are elements, features, traits, themes, etc. that can be related to or correspond to data. For example, a condition of the common cold could have characteristics including, but not limited to, “contagious”, “sore throat”, “nasal congestion”, and “common in children”. In an additional example, user data indicating travel abroad could have characteristics including but not limited to, “exposure to people”, “stress”, and “unusual food and beverage”. Described further herein, characteristic relationships can be used to establish influence factors and influence relationships between two or more pieces of data.
In embodiments, the characteristic identifier 308 can identify characteristics in data using natural language processing techniques as described herein. For example, in embodiments, the characteristic identifier 308 can employ a natural language processor 309. The natural language processor 309 can be configured to perform various methods and techniques for natural language analysis of data in the QA environment. For example, the natural language processor 309 can be configured to perform syntactic analysis, semantic analysis, image recognition analysis, concept matching and other suitable methods and techniques.
In embodiments, characteristics can be determined by concept matching techniques. Concept matching techniques can include, but are not limited to, semantic similarity analysis, syntactic analysis, and ontological matching. For example, in embodiments, the natural language processor 309 could be configured to parse data in the QA environment to determine semantic features (e.g., repeated words and/or keywords) and/or syntactic features (e.g., location of semantic features in headings and/or title). Ontological matching could be used to map semantic and/or syntactic features to a particular concept.
For example, in some embodiments, the natural language processor 309 can be configured to parse the database 312, the set of user data 313, and the set of conditions 301. The natural language processor 309 could identify, in the data, repeated words corresponding to a particular concept. Additionally, the natural language processor 309 could identify the location of the repeated words in headings and titles, which can indicate the relative importance of the repeated words. Based on the semantic and syntactic features the natural language processor 309 could map a subset of the set of user data 313 and a subset of the set of conditions 301 to various concepts. In embodiments, the characteristic identifier 308 could be configured to select the concepts as characteristics.
For example, in embodiments, a QA system could receive a question asking for possible causes in user data for a set of conditions including a common cold, and paresthesia (tingling sensation) in appendages. In response, characteristic identifier 308, using natural language processor 309, could parse the set of conditions 301 and the database 312 to determine characteristics of the set of conditions 301. Based on concept matching techniques, the natural language processor could identify various concepts from a corpus, such as the database 312, corresponding to the common cold. For example, the natural language processor could identify in various medical texts that exposure to young children can increase the chances of contracting the common cold. The natural language processor 309 could then select “exposure to children” as the concept. Similarly, in some examples, the natural language processor 309 could identify from medical journals or other sources that high stress levels can result in paresthesia. Thus, the natural language processor 309 could select “stress” as another concept. Thus, in embodiments, the characteristic identifier 308 could be configured to select the concept of “stress” as a characteristic of paresthesia and the concept of “exposure to children” as a characteristic of the common cold.
In embodiments, the characteristic identifier 308 could parse the set of user data and identify characteristics of the set of user data. For example, the characteristic identifier 308 could parse financial records, such as paystubs and tax information that shows that the user works at an elementary school and has been putting in overtime. As described herein, the natural language processor 309 could identify “exposure to children” and “stress” as concepts from analysis of this information. Thus, the characteristic identifier 308 could select the concepts of “stress” and “exposure to children” as characteristics of a subset of the user data.
The influence factor identifier 310 can be configured to identify influence factors between data in the QA environment. The influence factor identifier 310 can be configured to identify influence factors based on comparing characteristics identified by the characteristic identifier 308. In embodiments, comparisons can be made between data having common (e.g., shared) characteristics and different (e.g., non-shared) characteristics. Based on the comparisons of these characteristic relationships, the influence factor identifier can identify influence factors between data. In embodiments, if characteristics are the same or substantially similar then the influence factor identifier can identify the characteristics as one or more influence factors.
In embodiments, the influence factor identifier 310 can use natural language processor 309 to compare characteristics. In embodiments, natural language processor 309 can use various techniques such as syntactic analysis, semantic analysis, image recognition analysis, concept matching and other suitable methods and techniques as described herein. In embodiments, natural language processor can determine whether characteristics are the same or substantially similar. In embodiments, characteristics are substantially similar if they are identical. In some embodiments, characteristics are substantially similar if they are related. For example, in embodiments, related characteristics could be a first characteristic describing a genus and a second characteristic describing a species of that genus. For example, if a characteristic of the common cold was “exposure to children” and a characteristic of user data was “exposure to people”, the concepts could be considered substantially similar as “exposure to people” includes the characteristic of “exposure to children”.
In some embodiments, the characteristics can be weighted. Similarly, the influence factors identified from the characteristics can be weighted based on the weights of the characteristics. For example, in embodiments, an influence factor identified from a highly weighted characteristic will be a highly weighted influence factor. In embodiments, the characteristics can be weighted based on the source of the characteristic, such as type of user data. For example, in embodiments, if a characteristic was parsed from financial data it could have an assigned weight based on that the characteristic was located from financial data. In an additional, characteristics from financial data could have higher weights than characteristics from other types of data, such as social networking data. In some embodiments, characteristics can be weighted based on the format of the user data. For example, characteristics parsed from textual data could be weighed higher than audio data. Further, in some embodiments, the influence factor could be weighted based on the NLP analysis that detected the characteristic. For example, in embodiments, NLP could detect urgency which could give the characteristic a higher weight. For example, a high urgency characteristic could be parsed from a social network data that says “wow! I really feel sick after eating at that restaurant!”
In some embodiments, influence factor identifier 310 can be configured to group influence factors based on a data pair to which each influence factor belongs. For example, in a situation having a set of conditions including conditions A, B, and C, and a set of user data including user data D and E, there can be, in some embodiments, as many as six different answer pairs (A-D, A-E, B-D, B-E, C-D, and C-E) and, therefore, as many as six different sets of influence factors.
In embodiments, the relationship analyzer 306 can be configured to establish influence relationships using a set of influence factors. Each influence relationship can represent a composite of a particular set of influence factors. In some embodiments, influence relationships can be measures or indicators as to how datum of a data pair are likely to interact or influence each other. Further, in some embodiments, for data pairs having no influence factors, there can be deemed to be no influence relationship between those answers forming the pair or there can be deemed to be a null or neutral influence relationship. For instance, if there are no influence factors corresponding to the A-D pair then the relationship between condition A and user data D could be deemed a neutral influence relationship.
In embodiments, the relationship analyzer 306 can be configured to evaluate influence relationships. In embodiments, the relationship analyzer 306 can be configured to evaluate the influence relationships by calculating a relationship score that indicates the relative strength of the influence relationship. In some embodiments, the calculated relationship score can be based on the number of influence factors that make up the influence relationship. For example, in embodiments, the relationship analyzer 306 can be configured to determine the number of influence factors that make up the influence relationship. In some embodiments, the greater the number of influence factors that make up the influence relationship, the stronger the influence relationship. Similarly, the fewer the number of influence factors, the weaker the influence relationship. In some embodiments, the strength of the inverse relationship can be inversely proportional to the number of influence factors in the influence relationship. In some embodiments, the relationship score can be based on the weight of the influence factors, as described herein. For example, in embodiments, the higher the weight of the influence factors in the influence relationship, the greater the relationship score. In some embodiments, the relationship score can be inversely proportional to the weight of the influence factors in the influence relationship.
The answer generator 314 can be configured to generate answers based on influence relationships. For example, in response to a question about how two sets of data influence each other, the answer generator could generate one or more explanations detailing the influence relationships established by the relationship analyzer 306. For example, the answer generator could present text describing that an influence relationship between working as a schoolteacher and the common cold was obtained. In embodiments, the answer generator could include evidence used to arrive at the influence relationship. For example, the answer generator could present elements in the database 312 used to establish the influence relationship.
For example, in embodiments, the answer generator 314 can be configured to generate text based explanations of the influence relationship. In some embodiments, explanations can be generated in various formats including images, text, audio, video, tables, charts, and in other suitable formats. In embodiments, answer generator 314 can be configured to use natural language processing techniques as described herein, to generate the explanations.
Referring now to FIG. 4, an example diagram 400 of data relationships between data in a QA environment can be seen, according to embodiments of the present disclosure. As seen in FIG. 4, example diagram 400 includes two types of data including condition A 401A and user data B 401B. In embodiments, various amounts of data can be compared for data relationships. As described herein, diagram 400 could be a representation of a QA system's response to a question of how types of data influence one another (such as condition A 401A and user data B 401B).
As seen in FIG. 4, an influence relationship A/B 404 exists corresponding to condition A 401A and user data B 401B. As described herein, influence relationship A/B 404 can be a composite of one or more influence factors. As seen in FIG. 4, there are two influence factors corresponding to condition A 401A and user data B 401B (characteristic b-based influence factor 402A and characteristic d-based influence factor 402B).
As described herein, influence factors can be based on characteristics (a, b, c, and d). Characteristics can be associated with the data as described herein. For example, condition A 401A is associated with characteristic a 400A, b 400B, and d 400D. User data B 401B is associated with characteristics b 400B, c 400C, and d 400D. Two characteristics are common characteristics (b and d, respectively) which is indicated by lines from condition A 401A and user data B 401B to characteristic b 400B and d 400D. By comparing these characteristics as described herein, two characteristic based influence factors can be identified, namely, characteristic b-based influence factor 402A corresponding to both condition A 401A and user data B 401B and characteristic d-based influence factor 402B corresponding to both condition A 401A and user data B 401B.
Further, as shown, influence relationship A/B 404 can be generated based on the sets of influence factors. Specifically, a first influence relationship (including the characteristic b-based influence factor and characteristic d-based influence factor) can be used to generate an influence relationship A/B 404 between condition A 401A and user data B 401B.
Referring now to FIG. 5 a flowchart diagram of a method 500 of establishing influence relationships between data in a question-answering environment can be seen, according to embodiments of the present disclosure.
In operation 502, a set of conditions can be determined that indicate a set of user statuses. The set of conditions can be the same or substantially similar as described herein. In embodiments, the set of conditions can be various statuses for a user. For example, in the medical field, the set of conditions could include various illnesses and/or symptoms. In operation 504, a corpus can be analyzed that includes a set of user data. The corpus can be the same or substantially similar as described herein. In embodiments, the corpus includes mass quantities of information on various subject matter. The set of user data can be the same or substantially similar as described herein. In embodiments, the set of user data can include various types of electronic information accessible by a QA system for analysis.
In operation 506, characteristics can be identified that correspond to a subset of user data and to a subset of the conditions to identify influence factors. Characteristics can be the same or substantially similar as described herein. In embodiments, characteristics can be identified using natural language processing techniques. For example, in embodiments, concept matching techniques, as described herein, can be used to identify characteristics.
In operation 508, a set of influence factors can be identified based on a comparison of characteristics. Influence factors can be the same or substantially similar as described herein. In embodiments, the set of influence factors can be identified by determining that characteristics corresponding to the subset of user data and the subset of conditions are substantially similar.
In operation 510, a set of influence relationships can be established based on the set of influence factors. Influence relationships can be the same or substantially similar as described herein. In embodiments, influence relationships can be composites of groups of influence factors. In operation 512, the method 500 can include evaluating the influence relationships based on the influence factors. In embodiments, the influence relationships can be evaluated by calculating a relationship score that indicates the relative strength of the influence relationship. In some embodiments, the calculated relationship score can be based on the number of influence factors that make up the influence relationship. For example, in embodiments, the relationship analyzer 306 can be configured to determine the number of influence factors that make up the influence relationship. In some embodiments, the greater the number of influence factors that make up the influence relationship, the stronger the influence relationship. Similarly, the fewer the number of influence factors, the weaker the influence relationship. In some embodiments, the relationship score can be based on the weight of the influence factors, as described herein. For example, in embodiments, the higher the weight of the influence factors in the influence relationship, the greater the relationship score.
In operation 514, the method can include generating a set of explanations for the set of conditions using relevant influence relationships. In embodiments, the set of explanations can be text based descriptions of the influence relationships established by embodiments of the present disclosure. For example, an explanation could include text describing that an influence relationship between working as a schoolteacher and the common cold was obtained.
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present disclosure have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (4)

What is claimed is:
1. A computer-implemented method of establishing influence relationships between data in a question-answering environment, the method comprising:
receiving an electronic text document from a user over a network;
parsing, using a natural language processor, the electronic text document to determine a set of conditions indicating a set of user statuses, wherein parsing, using the natural language processor, the electronic text document to determine the set of conditions includes:
converting sequences of characters within the electronic text document into tokens;
determining a set of repeated words in the electronic text document;
determining a location of each of the repeated words in the set of repeated words; and
selecting, based on the location of each of the repeated words of the set of repeated words, a subset of repeated words, wherein the set of conditions are selected based on the subset of repeated words;
accessing, over the network, a corpus of data including a set of user data;
analyzing, using the natural language processor, the corpus of data including the set of user data;
identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions, wherein identifying the set of influence factors corresponding to the subset of the set of user data and to the subset of the set of conditions includes:
identifying, by the natural language processor, a first characteristic of a first data type of the subset of the set of user data, the first data type in a first format;
assigning a first weight to the first characteristic based on the first data type and the first format;
identifying, by the natural language processor, a second characteristic of a second data type of the subset of the set of user data, the second data type in a second format;
assigning a second weight to the second characteristic based on the second data type and the second format;
identifying a first influence factor of the set of influence factors using the first characteristic and the second characteristic; and
assigning a third weight to the first influence factor based on the first weight of the first characteristic and the second weight of the second characteristic;
establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions, wherein a first influence relationship is established using the first influence factor; and
generating an explanation for the subset of conditions using at least the first influence relationship based on the third weight of the first influence factor.
2. A system for establishing influence relationships between data in a question-answering environment, the system comprising:
a processor; and
a computer readable storage medium having program instructions embodied therewith, the program instructions executable by the processor to cause the system to:
receive an electronic text document from a user over a network;
parse, using a natural language processor, the electronic text document to determine a set of conditions indicating a set of user statuses, wherein parsing, using the natural language processor, the electronic text document to determine the set of conditions includes:
converting sequences of characters within the electronic text document into tokens;
determining a set of repeated words in the electronic text document;
determining a location of each of the repeated words in the set of repeated words; and
selecting, based on the location of each of the repeated words of the set of repeated words, a subset of repeated words, wherein the set of conditions are selected based on the subset of repeated words;
access, over the network, a corpus of data including a set of user data;
analyze, using the natural language processor, the corpus of data including the set of user data;
identify, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions, wherein identifying the set of influence factors corresponding to the subset of the set of user data and to the subset of the set of conditions includes:
identifying, by the natural language processor, a first characteristic of a first data type of the subset of the set of user data, the first data type in a first format;
assigning a first weight to the first characteristic based on the first data type and the first format;
identifying, by the natural language processor, a second characteristic of a second data type of the subset of the set of user data, the second data type in a second format;
assigning a second weight to the second characteristic based on the second data type and the second format;
identifying a first influence factor of the set of influence factors using the first characteristic and the second characteristic; and
assigning a third weight to the first influence factor based on the first weight of the first characteristic and the second weight of the second characteristic;
establish, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions, wherein a first influence relationship is established using the first influence factor; and
generate an explanation for the subset of conditions using at least the first influence relationship based on the third weight of the first influence factor.
3. A computer program product for establishing influence relationships between data in a question-answering environment, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising:
receiving an electronic text document from a user over a network;
parsing, using a natural language processor, the electronic text document to determine a set of conditions indicating a set of user statuses, wherein parsing, using the natural language processor, the electronic text document to determine the set of conditions includes:
converting sequences of characters within the electronic text document into tokens;
determining a set of repeated words in the electronic text document;
determining a location of each of the repeated words in the set of repeated words; and
selecting, based on the location of each of the repeated words of the set of repeated words, a subset of repeated words, wherein the set of conditions are selected based on the subset of repeated words;
accessing, over the network, a corpus of data including a set of user data;
analyzing, using the natural language processor, the corpus of data including the set of user data;
identifying, based on analyzing the corpus of data, a set of influence factors corresponding to a subset of the set of user data and to a subset of the set of conditions, wherein identifying the set of influence factors corresponding to the subset of the set of user data and to the subset of the set of conditions includes:
identifying, by the natural language processor, a first characteristic of a first data type of the subset of the set of user data, the first data type in a first format;
assigning a first weight to the first characteristic based on the first data type and the first format;
identifying, by the natural language processor, a second characteristic of a second data type of the subset of the set of user data, the second data type in a second format;
assigning a second weight to the second characteristic based on the second data type and the second format;
identifying a first influence factor of the set of influence factors using the first characteristic and the second characteristic; and
assigning a third weight to the first influence factor based on the first weight of the first and the second weight of the second characteristic;
establishing, based on the set of influence factors, a set of influence relationships between the subset of the set of user data and the subset of the set of conditions, wherein a first influence relationship is established using the first influence factor; and
generating an explanation for the subset of conditions using at least the first influence relationship based on the third weight of the first influence factor.
4. The computer program product of claim 3, wherein:
the set of conditions is a set of medical conditions indicating a set of medical related user statuses.
US16/275,798 2014-12-10 2019-02-14 Data relationships in a question-answering environment Active 2035-09-16 US11238231B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/275,798 US11238231B2 (en) 2014-12-10 2019-02-14 Data relationships in a question-answering environment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/566,111 US10289679B2 (en) 2014-12-10 2014-12-10 Data relationships in a question-answering environment
US16/275,798 US11238231B2 (en) 2014-12-10 2019-02-14 Data relationships in a question-answering environment

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/566,111 Continuation US10289679B2 (en) 2014-12-10 2014-12-10 Data relationships in a question-answering environment

Publications (2)

Publication Number Publication Date
US20190179904A1 US20190179904A1 (en) 2019-06-13
US11238231B2 true US11238231B2 (en) 2022-02-01

Family

ID=56111326

Family Applications (3)

Application Number Title Priority Date Filing Date
US14/566,111 Expired - Fee Related US10289679B2 (en) 2014-12-10 2014-12-10 Data relationships in a question-answering environment
US14/663,503 Abandoned US20160170962A1 (en) 2014-12-10 2015-03-20 Data relationships in a question-answering environment
US16/275,798 Active 2035-09-16 US11238231B2 (en) 2014-12-10 2019-02-14 Data relationships in a question-answering environment

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US14/566,111 Expired - Fee Related US10289679B2 (en) 2014-12-10 2014-12-10 Data relationships in a question-answering environment
US14/663,503 Abandoned US20160170962A1 (en) 2014-12-10 2015-03-20 Data relationships in a question-answering environment

Country Status (1)

Country Link
US (3) US10289679B2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10289679B2 (en) 2014-12-10 2019-05-14 International Business Machines Corporation Data relationships in a question-answering environment
US10467291B2 (en) * 2016-05-02 2019-11-05 Oath Inc. Method and system for providing query suggestions
US9996526B2 (en) 2016-10-19 2018-06-12 International Business Machines Corporation System and method for supplementing a question answering system with mixed-language source documents
US9996525B2 (en) 2016-10-19 2018-06-12 International Business Machines Corporation System and method for supplementing a question answering system with mixed-language source documents
CN106844344B (en) * 2017-02-06 2020-06-05 厦门快商通科技股份有限公司 Contribution calculation method for conversation and theme extraction method and system
CN111201523A (en) * 2017-10-05 2020-05-26 链睿有限公司 Search term extraction and optimization in natural language text files
US11526552B2 (en) * 2020-08-18 2022-12-13 Lyqness Inc. Systems and methods of optimizing the use of user questions to identify similarities among a large network of users

Citations (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6482156B2 (en) * 1996-07-12 2002-11-19 First Opinion Corporation Computerized medical diagnostic and treatment advice system including network access
WO2003060750A2 (en) 2002-01-10 2003-07-24 Siemens Medical Solutions Health Services Corporation A system for supporting clinical decision-making
US6641532B2 (en) * 1993-12-29 2003-11-04 First Opinion Corporation Computerized medical diagnostic system utilizing list-based processing
US20030233250A1 (en) 2002-02-19 2003-12-18 David Joffe Systems and methods for managing biological data and providing data interpretation tools
US20040049505A1 (en) * 2002-09-11 2004-03-11 Kelly Pennock Textual on-line analytical processing method and system
US20040260577A1 (en) * 1999-11-15 2004-12-23 Recare, Inc. Electronic healthcare information and delivery management system with an integrated medical search architecture and capability
US20050065813A1 (en) * 2003-03-11 2005-03-24 Mishelevich David J. Online medical evaluation system
US20050108001A1 (en) 2001-11-15 2005-05-19 Aarskog Brit H. Method and apparatus for textual exploration discovery
US20050177403A1 (en) 2004-02-06 2005-08-11 Johnson Thomas D. System and method for measuring and controlling the quality of medical consulting reports
US20060253300A1 (en) * 2005-05-03 2006-11-09 Somberg Benjamin L System and method for managing patient triage in an automated patient management system
WO2007024617A2 (en) 2005-08-25 2007-03-01 Siemens Medical Solutions Usa, Inc. Medical ontologies for computer assisted clinical decision support
US20070061393A1 (en) * 2005-02-01 2007-03-15 Moore James F Management of health care data
WO2007067078A1 (en) 2005-12-07 2007-06-14 Synergenz Bioscience Limited Methods of analysis of polymorphisms and uses thereof
US20070197882A1 (en) * 2006-02-17 2007-08-23 Medred, Llc Integrated method and system for diagnosis determination
US20080146334A1 (en) 2006-12-19 2008-06-19 Accenture Global Services Gmbh Multi-Player Role-Playing Lifestyle-Rewarded Health Game
US20090089086A1 (en) * 2007-10-01 2009-04-02 American Well Systems Enhancing remote engagements
US20090112623A1 (en) * 2007-10-22 2009-04-30 American Well Systems Connecting Consumers with Service Providers
US20090119095A1 (en) 2007-11-05 2009-05-07 Enhanced Medical Decisions. Inc. Machine Learning Systems and Methods for Improved Natural Language Processing
WO2010042947A2 (en) 2008-10-10 2010-04-15 Cardiovascular Decision Technologies, Inc. Automated management of medical data using expert knowledge and applied complexity science for risk assessment and diagnoses
US7725328B1 (en) 1996-10-30 2010-05-25 American Board Of Family Practice, Inc. Computer architecture and process of patient generation evolution, and simulation for computer based testing system
US20100222649A1 (en) * 2009-03-02 2010-09-02 American Well Systems Remote medical servicing
US20100324927A1 (en) 2009-06-17 2010-12-23 Tinsley Eric C Senior care navigation systems and methods for using the same
US20100332511A1 (en) 2009-06-26 2010-12-30 Entanglement Technologies, Llc System and Methods for Units-Based Numeric Information Retrieval
US20110196704A1 (en) 2010-02-11 2011-08-11 Eclipsys Corporation Intelligent tokens for automated health care information systems
US8050938B1 (en) * 2002-04-19 2011-11-01 Greenway Medical Technologies, Inc. Integrated medical software system with enhanced portability
US20120096391A1 (en) * 2010-10-18 2012-04-19 Smith William K Knowledge base data generation and management to support automated e-health diagnosis systems
US20120102405A1 (en) * 2010-10-25 2012-04-26 Evidence-Based Solutions, Inc. System and method for matching person-specific data with evidence resulting in recommended actions
US20120197876A1 (en) 2011-02-01 2012-08-02 Microsoft Corporation Automatic generation of an executive summary for a medical event in an electronic medical record
US20120232743A1 (en) * 2011-03-10 2012-09-13 GM Global Technology Operations LLC Developing fault model from service procedures
US20130085781A1 (en) * 2011-09-29 2013-04-04 Eclinicalworks, Llc Systems and methods for generating and updating electronic medical records
US20130204876A1 (en) * 2011-09-07 2013-08-08 Venio Inc. System, Method and Computer Program Product for Automatic Topic Identification Using a Hypertext Corpus
US20130210866A1 (en) * 2012-02-13 2013-08-15 Acorda Therapeutics, Inc. Methods for treating an impairment in gait and/or balance in patients with multiple sclerosis using an aminopyridine
US20130218884A1 (en) 2012-02-21 2013-08-22 Salesforce.Com, Inc. Method and system for providing a review from a customer relationship management system
US20130226601A1 (en) * 2011-08-24 2013-08-29 Acupera, Inc. Remote clinical care system
US20130325508A1 (en) * 2012-05-30 2013-12-05 Covidien Lp Systems and methods for providing transparent medical treatment
US20140081659A1 (en) * 2012-09-17 2014-03-20 Depuy Orthopaedics, Inc. Systems and methods for surgical and interventional planning, support, post-operative follow-up, and functional recovery tracking
US20140122185A1 (en) 2012-10-31 2014-05-01 Tata Consultancy Services Limited Systems and methods for engagement analytics for a business
US20140136216A1 (en) * 2012-11-12 2014-05-15 Hartford Fire Insurance Company System and method for processing data related to case management for injured individuals
US20140156308A1 (en) 2012-11-30 2014-06-05 Dacadoo Ag Automated Health Data Acquisition, Processing and Communication System
US20140195168A1 (en) * 2013-01-06 2014-07-10 Yahya Shaikh Constructing a differential diagnosis and disease ranking in a list of differential diagnosis
US20140229293A1 (en) 2013-02-13 2014-08-14 Sandra Liu Huang Techniques for facilitating the promotion of organic content
US20140257852A1 (en) 2013-03-05 2014-09-11 Clinton Colin Graham Walker Automated interactive health care application for patient care
US20140271721A1 (en) * 2013-03-14 2014-09-18 Allergen Research Corporation Peanut formulations and uses thereof
US8856156B1 (en) * 2011-10-07 2014-10-07 Cerner Innovation, Inc. Ontology mapper
US20150066539A1 (en) * 2013-09-05 2015-03-05 A-Life Medical, Llc Automated clinical indicator recognition with natural language processing
US9026529B1 (en) 2010-04-22 2015-05-05 NetBase Solutions, Inc. Method and apparatus for determining search result demographics
US20150161331A1 (en) * 2013-12-04 2015-06-11 Mark Oleynik Computational medical treatment plan method and system with mass medical analysis
US20150193583A1 (en) 2014-01-06 2015-07-09 Cerner Innovation, Inc. Decision Support From Disparate Clinical Sources
US9092802B1 (en) 2011-08-15 2015-07-28 Ramakrishna Akella Statistical machine learning and business process models systems and methods
US20160098387A1 (en) 2014-10-06 2016-04-07 International Business Machines Corporation Natural Language Processing Utilizing Propagation of Knowledge through Logical Parse Tree Structures
US20160117485A1 (en) 2014-10-27 2016-04-28 International Business Machines Corporation Criteria Conditional Override Based on Patient Information and Supporting Evidence
US20160171088A1 (en) 2014-12-10 2016-06-16 International Business Machines Corporation Data relationships in a question-answering environment
US20160188648A1 (en) 2013-07-17 2016-06-30 Douglass Malcolm A residential management system
US20160216274A1 (en) * 2013-09-05 2016-07-28 University Health Network Biomarkers for early determination of a critical or life threatening response to illness and/or treatment response
US20160275253A1 (en) * 2014-02-12 2016-09-22 Akiyoshi Shimura Disease detecting system and disease detecting method
US9536051B1 (en) * 2012-07-25 2017-01-03 Azad Alamgir Kabir High probability differential diagnoses generator
US9715542B2 (en) 2005-08-03 2017-07-25 Search Engine Technologies, Llc Systems for and methods of finding relevant documents by analyzing tags

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9442149B2 (en) * 2013-07-08 2016-09-13 Stmicroelectronics S.R.L. Measuring leakage currents and measuring circuit for carrying out such measuring

Patent Citations (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6641532B2 (en) * 1993-12-29 2003-11-04 First Opinion Corporation Computerized medical diagnostic system utilizing list-based processing
US6482156B2 (en) * 1996-07-12 2002-11-19 First Opinion Corporation Computerized medical diagnostic and treatment advice system including network access
US7725328B1 (en) 1996-10-30 2010-05-25 American Board Of Family Practice, Inc. Computer architecture and process of patient generation evolution, and simulation for computer based testing system
US20040260577A1 (en) * 1999-11-15 2004-12-23 Recare, Inc. Electronic healthcare information and delivery management system with an integrated medical search architecture and capability
US20050108001A1 (en) 2001-11-15 2005-05-19 Aarskog Brit H. Method and apparatus for textual exploration discovery
WO2003060750A2 (en) 2002-01-10 2003-07-24 Siemens Medical Solutions Health Services Corporation A system for supporting clinical decision-making
US20030233250A1 (en) 2002-02-19 2003-12-18 David Joffe Systems and methods for managing biological data and providing data interpretation tools
US8050938B1 (en) * 2002-04-19 2011-11-01 Greenway Medical Technologies, Inc. Integrated medical software system with enhanced portability
US20040049505A1 (en) * 2002-09-11 2004-03-11 Kelly Pennock Textual on-line analytical processing method and system
US20050065813A1 (en) * 2003-03-11 2005-03-24 Mishelevich David J. Online medical evaluation system
US20050177403A1 (en) 2004-02-06 2005-08-11 Johnson Thomas D. System and method for measuring and controlling the quality of medical consulting reports
US20070061393A1 (en) * 2005-02-01 2007-03-15 Moore James F Management of health care data
US20060253300A1 (en) * 2005-05-03 2006-11-09 Somberg Benjamin L System and method for managing patient triage in an automated patient management system
US9715542B2 (en) 2005-08-03 2017-07-25 Search Engine Technologies, Llc Systems for and methods of finding relevant documents by analyzing tags
WO2007024617A2 (en) 2005-08-25 2007-03-01 Siemens Medical Solutions Usa, Inc. Medical ontologies for computer assisted clinical decision support
WO2007067078A1 (en) 2005-12-07 2007-06-14 Synergenz Bioscience Limited Methods of analysis of polymorphisms and uses thereof
US20070197882A1 (en) * 2006-02-17 2007-08-23 Medred, Llc Integrated method and system for diagnosis determination
US20080146334A1 (en) 2006-12-19 2008-06-19 Accenture Global Services Gmbh Multi-Player Role-Playing Lifestyle-Rewarded Health Game
US20090089086A1 (en) * 2007-10-01 2009-04-02 American Well Systems Enhancing remote engagements
US20090112623A1 (en) * 2007-10-22 2009-04-30 American Well Systems Connecting Consumers with Service Providers
US20090119095A1 (en) 2007-11-05 2009-05-07 Enhanced Medical Decisions. Inc. Machine Learning Systems and Methods for Improved Natural Language Processing
WO2010042947A2 (en) 2008-10-10 2010-04-15 Cardiovascular Decision Technologies, Inc. Automated management of medical data using expert knowledge and applied complexity science for risk assessment and diagnoses
US20100222649A1 (en) * 2009-03-02 2010-09-02 American Well Systems Remote medical servicing
US20100324927A1 (en) 2009-06-17 2010-12-23 Tinsley Eric C Senior care navigation systems and methods for using the same
US20100332511A1 (en) 2009-06-26 2010-12-30 Entanglement Technologies, Llc System and Methods for Units-Based Numeric Information Retrieval
US20110196704A1 (en) 2010-02-11 2011-08-11 Eclipsys Corporation Intelligent tokens for automated health care information systems
US9026529B1 (en) 2010-04-22 2015-05-05 NetBase Solutions, Inc. Method and apparatus for determining search result demographics
US20120096391A1 (en) * 2010-10-18 2012-04-19 Smith William K Knowledge base data generation and management to support automated e-health diagnosis systems
US20120102405A1 (en) * 2010-10-25 2012-04-26 Evidence-Based Solutions, Inc. System and method for matching person-specific data with evidence resulting in recommended actions
US20120197876A1 (en) 2011-02-01 2012-08-02 Microsoft Corporation Automatic generation of an executive summary for a medical event in an electronic medical record
US20120232743A1 (en) * 2011-03-10 2012-09-13 GM Global Technology Operations LLC Developing fault model from service procedures
US9092802B1 (en) 2011-08-15 2015-07-28 Ramakrishna Akella Statistical machine learning and business process models systems and methods
US20130226601A1 (en) * 2011-08-24 2013-08-29 Acupera, Inc. Remote clinical care system
US20130204876A1 (en) * 2011-09-07 2013-08-08 Venio Inc. System, Method and Computer Program Product for Automatic Topic Identification Using a Hypertext Corpus
US20130085781A1 (en) * 2011-09-29 2013-04-04 Eclinicalworks, Llc Systems and methods for generating and updating electronic medical records
US8856156B1 (en) * 2011-10-07 2014-10-07 Cerner Innovation, Inc. Ontology mapper
US20130210866A1 (en) * 2012-02-13 2013-08-15 Acorda Therapeutics, Inc. Methods for treating an impairment in gait and/or balance in patients with multiple sclerosis using an aminopyridine
US20130218884A1 (en) 2012-02-21 2013-08-22 Salesforce.Com, Inc. Method and system for providing a review from a customer relationship management system
US20130325508A1 (en) * 2012-05-30 2013-12-05 Covidien Lp Systems and methods for providing transparent medical treatment
US9536051B1 (en) * 2012-07-25 2017-01-03 Azad Alamgir Kabir High probability differential diagnoses generator
US20140081659A1 (en) * 2012-09-17 2014-03-20 Depuy Orthopaedics, Inc. Systems and methods for surgical and interventional planning, support, post-operative follow-up, and functional recovery tracking
US20140122185A1 (en) 2012-10-31 2014-05-01 Tata Consultancy Services Limited Systems and methods for engagement analytics for a business
US20140136216A1 (en) * 2012-11-12 2014-05-15 Hartford Fire Insurance Company System and method for processing data related to case management for injured individuals
US20140156308A1 (en) 2012-11-30 2014-06-05 Dacadoo Ag Automated Health Data Acquisition, Processing and Communication System
US20140195168A1 (en) * 2013-01-06 2014-07-10 Yahya Shaikh Constructing a differential diagnosis and disease ranking in a list of differential diagnosis
US20140229293A1 (en) 2013-02-13 2014-08-14 Sandra Liu Huang Techniques for facilitating the promotion of organic content
US20140257852A1 (en) 2013-03-05 2014-09-11 Clinton Colin Graham Walker Automated interactive health care application for patient care
US20140271721A1 (en) * 2013-03-14 2014-09-18 Allergen Research Corporation Peanut formulations and uses thereof
US20160188648A1 (en) 2013-07-17 2016-06-30 Douglass Malcolm A residential management system
US20160216274A1 (en) * 2013-09-05 2016-07-28 University Health Network Biomarkers for early determination of a critical or life threatening response to illness and/or treatment response
US20150066539A1 (en) * 2013-09-05 2015-03-05 A-Life Medical, Llc Automated clinical indicator recognition with natural language processing
US20150161331A1 (en) * 2013-12-04 2015-06-11 Mark Oleynik Computational medical treatment plan method and system with mass medical analysis
US20150193583A1 (en) 2014-01-06 2015-07-09 Cerner Innovation, Inc. Decision Support From Disparate Clinical Sources
US20160275253A1 (en) * 2014-02-12 2016-09-22 Akiyoshi Shimura Disease detecting system and disease detecting method
US20160098387A1 (en) 2014-10-06 2016-04-07 International Business Machines Corporation Natural Language Processing Utilizing Propagation of Knowledge through Logical Parse Tree Structures
US20160117485A1 (en) 2014-10-27 2016-04-28 International Business Machines Corporation Criteria Conditional Override Based on Patient Information and Supporting Evidence
US20160170962A1 (en) 2014-12-10 2016-06-16 International Business Machines Corporation Data relationships in a question-answering environment
US20160171088A1 (en) 2014-12-10 2016-06-16 International Business Machines Corporation Data relationships in a question-answering environment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
IBM, List of IBM Patents or Patent Applications Treated as Related, Feb. 14, 2019, 2 pages.
Müller et al., "The diagnosis related groups enhanced electronic medical record", International Journal of Medical Informatics (2003) 70, 221-228, © 2003 Elsevier Ireland Ltd. DOI: 10.1016/S1386-5056(03)00050-9.
Zheng, K., "Clinical Decision-Support Systems", Encyclopedia of Library and Information Sciences, Third Edition, pp. 1-9, Copyright © 2010 Taylor & Francis. DOI: 10.1017/E-ELIS3-120044944.

Also Published As

Publication number Publication date
US20160170962A1 (en) 2016-06-16
US10289679B2 (en) 2019-05-14
US20190179904A1 (en) 2019-06-13
US20160171088A1 (en) 2016-06-16

Similar Documents

Publication Publication Date Title
US11238231B2 (en) Data relationships in a question-answering environment
US10957214B2 (en) Managing answer feasibility
US10936956B2 (en) Cognitive question answering pipeline blending
US9471689B2 (en) Managing documents in question answering systems
US10169490B2 (en) Query disambiguation in a question-answering environment
US11681932B2 (en) Cognitive question answering pipeline calibrating
US9495387B2 (en) Images for a question answering system
US9613093B2 (en) Using question answering (QA) systems to identify answers and evidence of different medium types
US10019673B2 (en) Generating responses to electronic communications with a question answering system
US11586940B2 (en) Generating answers to text input in an electronic communication tool with a question answering system
US20160364374A1 (en) Visual indication for images in a question-answering system
US9886480B2 (en) Managing credibility for a question answering system
US20160364391A1 (en) Demographic-based learning in a question answering system

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JOHNSON, DAVID L.;MURAS, BRIAN R.;STRAUSS, DANIEL J.;AND OTHERS;REEL/FRAME:048334/0726

Effective date: 20141208

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JOHNSON, DAVID L.;MURAS, BRIAN R.;STRAUSS, DANIEL J.;AND OTHERS;REEL/FRAME:048334/0726

Effective date: 20141208

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE