CA3026936A1

CA3026936A1 - Systems and methods for performing automated interactive conversation with a user

Info

Publication number: CA3026936A1
Application number: CA3026936A
Authority: CA
Inventors: Eric Charton; Matthew Bonnell; Louis Marceau; Jonathan Guymont
Original assignee: Banque Nationale du Canada
Current assignee: Banque Nationale du Canada
Priority date: 2018-12-07
Filing date: 2018-12-07
Publication date: 2020-06-07
Also published as: CA3171020A1; CA3064116A1

Abstract

A dialogue system is a computer system that converses with a human via a user interface. In some embodiments, a dialogue system is provided that may increase the probability of finding a satisfactory response with relatively little iteration of dialogue between a user and the dialogue system. The number of responses (e.g. in the form of alternative questions) may be progressively increased during the interaction with a user, and this may have the effect of increasing the overall robustness of the dialogue system.

Description

S&B Ref: 84678114 (8500010-481) SYSTEMS AND METHODS FOR PERFORMING AUTOMATED INTERACTIVE
CONVERSATION WITH A USER
FIELD
[1] The following relates to a computer-implemented dialogue system for conversing with a human.
BACKGROUND

[2] A dialogue system is a computer system that converses with a human via a user interface. Many dialogue systems utilize a computer program to conduct the conversation using auditory and/or textual methods. A common name for such a computer program is a `chatbot'. A
chatbot may be implemented using a natural language processing system.

[3] An organization may use a dialogue system to help support and scale their customer relation efforts. A dialogue system may be used to provide a wide variety of information to many different users. For example, the dialogue system may be used to perform automated interactive conversation with users in order to provide answers to questions posed by the users. Questions originating from different users may be very different in nature, and the questions may be received and answered at any time of day or night. The users of the dialogue system may be customers or potential customers of the organization.

[4] Current dialogue systems have a technical problem in that they are often not robust. 'Robustness' refers to the ability of the dialogue system to satisfactorily answer a question posed by a human user. Some current dialogue systems may provide less than 50%
correct/satisfactory answers. If the dialogue system returns an incorrect or unsatisfactory answer too often, then the dialogue system will not be adopted by human users. Also, the organization's reputation may be negatively impacted.
SUMMARY
151 One way to try to increase the robustness of a dialogue system is to invest significant resources in the writing of questions and answers, and/or to invest significant resources in enriching the ability of the dialogue system to recognize intentions. Technical S&B Ref: 84678114 (8500010-481) implementations often focus less on linguistics and more on improvements to algorithms /
models. Achieving satisfactory results may be expensive. In some cases, the results are not even satisfactory because of a technical challenge: the combinatory complexity of human language is boundless, and so it is difficult in technical implementation to predict the natural language a user could use to ask a particular question. The system may be highly based on recursion, and manual dialogue tree manufacture may not be a viable solution.
[6] Instead, in some embodiments disclosed herein, a dialogue system is provided that may increase the probability of finding a satisfactory response with relatively little iteration of dialogue between a user and the system. An interactive process is introduced to help facilitate the exchange between the user and the dialogue system to try to increase the level of robustness of the dialogue system.
[7] In some embodiments, the number of responses in the form of questions may be progressively increased during an interaction with a user. This may have the effect of increasing the overall robustness of the dialogue system. For example, the responses that are progressively increased may be questions that the system determines the user may be asking.
[8] Another problem with dialogue systems is that a response formulated by a dialogue system is not customized based on the user. For example, if two different users ask the exact same question, e.g. "What is the monthly fee for your savings account", the answer would be the same. However, one user may actually be entitled to a preferable monthly fee compared to another user, e.g. based on the volume of monthly financial transactions associated with the user's bank accounts or based on the number of accounts held by the user.
[9] In some embodiments, a technical solution is provided in which a response returned by a dialogue system is generated based on financial information specific to the user.
The response may be in reply to a user's finance related question or finance related action. The response may be a question or an answer or an action.
BRIEF DESCRIPTION OF THE DRAWINGS
[10] Embodiments will be described, by way of example only, with reference to the accompanying figures wherein:

S&B Ref: 84678114 (8500010-481) [11] FIG. 1 is a block diagram of a computer implemented system for performing automated interactive conversation with a user, according to one embodiment;
[12] FIG. 2 illustrates a flowchart of a computer-implemented method for performing automated interactive conversation with a user, according to one embodiment;
[13] FIGs. 3 and 4 illustrate example message exchanges on a user interface;
[14] FIG. 5 illustrates a flowchart of a computer-implemented method for interacting with a user, according to one embodiment;
[15] FIG. 6 illustrates a flowchart of a computer-implemented method for performing automated interactive conversation with a user, according to another embodiment; and [16] FIG. 7 illustrates a flowchart of a computer-implemented method for interacting with a user, according to another embodiment.
DETAILED DESCRIPTION
[17] For illustrative purposes, specific embodiments and examples will be explained in greater detail below in conjunction with the figures.
[18] FIG. 1 is a block diagram of a computer implemented system 102 for performing automated interactive conversation with a user, according to one embodiment.
The system 102 implements a dialogue system.
[19] The system 102 includes a user interface 104 for receiving a natural language input originating from the user, and for providing a response to the user. The attributes of the user interface 104 are implementation specific and depend on how the user is interacting with the system 102. Two examples of a user interface 104 are illustrated in FIG. 1. In one example, the user interface 104 interfaces with a telephone handset belonging to the user.
The telephone handset includes a transmitter through which the user speaks and a receiver through which the user hears the response. A speech recognition module 106 is included as part of the system 102 in order to convert from speech to text. As another example, the user interface 104 may interface with a graphical user interface (GUI) on a computing device, such as on the user's mobile device. The user may use a keyboard or touchscreen to provide a text input, and the response S&B Ref: 84678114 (8500010-481) would be presented as text on the display screen hosting the GUI. The user interface 104 is the component of the system 102 that interfaces with users, and is meant to refer to the components of the interface that belong to the system 102, rather than to the user device.
[20] The system 102 further includes a data processing unit 110, which may implement a natural language processing system. The data processing unit 110 includes a keyword extractor 112, an intent classifier 114, response generator 116, and a learning component 118.
[21] The keyword extractor 112 receives a natural language input originating from the user. The input received at the keyword extractor 112 is a string of text. In general, the string of text includes multiple words, although in some cases it could be that the string of text is only a single word. The string of text may convey a question asked by the user, or a user instruction, or a user's answer to a question that was asked by the system 102. The keyword extractor 112 attempts to extract words and/or phrases from the string of text. If any keywords are extracted, the extracted keywords are stored in a memory, e.g. memory 122.
[22] In some embodiments, the keyword extractor 112 may recognize properties that indicate a particular word in the string of text may be a keyword, such as the use of a date, capital letter, brand name, recognized phrase, etc. Examples of keyword extraction algorithms that may be implemented by the keyword extractor 112 are described in:
(1) Jean-Louis, L., Gagnon, M., and Charton, E., "A knowledge-base oriented approach for automatic keyword extraction" Computacion y Sistemas, 2013, vol. 17, no 2, p.
187-196; and (2) Bechet, F., and Charton, E., "Unsupervised knowledge acquisition for extracting named entities from speech", in Acoustics Speech and Signal Processing (ICASSP), International Conference on, pp. 5338-5341, March 2010.
[23] In some embodiments, a keyword extraction algorithm is used involving named entity recognition based on knowledge representation of the semantic domain covered by the dialog system application.
[24] The intent classifier 114 also receives the natural language input originating from the user in the form of a string of text, and analyses the string of text to determine the intent of S&B Ref: 84678114 (8500010-481) the user. In some embodiments, the words in the string of text are compared to a library of intents and entity values. For example, if the user asked the question "What is the rate on your cashback Mastercard?", then the intent classifier 114 may match the word "rate" to an intent "get rate" that is stored in a library of intents. The intent classifier 114 may determine that the entity value relating to that intent is "cashback" by the presence of the word "cashback". In such a scenario, the intent classifier 114 therefore determines that the user is asking for a cashback rate.
The presence of word "Mastercard" may cause the intent classifier 114 to determine that the cashback rate requested by the user is the cashback rate for the MastercardTM
brand credit card.
The intent classifier 114 may associate a confidence value with the determined intent. The confidence value will be referred to as a "confidence score", and it quantifies how confident the intent classifier 114 is regarding the correctness of its determined intent.
For example, the intent determined by the intent classifier 114 may be "get cashback rate for MastercardTM brand credit card". However, this intent is not necessarily correct, e.g. there is some ambiguity from the string of text as to whether the rate requested is cashback rate or another type of rate instead (e.g.
interest rate for the MastercardTM brand credit card). Therefore, the confidence score may not be 100%, but may instead have a lower value, e.g. 75%.
[25] In other embodiments, the intent classifier 114 instead works by simply looking for matches between words in the natural language input and words in prewritten questions that are stored in memory 122.
[261 One example of an algorithm that may be implemented by the intent classifier 114 is described in: Serban, I. V., Sordoni, A., Bengio, Y., Courville, A. C., and Pineau, J., "Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models", in Association for the Advancement of Artificial Intelligence (AAAI), Vol. 16, pp. 3776-3784, February 2016.
[27] The response generator 116 receives the intent from the intent classifier 114, determines the question that the user is possibly asking based on the intent, and returns the possible question to the user for verification. Once the user verifies the question, the response generator 116 formulates and returns the answer. In some embodiments, the answer may be stored in memory and simply retrieved using a mapping between the verified question and the answer. In other embodiments, the response generator 116 may need to send a request over a S&B Ref: 84678114 (8500010-481) network to obtain the answer. For example, if the verified question is "what is the cashback rate for MastercardTM brand credit card", then the response generator 116 may query a database storing the cashback rate in order to obtain the cashback rate, and then formulate and send the response to the user, e.g. "The cashback rate for our MastercardTM is 1%".
[28] The learning component 118 adapts the keyword extractor 112 and/or intent classifier 114 based on information provided by the user, as discussed in more detail later.
[29] Operation of the intent classifier 114, response generator 116, and learning component 118 will be explained in more detail below in relation to FIG. 2.
[30] The system 102 further includes a memory 122 for storing information used by the data processing unit 110. For example, the memory 122 may store a library of intents, the extracted keywords from the keyword extractor 112, responses or partial responses preprogrammed for use by the response generator 116, etc.
[31] The data processing unit 110 and its components (e.g. the keyword extractor 112, intent classifier 114, response generator 116, and learning component 118) may be implemented by one or more processors that execute instructions (software) stored in memory. The memory in which the instructions are stored may be memory 122 or another memory not illustrated. The instructions, when executed, cause the data processing unit 110 and its components to perform the operations described herein, e.g. extracting keywords from the user input, classifying intent, computing a confidence score, formulating the response to send to the user, updating one or more algorithms based on input from the user, etc. In some embodiments, the one or more processors consist of a central processing unit (CPU).
[32] Alternatively, some or all of the data processing unit 110 and its components may be implemented using dedicated circuitry, such as an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or a programmed field programmable gate array (FPGA) for performing the operations of the data processing unit 110 and its components.
[33] In some embodiments, in order to try to increase the robustness of the dialogue system, an interactive process is used in which the number of responses may be progressively increased during an interaction with a user. Example embodiments are provided below.

S&B Ref: 84678114 (8500010-481) [34] FIG. 2 illustrates a flowchart of a computer-implemented method for performing automated interactive conversation with a user, e.g. in order to provide an answer to a question from the user, according to one embodiment.
[35] In step 202, a natural language input originating from a user is received via the user interface 104. The natural language input is a string of text that conveys a question.
[36] In step 204, the keyword extractor 112 attempts to extract keywords from the natural language input. If one or more keywords are extracted, then they are stored in memory 122.
[37] In step 206 the intent classifier 114 determines an intent from the natural language input. The intent classifier 114 also determines the confidence score for its determined intent. In step 207, if the confidence score is below a threshold, then the method proceeds to step 221, explained later. Otherwise, if the confidence score is above the threshold, it indicates that the system 102 is confident enough in its determined intent to return a single question for verification, and the method proceeds to step 208.
[38] In step 208, the response generator 116 returns the question for verification by the user.
[39] In step 209, the data processing unit 110 determines whether the question returned in step 208 was verified as correct by the user. Step 209 may include receiving a natural language input from the user, and the intent classifier 114 determining from the intent of the natural language input whether or not the user has verified the correctness of the question.
[40] For example, the original natural language input received in step 202 may ask "What is the rate on your cashback MasterCard?". The intent classifier 114 may determine with high enough confidence that the intent is "get cashback rate for MasterCard", and so in step 208 the response generator 116 returns "I think I understand your question. Can you verify for me that your question is: What is the cashback rate for the MasterCard credit card?" The user replies "Yes". The input "Yes" is determined in step 209 to be verifying that the question is correct.

S&B Ref: 84678114 (8500010-481) [41] If the question is verified as correct, then in step 210 the response generator 116 returns the answer to the question, and the method ends. If the question is not verified as correct, then the method proceeds to step 211.
[42] When a question is derived by the intent classifier 114 from the natural language input, and the confidence score is above the threshold, then the derived question is referred to as a "likely question". A "likely question" is a question that the system 102 determines was likely conveyed by the natural language input. In step 208, it is the likely question that is returned.
However, it is only a "likely" question because it is not necessarily the actual question that was asked, e.g. if the intent determined by the intent classifier 114 does not correctly reflect the user's intent.
[43] If step 211 is reached, it means that the initial question presented to the user for verification in steps 208 and 209 is not verified as correct. In step 211, the data processing unit 110 determines whether one or more keywords were recognized and extracted by the keyword extractor 112 in step 204. If no keywords were recognized, then the method proceeds from step 211 to step 230. Step 230 is explained later. Otherwise, if one or more keywords were recognized and extracted, then the method proceeds from step 211 to 212.
[44] In step 212, the intent classifier 114 identifies n alternative intents based on the keywords extracted in step 204, where n is a natural number. n may vary depending upon how many alternative intents can be determined, and n may also be capped. For example, if only one alternative intent is determined by the intent classifier 114, then n is limited to n = 1. As another example, if five alternative intents are determined by the intent classifier 114, then n may be capped at four, e.g. only the top four alternative intents are identified.
[45] An alternative intent is identified by the intent classifier 114 as follows: the keywords are processed, but instead of identifying the most likely intent (identified in step 206), a different intent is identified that is determined to be less likely, e.g.
has a lower confidence score. For example, the user's question may be "What is the rate on your cashback Mastercard?"
The intent classifier 114 determines two possible intents: (1) the user is requesting the cashback rate for the MastercardTM brand credit card, and the confidence score of this determined intent is 75%; or (2) the user is requesting the interest rate for the MastercardTM
brand credit card, and the S&B Ref: 84678114 (8500010-481) confidence score of this determined intent is 65%. The intent identified in step 206 is the one with the higher confidence score, which in this example is cashback rate. The (n = 1) alternative intent identified in step 212 is the one with the lower confidence score, which in this example is interest rate. Each alternative intent corresponds to an alternative question the user might be asking, which is derived from the natural language input conveying the question that was received at step 202.
[46] The n alternative intents correspond to n alternative questions, and in step 214 the n alternative questions are returned to the user via the user interface 104.
[47] In step 215 it is determined whether one of the n alternative questions is identified as correct by the user. Step 215 may be performed by determining the intent of an input received from the user after the n alternative questions are presented to the user. For example, if the user responds "First question", then the system determines that the first question of the n alternative questions is the correct question.
[48] If one of the n alternative questions is identified as correct, then in step 216 the response generator 116 returns the corresponding answer and the method ends.
If none of the n alternative questions is identified as correct, then the method proceeds to step 230. Step 230 is described later.
[49] Returning to step 207, if the intent determined by the intent classifier 114 in step 207 is below a threshold, then the method proceeds to step 221. If step 221 is reached, it means that an intent has been determined from the natural language input, but the intent classifier 114 is not particularly confident that the determined intent is correct.
[50] Therefore, in step 221, the data processing unit 110 determines whether one or more keywords were recognized and extracted by the keyword extractor 112 in step 204. If no keywords were recognized, then the method proceeds from step 221 to step 230.
Step 230 is explained later. Otherwise, if one or more keywords were recognized and extracted, then the method proceeds from step 221 to step 222. In step 222, the intent classifier 114 identifies the k most likely intents, where k is a natural number greater than or equal to one.
k does not need to have any relation to n, but in some embodiments k = n or k = n + 1. k may vary depending upon how many intents can be determined, and k may also be capped. The k intents returned S&B Ref: 84678114 (8500010-481) may be the k intents having the highest confidence scores. For example, the user's question may be "What is the big deal about your Mastercard?" The intent classifier 114 determines two possible intents: (1) the user is requesting a summary of the features of the MastercardTM brand credit card, and the confidence score of this determined intent is 45%; or (2) the user is requesting information on promotional offers for signing up for the MastercardTM brand credit card, and the confidence score of this determined intent is 35%. Neither intent has a high enough confidence score to proceed to step 208, but in step 222 both intents (k = 2) are identified. Each intent corresponds to an alternative question the user might be asking, which is derived from the natural language input conveying the question that was received at step 202.
1511 The k intents correspond to k questions, and in step 224 the k questions are returned to the user via the user interface 104.
[52] In step 225 it is determined whether one of the k questions is identified as correct by the user. Step 225 may be performed by determining the intent of an input received from the user after the k questions are presented to the user. If one of the k questions is identified as correct, then in step 226 the response generator 116 returns the corresponding answer and the method ends. If none of the k questions is identified as correct, then the method proceeds to step 230.
[53] If step 230 is reached in the method of FIG. 2 it means that the system 102 is not able to determine the question the user is asking. In step 230 the response generator 116 sends a reply to the user indicating this, e.g. "Sorry, I do not understand your question. Please try to rephrase your question".
[54] FIG. 3 illustrates an example message exchange on a user interface 104, according to one embodiment. The message exchange corresponds to steps 202, 204, 206, 207, 208, 209, 211, 212, 214, 215, and 216 of FIG. 2. The number of responses (in the form of questions) is progressively increased during the interaction with the user. In particular, initially only one question is presented for verification at 382. However, upon receiving user feedback indicating that the initial question is incorrect, n = 3 alternative questions are provided at 384.
The user indicates that the first one of the three alternative questions is correct at 386, and the answer corresponding to that question is returned at 388.

S&B Ref: 84678114 (8500010-481) [55] FIG. 4 illustrates an example message exchange on a user interface 104, according to another embodiment. The message exchange corresponds to steps 202, 204, 206, 207, 221, 222, 224, 225, and 226 of FIG. 2. The confidence score relating to the most likely intent does not exceed the threshold, and so the k = 3 most likely responses are returned at 392.
The user indicates that the first one of the three questions is correct at 396, and the answer corresponding to that question is returned at 398.
[56] Returning to FIG. 2, optionally, in step 234, the learning component 118 updates the keyword extractor 112 and/or the intent classifier 114 to reflect the user's response that indicates which question is the correct question. For example, the learning component 118 may receive the output of the "Yes" branch of step 215 and/or step 225, which indicates the correct question, and the learning component 118 may use this indication to update or train the intent classifier 114 and/or the keyword extractor 112. Two examples follow.
[57] One example: The user initially asks the question "What is the big deal about your Mastercard?" The system does not determine an intent with a high enough confidence score and so three questions are returned to the user, as shown at 392 of FIG. 4. The user replies that the first question is the correct, i.e. the correct question is "What are the features of the Mastercard credit card?". The learning component 118 then updates the keyword extractor 112 and/or intent classifier 114 to add the vocabulary "big deal" and to indicate that "big deal" is a synonym to "features". Then, if in the future a user asks a question including "big deal", e.g. "What is the big deal regarding your savings account", then the intent classifier 114 will more confidently determine that the user intent is that the user wants to learn about the features of the savings account.
[58] Another example: The user initially asks the question "What is the rate on your cashback Mastercard?" The system initially returns the incorrect question, as shown at 382 of FIG. 3, and so three alternative questions are returned to the user, as shown at 384 of FIG. 3. The user replies at 386 that the first question is correct, i.e. the correct question is "What is the interest rate on the Mastercard credit card". The learning component 118 then updates the intent classifier 114 to increase the confidence score of the entity value "interest rate" when "rate" is used in the user's question. Then, if in the future a user asks a similar question, e.g. "What is the S&B Ref: 84678114 (8500010-481) rate on your Visa card", then the intent classifier will more confidently determine that the user intent is that the user wants to know the interest rate for the VisaTM brand credit card.
[59] An example of a learning algorithm that may be implemented by the learning component 118 is: Schatzmann, J., Weilhammer, K., Stuttle, M., & Young, S.
(2006), "A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies", The knowledge engineering review, 21(2), 97-126.
[60] In alternative embodiments, steps 208 and 209 of FIG. 2 may be modified to instead just return an answer to the determined question, and ask for validation that the returned answer is correct, in which case step 210 is not needed. For example, box 382 in FIG. 3 may instead be: "The cashback rate on the MasterCard credit card is 1%. Did that answer your question?". If the user answers "Yes" then the method ends, whereas if the user answers "No, that did not answer my question", then the method proceeds to step 211.
[61] In some embodiments, the original question asked in the natural language input received at step 202 may actually consist of more than one question, in which case the system 102 may extract and process each question separately, or the intent classifier 114 may try to determine an overall intent. For example, if the natural language input from the user in step 202 is "Does your bank offer multiple credit cards? What is the rate of each one?", then the intent classifier 114 may determine that the intent is that the user wants a comparison of the rate of each of the bank's credit cards.
[62] In some embodiments, the natural language input received at step 202 may not be a question, but may instead be a request or an instruction to perform an action. For example, the input may be a request for information. The reply may then be a question that confirms whether particular information is being requested. For example, the natural language input received in step 202 may be "Provide me with the rate on your cashback MasterCard", and the initial question returned in step 208 may be "Please confirm that you are asking: What is the cashback rate on the MasterCard credit card?" Similarly, the alternative questions in steps 214 and 224 may ask whether particular information is being requested.
[63] In some embodiments, the natural language input received at step 202 may be an instruction to perform an action, e.g. "open a new account", in which case the question(s) S&B Ref: 84678114 (8500010-481) returned may relate to clarification or confirmation before proceeding, e.g.
in step 214 "Do you mean any one of the following actions: (1) Open a new savings account?; or (2) open a new chequing account?; or (3) open a new student account?".
[64] In some embodiments, when a user asks a question or requests an action, the response returned by the system 102 may be formulated based on information specific to the user. In some embodiments, the response may be in reply to a user's finance related question or finance related action. The response may be a function of the user's financial information, e.g.
the user's prior financial transactions. The response may be a question or an answer or an action.
[65] FIG. 5 illustrates a flowchart of a computer-implemented method for interacting with a user, according to one embodiment. In step 452, the data processing unit 110 receives, in text form, a natural language input originating from a user via the user interface 104. The natural language input conveys a finance related question or a finance related action to be performed. As an example, the user may be asking "What is the monthly fee for your savings account?" (a finance related question), or the user may be instructing "Please open a new savings account" (a finance related action).
[66] In step 454, an intent is determined from the natural language input, possibly using keywords extracted from the natural language input. In step 456, the response generator 116 formulates a response (e.g. a question, an answer, or an action) based on the intent.
However, the response formulated by the response generator 116 is based on user-specific financial information, as explained below.
[67] Stored in memory 122 is the identity of the user. The system 102 knows and stores the identity of the user because the identity of the user has been previously provided to the system 102. As one example, the user may have previously provided their bank card number to the system 102, which is used to uniquely identify the user. As another example, the system 102 may be part of an online banking platform, and the user is signed into their online banking, such that the system 102 is aware of the identity of the user.
[68] Stored in a data structure, e.g. a database, is user-specific financial information.
User-specific financial information is financial information that is specific or unique to the user.
A non-exhaustive list of user-specific financial information includes any one, some, or all of the S&B Ref: 84678114 (8500010-481) following: prior financial transactions performed by the user, e.g. a stored record of previous financial transactions; and/or quantity, quality, or type of financial transactions performed by the user; and/or user account balances; and/or number or type of accounts held by a user (examples of accounts include banking accounts, mortgage accounts, investment accounts, etc.); and/or credit information for the user; and/or information relating to banking products utilized by the user, e.g. whether the user has a mortgage, a credit card, investments, etc.
The data structure may be stored in memory 122 or at another location, e.g. a database connected to data processing unit 110 via a network.
[69] There are multiple candidate responses that may be returned to the user, which are selected or weighted based on the user-specific financial information. Some examples are provided below.
[70] Example: The natural language input originating from the user in step conveys the following finance-related question: "What is the monthly fee for your savings account?" The intent determined in step 454 is that the user is requesting the monthly fee for a savings account. The response generator 116 determines the following, e.g. by querying a database: the standard monthly fee is $10 per month, but the fee is reduced to $5 per month if the user has a mortgage account or an investment account with the bank, and the fee is reduced to $0 per month if the user has both a mortgage account and an investment account with the bank.
Therefore, there are three candidate responses: $10, $5, or $0. The response generator 116 uses the user identification stored in memory 122 to query a database that lists the accounts held by the user. The accounts held by the user include a mortgage account, but not an investment account, and so the response returned to the user in step 456 is that the monthly fee is $5, or the response may be a question, e.g. "We can offer you a savings account for a monthly fee of only $5, are you interested?".
[71] Another example: The natural language input originating from the user in step 452 conveys the following finance-related question: "What is this month's fee for my savings account?" The intent determined in step 454 is that the user wants to know this month's fee for the user's savings account. The fee is a function of the number of financial transactions performed by the user involving the user's savings account, e.g. $1 fee for every transfer into or out of the savings account in the month. The response generator 116 uses the user identification S&B Ref: 84678114 (8500010-481) stored in memory 122 to query a database that lists the number of transactions that month. The database returns a value indicating that there were three transfers since the beginning of the month, and so the response returned to the user in step 456 is that the fee will be $3.
[72] Another example: The natural language input originating from the user in step 452 conveys the following finance-related action: "transfer $100 from my savings account to my chequing account". The intent determined in step 454 is that $100 is to be transferred from the user's savings account to the user's chequing account. The response generator 116 determines that the user has two savings accounts ("A" and "B"), and so there are two candidate responses:
either transfer the $100 from the user's savings account A or transfer the $100 from the user's savings account B. The response generator 116 uses the user identification stored in memory 122 to query the account balances for savings accounts A and B and determines that savings account B has no money in it. In response, the response generator 116 performs the transfer from savings account A, perhaps after sending a question to the user confirming that the money is to be transferred from savings account A.
[73] In some embodiments, the method of FIG. 2 may be modified to incorporate generating a response based on user-specific financial information. For example, the answer returned in step 210 and/or 216, and/or 226 may be based on user-specific financial information.
In a variation of FIG. 2, answers (instead of questions) may be returned in steps 208/209, 214/215, and 224/225 (in which case steps 210, 216 and 226 are not needed).
The initial answer returned in step 208/209 may be formulated based on financial information specific to the user. If in step 209 the user found the answer to be unsatisfactory (e.g. incorrect), then the alternative intents or answers (e.g. of step 214/215) may or may not be based on the user's financial information.
[74] An example: The natural language input originating from the user conveys the following finance-related question: "What is the rate of your savings account?" The intent determined is that the user is requesting the interest rate of a savings account, and the confidence score is high enough to immediately supply an answer to the question. The standard interest rate for a savings account is 1%, but can be offered at 1.5% if the user has a mortgage account with the bank. The response generator 116 uses the user identification stored in memory 122 to query a database that lists the accounts held by the user. The accounts held by the user include a S&B Ref: 84678114 (8500010-481) mortgage account, and so the response returned to the user is that the interest rate is 1.5%. It is then determined that the user is not satisfied with the answer, e.g. the user actually wanted to know the fee for the savings account. n alternative intents are therefore identified, and n corresponding alternative answers are returned to the user. However, the n corresponding alternative answers are not formulated based on user-specific financial information because the system 102 is now not as confident about whether the alternative answers even reflect the question actually asked by the user. This is because the confidence scores associated with the alternative intents are lower than the confidence score associated with intent initially determined.
[75] In some embodiments, the response may only be formulated based on user-specific financial information if the confidence score of the intent associated with the response is above a particular threshold. For example, if the intent has a confidence score of 90% or above, then modify the corresponding response based on the user-specific financial information;
otherwise, do not modify the corresponding response based on the user-specific financial information.
[76] FIG. 6 illustrates a flowchart of a computer-implemented method for performing automated interactive conversation with a user, according to one embodiment.
The automated interactive conversation may be performed in order to provide an answer to a question from the user.
[77] In step 502, a user interface 104 is provided at which the user can provide a natural language input. The natural language input is processed by the data processing unit 110.
The data processing unit 110 comprises at least one processor executing instructions. The instructions are configured to cause the data processing unit 110 to perform the remaining steps of FIG. 6.
[78] In step 504, the data processing unit 110 derives, from the natural language input, a possible question the user might be asking. An example of step 504 is described earlier in relation to steps 202 to 208 of FIG. 2.
[79] In step 506, the data processing unit 110 conveys the possible question to the user through the user interface 104 for verification by the user. An example of step 506 is described earlier in relation to steps 208 and 209 of FIG. 2.

S&B Ref: 84678114 (8500010-481) [80] In step 508, the data processing unit 110 processes user input at the user interface 104 indicating that the possible question is incorrect (e.g. the "No" branch of step 209 of FIG. 2).
[81] In step 510, the data processing unit 110 derives a series of alternate questions that the user might be asking (e.g. step 212 of FIG. 2). In some embodiments, step 510 may only be performed if at least one keyword was recognized and extracted from the natural language input. In some embodiments, at least one keyword is recognized and extracted from the natural language input, and step 510 includes deriving the series of alternate questions based on the at least one keyword.
[82] In step 512, the data processing unit presents the series of alternate questions to the user through the user interface 104.
[83] In some embodiments, deriving the possible question the user might be asking in step 504 includes determining a user intent from the natural language input.
In some embodiments, deriving the possible question the user might be asking is performed without an extracted keyword.
[84] In some embodiments, an algorithm for extracting at least one keyword and/or an algorithm for determining user intent is modified based on an indication from the user of which one of the alternate questions is a correct question.
[85] In some embodiments, the method further includes receiving an indication, from the user, that a particular question of the series of alternate questions is correct, and presenting to the user through the user interface an answer to the particular question. In some embodiments, the method includes generating the answer to the particular question using user-specific financial information. In some embodiments, the particular question is a finance-related question, and the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user.
[86] FIG. 7 illustrates a flowchart of a computer-implemented method for interacting with a user, according to one embodiment.
[87] In step 552, a user interface 104 is provided at which the user can provide a natural language input conveying a finance related question or a finance related action to be S&B Ref: 84678114 (8500010-481) performed. The natural language input is processed by the data processing unit 110. The data processing unit 110 comprises at least one processor executing instructions.
The instructions are configured to cause the data processing unit 110 to perform the remaining steps of FIG. 7.
[88] In step 554, the data processing unit 110 derives, from the natural language input, a possible finance related question or possible finance related action. In step 556, the data processing unit 110 obtains a series of candidate responses, each of which is a response to the possible finance related question or the possible finance related action. In step 558, the data processing unit 110 selects one of the candidate responses on the basis of user-specific financial information. In step 560, the data processing unit 110 presents the selected candidate response to the user through the user interface 104. Examples are provided earlier when describing FIG. 5 and related embodiments.
[89] In some embodiments, the candidate responses are a series of answers, each answer corresponding to a respective possible finance related question. In other embodiments, the candidate responses are a series of actions, each action corresponding to a respective possible finance related action instructed by the user. In other embodiments, the candidate responses are a series of questions. The questions may each correspond to possible finance related question being asked. In some embodiments, the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user. In some embodiments, the user-specific financial information is retrieved using an identifier of the user stored in memory.
[90] Although the foregoing has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the scope of the claims appended hereto.
[91] Moreover, any module, component, or device exemplified herein that executes instructions may include or otherwise have access to a non-transitory computer/processor readable storage medium or media for storage of information, such as computer/processor readable instructions, data structures, program modules, and/or other data. A
non-exhaustive list of examples of non-transitory computer/processor readable storage media includes magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, optical disks S&B Ref: 84678114 (8500010-481) such as compact disc read-only memory (CD-ROM), digital video discs or digital versatile disc (DVDs), Blu-ray DiscTM, or other optical storage, volatile and non-volatile, removable and non-removable media implemented in any method or technology, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology. Any such non-transitory computer/processor storage media may be part of a device or accessible or connectable thereto. Any application or module herein described may be implemented using computer/processor readable/executable instructions that may be stored or otherwise held by such non-transitory computer/processor readable storage media.

Claims

CLAIMS:

1. A computer-implemented method for performing automated interactive conversation with a user, the method comprising:
a. providing a user interface at which the user can provide a natural language input;
b. processing the natural language input with a data processing unit comprising at least one processor executing instructions, the instructions configured for:
i. deriving from the natural language input a possible question the user might be asking;
ii. conveying the possible question to the user through the user interface;
iii. processing user input at the user interface indicating that the possible question is incorrect;
iv. deriving a series of alternate questions that the user might be asking;
v. presenting the series of alternate questions to the user through the user interface.

2. The computer-implemented method of claim 1, wherein the instructions are further configured for: extracting at least one keyword from the natural language input, and deriving the series of alternate questions based on the at least one keyword.

3. The computer-implemented method of claim 2, wherein deriving the possible question the user might be asking comprises determining a user intent from the natural language input.

4. The computer-implemented method of claim 3, wherein deriving the possible question the user might be asking is performed without the at least one keyword.

5. The computer-implemented method of claim 3 or claim 4, wherein an algorithm for extracting the at least one keyword and/or an algorithm for determining the user intent is modified based on an indication from the user of which one of the alternate questions is a correct question.

6. The computer-implemented method of any one of claims 1 to 5, wherein the instructions are further configured for: receiving an indication, from the user, that a particular question of the series of alternate questions is correct, and presenting to the user through the user interface an answer to the particular question.

7. The computer-implemented method of claim 6, further comprising generating the answer to the particular question using user-specific financial information.

8. The computer-implemented method of claim 7, wherein the particular question is a finance-related question, and the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user.

9. A system for performing automated interactive conversation with a user, the system comprising:
a. a user interface configured to receive a natural language input from the user;
b. a data processing unit to process the natural language input, the data processing unit configured to:
i. derive from the natural language input a possible question the user might be asking;
ii. convey the possible question to the user through the user interface;
iii. process user input at the user interface indicating that the possible question is incorrect;
iv. derive a series of alternate questions that the user might be asking;
v. present the series of alternate questions to the user through the user interface.

10. The system of claim 9, wherein the data processing unit is further configured to: extract at least one keyword from the natural language input, and derive the series of alternate questions based on the at least one keyword.

11. The system of claim 10, wherein the data processing unit is further configured to derive the possible question the user might be asking by performing operations including determining a user intent from the natural language input.

12. The system of claim 11, wherein the data processing unit is configured to derive the possible question the user might be asking without using the at least one keyword.

13. The system of claim 11 or claim 12, wherein the data processing unit is configured to modify an algorithm for extracting the at least one keyword and/or modify an algorithm for determining the user intent based on an indication from the user of which one of the alternate questions is a correct question.

14. The system of any one of claims 9 to 13, wherein the data processing unit is configured to receive an indication, from the user, that a particular question of the series of alternate questions is correct, and present to the user through the user interface an answer to the particular question.

15. The system of claim 14, wherein the data processing unit is configured to generate the answer to the particular question using user-specific financial information.

16. The system of claim 15, wherein the particular question is a finance-related question, and the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user.

17. A computer-implemented method for interacting with a user, the method comprising:
a. providing a user interface at which a user can provide a natural language input conveying a finance related question or a finance related action to be performed;
b. processing the natural language input with a data processing unit comprising at least one processor executing instructions, the instructions configured for:
i. deriving from the natural language input a possible finance related question or possible finance related action;
ii. obtaining a series of candidate responses, each of which is a response to the possible finance related question or the possible finance related action;
iii. selecting one of the candidate responses on the basis of user-specific financial information;
iv. presenting the selected candidate response to the user through the user interface.

18. The computer implemented method of claim 17, further comprising retrieving the user-specific financial information using an identifier of the user stored in memory.

19. The computer-implemented method of claim 17 or claim 18, wherein the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user.

20. A system for interacting with a user, the system comprising:
a. a user interface configured to receive a natural language input conveying a finance related question or a finance related action to be performed;
b. a data processing unit configured to process the natural language input, the data processing unit configured to:
i. derive from the natural language input a possible finance related question or possible finance related action;
ii. obtain a series of candidate responses, each of which is a response to the possible finance related question or the possible finance related action;
iii. select one of the candidate responses on the basis of user-specific financial information;
iv. present the selected candidate response to the user through the user interface.

21. The system of claim 20, wherein the data processing unit is further configured to retrieve the user-specific financial information using an identifier of the user stored in memory.

22. The system of claim 20 or claim 21, wherein the user-specific financial information relates to financial transactions previously performed by the user and/or accounts held by the user.