WO2020185880A1 - Intelligence artificielle conversationnelle pour la gestion automatisée de compte en libre-service - Google Patents

Intelligence artificielle conversationnelle pour la gestion automatisée de compte en libre-service Download PDF

Info

Publication number
WO2020185880A1
WO2020185880A1 PCT/US2020/022074 US2020022074W WO2020185880A1 WO 2020185880 A1 WO2020185880 A1 WO 2020185880A1 US 2020022074 W US2020022074 W US 2020022074W WO 2020185880 A1 WO2020185880 A1 WO 2020185880A1
Authority
WO
WIPO (PCT)
Prior art keywords
caller
account
data
speech
conversational
Prior art date
Application number
PCT/US2020/022074
Other languages
English (en)
Inventor
Kevin Michael GILLESPIE
Original Assignee
Beguided, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beguided, Inc. filed Critical Beguided, Inc.
Publication of WO2020185880A1 publication Critical patent/WO2020185880A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4936Speech interaction details
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding

Definitions

  • This process is typically described as a“decision tree” wherein a defined number of possible caller/customer experiences is fully defined, and the entry points required to achieve the desired outcomes are limited to a specific set of inputs that the caller must use to trigger the intended treatment of the decision tree to route them to the desired outcome.
  • Existing technologies to have a conversational ability with the caller or use previous experience with other callers to augment their responses by correlating unclear answers with their corresponding associated intended meaning. Addtionally, the existing technologies cannot take free-form speech and analyze the intent of the spoken speech to establish meaning as it relates to the various tasks the device is designed to perform.
  • Existing applications also cannot perform multiple tasks in parallel; the caller cannot engage the application in conversation while it is performing the previous tasks.
  • the technologies described herein can take free-form human speech and transcribe it into text, analyze the intent of that speech in relation to the defined tasks for which the device is purposed, establish a corollary to communicate back to the caller for affirmation or next steps, analyze host data relating to the caller’s intent, execute any number of parallel tasks relating to account management or retrieval of decisioning data points relating to the caller, competently respond to the caller in a truly conversational format with understanding of the callers intended goal with needed attributes that allow the caller to complete the caller’s intended objective in a self-directed fashion, and respond comprehensively to answer specific caller inquiries throughout the call experience.
  • This new technology enables intent-driven communication automation tools on the phone between automated technologies (e.g., hots, etc.) and human beings.
  • a service provider, vendor or other creditor may have several options for contacting a consumer or account-holder about their account.
  • a service provider, vendor or other creditor will seek to contact the borrower, consumer or account-holder, or their co-signor, as applicable, via telephone, physical mail, SMS text, and email in order to communicate with the borrower, consumer or account-holder, or their co-signor, as applicable, about, for example, past due amounts payable, pending or anticipated changes of to the party’s account details or services, confirmation of changes to services or lending terms, credit authorization of new services or to otherwise renegotiate the payment terms agreement or take legal action to enforce the agreement, among other purposes. All of these tasks involve additional work and expense on the part of the vendor, service-provider or creditor. Contacting a borrower may require investment of additional time to locate the borrower’s contact information, if such information is not readily available.
  • Telecommunication Consumer Protection Act Telecommunication Consumer Protection Act
  • restrictions on outbound contact methods from service providers requires an emphasis on direct inbound communication strategies (i.e., creditors are restricted on initiating communications with consumers and often must wait for a consumer to contact them before they can take a desired corrective action related to an account).
  • the means through which vendors, service-providers, and creditors utilize to engage with consumers who are trying to engage them include: call centers operations, physical mail, and digital platforms such as social media.
  • the borrower may be uncooperative.
  • Embodiments of the present subject matter comprise systems and methods for the engineering, development, management, implementation and use of telephony-based applications for automated self-service account management.
  • Some embodiments of the subject matter disclosed herein comprise methods and systems for design and use of web-based applications for call flow management and routing. Some embodiments comprise an intelligent voice response (IVR) as part of a call flow management device to save time within the caller’s call routing experience.
  • IVR intelligent voice response
  • Some embodiments may comprise a static decisioning intelligence that can ascertain the purpose of the call directly by the input of alpha-numeric entries from the caller through the telephony experience, or dual tone multi- frequency (DTMF) signaling in order to receive direct input from the caller in response to automated questions and apply understanding to those direct entries and establishing the intent of a caller; creating a simplified self-service device that lets callers resolve their issue before reaching a live call center agent or to hasten their hold experience before they reach a live call center agent.
  • Some embodiments of the subject matter disclosed herein comprise methods for speech- to-text (STT) conversion for the real-time processing of recorded speech through textual analysis tools that create accurate text conversion as an output.
  • STT speech- to-text
  • Some embodiments of the subject matter disclosed herein comprise a Conversational Artificial Intelligence application layer, the method comprising: data parsing which allows for the tokenization of relevant textual data to prepare it for machine learning models and auto-detection of cluster typos or incorrect speech-to-text results; and machine learning model techniques such as a Markov Decision Process (MDP).
  • MDP forms the basis for many reinforcement learning problem techniques as it provides the flexibility to implement a wide range of machine learning algorithms including deep learning (neural nets) and classification and an interpretable and auditable environment which provides for continuous human assisted improvement.
  • the conversational artificial intelligence system the host data store comprising historic call data and account data; the telephony system configured to: receive free-form speech from a caller;
  • the conversational artificial intelligence system configured to: parse and tokenize the speech utterance; query the host data store to retrieve historic call data and account data for the caller; and apply a machine learning model to the tokenized speech utterance and the historic call data and account data for the caller to:
  • the machine learning model comprises a Markov Decision Process (MDP).
  • MDP Markov Decision Process
  • the MDP enables the artificial intelligence system to move freely across states and perform a plurality of tasks simultaneously.
  • the machine learning model comprises an artificial neural network (ANN).
  • the account is a credit account.
  • the one or more pre-defmed tasks comprises: get information about an account or make a payment on an account.
  • the conversational artificial intelligence system is further configured to execute any number of parallel tasks relating to account management or retrieval of decisioning data points relating to the caller.
  • the platform is configured to comprehensively answer specific caller inquiries throughout the call experience.
  • the conversational artificial intelligence system is further configured to provide a log tracking process for a person-in-the-loop procedure in order to create supervised transition probabilities.
  • the telephony system is further configured to provide an interactive voice response (IVR) system.
  • the telephony system is further configured to receive dual-tone multi -frequency (DTMF) signaling.
  • a host data store comprising historic call data and account data
  • receiving, via a telephony system, free form speech from a caller transcribing the free-form speech to generate a speech utterance; parsing and tokenizing the speech utterance; querying the host data store to retrieve historic call data and account data for the caller; and applying a machine learning model to the tokenized speech utterance and the historic call data and account data for the caller to: identify an intended objective of the caller in relation to one or more pre-defmed account management tasks; and execute the intended objective; or establish a corollary to respond to the caller in a conversational format via the telephony system for affirmation or additional data needed to execute the intended objective.
  • the machine learning model comprises a Markov Decision Process (MDP).
  • MDP Markov Decision Process
  • the MDP enables the artificial intelligence system to move freely across states and perform a plurality of tasks simultaneously.
  • the machine learning model comprises an artificial neural network (ANN).
  • ANN artificial neural network
  • the account is a credit account.
  • the one or more pre- defmed tasks comprises: get information about an account or make a payment on an account.
  • applying the machine learning model comprises executing any number of parallel tasks relating to account management or retrieval of decisioning data points relating to the caller.
  • the method further comprises comprehensively answering specific caller inquiries throughout the call experience.
  • the method further comprises providing a log tracking process for a person-in-the-loop procedure in order to create supervised transition probabilities.
  • the method further comprises receiving, via the telephony system, interactive voice responses (IVR).
  • IVR interactive voice responses
  • the method further comprises receiving, via the telephony system, dual-tone multi -frequency
  • Fig. 1 shows a non-limiting example of a high-level schematic diagram of the
  • FIG. 2 shows a non-limiting example of a topological diagram of the flow of information through the various systems and stages described herein, which derive successful output responses delivered as automated communications;
  • FIG. 3 shows a non-limiting example of a digital processing device; in this case, a device with one or more CPUs, a memory, a communication interface, and a display;
  • FIG. 4 shows a non-limiting example of a web/mobile application provision system; in this case, a system providing browser-based and/or native mobile user interfaces; and
  • Fig. 5 shows a non-limiting example of a cloud-based web/mobile application provision system; in this case, a system comprising an elastically load balanced, auto-scaling web server and application server resources as well synchronously replicated databases.
  • Described herein, in certain embodiments, are computer-implemented platform for automated self-service account management comprising a host data store, a telephony system, and conversational artificial intelligence system: the host data store comprising historic call data and account data; the telephony system configured to: receive free-form speech from a caller; transcribe the free-form speech to generate a speech utterance; and provide the speech utterance to the conversational artificial intelligence system; the conversational artificial intelligence system configured to: parse and tokenize the speech utterance; query the host data store to retrieve historic call data and account data for the caller; and apply a machine learning model to the tokenized speech utterance and the historic call data and account data for the caller to:
  • the machine learning model comprises a Markov Decision Process (MDP).
  • MDP Markov Decision Process
  • the MDP enables the artificial intelligence system to move freely across states and perform a plurality of tasks simultaneously.
  • the account is a credit account.
  • the one or more pre-defmed tasks comprises: get information about an account or make a payment on an account.
  • the conversational artificial intelligence system is further configured to execute any number of parallel tasks relating to account management or retrieval of decisioning data points relating to the caller.
  • the platform is configured to comprehensively answer specific caller inquiries throughout the call experience.
  • the conversational artificial intelligence system is further configured to provide a log tracking process for a person-in-the-loop procedure in order to create supervised transition probabilities.
  • a host data store comprising historic call data and account data
  • receiving, via a telephony system, free-form speech from a caller transcribing the free-form speech to generate a speech utterance; parsing and tokenizing the speech utterance; querying the host data store to retrieve historic call data and account data for the caller; and applying a machine learning model to the tokenized speech utterance and the historic call data and account data for the caller to: identify an intended objective of the caller in relation to one or more pre-defmed account management tasks; and execute the intended objective; or establish a corollary to respond to the caller in a conversational format via the telephony system for affirmation or additional data needed to execute the intended objective.
  • the machine learning model comprises a Markov Decision Process (MDP).
  • MDP enables the artificial intelligence system to move freely across states and perform a plurality of tasks simultaneously.
  • the machine learning model comprises an artificial neural network (ANN).
  • ANN artificial neural network
  • the account is a credit account.
  • the one or more pre-defmed tasks comprises: get information about an account or make a payment on an account.
  • applying the machine learning model comprises executing any number of parallel tasks relating to account management or retrieval of decisioning data points relating to the caller.
  • the method further comprises comprehensively answering specific caller inquiries throughout the call experience.
  • the method further comprises providing a log tracking process for a person-in-the- loop procedure in order to create supervised transition probabilities.
  • the method further comprises receiving interactive voice responses (IVR).
  • the method further comprises receiving dual-tone multi -frequency (DTMF) signaling.
  • IVR interactive voice responses
  • DTMF dual-tone multi -frequency
  • a caller is connected via telephony (e.g., a provided phone number to reach the telephony application platform; or the telephony environment).
  • the caller speaks into their telephone and the telephony application platform receives a“speech utterance” in a speech gather process. That speech utterance is sent via API in an MP3 format from the telephony platform to existing technologies for speech-to-text transcription. The transcription is then relayed as input to AVA in a textual format 110.
  • the speech-to-text input for AVA is created via existing technologies for speech-to-text transcription and delivered to the AVA machine learning models 120.
  • 120 AVA using data parsing to parse and tokenize the relevant textual data of the message in order to ready it for machine learning models (e.g., the Markov Decision Process).
  • Data parsing includes a combination of open source tools such as the Natural Language Toolkit (NLTK 3.4) and a machine learning-based tokenizer which can auto-detect and cluster typos together to reduce the training vocabulary size.
  • NLTK 3.4 Natural Language Toolkit
  • machine learning-based tokenizer which can auto-detect and cluster typos together to reduce the training vocabulary size.
  • the Markov Decision Process and other Natural Language Processing (NLP) and/or Natural Language Understanding (NLU) techniques seek to understand intent of the speech utterance and to follow the Markov Decision Process state/actions as well as execute probability function approximations resulting in: responses to the caller with information 160 and making requests via API 130 of the source data 140 required for specific tasks related to the specific state-actions linked to the initial speech (e.g., authentication of a caller, source data relating to the caller, etc.).
  • NLP Natural Language Processing
  • NLU Natural Language Understanding
  • 120 the probability distributions in AVA’s Markov Decision Process 120 to move freely across the various states and perform many tasks simultaneously, such as speaking to the caller 160 while performing a look up of information 130, 140, 150, the interpretation of that data, and the resultant state action to be executed.
  • the systems and methods described herein include a Markov Decision Process.
  • a Markov Decision Process expressed as mathematical formulae.
  • An MDP is a tuple (S, A, P, R, g), where S is a state space, A is a finite set of actions, P is the state transition probability function, R is the reward function, and g is a discount factor (g e [0, 1]).
  • S - The State Space is an exhaustive set of states that the model understands and links to “state-actions,” which include responding to the customer, sending a task out to a client system, (making a note, retrieving billing information), ending the call, transferring the call, and more.
  • a - The set of actions (different from state-actions) define the movements possible throughout the network. These actions (as well as the states) were built based on the
  • P - The state transition probability function is approximated by machine learning models; a semi-supervised pipeline consisting of Doc2Vec and a classification model trained on the movements between States learned from our analysis of the voice samples.
  • the inputs to the models consist of both caller inputs as well as the sourced client data required for decisioning.
  • Doc2Vec models for understanding text in a numerical format use an additional vector (e.g.,“document ID”) that will expand upon the broader ML concept of feature vectors as with Word2Vec.
  • the additional vector trained in addition to the word vectors establishes a concept of a“document” (e.g., a complete transcription of a speech utterance from a caller) through creating a numerical representation of the document (or label), in lieu of the Word2Vec establishment of the concept of a word with a numerical representation of each word.
  • Word2Vec is a two-layer neural net that process text; using Continuous Bag of Words (CBOW) and Skip Gram (skip- gram). Continuous Bag of Words will concatenate the following word after a series of words, and skip-gram will predict all surrounding words (or context) by using just one word.
  • Word2Vec would train word vectors to predict the next word by giving a numerical representation to each word through its use of CBOW and skip-gram, Doc2Vec trains word vectors and additionally trains a document vector to create a numerical representation of the document, regardless of its length.
  • Doc2Vec is an extension of Word2Vec with unsupervised learning of continuous representations of larger blocks of text (e.g., sentences, labels, documents, etc.).
  • R - In the first version of AVA, the reward function at this state is fixed upon whether or not the customer was able to make a payment, payment arrangement or restore their services. In future versions of AVA, customer interactions will provide a better reward estimation. The reward function helps to update the probability function defined in the previous bullet point.
  • the AVA environment encapsulates the telephony application platform for the auditory caller experience where speech-to-text conversion (STT) and text-to- speech (TTS) conversion occur via existing technologies reached via API 205, the AI environment 210 where data parsing and the Markov Decision process functions are executed 220, 225, and the retrieval aspects of retrieval of source data communicate via API to source data systems necessitated by specific state/actions, (e.g., authenticating a caller, retrieval of account management information, etc.).
  • STT speech-to-text conversion
  • TTS text-to- speech
  • the telephony environment contains existing technologies for speech-to-text conversions via existing resting APIs for STT
  • the source data is reached via API for the AI environment to access and interpret 220, and sometimes relay to the customer 225 based on the approximated state/action that apply.
  • 225, 220, 215 The probability distributions in AVA’s Markov Decision Process 225 move freely across the various states and perform many tasks simultaneously, such as speaking to the caller 225, 205 while performing a look up of information 220, 215, 225, the interpretation of that data and the resultant state action to be executed.
  • the platforms, systems, media, and methods described herein include a digital processing device, or use of the same.
  • the digital processing device includes one or more hardware central processing units (CPUs) or general purpose graphics processing units (GPGPUs) that carry out the device’s functions.
  • the digital processing device further comprises an operating system configured to perform executable instructions.
  • the digital processing device is optionally connected a computer network.
  • the digital processing device is optionally connected to the Internet such that it accesses the World Wide Web.
  • the digital processing device is optionally connected to a cloud computing infrastructure.
  • the digital processing device is optionally connected to an intranet.
  • the digital processing device is optionally connected to a data storage device.
  • suitable digital processing devices include, by way of non-limiting examples, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles.
  • server computers desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles.
  • smartphones are suitable for use in the system described herein.
  • Suitable tablet computers include those with booklet, slate, and convertible
  • the digital processing device includes an operating system configured to perform executable instructions.
  • the operating system is, for example, software, including programs and data, which manages the device’s hardware and provides services for execution of applications.
  • suitable server operating systems include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD ® , Linux, Apple ® Mac OS X Server ® , Oracle ® Solaris ® , Windows Server ® , and Novell ® NetWare ® .
  • suitable personal computer operating systems include, by way of non-limiting examples, Microsoft ® Windows ® , Apple ® Mac OS X ® , UNIX ® , and UNIX- like operating systems such as GNU/Linux ® .
  • the operating system is provided by cloud computing.
  • suitable mobile smartphone operating systems include, by way of non-limiting examples, Nokia ® Symbian ® OS, Apple ® iOS ® , Research In Motion ® BlackBerry OS ® , Google ® Android ® , Microsoft ® Windows Phone ® OS, Microsoft ® Windows Mobile ® OS, Linux ® , and Palm ® WebOS ® .
  • suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV ® , Roku ® , Boxee ® , Google TV ® , Google Chromecast ® , Amazon Fire ® , and Samsung ® HomeSync ® .
  • suitable video game console operating systems include, by way of non-limiting examples, Sony ® PS3 ® , Sony ® PS4 ® , Microsoft ® Xbox 360 ® , Microsoft Xbox One, Nintendo ® Wii ® , Nintendo ® Wii U ® , and Ouya ® .
  • the device includes a storage and/or memory device.
  • the storage and/or memory device is one or more physical apparatuses used to store data or programs on a temporary or permanent basis.
  • the device is volatile memory and requires power to maintain stored information.
  • the device is non-volatile memory and retains stored information when the digital processing device is not powered.
  • the non-volatile memory comprises flash memory.
  • the non volatile memory comprises dynamic random-access memory (DRAM).
  • the non-volatile memory comprises ferroelectric random access memory (FRAM).
  • the non-volatile memory comprises phase-change random access memory (PRAM).
  • the device is a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing based storage.
  • the storage and/or memory device is a combination of devices such as those disclosed herein.
  • the digital processing device includes a display to send visual information to a user.
  • the display is a cathode ray tube (CRT).
  • the display is a liquid crystal display (LCD).
  • the display is a thin film transistor liquid crystal display (TFT-LCD).
  • the display is an organic light emitting diode (OLED) display.
  • OLED organic light emitting diode
  • on OLED display is a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display.
  • the display is a plasma display. In other embodiments, the display is a video projector.
  • the digital processing device includes an input device to receive information from a user.
  • the input device is a keyboard.
  • the input device is a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus.
  • the input device is a touch screen or a multi-touch screen.
  • the input device is a microphone to capture voice or other sound input.
  • the input device is a video camera or other sensor to capture motion or visual input.
  • the input device is a Kinect, Leap Motion, or the like.
  • the input device is a combination of devices such as those disclosed herein.
  • an exemplary digital processing device 301 is programmed or otherwise configured to conduct telephony, store and retrieve caller data, and apply machine learning algorithms.
  • the digital processing device 301 includes a central processing unit (CPU, also“processor” and“computer processor” herein) 305, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
  • the digital processing device 301 also includes memory or memory location 310 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 315 (e.g., hard disk), communication interface 320 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 325, such as cache, other memory, data storage and/or electronic display adapters.
  • the memory 310, storage unit 315, interface 320 and peripheral devices 325 are in communication with the CPU 305 through a communication bus (solid lines), such as a motherboard.
  • the storage unit 315 can be a data storage unit (or data repository) for storing data.
  • the digital processing device 301 can be operatively coupled to a computer network (“network”) 330 with the aid of the communication interface 320.
  • the network 330 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
  • the network 330 in some cases is a telecommunication and/or data network.
  • the network 330 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
  • the network 330 in some cases with the aid of the device 301, can implement a peer-to-peer network, which may enable devices coupled to the device 301 to behave as a client or a server.
  • the CPU 305 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
  • the instructions may be stored in a memory location, such as the memory 310.
  • the instructions can be directed to the CPU 305, which can subsequently program or otherwise configure the CPU 305 to implement methods of the present disclosure. Examples of operations performed by the CPU 305 can include fetch, decode, execute, and write back.
  • the CPU 305 can be part of a circuit, such as an integrated circuit. One or more other components of the device 301 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA).
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the storage unit 315 can store files, such as drivers, libraries and saved programs.
  • the storage unit 315 can store user data, e.g., user preferences and user programs.
  • the digital processing device 301 in some cases can include one or more additional data storage units that are external, such as located on a remote server that is in communication through an intranet or the Internet.
  • the digital processing device 301 can communicate with one or more remote computer systems through the network 330.
  • the device 301 can communicate with a remote computer system of a user.
  • remote computer systems include servers, personal computers (e.g., portable PC), slate or tablet computers (e.g., Apple ® iPad, Samsung ® Galaxy Tab), telephones, smartphones (e.g., Apple ® iPhone, Android-enabled device, Blackberry ® ), or personal digital assistants.
  • Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the digital processing device 101, such as, for example, on the memory 310 or electronic storage unit 315.
  • the machine executable or machine readable code can be provided in the form of software.
  • the code can be executed by the processor 305.
  • the code can be retrieved from the storage unit 315 and stored on the memory 310 for ready access by the processor 305.
  • the electronic storage unit 315 can be precluded, and machine-executable instructions are stored on memory 310.
  • Non-transitory computer readable storage medium
  • the platforms, systems, media, and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device.
  • a computer readable storage medium is a tangible component of a digital processing device.
  • a computer readable storage medium is optionally removable from a digital processing device.
  • a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, distributed computing systems including cloud computing systems and services, and the like.
  • the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
  • the platforms, systems, media, and methods disclosed herein include at least one computer program, or use of the same.
  • a computer program includes a sequence of instructions, executable in the digital processing device’s CPU, written to perform a specified task.
  • Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • APIs Application Programming Interfaces
  • a computer program may be written in various versions of various languages.
  • a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
  • a computer program includes a web application.
  • a web application in various embodiments, utilizes one or more software frameworks and one or more database systems.
  • a web application is created upon a software framework such as Microsoft ® .NET or Ruby on Rails (RoR).
  • a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems.
  • suitable relational database systems include, by way of non-limiting examples, Microsoft ® SQL Server, mySQLTM, and Oracle ® .
  • a web application in various embodiments, is written in one or more versions of one or more languages.
  • a web application may be written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof.
  • a web application is written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML).
  • a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS).
  • CSS Cascading Style Sheets
  • a web application is written to some extent in a client- side scripting language such as Asynchronous Javascript and XML (AJAX), Flash ® Actionscript, Javascript, or Silverlight ® .
  • AJAX Asynchronous Javascript and XML
  • Flash ® Actionscript Javascript
  • Javascript or Silverlight ®
  • a web application is written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion ® , Perl, JavaTM, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), PythonTM, Ruby, Tel, Smalltalk, WebDNA ® , or Groovy.
  • a web application is written to some extent in a database query language such as Structured Query Language (SQL).
  • SQL Structured Query Language
  • a web application integrates enterprise server products such as IBM ® Lotus Domino ® .
  • a web application includes a media player element.
  • a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe ® Flash ® , HTML 5, Apple ® QuickTime ® , Microsoft ® Silverlight ® , JavaTM, and Unity ® .
  • an application provision system comprises one or more databases 400 accessed by a relational database management system (RDBMS) 410.
  • RDBMSs include Firebird, MySQL, PostgreSQL, SQLite, Oracle Database, Microsoft SQL Server, IBM DB2, IBM Informix, SAP Sybase, SAP Sybase, Teradata, and the like.
  • the application provision system further comprises one or more application severs 420 (such as Java servers, .NET servers, PHP servers, and the like) and one or more web servers 430 (such as Apache, IIS, GWS and the like).
  • the web server(s) optionally expose one or more web services via app application programming interfaces (APIs) 440.
  • APIs app application programming interfaces
  • an application provision system alternatively has a distributed, cloud-based architecture 500 and comprises elastically load balanced, auto-scaling web server resources 510 and application server resources 520 as well synchronously replicated databases 530.
  • a computer program includes a mobile application provided to a mobile digital processing device.
  • the mobile application is provided to a mobile digital processing device at the time it is manufactured.
  • the mobile application is provided to a mobile digital processing device via the computer network described herein.
  • a mobile application is created by techniques known to those of skill in the art using hardware, languages, and development environments known to the art. Those of skill in the art will recognize that mobile applications are written in several languages. Suitable programming languages include, by way of non-limiting examples,
  • Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplaySDK, alcheMo, Appcelerator ® , Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples, Lazarus, MobiFlex, MoSync, and Phonegap. Also, mobile device manufacturers distribute software developer kits including, by way of non-limiting examples, iPhone and iPad (iOS) SDK, AndroidTM SDK, BlackBerry ® SDK, BREW SDK, Palm ® OS SDK, Symbian SDK, webOS SDK, and Windows ® Mobile SDK.
  • iOS iPhone and iPad
  • a computer program includes a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in.
  • standalone applications are often compiled.
  • a compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, JavaTM, Lisp, PythonTM, Visual Basic, and VB .NET, or combinations thereof. Compilation is often performed, at least in part, to create an executable program.
  • a computer program includes one or more executable complied applications.
  • the computer program includes a web browser plug-in (e.g., extension, etc.).
  • a plug-in is one or more software components that add specific functionality to a larger software application. Makers of software applications support plug-ins to enable third-party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the functionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types.
  • the toolbar comprises one or more web browser extensions, add-ins, or add-ons.
  • the toolbar comprises one or more explorer bars, tool bands, or desk bands.
  • plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non-limiting examples, C++, Delphi, JavaTM, PHP, PythonTM, and VB .NET, or combinations thereof.
  • Web browsers are software applications, designed for use with network-connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non limiting examples, Microsoft ® Internet Explorer ® , Mozilla ® Firefox ® , Google ® Chrome, Apple ® Safari ® , Opera Software ® Opera ® , and KDE Konqueror. In some embodiments, the web browser is a mobile web browser.
  • Mobile web browsers are designed for use on mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems.
  • Suitable mobile web browsers include, by way of non-limiting examples, Google ® Android ® browser, RIM BlackBerry ® Browser, Apple ® Safari ® , Palm ® Blazer, Palm ® WebOS ® Browser, Mozilla ® Firefox ® for mobile, Microsoft ® Internet Explorer ® Mobile, Amazon ® Kindle ® Basic Web, Nokia ® Browser, Opera Software ® Opera ® Mobile, and Sony ® PSPTM browser.
  • the platforms, systems, media, and methods disclosed herein include software, server, and/or database modules, or use of the same.
  • software modules are created by techniques known to those of skill in the art using machines, software, and languages known to the art.
  • the software modules disclosed herein are implemented in a multitude of ways.
  • a software module comprises a file, a section of code, a programming object, a programming structure, or combinations thereof.
  • a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof.
  • the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application.
  • software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on cloud computing platforms. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
  • the platforms, systems, media, and methods disclosed herein include one or more databases, or use of the same.
  • suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object-oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases. Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, and Sybase.
  • a database is internet-based.
  • a database is web-based.
  • a database is cloud computing-based.
  • a database is based on one or more local computer storage devices.
  • a telephony environment which can be described as a software application platform for further development is first established.
  • This environment has a receiving phone number (or phone numbers) by which a caller or transfer application can reach the telephony environment.
  • the software application environment is where all the transcription, voice and necessary experiential data points will flow through.
  • Data flows to the AI model which is in the cloud, flows to host systems that hold account management data, and flows back through the telephony with voice applications providing an auditory experience to the caller.
  • the organization of this flow is essential and key to the specific design of the conversational AI.
  • Source caller data from the client containing existing historical caller experiences in their entirety is required. This data must be at a specific level of scale in order to successfully inform the AI for its initial approximations, establishment of states/actions and the development of Representative Turn Groups which impact how the set of actions can define the movements across the network.
  • the MDP there are provided potential values for each component of the tuple.
  • the state space values can be created, at first, through a manual process. Specifically, by establishing the total potential variety of entries that all equate to the same state.
  • a definitive list for affirmation or “yes” states might include:“Yup,”“ok,”“Okie,”“alrighty,”“absolutely,”“of course,”“no problem,”“sure,”“alright,” etc.
  • These and all utterances that might be colloquial or otherwise regional, or that might be a specific parlance relating to anachronistic or business-related (task- related) language (such as 40 IK management, for example), or acronyms that relate to business language (such as“HIS,” standing for“High Speed Internet,” for example) is logged and placed as specific state-space values.
  • the set of actions (different from state-actions) define the movements possible throughout the network.
  • the state transition probability function is approximated by machine learning models; a semi-supervised pipeline consisting of Doc2Vec and a classification model trained on the movements between States learned from the analysis of the voice samples.
  • the inputs to the models consist of both caller inputs as well as the retrieved source data through state actions for decisioning that relate to account management.
  • the probabilities are learned at the same time as the states/actions from the voice samples.
  • a person in the loop via a log tracking process in order to create supervised transition probabilities.
  • a caller In execution of these functions a caller reaches the specific telephonic number of the telephony environment hosting the application.
  • the telephony application accepts the call and executes a state action related to“basic greeting without prior knowledge of caller” defined by client policy and historical call analysis.
  • This state action sends a textual output to the telephony environment for text-to- speech conversion, such as an MP3 signal broadcast on the live phone call,“Thank you for calling us. My name is AVA can I have your name please.”
  • the caller’s initial speech reaction to this greeting then becomes the first variable by which the remainder of MDP results can occur.
  • the caller then states,“My name is Tim Cowherd and I want to understand my bill, why did my balance go up?”
  • the telephony environment will deliver the MP3 of this speech utterance for speech-to-text conversion, which transcribes the exacting result and then delivers the textual input to the AI environment.
  • the AI environment utilizes data parsing to prioritize and categorize specific words or documents/sentences, groups of words such as“bill,”“understand,”“Name,”“Tim Cowherd,”“bill go up” to provide the ML models with numerical representations of the words, documents, sentences and groups of words.
  • the numerical representations of the text thus submitted to the machine learning models will leverage probability distributions in AVA’s Markov Decision Process to move freely across the various states and perform many tasks simultaneously, such as a simultaneous execution of the state action to deliver text back to the telephony environment via text or speech conversion to MP3,“I understand you would like to know more about your account, could you please give me the account holder name and account number to start?”, the state action to understand the name“Tim Cowherd” and associate it with the caller, and the state action to authenticate the caller with the proceeding information from the caller in the next speech utterance containing the account holder name and account number.
  • AVA Markov Decision Process

Abstract

L'invention concerne une plate-forme mise en œuvre par ordinateur, des systèmes et des procédés pour une gestion automatisée de compte en libre-service utilisant un système de téléphonie et un système d'intelligence artificielle conversationnelle pour identifier et exécuter des tâches de gestion de compte prévues.
PCT/US2020/022074 2019-03-12 2020-03-11 Intelligence artificielle conversationnelle pour la gestion automatisée de compte en libre-service WO2020185880A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962817423P 2019-03-12 2019-03-12
US62/817,423 2019-03-12

Publications (1)

Publication Number Publication Date
WO2020185880A1 true WO2020185880A1 (fr) 2020-09-17

Family

ID=72427637

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/022074 WO2020185880A1 (fr) 2019-03-12 2020-03-11 Intelligence artificielle conversationnelle pour la gestion automatisée de compte en libre-service

Country Status (1)

Country Link
WO (1) WO2020185880A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159901A (zh) * 2021-04-29 2021-07-23 天津狮拓信息技术有限公司 融资租赁业务会话的实现方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070043574A1 (en) * 1998-10-02 2007-02-22 Daniel Coffman Conversational computing via conversational virtual machine
US20130111348A1 (en) * 2010-01-18 2013-05-02 Apple Inc. Prioritizing Selection Criteria by Automated Assistant
US20140247927A1 (en) * 2010-04-21 2014-09-04 Angel.Com Incorporated Dynamic speech resource allocation
US20150189085A1 (en) * 2013-03-15 2015-07-02 Genesys Telecommunications Laboratories, Inc. Customer portal of an intelligent automated agent for a contact center

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070043574A1 (en) * 1998-10-02 2007-02-22 Daniel Coffman Conversational computing via conversational virtual machine
US20130111348A1 (en) * 2010-01-18 2013-05-02 Apple Inc. Prioritizing Selection Criteria by Automated Assistant
US20140247927A1 (en) * 2010-04-21 2014-09-04 Angel.Com Incorporated Dynamic speech resource allocation
US20150189085A1 (en) * 2013-03-15 2015-07-02 Genesys Telecommunications Laboratories, Inc. Customer portal of an intelligent automated agent for a contact center

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159901A (zh) * 2021-04-29 2021-07-23 天津狮拓信息技术有限公司 融资租赁业务会话的实现方法和装置

Similar Documents

Publication Publication Date Title
US20240046374A1 (en) Systems, media, and methods for automated response to queries made by interactive electronic chat
US11206229B2 (en) Directed acyclic graph based framework for training models
US11735157B2 (en) Systems and methods for providing automated natural language dialogue with customers
US11134152B2 (en) System and method for managing a dialog between a contact center system and a user thereof
US10410633B2 (en) System and method for customer interaction management
JP7381579B2 (ja) セマンティック人工知能エージェント
US9516126B1 (en) Call center call-back push notifications
CN113228606A (zh) 来自移动通信会话的语义crm副本
US10192569B1 (en) Informing a support agent of a paralinguistic emotion signature of a user
CN115917553A (zh) 在聊天机器人中实现稳健命名实体识别的实体级数据扩充
CN116724305A (zh) 上下文标签与命名实体识别模型的集成
US20230043528A1 (en) Using backpropagation to train a dialog system
US11114092B2 (en) Real-time voice processing systems and methods
US20200211029A1 (en) Methods and systems for processing customer inquiries
Subudhi Banking on artificial intelligence: Opportunities & challenges for banks in India
US20210233090A1 (en) Systems and methods for automated discrepancy determination, explanation, and resolution
WO2020185880A1 (fr) Intelligence artificielle conversationnelle pour la gestion automatisée de compte en libre-service
US10832255B2 (en) Systems and methods for understanding and solving customer problems by extracting and reasoning about customer narratives
US20210125612A1 (en) Systems and methods for automated discrepancy determination, explanation, and resolution with personalization
Boonstra Introduction to conversational AI
WO2023076756A1 (fr) Techniques basées sur des règles destinées à l'extraction de paires de question-réponse de données

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20770867

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20770867

Country of ref document: EP

Kind code of ref document: A1