WO2019207597A1 - System and method of operating open ended interactive voice response in any spoken languages - Google Patents

System and method of operating open ended interactive voice response in any spoken languages Download PDF

Info

Publication number
WO2019207597A1
WO2019207597A1 PCT/IN2019/050325 IN2019050325W WO2019207597A1 WO 2019207597 A1 WO2019207597 A1 WO 2019207597A1 IN 2019050325 W IN2019050325 W IN 2019050325W WO 2019207597 A1 WO2019207597 A1 WO 2019207597A1
Authority
WO
WIPO (PCT)
Prior art keywords
open ended
interactive voice
voice response
users
user
Prior art date
Application number
PCT/IN2019/050325
Other languages
French (fr)
Inventor
Zubair Ahmed
MD. Rezwanul HOQUE
Sadat Sakif AHMED
Original Assignee
Zubair Ahmed
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zubair Ahmed filed Critical Zubair Ahmed
Priority to AU2019260038A priority Critical patent/AU2019260038A1/en
Publication of WO2019207597A1 publication Critical patent/WO2019207597A1/en
Priority to PH12020551761A priority patent/PH12020551761A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/39Electronic components, circuits, software, systems or apparatus used in telephone systems using speech synthesis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/20Aspects of automatic or semi-automatic exchanges related to features of supplementary services
    • H04M2203/2061Language aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42382Text-based messaging services in telephone networks such as PSTN/ISDN, e.g. User-to-User Signalling or Short Message Service for fixed networks

Definitions

  • the invention relates to the field of operating Interactive Voice Response (IVR). More particularly the invention relates to operating open ended IVR system.
  • IVR Interactive Voice Response
  • Interactive Voice Response is a system where a call, usually a GSM call or using other means of wireless communication for mobile devices, such as cellular or data networks is picked up by the system and a static menu is presented to the caller in a form of audio or text. Based on the user’s keypad input, a certain task is performed. Businesses set up Interactive Voice Response (IVR) to cut down cost and make serving customer effective before any human intervention is required.
  • IVR Interactive Voice Response
  • menus Normally when a user calls an entity, like a bank, the call is first received by an automated audio IVR system. The user is presented with some audio menus, the menu items are bound to the keys 0 - 9 of the keypad. Most menus provides instant service or access to an end point of a specific division in an entity. Bank instant services could be for instance checking balance, change pin of a card and so on, endpoint access of entity could get in touch with the loan department, contact the security department, etc. Menus are always specific to the entity and might contain services specific to the individual caller.
  • a bank’s IVR system will only serve account menus to a caller who has previously registered an account with the phone number, whereas the IVR system of a pizza delivery service will only take in the order and place of delivery. IVR system may or may not use the caller’s information to process a request.
  • the caller has to listen to a set of menus and perform certain keystrokes to reach an intended service or endpoint. These makes the process very time intensive. In addition the process is quite error prone as well, in case the caller makes a wrong input, the whole process might have to be repeated again. Moreover the menu of the IVR may be changed or updated as an impact, a regular caller will not be able to reach a desired service or endpoint just by remembering the keystrokes. Furthermore the endpoint or service may not fulfill the caller’s desired intent and may have to call back to the IVR again to perform the task, which is not at all user friendly and might frustrate the caller.
  • the technique is not efficient enough, as the user can switch between one IVR to another, but the IVRs do not share data. Moreover, the users have to press menu buttons to reach the desired destination, which makes the system very cumbersome. Although that technology made multiple IVR systems accessible under one phone call, it was unable to make them interconnected.
  • the visual IVR menu is specific to the destination and only the IVR of the destination dialed is displayed. These techniques therefore require each destination to set-up hardware, software and other facilities to be deployed for providing visual IVR servers. Moreover, the caller must be literate enough to be able to read and respond to the menu or else the caller has to listen through all the menu before proceeding.
  • this technique requires a specifically configured device to interpret the codes sent as DTMF signals for generating the graphics. Moreover, an operator is required to present the graphics to the caller. Furthermore, specialized software and hardware are required at the operator to design and generate DTMF codes. Therefore, the technique faces various practical limitations.
  • the audible menu scripts must be available in a particular format to enable the conversion.
  • the audio menu scripts must be available or downloadable for the program to function.
  • the audio menus scripts that are available can be converted to visual IVR menu scripts.
  • the device of the caller must be designed or programmed to be able to display the visual IVR menu scripts.
  • the effectiveness of providing the IVR in visual form is discussed in a technical paper titled ‘The Benefits of Augmenting Telephone Voice Menu Navigation with Visual Browsing and Search’ by Min Yin et al.
  • the paper discusses a setup, where visual content of the IVR is sent from a service provider to a computer connected to a mobile phone.
  • the technique discussed in the paper is limited to the visual content provided by the service provider's end, after the connection is established. Moreover, the providers are required to individually set up the hardware and services for providing visual content.
  • An enhanced IVR system which lets the user perform multiple task across multiple services under one streamlined phone call with just voice input without using any menus.
  • the user is authenticated via voice or passcode and the credentials are fetched.
  • the user tells the system the intended tasks.
  • the voice is converted to text using an automatic speech recognition (ASR) system which converts the users’ utterances to text.
  • a natural language processing (NLP) engine corrects any mistakes the ASR made during the conversion in languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai- Kadai.
  • the logic system takes over and fetches a set of services needed by the tasks, checks if these tasks are valid, fetches required information and generates responses to be given to the user. Thereupon the response system takes charge, generates and presents responses through speech synthesis or in text form (e.g. through sms messages) to the user.
  • the invention tackles a wide array from problems not dealt previously.
  • the elimination of both acoustic and visual menus leads to more natural and time effective user interaction. There is not need to type in any information, since all data will be extracted from voice, thus less literate users are enabled to interact with the IVR system effortlessly.
  • the interconnection across multiple services allows the user to perform similar tasks in a single phone all session.
  • FIG. 1.0.0 is a diagrammatic view of the system organization for an exemplary embodiment of the presented invention.
  • FIG. 2, 4, 5, 6 are segmented flow charts of the principal steps involved in using the presented invention.
  • a caller (can be multiple) 101 can initiate a call 104 to the open ended IVR system 200.
  • the caller 101 can make the call 104 using any cellular device 102 which could be a smartphone or feature phone and the call 104 will land on the system, which is running on cloud servers 200 using the cellular network 103 the GSM device 102 use.
  • the caller 101 will be greeted with an audio message and asks to speak the desired intent.
  • the system 200 receives the caller’s 101 intent, the system 200 fetches the required services 106 necessary to perform the intent via internet 105 based APIs (application programming interface).
  • the system 200 checks if the caller 101 intent is executable and based on the intent the system 200 responds to the caller 101 in the following ways:
  • the system 200 reports the caller 101 the informative intent requested in audio or text via SMS Gateway servers.
  • the system 200 might ask the caller’s 101 consent to execute the requested task, presenting an audio format of the task to be executed.
  • the system 200 might report the caller 101 error in the intent in audio form.
  • the caller 101 might ask to execute another intent or end the call. If the caller 101 receives an error, the caller 101 will be asked to repeat the corrected intent. For example: A caller 101 calls the system 200 and the system 200 greets the caller 101 with an audio message and asks the caller 101 for the intent. The caller 101 speaks“Buy 2 dozen eggs from EGGY Store and pay the bill from my AB bank account and deliver the eggs to my home by EATS” the system 200 processes the intent and responds to the caller 101 in audio with‘’From EGGY Store purchase 24 eggs. An amount BDT 50 will be debated from your AB bank account number XXXXX by EGGY.
  • the caller 101 responds with“YES” and the system 200 replies“Your request was successfully executed”. The caller 101 ends the call.
  • the system 200 sends two receipts via SMS, one from EGGY store and another from EATS delivery service.
  • the system may invoke a particular service multiple times as well based on the caller 101 needs.
  • a caller 101 calls the system 200.
  • the system 200 greets the caller 101 with an audio message and asks the caller 101 for the intent.
  • the caller 101 speaks“Transfer BDT 5000 from my account in ABC bank to account number XXXXX in ABC Bank and pay my Electric bill ID XXXX with my ABC bank account”
  • the system 200 processes the intent and responds to the caller 101 in audio with“From your account XXXX in ABC bank transfer BDT 5000 to account XXXXXX in ABC.
  • An amount BDT 4200 will be deducted from your ABC bank account XXXX in refer to the Electric bill ID XXXX. Confirm task?”.
  • the caller 101 responds with“YES” and the system 200 replies“Your request was successfully executed”.
  • the caller 101 cuts the line.
  • the system 200 sends two SMS logs, one for account transfer another for the electric bill payment.
  • FIG. 2 a diagrammatic view of the open ended IVR system 200 for an exemplary embodiment of the presented invention is shown.
  • the Call Landing server 201 handles the call 104 and routes the call 104 to the authentication system 300.
  • the authentication system 300 identifies the caller 101 and fetches the caller 101 credentials.
  • the audio speech of the caller 101 is converted to a machine readable intent by the intent handler system 400 and passed to the logic system 500 for further processing.
  • the logic system 500 fetches the required services and viability of the intent and gives feedback to the response system 500.
  • the response system converts the feedback into a human understandable format and passes the feedback to the caller 101 via the available channels.
  • FIG. 3 a diagrammatic view of the open ended IVR system 200 for an embodiment of the present invention.
  • FIG. 4 a diagrammatic view of the intent handler system 400 for an exemplary embodiment of the present invention is shown.
  • the audio from the caller 202 is passed to the automatic speech recognition (ASR) system 401 which converts the audio speech to text 402.
  • the text 402 is passed to the natural language processing system (NLP) 403 which corrects any error generated during audio to text conversion in languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai-Kadai.
  • the corrected text then is then passed to the contextual AI system, which reads the text and collects the tasks 406, the caller 101 wants to perform.
  • the machine readable intent 406 is passed to the logic system 500 for further processing.
  • Intent 406 can be one of the following 3 types:
  • the logic system 500 retrieves the services required to perform the intent - 504. Then the logic system 500 performs operations to evaluate whether the intent isvalid and executable - 506. If the intent 507 requires only information, the data is retrieved and passed to the response system 600 or the intent waits for the caller’s 101 confirmation or the intent contains an error, which has to be corrected by the caller 101 by providing a new intent 406. Upon receiving the confirmation intent the tasks are executed and the result is passed on to the response system 600.
  • FIG. 6 a diagrammatic view of the response system 600 for an exemplary embodiment of the presented invention is shown.
  • the logic system generates 2 types of responses:
  • the system needs a confirmation on the task list. 502
  • a report is generated to be delivered to the caller 101. It can be formation or error. 503
  • the response is converted to human readable text 504 and based on the logic system response 501.
  • the human readable text 504 is either passed to the text to speech (TTS) system 505 or aSMS gateway server, before it is delivered to the caller 101.
  • TTS text to speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Nitrogen Condensed Heterocyclic Rings (AREA)
  • Coloring Foods And Improving Nutritive Qualities (AREA)
  • Medicines Containing Plant Substances (AREA)

Abstract

System and method of operating open ended interactive voice response in any spoken languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai-Kadai, discloses the invention of open Ended Interactive Voice Response System, wherein customer just have to speak their desired intention in a spoken languages and the system will take care of the rest, thus making it efficient, user friendly, secure and unified. Whereas, IVR also serves businesses by eliminating their need to create and maintain any IVR systems, secure integration with other businesses and itself, worry about scalability of their systems.

Description

SYSTEM AND METHOD OF OPERATING OPEN ENDED
INTERACTIVE VOICE RESPONSE IN ANY SPOKEN LANGUAGES
FIELD OF THE INVENTION
The invention relates to the field of operating Interactive Voice Response (IVR). More particularly the invention relates to operating open ended IVR system.
BACKGROUND OF THE INVENTION
Interactive Voice Response (IVR) is a system where a call, usually a GSM call or using other means of wireless communication for mobile devices, such as cellular or data networks is picked up by the system and a static menu is presented to the caller in a form of audio or text. Based on the user’s keypad input, a certain task is performed. Businesses set up Interactive Voice Response (IVR) to cut down cost and make serving customer effective before any human intervention is required.
Normally when a user calls an entity, like a bank, the call is first received by an automated audio IVR system. The user is presented with some audio menus, the menu items are bound to the keys 0 - 9 of the keypad. Most menus provides instant service or access to an end point of a specific division in an entity. Bank instant services could be for instance checking balance, change pin of a card and so on, endpoint access of entity could get in touch with the loan department, contact the security department, etc. Menus are always specific to the entity and might contain services specific to the individual caller. For example, a bank’s IVR system will only serve account menus to a caller who has previously registered an account with the phone number, whereas the IVR system of a pizza delivery service will only take in the order and place of delivery. IVR system may or may not use the caller’s information to process a request.
In most cases the caller has to listen to a set of menus and perform certain keystrokes to reach an intended service or endpoint. These makes the process very time intensive. In addition the process is quite error prone as well, in case the caller makes a wrong input, the whole process might have to be repeated again. Moreover the menu of the IVR may be changed or updated as an impact, a regular caller will not be able to reach a desired service or endpoint just by remembering the keystrokes. Furthermore the endpoint or service may not fulfill the caller’s desired intent and may have to call back to the IVR again to perform the task, which is not at all user friendly and might frustrate the caller. Additionally a user might have to call multiple IVR systems to perform related tasks since most business IVR are not interconnected, generating a negative user experience. Former technology U.S. Pat. No.804l575B2 tried to solve IVRs’ intercommunication problem by providing menu driven top level IVR system where the user can connect to different IVR systems using the menu and data is bundled up by the main IVR and passed to secondary IVR systems.
However, the back and forth between menu poses a great challenge for the users. Additional, the menu hierarchy increases drastically thus the user is deprived of a user friendly solution.
Another technique is provided in US Patent 20050033684A1 assigned to Tekelec which provides a method of payment through IVR system using either keypad entry or voice based commands. The transaction only occurs, when a local Point of Sales (POS) device generates a sale transaction.
However the users are limited to only one task: Transferring money from one point to another. Moreover, the user cannot initiate a transaction, only a POS device can do this, thus making the service very limiting. The technique integrated multiple services of same agenda into one and doesn't serve the purpose of streamlining several tasks under one call.
In US Patent No. 8155280B1 assigned to Zvi Or-Bach, Tal Lavian tried to tackle the disconnected IVR systems by having a database of it’s own and linking to each menus of different IVR systems.
However, the technique is not efficient enough, as the user can switch between one IVR to another, but the IVRs do not share data. Moreover, the users have to press menu buttons to reach the desired destination, which makes the system very cumbersome. Although that technology made multiple IVR systems accessible under one phone call, it was unable to make them interconnected.
Some prior invention tried to address this problem by providing visual form of IVR. These prior arts display the IVR menu graphically on a caller device. U.S. Pat. No. 7,215,743 assigned to International Business Machines Corporation and a published U.S. patent application Ser. No. 11/957,605, filed Dec. 17, 2007 and assigned to Motorola Inc., provides the IVR menu of the destination in a visual form to the caller. The caller can select the options from the IVR menu without listening to the complete audio representation of the IVR menu. However, the IVR menu displayed on the caller device is stored on an IVR server at the destination end.
However, the visual IVR menu is specific to the destination and only the IVR of the destination dialed is displayed. These techniques therefore require each destination to set-up hardware, software and other facilities to be deployed for providing visual IVR servers. Moreover, the caller must be literate enough to be able to read and respond to the menu or else the caller has to listen through all the menu before proceeding.
Another existing technique, as disclosed in U.S. Pat. No. 6,560,320 assigned to International Business Machines Corporation, enables an operator of the IVR to send customized signals to the caller for generating and displaying graphical elements on the device of the caller. Thereafter, the caller can respond by selecting options through the touch-screen interface of the device by utilisingDual-tone multi-frequency (DTMF) signals of the IVR.
However, this technique requires a specifically configured device to interpret the codes sent as DTMF signals for generating the graphics. Moreover, an operator is required to present the graphics to the caller. Furthermore, specialized software and hardware are required at the operator to design and generate DTMF codes. Therefore, the technique faces various practical limitations.
Generally, the IVR menus of the organizations are presented as audible menus. Moreover, there are a large number of organizations that use IVR menus. Therefore, converting the audible menus to visual IVR menus can be time consuming. An existing technique, as disclosed in U.S. Pat. No. 6,920,425 assigned to Nortel Networks Limited, discloses an automated script to convert the audible menus scripts to visual IVR menu scripts.
However, the audible menu scripts must be available in a particular format to enable the conversion. Furthermore, the audio menu scripts must be available or downloadable for the program to function. As a result, only the audio menus scripts that are available can be converted to visual IVR menu scripts. Furthermore, the device of the caller must be designed or programmed to be able to display the visual IVR menu scripts.
The effectiveness of providing the IVR in visual form is discussed in a technical paper titled ‘The Benefits of Augmenting Telephone Voice Menu Navigation with Visual Browsing and Search’ by Min Yin et al. The paper discusses a setup, where visual content of the IVR is sent from a service provider to a computer connected to a mobile phone.
However, the technique discussed in the paper is limited to the visual content provided by the service provider's end, after the connection is established. Moreover, the providers are required to individually set up the hardware and services for providing visual content.
As discussed above the existing technologies have various limitations. Hence, techniques are desired for providing enhanced IVR systems.
SUMMARY OF THE INVENTION
An enhanced IVR system which lets the user perform multiple task across multiple services under one streamlined phone call with just voice input without using any menus. The user is authenticated via voice or passcode and the credentials are fetched. The user tells the system the intended tasks. The voice is converted to text using an automatic speech recognition (ASR) system which converts the users’ utterances to text. A natural language processing (NLP) engine corrects any mistakes the ASR made during the conversion in languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai- Kadai. Afterwards the logic system takes over and fetches a set of services needed by the tasks, checks if these tasks are valid, fetches required information and generates responses to be given to the user. Thereupon the response system takes charge, generates and presents responses through speech synthesis or in text form (e.g. through sms messages) to the user.
As a result, the invention tackles a wide array from problems not dealt previously. The elimination of both acoustic and visual menus leads to more natural and time effective user interaction. There is not need to type in any information, since all data will be extracted from voice, thus less literate users are enabled to interact with the IVR system effortlessly. In addition, the interconnection across multiple services allows the user to perform similar tasks in a single phone all session.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1.0.0 is a diagrammatic view of the system organization for an exemplary embodiment of the presented invention; and
FIG. 2, 4, 5, 6 are segmented flow charts of the principal steps involved in using the presented invention. DETAIL DESCRIPTION OF PREFERRED EMB ODIMENT (S)
Referring to FIG. 1, a diagrammatic view of the system organization for an embodiment of the presented invention is shown. A caller (can be multiple) 101 can initiate a call 104 to the open ended IVR system 200. The caller 101 can make the call 104 using any cellular device 102 which could be a smartphone or feature phone and the call 104 will land on the system, which is running on cloud servers 200 using the cellular network 103 the GSM device 102 use. Once the call 104 has landed on our system 200, the caller 101 will be greeted with an audio message and asks to speak the desired intent. Once the system 200 receives the caller’s 101 intent, the system 200 fetches the required services 106 necessary to perform the intent via internet 105 based APIs (application programming interface). The system 200 checks if the caller 101 intent is executable and based on the intent the system 200 responds to the caller 101 in the following ways:
1. The system 200 reports the caller 101 the informative intent requested in audio or text via SMS Gateway servers.
2. The system 200 might ask the caller’s 101 consent to execute the requested task, presenting an audio format of the task to be executed.
3. The system 200 might report the caller 101 error in the intent in audio form.
If the caller 101 requested for an informative intent or a task based intent, the caller 101 might ask to execute another intent or end the call. If the caller 101 receives an error, the caller 101 will be asked to repeat the corrected intent. For example: A caller 101 calls the system 200 and the system 200 greets the caller 101 with an audio message and asks the caller 101 for the intent. The caller 101 speaks“Buy 2 dozen eggs from EGGY Store and pay the bill from my AB bank account and deliver the eggs to my home by EATS” the system 200 processes the intent and responds to the caller 101 in audio with‘’From EGGY Store purchase 24 eggs. An amount BDT 50 will be debated from your AB bank account number XXXXXX by EGGY. Deliver eggs to house 16, flat A2, Jefferson Street by EATS. An amount BDT 8 will be debated from your AB bank account number XXXXXX by EATS. Confirm task?”. The caller 101 responds with“YES” and the system 200 replies“Your request was successfully executed”. The caller 101 ends the call. The system 200 sends two receipts via SMS, one from EGGY store and another from EATS delivery service.
The system may invoke a particular service multiple times as well based on the caller 101 needs. For example: A caller 101 calls the system 200. The system 200 greets the caller 101 with an audio message and asks the caller 101 for the intent. The caller 101 speaks“Transfer BDT 5000 from my account in ABC bank to account number XXXXX in ABC Bank and pay my Electric bill ID XXXX with my ABC bank account” the system 200 processes the intent and responds to the caller 101 in audio with“From your account XXXX in ABC bank transfer BDT 5000 to account XXXXXX in ABC. An amount BDT 4200 will be deducted from your ABC bank account XXXXX in refer to the Electric bill ID XXXX. Confirm task?”. The caller 101 responds with“YES” and the system 200 replies“Your request was successfully executed”. The caller 101 cuts the line. The system 200 sends two SMS logs, one for account transfer another for the electric bill payment.
Referring to FIG. 2, a diagrammatic view of the open ended IVR system 200 for an exemplary embodiment of the presented invention is shown. When a call 104 enters the system 200. The Call Landing server 201 handles the call 104 and routes the call 104 to the authentication system 300. The authentication system 300 identifies the caller 101 and fetches the caller 101 credentials. The audio speech of the caller 101 is converted to a machine readable intent by the intent handler system 400 and passed to the logic system 500 for further processing. The logic system 500 fetches the required services and viability of the intent and gives feedback to the response system 500. The response system converts the feedback into a human understandable format and passes the feedback to the caller 101 via the available channels.
Referring to FIG. 3, a diagrammatic view of the open ended IVR system 200 for an embodiment of the present invention.
Referring to FIG. 4, a diagrammatic view of the intent handler system 400 for an exemplary embodiment of the present invention is shown. The audio from the caller 202 is passed to the automatic speech recognition (ASR) system 401 which converts the audio speech to text 402. The text 402 is passed to the natural language processing system (NLP) 403 which corrects any error generated during audio to text conversion in languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai-Kadai. The corrected text then is then passed to the contextual AI system, which reads the text and collects the tasks 406, the caller 101 wants to perform. The machine readable intent 406 is passed to the logic system 500 for further processing.
Referring to FIG. 5, a diagrammatic view of the logic system 500for an exemplary embodiment of the presented invention is shown. Intent 406 can be one of the following 3 types:
1. New intent 501
2. Correction intent 502
3. Confirmation intent 503
In case of a new intent 501 and a correction intent 502 the logic system 500, retrieves the services required to perform the intent - 504. Then the logic system 500 performs operations to evaluate whether the intent isvalid and executable - 506. If the intent 507 requires only information, the data is retrieved and passed to the response system 600 or the intent waits for the caller’s 101 confirmation or the intent contains an error, which has to be corrected by the caller 101 by providing a new intent 406. Upon receiving the confirmation intent the tasks are executed and the result is passed on to the response system 600.
Referring to FIG. 6, a diagrammatic view of the response system 600 for an exemplary embodiment of the presented invention is shown. The logic system generates 2 types of responses:
The system needs a confirmation on the task list. 502
A report is generated to be delivered to the caller 101. It can be formation or error. 503
The response is converted to human readable text 504 and based on the logic system response 501. The human readable text 504 is either passed to the text to speech (TTS) system 505 or aSMS gateway server, before it is delivered to the caller 101.

Claims

We Claim:
1. System and method of operating open ended interactive voice responses in any spoken languages such as Austroasiatic, Austronesian, Dravidian, Indo-Aryan, Afroasiatic, Sino-Tibetan and Tai-Kadai, comprising the steps of
a system of call landing via wireless communication for mobile devices, such as cellular or data networks where a call landing server will pick up the call and pass the call to the authentication server, wherein the server verifies the user and fetches stored information associated with the corresponding user; and
a system of input voice in any language by the user, wherein the voice intent is converted by the ASR, which is passed through an NLP engine to correct any grammatical mistakes. Further, the corrected text is passed to the contextual AI engine, which picks up the user’s desired tasks and passes the tasks to the logic system, wherein the said logic system checks, validates and fetches the services required to complete the tasks with user confirmation, which may be required by the logic system to execute the task; and
a response system, generated as a reply to the users audio input, where the system uses either TTS (Text to Speech) to ask/ provide feedback or SMS notification, with a combination of modular servers and services the system works as one streamlined input mechanism.
2. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, after authentication by the system, which fetches the list of services from storage that the users are affiliated with. It is also in charge of fetching users credentials required to operate the corresponding services from the data center.
3. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system takes analog input from users in audio form in any language, which is converted to text by an ASR and NLP engine.
4. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system is connected to multiple services via APIs, wherein the system asks affiliated business/es to provide services to the users within periodical time intervals or one at a time with the user’s consent.
5. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system invoke multiple services or a single service multiple times based on the user’s task as coded in the logic server.
6. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system asks the users to provide inputs multiple times based on the task requirements.
7. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system passed data from one service to another or within the service if necessary for the execution of the required task.
8. System and method of operating open ended interactive voice response in any spoken languages according to claim 1, wherein the system responds to users with TTS(Text to Speech) to ask users to provide further information, to report an error that has been generated or provide the requested information; and a system that uses SMS Gateway server as a form of record keeping or log generation.
PCT/IN2019/050325 2018-04-23 2019-04-22 System and method of operating open ended interactive voice response in any spoken languages WO2019207597A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2019260038A AU2019260038A1 (en) 2018-04-23 2019-04-22 System and method of operating open ended interactive voice response in any spoken languages
PH12020551761A PH12020551761A1 (en) 2018-04-23 2020-10-22 System and method of operating open ended interactive voice response in any spoken languages

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
BD1192018 2018-04-23
BD119/2018 2018-04-23

Publications (1)

Publication Number Publication Date
WO2019207597A1 true WO2019207597A1 (en) 2019-10-31

Family

ID=68295833

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2019/050325 WO2019207597A1 (en) 2018-04-23 2019-04-22 System and method of operating open ended interactive voice response in any spoken languages

Country Status (3)

Country Link
AU (1) AU2019260038A1 (en)
PH (1) PH12020551761A1 (en)
WO (1) WO2019207597A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021135548A1 (en) * 2020-06-05 2021-07-08 平安科技(深圳)有限公司 Voice intent recognition method and device, computer equipment and storage medium
CN113392847A (en) * 2021-06-17 2021-09-14 拉萨搻若文化艺术产业开发有限公司 OCR (optical character recognition) handheld scanning translation device and translation method for Tibetan Chinese and English

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113905135B (en) * 2021-10-14 2023-10-20 天津车之家软件有限公司 User intention recognition method and device of intelligent outbound robot

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6418199B1 (en) * 1997-12-05 2002-07-09 Jeffrey Perrone Voice control of a server
US6944592B1 (en) * 1999-11-05 2005-09-13 International Business Machines Corporation Interactive voice response system
US20170228367A1 (en) * 2012-04-20 2017-08-10 Maluuba Inc. Conversational agent

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6418199B1 (en) * 1997-12-05 2002-07-09 Jeffrey Perrone Voice control of a server
US6944592B1 (en) * 1999-11-05 2005-09-13 International Business Machines Corporation Interactive voice response system
US20170228367A1 (en) * 2012-04-20 2017-08-10 Maluuba Inc. Conversational agent

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
AMAZON: "Voice Shopping with Alexa", YOUTUBE, 18 November 2016 (2016-11-18), XP054979967, Retrieved from the Internet <URL:https://youtu.be/mCjvV3iFsuw> *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021135548A1 (en) * 2020-06-05 2021-07-08 平安科技(深圳)有限公司 Voice intent recognition method and device, computer equipment and storage medium
CN113392847A (en) * 2021-06-17 2021-09-14 拉萨搻若文化艺术产业开发有限公司 OCR (optical character recognition) handheld scanning translation device and translation method for Tibetan Chinese and English
CN113392847B (en) * 2021-06-17 2023-12-05 拉萨搻若文化艺术产业开发有限公司 Tibetan Chinese-English three-language OCR handheld scanning translation device and translation method

Also Published As

Publication number Publication date
PH12020551761A1 (en) 2021-07-12
AU2019260038A1 (en) 2020-12-24

Similar Documents

Publication Publication Date Title
AU2019236638B2 (en) User authentication via mobile phone
US10009463B2 (en) Multi-channel delivery platform
US8000454B1 (en) Systems and methods for visual presentation and selection of IVR menu
US8223931B1 (en) Systems and methods for visual presentation and selection of IVR menu
US8442563B2 (en) Automated text-based messaging interaction using natural language understanding technologies
US10063701B2 (en) Custom grammars builder platform
US7471786B2 (en) Interactive voice response system with partial human monitoring
US20090144131A1 (en) Advertising method and apparatus
WO2019207597A1 (en) System and method of operating open ended interactive voice response in any spoken languages
US8625756B1 (en) Systems and methods for visual presentation and selection of IVR menu
EP3138272B1 (en) Voice call diversion to alternate communication method
US20160370952A1 (en) Visual interactive voice response system
EP3047638B1 (en) Multi-channel delivery platform
WO2010107649A1 (en) Cross channel contact history management
US8731148B1 (en) Systems and methods for visual presentation and selection of IVR menu
US8867708B1 (en) Systems and methods for visual presentation and selection of IVR menu
US9247055B2 (en) Interactive voice response (IVR) routing system
US10672093B2 (en) Delivery order relaying system using TTS and method therefor
US10404849B2 (en) Launching a designated application using a set of signals
WO2011130077A1 (en) System and method for intermediating between subscriber devices and communication service providers
US11743386B2 (en) System and method of controlling and implementing a communication platform as a service
US20150312411A1 (en) Method for directing a phone call to a web-based menu access point via a passive telephone access point
US20200322293A1 (en) Information processing system and method
KR20090099924A (en) Method and unit for interactive multimedia response to multiple call distribution
KR20120103278A (en) Remote banking system and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19791474

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019260038

Country of ref document: AU

Date of ref document: 20190422

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 19791474

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19791474

Country of ref document: EP

Kind code of ref document: A1