US20220129242A1 - System and method for voice-directed website walk-through - Google Patents
System and method for voice-directed website walk-through Download PDFInfo
- Publication number
- US20220129242A1 US20220129242A1 US17/467,538 US202117467538A US2022129242A1 US 20220129242 A1 US20220129242 A1 US 20220129242A1 US 202117467538 A US202117467538 A US 202117467538A US 2022129242 A1 US2022129242 A1 US 2022129242A1
- Authority
- US
- United States
- Prior art keywords
- text
- speech
- user
- voice
- website
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 8
- 238000006243 chemical reaction Methods 0.000 claims abstract description 7
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 230000001755 vocal effect Effects 0.000 claims 2
- 238000009434 installation Methods 0.000 abstract description 2
- 238000012546 transfer Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 4
- 230000008520 organization Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000000399 orthopedic effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- the present invention relates generally to the field of website control and more particularly to a system and method for voice-directed walk-throughs of particular websites.
- the present invention provides a system and method for a user to receive voice prompts and to talk to a website stating what they desire to do.
- the present invention uses conversational speech-to-text (speech recognition) and text-to-speech or pre-recorded voice over sounds along with graphic overlay to provide a general user guidance experience.
- speech recognition speech recognition
- special code is transferred to the client browser from the site server.
- the client computer's application interface supports speech recognition and/or text to speech, some or all speech conversions can be performed on the client side. If not, the speech processing can be performed on a dedicated private control site or by an external public site that provides speech processing services or as a distributed service in an on premise installation.
- an artificial intelligence module usually on the control site, attempts to determine intent—which is not mandatory, but definitely helps in many use-cases. Once intent is determined, the correct commands can be sent to the website to bring up proper pages and/or walkthroughs and/or answers. In addition, follow-up questions can be asked to clarify the user's intent and to continue the user through the actions they desire if needed in the context of the last sentence, for example:
- FIG. 1 shows an embodiment of the present invention where the User's computer/browser has speech-to-text and text-to-speech capability.
- FIG. 2 shows an embodiment where the control site performs speech-to-text and text-to-speech.
- FIG. 3 shows an embodiment where a public site or site other than the control site performs speech-to-text and text-to-speech.
- FIG. 1 shows a block diagram of an exemplary embodiment of the invention.
- a user 3 accesses a website 2 of a company or location he or she is interested in using their standard computer/browser 1 .
- This can be any type of computer/browser including cellular telephones for both web and mobile native applications (apps).
- special code 7 is embedded in the website.
- the special code 7 is transferred to the user's browser that directs the process of voice-controlled walk-through.
- a play-trigger can be anything that initiates a walk-through such as a button on the page, after a timer, on a user-interaction, or an automatic start.
- speech recognition 5 converts the user's words to text. That text is then sent for intent determination.
- speech recognition 6 is performed by the user's computer using an application program interface (API) that exists on the user's computer. Questions or further comments coming back from either the website or a control website are usually printed on the screen and also converted to speech by a text-to-speech 6 module. In the embodiment of FIG. 1 , this module is also located on the user's computer.
- API application program interface
- Text created by the user 3 is sent from the user's browser 1 to a control location or walk-through site 8 for intent determination.
- a text-to-intent engine 9 is typically located on that site. This is some form of artificial intelligence that can use dictionaries of expected words. If intent cannot be determined, text that asks more questions can be sent to the user where it is either printed or presented as speech. Even when intent is determined, further questions may be necessary. Also, further questions and statements can be used in the form of a running conversation to help the user complete the desired task.
- a user may enter a telephone provider's website.
- the site may ask (though text to speech): “How may I help you?”.
- the user might answer: “I want to upgrade”. Since this could mean several different things, i.e. upgrade service, upgrade a phone, a further question may be necessary: “Please state if you want to upgrade your service or your telephone.”
- the user can then respond and be taken to the correct page on the site where the conversation can continue either by text alone, or by speech exchange.
- FIG. 1 assumes that the speech recognition and text-to-speech capability rests on the user's computer through an API that code sent to the browser from the site can access (such as JavaTM).
- code sent to the browser from the site can access (such as JavaTM).
- FIG. 2 shows an embodiment of the present invention where the user's computer/browser 1 does not have speech processing capability, and where the audio from incoming speech is streamed to the walk-through control site 8 where speech recognition 5 , text-to-speech 6 and intent determination 9 takes place.
- FIG. 3 shows another embodiment of the present invention where one or more external sites 10 is/are used to perform speech processing, namely speech recognition 6 and text-to-speech conversion 5 .
- audio is streamed to the external site either directly to and from the user's browser 1 or to and from the walk-through control site 8 .
- Such external sites provide APIs that allow either public users or subscribers to perform speech processing.
- the present invention prefers to do the speech recognition on the client-side user's end. If the API does not exist, the system will automatically fallback to either option 2 or option 3 .
- the system will send the audio by using the navigator.getUserMedia( ) (or any other media capturing method) function in the client-side and send it to the server via webRTC or via REST API. From there the control server will send it either external services of speech to text and intent understanding or it will do both or the any part on the server. Optionally, parts of this can also be performed on the client-side depending upon capability.
- Another option that the present invention supports is sending the audio directly to a SaaS speech to text service to get the sentence with or without the intent.
- the system determines the intent (again either in the client or in the server), by using dictionaries of relevant words, that might exist in the sentence retrieved, or by using any other artificial intelligence method including neural networks, expert systems or any other method or technique of determining meaning from text.
- the system displays (via audio and/or text) the next part of the conversation, whether it is another question that will clarify the visitor's intent, or just navigating the visitor to the right place/section and guiding him or her through the process while in a conversational process with the visitor by going through a walkthrough tree/graph like instruction set.
- the walkthrough can still contain conditions, triggers, actions, custom JAVASCRIPTTM conditions and actions.
- the system can change the dictionary and the configuration to the context of the answer that is applicable to the question.
- the system can ask the visitor if they mind repeating his answer, and might even let the visitor choose from a textual representation of the options to avoid having the visitor repeat several times.
- the system's part of the dialog can always be represented in text, sound or both.
- the user/visitor part of the dialog can be demonstrated by speaking or by typing what he or she wants into a text input (div, text input, textarea or any other html text input method).
- a sequence might start with (for example): Visitor/User clicks on a Help button in the telco/cables/etc website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the insurance company's website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the utility company's website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the healthcare company's website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the travel/hospitality company's website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the education organization's website.
- System “How can I help you?”
- User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the public sector organization's website. System: “How can I help you?” User sample inquires for help might be:
- a sequence might start with (for example): Visitor/User clicks on a Help button in the public sector organization's website. System: “How can I help you?” User sample inquires for help might be:
- the present invention provides a system and method that makes it easy for a user or visitor to a website to navigate to correct pages and accomplish one or more tasks by voice interaction in a conversational mode.
- speech can be processed at the client end on the user computer/browser if that capability exists. If not, audio can be streamed mono-directionally or bidirectionally to either a walk-through control site or to an external remote API that provides speech processing services.
- the totality of the invention provide a convenient system and method for navigating websites using voice.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Information Transfer Between Computers (AREA)
- Telephonic Communication Services (AREA)
Abstract
A system and method for a user to receive voice prompts and to talk to a website stating what they desire to do. Speech-to-text (speech recognition) and text-to-speech or pre-recorded voice along with graphic overlay provide user guidance. Upon accessing a website, special code is transferred to the client browser from the site server. If the client computer's application interface supports speech recognition and/or text to speech, some or all speech conversions can be performed on the client side. If not, the speech processing can be performed on a dedicated private control site or by an external site that provides speech processing services, or as a distributed service in an on premise installation. After speech is converted to text, an artificial intelligence module, attempts to determine intent. Once intent is determined, correct commands can be sent to the website to bring up proper pages and/or walkthroughs and/or answers.
Description
- This application is a continuation of application Ser. No. 16/214,051 filed Dec. 8, 2018 which claimed priority to U.S. Provisional Patent Application No. 62/596,626 filed Dec. 8, 2017. Application Ser. No. 16/214,051 and 62/596,626 are hereby incorporated by reference in their entirety.
- The present invention relates generally to the field of website control and more particularly to a system and method for voice-directed walk-throughs of particular websites.
- Many times users have difficulties navigating web sites. In particular, they may not be able to immediately get to the page they desire even though they know exactly what they want to do. For example, they might want to pay a telephone bill. They may log onto the telephone provider's website, and be barraged with ads for new telephones and new services. They may have to search and search to find a tab or button that allows them to simply pay their telephone bill. Another example might be an airline reservation site. The user wants to fly from their hometown to San Francisco on the 20th of the month. It may take considerable time on the airline's site to get that information into their server and search engine. Even the simple act of logging off of a site when one has signed on may be difficult to do. It would be extremely advantageous if the user could simply speak what they want and have the “computer” understand what they want and immediately bring up the correct page from a site.
- The present invention provides a system and method for a user to receive voice prompts and to talk to a website stating what they desire to do. The present invention uses conversational speech-to-text (speech recognition) and text-to-speech or pre-recorded voice over sounds along with graphic overlay to provide a general user guidance experience. When a website is accessed, special code is transferred to the client browser from the site server. If the client computer's application interface (API) supports speech recognition and/or text to speech, some or all speech conversions can be performed on the client side. If not, the speech processing can be performed on a dedicated private control site or by an external public site that provides speech processing services or as a distributed service in an on premise installation. After speech is converted to text, an artificial intelligence module, usually on the control site, attempts to determine intent—which is not mandatory, but definitely helps in many use-cases. Once intent is determined, the correct commands can be sent to the website to bring up proper pages and/or walkthroughs and/or answers. In addition, follow-up questions can be asked to clarify the user's intent and to continue the user through the actions they desire if needed in the context of the last sentence, for example:
-
- User: “I want a device”
- Machine: “Do you want to new device or to upgrade your own device?”
- User: “new one”->This answer without a context doesn't mean anything but with a context it is sufficient.
- Attention is now directed to several drawings the illustrate features of the present invention.
-
FIG. 1 shows an embodiment of the present invention where the User's computer/browser has speech-to-text and text-to-speech capability. -
FIG. 2 shows an embodiment where the control site performs speech-to-text and text-to-speech. -
FIG. 3 shows an embodiment where a public site or site other than the control site performs speech-to-text and text-to-speech. - Several figures and illustrations have been provided to aid in understanding the present invention. The scope of the present invention is not limited to what is shown in the figures.
- The present invention relates to a system and method for speech-controlled walk-through of websites.
FIG. 1 shows a block diagram of an exemplary embodiment of the invention. Auser 3 accesses awebsite 2 of a company or location he or she is interested in using their standard computer/browser 1. This can be any type of computer/browser including cellular telephones for both web and mobile native applications (apps). By prior agreement with the website owner,special code 7 is embedded in the website. Upon access, thespecial code 7 is transferred to the user's browser that directs the process of voice-controlled walk-through. Once theuser 3 clicks on any button or play-trigger that initiates a voice-walkthrough, a voice and text prompt is given to theuser 3 through a speaker orearphone 4. A play-trigger can be anything that initiates a walk-through such as a button on the page, after a timer, on a user-interaction, or an automatic start. - Once there is a play-trigger, there can be a prompt that asks the
user 3 to either say or type what he or she would like to do. If theuser 3 types the request, the text is captured and sent for intent determination. if the user speaks, (say through a microphone 7),speech recognition 5 converts the user's words to text. That text is then sent for intent determination. In the embodiment ofFIG. 1 ,speech recognition 6 is performed by the user's computer using an application program interface (API) that exists on the user's computer. Questions or further comments coming back from either the website or a control website are usually printed on the screen and also converted to speech by a text-to-speech 6 module. In the embodiment ofFIG. 1 , this module is also located on the user's computer. - Text created by the
user 3, either by speaking or by freeform typing is sent from the user'sbrowser 1 to a control location or walk-throughsite 8 for intent determination. A text-to-intent engine 9 is typically located on that site. This is some form of artificial intelligence that can use dictionaries of expected words. If intent cannot be determined, text that asks more questions can be sent to the user where it is either printed or presented as speech. Even when intent is determined, further questions may be necessary. Also, further questions and statements can be used in the form of a running conversation to help the user complete the desired task. - For example, a user may enter a telephone provider's website. The site may ask (though text to speech): “How may I help you?”. The user might answer: “I want to upgrade”. Since this could mean several different things, i.e. upgrade service, upgrade a phone, a further question may be necessary: “Please state if you want to upgrade your service or your telephone.” The user can then respond and be taken to the correct page on the site where the conversation can continue either by text alone, or by speech exchange.
- As previously stated, the embodiment of
FIG. 1 assumes that the speech recognition and text-to-speech capability rests on the user's computer through an API that code sent to the browser from the site can access (such as Java™). -
FIG. 2 shows an embodiment of the present invention where the user's computer/browser 1 does not have speech processing capability, and where the audio from incoming speech is streamed to the walk-throughcontrol site 8 wherespeech recognition 5, text-to-speech 6 andintent determination 9 takes place. -
FIG. 3 shows another embodiment of the present invention where one or moreexternal sites 10 is/are used to perform speech processing, namelyspeech recognition 6 and text-to-speech conversion 5. In this case, audio is streamed to the external site either directly to and from the user'sbrowser 1 or to and from the walk-throughcontrol site 8. Such external sites provide APIs that allow either public users or subscribers to perform speech processing. - If the browser that the visitor is using supports the Speech Recognition API ability the present invention prefers to do the speech recognition on the client-side user's end. If the API does not exist, the system will automatically fallback to either
option 2 oroption 3. - If the browser that the user is using does not support the Speech Recognition API ability, the system will send the audio by using the navigator.getUserMedia( ) (or any other media capturing method) function in the client-side and send it to the server via webRTC or via REST API. From there the control server will send it either external services of speech to text and intent understanding or it will do both or the any part on the server. Optionally, parts of this can also be performed on the client-side depending upon capability.
- Another option that the present invention supports is sending the audio directly to a SaaS speech to text service to get the sentence with or without the intent.
- Once the walk-through control server has the text from the speech-to-text engines (
option - By using both the intent and the sentence, the system displays (via audio and/or text) the next part of the conversation, whether it is another question that will clarify the visitor's intent, or just navigating the visitor to the right place/section and guiding him or her through the process while in a conversational process with the visitor by going through a walkthrough tree/graph like instruction set.
- The walkthrough can still contain conditions, triggers, actions, custom JAVASCRIPT™ conditions and actions.
- For each part of the walkthrough, where the system waits for a visitor to input or say his or her answer out loud, the system can change the dictionary and the configuration to the context of the answer that is applicable to the question.
- For example in the telephone company world:
System: “How can I help you”?
Visitor: “1 want a device”
System: If it is not clear if the visitor wants a new device or an upgrade to an existing device, the system can ask: ‘Do you want a new device, or do you want to upgrade and existing device?”
Visitor: “Upgrade” or “upgrade existing device”—In this context just the word upgrade is enough to lead the visitor to upgrade a device, whereas if the visitor would say the word upgrade in the main context, the system can have other options for the word upgrade such as Upgrade a mobility data plan, upgrade a device and more. - If at any point the visitor's answer doesn't match any of the intents that are expected, the system can ask the visitor if they mind repeating his answer, and might even let the visitor choose from a textual representation of the options to avoid having the visitor repeat several times.
- The system's part of the dialog can always be represented in text, sound or both. The user/visitor part of the dialog can be demonstrated by speaking or by typing what he or she wants into a text input (div, text input, textarea or any other html text input method).
- Bank Sample scenario with sample script:
User clicks on a Help button in the financial services website.
System: “How can I help you?”
User: “I would like to transfer money”
System: “Would you like to transfer it to someone in the USA or someone out of the country?”
User: “In the USA please”
System: “Thank you, I am taking you to the right place”—The system then redirects the user to the right area of the website. After redirect is completed:
System: “In this area you can transfer money to anyone that has a US bank account. Where would you like the funds to be transferred, and for how much? Feel free to say it out loud or please enter the recipient bank account here (while highlighting the bank account field) and the amount here (while highlighting the amount field)”.
User either enter the fields or says: “I want to transfer two hundred fifty three dollars to Jessica Smith to account number two two five three six three in chase”.
System (if user interaction was performed with voice, the system enters all the fields): “Please go over these details. If they are correct, either click here (highlight the next button) or say continue to proceed”
User: looks verifies and says “Continue please”
System: The system then takes the user to the next page of the confirmation by clicking on the next button programmatically.
System: “Please review this transfer, if you approve it, it might take the recipient up to two business days to see it in your balance since it is after business hours.
If you want to transfer to chase, click here or say “Yes I approve this wire”?”
User: “Yes, I approve this wire”
System: clicks on approve for the user
System: “Thanks for using our guided wire transfer process at “MyBank”, If you need to do anything else today, I'd like to offer you more guidance, if not, we had a pleasure serving you” - A sequence might start with (for example):
Visitor/User clicks on a Help button in the telco/cables/etc website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I forgot my password”
- “I would like a new device”
- “I would like to see which mobility plans you offer”
- “I want a fast unlimited data plan”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the insurance company's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I forgot my password”
- “I would like to get insurance for ski traveling”
- “How do I know when my insurance policy is over”
- “How much allowance will I get at retirement age?”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the utility company's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I forgot my password”
- “I want to pay my electricity bill”
- “I want to know how come I paid so much this month vs the last month”
- “How do I set up auto payments for my charges”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the healthcare company's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I want to set an appointment for an orthopedic surgeon on may 5th”
- “I want to see my blood test results”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the travel/hospitality company's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I want to fly out to Boston Logan international airport on the 25th of July and return six days after”
- “I want to book a hotel in the radius of 5 km from the center of Barcelona”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the education organization's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I want to request a student scholarship”
- “I want to enroll to ‘
advanced chemistry 2 in July” - “I want to know the opening hours of the main library”
- “I want to upload my homework to the portal”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the public sector organization's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I want to fill in my yearly taxes”
- “I want to apply for a visa to fly out to Moscow”
- A sequence might start with (for example):
Visitor/User clicks on a Help button in the public sector organization's website.
System: “How can I help you?”
User sample inquires for help might be: -
- “I want to renew my SSL certificate”
- “I want to register a new domain”
- “I want a report of how many users visited to my website”
- The present invention provides a system and method that makes it easy for a user or visitor to a website to navigate to correct pages and accomplish one or more tasks by voice interaction in a conversational mode.
- Several embodiments of the system can accomplish this. Namely, speech can be processed at the client end on the user computer/browser if that capability exists. If not, audio can be streamed mono-directionally or bidirectionally to either a walk-through control site or to an external remote API that provides speech processing services. The totality of the invention provide a convenient system and method for navigating websites using voice.
- Several descriptions and illustrations have been presented to aid in understanding the present invention. One with skill in the art will realize that numerous changes and variations may be made without departing from the spirit of the invention. Each of these changes and variations is within the scope of the present invention.
Claims (9)
1. A method for providing voice-controlled website walk-through comprising:
providing a voice prompt to a user upon entering the website using text to voice conversion or a recorded voice prompt;
receiving a voice request from the user;
converting the voice request to text using voice recognition;
determining intent from the text;
from intent, either asking the user a question, or presenting a particular webpage to the user.
2. The method of claim 1 wherein voice recognition and text-to-voice conversion is performed on the user's computer.
3. The method of claim 1 wherein voice recognition and text-to-voice conversion is performed at a website server.
4. The method of claim 1 wherein voice recognition and text-to-voice conversion is performed on a third party computer.
5. The method of claim 1 wherein determining intent includes asking the user one or more questions using the voice prompt, receiving verbal answers to the questions, converting the verbal answers to response text for analysis.
6. The method of claim 1 wherein determining intent includes processing of the text by an artificial intelligence module.
7. The method of claim 5 wherein determining intent includes processing the response text with an artificial intelligence module.
8. The method of claim 6 wherein the artificial intelligence module includes dictionaries of words, a neural network or an expert system.
9. The method of claim 7 wherein the artificial intelligence module includes dictionaries of words, a neural network or an expert system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/467,538 US20220129242A1 (en) | 2017-12-08 | 2021-09-07 | System and method for voice-directed website walk-through |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762596626P | 2017-12-08 | 2017-12-08 | |
US16/214,051 US11113026B2 (en) | 2017-12-08 | 2018-12-08 | System and method for voice-directed website walk-through |
US17/467,538 US20220129242A1 (en) | 2017-12-08 | 2021-09-07 | System and method for voice-directed website walk-through |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/214,051 Continuation US11113026B2 (en) | 2017-12-08 | 2018-12-08 | System and method for voice-directed website walk-through |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220129242A1 true US20220129242A1 (en) | 2022-04-28 |
Family
ID=68763865
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/214,051 Active US11113026B2 (en) | 2017-12-08 | 2018-12-08 | System and method for voice-directed website walk-through |
US17/467,538 Abandoned US20220129242A1 (en) | 2017-12-08 | 2021-09-07 | System and method for voice-directed website walk-through |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/214,051 Active US11113026B2 (en) | 2017-12-08 | 2018-12-08 | System and method for voice-directed website walk-through |
Country Status (1)
Country | Link |
---|---|
US (2) | US11113026B2 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11113026B2 (en) * | 2017-12-08 | 2021-09-07 | Toonimo, Inc. | System and method for voice-directed website walk-through |
US11164562B2 (en) * | 2019-01-10 | 2021-11-02 | International Business Machines Corporation | Entity-level clarification in conversation services |
US11429793B2 (en) * | 2019-05-28 | 2022-08-30 | Dell Products L.P. | Site ambient audio collection |
CA3199655A1 (en) * | 2020-11-23 | 2022-05-27 | Andrei PAPANCEA | Method for multi-channel audio synchronization for task automation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030104839A1 (en) * | 2001-11-27 | 2003-06-05 | Christian Kraft | Communication terminal having a text editor application with a word completion feature |
US20040030556A1 (en) * | 1999-11-12 | 2004-02-12 | Bennett Ian M. | Speech based learning/training system using semantic decoding |
US20070100635A1 (en) * | 2005-10-28 | 2007-05-03 | Microsoft Corporation | Combined speech and alternate input modality to a mobile device |
US20160203002A1 (en) * | 2015-01-09 | 2016-07-14 | Microsoft Technology Licensing, Llc | Headless task completion within digital personal assistants |
US11113026B2 (en) * | 2017-12-08 | 2021-09-07 | Toonimo, Inc. | System and method for voice-directed website walk-through |
-
2018
- 2018-12-08 US US16/214,051 patent/US11113026B2/en active Active
-
2021
- 2021-09-07 US US17/467,538 patent/US20220129242A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040030556A1 (en) * | 1999-11-12 | 2004-02-12 | Bennett Ian M. | Speech based learning/training system using semantic decoding |
US20030104839A1 (en) * | 2001-11-27 | 2003-06-05 | Christian Kraft | Communication terminal having a text editor application with a word completion feature |
US20070100635A1 (en) * | 2005-10-28 | 2007-05-03 | Microsoft Corporation | Combined speech and alternate input modality to a mobile device |
US20160203002A1 (en) * | 2015-01-09 | 2016-07-14 | Microsoft Technology Licensing, Llc | Headless task completion within digital personal assistants |
US11113026B2 (en) * | 2017-12-08 | 2021-09-07 | Toonimo, Inc. | System and method for voice-directed website walk-through |
Also Published As
Publication number | Publication date |
---|---|
US11113026B2 (en) | 2021-09-07 |
US20190377544A1 (en) | 2019-12-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220129242A1 (en) | System and method for voice-directed website walk-through | |
US11283926B2 (en) | System and method for omnichannel user engagement and response | |
US9521255B1 (en) | Systems and methods for visual presentation and selection of IVR menu | |
US8687777B1 (en) | Systems and methods for visual presentation and selection of IVR menu | |
US8155280B1 (en) | Systems and methods for visual presentation and selection of IVR menu | |
US8903073B2 (en) | Systems and methods for visual presentation and selection of IVR menu | |
US8681951B1 (en) | Systems and methods for visual presentation and selection of IVR menu | |
US20190082043A1 (en) | Systems and methods for visual presentation and selection of ivr menu | |
US11792313B1 (en) | System and method for calling a service representative using an intelligent voice assistant | |
JP2007527640A (en) | An action adaptation engine for identifying action characteristics of a caller interacting with a VXML compliant voice application | |
US20170289332A1 (en) | Systems and Methods for Visual Presentation and Selection of IVR Menu | |
US11889023B2 (en) | System and method for omnichannel user engagement and response | |
US11347525B1 (en) | System and method for controlling the content of a device in response to an audible request | |
US11012573B2 (en) | Interactive voice response using a cloud-based service | |
US20100217603A1 (en) | Method, System, and Apparatus for Enabling Adaptive Natural Language Processing | |
JP2008507187A (en) | Method and system for downloading an IVR application to a device, executing the application and uploading a user response | |
US11783836B2 (en) | Personal electronic captioning based on a participant user's difficulty in understanding a speaker | |
US20050100142A1 (en) | Personal home voice portal | |
US7558733B2 (en) | System and method for dialog caching | |
US20220046127A1 (en) | Interactive voice response (IVR) for text-based virtual assistance | |
US20090163188A1 (en) | Method and system of providing an audio phone card | |
US11145289B1 (en) | System and method for providing audible explanation of documents upon request | |
US20230409616A1 (en) | Hybrid guided communication session with artificial intelligence | |
CN117424960A (en) | Intelligent voice service method, device, terminal equipment and storage medium | |
Griol et al. | Providing Interactive and User-Adapted E-City Services by Means of Voice Portals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |