EP1062798A1 - A system for browsing the world wide web with a traditional telephone - Google Patents

A system for browsing the world wide web with a traditional telephone

Info

Publication number
EP1062798A1
EP1062798A1 EP99904341A EP99904341A EP1062798A1 EP 1062798 A1 EP1062798 A1 EP 1062798A1 EP 99904341 A EP99904341 A EP 99904341A EP 99904341 A EP99904341 A EP 99904341A EP 1062798 A1 EP1062798 A1 EP 1062798A1
Authority
EP
European Patent Office
Prior art keywords
telephone
audio
command
world wide
wide web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP99904341A
Other languages
German (de)
French (fr)
Inventor
Michael J. Wynblatt
Stuart Goose
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens Corporate Research Inc
Original Assignee
Siemens Corporate Research Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Corporate Research Inc filed Critical Siemens Corporate Research Inc
Publication of EP1062798A1 publication Critical patent/EP1062798A1/en
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4938Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/60Medium conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/12Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal

Definitions

  • the present invention relates to interfacing with ' the world wide web and more particularly to browsing the world wide web using a telephone.
  • WWW World Wide Web
  • Traditional WWW browsers such as Netscape's Navigator and Microsoft' s Internet Explorer, offer complex visual renditions of WWW documents. These are not suitable for telephones because telephones lack sophisticated visual display mechanisms .
  • Some telephones have small visual displays and some vendors offer WWW browsers targeted to these limited displays. The best example of such a browser is Unwired Planet's UP.Browser. These systems are still dependent on a visual display, however, and thus are not usable on traditional telephones which have no visual display. 2
  • the Web-On-Call system from Netphonic allows telephonic access to the WWW by providing an audio rendering of WWW documents.
  • this system requires the WWW content-provider to modify, or pre- process, all of the content which they wish to make available.
  • Such a system is called a server-side solution and does not enable individual WWW users to browse to arbitrary WWW sites. Users may only browse sites that have the special modifications.
  • a client -side solution is needed, one which renders arbitrary WWW documents into audio on the fly.
  • the present invention is a system which employs the WIRE system and other components to allow browsing the WWW with a traditional telephone.
  • the system also enables access to e-mail.
  • This system works on the client-side, and thus requires no special preparation by the WWW content-provider. It uses only audio to render WWW documents and thus requires no visual display.
  • the traditional telephone is utilized to contact a host computer which has a voice-capable modem, a telephone-driven audio WWW browser (TAWB) and a connection to the Internet, called a network interface.
  • TAWB comprises a telephony interface, a digital voice processing module (DVP) , an interchange between the telephony interface and the DVP, an audio document renderer, a command and control module, and an Internet interface.
  • DVP digital voice processing module
  • the system contains a friendly server for storing information.
  • the system relies on the presence of the WWW which can be described as a collection of WWW servers connected to the Internet, an Intranet or an Extranet.
  • Figure 1 illustrates a block diagram of the present invention.
  • Figure 2 illustrates an overview of a telephone- driven audio WWW browser of the present invention.
  • Figure 3 illustrates a block diagram of the command and control module of the present invention.
  • Figure 4 illustrates an example of a touch-tone to user command map of the present invention.
  • Figure 5 illustrates the high level operation of the command and control logic of the present invention.
  • Figure 6 illustrates a block diagram of the main loop of the command and control logic of the present invention.
  • Figure 7 illustrates a block diagram of the digital voice processing module of the present invention. 4
  • Figure 8 illustrates a block diagram of the friendly server of the present invention.
  • Figure 9 illustrates an example of a CGI program.
  • Figure 1 shows the high-level architecture of the present invention.
  • the user of the system must have a telephone handset 10 that has touch-tone capability.
  • the telephone handset 10 may be a mobile telephone, such as a cellular telephone.
  • the user dials into a host computer system 12.
  • This host computer system 12 has a voice- capable modem 14, a telephone-driven audio WWW browser (TAWB) 16, and a connection to the Internet, called a network interface 18.
  • TAWB telephone-driven audio WWW browser
  • Typical network interfaces are Ethernet cards or modems.
  • the system contains a friendly server 20 for storing information.
  • the system relies on the presence of the WWW, which can be described as a collection of WWW servers 22 connected to the Internet.
  • the system works in the following manner.
  • the user employs his telephone handset to dial into the host computer.
  • the host computer's voice modem accepts the call and acts as an interface between the TAWB and the public telephone network.
  • the user then issues commands consisting of telephone touch-tones, spoken voice commands, or both.
  • the TAWB interprets these commands, downloads the appropriate WWW documents from the Internet and renders them to an audio stream.
  • the TAWB sends the audio stream, via the public telephone network, to the telephone handset where the user listens to it.
  • the user listens he or she may issue additional commands which the TAWB will capture and interpret.
  • the system allows hyperlinks to be followed, skips to be made forward and backward through the current document, and pauses in the rendering.
  • FIG. 2 shows a block diagram of the TAWB 16.
  • TAWB Telephone-driven Audio World Wide Web Browser
  • Figure 2 shows a block diagram of the TAWB 16.
  • the Internet interface 28 is to be distinguished from the network interface 18 shown in Figure 1 in the following manner.
  • the Internet interface 28 provides application level network protocols such as HTTP and FTP while the network interface (18 of Figure 1) provides lower level network protocols such as TCP/IP.
  • the command and control module (CCM) 27 directs the action of the other modules, interpreting user commands and directing the appropriate response.
  • the Internet interface 28 provides access to WWW servers, from where WWW documents are obtained and access to the friendly server.
  • the audio document renderer 26 converts structured documents from the WWW into an audio rendition. This rendition includes audio signals sent directly to the telephony interface 23, but consists primarily of a specially prepared structured text stream that is sent to the digital voice processing module (DVP) 25.
  • DVP digital voice processing module
  • the DVP module 25 converts the text stream to an audio "voice" stream and sends this to the telephony interface 23 by way of the DVP/telephony 6
  • the DVP 25 also converts voice commands received from the telephony interface 23 by way of the DVP/telephony interchange 24 into commands that are passed to the CCM 27.
  • the DVP/telephony interchange 24 is necessary to convert between the dissimilar formats used by the DVP module 25 and the telephony interface 23.
  • the telephony interface 23 captures and delivers audio streams and touch-tones to and from the public telephone network. The following will describe the TAWB 16 of the present invention in more detail.
  • the Internet interface 28 is responsible for providing application level network services to the CCM 27. Specifically, the Internet interface 28 must provide the following well known services.
  • Hypertext Transfer Protocol HTTP: This serves to download WWW documents from remote WWW servers and also to upload information back to the servers under some circumstances.
  • Post Office Protocol This serves to download e-mail documents from remote WWW servers.
  • File Transfer Protocol FTP: This serves to transfer files to and from the friendly server.
  • the Internet interface 28 takes its direction from the CCM 27 as described below. It returns all documents which it downloads to the CCM 27.
  • the CCM 27 directs the operation of the TAWB system 16.
  • a block diagram of the CCM 27 is shown in Figure 3.
  • the touch-tone to user-command map (TTUCM) 30 accepts touch-tone digits from the telephony interface and determines which user command they represent. As commands may consist of varying numbers of tones, the TTUCM 30 constitutes a finite-state machine in which receiving a touch-tone acts to move the machine along an edge.
  • TTUCMs are possible, depending on the set of commands which the TAWB is to support.
  • An example of a TTUCM 30 is shown in Figure 4. To reduce complexity, not all possible commands are shown. A complete TTUCM would be more complex but would follow the same principles.
  • the TTUCM if the first touch- tone is a 1, 2 or 3, the TTUCM decides that the command is "Follow”, “Skip Back”, or “Skip Ahead", respectively. If the first touch tone is 0, #, or *, the TTUCM accepts the next touch-tone and then uses this second tone as an index for either the "Get Favorite", "Set Favorite” or "Select from History List” commands, respectively.
  • the local flags cache 32 is a data-store that keeps a local copy of the addresses of documents that the user has flagged. The purpose of the flagging process is described below.
  • the local favorites cache 34 is a data- store that keeps a local copy of the addresses of the user's favorite WWW documents.
  • the history list 36 is a data-store which stores addresses of pages recently visited by the user, in the manner directed by the command and control logic 38. It contains a position pointer which marks one address as the current address.
  • the command and control logic (CCL) 38 directs the operations of the CCM.
  • Figure 5 shows the high-level operation of the CCL 38. Upon activation, the CCL 38 through the wait state module 50 is in a wait state.
  • the CCL directs the Internet interface to retrieve the user' s favorites from the friendly server through the Internet interface retrieval director 52. These favorites are then stored in the local favorites cache.
  • the CCL through the command and control enter module 54 enters the command and control main loop, which is described in detail below.
  • the CCL Upon receiving a quit command from the user via the TTUCM, the CCL directs the telephony interface to 8 terminate the telephone connection. This is performed in the telephony interface termination director 56.
  • the CCL directs the Internet interface to take the user favorites in the local favorites cache and store them on the friendly server. This is performed in the Internet interface store favorites director 58.
  • the main loop of the command and control logic is the normal operating logic for the TAWB system while it is in use.
  • Figure 6 shows a block diagram of the main loop.
  • the loop consists of two actions repeated continuously: first get a command, then execute that command.
  • Commands come either from the touch-tone to user-command map (TTUCM) 62 or from the voice to user-command map (VUCM) 64 of the DVP module (25 of Figure 2) .
  • TTUCM touch-tone to user-command map
  • VUCM voice to user-command map
  • DVP module 25 of Figure 2
  • the main loop acts asynchronously from the rest of the modules. This means that once the "wait for user command" 65 state is reached, new commands may be accepted and processed even if other modules have not finished their last assigned task. Notably, this means that the user may interrupt the rendering of a document with new commands, thus allowing a high degree of interactivity.
  • the CCL After receiving a command, the CCL determines which command it is 66. The following describes the handling of a set of relevant commands.
  • the term "URL" refers to the Uniform Resource
  • Locator which is an addressing standard used on the WWW.
  • Skip -Ahead Skip Back or Restart.
  • the CCM directs 67 the audio document renderer to adjust its rendering accordingly.
  • Reset Favorite N The CCM stores 68 the current URL in the local favorite cache at index N. This process overwrites the previous contents of index N in the local favorites cache.
  • follow Link The CCM gets 69 the URL of the current active link from the audio document renderer, directs 70 the Internet interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by adding the URL above the current position pointer and advancing the position pointer.
  • the CCM lookups 73 the URL stored in the local favorites cache at index N, directs 70 the Internet interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by adding the URL above the current position pointer and advancing the position pointer.
  • the CCM gets 74 the appropriate hyperlink from the history list, directs 70 the Internet 10
  • the interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by advancing or retreating the position pointer as appropriate.
  • Flag Current Page The CCM gets 75 the URL of the current document from the history list and stores it in the local flags cache.
  • ADR audio document renderer
  • WIRE accepts a structured document and outputs a text stream suitable for rendering by the text-to-speech synthesizer (TTS) component of the DVP.
  • TTS text-to-speech synthesizer
  • the text stream produced by the ADR is sent to the TTS component of the DVP.
  • This text stream is not necessarily linear, or ordered in the same way as the original WWW document, but instead the ADR may be directed by the CCM to send arbitrary parts of the 11
  • the CCM determines which part of the document to send based on the "navigation" commands sent by the user, such as "Skip Ahead” or “Skip Back”.
  • Systems such as WIRE may offer other "rendering modes" in which certain parts of a document are summarized or skipped based on the user's preferences.
  • ADR must provide the CCM with the "active link" when requested. This is the hyperlink corresponding to the anchor most recently rendered.
  • the digital voice processing module (25 of Figure 2) consists of two components as shown in Figure 7.
  • the text-to-speech synthesizer (TTS) 78 accepts marked up text from the ADR and generates waveform audio which is sent to the DVP/telephony interchange.
  • the voice recognizer component 79 accepts waveform audio from the DVP/telephony interchange and sends user command strings to the CCM.
  • SAPI Microsoft Speech API
  • SAPI interface there are many commercial packages available which can serve the role of the text- to-speech synthesizer, such as Lernout & Houspie's TruVoice, and of the voice recognizer, such as AT&T's Watson.
  • DVP/Telephony Interchange 24 of Figure 2
  • Many commercial DVP packages produce or consume audio data in a different format than can be accepted by computer telephony interfaces.
  • the job of the DVP/telephony interchange is to convert audio data from one format to another so that these two TAWB components can share the data. Since the audio is in the form of real-time streams, the format conversion must be fast and must be done on the fly. For example, if the sampling rate produced by the TTS engine 12
  • a TAWB PC needs an interface to the public telephone system. This interface essentially must allow the computer to control the modem which acts as the local telephony client. Specifically, the interface must be able to send and receive waveform audio and touch- tones and deliver them to both the TAWB system and the public telephony network in an understandable form.
  • many commercial modems provide software drivers which conform to Microsoft's Telephony API (TAPI) . Since the TAPI definition includes all of the services that the TAWB requires, these software drivers may serve the role of the telephony interface.
  • TAPI Microsoft's Telephony API
  • the friendly server (20 of Figure 1) is a repository for information that is accessible by any WWW-enabled device, including TAWB browsers and visual browsers. As some kinds of information are difficult to comprehend in an audio-only environment, the ability to share addresses with a visual browser can be seen as a required feature of an audio- only browser. 13
  • FIG 8 shows a block diagram of the friendly server 20.
  • the user favorites store 82 is a persistent data repository for addresses of the user' s favorite documents, that is, the documents he or she has chosen to associate with preset commands in the TAWB browser. The purpose of the user' s favorites is to allow the user to access certain sites quickly with a minimum of browsing.
  • the Internet interface for TAWB browser 84 allows the TAWB system to upload replacement favorites which may have been set by the user while using the TAWB system.
  • This interface is simply an FTP server as is known in the art.
  • the Internet interface for traditional browser 86 allows the user to modify his or her favorites while using a traditional WWW browser.
  • This interface consists of a CGI program within which the user can modify the favorites store and an HTTP server as is known in the art which allows the browser to access the CGI program.
  • An example of a suitable CGI program is shown in Figure 9. This program allows the user to modify the values of the favorites by typing new URL's in the boxes. Pressing the update button completes the modification. In this way, the user may easily transfer documents found with a traditional browser to his or her TAWB browser.
  • the user flags store 88 is a persistent data repository for the user's flagged URLs.
  • the purpose of the flags is to make documents found with the TAWB browser available to a traditional browser.
  • the Internet interface for TAWB browser 84 allows the TAWB system to upload addresses of documents which have been flagged by the user while using the TAWB system. This interface is simply an FTP server as is known in the art.
  • the addresses are stored in the form of a WWW document containing hyperlinks to the flagged documents.
  • the Internet interface for traditional browser 86 allows the user to view this document and follow the hyperlinks with 14
  • This interface is simply an HTTP server as is known in the art.
  • the present invention is a system which allows a user with an ordinary telephone to browse the World Wide Web.
  • the present invention is an improvement over prior art systems because any WWW documents can be obtained, not just documents that were specially prepared for audio access.
  • the present invention works with a traditional telephone and does not require the phone to have a visual display or special internal electronics.
  • the present invention is particularly valuable to people who need mobile access, are visually impaired, cannot afford browsers which require special hardware, or who work in environments where visual displays are not practical.

Abstract

Access to the world wide web is achieved by using a traditional telephone to contact a host computer which has a voice-capable modem, a telephone-driven audio WWW browser (TAWB) and a connection to the Internet. The TAWB comprises a telephony interface, a digital voice processing module (DVP), an interchange between the telephony interface and the DVP, an audio document renderer, a command and control module, and an Internet interface. Additionally, the system contains a friendly server for storing information. The system relies on the presence of the WWW which can be described as a collection of WWW servers connected to the Internet.

Description

A System For Browsing The World Wide Web With A Traditional Telephone
Background of the Invention
Field of the Invention
The present invention relates to interfacing with ' the world wide web and more particularly to browsing the world wide web using a telephone.
Description of the Prior Art
The World Wide Web (WWW) is rapidly becoming the single most important source of information for businesses and consumers. As individuals rely increasingly on the information available on the WWW, they will require ubiquitous access to this information. One device that is readily available in almost any environment is the telephone; thus, it seems natural to consider the telephone as a WWW access device. Traditional WWW browsers, such as Netscape's Navigator and Microsoft' s Internet Explorer, offer complex visual renditions of WWW documents. These are not suitable for telephones because telephones lack sophisticated visual display mechanisms . Some telephones have small visual displays and some vendors offer WWW browsers targeted to these limited displays. The best example of such a browser is Unwired Planet's UP.Browser. These systems are still dependent on a visual display, however, and thus are not usable on traditional telephones which have no visual display. 2
The Web-On-Call system from Netphonic allows telephonic access to the WWW by providing an audio rendering of WWW documents. However, this system requires the WWW content-provider to modify, or pre- process, all of the content which they wish to make available. Such a system is called a server-side solution and does not enable individual WWW users to browse to arbitrary WWW sites. Users may only browse sites that have the special modifications. In order to allow arbitrary browsing, a client -side solution is needed, one which renders arbitrary WWW documents into audio on the fly.
The Web-Based Interactive Radio Environment: WIRE system developed by Siemens Corporate Research Inc., described in United States Patent Application Number
08/768,046, filed on December 13, 1996 and assigned to the same assignee as the present invention, offers a mechanism for rendering arbitrary WWW documents using audio. As the WIRE system offers broad rendering support for structured documents as well as audio data, it is a natural tool for building client-side, audio-only, WWW browsers .
Summary of the Invention
The present invention is a system which employs the WIRE system and other components to allow browsing the WWW with a traditional telephone. The system also enables access to e-mail. This system works on the client-side, and thus requires no special preparation by the WWW content-provider. It uses only audio to render WWW documents and thus requires no visual display.
The traditional telephone is utilized to contact a host computer which has a voice-capable modem, a telephone-driven audio WWW browser (TAWB) and a connection to the Internet, called a network interface. The TAWB comprises a telephony interface, a digital voice processing module (DVP) , an interchange between the telephony interface and the DVP, an audio document renderer, a command and control module, and an Internet interface. Additionally, the system contains a friendly server for storing information. Finally, the system relies on the presence of the WWW which can be described as a collection of WWW servers connected to the Internet, an Intranet or an Extranet.
Brief Description of the Drawings
Figure 1 illustrates a block diagram of the present invention.
Figure 2 illustrates an overview of a telephone- driven audio WWW browser of the present invention.
Figure 3 illustrates a block diagram of the command and control module of the present invention.
Figure 4 illustrates an example of a touch-tone to user command map of the present invention.
Figure 5 illustrates the high level operation of the command and control logic of the present invention.
Figure 6 illustrates a block diagram of the main loop of the command and control logic of the present invention.
Figure 7 illustrates a block diagram of the digital voice processing module of the present invention. 4
Figure 8 illustrates a block diagram of the friendly server of the present invention.
Figure 9 illustrates an example of a CGI program.
Detailed Description of the Invention
Figure 1 shows the high-level architecture of the present invention. The user of the system must have a telephone handset 10 that has touch-tone capability.
There are no other requirements placed on the telephone handset 10. Notably, the telephone handset 10 may be a mobile telephone, such as a cellular telephone. To utilize the system, the user dials into a host computer system 12. This host computer system 12 has a voice- capable modem 14, a telephone-driven audio WWW browser (TAWB) 16, and a connection to the Internet, called a network interface 18. (Throughout this specification, it should be understood that the term "Internet" could be readily replaced by "Intranet" or "Extranet" with minimal additional changes to the rest of the specification) . Typical network interfaces are Ethernet cards or modems. Additionally, the system contains a friendly server 20 for storing information. Finally, the system relies on the presence of the WWW, which can be described as a collection of WWW servers 22 connected to the Internet. At a high level, the system works in the following manner. The user employs his telephone handset to dial into the host computer. The host computer's voice modem accepts the call and acts as an interface between the TAWB and the public telephone network. The user then issues commands consisting of telephone touch-tones, spoken voice commands, or both. The TAWB interprets these commands, downloads the appropriate WWW documents from the Internet and renders them to an audio stream. The TAWB sends the audio stream, via the public telephone network, to the telephone handset where the user listens to it. As the user listens, he or she may issue additional commands which the TAWB will capture and interpret. For example, the system allows hyperlinks to be followed, skips to be made forward and backward through the current document, and pauses in the rendering.
The following will describe the Telephone-driven Audio World Wide Web Browser (TAWB) 16. Figure 2 shows a block diagram of the TAWB 16. In consists of a telephony interface 23, a digital voice processing module (DVP) 25, an interchange between the telephony interface and the DVP 24, an audio document renderer 26, a command and control module 27, and an Internet interface 28. The Internet interface 28 is to be distinguished from the network interface 18 shown in Figure 1 in the following manner. The Internet interface 28 provides application level network protocols such as HTTP and FTP while the network interface (18 of Figure 1) provides lower level network protocols such as TCP/IP.
An overview of the TAWB 16 follows. The command and control module (CCM) 27 directs the action of the other modules, interpreting user commands and directing the appropriate response. The Internet interface 28 provides access to WWW servers, from where WWW documents are obtained and access to the friendly server. The audio document renderer 26 converts structured documents from the WWW into an audio rendition. This rendition includes audio signals sent directly to the telephony interface 23, but consists primarily of a specially prepared structured text stream that is sent to the digital voice processing module (DVP) 25. The DVP module 25 converts the text stream to an audio "voice" stream and sends this to the telephony interface 23 by way of the DVP/telephony 6
interchange 24. The DVP 25 also converts voice commands received from the telephony interface 23 by way of the DVP/telephony interchange 24 into commands that are passed to the CCM 27. The DVP/telephony interchange 24 is necessary to convert between the dissimilar formats used by the DVP module 25 and the telephony interface 23. The telephony interface 23 captures and delivers audio streams and touch-tones to and from the public telephone network. The following will describe the TAWB 16 of the present invention in more detail. The Internet interface 28 is responsible for providing application level network services to the CCM 27. Specifically, the Internet interface 28 must provide the following well known services. Hypertext Transfer Protocol (HTTP): This serves to download WWW documents from remote WWW servers and also to upload information back to the servers under some circumstances. Post Office Protocol (POP): This serves to download e-mail documents from remote WWW servers. File Transfer Protocol (FTP): This serves to transfer files to and from the friendly server. The Internet interface 28 takes its direction from the CCM 27 as described below. It returns all documents which it downloads to the CCM 27. The CCM 27 directs the operation of the TAWB system 16. A block diagram of the CCM 27 is shown in Figure 3. Within the CCM 27, the touch-tone to user-command map (TTUCM) 30 accepts touch-tone digits from the telephony interface and determines which user command they represent. As commands may consist of varying numbers of tones, the TTUCM 30 constitutes a finite-state machine in which receiving a touch-tone acts to move the machine along an edge. Many TTUCMs are possible, depending on the set of commands which the TAWB is to support. An example of a TTUCM 30 is shown in Figure 4. To reduce complexity, not all possible commands are shown. A complete TTUCM would be more complex but would follow the same principles. In Figure 4, if the first touch- tone is a 1, 2 or 3, the TTUCM decides that the command is "Follow", "Skip Back", or "Skip Ahead", respectively. If the first touch tone is 0, #, or *, the TTUCM accepts the next touch-tone and then uses this second tone as an index for either the "Get Favorite", "Set Favorite" or "Select from History List" commands, respectively.
The local flags cache 32 is a data-store that keeps a local copy of the addresses of documents that the user has flagged. The purpose of the flagging process is described below. The local favorites cache 34 is a data- store that keeps a local copy of the addresses of the user's favorite WWW documents. The history list 36 is a data-store which stores addresses of pages recently visited by the user, in the manner directed by the command and control logic 38. It contains a position pointer which marks one address as the current address. The command and control logic (CCL) 38 directs the operations of the CCM. Figure 5 shows the high-level operation of the CCL 38. Upon activation, the CCL 38 through the wait state module 50 is in a wait state. It waits for notification from the telephony interface that a call has been received and a connection established with a user. Next, the CCL directs the Internet interface to retrieve the user' s favorites from the friendly server through the Internet interface retrieval director 52. These favorites are then stored in the local favorites cache. Next, the CCL through the command and control enter module 54 enters the command and control main loop, which is described in detail below. Upon receiving a quit command from the user via the TTUCM, the CCL directs the telephony interface to 8 terminate the telephone connection. This is performed in the telephony interface termination director 56. Next, the CCL directs the Internet interface to take the user favorites in the local favorites cache and store them on the friendly server. This is performed in the Internet interface store favorites director 58. Finally, the CCL, through the Internet interface store user flags director 60, directs the Internet interface to store the user flags from the local flags cache to the friendly server. The following will describe the Command and Control Main Loop. The main loop of the command and control logic is the normal operating logic for the TAWB system while it is in use. Figure 6 shows a block diagram of the main loop. At a high level, the loop consists of two actions repeated continuously: first get a command, then execute that command. Commands come either from the touch-tone to user-command map (TTUCM) 62 or from the voice to user-command map (VUCM) 64 of the DVP module (25 of Figure 2) . Note that while either a TTUCM or VUCM is required, and both are desirable, only one is necessary for the system tq function. A working TAWB system can be made with TTUCM only, VUCM only, or both.
The main loop acts asynchronously from the rest of the modules. This means that once the "wait for user command" 65 state is reached, new commands may be accepted and processed even if other modules have not finished their last assigned task. Notably, this means that the user may interrupt the rendering of a document with new commands, thus allowing a high degree of interactivity.
There are a large number of potential user-level commands which may be offered by the TAWB system. These commands are those of a traditional WWW browser and are well known. There are additionally some commands which are new and relevant in telephone-based browsers. The following will describe examples of the most critical set of commands. The scope of the present invention is not limited to the specific set of commands described but includes any reasonable set of commands and the new commands described.
After receiving a command, the CCL determines which command it is 66. The following describes the handling of a set of relevant commands. In the discussion that follows, the term "URL" refers to the Uniform Resource
Locator, which is an addressing standard used on the WWW. Skip -Ahead, Skip Back or Restart. In the case of one of these commands, the CCM directs 67 the audio document renderer to adjust its rendering accordingly. Reset Favorite N. The CCM stores 68 the current URL in the local favorite cache at index N. This process overwrites the previous contents of index N in the local favorites cache. Follow Link. The CCM gets 69 the URL of the current active link from the audio document renderer, directs 70 the Internet interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by adding the URL above the current position pointer and advancing the position pointer.
Load Favorite N. The CCM lookups 73 the URL stored in the local favorites cache at index N, directs 70 the Internet interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by adding the URL above the current position pointer and advancing the position pointer.
Back Or Forward. The CCM gets 74 the appropriate hyperlink from the history list, directs 70 the Internet 10
interface to retrieve the document stored at that URL, directs 71 the audio document renderer to render this document, and updates 72 the history list by advancing or retreating the position pointer as appropriate. Flag Current Page. The CCM gets 75 the URL of the current document from the history list and stores it in the local flags cache.
Quit. The main loop of the CCL terminates. The CCL then continues as specified in Figure 5. An essential element of the TAWB is the audio document renderer (ADR) . The ADR is responsible for taking a WWW document and creating an audio rendition. Some types of documents, are primarily audio in nature and these may be rendered directly. For example, RealAudio streams, waveform audio data, and structured audio data like MIDI, can all be rendered directly. For documents of these types, the ADR can send the rendered output directly to the telephony interface.
Most WWW documents, however, are not inherently audible in nature. Two significant examples are e-mail messages and HTML documents which are composed of structured text. To convert documents such as these into audio requires a tool such as the WIRE system (described in US Patent Application Serial Number 08/768,046). WIRE accepts a structured document and outputs a text stream suitable for rendering by the text-to-speech synthesizer (TTS) component of the DVP. WIRE converts abstract, semantic mark-up into the literal, syntactic mark-up needed to convey the abstractions using audio and in a form which can be interpreted by a TTS .
The text stream produced by the ADR is sent to the TTS component of the DVP. This text stream is not necessarily linear, or ordered in the same way as the original WWW document, but instead the ADR may be directed by the CCM to send arbitrary parts of the 11
document to the TTS. The CCM determines which part of the document to send based on the "navigation" commands sent by the user, such as "Skip Ahead" or "Skip Back". Systems such as WIRE may offer other "rendering modes" in which certain parts of a document are summarized or skipped based on the user's preferences.
An additional requirement is that the ADR must provide the CCM with the "active link" when requested. This is the hyperlink corresponding to the anchor most recently rendered.
The digital voice processing module (DVP) (25 of Figure 2) consists of two components as shown in Figure 7. The text-to-speech synthesizer (TTS) 78 accepts marked up text from the ADR and generates waveform audio which is sent to the DVP/telephony interchange. The voice recognizer component 79 accepts waveform audio from the DVP/telephony interchange and sends user command strings to the CCM. In a prototype system, Microsoft Speech API (SAPI) was used as the interface through which the ADR and CCM modules could direct the operation of the DVP. Using the SAPI interface, there are many commercial packages available which can serve the role of the text- to-speech synthesizer, such as Lernout & Houspie's TruVoice, and of the voice recognizer, such as AT&T's Watson.
The following will describe the DVP/Telephony Interchange (24 of Figure 2) . Many commercial DVP packages produce or consume audio data in a different format than can be accepted by computer telephony interfaces. The job of the DVP/telephony interchange is to convert audio data from one format to another so that these two TAWB components can share the data. Since the audio is in the form of real-time streams, the format conversion must be fast and must be done on the fly. For example, if the sampling rate produced by the TTS engine 12
is diferent than that accepted by the telephony interface, the interchange must make the conversion. A suitable system is described in US Patent Application Serial Number ##########, entitled "A Real-Time Down- Sampling System For Digital Audio Waveform Data", assigned to the same assignee as the present invention, filed concurrently with this application and incorporated herein by reference.
The following will describe the telephony interface (23 of Figure 20) . A TAWB PC needs an interface to the public telephone system. This interface essentially must allow the computer to control the modem which acts as the local telephony client. Specifically, the interface must be able to send and receive waveform audio and touch- tones and deliver them to both the TAWB system and the public telephony network in an understandable form. In practice, many commercial modems provide software drivers which conform to Microsoft's Telephony API (TAPI) . Since the TAPI definition includes all of the services that the TAWB requires, these software drivers may serve the role of the telephony interface.
Although an audio-only WWW browser can be extremely useful under many circumstances, it is clear that many users will continue to use their visually based browsers in circumstances which permit it. Therefore, it is convenient to support the sharing of information between the two types of browsers. The friendly server (20 of Figure 1) is a repository for information that is accessible by any WWW-enabled device, including TAWB browsers and visual browsers. As some kinds of information are difficult to comprehend in an audio-only environment, the ability to share addresses with a visual browser can be seen as a required feature of an audio- only browser. 13
Figure 8 shows a block diagram of the friendly server 20. The user favorites store 82 is a persistent data repository for addresses of the user' s favorite documents, that is, the documents he or she has chosen to associate with preset commands in the TAWB browser. The purpose of the user' s favorites is to allow the user to access certain sites quickly with a minimum of browsing. The Internet interface for TAWB browser 84 allows the TAWB system to upload replacement favorites which may have been set by the user while using the TAWB system.
This interface is simply an FTP server as is known in the art. The Internet interface for traditional browser 86 allows the user to modify his or her favorites while using a traditional WWW browser. This interface consists of a CGI program within which the user can modify the favorites store and an HTTP server as is known in the art which allows the browser to access the CGI program. An example of a suitable CGI program is shown in Figure 9. This program allows the user to modify the values of the favorites by typing new URL's in the boxes. Pressing the update button completes the modification. In this way, the user may easily transfer documents found with a traditional browser to his or her TAWB browser.
The user flags store 88 is a persistent data repository for the user's flagged URLs. The purpose of the flags is to make documents found with the TAWB browser available to a traditional browser. The Internet interface for TAWB browser 84 allows the TAWB system to upload addresses of documents which have been flagged by the user while using the TAWB system. This interface is simply an FTP server as is known in the art. The addresses are stored in the form of a WWW document containing hyperlinks to the flagged documents. The Internet interface for traditional browser 86 allows the user to view this document and follow the hyperlinks with 14
a traditional browser. This interface is simply an HTTP server as is known in the art.
In summary, the present invention is a system which allows a user with an ordinary telephone to browse the World Wide Web. The present invention is an improvement over prior art systems because any WWW documents can be obtained, not just documents that were specially prepared for audio access. The present invention works with a traditional telephone and does not require the phone to have a visual display or special internal electronics.
The present invention is particularly valuable to people who need mobile access, are visually impaired, cannot afford browsers which require special hardware, or who work in environments where visual displays are not practical.
It is not intended that this invention be limited to the hardware or software arrangement, or operational procedures shown disclosed. This invention includes all of the alterations and variations thereto as encompassed within the scope of the claims as follows.

Claims

15CLAIMS :
1. A system for browsing the world wide web with a traditional telephone comprising: a host computer system capable of connecting to a telephone network and Internet, wherein said host computer system comprises: a voice modem; a telephone-driven audio WWW browser (TAWB) connected to said voice modem; and a network interface connected to said telephone-driven audio WWW browser.
2. A system for browsing the world wide web with a traditional telephone as claimed in claim 1 wherein said telephone-driven audio WWW browser comprises: a telephony interface; a DVP/telephony interchange connected to said telephony interface; a digital voice processing unit connected to said DVP/telephony interchange; an audio document renderer connected to said digital voice processing unit; a command and control module connected to said audio document renderer; and an internet interface connected to said command and control module.
3. A system for browsing the world wide web with a traditional telephone as claimed in claim 2 wherein said digital voice processing unit comprises: a text-to-speech synthesizer;
4. A system for browsing the world wide web with a traditional telephone as claimed in claim 3 wherein said digital voice processing unit further comprises: 16
a voice recognizer component; wherein said digital voice processing unit converts voice commands received from said telephony interface by way of said DVP/telephony interchange into commands that are passed to said command and control module.
5. A system for browsing the world wide web with a traditional telephone as claimed in claim 2 wherein said command and control module comprises: a touch-tone to user-command map; command and control logic; a local flags cache; a local favorites cache; and a history list module.
6. A system for browsing the world wide web with a traditional telephone as claimed in claim 5 wherein said command and control logic comprises: a wait state module; an internet interface retrieval director; a command and control enter module; a telephony interface termination director; an internet interface store favorites director; and an internet interface store user flags director.
7. A system for browsing the world wide web with a traditional telephone as claimed in claim 6 wherein said command and control enter module comprises : a voice to user-command map; a touch-tone to user-command map; a user-command wait module; and command decision module. 17
8. A system for browsing the world wide web with a traditional telephone as claimed in claim 1 further comprising: a friendly server for storing information.
9. A system for browsing the world wide web with a traditional telephone as claimed in claim 8 wherein said friendly server comprises: an internet interface for said telephone-driven audio WWW browser; an internet interface for a traditional browser; a user favorites store; and a user flags store.
10. A system for browsing the world wide web with a traditional telephone as claimed in claim 1 wherein said telephone-driven audio WWW browser comprises: telephony interface means for capturing and delivering audio streams and touch tones to and from said telephone network;
DVP/telephony interchange means for converting between dissimilar formats used by said telephony interface means; digital voice processing means for converting a text stream to an audio voice stream and for sending said audio voice stream to said telephony interface means by way of said DVP/telephony interchange means; audio document renderer means for converting structured documents from said world wide web into an audio rendition; command and control module means for interpreting user commands and for directing an appropriate response; and internet interface means for providing access to WWW servers .
11. A system for browsing the world wide web with a traditional telephone as claimed in claim 10 wherein said audio rendition comprises : audio signals sent directly to said telephony interface means .
12. A system for browsing the world wide web with a traditional telephone as claimed in claim 10 wherein said audio rendition comprises: a specially prepared structured stream that is sent to said digital voice processing means.
13. A method for browsing the world wide web with a traditional telephone comprising the steps of: utilizing a host computer system connected to a telephone network and Internet; wherein utilizing a host computer system comprises the steps of: communicating through a voice modem from said telephone network; utilizing a telephone-driven audio WWW browser (TAWB) connected to said voice modem; and using a network interface between said Internet and said telephone-driven audio WWW browser.
14. A method for browsing the world wide web with a traditional telephone as claimed in claim 13 wherein
utilizing a telephone-driven audio WWW browser comprises the steps of: using a telephony interface; using a digital voice processing unit; interfacing said telephony interface and said digital voice processing unit with a DVP/telephony interchange; utilizing an audio document renderer connected to said digital voice processing unit; utilizing a command and control module connected to said audio document renderer; and using an internet interface connected to said command and control module.
15. A method for browsing the world wide web with a traditional telephone as claimed in claim 14 wherein using a digital voice processing unit comprises the step of: performing text-to-speech synthesizing.
16. A method for browsing the world wide web with a traditional telephone as claimed in claim 15 wherein using a digital voice processing unit further comprises the step of: recognizing a voice component.
17. A method for browsing the world wide web with a traditional telephone as claimed in claim 14 wherein utilizing a command and control module comprises the steps of: converting a touch-tone to user-commands; utilizing command and control logic; utilizing a local flags cache; utilizing a local favorites cache; and accessing a history list module.
18. A method for browsing the world wide web with a traditional telephone as claimed in claim 17 wherein utilizing command and control logic comprises the steps of: entering a wait state; directing said internet interface to retrieve a user's favorites from a friendly server; entering a command and control main loop; directing said telephony interface to terminate connection; directing said internet interface to store favorites; and directing said internet interface to store user flags .
19. A method for browsing the world wide web with a traditional telephone as claimed in claim 18 wherein entering a command and control main loop comprises the steps of: converting voice to user-commands; converting touch-tone to user-commands; waiting for a user-command; and deciding which user command.
20. A method for browsing the world wide web with a traditional telephone as claimed in claim 13 further comprising the step of: storing information in a friendly server.
21. A method for browsing the world wide web with a traditional telephone as claimed in claim 20 wherein storing information comprises the steps of: interfacing said friendly server with said telephone-driven audio WWW browser; interfacing said friendly server with a traditional browser; storing user favorites; and storing user flags.
22. A method for browsing the world wide web with a traditional telephone as claimed in claim 13 wherein utilizing said telephone-driven audio WWW browser comprises the steps of: capturing and delivering audio streams and touch tones to and from said telephone network with a telephony interface; converting between dissimilar formats used by said telephony interface means with a DVP/telephony interchange; converting a text stream to an audio voice stream and sending said audio voice stream to said telephony interface by way of said DVP/telephony interchange with a digital voice processing unit; converting structured documents from said world wide web into an audio rendition with an audio document renderer; interpreting user commands and directing an appropriate response with a command and control module; and providing access to WWW servers with an internet interface.
23. A system for browsing the world wide web with a traditional telephone comprising: a computer system capable of connecting to a telephone network and Internet; and a friendly server for exchanging information between a telephone based browser and a visual based browser.
24. A system for browsing the world wide web with a traditional telephone as claimed in claim 23 wherein said friendly server comprises: an internet interface for a telephone-driven audio WWW browser; 22 an internet interface for a traditional browser; a user favorites store; and a user flags store.
25. A system for exchanging information between a telephone based browser and a visual based browser comprising: a computer system capable of connecting to a telephone network and Internet; and a friendly server connected to said computer system.
26. A system for exchanging information between a telephone based browser and a visual based browser as claimed in claim 25 wherein said friendly server comprises: an internet interface for a telephone-driven audio WWW browser; an internet interface for a traditional browser; a user favorites store; and a user flags store.
EP99904341A 1998-03-10 1999-01-28 A system for browsing the world wide web with a traditional telephone Ceased EP1062798A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US3795198A 1998-03-10 1998-03-10
US37951 1998-03-10
PCT/US1999/001751 WO1999046920A1 (en) 1998-03-10 1999-01-28 A system for browsing the world wide web with a traditional telephone

Publications (1)

Publication Number Publication Date
EP1062798A1 true EP1062798A1 (en) 2000-12-27

Family

ID=21897244

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99904341A Ceased EP1062798A1 (en) 1998-03-10 1999-01-28 A system for browsing the world wide web with a traditional telephone

Country Status (2)

Country Link
EP (1) EP1062798A1 (en)
WO (1) WO1999046920A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2928801A (en) * 2000-01-04 2001-07-16 Heyanita, Inc. Interactive voice response system
SG110999A1 (en) * 2000-03-14 2005-05-30 Comease Pte Ltd Client-server system for controlling internet browsing and method thereof
SG98374A1 (en) * 2000-03-14 2003-09-19 Egis Comp Systems Pte Ltd A client and method for controlling communications thereof
US8131555B1 (en) 2000-03-21 2012-03-06 Aol Inc. System and method for funneling user responses in an internet voice portal system to determine a desired item or service
US20020072916A1 (en) * 2000-12-08 2002-06-13 Philips Electronics North America Corporation Distributed speech recognition for internet access
US8238881B2 (en) 2001-08-07 2012-08-07 Waloomba Tech Ltd., L.L.C. System and method for providing multi-modal bookmarks
GB0121160D0 (en) 2001-08-31 2001-10-24 Mitel Knowledge Corp Split browser
DE10201623C1 (en) * 2002-01-16 2003-09-11 Mediabeam Gmbh Method for data acquisition of data made available on an Internet page and method for data transmission to an Internet page
US8213917B2 (en) 2006-05-05 2012-07-03 Waloomba Tech Ltd., L.L.C. Reusable multimodal application

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9523759D0 (en) * 1995-11-21 1996-01-24 Pollitt Alexander J World wide web information retrieval system
WO1997023973A1 (en) * 1995-12-22 1997-07-03 Rutgers University Method and system for audio access to information in a wide area computer network
WO1997040611A1 (en) * 1996-04-22 1997-10-30 At & T Corp. Method and apparatus for information retrieval using audio interface
US6282511B1 (en) * 1996-12-04 2001-08-28 At&T Voiced interface with hyperlinked information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9946920A1 *

Also Published As

Publication number Publication date
WO1999046920A1 (en) 1999-09-16

Similar Documents

Publication Publication Date Title
EP1183595B1 (en) A voice browser and a method at a voice browser
US20060064499A1 (en) Information retrieval system including voice browser and data conversion server
US6181781B1 (en) Voice mail system that downloads an applet for managing voice mail messages
US6771743B1 (en) Voice processing system, method and computer program product having common source for internet world wide web pages and voice applications
CA2219093C (en) Voice mail system
US8571606B2 (en) System and method for providing multi-modal bookmarks
US6859451B1 (en) Server for handling multimodal information
US7286990B1 (en) Universal interface for voice activated access to multiple information providers
US8566103B2 (en) Multi-modal web interaction over wireless network
US20040172254A1 (en) Multi-modal information retrieval system
US6754710B1 (en) Remote control of computer network activity
US6298372B1 (en) Communication terminal apparatus and communication control method for controlling communication channels
WO2003063137A1 (en) Multi-modal information delivery system
WO2002019152A1 (en) Multi-modal content and automatic speech recognition in wireless telecommunication systems
US6732078B1 (en) Audio control method and audio controlled device
WO2002093402A1 (en) Method and system for creating pervasive computing environments
WO1999009658A2 (en) Server-sided internet-based platform independent operating system and application suite
EP1062798A1 (en) A system for browsing the world wide web with a traditional telephone
US20050273487A1 (en) Automatic multimodal enabling of existing web content
US7283623B2 (en) Internet browsing using a uniform interface
JP3817106B2 (en) Information access method and apparatus
Caccia et al. Multimodal browsing
JP2001331407A (en) Method for converting web page accessible, method for using intelligent agent process for automatically converting web page accessible by user, voice browser and conversion system, and method for preparing mask customized by end user on web page
WO2001076212A1 (en) Universal interface for voice activated access to multiple information providers

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000831

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB IT

17Q First examination report despatched

Effective date: 20080227

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20091110