EP1413100A1

EP1413100A1 - Interactive voice response system

Info

Publication number: EP1413100A1
Application number: EP02748983A
Authority: EP
Inventors: James Teague Eckoh Technologies PLC MORRIS; Bahul Neel Eckoh Technologies PLC UPADHYAYA
Original assignee: Eckoh Technologies (UK) Ltd
Current assignee: Eckoh Technologies (UK) Ltd
Priority date: 2001-06-27
Filing date: 2002-06-27
Publication date: 2004-04-28
Also published as: WO2003003676A1; GB2377119A; GB0115715D0

Abstract

A method of providing telephony services to a user in receipt of a textual message, said method comprising: accessing personal contact information on behalf of the user, said personal contact information comprising telephone numbers; receiving a textual message on behalf of the user, said textual message not including a telephone number; receiving a command from the user to respond to the textual message by setting up a telephone call; and setting up a telephone call to a telephone number selected from said personal contact information at least in part on the basis of information received in said textual message.

Description

INTERACTIVE VOICE RESPONSE SYSTEM

Field of the Invention

This invention relates to methods of performing voice response interactions with a user, methods of storing and processing information on behalf of a user, methods of providing telephony services to a user, apparatus and computer software adapted to perform such methods.

Background WO-A-0106489 describes a telephony system providing text to speech conversion of a text message, for example e-mail, to a user accessing the system by telephone. In addition the system can provide a selectable foreign language translation in the text to the speech conversion.

WO-A-0018100 describes a telephony system whereby a user is able to access and listen to their personal e-mail messages via text to speech conversion. In addition, users can set up telephone calls using contacts from a preconfigured personal address book. In order to set up a call, the user utters the command "dial «name»".

WO-A-9965256 describes a telephony system in which a user is notified of a new email or v-mail message on their mobile phone. The user is able to listen to the message in full by calling a service from their phone, and reply with a similar message if desired, via a voice recognition system.

US-A-6246983 also describes a telephony system in which a user can access and listen to their email messages via voice commands over a telephone and reply by means of a recording saved as a digital audio file. In addition, a user can navigate within messages based upon a user-selected "granularity", for example by paragraphs, words or even characters.

Thus, network telephony services are known which allow a user to store contact information entries on a network node and to set up a telephone call to a selected contact from the stored entries by voicing a name tag stored in the selected entry. The user's voiced command is detected by a voice recognition engine, but reliability of the recognition process can cause difficulties, causing the user to have to repeat the command or generating errors in the selection of the correct entry.

Known systems for playing email messages to a user over a voice telephony connection also allow a user to respond to an email message with a voice message. A user, on commanding the service to generate a response, records a voice message over the telephony connection which is sent to the original sender of the email message as an email response, with the voice message attached to the message as an audio file. It is an object of the invention to provide improvements in relation to these known methods and systems.

Summary of the Invention

In accordance with one aspect of the invention, there is provided a method of providing telephony services to a user in receipt of a textual message, said method comprising: storing personal contact information on behalf of the user, said personal contact information comprising telephone numbers; receiving a textual message on behalf of the user, said textual message not including a telephone number; receiving a command from the user to respond to the textual message by setting up a telephone call; and setting up a telephone call to a telephone number selected from said personal contact information at least in part on the basis of information received in said textual message.

This aspect provides added convenience to the user. For example, by looking up a telephone directory number from an e-mail address in a received message, the reply telephone call can be made without the user needing to specify or select the telephone directory number of the recipient. In accordance with a further aspect of the invention, there is provided a method of storing and processing information on behalf of a user, said method comprising: accessing a set of personal information entries on behalf of the user; enabling a user to select a subset, having less members than said set, from said set of entries, which subset is to be enabled for voice recognition; receiving a voiced command from the user; conducting voice recognition to select one of said subset in response to said voiced command; and processing data from the selected entry.

By use of this aspect of the invention, voice recognition of name tags in a personal contacts directory can be made more reliable and thus more convenient to a user.

In accordance with a further aspect of the invention there is provided a method of communicating a message to a user from an interactive voice response engine, said method comprising: conducting a call with a first user over a first telephony connection; recording a message from the first user over said first telephony connection; storing said message in a first audio file; transmitting said message in a second audio file sent to a selected second user; conducting a call with said second user over a second telephony connection; playing audio signals from said first audio file to said second user over said second telephony connection.

Using this aspect allows a compressed version of a voice message to be transmitted, for example by email, but a higher quality version to be replayed to a user dialling in to the telephony system. Further features and advantages of the different aspects of the invention will be apparent from the following detailed description of preferred embodiments of the invention, made with reference to the accompanying drawings.

Brief Description of the Drawings Fig. 1 is a schematic illustration of an interactive voice response system arranged in accordance with an embodiment of the invention;

Fig. 2 is a flow diagram illustrating steps conducted by the voice response system of Fig. 1 in an embodiment of the invention; and

Fig. 3 is a schematic illustration of a Web page arranged in accordance with an embodiment of the invention.

Detailed Description

Referring now to Fig. 1, an interactive voice response system arranged in accordance with an embodiment of the invention includes various hardware components. These hardware components are themselves known and will not be described in detail; the functioning of the components, and in particular the functioning of the computer software running on the components is new and will be described in further detail.

The system includes an interactive voice response (IVR) engine 2 capable of recognising commands over a voice telephony comiection; these commands include commands voiced by users and recognised by a voice recognition component of the IVR engine 2, and may also include dual tone multiple frequency (DTMF) commands transmitted from users by pressing keys on their telephone handsets. The IVR engine is also capable of playing audio content, both in the form of persistently stored audio files and temporary audio files generated by a text to speech (TTS) engine 20 in the system. These audio files are stored in audio file store 14.

The IVR engine has voice telephony connections provided by circuit switching matrix 4, whereby a number of circuit switched telephone lines, connected to a voice telephony network, are terminated at the IVR engine 2. Users may access the system via different types of telephone terminal, including an exemplified fixed line telephone handset 6, connected to the voice telephony network via a fixed line 8, and mobile telephony handset 10, connected to the voice telephony network via a cellular radio communications link 12.

The IVR engine 2 has access to a user data store 16, in which user- specific information is stored, including logging in information, personal contacts information and email messages downloaded by the system on behalf of users. The IVR interactions with the user are personalised by means of the data stored in user data store 16. In addition, a user may add data to user data store 16 by means of voice and/or DTMF interaction with the IVR engine 2.

The IVR engine 2 is also connected to an email server 18 used to send outgoing emails via a connection to a data communications network 24, typically the Internet. Email server 18 is also capable of downloading emails from remote email servers, exemplified by remote email server 28, which receive, process and transmit email messages on behalf of the users of the IVR system of the present invention. Remote email server may for example be one provided by the user's internet service provider such as AOL™, or by a Web portal such as Yahoo!™. Email servers 18, 28 communicate over data communications network 24 using standard Internet protocols such as the SMTP protocol and the POP protocol.

The IVR engine 2 is also connected to a World Wide Web (herein "Web") server 22 providing Web resources to users via the data communications network. Thus, as user may access these Web resources by using a browser application, for example Microsoft Internet Explorer™ or

Netscape Navigator™, running on a computer terminal such as an exemplified desktop computer workstation 26 connected to the data communications network 24. The content provided by Web server 22 is personalised by means of data stored in user data store 16. In addition, a user may add data to user data store 16 by means of Web resources, such as Web pages containing forms to be filled in by users and posted back to Web server 22.

A user may register with the system by interaction with Web server 22. Namely, the user may fill out a Web page form giving details of a unique ID, preferably corresponding to the number of the telephone from which the user will most often access the service so that the service may recognise the user automatically by caller line identification (CLI), a password to be used at login, other personal information, such as personal contact information in the form of name tags, telephone numbers and email addresses for a number of personal contacts of the users, and details of the user's personal email account held with remote email server 28, including the user's email address, their account password and the address of the mail server, hi a preferred embodiment, the system stores addresses of well-known mail servers and automatically presents this information to the user as a form prefill for confirmation, based on the email address the user provides. For example, the mail server address for all users with email addresses in the domain "yahoo.co.uk" may be recognised and prefilled on behalf of the user.

Whilst a Web-based interaction is preferred in order to provide user convenience, a similar registration procedure, or at least parts thereof and/or updates thereto may also be performed via the IVR interface, by the user providing information in the form of voiced and/or DTMF responses to queries by the IVR engine when in data acquisition mode, which can be entered for example by the user uttering a predetermined keyword or keyword sequence when conducting a voice call with the IVR engine 2. Once the user has registered with the system, the user may access their emails via IVR engine 2. First, the user places a voice call, via their selected telephone handset, to the IVR engine 2, and logs in by supplying their user ID (if not automatically recognised via CLI) and user password. Next, the user may enter email retrieval mode by uttering a predetermined keyword or keyword sequence, such as "mailbox, retrieve mail". On receiving this command, IVR engine 2 instructs email server 18 to download all new emails from remote email server 28, stores same in user data store 16 and then proceeds to conduct the procedure illustrated in Fig. 2.

In step 100, IVR engine 2 retrieves the first of the stored email messages. The message typically includes email text and may have an attachment, which may also include text of a readable format, hi the following, the email text is described as being played to the user as a voice message, however the same procedure may be applied to a readable text attachment. Furthermore, a similar procedure may be applied to a recognisable audio file attachment, by playing back the audio file over the telephony connection and conducting fast forward and rewind of the audio file during playback in accordance with user commands. Such an audio file may contain speech, music or any other type of audio data. If the attachment is not a readable textual attachment (being of an unknown format or containing nontextual data), or if the user selects not to have the attachment played as a voice message, the procedure is applied only to the email text.

In step 102 IVR engine 2 divides the message text into two or more parts, depending on the message length, and selects the first part in the sequence to be played to the user. The division is carried out to generate parts of generally similar size, for example by tending towards a preset average using rules to determine the locations of breaks in the text. The preset average may for example be set at between 10 and 50 words, or may be measured in terms of characters providing similar length sections. The parts are separated at natural language breaks in the text, for example at the end of paragraphs and/or sentences. An excessively large paragraph may be divided into two parts, whilst two or more relatively small paragraphs may be retained in a single part.

In step 104 IVR engine 2 passes the first part to TTS engine 20 for the generation of an audio file containing speech in a sequence corresponding to the text in the part selected. On completion of the conversion, the audio file is stored in audio store 14 and played to the user over the voice telephony connection, step 106. During playback, the user may utter a voice, or transmit a DTMF, command, step 108. If the command relates to the playback sequence of the email message, such as a "skip" or "back" command, step 110, IVR engine 2 returns to step 102 to select a part of the text to be played back next in response to the command.

In response to a "skip" command, IVR engine 2 selects the next part in the text sequence and proceeds to generate and playback the corresponding audio file, steps 104, 106, before the completion of the part currently being played back. Thus, the next part is played out of sequence in that an intervening section is omitted from normal voice playback. A similar out-of- sequence effect could be achieved by "fast forward" playback of the intervening part. The "skip" command is thus useful to the user in order to avoid playback of message text which is not of interest to the user at that time.

In response to a "back" command, IVR engine 2 selects the currently playing or previous part in the text sequence and proceeds to retrieve and playback the corresponding audio file from its start, steps 104, 106, before the completion of the part currently being played back. Thus, the currently playing previous part is played out of sequence in that an already played part is replayed by normal voice playback A similar out-of-sequence effect could be achieved by "reverse" or "quick reverse" playback of the already played part. The "back" command is thus useful to the user in order to obtain playback of message text which is of interest to the user and which the user wishes to hear again.

If the command given by the user is not one relating to the playback of the currently-selected email message, the appropriate response is carried out by IVR engine 2, step 112. Such commands include "next message" to allow the user to move to the next message for playback thereof, "reply", in which a user records a voice message to reply to the currently playing email in an email response sent by email server 18, and "phone", in which the IVR engine 2 sets up a telephone call to the message sender. When a user utters "phone" in step 108, or enters a corresponding DTMF command, IVR engine 2 parses the email header to determine the originating email address, finds the email address in the user's personal contacts information, looks up the corresponding telephone number, confirms it with the user by playback (or if more than one is stored for the same user, conducts an IVR interaction to determine which of the numbers to use), sets up a call to the selected telephone number via switching matrix 4, and on answer, bridges the two call legs at switching matrix 4 to connect the user to the selected respondent. If the originating email address is not recognised, or if no corresponding telephone number is stored, the user is prompted to enter, by voice or DTMF commands, the telephone number to use in replying to the message.

In step 114, if the end of the currently playing audio file is reached, the next sequential part of the message text is selected for playback, step 102, unless the end of the email text has been reached, step 116. If so, the next email in the sequence is selected for playback, unless the just completed email was the last, step 118, in which case the IVR engine 2 returns to a start menu awaiting a further voice command from the user, step 120. In the start menu, the user may for example opt to send an email message, or call, to any recipient in their contacts list by uttering a voice command followed by the appropriate name tag, for example "send mail, Joe Bloggs" or "phone, Joe Bloggs", in which case IVR engine 2 carries out the appropriate action to generate an email addressed to the selected recipient or set up a call to the selected recipient's telephone number. When a user sends a voice mail message via the email function of the system, for example when the user utters "reply" in step 108, or enters a corresponding DTMF command, or when the user opts to send an email in a different part of the menu structure, records the message of the user as a high- quality, relatively large, audio file, in a preferred audio file format compatible for playback via the JVR system such as a VOX file format, and stores same in audio file store 14 before converting the file to a lower quality, relatively small, audio file suitable for transmission by email and in a preferred format suitable for playback on most user terminals, preferably a widely used audio file format such as a .wav file format. The high-quality file is stored for a limited period of time, for example at least a given minimum period lasting a number of days in case the recipient is to request playback of the message via the ICR engine 2. IVR engine 2 then adds the .wav file to an email to be transmitted via email server 18 to the selected recipient. Furthermore, IVR engine 2 stores a unique identifying ID code for the voice message against the file stored in file store 14 and inserts a corresponding code in the header information of the email message. Furthermore, a corresponding code is inserted in email text instructing the recipient to enter the code in an IVR interaction upon calling the IVR engine. If a recipient of the email message calls the JVR system rather than playing back the attached audio file, by the user entering the given code, the corresponding high quality stored audio file is played back to the recipient.

If a recipient of the email message is a user of the JVR system for the personal email management and replay function of the system, it is also possible that the recipient may opt to listen to the attached voice message whist browsing through their emails via the JVR engine. Thus, the system parses the header information accompanying emails recognised to contain

.wav files in order to identify emails containing voice messages originating from the IVR system. If a code generated by the system is found in the header of the email, the corresponding high quality stored audio file is played back to the recipient, rather than the .wav file in the received email message. Hence, lowering of the quality of the played back audio signal due to the conversion between different formats and the relatively lower quality of the audio files attached to the original email messages is avoided.

Figure 3 shows a Web page 200 illustrating an aspect of the invention used to improve the accuracy of the voice recognition performed by IVR engine 2 when working with the user's personal contacts list. Web page 200 is sent to the user by Web server 22 after the user has logged into the system by supplying an appropriate user JD and password combination. The Web page includes a contacts list 202, containing a number of entries, preferably over 20 and more preferably over 50, containing name tag records 204, telephone number records 206 and email address records 208. The list may be scrolled by scroll button 210, an entry may be edited by clicking on the name tag, and a new contact may be added by means of "add new contact button" 212. Also provided are form boxes 214, of which selected boxes 216 are checked, by user selection, to indicate that the selected entry is enabled in IVR engine 2 for voice recognition. Such enablement allows the entry to be accessed or used by the user uttering the name tag held in the entry, for example to allow the user to send an email or telephone a selected contact over a voice connection with IVR engine 2 without having to remember of supply the email address or telephone number.

Generally, the fewer entries selected for activation on the IVR engine 2, the more accurate and effective the voice recognition process will be.

Preferably, the user is limited to a preset maximum, preferably 50 or less and more preferably 20 or less, of entries which may be enabled for voice recognition in LVR engine 2. Thus, the user may select a subset from their entire set of contacts, themselves accessible via the Web interface, for enablement in voice recognition. The subset may be dynamically changed as desired by the user by checking and unchecking the appropriate boxes on their personal contacts Web page. Alternatively, or in addition, the subset of contacts enabled for voice recognition may be selected and dynamically changed by interaction with IVR engine 2. The above embodiments are to be understood as illustrative examples of the invention. Further embodiments of the invention are envisaged. For example, whilst in the above embodiments, the user has access to the system via a Web interface, wireless transfer protocols such as WAP™ and I-mode™ may alternatively be used to allow a user to similarly interact with the system by means of graphical interface on a mobile telephony device. Other hardware configurations are also envisaged; furthermore the use of pluralities of the different hardware components described would be useful in order to scale up the number of users the system is able to serve. It is to be understood that any feature described in relation to one embodiment may also be used in other of the embodiments. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims.

Claims

1. A method of providing telephony services to a user in receipt of a textual message, said method comprising: a) accessing personal contact information on behalf of the user, said personal contact information comprising telephone numbers; b) receiving a textual message on behalf of the user, said textual message not including a telephone number; c) receiving a command from the user to respond to the textual message by setting up a telephone call; and d) setting up a telephone call to a telephone number selected from said personal contact information at least in part on the basis of information received in said textual message.

2. A method according to claim 1, wherein said textual message is an email message.

3. A method according to claim 1 or 2, wherein said textual message comprises an email address, said selection being performed at least in part on the basis of said email address.

4. A method according to claim 3, wherein said personal contact information comprises email addresses, said selection being performed by selecting a telephone number corresponding to an email address in an entry in said personal contact information.

5. A method according to any of claims 1 to 4, comprising converting text from said email message to speech and playing said speech to said user.

6. A method according to any of claims 1 to 5, said method comprising receiving said command via an initial telephone call, and bridging the initial telephone call and the subsequently set up telephone call to connect the user to the selected telephone number.

7. A method of storing and processing information on behalf of a user, said method comprising: a) storing a set of personal information entries on behalf of the user; b) enabling a user to select a subset, having less members than said set, from said set of entries, which subset are to be enabled for voice recognition; c) receiving a voiced command from the user; d) conducting voice recognition to select one of said subset in response to said voiced command; and e) processing data from the selected entry.

8. A method according to claim 7, wherein said entries comprise a name record, and said voiced command comprises a name from the selected entry.

9. A method according to claim 7 or 8, wherein said entries comprise personal contact information;

10. A method according to claim 9, wherein said contact information comprises telephone numbers.

11. A method according to claim 10, wherein step (e) comprises setting up a telephone call to a telephone number from the selected entry.

12. A method according to claim 11, wherein said contact information comprises email addresses.

13. A method according to claim 12, wherein step (e) comprises generating an email message addressed to an email address from the selected entry.

14. A method according to any of claims 7 to 13, comprising limiting a number of entries in said subset to a maximum number less than an allowed number of entries in said set.

15 A method according to claim 14, wherein said maximum number is less than 50.

16. A method according to claim 15, wherein said maximum number is less than 20.

17. A method according to any of claims 14 to 16, wherein said allowed number is greater than 20.

18. A method according to claim 17, wherein said allowed number is greater than 50.

19. A method according to any of claims 7 to 18, comprising allowing a user to dynamically alter the members of said subset by altering selections from said set.

20. A method according to any of claims 7 to 19, comprising providing a graphical interface for performing the selection of said subset.

21. A method according to claim 20, wherein said graphical interface comprises a Web page or suchlike.

22. A method according to any of claims 7 to 21, wherein said method is carried out at least in part by an interactive voice response engine, said voiced command being received over a telecommunications link.

23. A method of communicating a message to a user from an interactive voice response engine, said method comprising: a) conducting a call with a first user over a first telephony connection; b) recording a voice message from the first user over said first telephony connection; c) storing said message in a first audio file; d) transmitting said voice message in a second audio file sent to a selected second user; e) conducting a call with said second user over a second telephony connection; and f) playing audio signals from said first audio file to said second user over said second telephony connection.

24. A method according to claim 23, wherein said step (c) comprises converting said voice message from said first audio file to said second audio file in order to perform data compression.

25. A method according to claim 23 or 24, wherein the second audio file is transmitted in an email message.

26. Apparatus adapted to carry out the method of any of any of the preceding claims.

27. Computer software adapted to carry out the method of any of claims 1 to 25.