EP1155552A4

EP1155552A4 - Speech-recognition-based phone numbering plan

Info

Publication number: EP1155552A4
Application number: EP00905896A
Authority: EP
Inventors: Alex Kurganov
Original assignee: Webley Systems Inc
Current assignee: Webley Systems Inc
Priority date: 1999-02-01
Filing date: 2000-02-01
Publication date: 2004-12-29
Also published as: EP1155552A1; CA2362195A1; AU2750100A; WO2000049790A1

Abstract

In accordance with one aspect of the present invention, a network comprised of, but not limited to redundant telephone call processing computers (servers), database servers and voice recognition servers is provisioned and connected with a telephone company to receive multiple simultaneous telephone calls placed to a set of phone numbers. Each telephone number terminated on the network will belong to a group of, for example, several thousand subscribers. When a caller dials this number, the caller is asked by the call-processing server to state the name of the person trying to be reached. The voice recognition server recognizes the particular subscriber name with high accuracy and confirms it back to the caller and subsequently directs the call to the subscriber. In accordance with one aspect of the present invention, a network comprised of, but not limited to redundant telephone call processing computers (servers), database servers and voice recognition servers is provisioned and connected in a Centrex-like way with a telephone company to receive multiple simultaneous telephone calls placed to a set of phone numbers. Each telephone number terminated on the network will belong to a group of, for example, several thousand subscribers. When a caller dials this number, she or he is asked by the call-processing server to state the name of the person that is trying to be reached. The voice recognition server recognizes the particular subscriber name with high accuracy, confirms it back to the caller and directs the call to the subscriber.

Description

SPEECH-RECOGNITION-BASED PHONE NUMBERING PLAN

TECHNICAL FIELD

The present invention relates to telecommunications. In particular, the invention relates to a method and system for speech-recognition-based phone numbering plan.

DISCLOSURE OF INVENTION

As the need for phone numbers increases, phone companies' available numbers become exhausted. The exhaustion of available numbers in various area codes in the United States has led to the introduction of additional area codes. The introduction of new area codes causes inconvenience and confusion for local businesses and individuals. Once new area codes are implemented, the area covered by each area code becomes progressively smaller which necessitates often dialing ten-digit numbers to reach businesses or individuals in the same city. The boundaries for various area codes are often difficult to remember which makes finding and reaching businesses increasingly more difficult. As the numbers in the new area codes becomes exhausted, the telephone companies' only alternative may be to increase the number of digits. Consideration has been given to switching from the current seven-digit local numbers to eleven digit numbers. Such a switch would be extremely expensive and would significantly increase the inconvenience of dialing and keeping track of the eleven-digit numbers. The present invention addresses this problem and provides a simple and elegant solution to the problem. The solution eliminates the need for exorbitant expenses associated with switching to eleven-digit numbers and makes reaching business and individuals easier than at the present time.

The present invention also provides a new approach to paying for telephone calls. The cost of the telephone calls in underwritten by advertisers and by internet companies which desire callers to use their portals.

Thus, one object of the present invention is to provide a method and a system for reducing inconvenience by allowing the use of a single common speech enabled telephone system number and eliminating the necessity for remembering phone number for various parties.

Another object of the present invention is to provide a method and a system for making reaching individuals easier and quicker. A further object of the present invention is to obviate the need for storing and remembering telephone numbers of businesses or individuals associated with known businesses or groups.

A still further object of the present invention is to provide a system and a method for quickly reaching businesses and individuals who are a part of a defined group.

Still another object of the present invention is to provide a system and method which allows subscribers to manage, and route incoming phone calls, and store, access and forward messages.

Still another object of the present invention is to provide a better and more efficient service for phone users.

A still another object of the present invention is to make it more difficult for random sales callers to reach the subscribers.

Other objects will become apparent to those skilled-in-the-art upon studying this disclosure. SUMMARY OF THE INVENTION

In accordance with one aspect of the present invention, a network comprised of, but not limited to redundant telephone call processing computers (servers), database servers and voice recognition servers is provisioned and connected with a telephone company to receive multiple simultaneous telephone calls placed to a set of phone numbers. Each telephone number terminated on the network will belong to a group of, for example, several thousand subscribers. When a caller dials this number, the caller is asked by the call-processing server to state the name of the person trying to be reached. The voice recognition server recognizes the particular subscriber name with high accuracy and confirms it back to the caller and subsequently directs the call to the subscriber.

In accordance with another aspect of the present invention, the subscribers for each phone number are selected based on dissimilarity of their names to minimize or eliminate any ambiguity and resulting connection errors.

In accordance with a further aspect of the present invention, the subscribers for each number are selected based on their association or profession.

In accordance with still another aspect of the present invention, the subscribers having the same phone number can reach each other by dialing a single digit and identifying the name of the party who they are calling. In accordance with a further aspect of the present invention, businesses or associations having the same phone number can be efficiently reached without the need for an operator.

In accordance with a still further aspect of the present invention, advertisers and/or internet access providers will allow subscribers to receive calls to their toll-free numbers and to receive a predetermined number and size of voice mail messages and further to allow for merchants to pay for outbound telephone calls.

Other aspects of the present invention will become apparent to those skilled in the art upon studying this disclosure. BRIEF DESCRIPTION OF DRAWINGS

Other objects and advantages of the present invention will become apparent upon reading the following description of illustrative embodiments and upon reference to these drawings.

FIG. 1 is a schematic of the preferred embodiment of the present invention for serving subscribers having the same phone number.

FIG. 2 is a schematic of the preferred embodiment of the present invention for handling calls to subscribers having the same phone number.

While the present invention is susceptible to various modifications and alternative forms, several embodiments will herein be described in detail. It should be understood, however, that it is not intended to limit the invention to the particular forms disclosed, but, on the contrary, the invention is to cover all notifications, equivalents, and alternatives falling within the scope and spirit of the invention as defined by the appended claims.

MODES FOR CARRYING OUT THE INVENTION The present invention provides a large number of subscribers assigned to the same telephone number. A network comprised of, but not limited to redundant telephone call processing servers, database servers and voice or speech recognition servers is provisioned with a telephone company to receive multiple simultaneous telephone calls placed to a single common phone number. The subscriber names and pronunciations and their corresponding telephone numbers and identifications are stored in the database. When a caller dials the common number, a call-processing server receives the dialed number from the telephone company. Based on that number, the call-processing server loads the binary representation of the subscribers' name list into the voice recognition server. The call-processing server initiates an action whereby the caller is asked to state the name of the person trying to be reached. The voice recognition server recognizes the particular subscriber name with high accuracy, confirms it back to the caller and directs the call to that subscriber.

A preferred embodiment of the present invention for subscribers having the same telephone numbers is shown in FIGS. 1-2 of the drawings. Referring now to FIG. 1, a public switch telephone network, generally designated by a numeral 4, receives calls from outside telephones such as telephone 5. The available numbers in an exchange are subdivided into listings (otherwise known as batches). Three representative listings are generally designated, in FIG. 1, by numerals 10, 12 and 13. Each listing 10, 12 and 13 contains several thousand subscriber numbers. Each listing of subscriber numbers 10, 12 and 13 is provided with a call processing server 6, a speech recognition server 7 and a subscriber database server 8. The call-processing server 6 receives the phone call and detects input (either voice input or input from a touch-tone pad) from the caller. The speech recognition server interprets the vocal input from the caller and compares it to a binary representation of a subscriber list disposed in the subscriber database server 8. FIG. 2 generally depicts the handling of a telephone call that arrives at the telephone number of listing 10 in FIG. 1. As shown in FIG. 2, the telephone call is directed to a call-processing server 6 that is operated by a CPU (not shown). Specifically, the inbound call arrives (as indicated by arrow 50) at a line interface card 32. The line interface card 32 includes a voice resource module 34, a touch tone detection module 36 and a speech detection and echo cancellation module 38. The voice resource module 34 detects the voice or party the caller is attempting to reach. The CPU is also able to detect caller input from the touch-tone numerals on the caller's phone pad. The touch-tone detection module 36 detects this type of input. The speech detection and echo cancellation module 38 is the transfer means by which the caller's utterance is transferred from the call-processing server 6 to the speech recognition server 7.

After a call is received, the call-processing server 6 activates the voice resource module 34 and the speech detection and echo cancellation module 38 and invokes a command to play a message to the caller (as indicated by arrow 52). By sending this message, the CPU attempts to identify the subscriber of the telephone network that the caller is attempting to reach. An example of a message played to the caller is, "Please state the name of the person who you are calling." If either speech or a touch-tone button input is detected, the CPU interrupts the message if it was still playing. This interruption is referred to as "barging in". The call-processing server 6 locates and communicates with (as indicated by arrow 70) the subscriber database server 8 to load a binary representation of the name grammars 42 and the subscriber telephone number(s) and location list 44 into the speech recognition server 7 (data transfer as indicated by arrow 72). After receiving the party information that the caller is attempting to reach, the CPU compares that information to the binary representation of the subscribers in order to determine the proper subscriber to reach.

The CPU establishes a connection from the call-processing server 6 to the automatic speech recognition server 7 through the echo cancellation module 38 (as indicated by arrow 60). The speech recognition server 7 attempts to locate the best match between the phonemes in the caller's utterance and the subscriber's name phonemes found in the subscriber name database 40 of the speech recognition server. The unique subscriber account number, the name of the subscriber, and the probability of the name located by the speech recognition server 7 to be the correct one are returned to the CPU within about one second of when the caller stated the name of the subscriber. The speech recognition server 7 recognizes the subscriber's name based on a comparison between the name recognized in the speech recognition server 7 and subscriber name grammars 42 previously loaded into the speech recognition server 7. The results are transferred to the call processing server 6 (as indicated by arrow 62) and back to the caller (as indicated by arrow 52). The CPU confirms the subscriber's name back to the caller through either speech synthesis or playback of a prerecorded audio file. The CPU will confirm the subscriber name that most nearly matches the name stated by the caller. Once the speech recognition server 7 locates a match to the caller's input, the CPU will transfer the call to the subscriber based on the data located in the name grammar 42 of the subscriber database server 8. The CPU transfers the call to the location identified by the subscriber as his current location. If the CPU is not able to locate a match to the input entered by the caller, the CPU may return multiple names to the caller. The caller chooses a name from the multiple names listed or can indicate that none of the names returned by the CPU are the desired person to be called. Alternatively, the CPU may return the single name that is most similar to that uttered by the caller. Further in the alternative, the CPU may ask the caller to spell (either using the touch-tone phone pad or by speaking) the name of the party the caller is attempting to reach. The call transfer method will be determined by the CPU based on the type of the subscriber location, e.g. main extension, home number, cellular phone number, e-mail address, IP address of the H.323 VoIP module, etc. In any case, the subscriber will have an option to use touch-tone commands or to request an operator transfer.

The speech recognition based telephone system of the present invention presumes that the caller knows the party or person being called, but may not know the telephone number of the entity being contacted. In most instances, the telephone system performs a certain amount of navigation in order to locate the party being contacted.

The listing where the party being called may be found by obtaining a dynamic listing or a static listing of subscriber names. In order to obtain a dynamic listing, one in which the subscribers are not fixed or may fluctuate and a new search should be performed substantially each time a party is contacted. This type of navigation can be performed using information from the Internet and is similar to a search performed on an Internet search engine. A static listing is one in which the information will not change over a certain period of time. For example, a simple static listing could provide the names of the fifty United States. It is conceivable in accordance with the present invention that one or more dynamic searches may be combined with one or more static searches to locate the party being called. For example, if a caller desires to locate a doctor in Chicago, Illinois, named Dr. Ben Smith. The speech recognition system may first ask the caller for the country in which the caller wants to locate a party. The system may then ask the caller for the state and the city. These three searches (to determine the appropriate country, state and city) can be described as static searches. Then, a dynamic search can be performed to find Dr. Ben Smith using an Internet directory-type web page or a prearranged custom directory of doctors in Chicago.

A preferred speech recognition engine is developed by Nuance Communications of 1380 Willow Road, Menlo Park, California 94025 (www.nuance.com). A preferred line interface card is a Dialogic D/480 SC-2T1 card with 48 voice resources. A currently preferred database is a Sybase Adaptive Server 11.9. A preferred call-processing server is based on the Intel Pentium II 450 MHz chip. A preferred speech recognition server is based on the Intel Dual Pentium II 450 MHz chip running a Natural Speech Recognition Engine from Nuance Communications The engine delivers 28.5 recognition units as defined in the vendor specification. A preferred echo cancellation module is DSP-based with embedded software. A preferred DSP is Antares manufactured by Dialogic Corporation. The embedded echo cancellation software is developed by Nuance Communications. The subscribers are placed in specific listings based on dissimilarities in their vocalized names unless they are related by association or profession, or location. For example, subscribers who have a common name, such as Bob Smith, are placed in different listings. A caller that if the caller wishes to avoid going through the voice recognition procedure is able to reach a subscriber through a personal identification number (PIN) given to each subscriber. INDUSTRIAL APPLICABILITY

In order to enter new subscribers onto the subscriber name database, an operator enters the new subscriber name in the front-end application. The back-end application finds the listing number that contains no or fewest similar sounding names out of thousands of listings. The new subscriber names are vocalized, to create phonetic pronunciation, and added to the subscriber name database. A new binary representation of the list of subscriber names can be created. Finally, various World Wide Web-based self-provisioning front-end applications can be developed for subscribers to enter their phone numbers, e-mail addresses, etc. into the system database.

The subscribers in a particular listing can be selected on the basis of dissimilarity of their vocalized names. In the alternative, subscribers having the same profession or association can be placed in the same telephone number. For example, one telephone number can be assigned to all lawyers practicing in a particular area code. A called attempting to contact a particular lawyer would dial, for example, LAW-YERS, and, when prompted, identify the particular lawyer to be reached. For lawyers who have the same or similar sounding names, the system would provide additional identification. The voice recognition system of the present application is expected to be approximately 98 percent accurate. As the technologies discussed within the present application develop, this percent accuracy should increase. If the voice recognition system is unable to match the name given by the caller with a name in the subscriber name database or is unable to match with the name with confidence, the call can be transferred to the operator who can inquire about additional data and make the connection. The pronunciation used by the caller is then stored in the database to provide for automatic transfers in the future. The voice recognition engine used by the system described should support continuous speaker independent speech and be phoneme-based (and not discrete, speaker dependent, pattern-based). In that case, even people with heavy accents in any supported language will successfully be able to use the voice recognition system. The system is capable of running multi-lingual recognition servers. For example, all Spanish-speaking doctors in a certain area can have a single number that people who prefer or can only speak in that language will use.

The preferred "Follow Me" service of the present invention is based on the concept that, in some quantity, a voice recognition service subscription would be free to the subscriber. The "Follow Me" service allows a caller to contact a particular number and reach a subscriber in one of multiple locations, depending on where the subscriber is located (e.g., home, office, cellular phone) and where the subscriber has instructed the voice recognition service to locate that particular subscriber. Advertisers provide the entire funding for this free service. All callers will hear at least one advertisement before the caller is able to leave a voice-mail message or get transferred to their party. Further, the voice recognition service includes unified messaging and is offered to on-line providers that market the voice recognition system to their subscriber base.

The voice recognition service works as follows: a toll free number is provided to an on-line provider for their subscribers. Callers to the subscribers' toll-free numbers are given the choice to leave a voice mail message for or transfer the call in an attempt to locate the subscriber. If the caller leaves a voice mail message, the message file is automatically sent via e-mail to the subscribers' mailbox at their on-line site. Subscribers can only access their mailbox or configure the "Follow Me" service by accessing their messaging web page at the on-line site. When picking up a voice-mail, the subscriber gets a text ad on the e-mail cover that includes hot-links to the advertisers' site. They also hear the ad as a preface to the message. Subscribers are unable to access this mailbox by telephone. If the caller chooses "Follow Me" option and attempts to locate the subscriber, the call is transferred to the destination the subscriber has designated. When a subscriber is receiving a "Follow Me" call, he hears the ad only once. If the subscriber is not available, the caller is able to leave a voice-mail message. Each enrolled subscriber is allocated a predetermined number of free voice-mail messages of certain duration per day.

If a subscriber chooses to register for the "Follow Me" service, that subscriber will not pay for the predetermined number of minutes of a completed call but will be billed for any minutes that extend past that predetermined time. The extra minutes, beyond the allocated free time, will be charged to the subscriber's credit card number. Only subscribers who have enrolled and given their card number will have the "Follow Me" option. The level of expected subscriber participation for the on-line services will exceed the number of dedicated toll free numbers that could be allocated for this purpose. The solution is to allocate some number of thousand subscribers per toll free number so the caller would identify the subscriber by speaking their name. Voice recognition would be employed for this application; duplicate or similar sounding names would be provisioned on separate toll free numbers.

Certain on-line companies have developed and deployed their own on-line address books and calendars. An interface is provided for these on-line companies so that their subscribers could, for example, speak a name from their portal address book and have it dialed. Instant messaging is an important new feature that determines if someone on a given contact list is presently on-line and can send a short message to the on-line user.

The description of the preferred embodiment is set forth for illustrative purposes and is not intended to limit the present invention in any manner. Equivalent approaches are intended to be included within the scope of the present invention. While the present invention has been described with reference to the particular embodiments illustrated, those skilled in the art will recognize that many changes and variations may be made thereto without departing from the spirit and scope of the present invention. The embodiments and obvious variations thereof are contemplated as falling within the scope and spirit of the claimed invention, which is set forth in the following claims:

Claims

What we claim is:

1. A telephone network comprised of: redundant telephone call processing computers, subscriber database computers and voice recognition computers, said telephone call processing computers, said database computers and said voice recognition computers are connected with a telephone company to receive multiple simultaneous telephone calls placed to a common phone number.

2. The telephone network of Claim 1, wherein said telephone call processing computers are telephone call processing servers.

3. The telephone network of Claim 3 wherein said telephone call processing servers further comprises a line interface card.

4. The telephone network of Claim 3 wherein said telephone call processing servers further comprises a voice resource module.

5. The telephone network of Claim 3 wherein said telephone call processing servers further comprises a touch tone detection module. 6. The telephone network of Claim 3, wherein said telephone call processing servers further comprises a speech detection and echo cancellation module.

7. The telephone network of Claim 1, wherein said subscriber database computers are subscriber database servers.

8. The telephone network of Claim 1, wherein said telephone call processing computers are telephone call processing servers.

9. A telephone network comprised of: a plurality of listings, each of said listings are provided with a redundant telephone call processing server, a subscriber database server and a voice recognition server; and said telephone call processing server, said subscriber database server and said voice recognition server are each connected with a telephone company to receive multiple simultaneous telephone calls placed to a common phone number.

10. The telephone network of Claim 9, wherein said telephone call processing servers further comprises a line interface card.

11. The telephone network of Claim 9, wherein said telephone call processing servers further comprises a voice resource module.

12. The telephone network of Claim 9, wherein said telephone call processing servers further comprises a touch tone detection module.

13. The telephone network of Claim 9, wherein said telephone call processing servers further comprises a speech detection and echo cancellation module.

14. The telephone network of Claim 9, wherein said subscriber database further comprises at least one name grammar and a subscriber listing.

15. A method of receiving telephone calls through a speech-recognition-based phone numbering network system, said method comprising; a call processing server receiving a telephone call to said network system; detecting input from the caller stating the party attempting to be reached, forwarding said call to a speech recognition server and loading a binary representation of a subscriber list into said speech recognition server; said speech recognition server comparing said input to said subscriber list; transferring the results of said comparison to said caller; and connecting said caller to said party.