WO1999049639A1 - Method for establishing telephone calls - Google Patents

Method for establishing telephone calls Download PDF

Info

Publication number
WO1999049639A1
WO1999049639A1 PCT/GB1999/000910 GB9900910W WO9949639A1 WO 1999049639 A1 WO1999049639 A1 WO 1999049639A1 GB 9900910 W GB9900910 W GB 9900910W WO 9949639 A1 WO9949639 A1 WO 9949639A1
Authority
WO
WIPO (PCT)
Prior art keywords
recognition
call
speech recognition
telephone
tespar
Prior art date
Application number
PCT/GB1999/000910
Other languages
French (fr)
Inventor
Reginald Alfred King
Original Assignee
Domain Dynamics Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Domain Dynamics Limited filed Critical Domain Dynamics Limited
Priority to AU30442/99A priority Critical patent/AU756212B2/en
Priority to EP99911930A priority patent/EP1070415A1/en
Priority to JP2000538487A priority patent/JP2002508629A/en
Publication of WO1999049639A1 publication Critical patent/WO1999049639A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6033Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
    • H04M1/6041Portable telephones adapted for handsfree use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones

Definitions

  • the present invention relates to establishing telephone calls using speech recognition.
  • Hands-free communication systems are known in which reliance is made upon speech recognition in order to establish a communications channel.
  • An important example of such systems relate to their applications in mobile telephones and car phones when deployed with hands-free modules.
  • Known systems provide speech recognition capabilities that enable a call originating user to speak the name of a required remote telephone user or a desired destination into a microphone and to have the recognition system translate these acoustic voice inputs into telephone numbers for automatically dialling the required destination.
  • Known systems operate by activating a word recognition module whereafter the user speaks individual digits or the name of the person or the destination required.
  • the user is invited to confirm to the word recognition module that the destination selected by the device is correct and then indicates to the system that a call is to be made.
  • the call is then established so as to allow the user to communication by voice or by other means, with the called subscriber.
  • the driver may give full attention to the driving conditions (or other operational conditions when deployed in alternative environments) without having to divert their attention to look down at a fixed mobile telephone display and thereafter manually activate physical buttons mounted to the telephone.
  • This requirement has also been identified by the title "Safe Behind The Wheel Communication”.
  • telephony apparatus including speech recognition means configured to identify a destination to be called in response to a vocalisation; calling means configured to call an assisting person if the speech recognition means fails to correctly identify said vocalised destination; and means for receiving telephone number data transmitted back in response to said call to said assisting person, wherein said calling means establishes a new call in response to said transmitted data.
  • the speech recognition means is configured to offer an alternative recognised destination before calling said assisting person.
  • the recognition means may be configured to offer a number of alternatives before calling an assisting person and in a preferred embodiment, incoming speech vocalisations are recognised by a process of Time Encoded Signal Processing And Recognition (TESPAR).
  • TSPAR Time Encoded Signal Processing And Recognition
  • the invention relates to mobile and/or hands-free communication systems and especially to systems relying upon voice control to enable interconnection or reconfiguration of elements which may include all elements of communication systems.
  • One important example of such systems is their utilisation in telephones and, in particular, in mobile phones when deployed with so called “hands-free modules" in car and other mobile vehicles.
  • the in-car telephone environment is used as an exemplar to illustrate the key features of the current disclosures but it can be appreciated that the invention has broader application in environments where telephone calls are established.
  • Voice activated so called hands-free car kits are now becoming available for the cellular phone market and are well known to those skilled in the art. These involve a voice (word) recognition capability to enable the user to speak the name of a required remote telephone user or a wanted destination into a microphone and to have the recognition hardware and software translate these acoustic voice inputs into telephone numbers for automatic dialling to the distant subscriber. By such means, the user whilst driving:
  • C. confirms to the word recognition module that the destination selected by the device is the correct one
  • D. indicates that the subscriber chosen is to be dialled
  • E. communicates by voice or other means with a distant subscriber.
  • a driver may give full attention to the road, the traffic and driving conditions, without having to divert attention to look down at a fixed mobile telephone display and manually activate a button or buttons on the mobile telephone mounted in the car, thus seriously diverting attention from the complexities of the driving task.
  • One description of the requirement for this capability is "safe behind the wheel communication".
  • a driver of a car fitted with a hands-free car telephone kit operates an activate voice switch A1 , which may be housed on the steering column or the steering wheel or, for example, as a foot operated switch similar to that used in some vehicles for dipping headlights.
  • a word recognition module A2 seeks confirmation that the car driver wishes to transmit.
  • the driver responds with a "yes” or a "no” which is recognised by the word recognition module A2 which then acts appropriately.
  • the recognition module prepares for the next acoustic input which will be, for example, the name or location of the person that the driver of the vehicle wishes to contact.
  • the word recognition module attempts to recognise the name or location spoken to it by the driver from the full portfolio of words previously stored in the module. It converts the word selected into an appropriate telephone number via a keypad code module A3.
  • the voice operated system may provide a synthetic speech output such as "do you wish to transmit to person X?", seeking confirmation from the driver that person X thus selected is the person that the driver now wishes to contact.
  • the driver will respond with a "yes” or “no” acoustic response. If “no" is the response, the system has made a mistake and the driver may be prompted to re-input the name that is to say, to try again, or, alternatively, the system may automatically provide the second choice from the set of scores previously calculated in the word recognition module.
  • a range of similar alternative man-machine protocols are available to enable this interactive process.
  • these products provide substantially in-vehicle hands-free operation, enabling drivers to concentrate on driving while setting up and making a call, thus facilitating communication and significantly improving road safety.
  • the present invention is disclosed to enable voice operated mobile communications to cope safely and effectively in extreme and variable ambient noise environments and to guarantee flexible hands-free voice operated communications which fully map the capability of human beings.
  • performance that is, the recognition rate of any word recognition algorithm is inversely proportional to the number of words that are to be presented at any one time to the algorithm for recognition.
  • Current conventional systems may offer from say, eight to sixty- four names, often splitting these up into sub-groups to present the algorithm at any one time with a minimum number of alternative names, thereby improving performance.
  • the performance of all such system degrade significantly and progressively as the level of acoustic background noise increases, both in terms of its magnitude and variability.
  • telephony apparatus configured to establish a telephone call using speech recognition, including means for receiving acoustic vocalisations wherein the name of a destination is repeated; and processing means are configured to analyse said repeated vocalisations so as to improve recognition properties.
  • the speech is recognised by a process of
  • Time Encoded Signal Processing And Recognition and the processing means may be configured to offer a predetermined number of alternatives if an incorrect recognition of a vocalisation is made.
  • a plurality of TESPAR archetypes are stored for specific users.
  • the recognition equipment is mounted within a motor vehicle and telephone communications are made by mobile cellular networks.
  • a third aspect of the present invention there is provided a method of establishing a telephone call using speech recognition wherein, after a speech recognition system has failed to correctly identify a destination to call, a call is made to an assisting person; telephone number data is transmitted back to said user; and a new telephone call is established in response to transmitted data.
  • Figure A shows a conventional mobile cellular telephone with hands- free voice recognition facilities
  • Figure 1 shows a hands-free mobile telephone system mounted within a motor vehicle
  • Figure 2 illustrates vehicles operating within a cellular telephone environment
  • Figure 4 summarises the operation of the environment identified in Figure 2;
  • FIG 5 details the telephone system identified in Figure 1.
  • Hands-free telephony systems may be employed in many situations where an operative cannot physically operate a telephone in the usual way or in doing so the operative may be distracted leading to a potentially dangerous situation.
  • the hands-free environment described herein relates to the deployment of a mobile cellular telephone within a motor vehicle. However, it should be appreciated that this particular application presents an 10
  • a car interior having mobile telephony equipment is shown in Figure 1.
  • the mobile telephone may be permanently mounted within the vehicle or, alternatively, it may include a portable mobile telephone interfaced to an in- car system, commonly referred to as a "car kit”.
  • a mobile telephone 101 is supported within a cradle 102.
  • a cradle 102 is connected to an in-car unit 103 by means of an umbilical connection 104.
  • the in-car unit 103 receives power via the car's internal battery via a power connection 105.
  • an aerial connection 106 is connected to an external aerial for the transmission and reception of radio signals. Audio input signals are supplied to audio loudspeakers 107 via audio leads 108 and internal vocalisations are received from a microphone 109 via an audio input lead 110.
  • the in-car unit 103 also includes speech recognition facilities, thereby allowing a driver to establish a telephone call with minimal physical interaction. In preference to removing the telephone from its cradle or activating telephone buttons while it resides in its cradle. A driver is merely required to activate a voice recognition switch 111 , thereby placing the in-car unit 103 into a condition which facilitates the establishment of a call by voice recognition procedures.
  • the system is provided with a talk-back system embodying the present invention such that should voice recognition procedures fail to establish a call, a call is automatically made to a secretary or appropriate populated bureau or service centre, whereafter the required telephone number details may be returned in machine-readable form to the in-car unit 103, thereby facilitating the establishment of a further call, again without any manual intervention on the part of the driver.
  • the vehicle identified in Figure 1 is also shown in Figure 2 at 201.
  • the vehicle operates within a modular telephone environment and other vehicles 11
  • the GSM modular network 204 is connected to a local public switched telephone network 205 via conventional interface channels 206, from which it is possible to establish conventional telephone connections to an office environment, shown generally at 207 and to a home environment, shown generally at 208. It is possible that these connections may be established using speech recognition systems, where the telephone numbers for "the office" and "home” are stored against appropriate encoded speech templates. Thus, a speech utterance is compared against a selection of templates and a best classification is made by the speech recognition equipment, which is then presented acoustically to the driver, thus enabling the driver to confirm or otherwise whether a telephone call is to be established.
  • the system is provided with a "talk-back" system such that if the speech recognition fails to identify the number required after a number of attempts, the system is automatically programmed to make a call to a service centre 209.
  • the system may be programmed to make a specified call, possibly to a secretary or other assistant.
  • the service centre provides talk-back facilities for a significant number of users on a subscription basis.
  • a call is made to the sen/ice centre and an audio call is established with a service centre operative.
  • the driver identifies the destination to which efforts are being made to establish a call and the operative at the service centre takes measures to identify the appropriate number. This may involve listing numbers from customer specific databases etc. Having determined the number required, the information is relayed back to the calling customer in machine-readable form, such that the customer is then in a position to 12
  • TESPAR Time Encoded Signal Processing And Recognition
  • the fundamental operating aspects of TESPAR are disclosed in United States patents 4,382,160, 5,091 ,949, 5,101 ,433 and 5,101 434, along with European patent publications 0 166 607 and 0256 081.
  • TESPAR has the unique capability of coding time varying signals into common fixed sized matrices, thereby facilitating its application in voice recognition systems.
  • TESPAR archetypes for individual words when correlated against individual or multiple versions of the same word may 13
  • word recognition modules may incorporate the so-called n m choice option by means of a dialog between the human user and the machine, using synthetic speech prompts.
  • the recognition vocabulary were to consist of the digit zero to nine and the driver spoke the word five to it, the recognition procedure may incorrectly recognise the word nine and prompt the user with the words "did you say nine?", the user would respond with the word "no", resulting in the machine being in a position to offer a second choice out of the list of comparisons previously made by it. If the second choice were to be the word five, which is likely, the machine would then prompt with the phrase "did you say five?" and the driver would then respond with the word "yes”. At this point, the system would act accordingly.
  • this n t choice mechanism provided that the yes and no recognition capability has a very high integrity, a system with this facility can guarantee that the driver can always select the correct word.
  • this procedure is enhanced by the fact that the second choice selected by the machine is more likely to be the correct word chosen than the third or higher and that the third choice is likely to be more probable than the fourth or higher etc. It has been discovered that this capability is an inherent property of TESPAR-based word recognition systems, in contrast with systems deploying spectral templates or hidden Markov models. Thus, this n choice facility is one which may be deployed 14
  • n choice activity may be a very powerful indication of difficulty in voice communication when operating in high or variable acoustic background noise. It has been discovered that this n" 1 choice activity may be measured to provide powerful additional alternative communication options which may enable effective communication to take place irrespective of the acoustic environment experienced by the driver.
  • a typical portable phone architecture (based on the GSM system) is shown in Figure 3. This includes microphone 109, loudspeaker 107, voice and base-band coder 301 , GSM processor, 302, interface to the keypad 303, mobile phone display 304, a random access memory module 305 and a read only module 306. In addition, there is provided a radio module 307 that enables both the transmission and reception of radio signals.
  • SIM Identity Module
  • a word recognition/keypad code module 311 is provided, embodying the previously described characteristics. This is inserted between a hands- free module 312 and a voice/base-band coder 301. Appropriate data is rerouted through the voice/base-band coder 301 to the GSM processor 302 for emulating activity of a keypad 303 and the display 304 options.
  • a Dual Tone Multi-Frequency (DTMF)/keypad code module 313 is placed in parallel with loudspeaker 107 thereby receiving an output from the voice/base-band coder 301.
  • an output from the DTMF/keypad code module 313 is connected to GSM processor 302.
  • DTMF Dual Tone Multi-Frequency
  • a situation may be assumed in which a driver wishes to communicate with one of N organisations or individuals stored in the word recognition/keypad module 311. If the word recognition system is a speaker independent one, pre-stored templates or TESPAR archetypes will be provided within the module. If the system is a speaker dependent word recognition process, the user will previously have provided and trained the system on a number of examples.
  • the system On operating the active voice switch 111, the system will produce a synthetic speech prompt asking "do you wish to make a call?".
  • the driver will respond with the word “yes” if he wishes to make a call or with the word “no” if the active voice switch has been inadvertently operated. If the driver's response is "yes” the recognition module 311 will respond with an acoustic prompt "please indicate a caller". The driver will then speak out one of the designation addresses from the list stored in his word recognition subdirectory such as "home”, "John Smith” etc.
  • the word recognition module 311 will compare the driver's acoustic input with the archetypes or templates of each of the words in the personalised telephone sub-directory, select the highest scoring entry and respond with a synthetic speech prompt associated with the highest score.
  • a response may be produced along the lines "do you want John Smith". If the response is "yes" John Smith's appropriate keypad code, that is to say, the relevant telephone number, will be passed, via the voice/baseband coder 301 to the GSM processor 302 and the GSM processor will then output the correct telephone code number for John Smith over the signalling channel of the radio communication link.
  • n choice should provide significant enhancements which, with TESPAR-based word recognition systems, should result in a one hundred percent success rate irrespective of acoustic conditions.
  • N the value of N is large, a large number of interactions or acoustic transactions may need to take place before the correct subscriber is eventually chosen. For example, pathologically, with a sixty-four word subdirectory, sixty-three system interactions may be required in order to achieve success. This is unsatisfactory, even when background noise is very intrusive.
  • VHN mode may be brought into operation, also referred to as "talk-back" where a DTMF code module 313 or other keypad code generator, is brought into play.
  • VHN facility may be introduced manually or automatically, the latter being via the n th choice procedure, in the following manner. If the correct word is not chosen first, nor is it chosen a second time, that is to say when n is greater than, for example, two, the word recognition keypad module 311 is activated to prompt with the phrase "do you want talk-back?" if the response 17
  • the DTMF keypad code module 313 dials" via the GSM processor 302, one of the list of talk-back numbers, the details of which are known to the driver. These may be, for example, the driver's office, the driver's home or a bureau designated specifically to provide the talk-back service, as identified in Figure 2. Having dialled this number, the driver is then, by the means previously described, interconnected directly to his office or directly to his secretary or to his home or to a talk-back bureau with whom he has listed the N individuals that form part of his word recognition subdirectory or, if required, to other subscribers not on the list.
  • a typical interaction may be illustrated with reference to Figure 4, where the call from the vehicle is routed via a radio base station 401, through a telephone exchange 402 and to an appropriate telephone 403.
  • the handset 404 of the telephone Upon receiving an incoming call, the handset 404 of the telephone is picked-up thereby placing the telephone 403 off-hook and, hence, in a receptive mode.
  • the driver's office or the talk-back bureau are then in direct voice communication and, irrespective of the background acoustic conditions, this should enable the secretary or bureau to ascertain which number the driver wishes to access.
  • DTMF Dual Tone Multi-Frequency
  • the secretary or bureau operative confirms that the number for John Smith is required, the secretary will be aware that John Smith's 18 telephone number is on, for example, quick dial three. With the handset off hook, the secretary will depress the quick dial three button, which will then generate a series of DTMF tones associated with John Smith's telephone number. These tones will then be automatically transmitted from the office phone to be received in the driver's vehicle telephone equipment via the loudspeaker channel, as shown in Figure 3. The tones will be heard by the driver and appropriately interpreted in parallel by the DTMF keypad code module 313. At the DTMF keypad code module 313, the numerical values are stored and translated into appropriate dial codes for the GSM processor 302 in preparation for subsequent transmission.
  • the facilities may be provided for any word recognition procedure to enhance its effectiveness, irrespective of its basic performance in acoustic background noise;
  • TESPAR-based word recognition procedures should be used in preference to existing known systems, since these are more likely to reduce the number of occasions when the talk-back facility is required; as the n* choice routines are particularly efficient when using TESPAR processes. It is also relevant that, in relation to this disclosure, TESPAR procedures may use words twice and similar protocols productively to overcome acoustic background noise. 19
  • the talk-back facility may provide the driver with any number very effectively.
  • the balance between word recognition capability, telephone sub- directory size, cost and complexity may be optimised for particular users and individual applications, to expand and improve all voice input facilities and capabilities, to provide one hundred percent system integrity in the most difficult environments and in a manner which leaves both sheep and goat drivers free to drive in a safe and effective fashion.
  • this capability is easily embodied and does not involve changes to standard mobile telephone equipment using the office or the bureau or the home to provide the facility disclosed to the driver on the highway.

Abstract

A voice recognition system (311) is provided to assist in the establishment of a telephone call. Preferably, TESPAR procedures are used such that if a destination is incorrectly recognised a first time, there is a high probability that the second best attempt will be successful. In addition, the application of the TESPAR process facilitates the ability to enhance recognition probability if the destination word is repeated. However, in situations where it is not possible to perform word recognition or where word recognition will take a significant period of time, probably due to the presence of background noise, a destination call is automatically made to an assisting person. Telephone number data is transmitted back to the user and a new telephone call is established in response to the transmitted data.

Description

METHOD FOR ESTABLISHING TELEPHONE CALLS
Field of the invention
The present invention relates to establishing telephone calls using speech recognition.
Background to the Invention
Hands-free communication systems are known in which reliance is made upon speech recognition in order to establish a communications channel. An important example of such systems relate to their applications in mobile telephones and car phones when deployed with hands-free modules. Furthermore, in some jurisdictions, there are statutory provisions to the effect that cars may only be used in vehicles by drivers when the vehicle is in motion if hands-free facilities are employed. Known systems provide speech recognition capabilities that enable a call originating user to speak the name of a required remote telephone user or a desired destination into a microphone and to have the recognition system translate these acoustic voice inputs into telephone numbers for automatically dialling the required destination. Known systems operate by activating a word recognition module whereafter the user speaks individual digits or the name of the person or the destination required. Thereafter, by means of synthetic speech prompts or by other indications, the user is invited to confirm to the word recognition module that the destination selected by the device is correct and then indicates to the system that a call is to be made. The call is then established so as to allow the user to communication by voice or by other means, with the called subscriber.
Thus, by this mechanism, the driver may give full attention to the driving conditions (or other operational conditions when deployed in alternative environments) without having to divert their attention to look down at a fixed mobile telephone display and thereafter manually activate physical buttons mounted to the telephone. This requirement has also been identified by the title "Safe Behind The Wheel Communication".
With safety concerns increasing and statutory requirements in place in some countries, the demand for hands-free facilities within motor vehicles (and other environments) is growing. However, a problem exists in that, in many normally occurring situations, speech recognition systems cannot always be guaranteed and a high failure rate may result.
Summary of the Invention
According to a first aspect of the present invention, there is provided telephony apparatus, including speech recognition means configured to identify a destination to be called in response to a vocalisation; calling means configured to call an assisting person if the speech recognition means fails to correctly identify said vocalised destination; and means for receiving telephone number data transmitted back in response to said call to said assisting person, wherein said calling means establishes a new call in response to said transmitted data. Preferably, the speech recognition means is configured to offer an alternative recognised destination before calling said assisting person. Furthermore, the recognition means may be configured to offer a number of alternatives before calling an assisting person and in a preferred embodiment, incoming speech vocalisations are recognised by a process of Time Encoded Signal Processing And Recognition (TESPAR).
Thus, the invention relates to mobile and/or hands-free communication systems and especially to systems relying upon voice control to enable interconnection or reconfiguration of elements which may include all elements of communication systems. One important example of such systems is their utilisation in telephones and, in particular, in mobile phones when deployed with so called "hands-free modules" in car and other mobile vehicles.
The in-car telephone environment is used as an exemplar to illustrate the key features of the current disclosures but it can be appreciated that the invention has broader application in environments where telephone calls are established.
Voice activated so called hands-free car kits are now becoming available for the cellular phone market and are well known to those skilled in the art. These involve a voice (word) recognition capability to enable the user to speak the name of a required remote telephone user or a wanted destination into a microphone and to have the recognition hardware and software translate these acoustic voice inputs into telephone numbers for automatic dialling to the distant subscriber. By such means, the user whilst driving:
A. activates a word recognition module;
B. speaks the name of the person or the destination required;
C. confirms to the word recognition module that the destination selected by the device is the correct one; D. indicates that the subscriber chosen is to be dialled; and
E. communicates by voice or other means with a distant subscriber.
By this means, a driver may give full attention to the road, the traffic and driving conditions, without having to divert attention to look down at a fixed mobile telephone display and manually activate a button or buttons on the mobile telephone mounted in the car, thus seriously diverting attention from the complexities of the driving task. One description of the requirement for this capability is "safe behind the wheel communication".
The demand for such a facility is growing rapidly and is generating serious concerns for the safety of such mobile phone users when driving vehicles in complex traffic scenarios. These concerns are persuading some state and government authorities to propose the introduction of legislative measures, completely denying the use of mobile communication equipment in vehicles.
Notwithstanding the threat of legislation, the demand for safe and effective mobile vehicular communications is expanding rapidly, world-wide.
The current concepts deployed by existing and proposed conventional voice operated systems may be described in outline with reference to Figure A, as follows. A driver of a car fitted with a hands-free car telephone kit operates an activate voice switch A1 , which may be housed on the steering column or the steering wheel or, for example, as a foot operated switch similar to that used in some vehicles for dipping headlights. In response to the switch operation, and using a synthetic speech prompt, such as, for example, "do you wish to transmit?", a word recognition module A2 seeks confirmation that the car driver wishes to transmit. The driver responds with a "yes" or a "no" which is recognised by the word recognition module A2 which then acts appropriately. Thus, if the word "no" is recognised, module A2 remains dormant. If the word "yes" is recognised, the recognition module prepares for the next acoustic input which will be, for example, the name or location of the person that the driver of the vehicle wishes to contact.
The word recognition module attempts to recognise the name or location spoken to it by the driver from the full portfolio of words previously stored in the module. It converts the word selected into an appropriate telephone number via a keypad code module A3. At this stage, the voice operated system may provide a synthetic speech output such as "do you wish to transmit to person X?", seeking confirmation from the driver that person X thus selected is the person that the driver now wishes to contact. The driver will respond with a "yes" or "no" acoustic response. If "no" is the response, the system has made a mistake and the driver may be prompted to re-input the name that is to say, to try again, or, alternatively, the system may automatically provide the second choice from the set of scores previously calculated in the word recognition module. A range of similar alternative man-machine protocols are available to enable this interactive process.
On the assumption that the word selected by the word recognition module is the correct one and the driver responds with a "yes" indicating correct recognition, the code selected and previously stored in the keypad code module A3 is passed to a telephone dial module A4, resulting in the appropriate number being dialled. The call is thus connected to the distant subscriber's phone and ringing tone is passed back to the driver via a loudspeaker in the car. When the call subscriber picks up the phone, the connection is complete and the call takes place. Variants of these options and protocols are well known and form the basis of a number of different commercial offerings. Thus, for example, DSE Communication Corporation provides "new voice list management features" enabling customers to create and modify name dialling phoning lists using voice commands and the DSP Group Inc of Santa Clara, California have developed hands-free car kits to provide alternatives to the facilities described above.
In general, these products provide substantially in-vehicle hands-free operation, enabling drivers to concentrate on driving while setting up and making a call, thus facilitating communication and significantly improving road safety.
The general capabilities outlined above are well known and it is also known that all such systems suffer from a number of serious limitations.
Such systems do not work very well in mobile vehicles, due to variable levels of ambient noise associated with cars and other vehicles. For static operation and for relatively quiet driving conditions, recognition rates and system effectiveness may be high. However, when driving at high speeds, possibly with a window open, the variable acoustic background effectively prohibits accurate word recognition and creates frustration and confusion for the driver. This leads to many drivers failing by voice and then attempting to use some other form of hands-on override to the voice input option while maintaining to drive the vehicle with one hand. Alternatively, the driver may be forced arbitrarily to slow down until the noise conditions are acceptable for voice operation, or to stop at a convenient lay-by. A range of complicated and costly signal processing procedures have been developed to reduce the effects of acoustic noise in such vehicles. These include echo cancellation, noise cancellation, noise suppression, the use of distributed microphones and speakers, and on-chip filtering etc. Research in this area continues. All such known approaches involve expensive and complicated algorithms and chip sets. However, no system has yet been devised which is able to satisfactory cope with the many variabilities associated with direct voice input operation in real-world vehicular environment.
Further, given the set of acoustic templates produced for example by the driver during the training mode in benign conditions, these differ significantly from the data sets derived from voice waveforms produced by that same speaker in high acoustic background noise. This physiological factor is well known. Current voice or word recognition algorithms based around spectral templates, including hidden mark-off models, are unable satisfactorily to cope with such circumstances and such noise or to offer well known reinforcement strategies, commonly employed in human communication, such as for instance the driver repeating the control words consecutively for example, such as "John Smith, John Smith" to emphasise to the recogniser that through the noise that the person to be spoken to is 7
John Smith. Indeed, such time varying versions of the inputs would cause conventional word recognition systems immediately to fail.
In addition, it is well known that the population of human speakers may be split into two groups, based on their basic ability to operate voice activated devices. In the literature, these groups are described as sheep and goats. Sheep are those individuals who can, without effort, produce consistent acoustic outputs and who perform consistently and well in most direct voiced input tasks scenarios. Goats are those individuals who are unable so to perform, even if in benign stress free anechoic ambient conditions. Thus, a significant percentage of the population are unlikely to be able to use conventional hands-free vehicular telephone systems, even when effective anti-noise strategies become available.
These examples emphasise the fact that the development of an effective voice activated hands-free telephone capability for the vehicular mobile cellular phone market is becoming increasingly complex and costly.
As yet, current systems are unable to effectively cope with the many adverse variabilities associated with the real world driving environment and even if and when they do become effective, a sizeable portion of the general public may be unable to use them satisfactorily. Currently, and for many decades to come, it is unlikely that any word recognition algorithm, irrespective of cost, will be able to match, or even to approach, the performance of the human ear and the mind behind it, in recognising a limited vocabulary of spoken words in very high and variable acoustic noise backgrounds. In this arena, human minds have an exceptional and most powerful capability to integrate a wanted noise signal at the expense of the background noise. In military communications for example "words twice" is a routine example of such, wherein a very noise environment words are repeated twice or three times or more times and deployed as part of an 8
interactive dialog to enable the recipient with certainty to understand what the originator has said. The human mind is able to integrate and filter out the wanted information from the unwanted noise. No system currently available, irrespective of cost and complexity, is able to compete with the flexibility and effectiveness of the human - human interface.
The present invention is disclosed to enable voice operated mobile communications to cope safely and effectively in extreme and variable ambient noise environments and to guarantee flexible hands-free voice operated communications which fully map the capability of human beings. It is well known that the performance, that is, the recognition rate of any word recognition algorithm is inversely proportional to the number of words that are to be presented at any one time to the algorithm for recognition. Current conventional systems may offer from say, eight to sixty- four names, often splitting these up into sub-groups to present the algorithm at any one time with a minimum number of alternative names, thereby improving performance. It is also well understood that the performance of all such system degrade significantly and progressively as the level of acoustic background noise increases, both in terms of its magnitude and variability.
According to a second aspect of the present invention, there is provided telephony apparatus configured to establish a telephone call using speech recognition, including means for receiving acoustic vocalisations wherein the name of a destination is repeated; and processing means are configured to analyse said repeated vocalisations so as to improve recognition properties. In a preferred embodiment, the speech is recognised by a process of
Time Encoded Signal Processing And Recognition (TESPAR) and the processing means may be configured to offer a predetermined number of alternatives if an incorrect recognition of a vocalisation is made.
Preferably, a plurality of TESPAR archetypes are stored for specific users.
In a preferred embodiment, the recognition equipment is mounted within a motor vehicle and telephone communications are made by mobile cellular networks. According to a third aspect of the present invention, there is provided a method of establishing a telephone call using speech recognition wherein, after a speech recognition system has failed to correctly identify a destination to call, a call is made to an assisting person; telephone number data is transmitted back to said user; and a new telephone call is established in response to transmitted data.
Brief Description of the Drawings
Figure A shows a conventional mobile cellular telephone with hands- free voice recognition facilities; Figure 1 shows a hands-free mobile telephone system mounted within a motor vehicle;
Figure 2 illustrates vehicles operating within a cellular telephone environment;
Figure 4 summarises the operation of the environment identified in Figure 2; and
Figure 5 details the telephone system identified in Figure 1.
Detailed Description of The Preferred Embodiments
Hands-free telephony systems may be employed in many situations where an operative cannot physically operate a telephone in the usual way or in doing so the operative may be distracted leading to a potentially dangerous situation. The hands-free environment described herein relates to the deployment of a mobile cellular telephone within a motor vehicle. However, it should be appreciated that this particular application presents an 10
example and should not be considered limiting.
A car interior having mobile telephony equipment is shown in Figure 1. The mobile telephone may be permanently mounted within the vehicle or, alternatively, it may include a portable mobile telephone interfaced to an in- car system, commonly referred to as a "car kit".
A mobile telephone 101 is supported within a cradle 102. A cradle 102 is connected to an in-car unit 103 by means of an umbilical connection 104. The in-car unit 103 receives power via the car's internal battery via a power connection 105. In addition, an aerial connection 106 is connected to an external aerial for the transmission and reception of radio signals. Audio input signals are supplied to audio loudspeakers 107 via audio leads 108 and internal vocalisations are received from a microphone 109 via an audio input lead 110.
In addition to facilitating the establishment of telephone calls in a conventional manner, the in-car unit 103 also includes speech recognition facilities, thereby allowing a driver to establish a telephone call with minimal physical interaction. In preference to removing the telephone from its cradle or activating telephone buttons while it resides in its cradle. A driver is merely required to activate a voice recognition switch 111 , thereby placing the in-car unit 103 into a condition which facilitates the establishment of a call by voice recognition procedures. Furthermore, the system is provided with a talk-back system embodying the present invention such that should voice recognition procedures fail to establish a call, a call is automatically made to a secretary or appropriate populated bureau or service centre, whereafter the required telephone number details may be returned in machine-readable form to the in-car unit 103, thereby facilitating the establishment of a further call, again without any manual intervention on the part of the driver.
The vehicle identified in Figure 1 is also shown in Figure 2 at 201. The vehicle operates within a modular telephone environment and other vehicles 11
in this environment, such as vehicles 202 and 203 also communicate via the GSM facilities, illustrated generally as a cellular network 204. The GSM modular network 204 is connected to a local public switched telephone network 205 via conventional interface channels 206, from which it is possible to establish conventional telephone connections to an office environment, shown generally at 207 and to a home environment, shown generally at 208. It is possible that these connections may be established using speech recognition systems, where the telephone numbers for "the office" and "home" are stored against appropriate encoded speech templates. Thus, a speech utterance is compared against a selection of templates and a best classification is made by the speech recognition equipment, which is then presented acoustically to the driver, thus enabling the driver to confirm or otherwise whether a telephone call is to be established.
In addition, in accordance with the present invention, the system is provided with a "talk-back" system such that if the speech recognition fails to identify the number required after a number of attempts, the system is automatically programmed to make a call to a service centre 209. Alternatively, the system may be programmed to make a specified call, possibly to a secretary or other assistant. However, in a preferred embodiment, the service centre provides talk-back facilities for a significant number of users on a subscription basis.
After failing to identify a required number using speech recognition, a call is made to the sen/ice centre and an audio call is established with a service centre operative. The driver identifies the destination to which efforts are being made to establish a call and the operative at the service centre takes measures to identify the appropriate number. This may involve listing numbers from customer specific databases etc. Having determined the number required, the information is relayed back to the calling customer in machine-readable form, such that the customer is then in a position to 12
establish the call using the received data.
Thus, it can be seen that there is provided an environment in which it is possible to establish a telephone call using speech recognition. If the speech recognition system is successful (which it should be on the majority of occasions) a telephone call is established and a user is then in a position to communicate without any additional manual interaction. However, when the speech recognition fails to identify a required destination, the system automatically makes a call to an assisting person, in the form of an operative at service centre 209 or a personal assisting person, such as a personal secretary. Such a call, identified by the term "talk-back" is selectively used by a user and when so enabled, telephone number data is transmitted back to the user, thereby allowing the user to establish a new telephone call without additional manual intervention.
Thus, if the system is working properly, automatic speech recognition procedures within the car will allow the number to be selected and automatically dialled. However, should these procedures fail, the user is still not required to manually intervene, given that a back-up system, in the form of the talk-back procedures, will activate automatically thereby providing a human-involved reliable procedure for allowing the call to be established. In a preferred embodiment, the speech recognition procedures employ
Time Encoded Signal Processing And Recognition (TESPAR) and, in particular, time encoded speech processing and recognition. The fundamental operating aspects of TESPAR are disclosed in United States patents 4,382,160, 5,091 ,949, 5,101 ,433 and 5,101 434, along with European patent publications 0 166 607 and 0256 081.
TESPAR has the unique capability of coding time varying signals into common fixed sized matrices, thereby facilitating its application in voice recognition systems. TESPAR archetypes for individual words when correlated against individual or multiple versions of the same word may 13
produce one hundred percent scores and this ability contrasts significantly against alternative systems.
It has been discovered that in high acoustic background noise, a TESPAR matrix produced by a speaker repeating a word a number of times is productive in averaging up the wanted features of the signal and averaging out much of the background noise. Thus, one important element of the human capability discussed previously may be reproduced by a substantially equivalent capability in the TESPAR word recognition domain.
Furthermore, word recognition modules may incorporate the so-called nm choice option by means of a dialog between the human user and the machine, using synthetic speech prompts. Thus, if the recognition vocabulary were to consist of the digit zero to nine and the driver spoke the word five to it, the recognition procedure may incorrectly recognise the word nine and prompt the user with the words "did you say nine?", the user would respond with the word "no", resulting in the machine being in a position to offer a second choice out of the list of comparisons previously made by it. If the second choice were to be the word five, which is likely, the machine would then prompt with the phrase "did you say five?" and the driver would then respond with the word "yes". At this point, the system would act accordingly. Thus, by this nt choice mechanism, provided that the yes and no recognition capability has a very high integrity, a system with this facility can guarantee that the driver can always select the correct word.
The effectiveness of this procedure is enhanced by the fact that the second choice selected by the machine is more likely to be the correct word chosen than the third or higher and that the third choice is likely to be more probable than the fourth or higher etc. It has been discovered that this capability is an inherent property of TESPAR-based word recognition systems, in contrast with systems deploying spectral templates or hidden Markov models. Thus, this n choice facility is one which may be deployed 14
very effectively to cater for errors that may occur in TESPAR-based systems but not so effectively in alternative systems where the error rate is likely to be high.
It has been discovered that a measure of n choice activity may be a very powerful indication of difficulty in voice communication when operating in high or variable acoustic background noise. It has been discovered that this n"1 choice activity may be measured to provide powerful additional alternative communication options which may enable effective communication to take place irrespective of the acoustic environment experienced by the driver. A typical portable phone architecture (based on the GSM system) is shown in Figure 3. This includes microphone 109, loudspeaker 107, voice and base-band coder 301 , GSM processor, 302, interface to the keypad 303, mobile phone display 304, a random access memory module 305 and a read only module 306. In addition, there is provided a radio module 307 that enables both the transmission and reception of radio signals. A Subscriber
Identity Module (SIM) 308 is provided and, to facilitate the transmission of data, the system includes a data terminal adapter 309.
A word recognition/keypad code module 311 is provided, embodying the previously described characteristics. This is inserted between a hands- free module 312 and a voice/base-band coder 301. Appropriate data is rerouted through the voice/base-band coder 301 to the GSM processor 302 for emulating activity of a keypad 303 and the display 304 options. In addition, a Dual Tone Multi-Frequency (DTMF)/keypad code module 313 is placed in parallel with loudspeaker 107 thereby receiving an output from the voice/base-band coder 301. Similarly, an output from the DTMF/keypad code module 313 is connected to GSM processor 302. However, it should be stressed that a range of different and alternative interconnections may be used to achieve similar capabilities and that the arrangement shown in Figure 3 is merely a particular example of interconnections provided to illustrate the 15 embodiment.
A situation may be assumed in which a driver wishes to communicate with one of N organisations or individuals stored in the word recognition/keypad module 311. If the word recognition system is a speaker independent one, pre-stored templates or TESPAR archetypes will be provided within the module. If the system is a speaker dependent word recognition process, the user will previously have provided and trained the system on a number of examples.
On operating the active voice switch 111, the system will produce a synthetic speech prompt asking "do you wish to make a call?". The driver will respond with the word "yes" if he wishes to make a call or with the word "no" if the active voice switch has been inadvertently operated. If the driver's response is "yes" the recognition module 311 will respond with an acoustic prompt "please indicate a caller". The driver will then speak out one of the designation addresses from the list stored in his word recognition subdirectory such as "home", "John Smith" etc.
The word recognition module 311 will compare the driver's acoustic input with the archetypes or templates of each of the words in the personalised telephone sub-directory, select the highest scoring entry and respond with a synthetic speech prompt associated with the highest score.
Thus, a response may be produced along the lines "do you want John Smith". If the response is "yes" John Smith's appropriate keypad code, that is to say, the relevant telephone number, will be passed, via the voice/baseband coder 301 to the GSM processor 302 and the GSM processor will then output the correct telephone code number for John Smith over the signalling channel of the radio communication link.
By these means, a normal telephone call will be set up, the appropriate phone at the destination address will ring and the driver will communication with the destination address via the hands-free module 16
equipment 312.
If the acoustic conditions are poor and the speech recognition system does not work effectively, such that the first word spoken by the driver to the system is not recognised as the correct word, the driver will indicate "no" and the recognition module will produce a second choice, a third choice and so on for a predetermined number of times until the correct word is identified. This should enable the driver to obtain the selected number. As previously indicated, this process is referred to as n choice and should provide significant enhancements which, with TESPAR-based word recognition systems, should result in a one hundred percent success rate irrespective of acoustic conditions.
However, for any large-sized vocabulary, the n* choice procedure can prove tedious, since the larger the vocabulary the more likely an error is to occur and the larger the vocabulary the more frustration is likely to be engendered.
If the value of N is large, a large number of interactions or acoustic transactions may need to take place before the correct subscriber is eventually chosen. For example, pathologically, with a sixty-four word subdirectory, sixty-three system interactions may be required in order to achieve success. This is unsatisfactory, even when background noise is very intrusive.
In situations where very high noise levels are present, a VHN mode may be brought into operation, also referred to as "talk-back" where a DTMF code module 313 or other keypad code generator, is brought into play. VHN facility may be introduced manually or automatically, the latter being via the nth choice procedure, in the following manner. If the correct word is not chosen first, nor is it chosen a second time, that is to say when n is greater than, for example, two, the word recognition keypad module 311 is activated to prompt with the phrase "do you want talk-back?" if the response 17
is a vocalised "yes", the DTMF keypad code module 313 "dials" via the GSM processor 302, one of the list of talk-back numbers, the details of which are known to the driver. These may be, for example, the driver's office, the driver's home or a bureau designated specifically to provide the talk-back service, as identified in Figure 2. Having dialled this number, the driver is then, by the means previously described, interconnected directly to his office or directly to his secretary or to his home or to a talk-back bureau with whom he has listed the N individuals that form part of his word recognition subdirectory or, if required, to other subscribers not on the list. A typical interaction may be illustrated with reference to Figure 4, where the call from the vehicle is routed via a radio base station 401, through a telephone exchange 402 and to an appropriate telephone 403. Upon receiving an incoming call, the handset 404 of the telephone is picked-up thereby placing the telephone 403 off-hook and, hence, in a receptive mode. The driver's office or the talk-back bureau are then in direct voice communication and, irrespective of the background acoustic conditions, this should enable the secretary or bureau to ascertain which number the driver wishes to access.
Many modern telephones incorporate Dual Tone Multi-Frequency (DTMF) dialling code capability and, for example, the driver's secretary may have the driver's N codes previously stored by means of quick dial memory keys. Thus, when the driver requests the number for a particular contact using the talk-back provision, the information may be provided quickly by means of accessing the memory keys. Thus, this exchange of information will take place to the limits of human capability, using words twice, or whatever other form of interactive voice exchange is needed to confirm the requirements, irrespective of the acoustic background conditions.
Once the secretary or bureau operative confirms that the number for John Smith is required, the secretary will be aware that John Smith's 18 telephone number is on, for example, quick dial three. With the handset off hook, the secretary will depress the quick dial three button, which will then generate a series of DTMF tones associated with John Smith's telephone number. These tones will then be automatically transmitted from the office phone to be received in the driver's vehicle telephone equipment via the loudspeaker channel, as shown in Figure 3. The tones will be heard by the driver and appropriately interpreted in parallel by the DTMF keypad code module 313. At the DTMF keypad code module 313, the numerical values are stored and translated into appropriate dial codes for the GSM processor 302 in preparation for subsequent transmission.
By these means, a variety of different options are made available. In the current example, the existing link to the secretary will automatically be disabled and the telephone number decoded by the DTMF keypad module 313. It can be seen that by these means
A. an optimum capability may be achieved, irrespective of acoustic background noise and that;
B. the facilities may be provided for any word recognition procedure to enhance its effectiveness, irrespective of its basic performance in acoustic background noise; and
C. the procedures disclosed are equally effective and enabling for the sheep and goat populations of vehicular mobile radio users.
Preferably, for the reasons indicated, TESPAR-based word recognition procedures should be used in preference to existing known systems, since these are more likely to reduce the number of occasions when the talk-back facility is required; as the n* choice routines are particularly efficient when using TESPAR processes. It is also relevant that, in relation to this disclosure, TESPAR procedures may use words twice and similar protocols productively to overcome acoustic background noise. 19
It will be apparent that by this means if the driver wishes to contact an unusual telephone number not included in the local telephone sub-directory, the talk-back facility may provide the driver with any number very effectively. Thus, the balance between word recognition capability, telephone sub- directory size, cost and complexity may be optimised for particular users and individual applications, to expand and improve all voice input facilities and capabilities, to provide one hundred percent system integrity in the most difficult environments and in a manner which leaves both sheep and goat drivers free to drive in a safe and effective fashion. By means of the current disclosure, this capability is easily embodied and does not involve changes to standard mobile telephone equipment using the office or the bureau or the home to provide the facility disclosed to the driver on the highway.
It will also be obvious to those skilled in the art that the features of this disclosure may be configured to be used in a variety of differing scenarios in differing application areas and with differing data coding strategies. For example, for handicapped users, for long-term hospital patients, for the safety of engineers working manually in remote locations or hazardous situations, for operators involved in complicated manual procedures, all or any of which may otherwise preclude the manual hands-on use of portable telephone equipment for communication, information transfer and remote control.

Claims

20What We Claim is:
1. Telephony apparatus, including speech recognition means configured to identify a destination to be called in response to a vocalisation; calling means configured to call an assisting person if the speech recognition means fails to correctly identify said vocalised destination; and means for receiving telephone number data transmitted back in response to said call to said assisting person, wherein said calling means establishes a new call in response to said transmitted data.
2. Apparatus according to claim 1 , wherein speech recognition means is configured to offer an alternative recognised destination before calling said assisting person.
3. Apparatus according to claim 2, wherein said recognition means is configured to offer a number of alternatives before calling an assisting person.
4. Apparatus according to claim 2 or claim 3, wherein said speech recognition means recognises incoming speech vocalisations by a process of
Time Encoded Signal Processing And Recognition (TESPAR).
5. Apparatus according to claim 1 , wherein said speech recognition means is configured to be responsive to words being spoken more than once so as to improve the recognition procedures.
6. Apparatus according to claim 5, wherein said speech recognition procedures are performed using Time Encoded Signal Processing And Recognition (TESPAR). 21
7. Apparatus according to claim 1, wherein said assisting person has means for generating an encoded representation of said telephone number using conventional signalling techniques.
8. Apparatus according to claim 7, wherein the equipment of said assisting person is configured to transmit audio tones.
9. Apparatus according to claim 1, wherein said calling means establishes a call by use of a cellular telephony network.
10. Apparatus according to claim 9, wherein said cellular telephone system is mounted within a motor vehicle.
11. Telephony apparatus configured to establish a telephone call using speech recognition, including means for receiving acoustic vocalisations wherein the name of a destination is repeated; and processing means are configured to analyse said repeated vocalisation so as to improve recognition properties.
12. Apparatus according to claim 11 , wherein said speech is recognised by a process of Time Encoded Signal Processing And Recognition (TESPAR).
13. Apparatus according to claim 12, wherein said processing means is configured to offer a predetermined number of alternatives if an incorrect recognition of a vocalisation is made. 22
14. Apparatus according to claim 12, wherein a plurality of TESPAR archetypes are stored for specific users.
15. Apparatus according to claim 11, wherein said recognition equipment is mounted within a motor vehicle and telephony communications are made via mobile cellular networks.
16. A method of establishing a telephone call using speech recognition wherein, after a speech recognition system has failed to correctly identify a destination to call, a call is made to an assisting person; telephone number data is transmitted back to said user; and a new telephone call is established in response to said transmitted data.
17. A method according to claim 16, wherein said speech recognition system offers an alternative recognised destination before calling said assisting person.
18. A method according to claim 17, wherein said speech recognition system is configured to offer a number of alternatives before calling an assisting person.
19. A method according to claim 17 or claim 18, wherein said speech recognition system recognises incoming speech vocalisations by a process of Time Encoded Signal Processing And Recognition (TESPAR).
20. A method according to claim 16, wherein the speech recognition system is responsive to words being spoken more than once to improve recognition. 23
21. A method according to claim 20, wherein said speech recognition system is performed in accordance with Time Encoded Signal Processing And Recognition (TESPAR) procedures.
22. A method according to claim 16, wherein said assisting person has means for generating an encoded representation of said telephone number using conventional signalling techniques.
23. A method according to claim 22, wherein the equipment of said assisting person is configured to transmit audio tones.
24. A method of establishing a telephone call using speech recognition, wherein a speech recognition process is configured to receive acoustic vocalisations in which the name of a destination is repeated; and said repeated vocalisations are analysed so as to improve recognition properties.
25. A method according to claim 24, wherein the speech is recognised by a process of Time Encoded Signal Processing And Recognition (TESPAR).
26. A method according to claim 25, wherein a predetermined number of alternatives are offered if an incorrect recognition of a vocalisation is made.
27. A method according to claim 25, wherein a plurality of TESPAR archetypes are stored for specific users. 24
28. A method according to claim 24, wherein the recognition process is performed within a motor vehicle for application with respect to a mobile cellular network.
PCT/GB1999/000910 1998-03-25 1999-03-25 Method for establishing telephone calls WO1999049639A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU30442/99A AU756212B2 (en) 1998-03-25 1999-03-25 Method for establishing telephone calls
EP99911930A EP1070415A1 (en) 1998-03-25 1999-03-25 Method for establishing telephone calls
JP2000538487A JP2002508629A (en) 1998-03-25 1999-03-25 How to make a phone call

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9806401.7 1998-03-25
GBGB9806401.7A GB9806401D0 (en) 1998-03-25 1998-03-25 Improvements in voice operated mobile communications

Publications (1)

Publication Number Publication Date
WO1999049639A1 true WO1999049639A1 (en) 1999-09-30

Family

ID=10829243

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1999/000910 WO1999049639A1 (en) 1998-03-25 1999-03-25 Method for establishing telephone calls

Country Status (5)

Country Link
EP (1) EP1070415A1 (en)
JP (1) JP2002508629A (en)
AU (1) AU756212B2 (en)
GB (2) GB9806401D0 (en)
WO (1) WO1999049639A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6370506B1 (en) * 1999-10-04 2002-04-09 Ericsson Inc. Communication devices, methods, and computer program products for transmitting information using voice activated signaling to perform in-call functions
KR20010094229A (en) * 2000-04-04 2001-10-31 이수성 Method and system for operating a phone by voice recognition technique
FR2829637A1 (en) * 2001-11-13 2003-03-14 Siemens Vdo Automotive Personal communications element/vehicle communications having base communications element connected and connecting communications interface bus directly with command unit displaying communications elements

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4870686A (en) * 1987-10-19 1989-09-26 Motorola, Inc. Method for entering digit sequences by voice command
EP0404502A2 (en) * 1989-06-19 1990-12-27 Nec Corporation Voice recognition dialing unit
US5033088A (en) * 1988-06-06 1991-07-16 Voice Processing Corp. Method and apparatus for effectively receiving voice input to a voice recognition system
US5091949A (en) * 1983-09-01 1992-02-25 King Reginald A Method and apparatus for the recognition of voice signal encoded as time encoded speech

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2793213B2 (en) * 1988-12-29 1998-09-03 株式会社東芝 Speech recognition device and telephone using the same
FI97919C (en) * 1992-06-05 1997-03-10 Nokia Mobile Phones Ltd Speech recognition method and system for a voice-controlled telephone

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091949A (en) * 1983-09-01 1992-02-25 King Reginald A Method and apparatus for the recognition of voice signal encoded as time encoded speech
US4870686A (en) * 1987-10-19 1989-09-26 Motorola, Inc. Method for entering digit sequences by voice command
US5033088A (en) * 1988-06-06 1991-07-16 Voice Processing Corp. Method and apparatus for effectively receiving voice input to a voice recognition system
EP0404502A2 (en) * 1989-06-19 1990-12-27 Nec Corporation Voice recognition dialing unit

Also Published As

Publication number Publication date
JP2002508629A (en) 2002-03-19
GB9806401D0 (en) 1998-05-20
GB2335826B (en) 2000-05-31
GB9906921D0 (en) 1999-05-19
AU756212B2 (en) 2003-01-09
EP1070415A1 (en) 2001-01-24
GB2335826A (en) 1999-09-29
AU3044299A (en) 1999-10-18

Similar Documents

Publication Publication Date Title
KR0129856B1 (en) Method for entering digit sequences by voice command
US8311584B2 (en) Hands-free system and method for retrieving and processing phonebook information from a wireless phone in a vehicle
US8917827B2 (en) Voice-operated interface for DTMF-controlled systems
USRE45066E1 (en) Method and apparatus for the provision of information signals based upon speech recognition
KR910006053B1 (en) Telephone system
US4959850A (en) Radio telephone apparatus
US4525793A (en) Voice-responsive mobile status unit
US20080194301A1 (en) Voice Activated Dialing for Wireless Headsets
KR960004692B1 (en) Method for terminating a telephone call by voice command
CN103124318B (en) Start the method for public conference calling
EP0739121A2 (en) Voice activated telephone
US6256611B1 (en) Controlling a telecommunication service and a terminal
US20080144805A1 (en) Method and device for answering an incoming call
US5915239A (en) Voice-controlled telecommunication terminal
FI111673B (en) Procedure for selecting a telephone number through voice commands and a telecommunications terminal equipment controllable by voice commands
US20040176138A1 (en) Mobile telephone having voice recording, playback and automatic voice dial pad
AU756212B2 (en) Method for establishing telephone calls
US7471776B2 (en) System and method for communication with an interactive voice response system
US7092515B2 (en) VC-to-DTMF interfacing system and method
US20080118056A1 (en) Telematics device with TDD ability
GB2113048A (en) Voice-responsive mobile status unit
EP0293264B1 (en) Telephone apparatus with a voice recognition system and operable in noisy environment
JP3384282B2 (en) Telephone equipment
KR970055729A (en) Method and apparatus for transmitting telephone number by voice recognition in mobile terminal
KR100788652B1 (en) Apparatus and method for dialing auto sound

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1999911930

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 30442/99

Country of ref document: AU

NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 09646106

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1999911930

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWG Wipo information: grant in national office

Ref document number: 30442/99

Country of ref document: AU

WWW Wipo information: withdrawn in national office

Ref document number: 1999911930

Country of ref document: EP