GB2304957A - Voice-dialog system for automated output of information - Google Patents

Voice-dialog system for automated output of information Download PDF

Info

Publication number
GB2304957A
GB2304957A GB9618308A GB9618308A GB2304957A GB 2304957 A GB2304957 A GB 2304957A GB 9618308 A GB9618308 A GB 9618308A GB 9618308 A GB9618308 A GB 9618308A GB 2304957 A GB2304957 A GB 2304957A
Authority
GB
United Kingdom
Prior art keywords
utterance
voice
utterances
user
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9618308A
Other versions
GB2304957B (en
GB9618308D0 (en
Inventor
Georg Fries
Karlheinz Schuhmacher
Antje Wirth
Bernhard Kaspar
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deutsche Telekom AG
Original Assignee
Deutsche Telekom AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deutsche Telekom AG filed Critical Deutsche Telekom AG
Publication of GB9618308D0 publication Critical patent/GB9618308D0/en
Publication of GB2304957A publication Critical patent/GB2304957A/en
Application granted granted Critical
Publication of GB2304957B publication Critical patent/GB2304957B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4931Directory assistance systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

A voice-dialog system outputs information, in particular a telephone number. An alphabet-identifier identifies an utterance which is spelt out by the user and selects utterances that can be spelt in a similar manner from a plurality of predetermined utterances; an utterance-identifier compares the utterance input by the user with the utterances selected by the alphabet-identifier and supplies at least one utterance for output to the user. A lexicon operates on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and a synthesizer can access in real time.

Description

VOICE-DIALOG SYSTEM FOR AUTOMATED OUTPUT OF INFORMATION The invention relates to a voice-dialog method for automated output of information, such as a telephone number of a user, and to a voice-dialog installation for carrying out the method, and to an apparatus for speaker-independent voice-identification, in particular for use in such an installation.
Voice-dialog systems for automated voice output of telephone numbers are known, in which the dialog between a caller, who requires certain information, and the system is conducted over the telephone. The voicedialog systems currently in operation can, however, only identify a fixed, small to medium vocabulary of approximately 1000 words. Any texts, including the output of place names, surnames and the telephone number, are output by way of a voice synthesizer. It has, however, been shown that errors in the pronunciation of names occur, particularly if the names do not obey the usual pronunciation rules.
The underlying object of the invention is therefore to make a voice-dialog method available for automated output of information and to provide a voicedialog installation which is suitably developed for this purpose and which can process a very large identifiable vocabulary, that is, approximately 10,000 to 100,000 words, and can still attain an acceptable identification rate and which also reduces or even totally avoids errors in the case of the voice output of foreign-language terms.
According to a first aspect of the present invention, there is provided a voice-dialog method for automated output of information, having the following steps: (a) intermittently loading orthographic-phonetic information for a plurality of predetermined utterances from a lexicon which is capable of operating on-line, with the information being available in real time; (b) verbally requesting the user to input an utterance; (c) temporarily storing the utterance which has been input; (d) verbally requesting the user to spell the utterance which has been input; (e) in response to the spelt-out utterance, identifying and selecting a plurality of the predetermined, spelt-out reference utterances with the aid of the stored orthographic information on the basis of ascertaining similarity; (f) feeding the selected utterances and the temporarily stored utterance to an utterance identifier; (g) identifying and selecting at least one utterance from the selected utterances on the basis of a similarity-comparison; and (h) sequentially outputting the utterances found in step (g) and the associated information in synthesized voice form.
According to a second aspect of the present invention, there is provided a voice-dialog installation, comprising: a device for the input of an utterance by a user, at least one synthesizer for generating voice signals for the user, a voice-inputting device, an alphabet-identifier which can identify an utterance which is spelt out by the user and can select orthbgraphically similar utterances from a plurality of predetermined spelt-out reference utterances, an utterance-identifier which compares the utterance input by the user with the utterances selected by the alphabet-identifier and on the basis of ascertaining similarity supplies at least one utterance for output to the user, and at least one lexicon which is capable of operating on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and the synthesizer can access in real time.
According to a third aspect of the present invention, there is provided an apparatus for speakerindependent voice-identification, having an alphabetidentifier which can identify an utterance spelt out by a user and can select several spelt-out reference utterances from a plurality of predetermined spelt-out reference utterances on the basis of ascertaining similarity, and having an utterance-identifier which, on the basis of ascertaining similarity, compares an utterance, which is input by the user and which corresponds to the spelt-out utterance, with the utterances which are pre-selected by the alphabetidentifier and supplies at least one output utterance as a result.
The invention is able to process a very large vocabulary at an acceptable identification rate, as an utterance input by a user undergoes combined voice identification. This utterance can be a surname, a first name, a street name, a place name or even words which are joined together. The combined voice identification takes in an alphabet-identifier, which can identify an utterance spelt out by the user and thereupon can select orthographically similar utterances from a plurality of predetermined reference utterances which have been spelt out. The term "orthographically similar utterance" is used each time in the following to express the fact that two or more sequences of pronounced letters forming words sound alike (e.g. "es e es es e el" and "ef e es es e el").
As a second main component, the combined voice identification includes an utterance-identifier which compares the utterance input directly by the user with the reference utterances which correspond to the speltout reference utterances which are selected by the alphabet identifier. On the basis of ascertaining similarity, the utterance identifier supplies as an identification result at least one word for output to the user, which word corresponds to a reference utterance similar to the user's utterance. A lexicon capable of operating on-line is used to store orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and a synthesizer can access in real time.
Advantageously, a memory for temporary storage is provided, which memory temporarily stores the utterance directly input by the user before it is forwarded to the utterance-identifier. In addition, the installation contains a further memory in which the spelt-out reference utterances, which have been preselected by the alphabet-identifier, are loaded in the form of a list of candidates of orthographically similar names.
The utterance-identifier operates in keywordspotting mode so that the user can, within certain limits, make additional utterances before and after the actual utterance and the utterance-identifier is still able to extract the relevant utterance.
The orthographic-phonetic information stored in the lexicon pertains, in the first place, to the spelling of the predetermined utterances which the alphabet-identifier uses in order to identify an utterance which has been spelt out and to make therefrom a pre-selection of orthographically similar names for the utterance-identifier. In addition, phonetic transcriptions, for example for place names and surnames, are stored in the lexicon. Orthographic and phonetic transcriptions of proper names are transmitted from an electronic dictionary of pronunciation to the lexicon in an off-line process.
In this connection, only proper names which occur in the electronic telephone directory are transferred.
The electronic telephone directory is a data bank which is capable of operating in real time and which contains the addresses and telephone numbers required to output information to the user. In order to obtain a high level of quality even in the case of voice-output of names which do not obey the usual pronunciation rules, intonation-related information of the terms is also stored in addition to the phonetic information. These voice features reproduce the intonation of syllables and endings of foreign-language words as well.
In order to avoid a situation where the results of identification of the combined voice identification are affected at random on account of acoustic similarities between words and/or spoken letters, additional information for homonyms is stored in the lexicon.
This additional information allows one candidate obtained by voice identification to be supplemented by alternatives which can be pronounced in the same way and thus allows the identification rate of the installation to be increased.
Advantageously, the lexicon includes a store for general vocabulary, for names of towns and for the surnames which occur there.
The control of the voice-dialog installation is effected by means of a program-controlled microcomputer. The control software implemented therein ensures inter alia that the required orthographic and phonetic information from the lexicon is made available to the identifiers and the synthesizer in good time and that the installation requests a user in a voice-controlled manner to input the respective utterances. In addition, it monitors the time-outs occurring in the voice-identifiers, processes terminating and help commands and takes over the identification and control of errors.
Internal program loops, which can reject an utterance input by the user or at the end of a given time span can ask the user to input his utterance anew, run in the utterance-identifier and in the alphabetidentifier.
The invention is explained in greater detail below with reference to an exemplary embodiment in conjunction with the enclosed drawings in which: Figure 1 is a schematic block diagram of a voice dialog installation having the combined voice-identification according to the invention and an on-line lexicon; Figure 2 is a flow chart showing the progress of an automated voice dialog for name identification and output of a pertinent call number effected by the voice-dialog installation according to Figure 1.
Figure 1 shows the basic structure of a voicedialog installation which can effect lexicon-controlled identification of any utterances, for example of place names or surnames, by means of a combination of voiceidentifiers and can output information associated with the utterance (for example a call number) on the basis of an utterance which has been ascertained (identification result). In detail, a telephone set or apparatus 10 is represented in Figure 1, at which apparatus a caller can input the place name and the surname of a subscriber, whose telephone number he wishes to find out, or certain other utterances.
Arranged on the operational side of the voice-dialog installation there is at least one analog-to-digital converter 80 which converts the analog voice signals from the subscriber into digital signals. The output of the analog-to-digital converter can be connected to the respective input of a voice memory 20 and an alphabet-identifier or letter-identifier 30. The voice memory 20 is used for temporary storage, for later use, of the utterance directly input into the telephone apparatus 10 by the caller, that is, for example, the name "Meier". The alphabet-identifier 30 receives, by way of the analog-to-digital converter 80 as a function of the status of the voice-dialog run, a spelt-out version of the directly input utterance which was previously stored in the voice memory 20. A programcontrolled microcomputer 120 ensures that the directly input utterance is loaded into the voice memory 20 and that the spelt-out utterance is fed to the alphabetidentifier 30. The output of the alphabet-identifier 30 is connected to a memory 40, stored in which there is a list of candidates of orthographically similar utterances which have been ascertained by the alphabetidentifier 30 during a pre-selection. An utteranceidentifier 50 is provided with three inputs which are connected to respective outputs of the candidate memory 40, the voice memory 20 and an on-line lexicon 70. The utterance-identifier 50 operates in the so-called keyword-spotting mode which makes it possible for the actual utterance, for example "Meier", to be correctly extracted, even if additional utterances such as "er", "please" or the like precede or follow it. The output of the keyword-spotter 50 is connected to an idetification-result memory 55 in which the resultant utterances, that is, similarly sounding names, are stored by the keyword-spotter 50. The utterances which are stored in the identification- result memory 55 are fed to a synthesizer 60 which on the basis of the corresponding information from the lexicon in turn transmits the names in synthesized speech by way of a digital-to-analog converter 85 to the telephone apparatus 10 of the subscriber. The synthesizer 60 can also produce the verbal requests to be made of the caller in conjunction with a database - not shown - in which all of the texts to be announced by the installation are contained in an orthographic or phonetic form.
The on-line lexicon 70 mentioned above is distinguished above all by the fact that it can be used simultaneously and in real time by the alphabetidentifier 30 for letter-identification, by the keyword-spotter 50 and by the synthesizer 60. That is why all the information relating to the utterances to be identified by the installation and to be made is stored in this lexicon 70. This information is orthographic and pronunciation- or intonation-related information which is loaded from a dictionary of pronunciation 100 into the on-line lexicon 70 in an off-line process. In addition, information on homonyms is stored in the lexicon 70 in order to extend the identification result of the utterance-identifier with names which sound alike or in order to supplement the spelt-out reference utterances of the alphabet identifier with orthographically similar names and thus to increase the probability of detecting the correct utterance at the same time. This also ensures that there is an increased success rate during use or an improved total throughput through the installation, as utterances which are to be identified are more rarely rejected by the voice-identifiers 30, 50. The information on homonyms makes it possible for the utterance-identifier, for example for an utterance "Meier", to find all the spellings present in the electronic telephone directory, such as, for example, "Meier", "Mayer", "Maier" and "Meyer", and to include them in the list of identification results. On the other hand, it is thereby possible for the alphabetidentifier to map, for example, frequently occurring and possibly incorrectly used spelling variants, such as, for example "MULLER" or "MUELLER", to the correct, spelt-out reference utterance even if, for example, only the spelling with " " appears in the telephone directory. The on-line lexicon 70 which has been described therefore first assists both the voiceidentification and the voice synthesis.
The mode of operation of the voice-dialog installation is explained in greater detail in the following with reference to a name-identification. It may be assumed that the voice-dialog installation already knows the name of the place in which the person, whose telephone number a caller would like to find out, lives. For this purpose, the installation first asked the user of the telephone apparatus 10 to input the place name (for example Darmstadt) direct, that is, in a form not spelt out. Advantageously, the microcomputer 120 controls the installation in such a way that the place name is only fed to the keywordspotter 50 in order to identify the utterance. As already mentioned, the keyword-spotter is able to tolerate additional utterances, such as "er" or "please", and extract merely the town name as information. The voice dialog-installation can also be developed in such a way that pre-selection of orthographically similar place names is effected by the alphabet-identifier 30 for the keyword-spotter 50 when an incorrect identification result or no identification result at all has been supplied by the keyword-spotter 50. After the place name has been identified, the voice-dialog installation makes available from the on line lexicon 70 all the surnames, which are stored in an electronic telephone directory 90 for this town name. It may further be assumed that the spelling of all the proper names which are required for the spelling identification in the alphabet-identifier 30, a respective sequence of phonetic symbols for all the proper names which are required for the voiceidentification in the keyword-spotter, and also a respective sequence of phonetic symbols including intonation information required for the voice synthesis are contained in the on-line lexicon 70. In addition, references to the corresponding entries in the on-line lexicon are contained in the electronic telephone directory 90 which contains the surnames of the subscribers with corresponding telephone numbers and addresses.
The caller is now guided through a dialog, during the course of which he finds out the desired telephone number by virtue of specifying the place name and the name of the subscriber.
The following voice dialog between the caller using the telephone apparatus 10 and the voice-dialog installation is explained in the flow chart according to Figure 2.
The caller is first asked verbally by the installation by way of the synthesizer 60 to input directly the desired name, for example "Meier". This input is subsequently temporarily stored in the voice memory 20. Even additional utterances, such as "er" and "please", are also recorded thereby in the voice memory 20. Subsequently, the caller is requested verbally by way of the synthesizer 60 to spell out the name previously directly input. Thereupon, the subscriber inputs the letter sequence M, E, I, E, R.
In conjunction with the orthographic information which is stored in the on-line lexicon 70, the alphabet identifier 30 ascertains similarity and makes a preselection from the list of available surnames stored in the on-line lexicon 70 under the place name. On account of identification uncertainties, the alphabetidentifier 30 ascertains a plurality of candidates, for example "Neier", "Meier", "Meter", "Mieter", "Neter", "Nieter", "Meiter", "Meider", etc.. This list of candidates is stored in the memory 40. The programcontrolled microcomputer 120 causes the keyword-spotter 50 to read out the user utterance "Meier" previously temporarily stored in the voice memory 20 and to load the pre-selected candidates which are in the memory 40.
On the basis of ascertaining similarity, the keywordspotter 50 compares the spoken name "Meier", which is directly input, with the list of candidates by using the phonetic information stored in the on-line lexicon 70. The keyword-spotter 50 supplies, for example, the names "Neier" and "Meier" as an identification result and stores them in the result memory 55. The voicedialog installation, on account of the phonetic and intonation-related information stored in the on-line lexicon 70, knows how to pronounce and intonate the identification results which have been found.
Thereupon, the names which have been found, in the present case the names "Neier" and "Meier", are successively transmitted by way of the synthesizer 60 to the telephone apparatus 10 of the caller. The caller can thereupon select the correct name. With this surname and the identified place name, a data bank inquiry of the electronic telephone directory 90 is then commenced. The names and addresses which are found are read out in a user-controlled manner, that is, the user can influence when the voice-output of the names and addresses which have been found is terminated and how often a list is read out or for which name additional information is to be output. In problem cases, it is possible for the caller to be connected through to an operator. As soon as the user of the voice-dialog installation indicates that the data output by way of the voice synthesizer 60 (first name, surname, street, street number) corresponds to the data of the person whose telephone number he is seeking, the microcomputer 120 causes the installation to read out the corresponding telephone number from the telephone directory 90 and inform the caller thereof verbally.
Owing to the lexicon-controlled identification of any utterances as a result of the combination of the alphabet-identifier 30 and the keyword-spotter 50, it is possible to process a clearly greater vocabulary at an acceptable identification rate than can conventional installations, which only use one voice identifier.
The reason for this can be seen in the fact that the alphabet-identifier 30 makes a pre-selection of the words which are to be identified and only this comparatively small selection of words which come into question is fed to the keyword-spotter 50 for actual identification.

Claims (19)

1. Voice-dialog method for automated output of information, having the following steps: (a) intermittently loading orthographic-phonetic information for a plurality of predetermined utterances from a lexicon which is capable of operating on-line, with the information being available in real time; (b) verbally requesting the user to input an utterance; (c) temporarily storing the utterance which has been input; (d) verbally requesting the user to spell the utterance which has been input; (e) in response to the spelt-out utterance, identifying and selecting a plurality of the predetermined, spelt-out reference utterances with the aid of the stored orthographic information on the basis of ascertaining similarity; (f) feeding the selected utterances and the temporarily stored utterance to an utterance identifier; (g) identifying and selecting at least one utterance from the selected utterances on the basis of a similarity-comparison; and (h) sequentially outputting the utterances found in step (g) and the associated information in synthesized voice form.
2. Voice-dialog method according to claim 1, characterised in that step (h) is repeated until the user terminates the synthesized voice output of the utterances.
3. Voice-dialog method according to claim 1 or 2, characterised in that steps (e) and (g) are terminated at the end of a predetermined time span and the user is requested to re-input his utterance if no utterance has been identified.
4. Voice-dialog method according to claim 2 or 3, characterised in that the user identifies one of the synthesized utterances as coinciding with his utterance, and in that, in response to this, an inquiry of an electronic telephone directory is commenced, which directory is capable of operating in real time and from which directory all of the data records meeting the criterion of the identified utterance are read out and made available to the user to choose from, and in that, on the basis of a name and an address read out from the directory, the user can identify the data record whose telephone number is to be output by the installation.
5. Voice-dialog method according to one of the claims 1 to 4, characterised in that orthographicphonetic information for predetermined utterances is loaded at predetermined instants from a lexicon which is capable of operating on-line.
6. Voice-dialog installation for carrying out the method according to one of the claims 1 to 5, comprising: a device for the input of an utterance by a user, at least one synthesizer for generating voice signals for the user, a voice-inputting device, an alphabet-identifier which can identify an utterance which is spelt out by the user and can select orthographically similar utterances from a plurality of predetermined spelt-out reference utterances, an utterance-identifier which compares the utterance input by the user with the utterances selected by the alphabet-identifier and on the basis of ascertaining similarity supplies at least one utterance for output to the user, and at least one lexicon which is capable of operating on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and the synthesizer can access in real time.
7. Voice-dialog installation according to claim 6, comprising a memory for temporary storage which temporarily stores the utterance input by the user, and by a memory which receives the utterances pre-selected by the alphabet-identifier.
8. Voice-dialog installation according to claim 6 or 7, characterised in that the utterance-identifier operates in keyword-spotting mode.
9. Voice-dialog installation according to one of the claims 6 to 8, characterised in that the data which is stored in the lexicon is orthographic, phonetic and intonation-related information for the predetermined utterances.
10. Voice-dialog installation according to claim 9, characterised in that additional information on homonyms is stored in the lexicon.
11. Voice-dialog installation according to one of the claims 6 to 10, characterised in that the utterance input by the user can be a place name, a surname or a plurality of words joined together.
12. Voice-dialog installation according to one of the claims 6 to 11, characterised in that the lexicon is capable of operating on-line and includes means for the storage of a general vocabulary, place names and surnames.
13. Voice-dialog installation according to one of the claims 6 to 12, characterised in that it is controlled by a program-controlled microcomputer.
14. Voice-dialog installation according to one of the claims 6 to 13, characterised in that the utterance-identifier and the alphabet-identifier are developed in such a way that they can reject an utterance input by the user and/or at the end of a given time span can ask the user to re-input his utterance.
15. Apparatus for speaker-independent voiceidentification, in particular for use in a voice-dialog installation according to one of the claims 6 to 14, having an alphabet-identifier which can identify an utterance spelt out by a user and can select several spelt-out reference utterances from a plurality of predetermined spelt-out reference utterances on the basis of ascertaining similarity, and having an utterance-identifier which, on the basis of ascertaining similarity, compares an utterance, which is input by the user and which corresponds to the spelt-out utterance, with the utterances which are preselected by the alphabet-identifier and supplies at least one output utterance as a result.
16. Apparatus for voice-identification according to claim 15, wherein the utterance-identifier operates in the keyword-spotting mode.
17. Apparatus for voice-identification according to claim 15 or 16, comprising a lexicon which stores orthographic and phonetic information on the plurality of predetermined utterances which the alphabetidentifier and the utterance-identifier can access in real time in order to ascertain utterances which sound alike or are orthographically similar.
18. A voice-dialog method, substantially as herein described with reference to Figure 2 of the accompanying drawings.
19. A voice-dialog installation, substantially as herein described with reference to, or as shown in, Figure 1 of the accompanying drawings.
GB9618308A 1995-08-31 1996-09-02 Voice-dialog system for automated output of information Expired - Fee Related GB2304957B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
DE1995132114 DE19532114C2 (en) 1995-08-31 1995-08-31 Speech dialog system for the automated output of information

Publications (3)

Publication Number Publication Date
GB9618308D0 GB9618308D0 (en) 1996-10-16
GB2304957A true GB2304957A (en) 1997-03-26
GB2304957B GB2304957B (en) 1999-09-29

Family

ID=7770897

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9618308A Expired - Fee Related GB2304957B (en) 1995-08-31 1996-09-02 Voice-dialog system for automated output of information

Country Status (3)

Country Link
DE (1) DE19532114C2 (en)
FR (1) FR2738382B1 (en)
GB (1) GB2304957B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2353887A (en) * 1999-09-04 2001-03-07 Ibm Speech recognition system
GB2362746A (en) * 2000-05-23 2001-11-28 Vocalis Ltd Data recognition and retrieval
US6721702B2 (en) 1999-06-10 2004-04-13 Infineon Technologies Ag Speech recognition method and device
EP1693829A1 (en) * 2005-02-21 2006-08-23 Harman Becker Automotive Systems GmbH Voice-controlled data system
US7167545B2 (en) 2000-12-06 2007-01-23 Varetis Solutions Gmbh Method and device for automatically issuing information using a search engine

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19907341A1 (en) * 1999-02-20 2000-08-31 Lutz H Karolus Processing data as query information involves comparing original and alternative data files with data in connected database, outputting coinciding data to local data processing machine
DE19907759C2 (en) * 1999-02-23 2002-05-23 Infineon Technologies Ag Method and device for spelling recognition
JP2001117828A (en) * 1999-10-14 2001-04-27 Fujitsu Ltd Electronic device and storage medium
EP1226576A2 (en) * 1999-11-04 2002-07-31 Telefonaktiebolaget Lm Ericsson System and method of increasing the recognition rate of speech-input instructions in remote communication terminals
DE10207895B4 (en) * 2002-02-23 2005-11-03 Harman Becker Automotive Systems Gmbh Method for speech recognition and speech recognition system
AT5730U3 (en) * 2002-05-24 2003-08-25 Roland Moesl METHOD FOR FOGGING WEBSITES
TWI298592B (en) * 2005-11-18 2008-07-01 Primax Electronics Ltd Menu-browsing method and auxiliary-operating system of handheld electronic device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0311414A2 (en) * 1987-10-08 1989-04-12 Nec Corporation Voice controlled dialer having memories for full-digit dialing for any users and abbreviated dialing for authorized users

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3732849A1 (en) * 1987-09-29 1989-04-20 Siemens Ag SYSTEM ARCHITECTURE FOR AN ACOUSTIC HUMAN / MACHINE DIALOG SYSTEM
US5131045A (en) * 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5293451A (en) * 1990-10-23 1994-03-08 International Business Machines Corporation Method and apparatus for generating models of spoken words based on a small number of utterances
DE69232407T2 (en) * 1991-11-18 2002-09-12 Toshiba Kawasaki Kk Speech dialogue system to facilitate computer-human interaction
FR2690777A1 (en) * 1992-04-30 1993-11-05 Lorraine Laminage Control of automaton by voice recognition - uses spelling of word or part of word by the operator to aid voice recognition and returns word recognised before acting
AU5803394A (en) * 1992-12-17 1994-07-04 Bell Atlantic Network Services, Inc. Mechanized directory assistance

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0311414A2 (en) * 1987-10-08 1989-04-12 Nec Corporation Voice controlled dialer having memories for full-digit dialing for any users and abbreviated dialing for authorized users

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6721702B2 (en) 1999-06-10 2004-04-13 Infineon Technologies Ag Speech recognition method and device
GB2353887A (en) * 1999-09-04 2001-03-07 Ibm Speech recognition system
GB2353887B (en) * 1999-09-04 2003-09-24 Ibm Speech recognition system
US6629071B1 (en) 1999-09-04 2003-09-30 International Business Machines Corporation Speech recognition system
US6687673B2 (en) * 1999-09-04 2004-02-03 International Business Machines Corporation Speech recognition system
GB2362746A (en) * 2000-05-23 2001-11-28 Vocalis Ltd Data recognition and retrieval
US7167545B2 (en) 2000-12-06 2007-01-23 Varetis Solutions Gmbh Method and device for automatically issuing information using a search engine
EP1693829A1 (en) * 2005-02-21 2006-08-23 Harman Becker Automotive Systems GmbH Voice-controlled data system
US9153233B2 (en) 2005-02-21 2015-10-06 Harman Becker Automotive Systems Gmbh Voice-controlled selection of media files utilizing phonetic data

Also Published As

Publication number Publication date
FR2738382A1 (en) 1997-03-07
FR2738382B1 (en) 1999-01-29
DE19532114C2 (en) 2001-07-26
GB2304957B (en) 1999-09-29
DE19532114A1 (en) 1997-03-06
GB9618308D0 (en) 1996-10-16

Similar Documents

Publication Publication Date Title
EP1049072B1 (en) Graphical user interface and method for modifying pronunciations in text-to-speech and speech recognition systems
US7529678B2 (en) Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system
US8285537B2 (en) Recognition of proper nouns using native-language pronunciation
US6999931B2 (en) Spoken dialog system using a best-fit language model and best-fit grammar
EP1975923B1 (en) Multilingual non-native speech recognition
US6996528B2 (en) Method for efficient, safe and reliable data entry by voice under adverse conditions
US6975986B2 (en) Voice spelling in an audio-only interface
KR19990008459A (en) Improved Reliability Word Recognition Method and Word Recognizer
US20050033575A1 (en) Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer
US5995931A (en) Method for modeling and recognizing speech including word liaisons
JPH075891A (en) Method and device for voice interaction
US9286887B2 (en) Concise dynamic grammars using N-best selection
WO1994016437A1 (en) Speech recognition system
GB2304957A (en) Voice-dialog system for automated output of information
US7406408B1 (en) Method of recognizing phones in speech of any language
EP0949606B1 (en) Method and system for speech recognition based on phonetic transcriptions
US6119085A (en) Reconciling recognition and text to speech vocabularies
EP1213706B1 (en) Method for online adaptation of pronunciation dictionaries
US7430503B1 (en) Method of combining corpora to achieve consistency in phonetic labeling
EP0786132B1 (en) A method and device for preparing and using diphones for multilingual text-to-speech generating
JPH0743599B2 (en) Computer system for voice recognition
WO2002086863A1 (en) Speech recognition
WO2000036591A1 (en) Speech operated automatic inquiry system
JPH0361954B2 (en)
JP2005534968A (en) Deciding to read kanji

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20100902