WO1989003083A1 - Architecture de systemes pour systeme acoustique conversationnel homme/machine - Google Patents
Architecture de systemes pour systeme acoustique conversationnel homme/machineInfo
- Publication number
- WO1989003083A1 WO1989003083A1 PCT/DE1988/000596 DE8800596W WO8903083A1 WO 1989003083 A1 WO1989003083 A1 WO 1989003083A1 DE 8800596 W DE8800596 W DE 8800596W WO 8903083 A1 WO8903083 A1 WO 8903083A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- module
- word
- architecture according
- word sequence
- analysis unit
- Prior art date
Links
- 230000006978 adaptation Effects 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims abstract description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000004883 computer application Methods 0.000 claims description 2
- 238000007619 statistical method Methods 0.000 claims description 2
- 238000003786 synthesis reaction Methods 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
Definitions
- the present invention relates to a system architecture for an acoustic human / machine dialog system with a voice input device for voice input into the dialog system, a configuration system and an adaptation system, the voice input device generating an input voice signal.
- dialog system of the type mentioned at the beginning, communication takes place via human language.
- the dialog system translates the language of a user's wishes into the language of the machine.
- the machine is usually an EDP system on which an application with highly formalized input / output procedures of the machine language is implemented (see FIG. 1).
- the voice input / output can take place via a voice terminal with additional aids (image output, light pen, etc.) or via a telephone.
- a voice terminal with additional aids (image output, light pen, etc.) or via a telephone.
- computer applications such.
- automatic information and advisory services such as train and flight information
- automatic transfer services such as booking or ordering from a catalog, or office management services conceivable.
- Fig. 2 shows the basic structure of a human-machine dialog system, which consists of the systems configuration system, adaptation system and dialog system.
- the core of the system is the dialog system, which conducts the dialog between a user and an IT application.
- the configuration system is used to adapt the dialog system to the respective IT application.
- the application-specific vocabulary needed for the dialog is entered here with its conceptual relationships (syntactic / semantic / pragmatic relationships).
- the task of the adaptation system is to adapt the dialogue system to the voice characteristics of the respective user. This increases the recognition performance of the dialog system, which leads to smoother dialog operation.
- the present invention has for its object to provide a system architecture of the type mentioned, with the help of which it is possible to implement a working and efficient man / machine dialog system which uses instructions, commands, questions, etc can direct to an EDP system and process answers or queries from the EDP system and, in some cases, pass them on to the user in the form of synthetic language and / or in the form of a screen display.
- the object on which the present invention is based is achieved by a system architecture of the type mentioned at the outset and according to the preamble of patent claim 1, which is characterized according to the invention by the features specified in the characterizing part of patent claim 1.
- FIG. 1 shows the basic structure of a block diagram of an overall system to be implemented, as has already been discussed in the technical field.
- FIG. 2 shows, as also already explained, a block diagram of the human / machine dialog system to be provided according to FIG. 1 in more detail.
- FIG. 3 shows a block diagram of the system architecture according to the invention of the human / machine dialog system shown in FIG. 2.
- the architecture of the dialog system 30, as shown in FIG. 3, consists of a recognition module with the units “signal analysis” 31, “word sequence generation” 32 and “syntactic-semantic-pragmatic content analysis” 33, a dialog control unit 34 with adaptation to the EDP application and response generation unit 35.
- the speech signal of a user coming from a microphone is interpreted in the recognition module and brought into a content-oriented representation.
- the speech signal is first analyzed with regard to language-specific features.
- the word sequence generation unit the Features are mapped to word sequences using a phonetic word lexicon 322.
- this mapping is not clear due to the limited acoustic signal analysis, which is taken into account by parallel tracking of possible word sequences (word sequence hypotheses).
- word sequence hypotheses The number of word sequence hypotheses can become very large. This effort can be achieved using a language model 323 in which the possible sequence of words based on the EDP application is stored, as a result of which only "valid" word sequence hypotheses need to be considered.
- the check for valid word sequence hypotheses can also be carried out during content analysis, with the meaningful word sequences being filtered out of the word sequence hypotheses on the basis of linguistic rules.
- statistical methods are additionally used in that the probability with which the acoustic features are mapped onto the word sequence is calculated and the sequence with the highest probability is passed on to the dialog control unit 35 as an interpreted utterance by the user.
- the dialog control unit 35 decides whether the content of the utterance makes "sense" for the application or whether yet another dialog with the user has to be conducted.
- the content-oriented utterance of the dialog system is converted into a machine language that is understandable for the EDP application.
- feedback from the EDP application is brought back into a content-oriented representation of the dialog system 30 and generated for this answer.
- the answer is output either acoustically through speech synthesis or pictorially through an image terminal.
- the architecture of the dialog system 30 allows simple configuration to various types of EDP applications by restructuring the databases “phonetic lexicon”, “language model”, “linguistic rules” and “word lexicon” and by redesigning the adaptation to the I / O procedure.
- the adaptation The speaker characteristics of the user are carried out via a phonetic lexicon 321, in which the speaker-specific data is entered through user training.
- the architecture according to the invention is also suitable for real-time implementation. Due to the high computing power required, the various modules can be implemented as separate processing units, so that several modules can be used in parallel.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
L'architecture de systèmes décrite comporte un dispositif d'entrée vocale dans le système conversationnel, un système de configuration et un système d'adaptation. Ladite architecture comprend essentiellement une unité d'analyse du signal (31) qui constitue un dispositif d'entrée du système conversationnel (30) et dans laquelle est introduit le signal vocal d'entrée, une unité génératrice de séquences de mots (32) reliée en aval de l'unité précitée (31). A cette unité génératrice de séquences de mots (32) sont adjoints un module dictionnaire de phonétique (321), un module lexical phonétique (322) et un module modèle vocal (323). L'architecture comporte également une unité d'analyse de contenu (33) reliée an aval de l'unité (32) et destinée à effectuer une analyse du contenu syntactique, sémantique et pragmatique, un module pour les règles syntactiques, sémantiques et pragmatiques (331) ainsi qu'un module dictionnaire linguistique (332) étant adjoints à ladite unité d'analyse de contenu (33). L'architecture possède aussi, reliée en aval de cette dernière unité (33), une unité de commande conversationnelle (34) à laquelle est adjoint un module (341) permettant l'adaptation à une procédure d'entrée/sortie pour des applications informatiques, ainsi qu'une unité génératrice de réponse (35) à laquelle est adjoint un module lexique linguistique en phonétique (351), pour produire un signal vocal synthétique et un signal vidéo.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DEP3732849.2 | 1987-09-29 | ||
DE19873732849 DE3732849A1 (de) | 1987-09-29 | 1987-09-29 | System-architektur fuer ein akustisches mensch/maschine-dialogsystem |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1989003083A1 true WO1989003083A1 (fr) | 1989-04-06 |
Family
ID=6337149
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE1988/000596 WO1989003083A1 (fr) | 1987-09-29 | 1988-09-27 | Architecture de systemes pour systeme acoustique conversationnel homme/machine |
Country Status (2)
Country | Link |
---|---|
DE (1) | DE3732849A1 (fr) |
WO (1) | WO1989003083A1 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2696574A1 (fr) * | 1992-10-06 | 1994-04-08 | Sextant Avionique | Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialogue homme-machine. |
WO1997043707A1 (fr) * | 1996-05-13 | 1997-11-20 | Telia Ab | Ameliorations relatives a la conversion voix-voix |
US6834280B2 (en) | 2000-02-07 | 2004-12-21 | Josiah Lee Auspitz | Systems and methods for determining semiotic similarity between queries and database entries |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE4430164C2 (de) * | 1994-08-25 | 1998-04-23 | Uthe Friedrich Wilhelm | Verwendung eines interaktiven Informationssystems |
DE19532114C2 (de) * | 1995-08-31 | 2001-07-26 | Deutsche Telekom Ag | Sprachdialog-System zur automatisierten Ausgabe von Informationen |
DE19756641C2 (de) * | 1997-12-19 | 2001-02-22 | Sucker Mueller Hacoba Gmbh | Hilfsmittel beim Bestücken eines Spulengatters und Verfahren zum Bestücken eines Spulengatters |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0242743A1 (fr) * | 1986-04-25 | 1987-10-28 | Texas Instruments Incorporated | Système de reconnaissance de la parole |
-
1987
- 1987-09-29 DE DE19873732849 patent/DE3732849A1/de not_active Withdrawn
-
1988
- 1988-09-27 WO PCT/DE1988/000596 patent/WO1989003083A1/fr unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0242743A1 (fr) * | 1986-04-25 | 1987-10-28 | Texas Instruments Incorporated | Système de reconnaissance de la parole |
Non-Patent Citations (3)
Title |
---|
Data Report, Band VIII, Nr. 3, September 1980, (M}nchen, DE), D. Klugman: "Speaking with the Computer", Seiten 4-9 * |
Proceedings: ICASSP 86, IEEE-IECEJ-ASJ International Conference on Acoustics, Speech, and Signal Processings, 7.-11. April 1986, Tokio, JP, Band 3, IEEE, (New York, US), M. Shigenaga et al.: "A speech recognition system of continuously spoken Japanese sentences and an application to a speech input device", Seiten 1577-1580 * |
Proceedings: ICASSP 87, 1987 International Conference on Acoustics, Speech, and Signal Processing, 6-9. April 1987 Dallas, Texas, Band 1, IEEE, (New York, US), P. Alinat et al.: "A continuous speech dialog system for the oral control of a sonar console", Seiten 368-371 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2696574A1 (fr) * | 1992-10-06 | 1994-04-08 | Sextant Avionique | Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialogue homme-machine. |
EP0592280A1 (fr) * | 1992-10-06 | 1994-04-13 | Sextant Avionique | Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialoque homme-machine |
WO1997043707A1 (fr) * | 1996-05-13 | 1997-11-20 | Telia Ab | Ameliorations relatives a la conversion voix-voix |
US6834280B2 (en) | 2000-02-07 | 2004-12-21 | Josiah Lee Auspitz | Systems and methods for determining semiotic similarity between queries and database entries |
Also Published As
Publication number | Publication date |
---|---|
DE3732849A1 (de) | 1989-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60201262T2 (de) | Hierarchische sprachmodelle | |
EP0802522B1 (fr) | Appareil et procédé pour déterminer une action, et utilisation de l'appareil et du procédé | |
DE69814589T2 (de) | Spracherkennung unter verwendung mehrerer spracherkenner | |
EP1184839B1 (fr) | Conversion graphème-phonème | |
DE69607601T2 (de) | System und verfahren zur spracherkennung mit automatischer erzeugung einer syntax | |
DE69834553T2 (de) | Erweiterbares spracherkennungssystem mit einer audio-rückkopplung | |
DE69622565T2 (de) | Verfahren und vorrichtung zur dynamischen anpassung eines spracherkennungssystems mit grossem wortschatz und zur verwendung von einschränkungen aus einer datenbank in einem spracherkennungssystem mit grossem wortschatz | |
DE69413052T2 (de) | Sprachsynthese | |
DE69712216T2 (de) | Verfahren und gerät zum übersetzen von einer sparche in eine andere | |
DE69828141T2 (de) | Verfahren und Vorrichtung zur Spracherkennung | |
EP0925578B1 (fr) | Systeme et procede de traitement de la parole | |
EP0702353B1 (fr) | Système et procédé de reproduction synthétique de parole en réponse aux signaux de parole fournis | |
DE60313706T2 (de) | Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium | |
DE19847419A1 (de) | Verfahren zur automatischen Erkennung einer buchstabierten sprachlichen Äußerung | |
EP1273003B1 (fr) | Procede et dispositif de determination de marquages prosodiques | |
DE102006006069A1 (de) | Verteiltes Sprachverarbeitungssystem und Verfahren zur Ausgabe eines Zwischensignals davon | |
WO2001018792A1 (fr) | Procede d'apprentissage des graphemes d'apres des regles de phonemes pour la synthese vocale | |
EP0987682B1 (fr) | Procédé d'adaptation des modèles de language pour la reconnaissance de la parole | |
DE19837102A1 (de) | Verfahren und Anordnung zum Durchführen einer Datenbankanfrage | |
EP1182646A2 (fr) | Méthode de classification des phonèmes | |
DE69326900T2 (de) | Spracherkennungssystem | |
DE3853702T2 (de) | Spracherkennung. | |
EP1187440A2 (fr) | Système de dialogue oral | |
DE19532114C2 (de) | Sprachdialog-System zur automatisierten Ausgabe von Informationen | |
EP1282897A1 (fr) | Procede pour produire une banque de donnees vocales pour un lexique cible pour l'apprentissage d'un systeme de reconnaissance vocale |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE FR GB IT LU NL SE |