WO1989003083A1 - Architecture de systemes pour systeme acoustique conversationnel homme/machine - Google Patents

Architecture de systemes pour systeme acoustique conversationnel homme/machine

Info

Publication number
WO1989003083A1
WO1989003083A1 PCT/DE1988/000596 DE8800596W WO8903083A1 WO 1989003083 A1 WO1989003083 A1 WO 1989003083A1 DE 8800596 W DE8800596 W DE 8800596W WO 8903083 A1 WO8903083 A1 WO 8903083A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
word
architecture according
word sequence
analysis unit
Prior art date
Application number
PCT/DE1988/000596
Other languages
German (de)
English (en)
Inventor
Lothar Glasser
Harald Höge
Erwin Marschall
Gerhard Niedermair
Montserrat Meya-Llopart
Jorge Romano-Rodriguez
Robert J. Sommer
Otto Schmidbauer
Gregor Thurmair
Hendrich Bunt
Jan B. Van Hemert
Kees Van Deemter
Dieter Mergel
Hermann Ney
Andreas Noll
John H. M. De Vet
Original Assignee
Siemens Aktiengesellschaft
N.V. Philips' Gloeilampenfabrieken
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft, N.V. Philips' Gloeilampenfabrieken filed Critical Siemens Aktiengesellschaft
Publication of WO1989003083A1 publication Critical patent/WO1989003083A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output

Definitions

  • the present invention relates to a system architecture for an acoustic human / machine dialog system with a voice input device for voice input into the dialog system, a configuration system and an adaptation system, the voice input device generating an input voice signal.
  • dialog system of the type mentioned at the beginning, communication takes place via human language.
  • the dialog system translates the language of a user's wishes into the language of the machine.
  • the machine is usually an EDP system on which an application with highly formalized input / output procedures of the machine language is implemented (see FIG. 1).
  • the voice input / output can take place via a voice terminal with additional aids (image output, light pen, etc.) or via a telephone.
  • a voice terminal with additional aids (image output, light pen, etc.) or via a telephone.
  • computer applications such.
  • automatic information and advisory services such as train and flight information
  • automatic transfer services such as booking or ordering from a catalog, or office management services conceivable.
  • Fig. 2 shows the basic structure of a human-machine dialog system, which consists of the systems configuration system, adaptation system and dialog system.
  • the core of the system is the dialog system, which conducts the dialog between a user and an IT application.
  • the configuration system is used to adapt the dialog system to the respective IT application.
  • the application-specific vocabulary needed for the dialog is entered here with its conceptual relationships (syntactic / semantic / pragmatic relationships).
  • the task of the adaptation system is to adapt the dialogue system to the voice characteristics of the respective user. This increases the recognition performance of the dialog system, which leads to smoother dialog operation.
  • the present invention has for its object to provide a system architecture of the type mentioned, with the help of which it is possible to implement a working and efficient man / machine dialog system which uses instructions, commands, questions, etc can direct to an EDP system and process answers or queries from the EDP system and, in some cases, pass them on to the user in the form of synthetic language and / or in the form of a screen display.
  • the object on which the present invention is based is achieved by a system architecture of the type mentioned at the outset and according to the preamble of patent claim 1, which is characterized according to the invention by the features specified in the characterizing part of patent claim 1.
  • FIG. 1 shows the basic structure of a block diagram of an overall system to be implemented, as has already been discussed in the technical field.
  • FIG. 2 shows, as also already explained, a block diagram of the human / machine dialog system to be provided according to FIG. 1 in more detail.
  • FIG. 3 shows a block diagram of the system architecture according to the invention of the human / machine dialog system shown in FIG. 2.
  • the architecture of the dialog system 30, as shown in FIG. 3, consists of a recognition module with the units “signal analysis” 31, “word sequence generation” 32 and “syntactic-semantic-pragmatic content analysis” 33, a dialog control unit 34 with adaptation to the EDP application and response generation unit 35.
  • the speech signal of a user coming from a microphone is interpreted in the recognition module and brought into a content-oriented representation.
  • the speech signal is first analyzed with regard to language-specific features.
  • the word sequence generation unit the Features are mapped to word sequences using a phonetic word lexicon 322.
  • this mapping is not clear due to the limited acoustic signal analysis, which is taken into account by parallel tracking of possible word sequences (word sequence hypotheses).
  • word sequence hypotheses The number of word sequence hypotheses can become very large. This effort can be achieved using a language model 323 in which the possible sequence of words based on the EDP application is stored, as a result of which only "valid" word sequence hypotheses need to be considered.
  • the check for valid word sequence hypotheses can also be carried out during content analysis, with the meaningful word sequences being filtered out of the word sequence hypotheses on the basis of linguistic rules.
  • statistical methods are additionally used in that the probability with which the acoustic features are mapped onto the word sequence is calculated and the sequence with the highest probability is passed on to the dialog control unit 35 as an interpreted utterance by the user.
  • the dialog control unit 35 decides whether the content of the utterance makes "sense" for the application or whether yet another dialog with the user has to be conducted.
  • the content-oriented utterance of the dialog system is converted into a machine language that is understandable for the EDP application.
  • feedback from the EDP application is brought back into a content-oriented representation of the dialog system 30 and generated for this answer.
  • the answer is output either acoustically through speech synthesis or pictorially through an image terminal.
  • the architecture of the dialog system 30 allows simple configuration to various types of EDP applications by restructuring the databases “phonetic lexicon”, “language model”, “linguistic rules” and “word lexicon” and by redesigning the adaptation to the I / O procedure.
  • the adaptation The speaker characteristics of the user are carried out via a phonetic lexicon 321, in which the speaker-specific data is entered through user training.
  • the architecture according to the invention is also suitable for real-time implementation. Due to the high computing power required, the various modules can be implemented as separate processing units, so that several modules can be used in parallel.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

L'architecture de systèmes décrite comporte un dispositif d'entrée vocale dans le système conversationnel, un système de configuration et un système d'adaptation. Ladite architecture comprend essentiellement une unité d'analyse du signal (31) qui constitue un dispositif d'entrée du système conversationnel (30) et dans laquelle est introduit le signal vocal d'entrée, une unité génératrice de séquences de mots (32) reliée en aval de l'unité précitée (31). A cette unité génératrice de séquences de mots (32) sont adjoints un module dictionnaire de phonétique (321), un module lexical phonétique (322) et un module modèle vocal (323). L'architecture comporte également une unité d'analyse de contenu (33) reliée an aval de l'unité (32) et destinée à effectuer une analyse du contenu syntactique, sémantique et pragmatique, un module pour les règles syntactiques, sémantiques et pragmatiques (331) ainsi qu'un module dictionnaire linguistique (332) étant adjoints à ladite unité d'analyse de contenu (33). L'architecture possède aussi, reliée en aval de cette dernière unité (33), une unité de commande conversationnelle (34) à laquelle est adjoint un module (341) permettant l'adaptation à une procédure d'entrée/sortie pour des applications informatiques, ainsi qu'une unité génératrice de réponse (35) à laquelle est adjoint un module lexique linguistique en phonétique (351), pour produire un signal vocal synthétique et un signal vidéo.
PCT/DE1988/000596 1987-09-29 1988-09-27 Architecture de systemes pour systeme acoustique conversationnel homme/machine WO1989003083A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DEP3732849.2 1987-09-29
DE19873732849 DE3732849A1 (de) 1987-09-29 1987-09-29 System-architektur fuer ein akustisches mensch/maschine-dialogsystem

Publications (1)

Publication Number Publication Date
WO1989003083A1 true WO1989003083A1 (fr) 1989-04-06

Family

ID=6337149

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DE1988/000596 WO1989003083A1 (fr) 1987-09-29 1988-09-27 Architecture de systemes pour systeme acoustique conversationnel homme/machine

Country Status (2)

Country Link
DE (1) DE3732849A1 (fr)
WO (1) WO1989003083A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2696574A1 (fr) * 1992-10-06 1994-04-08 Sextant Avionique Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialogue homme-machine.
WO1997043707A1 (fr) * 1996-05-13 1997-11-20 Telia Ab Ameliorations relatives a la conversion voix-voix
US6834280B2 (en) 2000-02-07 2004-12-21 Josiah Lee Auspitz Systems and methods for determining semiotic similarity between queries and database entries

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4430164C2 (de) * 1994-08-25 1998-04-23 Uthe Friedrich Wilhelm Verwendung eines interaktiven Informationssystems
DE19532114C2 (de) * 1995-08-31 2001-07-26 Deutsche Telekom Ag Sprachdialog-System zur automatisierten Ausgabe von Informationen
DE19756641C2 (de) * 1997-12-19 2001-02-22 Sucker Mueller Hacoba Gmbh Hilfsmittel beim Bestücken eines Spulengatters und Verfahren zum Bestücken eines Spulengatters

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0242743A1 (fr) * 1986-04-25 1987-10-28 Texas Instruments Incorporated Système de reconnaissance de la parole

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0242743A1 (fr) * 1986-04-25 1987-10-28 Texas Instruments Incorporated Système de reconnaissance de la parole

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Data Report, Band VIII, Nr. 3, September 1980, (M}nchen, DE), D. Klugman: "Speaking with the Computer", Seiten 4-9 *
Proceedings: ICASSP 86, IEEE-IECEJ-ASJ International Conference on Acoustics, Speech, and Signal Processings, 7.-11. April 1986, Tokio, JP, Band 3, IEEE, (New York, US), M. Shigenaga et al.: "A speech recognition system of continuously spoken Japanese sentences and an application to a speech input device", Seiten 1577-1580 *
Proceedings: ICASSP 87, 1987 International Conference on Acoustics, Speech, and Signal Processing, 6-9. April 1987 Dallas, Texas, Band 1, IEEE, (New York, US), P. Alinat et al.: "A continuous speech dialog system for the oral control of a sonar console", Seiten 368-371 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2696574A1 (fr) * 1992-10-06 1994-04-08 Sextant Avionique Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialogue homme-machine.
EP0592280A1 (fr) * 1992-10-06 1994-04-13 Sextant Avionique Procédé et dispositif d'analyse d'un message fourni par des moyens d'interaction à un système de dialoque homme-machine
WO1997043707A1 (fr) * 1996-05-13 1997-11-20 Telia Ab Ameliorations relatives a la conversion voix-voix
US6834280B2 (en) 2000-02-07 2004-12-21 Josiah Lee Auspitz Systems and methods for determining semiotic similarity between queries and database entries

Also Published As

Publication number Publication date
DE3732849A1 (de) 1989-04-20

Similar Documents

Publication Publication Date Title
DE60201262T2 (de) Hierarchische sprachmodelle
EP0802522B1 (fr) Appareil et procédé pour déterminer une action, et utilisation de l'appareil et du procédé
DE69814589T2 (de) Spracherkennung unter verwendung mehrerer spracherkenner
EP1184839B1 (fr) Conversion graphème-phonème
DE69607601T2 (de) System und verfahren zur spracherkennung mit automatischer erzeugung einer syntax
DE69834553T2 (de) Erweiterbares spracherkennungssystem mit einer audio-rückkopplung
DE69622565T2 (de) Verfahren und vorrichtung zur dynamischen anpassung eines spracherkennungssystems mit grossem wortschatz und zur verwendung von einschränkungen aus einer datenbank in einem spracherkennungssystem mit grossem wortschatz
DE69413052T2 (de) Sprachsynthese
DE69712216T2 (de) Verfahren und gerät zum übersetzen von einer sparche in eine andere
DE69828141T2 (de) Verfahren und Vorrichtung zur Spracherkennung
EP0925578B1 (fr) Systeme et procede de traitement de la parole
EP0702353B1 (fr) Système et procédé de reproduction synthétique de parole en réponse aux signaux de parole fournis
DE60313706T2 (de) Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium
DE19847419A1 (de) Verfahren zur automatischen Erkennung einer buchstabierten sprachlichen Äußerung
EP1273003B1 (fr) Procede et dispositif de determination de marquages prosodiques
DE102006006069A1 (de) Verteiltes Sprachverarbeitungssystem und Verfahren zur Ausgabe eines Zwischensignals davon
WO2001018792A1 (fr) Procede d'apprentissage des graphemes d'apres des regles de phonemes pour la synthese vocale
EP0987682B1 (fr) Procédé d'adaptation des modèles de language pour la reconnaissance de la parole
DE19837102A1 (de) Verfahren und Anordnung zum Durchführen einer Datenbankanfrage
EP1182646A2 (fr) Méthode de classification des phonèmes
DE69326900T2 (de) Spracherkennungssystem
DE3853702T2 (de) Spracherkennung.
EP1187440A2 (fr) Système de dialogue oral
DE19532114C2 (de) Sprachdialog-System zur automatisierten Ausgabe von Informationen
EP1282897A1 (fr) Procede pour produire une banque de donnees vocales pour un lexique cible pour l'apprentissage d'un systeme de reconnaissance vocale

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LU NL SE