WO2009106031A1 - Procédé pour faire fonctionner un système d'assistance électronique - Google Patents

Procédé pour faire fonctionner un système d'assistance électronique Download PDF

Info

Publication number
WO2009106031A1
WO2009106031A1 PCT/DE2009/000156 DE2009000156W WO2009106031A1 WO 2009106031 A1 WO2009106031 A1 WO 2009106031A1 DE 2009000156 W DE2009000156 W DE 2009000156W WO 2009106031 A1 WO2009106031 A1 WO 2009106031A1
Authority
WO
WIPO (PCT)
Prior art keywords
phoneme
data
context
database
processing stage
Prior art date
Application number
PCT/DE2009/000156
Other languages
German (de)
English (en)
Inventor
Mathias Mühlfelder
Original Assignee
Navigon Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Navigon Ag filed Critical Navigon Ag
Publication of WO2009106031A1 publication Critical patent/WO2009106031A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the invention relates to a method for operating an electronic assistance system with voice recognition module according to the preamble of claim 1.
  • keyboards are typically used where the user inputs the input data in alphanumeric form.
  • Assistance systems with speech recognition modules have also been in widespread use for some years.
  • the input interface is equipped with voice recognition. This means that the user generally speaks his user instructions, ie his input data, into a recording device, for example a microphone of the assistance system, and this operator instruction is recorded there.
  • the speech recognition module the spoken user instructions are then further processed and interpreted in order to recognize the contents of the spoken operator's instruction and to be able to further process it in electronic form.
  • the well-known speech recognition modules of electro- are limited to comparing, in a first processing stage, the spoken operator instruction recorded with the recording device with phoneme data records stored in a database.
  • the phoneme data sets may be, for example, acoustic files, in particular WAV files or phoneme vectors.
  • the phoneme record with the content stored therein is always selected for further processing, which has yielded the highest acoustic match in the first processing stage.
  • the inventive method is based on the basic idea that the speech recognition module is extended with a second processing stage.
  • this second processing stage at least for a part of the phoneme data sets, the contents stored therein are compared with the context data stored in a context database.
  • the match score determined in the first processing stage which alone characterizes the acoustic match between the spoken operator statement and the phoneme record, may then be modified depending on the context comparison made in the second stage of the processing. At least the phoneme record with the best modified match score is then passed to the other parts of the Assistance System for further processing.
  • the processing according to the invention of the phoneme data sets in the second processing stage for processing the content context comparison represents a considerable additional effort in the context of the Data processing.
  • it is thus achieved that all phoneme data sets which did not produce sufficiently good results in the first comparison stage during the acoustic comparison are already filtered out before the context comparison in the second processing stage.
  • the phoneme data record with the respectively best modified agreement value is forwarded for further processing.
  • the content stored in the phoneme record can then be automatically selected for further processing and further processed in downstream function modules.
  • This may be done, for example, by giving the user the contents of the phoneme records with the relatively highest modified match values are displayed and the user then confirms one of the phoneme records by an appropriate selection.
  • the phoneme records with the relatively highest modified match values are sorted in a list.
  • the sorting of phoneme records in the list can be done according to the size of their respective modified match score. In other words, this means that the phoneme data record with the best modified match value is arranged at the first position of the list and that the phoneme data records are sorted according to their respective modified match value.
  • the size of the list can often be defined by a certain number of phoneme records to be included in the list. For example, if the list contains five locations, the list will include the five phoneme records that have the five highest modified match values.
  • the way in which the context comparison is carried out in the second processing stage is fundamentally arbitrary.
  • earlier entries of the user's data which have been confirmed by the user for use, are stored in the context database. This is based on the basic consideration that input data confirmed by the user earlier on is entered by the same user with a relatively high probability again.
  • the content of the phoneme data records is then compared in the second processing stage. For the phoneme records that match the content with the previous input data stored in the context database, the match score is increased to make the selection of those phoneme records more likely.
  • a context database contains user-specific address data.
  • This may be, for example, the electronic address book of a user.
  • For all addresses stored in the user's electronic address book have a correspondingly high probability as possible destination points for the navigation system.
  • a context database contains the starting points or destination points which have already been the basis of route planning in the past. Because certain start or finish points are approached by the user again and again and are therefore to be considered in the later route planning as well as particularly probable hits.
  • a context database containing data describing the meaning of cities.
  • This may, for example, be the population and / or the city area of a city. Because the selection of a city with a large population or large urban area is much more likely than, for example, the selection of a small village.
  • the electronic assistance system in the manner of a media player, in particular in the manner of an MP3 player be formed. Again, the user often has to enter his input data with very little input comfort, so that the improvement of the input comfort by means of appropriate voice inputs with high probability of hit is of great importance.
  • the context database may preferably include data on preferred tunes and / or data for user-specific rating of tunes and / or data at the time of storing music. Because the user-preferred pieces of music that are stored, for example, in favorite lists, or the pieces of music that have received a high user-specific rating from the user or only recently stored on the media player, have a significantly higher hit probability than other pieces of music .
  • the method according to the invention can also be installed on ticket machines. Again, in turn, the input of a variety of input data by the user is necessary, which is also often completely untrained.
  • the contents of the phoneme records can then be compared with the data from preferred destination stations or with data from nearby destination stations or with the data on the size or meaning of destination stations.
  • Fig. 1 a sorted result list with the contents of several
  • FIG. 2 shows the result list according to FIG. 1 after passing through the second processing stage.
  • place 1 shows a list 01 in whose first column five place names are written. These place names are the contents of phoneme data records which have been recognized by acoustic comparison in a voice comparison on a navigation system as possible hits in a first processing stage. According to the match score of the acoustic match, the place “Würzbach” was identified as the most probable hit and therefore provided with the prioritization 1. The hit "millcast”, on the other hand, has the lowest acoustic match score and thus receives the worst prioritization, namely 5 points.
  • FIG. 2 shows the list 01a after the recognized locations have undergone a content context comparison in a second processing stage.
  • this contextual comparison of content it was found that the user already very often uses the place "Würzburg" as the destination of his
  • the hit "Würzburg” is modified with a higher matching value and now receives the highest prioritization 1.
  • the others Hits of the list 1 are subjected to a content context comparison and the respective match values are modified, so that "Mühlburg” after this modification instead of the prioritization 5 the prioritization 3 and "Würzbach” instead of the prioritization 1 the prioritization 4 receives.
  • the locations according to the list 1 are then subsequently passed on for further processing and can be displayed to the user of a navigation system in the appropriate order as possible destinations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Navigation (AREA)

Abstract

L'invention concerne un procédé pour faire fonctionner un système d'assistance électronique comprenant un dispositif d'enregistrement acoustique sur lequel des instructions parlées de l'utilisateur peuvent être enregistrées puis être traitées à l'aide d'un module de reconnaissance vocale. Dans une première étape de traitement du module de reconnaissance vocale, les instructions parlées sont comparées à des jeux de phonèmes stockés dans une base de données, les jeux de phonèmes étant évalués en fonction de leur taux de concordance acoustique respectif et munis d'une valeur de concordance. Lors d'une deuxième étape de traitement du module de reconnaissance vocale, les contenus d'au moins une partie des jeux de phonèmes sont comparés aux données contextuelles stockées dans au moins une base de données contextuelles, la valeur de concordance des jeux de phonèmes calculée lors de la première étape de traitement étant modifiée en fonction du résultat de la comparaison contextuelle des contenus. Au moins le jeu de phonèmes ayant la meilleure valeur de concordance modifiée est transmis pour la poursuite du traitement.
PCT/DE2009/000156 2008-02-29 2009-02-06 Procédé pour faire fonctionner un système d'assistance électronique WO2009106031A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
DE102008012067.7 2008-02-29
DE102008012067 2008-02-29
DE102008021954.1 2008-05-02
DE102008021954A DE102008021954A1 (de) 2008-02-29 2008-05-02 Verfahren zum Betrieb eines elektronischen Assistenzsystems

Publications (1)

Publication Number Publication Date
WO2009106031A1 true WO2009106031A1 (fr) 2009-09-03

Family

ID=40911440

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DE2009/000156 WO2009106031A1 (fr) 2008-02-29 2009-02-06 Procédé pour faire fonctionner un système d'assistance électronique

Country Status (2)

Country Link
DE (1) DE102008021954A1 (fr)
WO (1) WO2009106031A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102011116460A1 (de) 2011-10-20 2013-04-25 Volkswagen Aktiengesellschaft Verfahren für eine Benutzerschnittstelle
DE102013007964B4 (de) 2013-05-10 2022-08-18 Audi Ag Kraftfahrzeug-Eingabevorrichtung mit Zeichenerkennung
DE102015226408A1 (de) * 2015-12-22 2017-06-22 Robert Bosch Gmbh Verfahren und Vorrichtung zum Durchführen einer Spracherkennung zum Steuern zumindest einer Funktion eines Fahrzeugs
DE102016221466B4 (de) 2016-11-02 2019-02-21 Audi Ag Verfahren zum Verarbeiten einer Benutzereingabe und Kraftfahrzeug mit einer Datenverarbeitungseinrichtung

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0905662A2 (fr) * 1997-09-24 1999-03-31 Philips Patentverwaltung GmbH Système d'introduction de données pour au moins les noms de localités et de rues
DE10218226A1 (de) * 2002-04-24 2003-11-06 Volkswagen Ag Verfahren und Einrichtung zur sprachgesteuerten Ansteuerung einer Multimediaeinrichtung, insbesondere in Kraftfahrzeugen
EP1435605A2 (fr) * 2002-12-31 2004-07-07 Samsung Electronics Co., Ltd. Procédé et dispositif de reconnaissance de la parole
EP1562357A1 (fr) * 2004-02-05 2005-08-10 Avaya Technology Corp. Procédé et appareil pour la mise en antémémoire de données pour améliorer la reconnaissance de noms dans de grands espaces de noms
US20070033043A1 (en) * 2005-07-08 2007-02-08 Toshiyuki Hyakumoto Speech recognition apparatus, navigation apparatus including a speech recognition apparatus, and speech recognition method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19933524A1 (de) * 1999-07-16 2001-01-18 Nokia Mobile Phones Ltd Verfahren zur Eingabe von Daten in ein System
DE10125825B4 (de) * 2001-05-26 2014-09-11 Robert Bosch Gmbh Verfahren zur Spracheingabe und Datenträger
DE10131157C1 (de) * 2001-06-29 2002-07-04 Project49 Ag Dynamisches Grammatikgewichtungsverfahren für Spracherkennungssysteme
DE10306022B3 (de) * 2003-02-13 2004-02-19 Siemens Ag Dreistufige Einzelworterkennung
DE102005018174A1 (de) * 2005-04-19 2006-11-02 Daimlerchrysler Ag Verfahren zur gezielten Ermittlung eines vollständigen Eingabedatensatzes in einem Sprachdialog 11
DE102007016887B3 (de) * 2007-04-10 2008-07-31 Siemens Ag Verfahren und Vorrichtung zum Betreiben eines Navigationssystems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0905662A2 (fr) * 1997-09-24 1999-03-31 Philips Patentverwaltung GmbH Système d'introduction de données pour au moins les noms de localités et de rues
DE10218226A1 (de) * 2002-04-24 2003-11-06 Volkswagen Ag Verfahren und Einrichtung zur sprachgesteuerten Ansteuerung einer Multimediaeinrichtung, insbesondere in Kraftfahrzeugen
EP1435605A2 (fr) * 2002-12-31 2004-07-07 Samsung Electronics Co., Ltd. Procédé et dispositif de reconnaissance de la parole
EP1562357A1 (fr) * 2004-02-05 2005-08-10 Avaya Technology Corp. Procédé et appareil pour la mise en antémémoire de données pour améliorer la reconnaissance de noms dans de grands espaces de noms
US20070033043A1 (en) * 2005-07-08 2007-02-08 Toshiyuki Hyakumoto Speech recognition apparatus, navigation apparatus including a speech recognition apparatus, and speech recognition method

Also Published As

Publication number Publication date
DE102008021954A1 (de) 2009-09-03

Similar Documents

Publication Publication Date Title
WO2009140781A1 (fr) Procédé pour classer et éliminer des parties indésirables d'une instruction lors d'une reconnaissance vocale
EP2815396B1 (fr) Méthode pour phonétiser und liste de données et interface utilisateur à commande vocale
WO2009106031A1 (fr) Procédé pour faire fonctionner un système d'assistance électronique
WO2001069591A1 (fr) Procede pour reconnaitre les enonces verbaux de locuteurs non natifs dans un systeme de traitement de la parole
EP1282897B1 (fr) Procede pour produire une banque de donnees vocales pour un lexique cible pour l'apprentissage d'un systeme de reconnaissance vocale
WO2004086360A1 (fr) Procede de reconnaissance vocale dependant du locuteur et systeme de reconnaissance vocale
EP1640969B1 (fr) Procédé de l'adaptation au locuteur pour un système de reconnaissance de la parole utilisant des modèls de markov cachés
DE60029456T2 (de) Verfahren zur Online-Anpassung von Aussprachewörterbüchern
DE102005030965A1 (de) Erweiterung des dynamischen Vokabulars eines Spracherkennungssystems um weitere Voiceenrollments
EP2006835B1 (fr) Procédé destiné à la création d'une liste d'hypothèses à l'aide du vocabulaire d'un système de reconnaissance vocale
EP3115886B1 (fr) Procede de fonctionnement d'un systeme de commande vocale et systeme de commande vocale
DE102013222520B4 (de) Verfahren für ein sprachsystem eines fahrzeugs
DE10042942C2 (de) Verfahren zur Sprachsynthese
WO1999005681A1 (fr) Procede pour la memorisation des parametres de recherche d'une sequence d'images et acces a une suite d'images dans cette sequence d'images
EP1224661B1 (fr) Procede et dispositif pour la verification d'un locuteur a l'aide d'un ordinateur
EP0834859B1 (fr) Procédé de détermination d'un modèle acoustique pour un mot
DE4111781A1 (de) Computersystem zur spracherkennung
DE102010026708A1 (de) Verfahren zum Betreiben eines Sprachdialogsystems und Sprachdialogsystem
DE112009003930T5 (de) Spracherkennungsvorrichtung
DE102016009196B4 (de) Verfahren zum Betreiben mehrerer Spracherkenner
DE102008062923A1 (de) Verfahren und Vorrichtung zur Erzeugung einer Trefferliste bei einer automatischen Spracherkennung
DE4240978A1 (de) Verfahren zur Verbesserung der Erkennungsqualität bei sprecherabhängiger Spracherkennung, insbesondere Sprecherverifikation
EP0945705A2 (fr) Système de reconnaissance
DE102016005731B4 (de) Verfahren zum Betreiben mehrerer Spracherkenner
EP2154483B1 (fr) Procédé d'entrée de cibles dans un système de navigation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09715534

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 09715534

Country of ref document: EP

Kind code of ref document: A1