WO2009106031A1 - Procédé pour faire fonctionner un système d'assistance électronique - Google Patents
Procédé pour faire fonctionner un système d'assistance électronique Download PDFInfo
- Publication number
- WO2009106031A1 WO2009106031A1 PCT/DE2009/000156 DE2009000156W WO2009106031A1 WO 2009106031 A1 WO2009106031 A1 WO 2009106031A1 DE 2009000156 W DE2009000156 W DE 2009000156W WO 2009106031 A1 WO2009106031 A1 WO 2009106031A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- phoneme
- data
- context
- database
- processing stage
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000013439 planning Methods 0.000 claims description 3
- 238000012913 prioritisation Methods 0.000 description 7
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- the invention relates to a method for operating an electronic assistance system with voice recognition module according to the preamble of claim 1.
- keyboards are typically used where the user inputs the input data in alphanumeric form.
- Assistance systems with speech recognition modules have also been in widespread use for some years.
- the input interface is equipped with voice recognition. This means that the user generally speaks his user instructions, ie his input data, into a recording device, for example a microphone of the assistance system, and this operator instruction is recorded there.
- the speech recognition module the spoken user instructions are then further processed and interpreted in order to recognize the contents of the spoken operator's instruction and to be able to further process it in electronic form.
- the well-known speech recognition modules of electro- are limited to comparing, in a first processing stage, the spoken operator instruction recorded with the recording device with phoneme data records stored in a database.
- the phoneme data sets may be, for example, acoustic files, in particular WAV files or phoneme vectors.
- the phoneme record with the content stored therein is always selected for further processing, which has yielded the highest acoustic match in the first processing stage.
- the inventive method is based on the basic idea that the speech recognition module is extended with a second processing stage.
- this second processing stage at least for a part of the phoneme data sets, the contents stored therein are compared with the context data stored in a context database.
- the match score determined in the first processing stage which alone characterizes the acoustic match between the spoken operator statement and the phoneme record, may then be modified depending on the context comparison made in the second stage of the processing. At least the phoneme record with the best modified match score is then passed to the other parts of the Assistance System for further processing.
- the processing according to the invention of the phoneme data sets in the second processing stage for processing the content context comparison represents a considerable additional effort in the context of the Data processing.
- it is thus achieved that all phoneme data sets which did not produce sufficiently good results in the first comparison stage during the acoustic comparison are already filtered out before the context comparison in the second processing stage.
- the phoneme data record with the respectively best modified agreement value is forwarded for further processing.
- the content stored in the phoneme record can then be automatically selected for further processing and further processed in downstream function modules.
- This may be done, for example, by giving the user the contents of the phoneme records with the relatively highest modified match values are displayed and the user then confirms one of the phoneme records by an appropriate selection.
- the phoneme records with the relatively highest modified match values are sorted in a list.
- the sorting of phoneme records in the list can be done according to the size of their respective modified match score. In other words, this means that the phoneme data record with the best modified match value is arranged at the first position of the list and that the phoneme data records are sorted according to their respective modified match value.
- the size of the list can often be defined by a certain number of phoneme records to be included in the list. For example, if the list contains five locations, the list will include the five phoneme records that have the five highest modified match values.
- the way in which the context comparison is carried out in the second processing stage is fundamentally arbitrary.
- earlier entries of the user's data which have been confirmed by the user for use, are stored in the context database. This is based on the basic consideration that input data confirmed by the user earlier on is entered by the same user with a relatively high probability again.
- the content of the phoneme data records is then compared in the second processing stage. For the phoneme records that match the content with the previous input data stored in the context database, the match score is increased to make the selection of those phoneme records more likely.
- a context database contains user-specific address data.
- This may be, for example, the electronic address book of a user.
- For all addresses stored in the user's electronic address book have a correspondingly high probability as possible destination points for the navigation system.
- a context database contains the starting points or destination points which have already been the basis of route planning in the past. Because certain start or finish points are approached by the user again and again and are therefore to be considered in the later route planning as well as particularly probable hits.
- a context database containing data describing the meaning of cities.
- This may, for example, be the population and / or the city area of a city. Because the selection of a city with a large population or large urban area is much more likely than, for example, the selection of a small village.
- the electronic assistance system in the manner of a media player, in particular in the manner of an MP3 player be formed. Again, the user often has to enter his input data with very little input comfort, so that the improvement of the input comfort by means of appropriate voice inputs with high probability of hit is of great importance.
- the context database may preferably include data on preferred tunes and / or data for user-specific rating of tunes and / or data at the time of storing music. Because the user-preferred pieces of music that are stored, for example, in favorite lists, or the pieces of music that have received a high user-specific rating from the user or only recently stored on the media player, have a significantly higher hit probability than other pieces of music .
- the method according to the invention can also be installed on ticket machines. Again, in turn, the input of a variety of input data by the user is necessary, which is also often completely untrained.
- the contents of the phoneme records can then be compared with the data from preferred destination stations or with data from nearby destination stations or with the data on the size or meaning of destination stations.
- Fig. 1 a sorted result list with the contents of several
- FIG. 2 shows the result list according to FIG. 1 after passing through the second processing stage.
- place 1 shows a list 01 in whose first column five place names are written. These place names are the contents of phoneme data records which have been recognized by acoustic comparison in a voice comparison on a navigation system as possible hits in a first processing stage. According to the match score of the acoustic match, the place “Würzbach” was identified as the most probable hit and therefore provided with the prioritization 1. The hit "millcast”, on the other hand, has the lowest acoustic match score and thus receives the worst prioritization, namely 5 points.
- FIG. 2 shows the list 01a after the recognized locations have undergone a content context comparison in a second processing stage.
- this contextual comparison of content it was found that the user already very often uses the place "Würzburg" as the destination of his
- the hit "Würzburg” is modified with a higher matching value and now receives the highest prioritization 1.
- the others Hits of the list 1 are subjected to a content context comparison and the respective match values are modified, so that "Mühlburg” after this modification instead of the prioritization 5 the prioritization 3 and "Würzbach” instead of the prioritization 1 the prioritization 4 receives.
- the locations according to the list 1 are then subsequently passed on for further processing and can be displayed to the user of a navigation system in the appropriate order as possible destinations.
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Navigation (AREA)
Abstract
L'invention concerne un procédé pour faire fonctionner un système d'assistance électronique comprenant un dispositif d'enregistrement acoustique sur lequel des instructions parlées de l'utilisateur peuvent être enregistrées puis être traitées à l'aide d'un module de reconnaissance vocale. Dans une première étape de traitement du module de reconnaissance vocale, les instructions parlées sont comparées à des jeux de phonèmes stockés dans une base de données, les jeux de phonèmes étant évalués en fonction de leur taux de concordance acoustique respectif et munis d'une valeur de concordance. Lors d'une deuxième étape de traitement du module de reconnaissance vocale, les contenus d'au moins une partie des jeux de phonèmes sont comparés aux données contextuelles stockées dans au moins une base de données contextuelles, la valeur de concordance des jeux de phonèmes calculée lors de la première étape de traitement étant modifiée en fonction du résultat de la comparaison contextuelle des contenus. Au moins le jeu de phonèmes ayant la meilleure valeur de concordance modifiée est transmis pour la poursuite du traitement.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102008012067.7 | 2008-02-29 | ||
DE102008012067 | 2008-02-29 | ||
DE102008021954.1 | 2008-05-02 | ||
DE102008021954A DE102008021954A1 (de) | 2008-02-29 | 2008-05-02 | Verfahren zum Betrieb eines elektronischen Assistenzsystems |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009106031A1 true WO2009106031A1 (fr) | 2009-09-03 |
Family
ID=40911440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE2009/000156 WO2009106031A1 (fr) | 2008-02-29 | 2009-02-06 | Procédé pour faire fonctionner un système d'assistance électronique |
Country Status (2)
Country | Link |
---|---|
DE (1) | DE102008021954A1 (fr) |
WO (1) | WO2009106031A1 (fr) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102011116460A1 (de) | 2011-10-20 | 2013-04-25 | Volkswagen Aktiengesellschaft | Verfahren für eine Benutzerschnittstelle |
DE102013007964B4 (de) | 2013-05-10 | 2022-08-18 | Audi Ag | Kraftfahrzeug-Eingabevorrichtung mit Zeichenerkennung |
DE102015226408A1 (de) * | 2015-12-22 | 2017-06-22 | Robert Bosch Gmbh | Verfahren und Vorrichtung zum Durchführen einer Spracherkennung zum Steuern zumindest einer Funktion eines Fahrzeugs |
DE102016221466B4 (de) | 2016-11-02 | 2019-02-21 | Audi Ag | Verfahren zum Verarbeiten einer Benutzereingabe und Kraftfahrzeug mit einer Datenverarbeitungseinrichtung |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0905662A2 (fr) * | 1997-09-24 | 1999-03-31 | Philips Patentverwaltung GmbH | Système d'introduction de données pour au moins les noms de localités et de rues |
DE10218226A1 (de) * | 2002-04-24 | 2003-11-06 | Volkswagen Ag | Verfahren und Einrichtung zur sprachgesteuerten Ansteuerung einer Multimediaeinrichtung, insbesondere in Kraftfahrzeugen |
EP1435605A2 (fr) * | 2002-12-31 | 2004-07-07 | Samsung Electronics Co., Ltd. | Procédé et dispositif de reconnaissance de la parole |
EP1562357A1 (fr) * | 2004-02-05 | 2005-08-10 | Avaya Technology Corp. | Procédé et appareil pour la mise en antémémoire de données pour améliorer la reconnaissance de noms dans de grands espaces de noms |
US20070033043A1 (en) * | 2005-07-08 | 2007-02-08 | Toshiyuki Hyakumoto | Speech recognition apparatus, navigation apparatus including a speech recognition apparatus, and speech recognition method |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19933524A1 (de) * | 1999-07-16 | 2001-01-18 | Nokia Mobile Phones Ltd | Verfahren zur Eingabe von Daten in ein System |
DE10125825B4 (de) * | 2001-05-26 | 2014-09-11 | Robert Bosch Gmbh | Verfahren zur Spracheingabe und Datenträger |
DE10131157C1 (de) * | 2001-06-29 | 2002-07-04 | Project49 Ag | Dynamisches Grammatikgewichtungsverfahren für Spracherkennungssysteme |
DE10306022B3 (de) * | 2003-02-13 | 2004-02-19 | Siemens Ag | Dreistufige Einzelworterkennung |
DE102005018174A1 (de) * | 2005-04-19 | 2006-11-02 | Daimlerchrysler Ag | Verfahren zur gezielten Ermittlung eines vollständigen Eingabedatensatzes in einem Sprachdialog 11 |
DE102007016887B3 (de) * | 2007-04-10 | 2008-07-31 | Siemens Ag | Verfahren und Vorrichtung zum Betreiben eines Navigationssystems |
-
2008
- 2008-05-02 DE DE102008021954A patent/DE102008021954A1/de not_active Withdrawn
-
2009
- 2009-02-06 WO PCT/DE2009/000156 patent/WO2009106031A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0905662A2 (fr) * | 1997-09-24 | 1999-03-31 | Philips Patentverwaltung GmbH | Système d'introduction de données pour au moins les noms de localités et de rues |
DE10218226A1 (de) * | 2002-04-24 | 2003-11-06 | Volkswagen Ag | Verfahren und Einrichtung zur sprachgesteuerten Ansteuerung einer Multimediaeinrichtung, insbesondere in Kraftfahrzeugen |
EP1435605A2 (fr) * | 2002-12-31 | 2004-07-07 | Samsung Electronics Co., Ltd. | Procédé et dispositif de reconnaissance de la parole |
EP1562357A1 (fr) * | 2004-02-05 | 2005-08-10 | Avaya Technology Corp. | Procédé et appareil pour la mise en antémémoire de données pour améliorer la reconnaissance de noms dans de grands espaces de noms |
US20070033043A1 (en) * | 2005-07-08 | 2007-02-08 | Toshiyuki Hyakumoto | Speech recognition apparatus, navigation apparatus including a speech recognition apparatus, and speech recognition method |
Also Published As
Publication number | Publication date |
---|---|
DE102008021954A1 (de) | 2009-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009140781A1 (fr) | Procédé pour classer et éliminer des parties indésirables d'une instruction lors d'une reconnaissance vocale | |
EP2815396B1 (fr) | Méthode pour phonétiser und liste de données et interface utilisateur à commande vocale | |
WO2009106031A1 (fr) | Procédé pour faire fonctionner un système d'assistance électronique | |
WO2001069591A1 (fr) | Procede pour reconnaitre les enonces verbaux de locuteurs non natifs dans un systeme de traitement de la parole | |
EP1282897B1 (fr) | Procede pour produire une banque de donnees vocales pour un lexique cible pour l'apprentissage d'un systeme de reconnaissance vocale | |
WO2004086360A1 (fr) | Procede de reconnaissance vocale dependant du locuteur et systeme de reconnaissance vocale | |
EP1640969B1 (fr) | Procédé de l'adaptation au locuteur pour un système de reconnaissance de la parole utilisant des modèls de markov cachés | |
DE60029456T2 (de) | Verfahren zur Online-Anpassung von Aussprachewörterbüchern | |
DE102005030965A1 (de) | Erweiterung des dynamischen Vokabulars eines Spracherkennungssystems um weitere Voiceenrollments | |
EP2006835B1 (fr) | Procédé destiné à la création d'une liste d'hypothèses à l'aide du vocabulaire d'un système de reconnaissance vocale | |
EP3115886B1 (fr) | Procede de fonctionnement d'un systeme de commande vocale et systeme de commande vocale | |
DE102013222520B4 (de) | Verfahren für ein sprachsystem eines fahrzeugs | |
DE10042942C2 (de) | Verfahren zur Sprachsynthese | |
WO1999005681A1 (fr) | Procede pour la memorisation des parametres de recherche d'une sequence d'images et acces a une suite d'images dans cette sequence d'images | |
EP1224661B1 (fr) | Procede et dispositif pour la verification d'un locuteur a l'aide d'un ordinateur | |
EP0834859B1 (fr) | Procédé de détermination d'un modèle acoustique pour un mot | |
DE4111781A1 (de) | Computersystem zur spracherkennung | |
DE102010026708A1 (de) | Verfahren zum Betreiben eines Sprachdialogsystems und Sprachdialogsystem | |
DE112009003930T5 (de) | Spracherkennungsvorrichtung | |
DE102016009196B4 (de) | Verfahren zum Betreiben mehrerer Spracherkenner | |
DE102008062923A1 (de) | Verfahren und Vorrichtung zur Erzeugung einer Trefferliste bei einer automatischen Spracherkennung | |
DE4240978A1 (de) | Verfahren zur Verbesserung der Erkennungsqualität bei sprecherabhängiger Spracherkennung, insbesondere Sprecherverifikation | |
EP0945705A2 (fr) | Système de reconnaissance | |
DE102016005731B4 (de) | Verfahren zum Betreiben mehrerer Spracherkenner | |
EP2154483B1 (fr) | Procédé d'entrée de cibles dans un système de navigation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09715534 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09715534 Country of ref document: EP Kind code of ref document: A1 |