WO2009140780A1 - Method for conveying a confidence to a user of an automatic voice dialogue system - Google Patents

Method for conveying a confidence to a user of an automatic voice dialogue system Download PDF

Info

Publication number
WO2009140780A1
WO2009140780A1 PCT/CH2009/000158 CH2009000158W WO2009140780A1 WO 2009140780 A1 WO2009140780 A1 WO 2009140780A1 CH 2009000158 W CH2009000158 W CH 2009000158W WO 2009140780 A1 WO2009140780 A1 WO 2009140780A1
Authority
WO
WIPO (PCT)
Prior art keywords
confidence
kn
k2
user
kl
Prior art date
Application number
PCT/CH2009/000158
Other languages
German (de)
French (fr)
Inventor
Georg Stemmer
Original Assignee
Svox Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to DE200810024974 priority Critical patent/DE102008024974A1/en
Priority to DE102008024974.2 priority
Application filed by Svox Ag filed Critical Svox Ag
Publication of WO2009140780A1 publication Critical patent/WO2009140780A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Abstract

The document describes a method for conveying a confidence (K1, K2,...Kn) to a user (B) of an automatic voice dialogue system, with which confidence (K1, K2,...Kn) a data item (D1, D2,..., Dn) from a user statement (BA) has been recognized, wherein the confidence (K1, K2,...Kn) associated with a data item (D1, D2,..., Dn) is low if the probability of a recognition error is high, and the confidence (K1, K2,...Kn) is high if the probability of a recognition error is low. The method is distinguished, in line with the invention, in that the confidence (K1, K2,...Kn) is used to prosodically manipulate previously recorded or synthesized system prompts (SP).

Description

description

A method for switching a confidence level to a user of an automatic speech dialogue system

The invention relates to a method for imparting a confidence level to a user of an automatic voice dialog system according to the preamble of claim 1.

In automatic speech dialog systems are user utterances realized rated a speech recognition component called confidences. The confidence approximates the probability of a speech recognition result is correct, so that the confidence of a recognized word ideal-enough, is very low if and only if the probability of detection error is very high.

Voice response systems use this confidence values ​​and adjust the dialogue accordingly, for example by beispiels-, after a user confirmation using so-called prompt or system prompts ask if a recognized word has too low a confidence. So, may be about a bank transfer, in which the height of the transfer amount poorly understood and was therefore occupied with a certain confidence contend, are automatically asked if the amount is correct understood, or not. Such a question may for example be: "You want to transfer 1,000 euros, is that right?"

However, too frequent requests for confirmation inhibit the dialog flow, which is why often confirmation requests for multiple data summarized or be combined with another query, such as "1000 Euro to the account of Messrs. Muller, you want actuating a paid Buchungsbestä-?".

As mentioned above use voice portals confidence values ​​and adjust the dialogue accordingly. However, such an adaptation of the dialogue takes place on a purely textual level, that is, by re-wording of the text of prompts and change the dialog history. However, the rephrasing of prompts are often narrow limits as these, when no speech synthesis is to be used, must be recorded with a professional speaker in advance. Likewise, reformulations and changes in the course of dialogue can only be very rough gradations of confidence, such as low, medium, high consider.

As an object of the invention, it may therefore be considered to develop a process which makes it possible to provide a confidence to a user of an automatic voice dialog system.

The object is achieved by a method for switching a confidence level to a user of an automatic voice dialog system, at which confidence is known as a date or a part of a user's speech during a voice dialogue as part of an automatic speech recognition has been detected and flows into a speech recognition result is or incorporated, wherein the confidence of a date or a recognized word is low when the probability of a detection error of the date is high and the confidence is high when the probability of a detection error of the date is low or low. It is provided that the confidence is used to manipulate pre-recorded or synthesized system prompts prosodically.

The invention allows a simple way to allow finer intermediates which implicitly make it clear to the user without further complicating the dialogue that a certain, often designated as part of the speech recognition as a date part of a user's speech, for example, a number, a word, a phrase etc., for example, "Fa. Müller, "was recognized with particularly low confidence. An advantageous embodiment of the invention provides that the prosodic manipulation is performed such that a date with a low confidence with a brief preferably introductory break and special emphasis on the date be excluded or is playing. the prosodic manipulation thus serves to underline acoustic and can include very fine-grained way, even small variations of confidence.

A particularly advantageous embodiment of the invention provides that the prosodic manipulation includes properties such as rhythm, energy, fundamental frequency.

Another advantageous embodiment of the invention provides that only confidences below a predetermined or predeterminable threshold value to the user are taught.

The invention is explained below with reference to a single drawing in Fig. 1 embodiment illustrated in greater detail. It shows

Fig. 1 is a schematic representation of the sequence of an inventive method for imparting confidence to a user of an automatic

Voice response system.

In one in its expiry in Fig. 1. The process according to the invention schematically illustrated for imparting configura- K tendency to a user B of an automatic voice dialog system, was charged with the recognized a so-called date or a part of a user utterance BA during a voice dialogue as part of an automatic speech recognition and flows into a speech recognition result or SE is incorporated, made by a user B a linguistic user utterance BA. This user utterance BA is detected by a Spracherkennungskom- component SK. The speech recognition component recognizes individual data Dl, D2 ... Dn of the user utterance BA which are included in a speech recognition result SE. The data Dl, D2, ..., Dn each comprise parts, preferably individual words of the user utterance BA. Each date Dl, D2, ..., Dn will be evaluated by the speech recognition component SK with a confidence Kl, K2, ..., Kn.

The confidence Kl, K2, ..., Kn of a date Dl, D2, ..., Dn is low when the probability of a detection error of the date Dl, D2, ..., Dn is high. The confidence Kl, K2, ..., Kn of a date Dl, D2, ..., Dn is high when the probability of a detection error of the date Dl, D2, ..., Dn is low.

Assigns the speech recognition component SK a date Dl, D2, ..., Dn low confidence Kl, K2, ..., Kn to, so this low confidence Kl, K2, ..., Kn in the context of the user B is directed, also referred to as a confirmation request, pre-recorded or synthesized prompts SP system used to manipulate the respective date preferably contained in the system prompt SP Dl, D2, ... Dn in the system prompt SP prosodisch.

The invention thus makes it possible to easily allow finer intermediates which implicitly make it clear to the user B without further complication of the user utterance BA and the system prompt SP comprehensive dialogue that a date Dl, D2, ..., Dn with particularly low confidence Kl, K2, ..., Kn is detected.

The method known from the prior art problem is thus solved according to the invention by the confidence is used to manipulate the pre-recorded or synthesized system prompts prosodically. For example, within a system prompts "1000 EUR to the account of - Fa. Müller, you want a paid booking confirmation?" Will play with a short pause and emphasis the date "Müller Fa.". The prosodic manipulation thus serves to underline acoustic and can should include any very fine granular way even small variations of confidence. Algorithms for the manipulation of prosodic

Language have long been known in the field of speech synthesis.

Thus, the invention uses the technical ability proso- sized properties such as rhythm, energy, fundamental frequency of played to manipulate synthesized or pre-recorded prompts, for the expression of confidence in certain data in an automated voice response.

Claims

claims
1. A method for switching a confidence (Kl, K2, ... Kn) to a user (B) of an automatic Sprachdialogsys- tems with which confidence (Kl, K2, ... Kn) a date (Dl,
D2, ..., Dn) a user's utterance (BA) has been detected, the confidence (Kl, K2, ... Kn) is low a date (Dl, D2, ..., Dn), when the probability of a detection error is high, and the confidence (Kl, K2, ... Kn) is high when the probability of a detection error is low, characterized in that the confidence (Kl, K2, ... Kn) is used to pre-recorded or synthesized prompts system (SP) proso- manipulate disch.
2. The method according to claim 1, characterized in that the prosodic manipulation is performed such that a date (Dl, D2, ..., Dn) with a low confidence (Kl, K2, ... Kn) with a short break, and special emphasis will be played.
3. The method according to claim 1 or 2, characterized in that the prosodic manipulation includes properties such as rhythm, energy, fundamental frequency.
4. The method of claim 1, 2 or 3, characterized in that only confidences (Kl, K2, ... Kn) is below a predetermined or predeterminable threshold value to the user (B) are taught.
PCT/CH2009/000158 2008-05-23 2009-05-14 Method for conveying a confidence to a user of an automatic voice dialogue system WO2009140780A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
DE200810024974 DE102008024974A1 (en) 2008-05-23 2008-05-23 A method for switching a confidence level to a user of an automatic speech dialogue system
DE102008024974.2 2008-05-23

Publications (1)

Publication Number Publication Date
WO2009140780A1 true WO2009140780A1 (en) 2009-11-26

Family

ID=40834457

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CH2009/000158 WO2009140780A1 (en) 2008-05-23 2009-05-14 Method for conveying a confidence to a user of an automatic voice dialogue system

Country Status (2)

Country Link
DE (1) DE102008024974A1 (en)
WO (1) WO2009140780A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751591B1 (en) * 2001-01-22 2004-06-15 At&T Corp. Method and system for predicting understanding errors in a task classification system
US20050027523A1 (en) * 2003-07-31 2005-02-03 Prakairut Tarlton Spoken language system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751591B1 (en) * 2001-01-22 2004-06-15 At&T Corp. Method and system for predicting understanding errors in a task classification system
US20050027523A1 (en) * 2003-07-31 2005-02-03 Prakairut Tarlton Spoken language system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SARAH DAVIES AND MASSIMO POESIO: "THE PROVISION OF CORRECTIVE FEEDBACK IN A SPOKEN DIALOGUE CALL SYSTEM" 19981001, 1. Oktober 1998 (1998-10-01), Seite P813, XP007000592 *

Also Published As

Publication number Publication date
DE102008024974A1 (en) 2009-12-03

Similar Documents

Publication Publication Date Title
Bolinger A theory of pitch accent in English
Zissman Comparison of four approaches to automatic language identification of telephone speech
Ernestus et al. The recognition of reduced word forms
Kirchhoff Robust speech recognition using articulatory information
Campbell et al. Phonetic speaker recognition with support vector machines
Larnel et al. BREF, a large vocabulary spoken corpus for French
Chafe Prosodic and functional units of
Singer et al. Acoustic, phonetic, and discriminative approaches to automatic language identification
Grosjean The recognition of words after their acoustic offset: Evidence and implications
US5855000A (en) Method and apparatus for correcting and repairing machine-transcribed input using independent or cross-modal secondary input
US6999931B2 (en) Spoken dialog system using a best-fit language model and best-fit grammar
Vroomen et al. Metrical segmentation and lexical inhibition in spoken word recognition.
US7869999B2 (en) Systems and methods for selecting from multiple phonectic transcriptions for text-to-speech synthesis
Bitouk et al. Class-level spectral features for emotion recognition
O’Shaughnessy Automatic speech recognition: History, methods and challenges
Grosjean et al. Prosodic structure and spoken word recognition
JP5327054B2 (en) Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program
KR101688240B1 (en) System and method for automatic speech to text conversion
US20080189106A1 (en) Multi-Stage Speech Recognition System
US20100004931A1 (en) Apparatus and method for speech utterance verification
Li et al. Spoken language recognition: from fundamentals to practice
CN1206620C (en) Transcription and display input speech
Campbell et al. Language recognition with support vector machines
JP4709663B2 (en) User adaptive speech recognition method and a speech recognition device
Lahiri et al. Underspecified recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09749384

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct app. not ent. europ. phase

Ref document number: 09749384

Country of ref document: EP

Kind code of ref document: A1