WO2002103674A1 - On-line environmental and speaker model adaptation - Google Patents

On-line environmental and speaker model adaptation Download PDF

Info

Publication number
WO2002103674A1
WO2002103674A1 PCT/AU2002/000804 AU0200804W WO02103674A1 WO 2002103674 A1 WO2002103674 A1 WO 2002103674A1 AU 0200804 W AU0200804 W AU 0200804W WO 02103674 A1 WO02103674 A1 WO 02103674A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing system
models
model
utterances
adaptation
Prior art date
Application number
PCT/AU2002/000804
Other languages
French (fr)
Inventor
Habib Talhami
David Thambiratnum
Original Assignee
Kaz Group Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kaz Group Limited filed Critical Kaz Group Limited
Publication of WO2002103674A1 publication Critical patent/WO2002103674A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation

Definitions

  • the present invention relates to an on-line environmental and speaker model adaptation arrangement particularly suited for use with an automated speech recognition system.
  • Automated speech recognition is a complex task in itself. Automated speech understanding sufficient to provide automated dialogue with a user adds a further layer of complexity.
  • automated speech recognition system will refer to automated or substantially automated systems which perform automated speech recognition and also attempt automated speech understanding, at least to predetermined levels sufficient to provide a capability for at least limited automated dialogue with a user.
  • a generalized diagram of a commercial grade automated speech recognition system as can be used for example in call centres and the like is illustrated in Fig. 1.
  • a particular problem in hosted speech recognition solutions is that where multiple calls are being handled simultaneously it is not possible to adapt a single model set concurrently.
  • One solution is to periodically retrain the models to allow them to better represent new data however this has not been overly successful.
  • a speech recognition system of the type adapted to process utterances from a caller or user by way of a recogniser, an utterance processing system and a dialogue processing system so as to produce responses to said utterances
  • an on-line environmental and speaker model adaptation arrangement wherein a plurality of models operating on a plurality of respective conversations each adapt differently to each conversation over a predetermined period of time subsequent to which the adapted models are tested and the best model according to a predetermined criterion is selected as the initial model to be applied commencing at a second predetermined period of time.
  • adaptation is performed by a Maximum Likelihood Linear Regression (MLLR) process.
  • MLLR Maximum Likelihood Linear Regression
  • Fig. 1 is a generalized block diagram of a prior art automated speech recognition system
  • Fig. 2 is a generalized block diagram of an automated speech recognition system suited for use in conjunction with an embodiment of the present invention
  • Fig. 3 is a more detailed block diagram of the utterance processing and dialogue processing portions of the system of Fig. 2;
  • Fig. 4 is a block diagram of the recogniser portion of the system of Fig. 2 incorporating an arrangement in accordance with a first preferred embodiment of the present invention.
  • Fig. 2 there is illustrated a generalized block diagram of an automated speech recognition system 10 adapted to receive human speech derived from user 11, and to process that speech with a view to recognizing and understanding the speech to a sufficient level of accuracy that a response 12 can be returned to user 11 by system 10.
  • the response 12 can take the form of an auditory communication, a written or visual communication or any other form of communication intelligible to user 11 or a combination thereof.
  • input from user 11 is in the form of a plurality of utterances 13 which are received by transducer 14 (for example a microphone) and converted into an electronic representation 15 of the utterances 13.
  • the electronic representation 15 comprises a digital representation of the utterances 13 in .WAV format.
  • Each electronic representation 15 represents an entire utterance 13.
  • the electronic representations 1 ' 5 are processed through front end processor 16 to produce a stream of vectors 17, one vector for example for each 10ms segment of utterance 13.
  • the vectors 17 are matched against knowledge base vectors 18 derived from knowledge base 19 by back end processor 20 so as to produce ranked results 1-N in the form of N best results 21.
  • the results can comprise for example subwords, words or phrases but will depend on the application. N can vary from 1 to very high values, again depending on the application.
  • An utterance processing system 26 receives the N best results 21 and begins the task of assembling the results into a meaning representation 25 for example based on the data contained in language knowledge database 31.
  • the utterance processing system 26 orders the resulting tokens or words 23 contained in N-best results 21 into a meaning representation 25 of token or word candidates which are passed to the dialogue processing system 27 where sufficient understanding is attained so as to permit functional utilization of speech input 15 from user 11 for the task to be performed by the automated speech recognition system 10.
  • the functionality includes attaining of sufficient understanding to permit at least a limited dialogue to be entered into with user/caller 11 by means of response 12 in the form of prompts so as to elicit further speech input from the user 11.
  • the functionality for example can include a sufficient understanding to permit interaction with extended databases for data identification.
  • Fig. 3 illustrates further detail of the system of Fig. 2 including listing of further functional components which make up the utterance processing system 26 and the dialogue processing system 27 and their interaction. Like components are numbered as for the arrangement of Fig. 2.
  • the utterance processing system 26 and the dialogue processing system 27 together form a natural language processing system.
  • the utterance processing system 26 is event-driven and processes each of the utterances 13 of caller/user 11 individually.
  • the dialogue processing system 27 puts any given utterance 13 of caller/user 11 into the context of the current conversation (usually in the context of a telephone conversation) . Broadly, in a telephone answering context, it will try to resolve the query from the caller and decide on an appropriate answer to be provided by way of response 12.
  • the utterance processing system 26 takes as input the output of the acoustic or speech recogniser 30 and produces a meaning representation 25 for passing to dialogue processing system 27.
  • the meaning representation 25 can take the form of value pairs.
  • the utterance "I want to go from Melbourne to Sydney on Wednesday” may be presented to the dialogue processing system 27 in the form of three value pairs, comprising:
  • the recogniser 30 provides as output N best results 21 usually in the form of tokens or words 23 to the utterance processing system 26 where it is first disambiguated by language model 32.
  • the language model 32 is based on trigrams with cut off.
  • Analyser 33 specifies how words derived from language model 32 can be grouped together to form meaningful phrases which are used to interpret utterance 13.
  • the analyzer is based on a series of simple finite state automata which produce robust parses of phrasal chunks - for example noun phrases for entities and concepts and WH- phrases for questions, dates.
  • Analyser 33 is driven by grammars such as meta-grammar 34. The grammars themselves must be tailored for each application and can be thought of as data created for a specific customer.
  • the resolver 35 then uses semantic information associated with the words of the phrases recognized as relevant by the analyzer 33 to refine the meaning representation 25 into its final form for passing through the dialogue flow controller 36 within dialogue processing system 27.
  • the dialogue processing system 27, in this instance with reference to Fig. 3, receives meaning representation 25 from resolver 35 and processes the dialogue according to the appropriate dialogue models.
  • dialogue models will be specific to different applications but some may be reusable.
  • a protocol model may handle greetings, closures, interruptions, errors and the like across a number of different applications.
  • the dialogue flow controller 36 uses the dialogue history to keep track of the interactions.
  • the logic engine 37 in this instance, creates SQL queries based on the meaning representation 25. Again it will be dependent on the specific application and its domain knowledge base.
  • the generator 38 produces responses 12 (for example speech out) .
  • the generator 38 can utilize generic text to speech (TTS) systems to produce a voiced response.
  • TTS generic text to speech
  • Language knowledge database 31 comprises, in the instance of Fig. 3, a lexicon 39 operating in conjunction with database 40.
  • the lexicon 39 and database 40 operating in conjunction with knowledge base mapping tools 41 and, as appropriate, language model 32 and grammars 34 constitutes a language knowledge database 31 or knowledge base which deals with domain specific data. The structure and grouping of data is modeled in the knowledge base 31.
  • Database 40 comprises raw data provided by a customer.
  • this data may comprise names, addresses, places, dates and is usually organised in a way that logically relates to the way it will be used.
  • the database 40 may remain unchanged or it may be updated throughout the lifetime of an application.
  • Functional implementation can be by way of database servers such as MySQL, Oricle, Postgres .
  • FIG. 4 there is illustrated in schematic form the basic procedure adopted by an on-line environmental and speaker model adaptation arrangement 910 operable, in this instance, from back end 20 of recogniser
  • system 10 is handling a plurality of telephone conversations 911, in this instance conversations 1-N, each has a respective speech model 912 comprising models 1-N and in respect of each of which over a first predetermined period of time 913 an adaptation process is applied to the models 912, in this instance an MLLR adaptation process, thereby to improve recognition.
  • each respective model 912 will adapt differently over first period 913.
  • All models 912 are tested by test and select means 914 thereby to select the best performing model of the models 912 in accordance with the predetermined criteria.
  • the best model 912A then becomes the initial model for the start of the next predetermined period 915.
  • procedure 910 is based on a technique called MLLR (Maximum Likelihood Linear Regression) .
  • MLLR Maximum Likelihood Linear Regression
  • System 10 can perform speech recognition in a hosted environment. In a given day, it handles numerous phone calls for different applications. During each call, speech models are adapted using the MLLR algorithm. However because the system is handling several calls simultaneously, it ends up having several adapted model sets. The recombination of these sets is difficult yet it must be done.
  • the arrangement 910 recombines these in the following fashion:
  • the adaptation is based on MLLR, which is a well-known adaptation process for HMMs. It is essentially a method of determining the distribution of speech features in new speech data, and then modifying the parameters of the old models to better fit the new speech data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)

Abstract

In a speech recognition system of the type adapted to process utterances from a caller or user by way of a recogniser, an utterance processing system and a dialogue processing system so as to produce responses to said utterances, an on-line environmental and speaker model adaptation arrangement wherein a plurality of models operating on a plurality of respective conversations each adapt differently to each conversation over a predetermined period of time subsequent to which the adapted models are tested and the best model according to a predetermined criterion is selected as the initial model to be applied commencing at a second predetermined period of time.

Description

ON-LINE ENVIRONMENTAL AND SPEAKER MODEL ADAPTATION
The present invention relates to an on-line environmental and speaker model adaptation arrangement particularly suited for use with an automated speech recognition system.
BACKGROUND
Automated speech recognition is a complex task in itself. Automated speech understanding sufficient to provide automated dialogue with a user adds a further layer of complexity.
In this specification the term "automated speech recognition system" will refer to automated or substantially automated systems which perform automated speech recognition and also attempt automated speech understanding, at least to predetermined levels sufficient to provide a capability for at least limited automated dialogue with a user. A generalized diagram of a commercial grade automated speech recognition system as can be used for example in call centres and the like is illustrated in Fig. 1.
With advances in digital computers and a significant lowering in cost per unit of computing capacity there have been a number of attempts in the commercial marketplace to install such automated speech recognition systems implemented substantially by means of digital computers. However, to date, there remain problems in achieving 100% recognition and/or 100% understanding in real time.
A particular problem in hosted speech recognition solutions is that where multiple calls are being handled simultaneously it is not possible to adapt a single model set concurrently. One solution is to periodically retrain the models to allow them to better represent new data however this has not been overly successful.
It is an object of the present invention to address or ameliorate one or more of the abovementioned disadvantages.
BRIEF DESCRIPTION OF INVENTION
Accordingly, in one broad form of the invention there is provided in a speech recognition system of the type adapted to process utterances from a caller or user by way of a recogniser, an utterance processing system and a dialogue processing system so as to produce responses to said utterances, an on-line environmental and speaker model adaptation arrangement wherein a plurality of models operating on a plurality of respective conversations each adapt differently to each conversation over a predetermined period of time subsequent to which the adapted models are tested and the best model according to a predetermined criterion is selected as the initial model to be applied commencing at a second predetermined period of time. Preferably adaptation is performed by a Maximum Likelihood Linear Regression (MLLR) process.
BRIEF DESCRIPTION OF DRAWINGS
Embodiments of the present invention will now be described with reference to the accompanying drawings wherein:
Fig. 1 is a generalized block diagram of a prior art automated speech recognition system;
Fig. 2 is a generalized block diagram of an automated speech recognition system suited for use in conjunction with an embodiment of the present invention;
Fig. 3 is a more detailed block diagram of the utterance processing and dialogue processing portions of the system of Fig. 2;
Fig. 4 is a block diagram of the recogniser portion of the system of Fig. 2 incorporating an arrangement in accordance with a first preferred embodiment of the present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
With reference to Fig. 2 there is illustrated a generalized block diagram of an automated speech recognition system 10 adapted to receive human speech derived from user 11, and to process that speech with a view to recognizing and understanding the speech to a sufficient level of accuracy that a response 12 can be returned to user 11 by system 10. In the context of systems to which embodiments of the present invention are applicable the response 12 can take the form of an auditory communication, a written or visual communication or any other form of communication intelligible to user 11 or a combination thereof.
In all cases input from user 11 is in the form of a plurality of utterances 13 which are received by transducer 14 (for example a microphone) and converted into an electronic representation 15 of the utterances 13. In one exemplary form the electronic representation 15 comprises a digital representation of the utterances 13 in .WAV format. Each electronic representation 15 represents an entire utterance 13. The electronic representations 1'5 are processed through front end processor 16 to produce a stream of vectors 17, one vector for example for each 10ms segment of utterance 13. The vectors 17 are matched against knowledge base vectors 18 derived from knowledge base 19 by back end processor 20 so as to produce ranked results 1-N in the form of N best results 21. The results can comprise for example subwords, words or phrases but will depend on the application. N can vary from 1 to very high values, again depending on the application. An utterance processing system 26 receives the N best results 21 and begins the task of assembling the results into a meaning representation 25 for example based on the data contained in language knowledge database 31.
The utterance processing system 26 orders the resulting tokens or words 23 contained in N-best results 21 into a meaning representation 25 of token or word candidates which are passed to the dialogue processing system 27 where sufficient understanding is attained so as to permit functional utilization of speech input 15 from user 11 for the task to be performed by the automated speech recognition system 10. In this case the functionality includes attaining of sufficient understanding to permit at least a limited dialogue to be entered into with user/caller 11 by means of response 12 in the form of prompts so as to elicit further speech input from the user 11. In the alternative or in addition, the functionality for example can include a sufficient understanding to permit interaction with extended databases for data identification.
Fig. 3 illustrates further detail of the system of Fig. 2 including listing of further functional components which make up the utterance processing system 26 and the dialogue processing system 27 and their interaction. Like components are numbered as for the arrangement of Fig. 2.
The utterance processing system 26 and the dialogue processing system 27 together form a natural language processing system. The utterance processing system 26 is event-driven and processes each of the utterances 13 of caller/user 11 individually. The dialogue processing system 27 puts any given utterance 13 of caller/user 11 into the context of the current conversation (usually in the context of a telephone conversation) . Broadly, in a telephone answering context, it will try to resolve the query from the caller and decide on an appropriate answer to be provided by way of response 12.
The utterance processing system 26 takes as input the output of the acoustic or speech recogniser 30 and produces a meaning representation 25 for passing to dialogue processing system 27.
In a . typical, but not limiting form, the meaning representation 25 can take the form of value pairs. For example, the utterance "I want to go from Melbourne to Sydney on Wednesday" may be presented to the dialogue processing system 27 in the form of three value pairs, comprising:
1. Start ; Melbourne 2. Destination; Sydney 3. Date; Wednesday where, in this instance, the components Melbourne, Sydney, Wednesday of the value pairs 24 comprise tokens or words 23. With particular reference to Fig. 3 "the recogniser 30 provides as output N best results 21 usually in the form of tokens or words 23 to the utterance processing system 26 where it is first disambiguated by language model 32. In one form the language model 32 is based on trigrams with cut off. Analyser 33 specifies how words derived from language model 32 can be grouped together to form meaningful phrases which are used to interpret utterance 13. In one form the analyzer is based on a series of simple finite state automata which produce robust parses of phrasal chunks - for example noun phrases for entities and concepts and WH- phrases for questions, dates. Analyser 33 is driven by grammars such as meta-grammar 34. The grammars themselves must be tailored for each application and can be thought of as data created for a specific customer. The resolver 35 then uses semantic information associated with the words of the phrases recognized as relevant by the analyzer 33 to refine the meaning representation 25 into its final form for passing through the dialogue flow controller 36 within dialogue processing system 27.
The dialogue processing system 27, in this instance with reference to Fig. 3, receives meaning representation 25 from resolver 35 and processes the dialogue according to the appropriate dialogue models. Again, dialogue models will be specific to different applications but some may be reusable. For example a protocol model may handle greetings, closures, interruptions, errors and the like across a number of different applications.
The dialogue flow controller 36 uses the dialogue history to keep track of the interactions. The logic engine 37, in this instance, creates SQL queries based on the meaning representation 25. Again it will be dependent on the specific application and its domain knowledge base.
The generator 38 produces responses 12 (for example speech out) . In the simplest form the generator 38 can utilize generic text to speech (TTS) systems to produce a voiced response.
Language knowledge database 31 comprises, in the instance of Fig. 3, a lexicon 39 operating in conjunction with database 40. The lexicon 39 and database 40 operating in conjunction with knowledge base mapping tools 41 and, as appropriate, language model 32 and grammars 34 constitutes a language knowledge database 31 or knowledge base which deals with domain specific data. The structure and grouping of data is modeled in the knowledge base 31.
Database 40 comprises raw data provided by a customer. In one instance this data may comprise names, addresses, places, dates and is usually organised in a way that logically relates to the way it will be used. The database 40 may remain unchanged or it may be updated throughout the lifetime of an application. Functional implementation can be by way of database servers such as MySQL, Oricle, Postgres .
As will be observed particularly with reference to
Fig. 3, interaction between a number of components in the system can be quite complex with lexicon 39, in particular, being used by and interacting with multiple components of
System 10.
With reference to Fig. 4 there is illustrated in schematic form the basic procedure adopted by an on-line environmental and speaker model adaptation arrangement 910 operable, in this instance, from back end 20 of recogniser
30.
Broadly, where system 10 is handling a plurality of telephone conversations 911, in this instance conversations 1-N, each has a respective speech model 912 comprising models 1-N and in respect of each of which over a first predetermined period of time 913 an adaptation process is applied to the models 912, in this instance an MLLR adaptation process, thereby to improve recognition. Because the conversations 911 are running concurrently each respective model 912 will adapt differently over first period 913. At the end of first period 913 all models 912 are tested by test and select means 914 thereby to select the best performing model of the models 912 in accordance with the predetermined criteria. The best model 912A then becomes the initial model for the start of the next predetermined period 915.
In this instance, procedure 910 is based on a technique called MLLR (Maximum Likelihood Linear Regression) .
System 10 can perform speech recognition in a hosted environment. In a given day, it handles numerous phone calls for different applications. During each call, speech models are adapted using the MLLR algorithm. However because the system is handling several calls simultaneously, it ends up having several adapted model sets. The recombination of these sets is difficult yet it must be done.
The arrangement 910 recombines these in the following fashion:
At the end of each day (or any predetermined time 913) all adapted model sets 912 are collected, and each is tested with a standard test algorithm. This algorithm tests the performance of each adapted model set 912. The model set that gives the best performance is then used as the new initial model 912A for the next day.
During this process, if any calls come in, recognition is performed as normal, but no adaptation may take place.
The main benefit in this solution is that the speech models will improve over time, giving better speech recognition performance. Other benefits are:
• Adaptation to a noisy environment
• Adaptation to individual speaker styles
In this instance, the adaptation is based on MLLR, which is a well-known adaptation process for HMMs. It is essentially a method of determining the distribution of speech features in new speech data, and then modifying the parameters of the old models to better fit the new speech data. The above describes only some embodiments of the present invention and modifications, obvious to those skilled in the art, can be made thereto without departing from the scope and spirit of the present invention.

Claims

1. In a speech recognition system of the type adapted to process utterances from a caller or user by way of a recogniser, an utterance processing system and a dialogue processing system so as to produce responses to said utterances, an on-line environmental and speaker model adaptation arrangement wherein a plurality of models operating on a plurality of respective conversations each adapt differently to each conversation over a predetermined period of time subsequent to which the adapted models are tested and the best model according to a predetermined criterion is selected as the initial model to be applied commencing at a second predetermined period of time.
2. The arrangement of Claim 1 wherein adaptation is performed by a Maximum Likelihood Linear Regression (MLLR) process.
PCT/AU2002/000804 2001-06-19 2002-06-19 On-line environmental and speaker model adaptation WO2002103674A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AUPR5796A AUPR579601A0 (en) 2001-06-19 2001-06-19 On-line environmental and speaker model adaptation
AUPR5796 2001-06-19

Publications (1)

Publication Number Publication Date
WO2002103674A1 true WO2002103674A1 (en) 2002-12-27

Family

ID=3829767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2002/000804 WO2002103674A1 (en) 2001-06-19 2002-06-19 On-line environmental and speaker model adaptation

Country Status (2)

Country Link
AU (1) AUPR579601A0 (en)
WO (1) WO2002103674A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000078022A1 (en) * 1999-06-11 2000-12-21 Telstra New Wave Pty Ltd A method of developing an interactive system
US6208964B1 (en) * 1998-08-31 2001-03-27 Nortel Networks Limited Method and apparatus for providing unsupervised adaptation of transcriptions
EP1089256A2 (en) * 1999-09-30 2001-04-04 Sony Corporation Speech recognition models adaptation from previous results feedback
WO2001075862A2 (en) * 2000-04-05 2001-10-11 Lernout & Hauspie Speech Products N.V. Discriminatively trained mixture models in continuous speech recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6208964B1 (en) * 1998-08-31 2001-03-27 Nortel Networks Limited Method and apparatus for providing unsupervised adaptation of transcriptions
WO2000078022A1 (en) * 1999-06-11 2000-12-21 Telstra New Wave Pty Ltd A method of developing an interactive system
EP1089256A2 (en) * 1999-09-30 2001-04-04 Sony Corporation Speech recognition models adaptation from previous results feedback
WO2001075862A2 (en) * 2000-04-05 2001-10-11 Lernout & Hauspie Speech Products N.V. Discriminatively trained mixture models in continuous speech recognition

Also Published As

Publication number Publication date
AUPR579601A0 (en) 2001-07-12

Similar Documents

Publication Publication Date Title
EP1380153B1 (en) Voice response system
US8626509B2 (en) Determining one or more topics of a conversation using a domain specific model
US7606714B2 (en) Natural language classification within an automated response system
CA2441195C (en) Voice response system
EP1240642B1 (en) Learning of dialogue states and language model of spoken information system
US7158935B1 (en) Method and system for predicting problematic situations in a automated dialog
US8943394B2 (en) System and method for interacting with live agents in an automated call center
CN107818798A (en) Customer service quality evaluating method, device, equipment and storage medium
US20070121893A1 (en) Optimal call speed for call center agents
WO2007101088A1 (en) Menu hierarchy skipping dialog for directed dialog speech recognition
US11895272B2 (en) Systems and methods for prioritizing emergency calls
JP6605105B1 (en) Sentence symbol insertion apparatus and method
US20050096912A1 (en) System and method for interactive voice response enhanced out-calling
Karis et al. Automating services with speech recognition over the public switched telephone network: Human factors considerations
Suendermann Advances in commercial deployment of spoken dialog systems
CN114328867A (en) Intelligent interruption method and device in man-machine conversation
US7853451B1 (en) System and method of exploiting human-human data for spoken language understanding systems
WO2002103673A1 (en) Neural network post-processor
WO2002103674A1 (en) On-line environmental and speaker model adaptation
Tarasiev et al. Application of stemming methods to development a module of a post-processing of recognized speech in intelligent automated system for dialogue and decision-making in real time
CA2375589A1 (en) Method and apparatus for determining user satisfaction with automated speech recognition (asr) system and quality control of the asr system
López-Cózar et al. A new technique based on augmented language models to improve the performance of spoken dialogue systems.
AU4885602A (en) Concurrent recognition using bin-sequential lattices
Hagberg User response strategies to reprompting in a call routing service
WO2002103672A1 (en) Language assisted recognition module

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP