WO2007118324A1 - Procédé et appareil pour construire des grammaires par regroupement sémantique lexical dans un système de reconnaissance vocale - Google Patents

Procédé et appareil pour construire des grammaires par regroupement sémantique lexical dans un système de reconnaissance vocale Download PDF

Info

Publication number
WO2007118324A1
WO2007118324A1 PCT/CA2007/000634 CA2007000634W WO2007118324A1 WO 2007118324 A1 WO2007118324 A1 WO 2007118324A1 CA 2007000634 W CA2007000634 W CA 2007000634W WO 2007118324 A1 WO2007118324 A1 WO 2007118324A1
Authority
WO
WIPO (PCT)
Prior art keywords
phrases
collected
semantic
semantic concepts
vector
Prior art date
Application number
PCT/CA2007/000634
Other languages
English (en)
Inventor
Kenneth Todd Reed
Original Assignee
Call Genie Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Call Genie Inc. filed Critical Call Genie Inc.
Priority to EP07719561A priority Critical patent/EP2008268A4/fr
Priority to CA002643930A priority patent/CA2643930A1/fr
Publication of WO2007118324A1 publication Critical patent/WO2007118324A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Definitions

  • TITLE METHOD AND APPARATUS FOR BUILDING GRAMMARS
  • the present invention relates to speech recognition systems, and most particularly to a method and system for building grammars with lexical semantic clustering.
  • Automated speech applications allow a person to interact with a computer-implemented system using their voice and ears in much the same manner as interacting with another person.
  • Such systems utilize automated speech recognition technology, which interprets the spoken words from a person and translates them into a form, which is semantically meaningful to a computer, for example, strings or other types of digital data or information.
  • the context free grammar represents the designer's best prediction of what a person will say in response to a particular question or prompt posed by the system.
  • a context free grammar can be provided which successfully predicts the spoken responses that will be made by all the system's users.
  • phraseology expands, for example, with an open- ended question, it becomes increasingly difficult to predict a priori all the responses and variations that will be provided by a user.
  • the speech recognizer attempts to emulate the human ability to understand language.
  • the speech recognizer has no ability to understand natural language as the human brain can.
  • the speech recognizer simply executes computer code that identifies phonemes in the digitized sound wave generated by a person's voice and then attempts to find a corresponding phrase in the provided grammar that has a similar sequence of phonemes. It is typically the responsibility of the speech application to associate a semantic meaning to the results of the speech recognizer. And in many cases, the associated semantics are manually determined.
  • the design of a context free grammar for a speech application typically involves two design considerations.
  • the first design consideration comprises predicting phraseology that encompasses all the possible responses that may be given by a user to the questions or prompts posed by the speech application.
  • the second design consideration comprises providing a semantic interpretation or mapping for each possible response, i.e. word or phrase, that may be provided by a user of the system.
  • the design of a system with open-ended questions presents particular challenges because the large number of responses makes it very difficult to program a priori all or even most of the phraseology for the possible responses. It also becomes very difficult to determine a priori the set of semantic interpretations for mapping the phraseology or phrases corresponding to the responses.
  • semantics interpretations are manually associated with phrases, the shear number of phrases makes this task time consuming, error prone, and costly.
  • the present invention provides a method and system for creating a grammar module suitable for a speech application.
  • the grammar module includes one or more semantic concepts.
  • the semantic concepts are generated by clustering semantically similar phrases into groups, wherein each of the clustered phrases represents the same or a similar semantic concept.
  • the present invention provides a method for creating a grammar module for use with a speech application, the method comprises the steps of: collecting phrases associated with one or more voice responses; transcribing the collected phrases into a machine-readable format; clustering selected ones of the collected phrases into one or more semantic concepts, and wherein the selected collected phrases corresponding to each of the semantic concepts have a related meaning; building a grammar module based on the collected phrases and the semantic concepts.
  • the present invention provides a system for building a grammar module for a speech application, the system comprises: means for collecting phrases associated with one or more voice responses; means for transcribing the collected phrases into a machine-readable format; means for clustering selected ones of the collected phrases into one or more corresponding semantic concepts; and means for creating a grammar module based on the collected phrases and the semantic concepts.
  • the present invention provides a method for generating a grammar module for a speech application, the method comprises the steps of: collecting one or more phrases associated with one or more voice responses; transcribing the collected phrases into a machine-readable format; clustering selected ones of the collected phrases into one or more semantic concepts, and wherein the selected collected phrases in each of the semantic concepts have a similar meaning; interpreting at least some of the semantic concepts; building a grammar module based on the collected phrases, the semantic concepts and the interpreted semantic concepts.
  • FIG. 1 shows in diagrammatic form a networked communication system incorporating a voice recognition mechanism according to an embodiment of the present invention
  • FIG. 2 shows in flowchart form a method for building a grammar module according to an embodiment of the present invention.
  • FIG. 3 shows in flowchart form a method for building a grammar module according to another embodiment of the present invention.
  • Fig. 1 shows in diagrammatic form a voice based communication system 100 incorporating a speech recognition mechanism and techniques according to the present invention.
  • the voice based communication system 100 comprises a telecommunication network 110 and a voice application 120.
  • the telecommunication network 110 may comprise, for example, a public or a private telephone or voice network or a combination thereof.
  • the voice application 120 in the context of the following description comprises a voice node 130 and a speech application server 140.
  • the speech application server 140 runs or executes a speech application 142, e.g. a standalone computer program or software module or code component or function.
  • the voice node 130 includes a speech recognizer indicated generally by reference 132.
  • the speech recognizer 132 comprises a software module or engine which converts voice signals or speech samples into digital data or other forms of data which are recognized by the speech application server 140, and in the other direction, the speech recognizer 132 converts the digital data or voice information generated by the speech application 142 into vocalizations or other types of audible signals.
  • the speech recognizer 132 includes a grammar module according to an embodiment of the invention and indicated generally by reference 150.
  • phraseology In the context of a speech application, a large number or sample of spoken answers are typically empirically collected for each question that is or may be posed by the application.
  • the phrases are collected from a population that is representative of the users of the speech application.
  • the collection of phrases typically tens of thousands in number, is called or termed phraseology.
  • phraseology In a speech application, the phraseology is typically dominated by phrases that are in-context; i.e. phrases that comprise on-topic responses for the question posed by the application.
  • most speech applications are designed to accommodate a statistically significant number of phrases that are out-of -context. Out- of-context phrases are not consistent with the question posed, but in the larger context of the speech application, may still have some relevance.
  • embodiments of the present invention provide a mechanism or process for building a grammar module for the speech application which can accommodate both in-context and out-of-context phrases and which includes lexical clustering according to an aspect of the invention.
  • telecommunication devices for example, a fixed line telephone set 112, or wireless or cellular communication devices 114, to communicate with each other via the telecommunication network 110 by dialing the directory number or DN associated with another user's telephone.
  • the voice node 130 is also assigned a directory number and a user dials the directory number of the voice node 120 to initiate a call session with the speech application running 142 on the speech application server 140.
  • the speech application 142 may comprise, for example, a business listings directory accessed by voice commands.
  • the voice node 130 handles the call from the telephone set 112 or the communication device 114 of a user, and the speech recognizer 132 handles the conversion of voice signals, e.g.
  • the speech application server 140 controls or handles the call session.
  • the speech application 142 running on the server 140 will typically execute several dialog forms.
  • the speech application 142 prompts the user with one or more questions, waits for a response from the user, and then provides further prompts or processing, as dictated by the particular application.
  • the speech recognizer 132 converts the prompts generated by the speech application 142 into corresponding vocalizations or other types of voice or audible signals.
  • the speech recognizer 132 converts the responses provided by the user into corresponding digital data.
  • the grammar module 150 is utilized by the speech recognizer 132 and provides a mechanism for building a grammar base or module for use by the speech application 142.
  • the speech recognizer 132 and speech application 142 are implemented as software on the voice node 130 and the speech application server 140, respectively, and may comprise a standalone computer program, a component of software in a larger program, or a plurality of separate programs, or software, hardware, firmware or any combination thereof.
  • the particular details or programming specifics for implementing software, computer programs or computer code components or functions for performing the operations or functions associated with the embodiments of the present invention will be readily understood by those skilled in the art. While described in the context of a voice-based networked communication system, it will be appreciated that the present invention has wider applicability and is suitable for other types of voice-based or speech recognition applications.
  • Fig. 2 shows in flowchart form a method 200 according to one embodiment of the invention for creating or generating a grammar module, for example, the grammar module 150 (Fig. 1) for the speech application 142 running on the speech application server 140 (Fig. 1).
  • a user of the speech application 142 initiates a call from a telecommunications device, for example, a cellular phone 114, over the telecommunication (e.g. a public or private telephone) network 110.
  • the voice node 130 and the speech recognizer 132 handle the call from the user, and the speech application server 140 handles the call session.
  • the speech application 142 executes several dialog forms, which include prompting the user, i.e. calling party, with a question, and then listening for the caller's response.
  • the responses or replies received from the caller are handled by the speech recognizer 132, which utilizes the grammar module 150.
  • the process according to an embodiment of the invention provides for the creation of the grammar module 150 comprising semantic concepts and context free grammars for open-ended questions, i.e. questions that can have a large number of distinct responses. For example, in a speech accessible business directory, the question "what type of business are you looking for" can result in 10,000 or more distinct responses.
  • the first step indicated by block 210 involves the collection of a large number or sample of spoken responses.
  • the spoken responses are typically collected from a population that is statistically representative of the population that will be using the speech application 142 (Fig. 1).
  • the environment in which the phrases are collected will accurately simulate the anticipated environment of the speech application.
  • the words and sentence structure chosen by a person to respond to a question can depend on several environmental factors, including, but not limited to: the time of day; the communication medium; the person's location; and, perhaps most significantly, the knowledge that the person's conversational partner is an automated computer system.
  • the next step in the process 200 comprises transcribing the collected phrases to text or some other digitized form.
  • the collected and transcribed phrases are saved in a digital transcription file 222, which is stored as part of a database or in computer memory, for example, in the voice node 130 (Fig. 1) or the speech application server 140 (Fig. 1).
  • the next step indicated by block 230 comprises clustering the phrases from the transcription file 222.
  • a computer-implemented clustering process or algorithm is applied to the transcription phrases in the file 222 to cluster semantically similar phrases into groups called semantic concepts. For example, the phrases my car needs gasoline and my auto requires petrol belong to the same semantic concept, because they have the same semantic meaning.
  • the clustering algorithm or process provides lexical semantic clustering, and according to one embodiment, the clustering algorithm may be implemented as described by the following pseudo code:
  • the lexical semantic clustering algorithm starts or begins by initializing the set of semantic concepts C to an empty set. Next, each phrase is compared to the semantic concepts in C. Because C is initially empty, the first phrase always begins a new semantic concept, which is added to the semantic concepts set C. For each subsequent phrase p, the phrase p is compared to each semantic concept to the find the semantic concept whose phrases are most similar to the phrase p.
  • the function S computes the similarity between a phrase and a semantic concept, as described in more detail below.
  • the phrase p is added to the semantic concept; otherwise, the phrase p becomes the seed of a new semantic concept.
  • the algorithm terminates or ends when all of the transcribed phrases have been analyzed, at which point C contains the set of semantic concepts.
  • the set of semantic concepts C are stored in a digital semantic concepts file 232, e.g. a phrase clusters file.
  • the semantic concepts C comprise a set of semantically equivalent phrases.
  • the meaning or relevance of the semantic concept is typically determined by the context of the application.
  • An aspect of the clustering operation in step 230 as described above involves quantitatively measuring the similarity between two phrases.
  • Known methods for measuring similarity typically incorporate some form of vectorization of the phrases.
  • the vocabulary size of the phraseology determines the dimension of the vector or vector space. For example, a phraseology comprising N distinct words results in an N dimensional space with each word being represented by a dimension.
  • a particular phrase is represented by a vector having non-zero components for each word in the phrase.
  • the phrase coffee shop is represented as (0, ..., 0, 1, 0, ..., 0, 1, 0, ..., 0), where the two l's correspond to the words coffee and shop, and the O's correspond to the words in the phraseology, but not in the phrase coffee shop.
  • each component has either the value 0 or 1, indicating either the absence or the presence of a word in a phrase. It will be appreciated that this scheme has the disadvantage of treating all words with equal importance.
  • the concept of information content can be applied to the vectorization of each phrase, wherein the O's remain, and for each word in a phrase, the corresponding vector component is assigned the information content of the word.
  • the information content for a word w is - log 2 P(yv) , where P(w) is the probability of the word w occurring.
  • P(w) is / H , / N , where / H ,is the number of phrases containing the word w and N is the number of phrases.
  • more complex probability models for example, using n-grams and Bayes' Theorem, may be applied.
  • phrases can still be semantically similar.
  • the phrases my car needs gasoline and my auto requires petrol are semantically similar, but because these two exemplary phrases have few words in common, the similarity measurements, i.e. Jaccard's or cosine, fail to identify the similarity.
  • the clustering operation provides for the interjection of synonymous terms.
  • the terms auto and petrol are inserted into the phrase vector, as synonyms for the words car and gasoline.
  • the injected synonyms will typically have the same vector weight as the original word or term.
  • hypernyms and/or hyponyms are inserted into the phrase vector.
  • the injected terms will have a scaled weight which is less than the original term, because the injected terms have related, but not equivalent, semantics.
  • the vectorization process can be improved further by applying a word sense tag or indicator for each word according to another embodiment.
  • the word glasses can mean a container used for drinking, or eyewear.
  • the word sense tag indicates which meaning of a word is intended.
  • the word sense tag may be determined manually or algorithmically (e.g. through the execution of a computer program, function or code component). There may also be instances where a word sense tag cannot be determined, for example, where there is ambiguity in the entire phrase.
  • each word, or most words, in the phrase are tagged with a word sense.
  • words with different senses are considered distinct, and if a word is determined to be ambiguous, then in the vector form, each word sense is represented by a non-zero component.
  • the clustering operation includes determining the similarity between the phrases and the semantic concepts by performing a similarity measurement, for example, a scalar similarity measure.
  • a similarity measurement for example, a scalar similarity measure.
  • the clustering operation 230 and execution of the clustering algorithm, yields a set of semantic concepts, which are stored in a semantic concepts file indicated by reference
  • step 240 the process 200 uses the semantic concepts file 232 to build a grammar file or module 242 for the speech recognizer (i.e. the speech recognizer 132 in Fig. 1).
  • the grammar module 242 i.e. indicated by reference 150 in Fig. 1
  • the speech recognizer 132 comprises a machine-readable format and is used by the speech recognizer 132 to recognize or decode words and phrases in the responses provided by the user (for example, as described above), and the decoded speech is then provided to the speech application 142 (Fig. 1) for further processing according to the application.
  • Fig. 3 shows in flowchart form a process 300 according to another embodiment for creating or generating a grammar module for a speech application, for example, as described above for Fig. 1.
  • the process 300 is similar to the process 200 of Fig. 2, and includes a collect phrases operation (step 310), a transcribe phrases operation (step 320), creation of a transcription file (reference 322), a cluster phrases operation (step 330) and creation of a semantics file (reference 332).
  • the process 300 performs or executes these operations in a manner similar to that described above for the process 200 of Fig. 2.
  • the process 300 includes a semantic interpretation operation in step 340.
  • the semantic interpretation step 340 operates to create a semantic interpretation for each semantic concept C, and the semantic interpretations are stored in a file denoted by reference 342.
  • the semantic interpretation operation in step 340 typically comprises a manual process, which is performed by a person skilled in the appropriate domain.
  • the build grammar operation in step 350 builds a machine-readable grammar file 352.
  • the grammar file 352 also includes the semantic interpretations which are converted to a machine- readable format and embedded with the grammar elements. The implementations details associated with this operation will be within the understanding of one skilled in the art.
  • the processes and clustering algorithm according to the present invention allows semantically equivalent phrases to be grouped together, which in turn provides the capability to organize and identify distinct semantic concepts present in the phraseology of interest or relevant to a particular speech application.
  • the phraseology is sufficiently large, and the semantic interpretations are determined using a manual process, the creation of semantic concepts can greatly reduce the manual effort because semantic interpretations need only to be done for each semantic concept, and not every phrase.

Abstract

La présente invention concerne un procédé et un système pour construire un module de grammaire dans une application de parole. Le procédé comprend l'étape de regroupement d'expressions présentant une similarité sémantique. Le module de grammaire comprend des expressions en format lisible par machine et des concepts sémantiques associés aux expressions. Conformément à un autre aspect, le module de grammaire comprend des interprétations sémantiques intégrées qui sont associées aux concepts sémantiques.
PCT/CA2007/000634 2006-04-17 2007-04-17 Procédé et appareil pour construire des grammaires par regroupement sémantique lexical dans un système de reconnaissance vocale WO2007118324A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP07719561A EP2008268A4 (fr) 2006-04-17 2007-04-17 Procede et appareil pour construire des grammaires par regroupement semantique lexical dans un systeme de reconnaissance vocale
CA002643930A CA2643930A1 (fr) 2006-04-17 2007-04-17 Procede et appareil pour construire des grammaires par regroupement semantique lexical dans un systeme de reconnaissance vocale

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US79235006P 2006-04-17 2006-04-17
US60/792,350 2006-04-17

Publications (1)

Publication Number Publication Date
WO2007118324A1 true WO2007118324A1 (fr) 2007-10-25

Family

ID=38609002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2007/000634 WO2007118324A1 (fr) 2006-04-17 2007-04-17 Procédé et appareil pour construire des grammaires par regroupement sémantique lexical dans un système de reconnaissance vocale

Country Status (3)

Country Link
EP (1) EP2008268A4 (fr)
CA (1) CA2643930A1 (fr)
WO (1) WO2007118324A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347375A1 (en) * 2014-05-30 2015-12-03 Nuance Communications, Inc. Automated quality assurance checks for improving the construction of natural language understanding systems
WO2017011343A1 (fr) * 2015-07-14 2017-01-19 Genesys Telecommunications Laboratories, Inc. Systèmes d'auto-assistance activés par la parole et dirigés par les données et procédés de fonctionnement de ces derniers
US10382623B2 (en) 2015-10-21 2019-08-13 Genesys Telecommunications Laboratories, Inc. Data-driven dialogue enabled self-help systems
US10455088B2 (en) 2015-10-21 2019-10-22 Genesys Telecommunications Laboratories, Inc. Dialogue flow optimization and personalization

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5794193A (en) * 1995-09-15 1998-08-11 Lucent Technologies Inc. Automated phrase generation
US5860063A (en) * 1997-07-11 1999-01-12 At&T Corp Automated meaningful phrase clustering
WO2000073936A1 (fr) * 1999-05-28 2000-12-07 Sehda, Inc. Modelisation de dialogue a base de locution convenant particulierement pour la creation de grammaires de reconnaissance destinees a des interfaces utilisateurs a commande vocale
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US6317707B1 (en) * 1998-12-07 2001-11-13 At&T Corp. Automatic clustering of tokens from a corpus for grammar acquisition
US6415248B1 (en) * 1998-12-09 2002-07-02 At&T Corp. Method for building linguistic models from a corpus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE405918T1 (de) * 1999-12-20 2008-09-15 British Telecomm Das lernen von dialogzuständen und sprachmodellen des gesprochenen informationssystems

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5794193A (en) * 1995-09-15 1998-08-11 Lucent Technologies Inc. Automated phrase generation
US5860063A (en) * 1997-07-11 1999-01-12 At&T Corp Automated meaningful phrase clustering
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US6317707B1 (en) * 1998-12-07 2001-11-13 At&T Corp. Automatic clustering of tokens from a corpus for grammar acquisition
US6415248B1 (en) * 1998-12-09 2002-07-02 At&T Corp. Method for building linguistic models from a corpus
WO2000073936A1 (fr) * 1999-05-28 2000-12-07 Sehda, Inc. Modelisation de dialogue a base de locution convenant particulierement pour la creation de grammaires de reconnaissance destinees a des interfaces utilisateurs a commande vocale

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2008268A4 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347375A1 (en) * 2014-05-30 2015-12-03 Nuance Communications, Inc. Automated quality assurance checks for improving the construction of natural language understanding systems
US9690771B2 (en) * 2014-05-30 2017-06-27 Nuance Communications, Inc. Automated quality assurance checks for improving the construction of natural language understanding systems
US10339217B2 (en) 2014-05-30 2019-07-02 Nuance Communications, Inc. Automated quality assurance checks for improving the construction of natural language understanding systems
WO2017011343A1 (fr) * 2015-07-14 2017-01-19 Genesys Telecommunications Laboratories, Inc. Systèmes d'auto-assistance activés par la parole et dirigés par les données et procédés de fonctionnement de ces derniers
AU2016291566B2 (en) * 2015-07-14 2019-11-21 Genesys Cloud Services Holdings II, LLC Data driven speech enabled self-help systems and methods of operating thereof
US10515150B2 (en) 2015-07-14 2019-12-24 Genesys Telecommunications Laboratories, Inc. Data driven speech enabled self-help systems and methods of operating thereof
US10382623B2 (en) 2015-10-21 2019-08-13 Genesys Telecommunications Laboratories, Inc. Data-driven dialogue enabled self-help systems
US10455088B2 (en) 2015-10-21 2019-10-22 Genesys Telecommunications Laboratories, Inc. Dialogue flow optimization and personalization
US11025775B2 (en) 2015-10-21 2021-06-01 Genesys Telecommunications Laboratories, Inc. Dialogue flow optimization and personalization

Also Published As

Publication number Publication date
EP2008268A4 (fr) 2010-12-22
CA2643930A1 (fr) 2007-10-25
EP2008268A1 (fr) 2008-12-31

Similar Documents

Publication Publication Date Title
Gorin et al. How may I help you?
US10917758B1 (en) Voice-based messaging
US8768700B1 (en) Voice search engine interface for scoring search hypotheses
CA2508946C (fr) Methode et appareil de routage d'appel en langage naturel a partir d'evaluations de la certitude
US8666726B2 (en) Sample clustering to reduce manual transcriptions in speech recognition system
Hakkani-Tür et al. Beyond ASR 1-best: Using word confusion networks in spoken language understanding
US8392188B1 (en) Method and system for building a phonotactic model for domain independent speech recognition
Wang et al. An introduction to voice search
CN1655235B (zh) 基于话音特征自动标识电话呼叫者
US7634406B2 (en) System and method for identifying semantic intent from acoustic information
US6681206B1 (en) Method for generating morphemes
US6937983B2 (en) Method and system for semantic speech recognition
CA2486125C (fr) Systeme et methode d'utilisation de metadonnees dans le traitement de la parole
US7292976B1 (en) Active learning process for spoken dialog systems
US20090037175A1 (en) Confidence measure generation for speech related searching
CA2486128C (fr) Systeme et methode pour utiliser des modeles de langage dependant de metadonnees pour la reconnaissance de la parole automatique
WO2002054385A1 (fr) Procede et systeme de generation de modele de langage dynamique par ordinateur
EP1593049A1 (fr) Systeme permettant de predire la precision de la reconnaissance vocale et developpement d'un systeme de dialogues
US20050004799A1 (en) System and method for a spoken language interface to a large database of changing records
GB2424502A (en) Apparatus and method for model adaptation for spoken language understanding
WO2017184387A1 (fr) Décodeur de reconnaissance vocale hiérarchique
CN112767921A (zh) 一种基于缓存语言模型的语音识别自适应方法和系统
EP2008268A1 (fr) Procede et appareil pour construire des grammaires par regroupement semantique lexical dans un systeme de reconnaissance vocale
CN111640423B (zh) 一种词边界估计方法、装置及电子设备
Rose et al. Integration of utterance verification with statistical language modeling and spoken language understanding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07719561

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2643930

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2007719561

Country of ref document: EP