EP2008268A1 - Verfahren und vorrichtung zum aufbau von grammatiken mit lexikalischer, semantischer clusterung in einer spracherkennungsvorrichtung - Google Patents
Verfahren und vorrichtung zum aufbau von grammatiken mit lexikalischer, semantischer clusterung in einer spracherkennungsvorrichtungInfo
- Publication number
- EP2008268A1 EP2008268A1 EP07719561A EP07719561A EP2008268A1 EP 2008268 A1 EP2008268 A1 EP 2008268A1 EP 07719561 A EP07719561 A EP 07719561A EP 07719561 A EP07719561 A EP 07719561A EP 2008268 A1 EP2008268 A1 EP 2008268A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- phrases
- collected
- semantic
- semantic concepts
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000004044 response Effects 0.000 claims description 30
- 239000013598 vector Substances 0.000 claims description 26
- 238000011524 similarity measure Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 16
- 238000004891 communication Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 238000005259 measurement Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000035622 drinking Effects 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
Definitions
- TITLE METHOD AND APPARATUS FOR BUILDING GRAMMARS
- the present invention relates to speech recognition systems, and most particularly to a method and system for building grammars with lexical semantic clustering.
- Automated speech applications allow a person to interact with a computer-implemented system using their voice and ears in much the same manner as interacting with another person.
- Such systems utilize automated speech recognition technology, which interprets the spoken words from a person and translates them into a form, which is semantically meaningful to a computer, for example, strings or other types of digital data or information.
- the context free grammar represents the designer's best prediction of what a person will say in response to a particular question or prompt posed by the system.
- a context free grammar can be provided which successfully predicts the spoken responses that will be made by all the system's users.
- phraseology expands, for example, with an open- ended question, it becomes increasingly difficult to predict a priori all the responses and variations that will be provided by a user.
- the speech recognizer attempts to emulate the human ability to understand language.
- the speech recognizer has no ability to understand natural language as the human brain can.
- the speech recognizer simply executes computer code that identifies phonemes in the digitized sound wave generated by a person's voice and then attempts to find a corresponding phrase in the provided grammar that has a similar sequence of phonemes. It is typically the responsibility of the speech application to associate a semantic meaning to the results of the speech recognizer. And in many cases, the associated semantics are manually determined.
- the design of a context free grammar for a speech application typically involves two design considerations.
- the first design consideration comprises predicting phraseology that encompasses all the possible responses that may be given by a user to the questions or prompts posed by the speech application.
- the second design consideration comprises providing a semantic interpretation or mapping for each possible response, i.e. word or phrase, that may be provided by a user of the system.
- the design of a system with open-ended questions presents particular challenges because the large number of responses makes it very difficult to program a priori all or even most of the phraseology for the possible responses. It also becomes very difficult to determine a priori the set of semantic interpretations for mapping the phraseology or phrases corresponding to the responses.
- semantics interpretations are manually associated with phrases, the shear number of phrases makes this task time consuming, error prone, and costly.
- the present invention provides a method and system for creating a grammar module suitable for a speech application.
- the grammar module includes one or more semantic concepts.
- the semantic concepts are generated by clustering semantically similar phrases into groups, wherein each of the clustered phrases represents the same or a similar semantic concept.
- the present invention provides a method for creating a grammar module for use with a speech application, the method comprises the steps of: collecting phrases associated with one or more voice responses; transcribing the collected phrases into a machine-readable format; clustering selected ones of the collected phrases into one or more semantic concepts, and wherein the selected collected phrases corresponding to each of the semantic concepts have a related meaning; building a grammar module based on the collected phrases and the semantic concepts.
- the present invention provides a system for building a grammar module for a speech application, the system comprises: means for collecting phrases associated with one or more voice responses; means for transcribing the collected phrases into a machine-readable format; means for clustering selected ones of the collected phrases into one or more corresponding semantic concepts; and means for creating a grammar module based on the collected phrases and the semantic concepts.
- the present invention provides a method for generating a grammar module for a speech application, the method comprises the steps of: collecting one or more phrases associated with one or more voice responses; transcribing the collected phrases into a machine-readable format; clustering selected ones of the collected phrases into one or more semantic concepts, and wherein the selected collected phrases in each of the semantic concepts have a similar meaning; interpreting at least some of the semantic concepts; building a grammar module based on the collected phrases, the semantic concepts and the interpreted semantic concepts.
- FIG. 1 shows in diagrammatic form a networked communication system incorporating a voice recognition mechanism according to an embodiment of the present invention
- FIG. 2 shows in flowchart form a method for building a grammar module according to an embodiment of the present invention.
- FIG. 3 shows in flowchart form a method for building a grammar module according to another embodiment of the present invention.
- Fig. 1 shows in diagrammatic form a voice based communication system 100 incorporating a speech recognition mechanism and techniques according to the present invention.
- the voice based communication system 100 comprises a telecommunication network 110 and a voice application 120.
- the telecommunication network 110 may comprise, for example, a public or a private telephone or voice network or a combination thereof.
- the voice application 120 in the context of the following description comprises a voice node 130 and a speech application server 140.
- the speech application server 140 runs or executes a speech application 142, e.g. a standalone computer program or software module or code component or function.
- the voice node 130 includes a speech recognizer indicated generally by reference 132.
- the speech recognizer 132 comprises a software module or engine which converts voice signals or speech samples into digital data or other forms of data which are recognized by the speech application server 140, and in the other direction, the speech recognizer 132 converts the digital data or voice information generated by the speech application 142 into vocalizations or other types of audible signals.
- the speech recognizer 132 includes a grammar module according to an embodiment of the invention and indicated generally by reference 150.
- phraseology In the context of a speech application, a large number or sample of spoken answers are typically empirically collected for each question that is or may be posed by the application.
- the phrases are collected from a population that is representative of the users of the speech application.
- the collection of phrases typically tens of thousands in number, is called or termed phraseology.
- phraseology In a speech application, the phraseology is typically dominated by phrases that are in-context; i.e. phrases that comprise on-topic responses for the question posed by the application.
- most speech applications are designed to accommodate a statistically significant number of phrases that are out-of -context. Out- of-context phrases are not consistent with the question posed, but in the larger context of the speech application, may still have some relevance.
- embodiments of the present invention provide a mechanism or process for building a grammar module for the speech application which can accommodate both in-context and out-of-context phrases and which includes lexical clustering according to an aspect of the invention.
- telecommunication devices for example, a fixed line telephone set 112, or wireless or cellular communication devices 114, to communicate with each other via the telecommunication network 110 by dialing the directory number or DN associated with another user's telephone.
- the voice node 130 is also assigned a directory number and a user dials the directory number of the voice node 120 to initiate a call session with the speech application running 142 on the speech application server 140.
- the speech application 142 may comprise, for example, a business listings directory accessed by voice commands.
- the voice node 130 handles the call from the telephone set 112 or the communication device 114 of a user, and the speech recognizer 132 handles the conversion of voice signals, e.g.
- the speech application server 140 controls or handles the call session.
- the speech application 142 running on the server 140 will typically execute several dialog forms.
- the speech application 142 prompts the user with one or more questions, waits for a response from the user, and then provides further prompts or processing, as dictated by the particular application.
- the speech recognizer 132 converts the prompts generated by the speech application 142 into corresponding vocalizations or other types of voice or audible signals.
- the speech recognizer 132 converts the responses provided by the user into corresponding digital data.
- the grammar module 150 is utilized by the speech recognizer 132 and provides a mechanism for building a grammar base or module for use by the speech application 142.
- the speech recognizer 132 and speech application 142 are implemented as software on the voice node 130 and the speech application server 140, respectively, and may comprise a standalone computer program, a component of software in a larger program, or a plurality of separate programs, or software, hardware, firmware or any combination thereof.
- the particular details or programming specifics for implementing software, computer programs or computer code components or functions for performing the operations or functions associated with the embodiments of the present invention will be readily understood by those skilled in the art. While described in the context of a voice-based networked communication system, it will be appreciated that the present invention has wider applicability and is suitable for other types of voice-based or speech recognition applications.
- Fig. 2 shows in flowchart form a method 200 according to one embodiment of the invention for creating or generating a grammar module, for example, the grammar module 150 (Fig. 1) for the speech application 142 running on the speech application server 140 (Fig. 1).
- a user of the speech application 142 initiates a call from a telecommunications device, for example, a cellular phone 114, over the telecommunication (e.g. a public or private telephone) network 110.
- the voice node 130 and the speech recognizer 132 handle the call from the user, and the speech application server 140 handles the call session.
- the speech application 142 executes several dialog forms, which include prompting the user, i.e. calling party, with a question, and then listening for the caller's response.
- the responses or replies received from the caller are handled by the speech recognizer 132, which utilizes the grammar module 150.
- the process according to an embodiment of the invention provides for the creation of the grammar module 150 comprising semantic concepts and context free grammars for open-ended questions, i.e. questions that can have a large number of distinct responses. For example, in a speech accessible business directory, the question "what type of business are you looking for" can result in 10,000 or more distinct responses.
- the first step indicated by block 210 involves the collection of a large number or sample of spoken responses.
- the spoken responses are typically collected from a population that is statistically representative of the population that will be using the speech application 142 (Fig. 1).
- the environment in which the phrases are collected will accurately simulate the anticipated environment of the speech application.
- the words and sentence structure chosen by a person to respond to a question can depend on several environmental factors, including, but not limited to: the time of day; the communication medium; the person's location; and, perhaps most significantly, the knowledge that the person's conversational partner is an automated computer system.
- the next step in the process 200 comprises transcribing the collected phrases to text or some other digitized form.
- the collected and transcribed phrases are saved in a digital transcription file 222, which is stored as part of a database or in computer memory, for example, in the voice node 130 (Fig. 1) or the speech application server 140 (Fig. 1).
- the next step indicated by block 230 comprises clustering the phrases from the transcription file 222.
- a computer-implemented clustering process or algorithm is applied to the transcription phrases in the file 222 to cluster semantically similar phrases into groups called semantic concepts. For example, the phrases my car needs gasoline and my auto requires petrol belong to the same semantic concept, because they have the same semantic meaning.
- the clustering algorithm or process provides lexical semantic clustering, and according to one embodiment, the clustering algorithm may be implemented as described by the following pseudo code:
- the lexical semantic clustering algorithm starts or begins by initializing the set of semantic concepts C to an empty set. Next, each phrase is compared to the semantic concepts in C. Because C is initially empty, the first phrase always begins a new semantic concept, which is added to the semantic concepts set C. For each subsequent phrase p, the phrase p is compared to each semantic concept to the find the semantic concept whose phrases are most similar to the phrase p.
- the function S computes the similarity between a phrase and a semantic concept, as described in more detail below.
- the phrase p is added to the semantic concept; otherwise, the phrase p becomes the seed of a new semantic concept.
- the algorithm terminates or ends when all of the transcribed phrases have been analyzed, at which point C contains the set of semantic concepts.
- the set of semantic concepts C are stored in a digital semantic concepts file 232, e.g. a phrase clusters file.
- the semantic concepts C comprise a set of semantically equivalent phrases.
- the meaning or relevance of the semantic concept is typically determined by the context of the application.
- An aspect of the clustering operation in step 230 as described above involves quantitatively measuring the similarity between two phrases.
- Known methods for measuring similarity typically incorporate some form of vectorization of the phrases.
- the vocabulary size of the phraseology determines the dimension of the vector or vector space. For example, a phraseology comprising N distinct words results in an N dimensional space with each word being represented by a dimension.
- a particular phrase is represented by a vector having non-zero components for each word in the phrase.
- the phrase coffee shop is represented as (0, ..., 0, 1, 0, ..., 0, 1, 0, ..., 0), where the two l's correspond to the words coffee and shop, and the O's correspond to the words in the phraseology, but not in the phrase coffee shop.
- each component has either the value 0 or 1, indicating either the absence or the presence of a word in a phrase. It will be appreciated that this scheme has the disadvantage of treating all words with equal importance.
- the concept of information content can be applied to the vectorization of each phrase, wherein the O's remain, and for each word in a phrase, the corresponding vector component is assigned the information content of the word.
- the information content for a word w is - log 2 P(yv) , where P(w) is the probability of the word w occurring.
- P(w) is / H , / N , where / H ,is the number of phrases containing the word w and N is the number of phrases.
- more complex probability models for example, using n-grams and Bayes' Theorem, may be applied.
- phrases can still be semantically similar.
- the phrases my car needs gasoline and my auto requires petrol are semantically similar, but because these two exemplary phrases have few words in common, the similarity measurements, i.e. Jaccard's or cosine, fail to identify the similarity.
- the clustering operation provides for the interjection of synonymous terms.
- the terms auto and petrol are inserted into the phrase vector, as synonyms for the words car and gasoline.
- the injected synonyms will typically have the same vector weight as the original word or term.
- hypernyms and/or hyponyms are inserted into the phrase vector.
- the injected terms will have a scaled weight which is less than the original term, because the injected terms have related, but not equivalent, semantics.
- the vectorization process can be improved further by applying a word sense tag or indicator for each word according to another embodiment.
- the word glasses can mean a container used for drinking, or eyewear.
- the word sense tag indicates which meaning of a word is intended.
- the word sense tag may be determined manually or algorithmically (e.g. through the execution of a computer program, function or code component). There may also be instances where a word sense tag cannot be determined, for example, where there is ambiguity in the entire phrase.
- each word, or most words, in the phrase are tagged with a word sense.
- words with different senses are considered distinct, and if a word is determined to be ambiguous, then in the vector form, each word sense is represented by a non-zero component.
- the clustering operation includes determining the similarity between the phrases and the semantic concepts by performing a similarity measurement, for example, a scalar similarity measure.
- a similarity measurement for example, a scalar similarity measure.
- the clustering operation 230 and execution of the clustering algorithm, yields a set of semantic concepts, which are stored in a semantic concepts file indicated by reference
- step 240 the process 200 uses the semantic concepts file 232 to build a grammar file or module 242 for the speech recognizer (i.e. the speech recognizer 132 in Fig. 1).
- the grammar module 242 i.e. indicated by reference 150 in Fig. 1
- the speech recognizer 132 comprises a machine-readable format and is used by the speech recognizer 132 to recognize or decode words and phrases in the responses provided by the user (for example, as described above), and the decoded speech is then provided to the speech application 142 (Fig. 1) for further processing according to the application.
- Fig. 3 shows in flowchart form a process 300 according to another embodiment for creating or generating a grammar module for a speech application, for example, as described above for Fig. 1.
- the process 300 is similar to the process 200 of Fig. 2, and includes a collect phrases operation (step 310), a transcribe phrases operation (step 320), creation of a transcription file (reference 322), a cluster phrases operation (step 330) and creation of a semantics file (reference 332).
- the process 300 performs or executes these operations in a manner similar to that described above for the process 200 of Fig. 2.
- the process 300 includes a semantic interpretation operation in step 340.
- the semantic interpretation step 340 operates to create a semantic interpretation for each semantic concept C, and the semantic interpretations are stored in a file denoted by reference 342.
- the semantic interpretation operation in step 340 typically comprises a manual process, which is performed by a person skilled in the appropriate domain.
- the build grammar operation in step 350 builds a machine-readable grammar file 352.
- the grammar file 352 also includes the semantic interpretations which are converted to a machine- readable format and embedded with the grammar elements. The implementations details associated with this operation will be within the understanding of one skilled in the art.
- the processes and clustering algorithm according to the present invention allows semantically equivalent phrases to be grouped together, which in turn provides the capability to organize and identify distinct semantic concepts present in the phraseology of interest or relevant to a particular speech application.
- the phraseology is sufficiently large, and the semantic interpretations are determined using a manual process, the creation of semantic concepts can greatly reduce the manual effort because semantic interpretations need only to be done for each semantic concept, and not every phrase.
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US79235006P | 2006-04-17 | 2006-04-17 | |
PCT/CA2007/000634 WO2007118324A1 (en) | 2006-04-17 | 2007-04-17 | Method and apparatus for building grammars with lexical semantic clustering in a speech recognizer |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2008268A1 true EP2008268A1 (de) | 2008-12-31 |
EP2008268A4 EP2008268A4 (de) | 2010-12-22 |
Family
ID=38609002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP07719561A Withdrawn EP2008268A4 (de) | 2006-04-17 | 2007-04-17 | Verfahren und vorrichtung zum aufbau von grammatiken mit lexikalischer, semantischer clusterung in einer spracherkennungsvorrichtung |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP2008268A4 (de) |
CA (1) | CA2643930A1 (de) |
WO (1) | WO2007118324A1 (de) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9690771B2 (en) | 2014-05-30 | 2017-06-27 | Nuance Communications, Inc. | Automated quality assurance checks for improving the construction of natural language understanding systems |
US10515150B2 (en) * | 2015-07-14 | 2019-12-24 | Genesys Telecommunications Laboratories, Inc. | Data driven speech enabled self-help systems and methods of operating thereof |
US10455088B2 (en) | 2015-10-21 | 2019-10-22 | Genesys Telecommunications Laboratories, Inc. | Dialogue flow optimization and personalization |
US10382623B2 (en) | 2015-10-21 | 2019-08-13 | Genesys Telecommunications Laboratories, Inc. | Data-driven dialogue enabled self-help systems |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001046945A1 (en) * | 1999-12-20 | 2001-06-28 | British Telecommunications Public Limited Company | Learning of dialogue states and language model of spoken information system |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6173261B1 (en) * | 1998-09-30 | 2001-01-09 | At&T Corp | Grammar fragment acquisition using syntactic and semantic clustering |
US5794193A (en) * | 1995-09-15 | 1998-08-11 | Lucent Technologies Inc. | Automated phrase generation |
US5860063A (en) * | 1997-07-11 | 1999-01-12 | At&T Corp | Automated meaningful phrase clustering |
US6317707B1 (en) * | 1998-12-07 | 2001-11-13 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
US6415248B1 (en) * | 1998-12-09 | 2002-07-02 | At&T Corp. | Method for building linguistic models from a corpus |
AU5451800A (en) * | 1999-05-28 | 2000-12-18 | Sehda, Inc. | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
-
2007
- 2007-04-17 EP EP07719561A patent/EP2008268A4/de not_active Withdrawn
- 2007-04-17 WO PCT/CA2007/000634 patent/WO2007118324A1/en active Application Filing
- 2007-04-17 CA CA002643930A patent/CA2643930A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001046945A1 (en) * | 1999-12-20 | 2001-06-28 | British Telecommunications Public Limited Company | Learning of dialogue states and language model of spoken information system |
Non-Patent Citations (1)
Title |
---|
See also references of WO2007118324A1 * |
Also Published As
Publication number | Publication date |
---|---|
WO2007118324A1 (en) | 2007-10-25 |
EP2008268A4 (de) | 2010-12-22 |
CA2643930A1 (en) | 2007-10-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106683677B (zh) | 语音识别方法及装置 | |
Gorin et al. | How may I help you? | |
US10917758B1 (en) | Voice-based messaging | |
US8768700B1 (en) | Voice search engine interface for scoring search hypotheses | |
CA2508946C (en) | Method and apparatus for natural language call routing using confidence scores | |
US8666726B2 (en) | Sample clustering to reduce manual transcriptions in speech recognition system | |
Hakkani-Tür et al. | Beyond ASR 1-best: Using word confusion networks in spoken language understanding | |
US9305553B2 (en) | Speech recognition accuracy improvement through speaker categories | |
US8392188B1 (en) | Method and system for building a phonotactic model for domain independent speech recognition | |
Wang et al. | An introduction to voice search | |
CN1655235B (zh) | 基于话音特征自动标识电话呼叫者 | |
US7634406B2 (en) | System and method for identifying semantic intent from acoustic information | |
US6681206B1 (en) | Method for generating morphemes | |
US7292976B1 (en) | Active learning process for spoken dialog systems | |
US20020111803A1 (en) | Method and system for semantic speech recognition | |
US20090037175A1 (en) | Confidence measure generation for speech related searching | |
CA2486128C (en) | System and method for using meta-data dependent language modeling for automatic speech recognition | |
WO2001093249A1 (en) | Unified language model cfg and n-grams | |
WO2002054385A1 (en) | Computer-implemented dynamic language model generation method and system | |
WO2004072862A1 (en) | System for predicting speec recognition accuracy and development for a dialog system | |
US20050004799A1 (en) | System and method for a spoken language interface to a large database of changing records | |
GB2424502A (en) | Apparatus and method for model adaptation for spoken language understanding | |
WO2017184387A1 (en) | Hierarchical speech recognition decoder | |
CN112767921A (zh) | 一种基于缓存语言模型的语音识别自适应方法和系统 | |
EP2008268A1 (de) | Verfahren und vorrichtung zum aufbau von grammatiken mit lexikalischer, semantischer clusterung in einer spracherkennungsvorrichtung |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20081110 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA HR MK RS |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20101118 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 15/06 20060101AFI20101112BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20101102 |