US20020032564A1 - Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface - Google Patents

Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface Download PDF

Info

Publication number
US20020032564A1
US20020032564A1 US09/840,005 US84000501A US2002032564A1 US 20020032564 A1 US20020032564 A1 US 20020032564A1 US 84000501 A US84000501 A US 84000501A US 2002032564 A1 US2002032564 A1 US 2002032564A1
Authority
US
United States
Prior art keywords
phrases
phrase
expression
equivalent
system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/840,005
Inventor
Farzad Ehsani
Eva Knodt
Demitrios Master
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SEHDA Inc
Original Assignee
SEHDA Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US19840200P priority Critical
Priority to US58005900A priority
Application filed by SEHDA Inc filed Critical SEHDA Inc
Priority to US09/840,005 priority patent/US20020032564A1/en
Assigned to SEHDA, INC. reassignment SEHDA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EHSANI, FARZAD, MASTER, DEMITRIOS L., KNODT, EVA M.
Publication of US20020032564A1 publication Critical patent/US20020032564A1/en
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/2775Phrasal analysis, e.g. finite state techniques, chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/193Formal grammars, e.g. finite state automata, context free grammars or word networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models

Abstract

The invention enables creation of grammar networks that can regulate, control, and define the content and scope of human-machine interaction in natural language voice user interfaces (NLVUI). The invention enables phrase-based modeling of generic structures of verbal interaction to be used for the purpose of automating part of the design of such grammar networks. Most particularly, the invention enables such grammar networks to be used in providing a voice-controlled user interface to human readable text data that is also machine-readable (such as a Web page, a word processing document, a PDF document, or a spreadsheet).

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0001]
  • This invention relates to the creation of grammar networks that regulate, control, and define the content and scope of human-machine interaction in natural language voice user interfaces (NLVUI). More particularly, the invention relates to phrase-based modeling of generic structures of verbal interaction and use of these models for the purpose of automating part of the design of such grammar networks. Most particularly, the invention relates to the use of such grammar networks in providing a voice-controlled user interface to human readable text data that is also machine readable (such as a Web page, a word processing document, a PDF document, or a spreadsheet). [0002]
  • 2. Related Art [0003]
  • Voice user interfaces enable control of devices via voice commands transmitted through a microphone or telephone handset and decoded by a speech recognizer. These interfaces supplement or replace conventional input modalities such as a keyboard or a telephone touch-tone pad, and are increasingly deployed in a wide range of situations, where keyboard input is either inconvenient or impossible, e.g., to control home appliances, automotive devices, or applications accessed via the telephone. In recent years, a number of routine over-the-phone transactions such as voice dialing and collect call handling, as well as some commercial call center self-service applications, have been successfully automated with speech recognition technology. Such systems allow users to remotely access, for example, a banking application or ticket reservation system, and to retrieve information or complete simple transactions by using voice commands. Increasingly, voice control is being deployed to access the Internet by phone for the purpose of retrieving information or completing Internet-based commercial transactions such as making an on-line purchase. [0004]
  • a. Limitations and unsolved problems in current technology [0005]
  • Current technology limits the design of voice-controlled user interfaces in terms of both complexity and portability. Systems must be designed for a clearly defined task domain, and users are expected to respond to system prompts with short, fixed voice commands. Systems typically work well as long as vocabularies remain relatively small (200-500 words), choices at any point in the interaction remain limited and users interact with the system in a constrained, disciplined manner. [0006]
  • There are two major technological barriers that need to be overcome in order to create systems that allow for more spontaneous user interaction: (1) systems must be able to handle more complex tasks, and (2) the speech interface must become more “natural” if systems are expected to perform sophisticated functions based on unrestrained, natural speech or language input. [0007]
  • A major bottleneck is the complexity of the recognition grammar that enables the system to recognize natural language voice commands, interpret their meaning correctly, and respond appropriately. As indicated above, this grammar must anticipate, and thus explicitly spell out, the entire virtual space of possible user requests and/or responses to any given system prompt. To keep choices limited, the underlying recognition grammars typically process requests in a strictly predetermined, menu-driven order. [0008]
  • Another problem is portability. Current systems must be task specific, that is, they must be designed for a particular domain. An automated banking application cannot process requests about the weather, and, conversely, a system designed to provide weather information cannot complete banking transactions. Because recognition grammars are designed by hand and model domain specific rather than generic machine-human interaction, they cannot be easily modified or ported to another domain. Reusability is limited to certain routines that may be used in more than one system. Such routines consist of subgrammars for yes-no questions or personal user data collection required in many commercial transactions (e.g., for collecting name, addresses, credit card information, etc.). Usually, designing a system in a new domain means starting entirely from scratch. [0009]
  • Even though the need for generic dialogue models is widely recognized and a number of systems claim to be portable, no effective and commercially feasible technology for modeling generic aspects of conversational dialogue currently exists. [0010]
  • b. Current system design and implementation [0011]
  • The generated dialogue flow and the recognition grammar can be dauntingly complex for longer interactions. The reason is that users always manage to come up with new and unexpected ways to make even the simplest request, and all potential input variants must be anticipated in the recognition grammar. Designing such recognition grammars, usually by trained linguists, is extremely labor-intensive and costly. It typically starts with a designer's guess of what users might say and requires hours of refinement as field data is collected from real users interacting with a system simulation or a prototype. [0012]
  • c. Stochastic versus rule-based approaches to natural language processing [0013]
  • Since its beginnings, speech technology has oscillated between rule-governed approaches based on human expert knowledge and those based on statistical analysis of vast amounts of data. In the realm of acoustic modeling for speech recognition, probabilistic approaches have far outperformed models based on expert knowledge. In natural language processing (NLP), on the other hand, the rule-governed, theory-driven approach continued to dominate the field throughout the 1970's and 1980's. [0014]
  • In recent years, the increasing availability of large electronic text corpora has led to a revival of quantitative, computational approaches to NLP in certain domains. [0015]
  • One such domain is large vocabulary dictation. Because dictation covers a much larger domain than interactive voice-command systems (typically a 30,000 to 50,000 word vocabulary) and does not require an interpretation of the input, these systems deploy a language model rather than a recognition grammar to constrain the recognition hypotheses generated by the signal analyzer. A language model is computationally derived from large text corpora in the target domain (e.g., news text). N-gram language models contain statistical information about recurrent word sequences (word pairs, combinations of 3, 4, or n words). They estimate the likelihood that a given word is followed by another word, thus reducing the level of uncertainty in automatic speech recognition. For example, the word sequence “A bear attacked him” will have a higher probability in Standard English usage than the sequence “A bare attacked him.” [0016]
  • Another domain where probabilistic models are beginning to be used is automated part-of-speech analysis. Part-of-speech analysis is necessary in interactive systems that require interpretation, that is, a conceptual representation of a given natural language input. Traditional part-of-speech analysis draws on explicit syntactical rules to parse natural language input by determining the parts of an utterance and the syntactic relationships among these parts. For example, the syntactical rule S →NP VP states that a sentence S consists of a noun phrase NP and a verb phrase VP. [0017]
  • Rule-based parsing methods perform poorly when confronted with syntactically ambiguous input that allows for more than one possible syntactic representation. In such cases, linguistic preferences captured by probabilistic models have been found to resolve a significant portion of syntactic ambiguity. [0018]
  • Statistical methods have also been applied to modeling larger discourse units, such as fixed phrases and collocations (words that tend to occur next to each other, e.g. “eager to please”). Statistical phrase modeling involves techniques similar to the ones used in standard n-gram language modeling, namely, collecting frequency statistics about word sequences in large text corpora (n-grams). However, not every n-gram is a valid phrase: for example, the sequence “the court went into” is a valid 4-gram in language modeling, but only “the court went into recess” is a phrase. A number of different methods have been used to derive valid phrases from n-grams, including syntactical filtering, mutual information, and entropy. In some cases, statistical modeling of phrase sequences has been found to reduce lexical ambiguity. Others have used a phrase-based statistical modeling technique to generate knowledge bases that can help lexicographers to determine relevant linguistic usage. [0019]
  • Experiments in training probabilistic models of higher-level discourse units on conversational corpora have also been shown to significantly reduce the perplexity of a large-vocabulary continuous speech recognition task in the domain of spontaneous conversational speech. Others have modeled dialogue flow by using a hand-tagged corpus in which each utterance is labeled as an IFT (illocutionary force type). Probabilistic techniques have also been used to build predictive models of dialogue structures such as dialogue act sequences. The bottleneck in all of these experiments is the need for hand-tagging both training and testing corpora. [0020]
  • Another recent application of a probabilistic, phrase-based approach to NLP has been in the field of foreign language pedagogy, where it has been proposed as a new method of teaching foreign languages. Michael Lewis, in his book, [0021] Implementing The Lexical Approach (Hove, Great Britain, 1997) challenges the conventional view that learning a language involves two separate cognitive tasks: first, learning the vocabulary of the language, and second, mastering the grammatical rules for combining words into sentences. The lexical approach proposes instead that mastering a language involves knowing how to use and combine phrases in the right way (which may or may not be grammatical). Phrases, in Lewis' sense are fixed multi-word chunks of language, whose likelihood of co-occurring in natural text is more than random. Mastering a language is the ability of using these chunks in a manner that produces coherent discourse without necessarily being rule-based.
  • SUMMARY OF THE INVENTION
  • In one aspect, the present invention concerns modeling generic aspects of interactive discourse based on statistical modeling of phrases in large amounts of conversational text data. It involves automatically extracting valid phrases from a given text corpus, and clustering these phrases into syntactically and/or semantically meaningful equivalent classes. Various existing statistical and computational techniques are combined in a new way to accomplish this end. The result is a large thesaurus of fixed word combinations or phrases, grouped in equivalence classes that contain similar phrases. This thesaurus provides a data structure in which variations of saying the same thing and their associated probabilities can be looked up quickly. To the extent that this phrase thesaurus groups similar or semantically equivalent phrases into classes along with probabilities of their occurrence, it contains an implicit probabilistic model of generic structures found in interactive discourse, and thus can be used to model interactions across a large variety of different contexts, domains, and languages. [0022]
  • In another aspect of the invention, the phrase thesaurus mentioned above functions as a key element of a software application that can be used to generate recognition grammars for voice-interactive dialogue systems. The thesaurus provides the linguistic knowledge necessary to automatically expand anticipated user responses into alternative linguistic variants. [0023]
  • in another aspect of the invention, the phrase thesaurus is used as part of a software application that can be used to generate recognition grammars from the source code of a web page or pages, including “interactive” part(s) of web page(s) (i.e., part(s) of web page(s) that prompt the user to provide textual information in form fields) and/or “non-interactive” part(s) of web page(s) (i.e., part(s) of web page(s) other than interactive parts, such as parts of web page(s) that enable navigation within a web page and/or between web pages). For interactive part(s) of a web page, the software application takes the form-field keyword(s) provided in the page source, constructs an interactive dialogue flow based on the sequence of keyword(s), and automatically generates recognition grammars for the anticipated user responses. For informational (non-interactive) part(s) of a web page, the software application can use the phrase thesaurus to automatically generate recognition grammars for identified headings or topics within the web page. This aspect of the invention can be used generally to generate a recognition grammar from any set of human readable text data that is also machine readable. Though Web pages are an important example of such text data with which the invention can be used, the invention can also be used with other such types of text data, such as text data created using a word processing program, PDF documents and text data created using a spreadsheet. [0024]
  • The present invention has a number of significant advantages over existing techniques for designing voice recognition grammars. Most significantly, it automates the most laborious aspects of recognition grammar design, namely, the need to generate, either by anticipation or by empirical sampling, potential variants of responses to any given system prompt. Secondly, it eliminates the need for expensive user data collection and hand coding of recognition grammars. Thirdly, the invention allows developers without specialized linguistic knowledge to design much more complex networks than conventional design techniques can support. In sum, the invention enables a developer to create more complex and better performing systems in less time and with fewer resources. [0025]
  • In another aspect of the invention, a compiled subset of the thesaurus (containing only the phrases incorporated into any given recognition grammar) is incorporated into a natural language understanding (NLU) component that parses the recognizer output at run-time to derive a conceptual meaning representation. Because phrases consist of words in context, they are potentially less ambiguous than isolated words. Because a phrase-based parser can draw on the linguistic knowledge stored in a large probabilistic phrase thesaurus, it is able to parse utterances much faster and with higher accuracy than conventional rule-based parsers. [0026]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates a two-dimensional vector space for the phrases “can you show me . . .” and “can you hand me.” . . . [0027]
  • FIG. 2 illustrates a matrix representation of a singular value decomposition algorithm. [0028]
  • FIG. 3 illustrates a simplified matrix representation of a singular value decomposition algorithm. [0029]
  • FIG. 4 is an example of a dialogue flow chart for a simple restaurant information request. [0030]
  • FIG. 5 shows a type of recognition grammar for user responses to the system prompt: “What kind of food would you like to eat?”[0031]
  • FIG. 6 illustrates the place of the present invention within an application that is controlled by a voice-interactive natural language user interface.[0032]
  • DETAILED DESCRIPTION OF THE INVENTION
  • I. Phrase-Based Dialogue Modeling [0033]
  • The present invention can enable a person with no special linguistic expertise to design a dialogue flow for an interactive voice application. It can be used to automatically generate a recognition grammar from information specified in a dialogue flow design. The key element in the present invention is a large, machine readable database containing phrases and other linguistic and statistical information about dialogue structures. This database provides the linguistic knowledge necessary to automatically expand a call-flow design into a recognition grammar. The following is a description of the components of the invention, how they are generated and how they work together within the overall system. [0034]
  • a. Phrase Thesaurus [0035]
  • The phrase thesaurus is a large database of fixed word combinations in which alternative ways of saying the same thing can be looked up. The phrases are arranged in the order of frequency of occurrence, and they are grouped in classes that contain similar or semantically equivalent phrases. The following is an example of a class containing interchangeable ways of confirming that a previous utterance by another speaker has been understood: [0036]
  • I understand [0037]
  • I hear you [0038]
  • [I] got [you ¦ your point ¦ it][0039]
  • I see your point [0040]
  • I [hear ¦ see ¦ know ¦ understand] [what you're saying ¦what you mean][0041]
  • I follow you [0042]
  • [I'm ¦ I am] with you [there][0043]
  • I [hear ¦ read] you loud and clear [0044]
  • Example Based On Michael Lewis, [0045] Implementing The Lexical Approach: Putting Theory into Practice, Howe, Great Britain, 1997.
  • The database comprises anywhere from 500,000 and 1 million phrase entries. The number of phrases may vary, depending on the size of the initial text corpus and the domain to be modeled. The minimum requirement is that the initial text corpus is large enough for statistical modeling. Generally, a larger, semantically richer corpus tends to yield a larger phrase database, which in turn is likely to provide a greater number of linguistic variants for each phrase. [0046]
  • In addition to the phrase entries, the database comprises a vocabulary of lexical items containing objects, locations, proper names, dates, times, etc. that are used to fill the slots in phrase templates such as “how do I get to . . .?” Some partial phrases may occur in several different groupings. For example, the sub-phrase “I know” in “I know what you mean” may also occur in another class containing alternate ways of challenging a speaker: [0047]
  • [I know ¦ I'm sure ¦ I believe] you're [wrong ¦ mistaken][0048]
  • As a result, some phrase classes may be overlapping or contain cross-references between partial phrases. [0049]
  • b. Building a phrase thesaurus [0050]
  • The phrase thesaurus is generated automatically by a series of computer programs that operate on large amounts of natural language text data. The programs are executed sequentially, each taking the output of the previous program as its input, and processing it further. Taken together, the programs take a large text corpus as their input, and output a phrase thesaurus of the type described in section a. above. Some of the steps involved in this process are based on standard algorithms that have been used in various aspects of computational linguistics to process large machine readable corpora. These algorithms are used and combined within the present invention in a new way to accomplish the goal of automatically deriving a phrase thesaurus. [0051]
  • c. Linguistic assumptions underlying the invention [0052]
  • The present invention makes the following linguistic assumptions: [0053]
  • 1. Language in general, and conversational speech in particular, consists of phrases rather than of isolated vocabulary items, the combination of which is governed by grammatical rules. [0054]
  • 2. A phrase is a fixed, multi-word chunk of language of an average length between 1 and 7 words that conveys a unique sense depending on just that particular combination of words. The words that make up a phrase may or may not occur next to each other (e.g., the phrase “to make sense” can be separated by “a whole lot of,” “not much,” etc.) [0055]
  • 3. The use of phrases is governed by conventions of usage and linguistic preferences that are not always explicable with reference to grammatical rules. The phrase “on the one hand” loses its unique phrasal sense if “hand” is replaced by “finger.” “On the one finger” is not a legitimate phrase in Standard English, even though it is perfectly grammatical. Being able to use just the right phrases signals native fluency in a speaker. [0056]
  • 4. There are at least four types of phrases: [0057]
  • (classification based on Lewis, 1997 and Smadja, 1994). The typology is not meant to be exhaustive or complete; other classifications may be possible. [0058]
  • (a) Polywords: generally 1-3 word fixed phrases conveying a unique idiomatic sense. Polywords allow for no variation or reversal of word order. [0059]
  • Example: “by the way,” “nevertheless,” “bread and butter,” “every now and then.”[0060]
  • b) Collocations: words that occur next to each other in more than random frequencies and in ways that are not generalizable: Example: “perfectly acceptable,” “stock market slide,” “sales representative.” Variation in collocations is possible, but restricted by linguistic usage: “a tall building,” “a tall boy” (but not: “a high building,” “a high boy”); “to take a look at a problem” (not: “to gaze at a problem”); “anxiety attack” (not “fear attack”), but also an “asthma attack,” a “hay-fever attack.”[0061]
  • (c) Standardized, idiomatic expressions with limited variability, often used in formulaic greetings and social interaction routines: [0062]
  • Example: “Howls it going?” “How are you doing?” “Thanks, I'm fine [great ¦ terrific].” “Talk to you later.”[0063]
  • (d) Non-contiguous phrases: functional frames containing one or more slots that can be filled by a limited number of words. The meaning of the phrase is determined by the filler word. The set of legitimate filler words tends to be determined by world knowledge rather than linguistic usage. [0064]
  • Example: “Can you pass me the . . . , please?” Here, the filler can be any small object that can be “passed on” by hand: “salt,” “pepper,” “bread,” “water,” but not “house,” “tree,” “sewing-machine,” etc. “I have a . . . in my shoe” can be filled by, e.g., “stone,” “pebble,” “something” , but not by “elephant.” [0065]
  • 5. Because they are fixed in the mental lexicon of the speakers of the language, some word combinations are more likely to be observed/chosen in actual discourse than other combinations. This is why usage patterns and their frequencies can be analyzed using statistical methods, and can be captured in probabilistic models that reveal these patterns. [0066]
  • 6. Phrases are relatively unambiguous in their meaning or intention. Ambiguity arises when an utterance can have more than one conceptual meaning. The source of ambiguity can be either lexical (a word can have two or more unrelated meanings. E.g., “suit” =1. a piece of clothing, 2. a legal dispute), syntactic (a sentence can have two or more different and equally plausible parses (e.g. “he killed the man with a knife,” where the modifier “with a knife” can either refer to VP (the act of killing) or to the NP (the object of killing). Because phrases use words in context, they reduce semantic ambiguity (wearing a suit vs. filing a suit) and some cases of syntactic ambiguity. [0067]
  • 7. Phrasal usage is not an exclusive property of spoken, conversational language. Rather, phrase usage pertains to all forms and genres of spoken and written discourse. However, each of these genres may use different types of phrases, and a computational analysis of linguistic preferences in terms of phrase frequencies and probabilities is likely to reveal different patterns of usage depending on the genre. [0068]
  • 8. Nor is phrasal usage an exclusive property of English. Most languages are governed by it, albeit in different ways. Generally speaking, phrases do not translate word for word into other languages. A literal translation, for example, of “get your act together” into German yields a meaningless construct “bring deine Tat zusammen.” However, many phrases have functional phrase equivalents in other languages, e.g., “getting one's act together” => “sich zusammenreiBen.”[0069]
  • d. Goals of the invention [0070]
  • The following are goals of the present invention: [0071]
  • 1. To implement a phrase-based, corpus driven natural language processing technique that can reveal overarching discourse patterns without requiring laborious hand-tagging of training data in terms of syntactic, semantic, or pragmatic utterance features. As Lewis puts it: “Grammar tends to become lexis as the event becomes more probable” (p. 41). That is to say, syntactic, semantic, and pragmatic structures are embedded in the phrase and are modeled along with it, provided the analysis is based on a conversational speech corpus large enough for statistical modeling. [0072]
  • 2. To implement the process described under 1) above in such a way that the resulting linguistic knowledge can be stored in a machine readable database, and used (and reused repeatedly) in a computer system designed to generate recognition grammars for voice-controlled user interfaces. [0073]
  • 3. To implement the process described under 1) above in such a way that the resulting linguistic knowledge can be stored in a machine readable database, and used (and reused repeatedly) in a Natural Language Understanding component that functions within a speech recognition system to extract the meaning of user responses at runtime. [0074]
  • e. Data Resources [0075]
  • Statistical modeling of any kind requires a vast amount of data. To build a sizable phrase thesaurus of 500,000 to 1 million entries requires a large source corpus (on the order of 1 billion words). However, smaller and more specialized corpora may be used to model phrases in a particular domain. For a phrase thesaurus covering the domain of interactive discourse, a number of diverse resources may be used to compile a text corpus for language. Such resources include but are not limited to: [0076]
  • 1. Transcribed speech databases for task oriented interactive discourse (SWITCHBOARD, CallHome, and TRAINS (available from the Linguistic Data Consortium (LDC) at www.ldc.upenn.edu). [0077]
  • 2. User data collected from verbal interactions with existing dialogue systems or with simulations of such systems. [0078]
  • 3. Closed caption data from television programs containing large amounts of interactive dialogue, such as talk shows, dramas, movies, etc. Television transcripts tend to be highly accurate (95%-100% for off-line captioned programs) (Jensema, 1996). As a consequence, virtually unlimited amounts of data can be purchased from places that gather and disseminate this data. [0079]
  • Television transcripts are a good way of supplementing databases of task-oriented discourse (1. and 2.) Even though most television shows are scripted, they nonetheless contain large amounts of common dialogic structures, good idiomatic English, etc. What is missing is mainly the fragmented, discontinuous nature of most conversational speech. However, this difference may well be an advantage in that models based on well-formed conversational speech might be used to identify and repair elliptical speech. [0080]
  • f. Data Preparation [0081]
  • To prepare the corpus for phrase modeling, it is subjected to a normalization procedure that marks sentence boundaries, identifies acronyms, and expands abbreviations, dates, times, and monetary amounts into full words. This normalization process is necessary because the phrase thesaurus is used to create grammars for recognition systems, and recognizers transcribe utterances as they are spoken, not as they are written. This means that monetary amounts, e.g., $2.50, must be spelled out in the recognition grammar as “two dollars and fifty cents” in order to be recognized correctly. The procedure also eliminates non-alphanumeric characters and other errors that are often found in television transcripts as a result of transmission errors in the caption delivery. [0082]
  • The normalization process is carried out by running a sequence of computer programs that act as filters. In the normalization process, raw text data is taken as input and a cleaned-up, expanded corpus that is segmented into sentence units is output. Sentence segmentation is especially important because the subsequent phrase modeling procedure takes the sentence as the basic unit. [0083]
  • The invention can make use of a version of a text normalization toolkit that has been made freely available to the speech research community (Copyright 1994, University of Pennsylvania, available through the Linguistic Data Consortium). [0084]
  • g. Compiling a seed dictionary of phrase candidates [0085]
  • The first step and the precondition for building a phrase thesaurus from a corpus is a creating a seed dictionary of likely phrase candidates. Initially, existing on-line idiomatic dictionaries are searched for basic phrase candidates that are rigid and not subject to grammatical or lexical variation (section I.c.4.(a)-(c)). The words and phrases are compiled into a basic phrase list. Less rigid collocations and phrasal templates are subject to considerable lexical and grammatical variability, and therefore, empirical text data are needed that contain actual instances of their use. To compile an initial seed phrase dictionary, we derive collocations automatically from large corpora on the basis of simple frequency counts, and then subject the results to a post-processing heuristics to eliminate invalid collocations. [0086]
  • Step 1: Deriving N-Grams [0087]
  • We begin by deriving n-gram statistics from a given corpus C1 using standard language modeling techniques. For an overview of such techniques, see Frederik Jelinek, Frederick, [0088] Statistical Methods for Speech Recoqnition, MIT, Cambridge Mass., 1997). The procedure generates information about how often word strings of n-word length occur in a given corpus. Input: A given Corpus C1 → Output: n-gram frequency counts.
  • We choose n-grams of varying lengths (approximately 1<=n<=7.) N-grams are sorted in the order of the frequency of their occurrence. [0089]
  • Step 2: Filtering: Deriving Valid Phrase Candidates From N-Grams [0090]
  • The list of n-grams is very large and contains many invalid and meaningless collocations, phrase fragments, and redundant word combinations that are subsumed by larger n-grams. [0091]
  • Take for example, the following sentence: “<s> e-mail is replacing to a large extent direct communication between people </s>.”[0092]
  • For 1 <= n <=7, n-gram frequency counts on this sentence, including sentence boundary markers, will return 70 unique n-grams (13 unigrams, 12 bigrams, 11 trigrams, 10 4-grams, 9 5-grams, 8 6-grams, and 7 7-grams). By contrast, the sentence contains only four potentially valid phrase candidates, two of which are partially overlapping: [0093]
  • (a) Phrase template: “replacing [. . .] communication”[0094]
  • (b) Multi-word: “to a large extent”[0095]
  • (c) Compound noun collocation: “direct communication”[0096]
  • (d) Mixed collocation: “communications between people”[0097]
  • The next step consists of filtering n-grams to eliminate invalid or redundant collocations by implementing a series of computational measures to determine the strength of any given collocation. The problem of n-gram filtering can be approached in a number of different ways, and the following description is meant to be exemplifying rather than being exhaustive. Since the goal at this point is to compile a preliminary seed dictionary of phrases, any of the methods described below can be used, either by themselves or in combination, to identify initial phrase candidates. [0098]
  • A Frequency-Based Pre-Filtering Method [0099]
  • The simplest filtering method is frequency-based. Computed over a large corpus, n-grams with high frequency counts are more likely to contain strong collocations than n-grams that occur only once or twice. We eliminate n-grams below a specific frequency threshold. The threshold is lower for large word strings because recurring combinations of large n-grams are rarer, and more likely to contain significant phrase candidates than shorter strings. [0100]
  • Perplexity/Entropy [0101]
  • Perplexity is a measure for determining the average branching factor of a recognition network and it is most often used as a measure for evaluating language models. It indicates the probability, computed over an entire network, that any given element can be followed by any other. For example, in a digit recognition system composed of 0-9 digits and two pronunciations for 0 (“oh” and “zero” ), the perplexity of the recognition grammar exactly equals the number of elements, 11, because there are no constraining factors that favor certain digit sequences over others. Because word sequences underlie various kinds of constraints (imposed by syntax, morphology, idiomatic usage, etc.) perplexity has been found useful in natural language processing to measure the strength of certain collocations (see, for example, Shimohata, S, T. Sugio, J. Nagata, “Retrieving Collocations by Co-occurrence and Word Order Constraints,” Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, 1997, pp. 476-481.) [0102]
  • We take each unique n-gram and its associated frequency f(n-gram) and look at the probability of each word w[0103] i that can follow the n-gram. We calculate this probability p(wi) by dividing the frequency in which a given word follows the n-gram by the frequency count for the n-gram itself:
  • [0104] p ( W 1 ) = f ( wi ) f ( n - gram )
    Figure US20020032564A1-20020314-M00001
  • If the n-gram is part of a larger, strong collocation, the choice of words adjacent to the phrase boundary will be very small, because of the internal constraint of the collocation. Conversely, the likelihood that a particular word will follow is very high. For example, the word following the trigram “to a large” will almost always be “extent,” which means, the perplexity is low, and the trigram is subsumed under the fixed collocation “to a large extent.” On the other hand, a large number of different words can precede or follow the phrase “to a large extent,” and the probability that any particular word will follow is very small (close to 0). [0105]
  • We use a standard entropy measure to calculate the internal co-locational constraints of the n-gram at a given junction wi as:[0106]
  • H(n-gram)=Σ−p(wj)ln p(wl)
  • [┐=wordj]
  • The perplexity of the n-gram can then be defined as:[0107]
  • Prep(n-gram)=eH(n-gram)
  • We eliminate n-grams with low surrounding perplexity as redundant (subsumed in larger collocations) and keep the ones with perplexity above a specified threshold t. [0108]
  • Step 3: Deriving Non-Contiguous Phrases [0109]
  • The frequency and perplexity measures described above give us a good first cut at phrase candidates, generating mainly rigid word combinations such as compound nouns (“Grade Point Average” ), idiomatic expressions (“How's it going?” ) and polywords (“sooner or later” ). The next objective is to expand the initial seed phrase dictionary by deriving non-contiguous collocations (collocations that are less rigid and contain one or more filler words or phrases, e.g. “Give me . . . please” ). There are at least three types of non-contiguous phrases. Assuming that w is any word and p is any phrase, these types can be distinguished as follows:[0110]
  • Type 1: p[0111] 1 . . . . . P2
  • Two phrases occurring next to each other with more than random frequency, separated by one or more words that are not themselves phrases. [0112]
  • Example: “refer to [the appendix ¦ the manual ¦ page 220. . . ] for more information”[0113]
  • Type 2: p[0114] 1 . . . w1 A phrase is followed or preceded by one or more filler words, which are followed or preceded by another word that, together with the initial phrase, forms a phrase template.
  • Example: “Could you hand me [the salt ¦ your ID . . . ] please?” [0115]
  • Type 3: w[0116] 1. . . . . W2 A word is followed by one or more filler words, which are followed by another word that together with the initial word forms a phrase template.
  • Example: “taking [initial ¦ the first ¦ important . . . ] steps”[0117]
  • To extract phrases of the types 1 and 2, we first create a list of contexts for each phrase. We take each of the phrase candidates obtained in the first processing phase and retrieve all sentences containing the phrase. We then look at surrounding words in order to identify possible regularities and co-occurrence patterns with words or phrases not captured in the initial n-gram modeling and filtering stage. This can be done using any of the following methods: frequency counts, normalized frequency methods, perplexity, or normalized perplexity. [0118]
  • In order to handle Type 3, we compile a list of the top n most frequent word bigrams separated by up to 5 words. As in the first extraction stage, not every collocation is significant. Again, there are several ways to eliminate invalid collocations that can be used by themselves or in various combinations. Again, this can be done using any of the following methods: frequency counts, normalized frequency methods, perplexity, or normalized perplexity. [0119]
  • Mutual Information [0120]
  • Mutual information is a standard information theoretical measure that computes the strength of a relationship between two points by comparing the joint probability of observing the two points together with the probability of observing them independently. In natural language processing, it has been used to establish the strength of an association between words, for example, for use in lexicography (see Kenneth Church, W. & Patrick Hanks, “Word Association Norms, Mutual Information, and Lexicography,” [0121] Computational Linguistics, 16 (1), 1990: 22-29.)
  • Given two phrases, ql and q2 with probabilities p(q1) and p(q2) then the mutual information I (ql, q2) is defined as: [0122] I ( q1 , q2 ) = p ( q 1 , q 2 ) p ( q 1 ) p ( q 2 )
    Figure US20020032564A1-20020314-M00002
  • Joint probability can serve as a measure to determine the strength of a collocation within a given window (in our case, a sentence), even if the collocation is interrupted, as in the case of non-contiguous phrases. If there is a genuine association between two words or word strings, their joint probability will be larger than the probability of observing them independently, so the mutual information I(w1,w2) must be greater than 1. [0123]
  • We take our corpus of non-contiguous phrase candidates and compute the mutual information for each phrase and the most frequent words or word sequences surrounding these phrases. We extract the phrase-word or phrase-phrase combinations with the highest joint probability. [0124]
  • However, the above formula may generate misleading results in case of very frequently used words such as “the,” “it,” or “very good.” In this case we will use a slightly modified mutual information defined as: [0125] I new ( q1 , q2 ) = p ( q 1 , q 2 ) p ( q 1 )
    Figure US20020032564A1-20020314-M00003
  • where q2 is the frequent word or phrase. [0126]
  • Probability Distribution [0127]
  • Yet another way to eliminate invalid phrase candidates is to look at the probability distribution of components within each non-contiguous phrase candidate. For each phrase candidate, we determine a main component and a sub-component (the longer or the more frequent phrases can usually be considered as the main component), and then look at the probability distribution of the sub-component with respect to other words or phrases that co-occur in the same context (i.e., sentence or clause). This algorithm can be formally described as: [0128] M main , sub = f ( q main , q sub ) - Exp ( q main ) Dev ( q main )
    Figure US20020032564A1-20020314-M00004
  • where f(q[0129] main, qsub)is the frequency of the co-occurrence of the main component with the sub-component and Exp(qmain) & Dev(qmain) are the Expected Value and the Standard Deviation of the frequency occurrence of qmain with all of the sub-components qsub.
  • We can assume that if M[0130] main, sub is greater than a certain threshold, then the collocation is a valid phrase, otherwise it is not.
  • Hand Checking [0131]
  • A final way of eliminating invalid phrases—especially cases determined as borderline by the other algorithms—is by having a trained linguist go through the resulting phrase dictionary and eliminate the unlikely phrases. This step, while optional, may improve the quality and accuracy of the resulting phrase list with respect to common linguistic usage. [0132]
  • Step 4: Phrase-Based Corpus Segmentation [0133]
  • As explained in the previous section, a number of measures can be (and have been) used to automatically derive an initial seed dictionary of phrase candidates from large corpora. Because all of these methods act more or less as filters, they can be used in various combinations to extract multi-word phrases and collocations. However, whatever method we use, the list of derived phrases still contains a large number of overlapping phrase candidates, because multiple parses of the same sentence remain a possibility. For example, for the sentence “E-mail is replacing direct communications between people,” the following alternative parses are conceivable: [0134]
  • Parse 1: <s>[E-mail] [is replacing] [direct communications] [between people] </s>[0135]
  • Parse 2: <s>[E-mail] [is replacing direct communications] [between people] </s>[0136]
  • Parse 3: <s>[E-mail] [is replacing] [direct] [communications between people.] </s>[0137]
  • The problem is similar to the one we encounter when segmenting text for building dictionaries in Chinese or Japanese. In these languages, the concept of a “word” is less well defined than it is in European languages. Each Chinese word is made up of anywhere between one and seven characters, and in Chinese writing, word boundaries are not separated by white spaces. The problem is augmented by the fact that complete Chinese dictionaries are extremely hard to find, especially when it comes to proper names. [0138]
  • The absence of word boundaries in Chinese or Japanese creates significant difficulties when building probabilistic language models for large vocabulary dictation systems. Word-based n-gram language modeling requires correct parsing of sentences to identify word boundaries and subsequently calculate n-gram probabilities. Parsing errors are a common problem in Chinese language processing. For example, we may encounter a character sequence ABCDE where A, AB, CDE, BCD, D, and E are all legitimate words in the dictionary. One can quickly note that there are two possible parses for this character sequence: [A] [BCD] [E] and [AB] [CDE]. Linguists have applied various lexical, statistical, and heuristic approaches, by themselves and in combination, to parse Chinese text. Most of these methods can be applied to phrase parsing in English. We describe one statistical, n-gram-based parsing algorithm that we found particularly efficient and useful. However, other methods can be used for phrase parsing as well. [0139]
  • The general idea is to implement an N-gram phrase-based language model (a language model that uses phrases rather than single words as the basis for n-gram modeling), in order to calculate the best parse of a sentence. Note that some words may act as phrases as can be seen in Sentence 3 (e.g. the word “direct” in the above example). Assuming the log probability bigram statistics for the example above to be as follows: [0140]
  • [<s>], [Email]-−5.8 [0141]
  • [Email],[is replacing]-−2.4 [0142]
  • [Email],[is replacing direct communications]-−6.5 [0143]
  • [is replacing], [direct]-−4.7 [0144]
  • [is replacing], [direct communications]-−5.4 [0145]
  • [direct],[communication between people]-−4.2 [0146]
  • [direct communications],[between people]-−6.2 [0147]
  • [is replacing direct communications],[between people]-−8.9 [0148]
  • [between people] [<s>]-4.8 [0149]
  • [communication between people] [<s>]-−5.9 Given these log probabilities, we can calculate the best phrase-based parse through a sentence by multiplying the probabilities (or summing the log probabilities) of each of the bigrams for each possible parse: [0150]
  • Parse 1[0151] Total likelihood=−5.8+−2.4+−5.4+−6.2+−4.8=−24.6
  • Parse 2[0152] Total likelihood=−5.8+−6.5+−8.9+−4.8=−26.0
  • Parse 3[0153] Total likelihood=−5.8+−2.4+−4.7+−4.2+−5.9=−23.0
  • We select the parse with the highest overall likelihood as the best parse (in this case, Parse 1). [0154]
  • A First Pass At Phrase-Based N-Gram Parsing [0155]
  • In order to create a phrase-based parse of a given text corpus C, we need a phrase-based language model. Building such a language model, however, requires a pre-parsed text or a dictionary of phrases. In order to get around this problem, we use a bootstrapping technique that provides us with an initial parse of the corpus, which will then form the basis for building an initial language model that is subsequently refined by iterating the procedure. There are two ways to derive a preliminary parse through the corpus: [0156]
  • 1. We use a Greedy Algorithm that, whenever it encounters a parsing ambiguity (more than one parse is possible), selects the longest phrases (e.g., the parse that produces the longest phrase or the parse that produces the longest first phrase) from the seed dictionary. In the above example, Parse 2 would be selected as the optimal parse. [0157]
  • 2. We pick the parse that minimizes the number of phrases for each parse. Assuming that neither the phrase “is replacing direct communications” (because it is not a very common phrase) nor the word “direct” are in the seed dictionary, Parse 1 would be selected. [0158]
  • Applying either one or both of these algorithms will result in an initial phrase-based parse of our corpus. [0159]
  • Optimizing the phrase-based n-gram parse [0160]
  • Once we have an initial parse through our corpus, we divide the corpus into two sub-corpora of equal size, C[0161] 1 and C2 and use the seed dictionary of phrases (described in section I.b.-d.) to build an initial language model for one of the sub-corpora. We then use this language model to generate an improved segmentation of the other sub-corpus C2. Resulting high-frequency bigrams and trigrams are phrase candidates that can be added to the dictionary for improved segmentation.
  • A significant advantage of using a language modeling technique to iteratively refine corpus segmentation is that this technique allows us to identify new phrases and collocations and thereby enlarge our initial phrase dictionary. A language model based corpus segmentation assigns probabilities not only to phrases contained in the dictionary, but to unseen phrases as well (phrases not included in the dictionary). Recurring unseen phrases encountered in the parses with the highest unigram probability score are likely to be significant fixed phrases rather than just random word sequences. By keeping track of unseen phrases and selecting recurring phrases with the highest unigram probabilities, we identify new collocations that can be added to the dictionary. [0162]
  • There are two ways of implementing this procedure. In the first case, we start a unigram language model, and use this model to segment sub-corpus C[0163] 2. The segmented sub-corpus C2 is subsequently used to build a new, improved unigram language model on the initial sub-corpus C1. We iterate the procedure until we see little change in the unigram probability scores. At this point we switch to a bigram language model (based on phrase pairs) and reiterate the language modeling process until we see very little change. Then we use a tri-gram model (based on sequences of three phrases) and reiterate the procedure again until we see little changes in the segmentation statistics and few new, unseen phrases. At this point, our dictionary contains a large number of plausible phrase candidates and we have obtained a fairly good parse through each utterance.
  • In the second case, we implement the same iterative language modeling procedure, using bigram, trigram, or even n-gram models with larger units, in the very beginning of the process rather than increasing gradually from unigram to trigram models. One or the other implementation may prove more effective, depending on the type of source material and other variables. [0164]
  • h. Automatically deriving a phrase thesaurus from a seed dictionary of phrases [0165]
  • The core of the proposed technology is a phrase thesaurus, a lexicon of fixed phrases and collocations. The thesaurus differs from the seed dictionary of phrases in that it groups phrases that are close in content and in some sense interchangeable. The grouping is essential for the use of the phrase database in the context of the proposed invention, namely, to allow for the retrieval of alternative phrase variants that can be used to automatically create a grammar network. We use linear algebra techniques to determine the semantic distance between phrases contained in our phrase dictionary. Once we have a measure of closeness/distance between phrases, we can use this information and a standard clustering algorithm (e.g., Group Average Agglomerative Clustering) to derive sets of semantically similar phrases. [0166]
  • Step 1: Measuring Distance Between Phrases [0167]
  • In order to derive a measure for determining semantic distance between phrases, we draw on two basic linguistic assumptions: [0168]
  • 1. The meaning of a word is determined by its use. Mastering a language is the ability to use the right words in the right situation. [0169]
  • 2. The degree of similarity between two words can be inferred from the similarity of the contexts in which they appear. Two words are synonymous if they are completely interchangeable in all contexts. Two words are similar if they share a subset of their mutual contexts. [0170]
  • We take these assumptions to hold true not only for isolated words, but for phrases as well. To determine semantic proximity or distance between phrases, we look at the surrounding words and phrases that co-occur with any given phrase P across an entire machine readable corpus C, and measure the extent to which these contexts overlap. For example, we will find that the phrases “can you hand me . . . ” and “can you pass me . . . ” share a large subset of neighboring words: “salt,” “coffee,” “hammer,” “the paper,” “my glasses,” etc. Conversely, we find no overlap in the neighbors of the phrases “can you pass me . . . ” and “can you tell me . . .”[0171]
  • To represent and measure semantic and/or syntactic relationships between phrases, we model each phrase by its context, and then use similarities between contexts to measure the similarity between phrases. One can imagine that each phrase is modeled by a vector in a multi-dimensional space where each dimension is used for one context. The degree of overlap between vectors indicates the degree of similarity between phrases. A simple example illustrates how to represent contextual relationships between phrases and their associated neighbors in such a space. For the two phrases, P1: “can you hand me . . . ” and P2: “can you show me . . . , ” we create an entry in a 2 dimensional matrix for each time they co-occur with one of two right neighbors, “the salt,” and “your ID.” The example shows that the phrases P1 and P2 share some but not all of the same contexts. P1 occurs 136 times with “your ID” but never (0 times) with “the salt.” P2 co-occurs 348 times with “the salt” and 250 times with your ID. [0172]
  • We can capture this co-occurrence pattern geometrically in a two-dimensional space in which the phrases P1 and P2 represent the two dimensions, and the contexts “the salt” and “your ID” represent points in this space (see FIG. 1). The context the salt is located at point 0,348 in this space because it never occurs (0 times) with P1 and occurs 348 times with P2. [0173]
  • The degree of similarity between contexts can be determined by using some kind of association measure between the word vectors. Association coefficients are commonly used in the area of information retrieval, and include, among others, the following: Dice coefficient, Jaccard's coefficient, Overlap coefficient and Cosine coefficient (for an Overview, see C. J. van Rijsbergen, [0174] Information Retrieval, 2nd ed., London, Butterworths, 1979). There is little difference between these measures in terms of efficiency, and several of these coefficients may be used to determine the difference between phrases. The most straightforward one is the Cosine coefficient, which defines the angle a between the two word vectors as follows: cos Θ = A T B A · B
    Figure US20020032564A1-20020314-M00005
  • Step 2: Singular Value Decomposition [0175]
  • Using either of the formulas described in Step 1 will give us an initial distance measure between phrases. Assuming the phrase dictionary derived so far contains N phrases (with N being anywhere from 500,000 to 1,000,000), and assuming further that we parameterize each key-phrase with only the most frequent M phrases (with M being between 500,000 and 100,000 depending on a number of variables), then we still have two problems: [0176]
  • 1. The resulting MxN matrix may be too large (500,000×100,000) to compare vectors. [0177]
  • 2. Because of the sparseness of data, many context phrases or words will not appear in the context of their respective key phrases. For less frequent phrases or context phrases, the vector model might therefore yield misleading and inaccurate results. [0178]
  • In order to get around both of these problems we can use Singular Value Decomposition (SVD) to reduce the original matrix to a smaller and informationally richer matrix. We describe the original matrix as follows: each row is used for one key-phrase and each column is used for one of the M context-phrases. So c[0179] ij is the number of occurrences of the phrase pj in the context of phrase pi. The standard SVD algorithm for a matrix A of size MxN allows us to express A as a product of a MxN column-orthogonal matrix U, a diagonal matrix S of size NxN whose elements are either positive or zero, and transpose of another NxN row-orthonormal matrix V. This can be summarized as follows:
  • A=U·S·V T
  • The shapes of these matrices can be visualized as a series of columns, as shown in FIG. 2. [0180]
  • The advantage of using SVD is that it allows us to break down the matrix into its individual components and to reduce the size of the matrix by as much as one order of magnitude by eliminating unwanted or meaningless components. If the matrix is singular, some of the s[0181] n will be zero and some are going to be very small. By eliminating these elements and reducing the matrix in size, we can make the matrix smaller and more manageable. Moreover, the reduced matrix Anew contains only the most significant elements of the original matrix A. Assuming that the sn-was very small and sn was zero and we decide to eliminate these columns from the original matrix, the result would be a (M)x(N-2) matrix made from the first N-2 columns of U, S, & V, as shown in FIG. 3.
  • Note that Factor Analysis or any other kind of Principle Component Analysis with dimensionality reduction might work just as well in this case. [0182]
  • Step 3: Phrase Clustering [0183]
  • The next step in creating a phrase thesaurus consists of clustering phrases into classes based on the degree of overlap between distance vectors. A number of standard clustering algorithms have been described in the literature. The most efficient ones include Single Link, Complete Link, Group Average, and Ward's algorithm. These algorithms are typically used to classify documents for information retrieval, and, depending on the particular data being modeled, one or the other has been shown to be more efficient. For a discussion of clustering algorithms, see, e.g., El Hamdouchi, A. and P. Willett, “Hierarchic Document Clustering using Ward's Method,” Proceedings of the Organization of the 1986 ACM Conference on Research and Development in Information Retrieval, 1988, pp. 149-156; El Hamdouchi, A. and P. Willett, “Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval,” The Computer Journal 32.3, 1989, pp. 220-227; Cutting, Douglas, R., David R. Krager, Jan O. Pedersen, John W. Tukey, “Scatter/Gather: A C[0184] 1uster-Based Approach to Browsing Large Document Collections,” Proceedings of the 15th Annual International SIGIR '92, Denmark, pp. 318-329.
  • All of these clustering algorithms are “agglomerative” in that they iteratively group similar items, and “global” in that they consider all items in every step. [0185]
  • We can use one or the other of these algorithms to cluster similar phrases into equivalence classes by performing the following steps: [0186]
  • a) Calculate all inter-phrase similarity coefficients. [0187]
  • Assuming q[0188] x and qy are any two phrases, they can be represented by rows X & Y of Anew from Step 2, so the similarity between any two phrases using the Cosine coefficient would be: S cos ( q x , q y ) = q x T · q y q x · q y
    Figure US20020032564A1-20020314-M00006
  • b) Assign each phrase to its own cluster [0189]
  • c) Form a new cluster by combining the most similar pair of current clusters (r, s) [0190]
  • d) Update the inter-phrase similarity coefficients for all distances using r & s. [0191]
  • e) Go to step (c) if the total number of clusters is greater than some specified number N. [0192]
  • Clustering algorithms differ in how they agglomerate clusters. Single Link joins clusters whose members share maximum similarity. In the case of Complete Link, clusters that are least similar are joined last, or rather an item is assigned to a cluster if it is more similar to the most dissimilar member of that cluster than to the most dissimilar member of any other cluster. Group Average clusters items according to their average similarity. Ward's method joins two clusters when this joining results in the least increase in the sum of distances from each item to the centroid of that cluster. [0193]
  • Clustering techniques tend to be resource intensive, and some initial seeding of clusters, based on rough guesses, may be necessary. The Buckshot algorithm (Cutting, et. al., 1992) can be used to accomplish this goal. Buckshot starts with a small random number of clusters and then uses the resulting cluster centers (and just these centers) to find the right clusters for the other items. One could imagine other similar algorithms that take some initial guesses at the cluster center, and then use the cluster center (or even the top N items that can be considered as the closest to the center), and find the other buckets accordingly. [0194]
  • We can use any one of these clustering algorithms or a combination of them depending on the computational resources required and other factors to derive both flat and hierarchical groupings of phrases. [0195]
  • Step 4: Hand tagging of classes [0196]
  • In a final step, a sub-set of the hand-checked phrase classes are tagged with abstract descriptors denoting abstract conceptual representations of the phrases contained in each class. Descriptors include speech act classifications for verb phrases (e.g. request [. . . ], confirm [. . . ], reject [. . . ], clarify [. . . ], etc. and object nouns (e.g. date, location, time, amount,) and proper names (businesses, restaurants, cities, etc.). [0197]
  • The phrases in a phrase thesaurus produced in accordance with the invention can be arranged in a hierarchical manner. For example, phrases that can occur as part of other phrases can be represented once in the phrase thesaurus and each other phrase that can include such phrase can include a pointer to that phrase. This can be desirable to enable the phrase thesaurus to be represented more compactly, thus decreasing the data storage capacity required to store the data representing the phrase thesaurus. [0198]
  • II. Use Of The Invention In Designing And Operating voice-interactive speech applications [0199]
  • Speech recognition technology is increasingly being used to facilitate communication between humans and machines in situations where the use of other input modalities (such as a keyboard) is either impossible or inconvenient. More specifically, such situations include remote access of databases and/or control of applications or devices using a telephone or other hand-held device and simple natural voice commands. Typically, callers dial into a voice telephony server and are led through a sequence of voice-driven interactions that lets them complete automated transactions such as getting information, accessing a database or making a purchase. Systems differ with regard to the complexity of the supported interaction and the manner in which the voice interface is integrated with the application it controls. In some cases, both the voice-interface and the application or back-end database are located on the same telephony server. In other cases, such as when telephone voice input is used to control Internet-based applications, the voice telephony server, which processes the telephone voice input, is linked with the application over the Internet. [0200]
  • In what follows, we describe three embodiments of the present invention and how they can be used to optimize both the design and the performance of speech applications: [0201]
  • 1. An application for designing recognition grammars for “generic” speech applications. By “generic” we mean that the grammars generated by means of this application can be used in a variety of different systems, such as computer desktop applications, remote voice control of household appliances, or telephone self-service applications. [0202]
  • 2. An application for designing recognition grammars for speech applications that allow Internet access by an audio input device (such as a telephone) and are therefore tightly integrated with the Internet. This application is similar to the immediately preceding embodiment, but, in addition, comprises features specifically designed for Internet-based audio input applications. [0203]
  • 3. A natural language component that functions as part of a speech application and extracts meaning from user responses at runtime. [0204]
  • In order to clarify the aspects of novelty inherent in these embodiments, the description of the embodiments is prefaced with a general overview of a standard speech application that illustrates how the grammar and the natural language understanding component function within the context of such applications. The present invention is particularly concerned with components [0205] 1 (e) and 2 in the description provided below.
  • The operation of a voice-interactive application entails processing acoustic, syntactic, semantic, and pragmatic information derived from a user's voice input in such a way as to generate a desired response from the application. This process is controlled by the interaction of at least five separate but interrelated components (see FIG. 6): [0206]
  • 1. a speech recognition front-end consisting of: (a) an acoustic signal analyzer, (b) a decoder, (c) phone models, (d) a phonetic dictionary, and (e) a recognition grammar; [0207]
  • 2. a Natural Language Understanding (NLU) component; [0208]
  • 3. a Dialogue Finite State Machine; [0209]
  • 4. an application Interface; and [0210]
  • 5. a speech output back-end. [0211]
  • The components enumerated above work together in the following manner: [0212]
  • 1. When a speech signal is received through a microphone or telephone hand-set, its acoustic features are analyzed by the acoustic signal decoder (a) and a set n of the most probable word hypotheses are computed based on the acoustic information contained in the signal, and the phonetic transcriptions contained in the dictionary (d). The dictionary is a word list that maps the vocabulary specified in the recognition grammar (e) to their phonetic transcriptions. The recognition grammar (e) defines legitimate user responses including their linguistic variants and thus tells the system what commands to expect at each point in a given interaction. Because the grammar specifies only legitimate word sequences, it narrows down the hypotheses generated by the acoustic signal analyzer to a limited number of possible commands that are can be recognized by the system at any given point. The result of the front-end processing is a transcription of the speech input. [0213]
  • 2. The Natural Language Understanding component (component 2) extracts the meaning of the transcribed speech input and translates the utterances specified in the recognition grammar into a formalized set of instructions that can be processed by the application. In most simple systems, this is done via language interpretation tags that are inserted manually into the grammar in such a way as to reduce the linguistic variants specified in a given recognition grammar to a single command that can be executed by the system. For example, the input variants “I'd like to order <title>,” “Do you have <title>?,” and “I'm looking for <title>” are reduced to a single instruction such as <search TITLE>. [0214]
  • 3. The Dialogue Finite State Machine (component 3) can be implemented as a computer program that specifies the flow of the human-machine interaction. It contains instructions for prompting the caller for speech input and for generating the appropriate system response to each instruction that is passed to the program by the natural language understanding component. The Dialogue Finite State Machine for a voice interface to an online bookseller's web site, for example, might prompt the user to say his/her name, address, credit card number, and upon successful completion of these items ask the user to say the title of the book he/she is looking for. [0215]
  • 4. The Application Interface can be implemented as a set of scripts that are called by the Finite State Machine and interact with the application that is controlled by the voice interface. These scripts contain instructions to be executed by the application (e.g., to access a bookseller's database and retrieve a requested title, to produce a verbal system response such as a request for clarification such as “Do you want Edgar Smith or Frank Smith?,” or a combination of both). If the voice interface is used to control a web-based application, the specified instruction, e.g., <search TITLE>, is sent over the Internet from a voice server (discussed further below) to the respective web site where it is processed like a regular on-line transaction. [0216]
  • 5. The speech-output back-end (component 5) takes the verbal response generated by the application interface and maps it to an acoustic speech signal, using either a speech synthesizer or prerecorded utterances from a database. [0217]
  • (For a comprehensive overview of state-of-the-art dialogue systems, their operation, and assessment, see Ronald Cole, A. J. Mariani, Hans Uszkoreit, Annie Zaenen, Victor Zue, “Survey of the State of the Art in Human Language Technology, Center for Spoken Language Understanding,” Oregon Graduate Institute, 1995, and EAGLES, [0218] Handbook of Standards and Resources for Spoken Dialoque Systems, De Gruyter, Berlin & New York, 1997.)
  • A. A Computer System For Automatically Creating Recognition Grammars For Voice-Controlled User Interfaces [0219]
  • The phrase thesaurus described above can be implemented as part of a computer system that can be used to automatically generate complex recognition grammar for speech recognition systems. The recognition grammar can then be used with an interactive user interface that is responsive to spoken input (voice input). The recognition grammar enables interpretation of the spoken input to the user interface. the system combines call-flow design, network expansion, and grammar compilation into a single development tool. The thesaurus forms the key element of this system, but in order to function in the manner desired, it must be integrated and work together with a number of other system components. [0220]
  • The system consists of the following components: (a) a graphical user interface for designing and editing the call flow for a voice application, (b) a network expander that retrieves alternative variants for the user commands specified in the call-flow design from the database along with their probabilities, (c) a linguistic database, (d) an editor, and (e) a compiler that translates the grammar network into a format than can be used by commercial speech recognizers. [0221]
  • (a) Call Flow Design: The first step in designing a recognition network for a voice-controlled dialogue system consists of specifying the call flow in such a way as to anticipate the logic of the interaction. The system's graphical user interface allows the designer to specify user requests, system states, and the transitions between these states. FIG. 4 shows the initial part of a call flow for a simple restaurant information request. At this stage of the design process, the designer only needs to specify one sample utterance for each type of user request. For example, the utterance “Where can I find a good Japanese restaurant around here” fully specifies the request type “request restaurant information.” [0222]
  • (b) Network Expander: In a second step, the user responses in the call flow design are automatically expanded into recognition grammars. A grammar includes the set of user responses to system prompts that the system can recognize and process accordingly. FIG. 5 shows the type of network that needs to be generated to recognize the user response to the systems prompt “What kind of food do you like to eat?” For each user request, the grammar specifies the set of legitimate variants and supplies an abstract meaning representation (e.g., “request restaurant information”). Note that the system will not recognize speech input that is not explicitly specified in the grammar. If the recognition system allows for probabilistic grammars, the Network Expander can supply frequency and other probabilistic bigram and trigram statistics to build such a grammar. [0223]
  • Activation of the network expander will take the sample user responses specified in the call-flow design and automatically retrieve alternative linguistic variants from the database. For example, suppose we want to model a user request for help. For the phrase “I need help,” the network expander will return: “What do I do now?,” “Help!,” “Help me, please,” “I could need some help here!,” “Can you help me?,” “I'm lost, I don't know what to do,” “Oops, something's wrong!,” etc. [0224]
  • (c) Linguistic Database: The linguistic knowledge required for the automatic grammar expansion is stored in a large, machine-searchable database. The database contains the phrase thesaurus (along with probability scores associated with each phrase). In addition, it contains lists of common nouns for filling phrase templates, as well as locations, dates, proper names, etc. The database is customizable, that is, users can create their own application specific lists of objects, names, etc. [0225]
  • (d) Editor: The grammar designer provides editing functionality at all stages in the design process. Initial call flow designs can be saved, retrieved, and changed in both graphical and text mode. After the network has been expanded, the designer can go back to the initial call flow design and edit the phrase variants retrieved by the system. At this stage, most of the editing activity will consist of eliminating variants that don't fit the pragmatic context, and of completing phrase templates by accessing the supplemental databases provided by the system or by typing in the template fillers directly. The editor also permits review and modification of the meaning representations automatically supplied by the system. [0226]
  • (e) Compiler: After completing the editing, the user activates the system compiler, which executes a computer program that translates the grammar network design into a format that can be used by the recognizer. [0227]
  • In conventional grammar design, grammar network expansion must be done by hand. The knowledge of anticipated user responses and their linguistic variants is supplied by language experts who anticipate a set of variants to generate a grammar, or they are collected by recording user interactions with system simulations or prototypes. In accordance with the invention, grammar network expansion can be automated using linguistic knowledge derived from previous modeling of linguistic behavior. [0228]
  • B. A Computer System For Creating Voice Interfaces For Internet-Based Self-Service Applications [0229]
  • Speech recognition technology can be used to enable access to the Internet by telephone or other audio input device to, for example, retrieve information or complete Web-based self-service transactions such as ordering tickets. A speech application located on a voice server (discussed further below) that is connected to the Internet allows callers to complete the same kinds of transactions they usually do via their Web-browser (e.g., register for a service, input credit card information, put together a shopping basket, or make a purchase). The difference from using a Web browser is that they use their voice rather than filling out interactive forms on the Web using a keyboard, mouse or other tactile input device. A voice server recognizes the voice input and sends data (e.g., a completed site registration or credit card transaction) to a Web site where it can be processed in the same way as a regular on-line transaction. [0230]
  • A “voice page” is a representation, e.g., a set of instructions and/or data (for convenience, sometimes referred to herein as “code” ), of a conventional Web page that reproduces some or all of both the structure and content of the Web page, and enables interaction with the Web page using audio input—speech or tone(s) of predetermined pitch (e.g., DTMF). A voice page can include all of the five components of a speech application described above and shown in FIG. 6. Creating a voice page involves translating the graphical user interface (GUI) of the Web page, which is typically written in a markup language such as Hyper Text Markup Language (HTML) or extensible Markup Language (XML), into code for recognizing and processing voice commands. A voice page can be implemented in Voice XML (VXML), an HTML-like language for scripting a dialog flow and telephony interactions. [0231]
  • A “voice server” stores and enables access to “voice pages.” For example, a voice server could be a software application running on a computer system which can be accessed by end-users via communication lines (e.g., telephone lines, Ti lines, ISDN lines, cable lines). In accordance with the invention, a voice server can be used as an interface for a web-based application. When used in such manner, the voice server can be adapted to transmit and receive data from the Internet. Alternatively, the voice server can be implemented together in the same apparatus as that used to implement the Web-based application. Further, when a voice server is used as an interface for a Web-based application, the voice page is closely integrated with the corresponding Web page as described above. If the voice page is implemented in VXML, an existing commercial voice server that works directly with VXML pages (e.g., Nuance Voice Web Server™) can be used. [0232]
  • Below, an aspect of the present invention is described in which phrase-based language processing is deployed to facilitate the translation of transaction-oriented Web pages into voice pages. Specifically, this aspect of the invention enables the generation of recognition grammars directly from information provided in the source code used to generate the corresponding Web page. This aspect of the invention is similar in functionality to the grammar design tool described above in that its key component is a phrase-based language-processing engine that supports automatic grammar expansion. In addition, however, the system comprises the following components: [0233]
  • 1. An off-the-shelf, or hand-built, HTML/XML parser (using conventional parsing technology) that extracts input field keywords from the Web page's source code; and [0234]
  • 2. A knowledge database that turns keywords into context (e.g., the “Title” field in an on-line bookseller's web site to “Please tell me the title of the book”). This knowledge database is used to create a Dialogue Finite State Machine. [0235]
  • Like the grammar design tool, the system provides a graphical interface for call-flow design and a large database of phrases for enabling the grammars to handle natural variations of user input, e.g., different ways of phrasing a request for information. [0236]
  • In one embodiment, the process of translating an interactive web page into a voice page comprises the following steps: [0237]
  • 1. A reference web page is loaded into memory and displayed on a computer screen. [0238]
  • 2. The page is parsed using the HTML/XML parser into HTML/XML tags (such as form tags). [0239]
  • 3. The tags can be grouped together into usable higher level modules (such as combining all of the form tags). [0240]
  • 4. An HTML parser can be implemented to search the source code of the web page for system prompts preceding form tags fields and itemized lists of option values for input into these fields. Additionally or alternatively (depending on the nature of the web page and/or the application for which the invention is used), the HTML parser can be implemented to search the source code of the web page for non-interactive parts of the web page (e.g., informational text and headings, navigation links that are internal or external to the web page). On the displayed web page, these items are marked in such a way that they can be clicked, copied, and pasted into the call-flow designer window. [0241]
  • 5. A call flow for the voice page can be designed by copying the items marked by the HTML parser into the call-flow designer window. These items include system prompts and lists of option values to be selected. [0242]
  • 6. System prompts are expanded into complete speech utterances as needed. This can be done manually by a designer or automatically by apparatus that accesses a linguistic knowledge base that contains expanded versions for system prompts commonly used in transaction-based web forms. For example, the “title” prompt in a booksellers on-line purchasing form can be expanded into the question: “What is the title of the book you are looking for?” The linguistic knowledge base can also be used to expand informational text (headings) and navigation links. For example, a list of book titles preceded by a heading “Other books by the Author” can be expanded to “Would you like a list of other books the Author has written?”[0243]
  • 7. Lists of option values are expanded into phrases. Lists of option values can be automatically expanded into phrases using the same knowledge base described in step 6. For example, the option value for title can be translated into: “I'd like to buy <TITLE>,” with <TITLE> being any of the titles that the user is allowed to buy at that point. Informational text (headings) and navigation links can be manually translated into questions by a designer selecting an appropriate phrase from a set of possible phrases presented to the designer. For example, the heading “Other books by the Author” can be changed to “What other books have been written by the author?”[0244]
  • 8. The phrase created in step 7 is expanded into a recognition grammar containing possible linguistic variants for the phrase “I'd like to buy <TITLE>.”[0245]
  • 9. Call flow and grammars are edited using an editor of the type described above. [0246]
  • The grammars are compiled into a format that can be used by the voice recognizer. [0247]
  • Above, an embodiment of the invention is described in which a phrase thesaurus is used as part of a software application that can be used to generate recognition grammars from the source code of a web page or pages. The web page(s) can include “interactive” part(s) (i.e., part(s) that prompt the user to provide textual information in form fields) and/or “non-interactive” part(s) (i.e., part(s) other than interactive parts, such as part(s) that enable navigation). More generally, other embodiments of the invention similar to such an embodiment of the invention can be used to generate a recognition grammar from any set of human readable text data that is also machine readable, such as, for example, text data created using a word processing program, PDF documents, or text data created using a spreadsheet. Further, though the invention is often described above as implemented to generate a recognition grammar for text data representing a form, such need not be the case. The invention can also be used to generate recognition grammars based on text data that enables navigation through, or retrieval of information from, the set of text data. For example, the invention can be used to create a recognition grammar based on an index of this document. A voice recognition system could then be implemented to make use of such a recognition grammar so that, for example, when a user said “Skip to the claims,” the voice recognition system would understand and act on that statement by the user. [0248]
  • C. A Natural Language Understanding Component To Be Used In Speech Recognition Systems [0249]
  • In another aspect of the invention a compiled sub-set of the phrase thesaurus is incorporated into a speech recognition system to be accessed at run-time in order to parse the incoming speech signal and to derive an abstract conceptual representation of its meaning that is passed on to the application. The phrase subset used in the run-time natural language interpreter is identical to the one used in a particular grammar. (Recall that the grammar specifies the total set of user commands the system expects and is able to process. Commands not specified in the grammar are automatically assigned to a single variable that triggers a system request for clarification.) [0250]
  • This aspect of the invention particularly concerns the NLU component. In conventional spoken dialogue systems, recognition grammars are mapped onto a set of formalized instructions by using a crude technique called “word spotting.” Word spotting proceeds from a given set of instructions and then searches the user input for specific words that match these instructions. The instructions themselves are provided by hand, using abstract denominators to tag each grammar. Word spotting works by disregarding utterances or parts of utterances that are deemed irrelevant at a given state of the user-machine interaction. For example, all the responses specified in the request for a certain book title are reduced to the simple instruction “search <TITLE>,” where only the words that make up “<TITLE>” are explicitly recognized by the system. Word spotting works for very simple systems, but it is limited by the fact that it cannot recognize negations or more complex syntactic relationships. [0251]
  • In the present invention, recognition grammars are mapped to system instructions by way of an annotation scheme that extracts the abstract meaning from a number of alternative phrase variants. This is possible because the underlying thesaurus database classifies phrases according to semantic similarity and contains pre-tagged descriptors for each class. At run-time, user speech input is parsed automatically into phrase-based units, which are subsequently translated into system instructions. [0252]
  • Various embodiments of the invention have been described. The descriptions are intended to be illustrative, not limitative. Thus, it will be apparent to one skilled in the art that certain modifications may be made to the invention as described herein without departing from the scope of the claims set out below. [0253]

Claims (57)

We claim:
1. A method for creating a recognition grammar for use with an interactive user interface to human readable text data that is also machine readable, the interactive user interface being responsive to spoken input, the method comprising the steps of:
formulating an expression representing a part of the text data for each of one or more parts of the text data, wherein each formulated expression can be constructed as one or more combinations of one or more phrases in a phrase thesaurus; and
automatically using the phrase thesaurus to construct one or more equivalent expressions of one or more formulated expressions, wherein the recognition grammar comprises the collection of all of the expressions.
2. A method as in claim 1, wherein the step of formulating an expression representing a part of the text data further comprises the step of formulating an expression representing an interactive part of the text data.
3. A method as in claim 1, wherein the step of formulating an expression representing a part of the text data further comprises the step of formulating an expression representing a non-interactive part of the text data.
4. A method as in claim 1, wherein the text data represents one or more Web pages.
5. A method as in claim 4, wherein the step of formulating an expression further comprises the step of automatically parsing code representing the one or more Web pages to identify the one or more parts.
6. A method as in claim 5, wherein one or more of the parts of the one or more Web pages comprise a system prompt indicating a type of interaction with an interactive part of a web page and a plurality of option values each representing a possible input to the interactive user interface for that type of interaction, the step of parsing further comprising the step of identifying the system prompt and the plurality of option values.
7. A method as in claim 6, wherein the step of formulating an expression further comprises the steps of:
automatically identifying one or more phrases that correspond to a system prompt; and
automatically identifying, for each of a plurality of option values, one or more phrases that correspond to the option value.
8. A method as in claim 4, wherein the code representing the one or more Web pages is expressed in a markup language.
9. A method as in claim 8, wherein the code representing the one or more Web pages is expressed in HTML.
10. A method as in claim 8, wherein the code epresenting the one or more Web pages is expressed in XML.
11. A method as in claim 4, wherein the recognition grammar is expressed in VXML.
12. A method as in claim 1, wherein the step of automatically using the phrase thesaurus to construct one or more equivalent expressions further comprises the steps of:
selecting a combination of one or more phrases representing the formulated expression, wherein the phrases of the selected combination of one or more phrases are original phrases of the formulated expression;
identifying an equivalent phrase for each of one or more original phrases of the formulated expression; and
producing a new combination of one or more phrases representing the formulated expression, the new combination including at least one of the identified equivalent phrases, wherein the new combination represents the equivalent expression.
13. A method as in claim 12, wherein:
phrases in the phrase thesaurus have a probability of occurrence associated therewith;
one or more original phrases has a plurality of equivalent phrases; and
the step of identifying an equivalent phrase further comprises the step of selecting an equivalent phrase having the highest probability of occurrence.
14. A method as in claim 12, wherein equivalent phrases are grouped in classes and each class of equivalent phrases has associated therewith a descriptor denoting a conceptual representation of the phrases contained in that phrase class, the method further comprising the step of tagging each equivalent expression with the descriptor or descriptors associated with phrases of the equivalent expression.
15. A method as in claim 1, further comprising the step of translating the recognition grammar into a form that can be processed by a speech recognition system.
16. A method as in claim 1, further comprising the step of manually editing the recognition grammar.
17. A method as in claim 1, wherein expressions representing a plurality of parts of the text data are formulated and the phrase thesaurus is used to identify equivalent expressions for a plurality of formulated expressions.
18. A system for creating a recognition grammar for use with an interactive user interface to human readable text data that is also machine readable, the interactive user interface being responsive to spoken input, the system comprising:
means for formulating an expression representing a part of the text data for each of one or more parts of the text data, wherein each formulated expression can be constructed as one or more combinations of one or more phrases in a phrase thesaurus; and
means for automatically using the phrase thesaurus to construct one or more equivalent expressions of one or more formulated expressions, wherein the recognition grammar comprises the collection of all of the expressions.
19. A system as in claim 18, wherein the means for formulating an expression of each of one or more anticipated spoken inputs to the interface further comprises a graphical user interface device.
20. A system as in claim 18, wherein the means for formulating an expression representing a part of the text data further comprises means for formulating an expression representing an interactive part of the text data.
21. A system as in claim 18, wherein the means for formulating an expression representing a part of the text data further comprises means for formulating an expression representing a non-interactive part of the text data.
22. A system as in claim 18, wherein the text data represents one or more Web pages.
23. A system as in claim 22, wherein the means for formulating an expression further comprises means for automatically parsing code representing the one or more Web pages to identify the one or more parts.
24. A system as in claim 23, wherein one or more of the parts of the one or more Web pages comprise a system prompt indicating a type of interaction with an interactive part of a web page and a plurality of option values each representing a possible input to the interactive user interface for that type of interaction, and wherein the means for parsing further comprises means for identifying the system prompt and the plurality of option values.
25. A system as in claim 24, wherein the means for formulating an expression further comprises:
means for automatically identifying one or more phrases that correspond to a system prompt; and
means for automatically identifying, for each of a plurality of option values, one or more phrases that correspond to the option value.
26. A system as in claim 22, wherein the means for automatically parsing further comprises means for automatically parsing code expressed in a markup language.
27. A system as in claim 26, wherein the means for automatically parsing further comprises means for automatically parsing code expressed in HTML.
28. A system as in claim 26, wherein the means for automatically parsing further comprises means for automatically parsing code expressed in XML.
29. A system as in claim 22, wherein the recognition grammar is expressed in VXML.
30. A system as in claim 18, further comprising data storage means for storing data representing the phrase thesaurus and the recognition grammar.
31. A system as in claim 30, wherein the data storage means further stores data representing lexical items that can be used to complete a phrase template.
32. A system as in claim 30, wherein the data storage means further stores data representing a probability of occurrence of phrases.
33. A system as in claim 32, wherein the phrases are stored in the data storage means in accordance with the corresponding probability of occurrence.
34. A system as in claim 32, wherein the means for automatically using the phrase thesaurus to construct equivalent phrases further comprises means for using the data representing a probability of occurrence of phrases to construct a probabilistic grammar.
35. A system as in claim 1, wherein the means for automatically using the phrase thesaurus to construct equivalent phrases further comprises:
means for selecting a combination of one or more phrases representing a formulated expression, wherein the phrases of the selected combination of one or more phrases are original phrases of the formulated expression;
means for identifying an equivalent phrase for each of one or more original phrases of the formulated expression; and
means for producing a new combination of one or more phrases representing the formulated expression, the new combination including at least one of the identified equivalent phrases, wherein the new combination represents the equivalent expression.
36. A system as in claim 25, wherein:
phrases in the phrase thesaurus have a probability of occurrence associated therewith;
one or more original phrases has a plurality of equivalent phrases; and
the means for identifying an equivalent phrase further comprises means for selecting an equivalent phrase having the highest probability of occurrence.
37. A system as in claim 35, wherein equivalent phrases are grouped in classes and each class of equivalent phrases has associated therewith a descriptor denoting a conceptual representation of the phrases contained in that phrase class, the system further comprising means for tagging each equivalent expression with the descriptor or descriptors associated with phrases of the equivalent expression.
38. A system as in claim 18, further comprising means for translating the recognition grammar into a form that can be processed by a speech recognition system.
39. A system as in claim 18, further comprising means for manually editing the recognition grammar.
40. A system as in claim 18, wherein expressions representing a plurality of parts of the text data are formulated and the phrase thesaurus is used to identify equivalent expressions for a plurality of formulated expressions.
41. A computer readable storage medium encoded with one or more computer programs for creating a recognition grammar for use with an interactive user interface to human readable text data that is also machine readable, the interactive user interface being responsive to spoken input, the computer programs comprising:
instructions for formulating an expression representing a part of the text data for each of one or more parts of the text data, wherein each formulated expression can be constructed as one or more combinations of one or more phrases in a phrase thesaurus; and
instructions for automatically using the phrase thesaurus to construct one or more equivalent expressions of one or more formulated expressions, wherein the recognition grammar comprises the collection of all of the expressions.
42. A computer readable storage medium as in claim 41, wherein the instructions for formulating an expression representing a part of the text data further comprise instructions for formulating an expression representing an interactive part of the text data.
43. A computer readable storage medium as in claim 41, wherein the instructions for formulating an expression representing a part of the text data further comprise instructions for formulating an expression representing a non-interactive part of the text data.
44. A computer readable storage medium as in claim 41, wherein the text data represents one or more Web pages.
45. A computer readable storage medium as in claim 44, wherein the instructions for formulating an expression further comprise instructions for automatically parsing code representing the one or more Web pages to identify the one or more parts.
46. A computer readable storage medium as in claim 45, wherein one or more of the parts of the one or more Web pages comprise a system prompt indicating a type of interaction with an interactive part of a web page and a plurality of option values each representing a possible input to the interactive user interface for that type of interaction, the instructions for parsing further comprising instructions for identifying the system prompt and the plurality of option values.
47. A computer readable storage medium as in claim 46, wherein the instructions for formulating an expression further comprise:
instructions for automatically identifying one or more phrases that correspond to a system prompt; and
instructions for automatically identifying, for each of a plurality of option values, one or more phrases that correspond to the option value.
48. A computer readable storage medium as in claim 44, wherein the instructions for automatically parsing further comprise instructions for automatically parsing code expressed in a markup language.
49. A computer readable storage medium as in claim 48, wherein the instructions for automatically parsing further comprise instructions for automatically parsing code expressed in HTML.
50. A computer readable storage medium as in claim 48, wherein the instructions for automatically parsing further comprise instructions for automatically parsing code expressed in XML.
51. A computer readable storage medium as in claim 44, wherein the recognition grammar is expressed in VXML.
52. A computer readable storage medium as in claim 41, wherein the instructions for automatically using the phrase thesaurus to construct equivalent phrases further comprise:
instructions for selecting a combination of one or more phrases representing a formulated expression, wherein the phrases of the selected combination of one or more phrases are original phrases of the formulated expression;
instructions for identifying an equivalent phrase for each of one or more original phrases of the formulated expression; and
instructions for producing a new combination of one or more phrases representing the formulated expression, the new combination including at least one of the identified equivalent phrases, wherein the new combination represents the equivalent expression.
53. A computer readable storage medium as in claim 52, wherein:
phrases in the phrase thesaurus have a probability of occurrence associated therewith;
one or more original phrases has a plurality of equivalent phrases; and
the instructions for identifying an equivalent phrase further comprise instructions for selecting an equivalent phrase having the highest probability of occurrence.
54. A computer readable storage medium as in claim 52, wherein equivalent phrases are grouped in classes and each class of equivalent phrases has associated therewith a descriptor denoting a conceptual representation of the phrases contained in that phrase class, the one or more computer programs further comprising instructions for tagging each equivalent expression with the descriptor or descriptors associated with phrases of the equivalent expression.
55. A computer readable storage medium as in claim 41, further comprising instructions for translating the recognition grammar into a form that can be processed by a speech recognition system.
56. A computer readable storage medium as in claim 41, further comprising instructions for manually editing the recognition grammar.
57. A computer readable storage medium as in claim 41, wherein expressions representing a plurality of parts of the text data are formulated and the phrase thesaurus is used to identify equivalent expressions for a plurality of formulated expressions.
US09/840,005 2000-04-19 2001-04-19 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface Abandoned US20020032564A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US19840200P true 2000-04-19 2000-04-19
US58005900A true 2000-05-27 2000-05-27
US09/840,005 US20020032564A1 (en) 2000-04-19 2001-04-19 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US09/840,005 US20020032564A1 (en) 2000-04-19 2001-04-19 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US10/818,219 US8442812B2 (en) 1999-05-28 2004-04-05 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US13/736,903 US8630846B2 (en) 1999-05-28 2013-01-08 Phrase-based dialogue modeling with particular application to creating a recognition grammar
US14/155,235 US9342504B2 (en) 1999-05-28 2014-01-14 Phrase-based dialogue modeling with particular application to creating a recognition grammar

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US58005900A Continuation-In-Part 2000-05-27 2000-05-27
US10/096,194 Continuation-In-Part US8374871B2 (en) 1999-05-28 2002-03-11 Methods for creating a phrase thesaurus

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US10/818,219 Continuation US8442812B2 (en) 1999-05-28 2004-04-05 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface

Publications (1)

Publication Number Publication Date
US20020032564A1 true US20020032564A1 (en) 2002-03-14

Family

ID=26893747

Family Applications (4)

Application Number Title Priority Date Filing Date
US09/840,005 Abandoned US20020032564A1 (en) 2000-04-19 2001-04-19 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US10/818,219 Active 2023-12-10 US8442812B2 (en) 1999-05-28 2004-04-05 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US13/736,903 Active US8630846B2 (en) 1999-05-28 2013-01-08 Phrase-based dialogue modeling with particular application to creating a recognition grammar
US14/155,235 Active 2020-12-08 US9342504B2 (en) 1999-05-28 2014-01-14 Phrase-based dialogue modeling with particular application to creating a recognition grammar

Family Applications After (3)

Application Number Title Priority Date Filing Date
US10/818,219 Active 2023-12-10 US8442812B2 (en) 1999-05-28 2004-04-05 Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US13/736,903 Active US8630846B2 (en) 1999-05-28 2013-01-08 Phrase-based dialogue modeling with particular application to creating a recognition grammar
US14/155,235 Active 2020-12-08 US9342504B2 (en) 1999-05-28 2014-01-14 Phrase-based dialogue modeling with particular application to creating a recognition grammar

Country Status (1)

Country Link
US (4) US20020032564A1 (en)

Cited By (276)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169817A1 (en) * 2001-05-11 2002-11-14 Koninklijke Philips Electronics N.V. Real-world representation system and language
US20030004722A1 (en) * 2001-06-28 2003-01-02 Butzberger John W. Method of dynamically altering grammars in a memory efficient speech recognition system
US20030009339A1 (en) * 2001-07-03 2003-01-09 Yuen Michael S. Method and apparatus for improving voice recognition performance in a voice application distribution system
US20030071833A1 (en) * 2001-06-07 2003-04-17 Dantzig Paul M. System and method for generating and presenting multi-modal applications from intent-based markup scripts
US20030167168A1 (en) * 2002-03-01 2003-09-04 International Business Machines Corporation Automatic generation of efficient grammar for heading selection
EP1349145A2 (en) * 2002-03-29 2003-10-01 Samsung Electronics Co., Ltd. System and method for providing information using spoken dialogue interface
US20030200094A1 (en) * 2002-04-23 2003-10-23 Gupta Narendra K. System and method of using existing knowledge to rapidly train automatic speech recognizers
US20040027379A1 (en) * 2002-08-08 2004-02-12 Hong Huey Anna Onon Integrated visual development system for creating computer-implemented dialog scripts
US20040030557A1 (en) * 2002-08-06 2004-02-12 Sri International Method and apparatus for providing an integrated speech recognition and natural language understanding for a dialog system
US20040054530A1 (en) * 2002-09-18 2004-03-18 International Business Machines Corporation Generating speech recognition grammars from a large corpus of data
US20040111405A1 (en) * 2002-10-18 2004-06-10 Hewlett-Packard Development Company, L.P. Communication system and method
US20040153323A1 (en) * 2000-12-01 2004-08-05 Charney Michael L Method and system for voice activating web pages
US20040193557A1 (en) * 2003-03-25 2004-09-30 Olsen Jesse Dale Systems and methods for reducing ambiguity of communications
US20040193426A1 (en) * 2002-10-31 2004-09-30 Maddux Scott Lynn Speech controlled access to content on a presentation medium
US20040225499A1 (en) * 2001-07-03 2004-11-11 Wang Sandy Chai-Jen Multi-platform capable inference engine and universal grammar language adapter for intelligent voice application execution
US20050086057A1 (en) * 2001-11-22 2005-04-21 Tetsuo Kosaka Speech recognition apparatus and its method and program
US20050114122A1 (en) * 2003-09-25 2005-05-26 Dictaphone Corporation System and method for customizing speech recognition input and output
US20050119892A1 (en) * 2003-12-02 2005-06-02 International Business Machines Corporation Method and arrangement for managing grammar options in a graphical callflow builder
US20050125270A1 (en) * 2003-12-08 2005-06-09 International Business Machines Corporation Efficient presentation of correction options in a speech interface based upon user selection probability
US20050132275A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Creating a presentation document
US20050132271A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Creating a session document from a presentation document
US20050132273A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Amending a session document during a presentation
US20050132274A1 (en) * 2003-12-11 2005-06-16 International Business Machine Corporation Creating a presentation document
US20050143975A1 (en) * 2003-06-06 2005-06-30 Charney Michael L. System and method for voice activating web pages
US20050165900A1 (en) * 2004-01-13 2005-07-28 International Business Machines Corporation Differential dynamic content delivery with a participant alterable session copy of a user profile
US20050209853A1 (en) * 2004-03-19 2005-09-22 International Business Machines Corporation Speech disambiguation for string processing in an interactive voice response system
EP1583076A1 (en) 2004-03-31 2005-10-05 AT&amp;T Corp. System and method for automatic generation of dialogue run time systems
US20050234727A1 (en) * 2001-07-03 2005-10-20 Leo Chiu Method and apparatus for adapting a voice extensible markup language-enabled voice system for natural speech recognition and system response
US20050240603A1 (en) * 2004-04-26 2005-10-27 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20050256715A1 (en) * 2002-10-08 2005-11-17 Yoshiyuki Okimoto Language model generation and accumulation device, speech recognition device, language model creation method, and speech recognition method
US20050261902A1 (en) * 2004-05-24 2005-11-24 Sbc Knowledge Ventures, L.P. Method for designing an automated speech recognition (ASR) interface for a customer call center
US20050283367A1 (en) * 2004-06-17 2005-12-22 International Business Machines Corporation Method and apparatus for voice-enabling an application
US20060004818A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Efficient information management
US20060004819A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Information management
US20060004868A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Policy-based information management
US20060004820A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Storage pools for information management
US20060004582A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Video surveillance
US20060010370A1 (en) * 2004-07-08 2006-01-12 International Business Machines Corporation Differential dynamic delivery of presentation previews
US20060010365A1 (en) * 2004-07-08 2006-01-12 International Business Machines Corporation Differential dynamic delivery of content according to user expressions of interest
US20060047518A1 (en) * 2004-08-31 2006-03-02 Claudatos Christopher H Interface for management of multiple auditory communications
US20060112063A1 (en) * 2004-11-05 2006-05-25 International Business Machines Corporation System, apparatus, and methods for creating alternate-mode applications
US20060112040A1 (en) * 2004-10-13 2006-05-25 Hewlett-Packard Development Company, L.P. Device, method, and program for document classification
US20060149553A1 (en) * 2005-01-05 2006-07-06 At&T Corp. System and method for using a library to interactively design natural language spoken dialog systems
US20060155526A1 (en) * 2005-01-10 2006-07-13 At&T Corp. Systems, Devices, & Methods for automating non-deterministic processes
US20060178869A1 (en) * 2005-02-10 2006-08-10 Microsoft Corporation Classification filter for processing data for creating a language model
US20060182239A1 (en) * 2005-02-16 2006-08-17 Yves Lechervy Process for synchronizing a speech service and a visual presentation
WO2006127504A2 (en) * 2005-05-20 2006-11-30 Sony Computer Entertainment Inc. Optimisation of a grammar for speech recognition
US20060277031A1 (en) * 2005-06-02 2006-12-07 Microsoft Corporation Authoring speech grammars
US7149695B1 (en) * 2000-10-13 2006-12-12 Apple Computer, Inc. Method and apparatus for speech recognition using semantic inference and word agglomeration
US20070043570A1 (en) * 2003-07-18 2007-02-22 Koninklijke Philips Electronics N.V. Method of controlling a dialoging process
US20070043758A1 (en) * 2005-08-19 2007-02-22 Bodin William K Synthesizing aggregate data of disparate data types into data of a uniform data type
US20070061401A1 (en) * 2005-09-14 2007-03-15 Bodin William K Email management and rendering
KR100718147B1 (en) * 2005-02-01 2007-05-14 삼성전자주식회사 Apparatus and method of generating grammar network for speech recognition and dialogue speech recognition apparatus and method employing the same
US20070129947A1 (en) * 2005-12-02 2007-06-07 International Business Machines Corporation Method and system for testing sections of large speech applications
US7231343B1 (en) * 2001-12-20 2007-06-12 Ianywhere Solutions, Inc. Synonyms mechanism for natural language systems
US20070136067A1 (en) * 2003-11-10 2007-06-14 Scholl Holger R Audio dialogue system and voice browsing method
US7243071B1 (en) * 2003-01-16 2007-07-10 Comverse, Inc. Speech-recognition grammar analysis
US20070168191A1 (en) * 2006-01-13 2007-07-19 Bodin William K Controlling audio operation for data management and data rendering
US20070185702A1 (en) * 2006-02-09 2007-08-09 John Harney Language independent parsing in natural language systems
US20070192674A1 (en) * 2006-02-13 2007-08-16 Bodin William K Publishing content through RSS feeds
US20070192684A1 (en) * 2006-02-13 2007-08-16 Bodin William K Consolidated content management
US20070192683A1 (en) * 2006-02-13 2007-08-16 Bodin William K Synthesizing the content of disparate data types
US20070214149A1 (en) * 2006-03-09 2007-09-13 International Business Machines Corporation Associating user selected content management directives with user selected ratings
US20070213857A1 (en) * 2006-03-09 2007-09-13 Bodin William K RSS content administration for rendering RSS content on a digital audio player
US20070214485A1 (en) * 2006-03-09 2007-09-13 Bodin William K Podcasting content associated with a user account
US20070250602A1 (en) * 2004-01-13 2007-10-25 Bodin William K Differential Dynamic Content Delivery With A Presenter-Alterable Session Copy Of A User Profile
US20070277233A1 (en) * 2006-05-24 2007-11-29 Bodin William K Token-based content subscription
US20070277088A1 (en) * 2006-05-24 2007-11-29 Bodin William K Enhancing an existing web page
US20070276866A1 (en) * 2006-05-24 2007-11-29 Bodin William K Providing disparate content as a playlist of media files
US20070282594A1 (en) * 2006-06-02 2007-12-06 Microsoft Corporation Machine translation in natural language application development
US20080010280A1 (en) * 2006-06-16 2008-01-10 International Business Machines Corporation Method and apparatus for building asset based natural language call routing application with limited resources
US20080033720A1 (en) * 2006-08-04 2008-02-07 Pankaj Kankar A method and system for speech classification
US20080040099A1 (en) * 2006-03-10 2008-02-14 Nec (China) Co., Ltd. Device and method for language model switching and adaption
US20080052082A1 (en) * 2006-08-23 2008-02-28 Asustek Computer Inc. Voice control method
US20080082635A1 (en) * 2006-09-29 2008-04-03 Bodin William K Asynchronous Communications Using Messages Recorded On Handheld Devices
US7373300B1 (en) * 2002-12-18 2008-05-13 At&T Corp. System and method of providing a spoken dialog interface to a website
US20080154594A1 (en) * 2006-12-26 2008-06-26 Nobuyasu Itoh Method for segmenting utterances by using partner's response
US20080161948A1 (en) * 2007-01-03 2008-07-03 Bodin William K Supplementing audio recorded in a media file
US20080177866A1 (en) * 2004-07-08 2008-07-24 International Business Machines Corporation Differential Dynamic Delivery Of Content To Users Not In Attendance At A Presentation
US20080177837A1 (en) * 2004-04-26 2008-07-24 International Business Machines Corporation Dynamic Media Content For Collaborators With Client Locations In Dynamic Client Contexts
US20080215329A1 (en) * 2002-03-27 2008-09-04 International Business Machines Corporation Methods and Apparatus for Generating Dialog State Conditioned Language Models
US20080215325A1 (en) * 2006-12-27 2008-09-04 Hiroshi Horii Technique for accurately detecting system failure
US20080275893A1 (en) * 2006-02-13 2008-11-06 International Business Machines Corporation Aggregating Content Of Disparate Data Types From Disparate Data Sources For Single Point Access
US20090018830A1 (en) * 2007-07-11 2009-01-15 Vandinburg Gmbh Speech control of computing devices
US20090063150A1 (en) * 2007-08-27 2009-03-05 International Business Machines Corporation Method for automatically identifying sentence boundaries in noisy conversational data
US20090070380A1 (en) * 2003-09-25 2009-03-12 Dictaphone Corporation Method, system, and apparatus for assembly, transport and display of clinical data
US20090089659A1 (en) * 2004-07-08 2009-04-02 International Business Machines Corporation Differential Dynamic Content Delivery To Alternate Display Device Locations
US20090240500A1 (en) * 2008-03-19 2009-09-24 Kabushiki Kaisha Toshiba Speech recognition apparatus and method
US7603433B1 (en) * 2003-04-15 2009-10-13 Sprint Spectrum, L.P. IMS-based interactive media system and method
KR100923180B1 (en) 2005-10-21 2009-10-22 뉘앙스 커뮤니케이션즈, 인코포레이티드 Creating a mixed-initiative grammar from directed dialog grammars
WO2010006087A1 (en) * 2008-07-08 2010-01-14 David Seaberg Process for providing and editing instructions, data, data structures, and algorithms in a computer system
US20100030553A1 (en) * 2007-01-04 2010-02-04 Thinking Solutions Pty Ltd Linguistic Analysis
US20100050150A1 (en) * 2002-06-14 2010-02-25 Apptera, Inc. Method and System for Developing Speech Applications
US7698435B1 (en) 2003-04-15 2010-04-13 Sprint Spectrum L.P. Distributed interactive media system and method
US7698136B1 (en) * 2003-01-28 2010-04-13 Voxify, Inc. Methods and apparatus for flexible speech recognition
US20100114571A1 (en) * 2007-03-19 2010-05-06 Kentaro Nagatomo Information retrieval system, information retrieval method, and information retrieval program
US20100138377A1 (en) * 2008-11-29 2010-06-03 Jeremy Wright Systems and Methods for Detecting and Coordinating Changes in Lexical Items
US20100179801A1 (en) * 2009-01-13 2010-07-15 Steve Huynh Determining Phrases Related to Other Phrases
US7774693B2 (en) 2004-01-13 2010-08-10 International Business Machines Corporation Differential dynamic content delivery with device controlling action
US7831432B2 (en) 2006-09-29 2010-11-09 International Business Machines Corporation Audio menus describing media contents of media players
US20100286979A1 (en) * 2007-08-01 2010-11-11 Ginger Software, Inc. Automatic context sensitive language correction and enhancement using an internet corpus
US20100312547A1 (en) * 2009-06-05 2010-12-09 Apple Inc. Contextual voice commands
US20110016141A1 (en) * 2008-04-15 2011-01-20 Microsoft Corporation Web Traffic Analysis Tool
US7890848B2 (en) 2004-01-13 2011-02-15 International Business Machines Corporation Differential dynamic content delivery with alternative content presentation
US7902447B1 (en) * 2006-10-03 2011-03-08 Sony Computer Entertainment Inc. Automatic composition of sound sequences using finite state automata
US20110064207A1 (en) * 2003-11-17 2011-03-17 Apptera, Inc. System for Advertisement Selection, Placement and Delivery
US20110099016A1 (en) * 2003-11-17 2011-04-28 Apptera, Inc. Multi-Tenant Self-Service VXML Portal
US8005025B2 (en) 2004-07-13 2011-08-23 International Business Machines Corporation Dynamic media content for collaborators with VOIP support for client communications
US8024196B1 (en) * 2005-09-19 2011-09-20 Sap Ag Techniques for creating and translating voice applications
US20110276944A1 (en) * 2010-05-07 2011-11-10 Ruth Bergman Natural language text instructions
US8065151B1 (en) 2002-12-18 2011-11-22 At&T Intellectual Property Ii, L.P. System and method of automatically building dialog services by exploiting the content and structure of websites
US20120084433A1 (en) * 2010-10-01 2012-04-05 Microsoft Corporation Web test generation
CN102411563A (en) * 2010-09-26 2012-04-11 阿里巴巴集团控股有限公司 Method, device and system for identifying target words
US8209185B2 (en) 2003-09-05 2012-06-26 Emc Corporation Interface for management of auditory communications
US20120166183A1 (en) * 2009-09-04 2012-06-28 David Suendermann System and method for the localization of statistical classifiers based on machine translation
US8214376B1 (en) * 2007-12-31 2012-07-03 Symantec Corporation Techniques for global single instance segment-based indexing for backup data
US8219402B2 (en) 2007-01-03 2012-07-10 International Business Machines Corporation Asynchronous receipt of information from a user
US20120253799A1 (en) * 2011-03-28 2012-10-04 At&T Intellectual Property I, L.P. System and method for rapid customization of speech recognition models
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US20120303359A1 (en) * 2009-12-11 2012-11-29 Nec Corporation Dictionary creation device, word gathering method and recording medium
US20130054238A1 (en) * 2011-08-29 2013-02-28 Microsoft Corporation Using Multiple Modality Input to Feedback Context for Natural Language Understanding
US20130246045A1 (en) * 2012-03-14 2013-09-19 Hewlett-Packard Development Company, L.P. Identification and Extraction of New Terms in Documents
US20130262114A1 (en) * 2012-04-03 2013-10-03 Microsoft Corporation Crowdsourced, Grounded Language for Intent Modeling in Conversational Interfaces
US8560318B2 (en) 2010-05-14 2013-10-15 Sony Computer Entertainment Inc. Methods and system for evaluating potential confusion within grammar structure for set of statements to be used in speech recognition during computing event
US8583418B2 (en) 2008-09-29 2013-11-12 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US8600743B2 (en) 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
US8614431B2 (en) 2005-09-30 2013-12-24 Apple Inc. Automated response to and sensing of user activity in portable devices
US20130346078A1 (en) * 2012-06-26 2013-12-26 Google Inc. Mixed model speech recognition
US8620662B2 (en) 2007-11-20 2013-12-31 Apple Inc. Context-aware unit selection
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
US20140039876A1 (en) * 2012-07-31 2014-02-06 Craig P. Sayers Extracting related concepts from a content stream using temporal distribution
US8660849B2 (en) 2010-01-18 2014-02-25 Apple Inc. Prioritizing selection criteria by automated assistant
US8670985B2 (en) 2010-01-13 2014-03-11 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US8682649B2 (en) 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US8688446B2 (en) 2008-02-22 2014-04-01 Apple Inc. Providing text input using speech data and non-speech data
US8694324B2 (en) 2005-01-05 2014-04-08 At&T Intellectual Property Ii, L.P. System and method of providing an automated data-collection in spoken dialog systems
US8694319B2 (en) 2005-11-03 2014-04-08 International Business Machines Corporation Dynamic prosody adjustment for voice-rendering synthesized data
US8700396B1 (en) * 2012-09-11 2014-04-15 Google Inc. Generating speech data collection prompts
US8700404B1 (en) * 2005-08-27 2014-04-15 At&T Intellectual Property Ii, L.P. System and method for using semantic and syntactic graphs for utterance classification
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
US8712776B2 (en) 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US8718047B2 (en) 2001-10-22 2014-05-06 Apple Inc. Text to speech conversion of text messages from mobile communication devices
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
US20140136189A1 (en) * 1999-05-28 2014-05-15 Fluential, Llc Phrase-Based Dialogue Modeling With Particular Application to Creating a Recognition Grammar
US20140156265A1 (en) * 2005-12-15 2014-06-05 Nuance Communications, Inc. Method and system for conveying an example in a natural language understanding application
US8751238B2 (en) 2009-03-09 2014-06-10 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US20140214399A1 (en) * 2013-01-29 2014-07-31 Microsoft Corporation Translating natural language descriptions to programs in a domain-specific language for spreadsheets
US8799658B1 (en) 2010-03-02 2014-08-05 Amazon Technologies, Inc. Sharing media items with pass phrases
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US8868424B1 (en) * 2008-02-08 2014-10-21 West Corporation Interactive voice response data collection object framework, vertical benchmarking, and bootstrapping engine
US8898568B2 (en) 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
US8977584B2 (en) 2010-01-25 2015-03-10 Newvaluexchange Global Ai Llp Apparatuses, methods and systems for a digital conversation management platform
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8990126B1 (en) * 2006-08-03 2015-03-24 At&T Intellectual Property Ii, L.P. Copying human interactions through learning and discovery
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US9015036B2 (en) 2010-02-01 2015-04-21 Ginger Software, Inc. Automatic context sensitive language correction using an internet corpus particularly for small keyboard devices
US9037967B1 (en) * 2014-02-18 2015-05-19 King Fahd University Of Petroleum And Minerals Arabic spell checking technique
US9053089B2 (en) 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
US9135339B2 (en) 2006-02-13 2015-09-15 International Business Machines Corporation Invoking an audio hyperlink
US9135544B2 (en) 2007-11-14 2015-09-15 Varcode Ltd. System and method for quality management utilizing barcode indicators
US9167087B2 (en) 2004-07-13 2015-10-20 International Business Machines Corporation Dynamic media content for collaborators including disparate location representations
US9213692B2 (en) * 2004-04-16 2015-12-15 At&T Intellectual Property Ii, L.P. System and method for the automatic validation of dialog run time systems
US9240197B2 (en) 2005-01-05 2016-01-19 At&T Intellectual Property Ii, L.P. Library of existing spoken dialog data for use in generating new natural language spoken dialog systems
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9268780B2 (en) 2004-07-01 2016-02-23 Emc Corporation Content-driven information lifecycle management
US20160055848A1 (en) * 2014-08-25 2016-02-25 Honeywell International Inc. Speech enabled management system
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US9298700B1 (en) * 2009-07-28 2016-03-29 Amazon Technologies, Inc. Determining similar phrases
US20160092160A1 (en) * 2014-09-26 2016-03-31 Intel Corporation User adaptive interfaces
US20160098389A1 (en) * 2014-10-06 2016-04-07 International Business Machines Corporation Natural Language Processing Utilizing Transaction Based Knowledge Representation
US9311043B2 (en) 2010-01-13 2016-04-12 Apple Inc. Adaptive audio feedback system and method
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9400952B2 (en) 2012-10-22 2016-07-26 Varcode Ltd. Tamper-proof quality management barcode indicators
US20160217127A1 (en) * 2015-01-27 2016-07-28 Verint Systems Ltd. Identification of significant phrases using multiple language models
US9432516B1 (en) 2009-03-03 2016-08-30 Alpine Audio Now, LLC System and method for communicating streaming audio to a telephone device
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US20170031896A1 (en) * 2015-07-28 2017-02-02 Xerox Corporation Robust reversible finite-state approach to contextual generation and semantic parsing
US9569770B1 (en) 2009-01-13 2017-02-14 Amazon Technologies, Inc. Generating constructed phrases
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9588961B2 (en) 2014-10-06 2017-03-07 International Business Machines Corporation Natural language processing utilizing propagation of knowledge through logical parse tree structures
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9646277B2 (en) 2006-05-07 2017-05-09 Varcode Ltd. System and method for improved quality management in a product logistic chain
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9665564B2 (en) 2014-10-06 2017-05-30 International Business Machines Corporation Natural language processing utilizing logical tree structures
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US20180061408A1 (en) * 2016-08-24 2018-03-01 Semantic Machines, Inc. Using paraphrase in accepting utterances in an automated assistant
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9946706B2 (en) 2008-06-07 2018-04-17 Apple Inc. Automatic language identification for dynamic text processing
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US9977779B2 (en) 2013-03-14 2018-05-22 Apple Inc. Automatic supplementation of word correction dictionaries
US10002189B2 (en) 2007-12-20 2018-06-19 Apple Inc. Method and apparatus for searching using an active ontology
US10007712B1 (en) 2009-08-20 2018-06-26 Amazon Technologies, Inc. Enforcing user-specified rules
US10019994B2 (en) 2012-06-08 2018-07-10 Apple Inc. Systems and methods for recognizing textual identifiers within a plurality of words
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10073833B1 (en) * 2017-03-09 2018-09-11 International Business Machines Corporation Domain-specific method for distinguishing type-denoting domain terms from entity-denoting domain terms
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10078487B2 (en) 2013-03-15 2018-09-18 Apple Inc. Context-sensitive handling of interruptions
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10176451B2 (en) 2007-05-06 2019-01-08 Varcode Ltd. System and method for quality management utilizing barcode indicators
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10255566B2 (en) 2011-06-03 2019-04-09 Apple Inc. Generating and processing task items that represent tasks to perform
US10255346B2 (en) 2014-01-31 2019-04-09 Verint Systems Ltd. Tagging relations with N-best
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
US10296584B2 (en) 2010-01-29 2019-05-21 British Telecommunications Plc Semantic textual analysis
US10339452B2 (en) 2014-02-05 2019-07-02 Verint Systems Ltd. Automated ontology development

Families Citing this family (90)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060116865A1 (en) 1999-09-17 2006-06-01 Www.Uniscape.Com E-services translation utilizing machine translation and translation memory
US8392188B1 (en) 1999-11-05 2013-03-05 At&T Intellectual Property Ii, L.P. Method and system for building a phonotactic model for domain independent speech recognition
US7286984B1 (en) 1999-11-05 2007-10-23 At&T Corp. Method and system for automatically detecting morphemes in a task classification system using lattices
US20030191625A1 (en) * 1999-11-05 2003-10-09 Gorin Allen Louis Method and system for creating a named entity language model
US6904402B1 (en) * 1999-11-05 2005-06-07 Microsoft Corporation System and iterative method for lexicon, segmentation and language model joint optimization
GB0030330D0 (en) * 2000-12-13 2001-01-24 Hewlett Packard Co Idiom handling in voice service systems
US7904595B2 (en) 2001-01-18 2011-03-08 Sdl International America Incorporated Globalization management system and method therefor
US7269546B2 (en) * 2001-05-09 2007-09-11 International Business Machines Corporation System and method of finding documents related to other documents and of finding related words in response to a query to refine a search
US7117447B2 (en) * 2001-06-08 2006-10-03 Mci, Llc Graphical user interface (GUI) based call application system
US20060004732A1 (en) * 2002-02-26 2006-01-05 Odom Paul S Search engine methods and systems for generating relevant search results and advertisements
US7716207B2 (en) * 2002-02-26 2010-05-11 Odom Paul S Search engine methods and systems for displaying relevant topics
US7340466B2 (en) * 2002-02-26 2008-03-04 Kang Jo Mgmt. Limited Liability Company Topic identification and use thereof in information retrieval systems
US7398209B2 (en) 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7693720B2 (en) 2002-07-15 2010-04-06 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US7983896B2 (en) 2004-03-05 2011-07-19 SDL Language Technology In-context exact (ICE) matching
JP2005321730A (en) * 2004-05-11 2005-11-17 Fujitsu Ltd Dialog system, dialog system implementation method, and computer program
JP3984988B2 (en) * 2004-11-26 2007-10-03 キヤノン株式会社 User interface design apparatus and control method thereof
US20060253272A1 (en) * 2005-05-06 2006-11-09 International Business Machines Corporation Voice prompts for use in speech-to-speech translation system
US7689411B2 (en) * 2005-07-01 2010-03-30 Xerox Corporation Concept matching
US7809551B2 (en) * 2005-07-01 2010-10-05 Xerox Corporation Concept matching system
US7640160B2 (en) 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7620549B2 (en) 2005-08-10 2009-11-17 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US7949529B2 (en) 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
EP1934971A4 (en) 2005-08-31 2010-10-27 Voicebox Technologies Inc Dynamic speech sharpening
US7565372B2 (en) * 2005-09-13 2009-07-21 Microsoft Corporation Evaluating and generating summaries using normalized probabilities
JP3986531B2 (en) * 2005-09-21 2007-10-03 沖電気工業株式会社 Morphological analysis apparatus and a morphological analysis program
US10319252B2 (en) 2005-11-09 2019-06-11 Sdl Inc. Language capability assessment and training apparatus and techniques
US9165039B2 (en) * 2005-11-29 2015-10-20 Kang Jo Mgmt, Limited Liability Company Methods and systems for providing personalized contextual search results
US9037466B2 (en) * 2006-03-09 2015-05-19 Nuance Communications, Inc. Email administration for rendering email on a digital audio player
US20070239455A1 (en) * 2006-04-07 2007-10-11 Motorola, Inc. Method and system for managing pronunciation dictionaries in a speech application
JP4156639B2 (en) * 2006-08-14 2008-09-24 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Apparatus for supporting a voice interface design, method, program
US8521510B2 (en) 2006-08-31 2013-08-27 At&T Intellectual Property Ii, L.P. Method and system for providing an automated web transcription service
US20080071533A1 (en) * 2006-09-14 2008-03-20 Intervoice Limited Partnership Automatic generation of statistical language models for interactive voice response applications
ZA200902091B (en) * 2006-09-21 2010-07-28 Activx Biosciences Inc Serine hydrolase inhibitors
US7881932B2 (en) * 2006-10-02 2011-02-01 Nuance Communications, Inc. VoiceXML language extension for natively supporting voice enrolled grammars
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US9396185B2 (en) * 2006-10-31 2016-07-19 Scenera Mobile Technologies, Llc Method and apparatus for providing a contextual description of an object
US8631005B2 (en) 2006-12-28 2014-01-14 Ebay Inc. Header-token driven automatic text segmentation
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US20090037176A1 (en) * 2007-08-02 2009-02-05 Nexidia Inc. Control and configuration of a speech recognizer by wordspotting
US20090094018A1 (en) * 2007-10-08 2009-04-09 Nokia Corporation Flexible Phrasebook
US20090099847A1 (en) * 2007-10-10 2009-04-16 Microsoft Corporation Template constrained posterior probability
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US8972247B2 (en) * 2007-12-26 2015-03-03 Marvell World Trade Ltd. Selection of speech encoding scheme in wireless communication terminals
US8949122B2 (en) * 2008-02-25 2015-02-03 Nuance Communications, Inc. Stored phrase reutilization when testing speech recognition
US8831950B2 (en) * 2008-04-07 2014-09-09 Nuance Communications, Inc. Automated voice enablement of a web page
US9047869B2 (en) * 2008-04-07 2015-06-02 Nuance Communications, Inc. Free form input field support for automated voice enablement of a web page
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8990088B2 (en) * 2009-01-28 2015-03-24 Microsoft Corporation Tool and framework for creating consistent normalization maps and grammars
US8031201B2 (en) * 2009-02-13 2011-10-04 Cognitive Edge Pte Ltd Computer-aided methods and systems for pattern-based cognition from fragmented material
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
JP2011033680A (en) * 2009-07-30 2011-02-17 Sony Corp Voice processing device and method, and program
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
WO2011059997A1 (en) 2009-11-10 2011-05-19 Voicebox Technologies, Inc. System and method for providing a natural language content dedication service
US8626511B2 (en) * 2010-01-22 2014-01-07 Google Inc. Multi-dimensional disambiguation of voice commands
US8554542B2 (en) * 2010-05-05 2013-10-08 Xerox Corporation Textual entailment method for linking text of an abstract to text in the main body of a document
WO2012001271A1 (en) * 2010-06-29 2012-01-05 France Telecom Adaptation of the operation of an apparatus
US9547626B2 (en) 2011-01-29 2017-01-17 Sdl Plc Systems, methods, and media for managing ambient adaptability of web applications and web services
US10140320B2 (en) 2011-02-28 2018-11-27 Sdl Inc. Systems, methods, and media for generating analytical data
US20120290509A1 (en) * 2011-05-13 2012-11-15 Microsoft Corporation Training Statistical Dialog Managers in Spoken Dialog Systems With Web Data
US9984054B2 (en) 2011-08-24 2018-05-29 Sdl Inc. Web interface including the review and manipulation of a web document and utilizing permission based control
US9317605B1 (en) 2012-03-21 2016-04-19 Google Inc. Presenting forked auto-completions
US10261994B2 (en) 2012-05-25 2019-04-16 Sdl Inc. Method and system for automatic management of reputation of translators
US9588964B2 (en) * 2012-09-18 2017-03-07 Adobe Systems Incorporated Natural language vocabulary generation and usage
US9412366B2 (en) 2012-09-18 2016-08-09 Adobe Systems Incorporated Natural language image spatial and tonal localization
US9436382B2 (en) 2012-09-18 2016-09-06 Adobe Systems Incorporated Natural language image editing
US9916306B2 (en) 2012-10-19 2018-03-13 Sdl Inc. Statistical linguistic analysis of source content
CN103971686B (en) * 2013-01-30 2015-06-10 腾讯科技(深圳)有限公司 Method and system for automatically recognizing voice
US9299339B1 (en) * 2013-06-25 2016-03-29 Google Inc. Parsing rule augmentation based on query sequence and action co-occurrence
US9117452B1 (en) * 2013-06-25 2015-08-25 Google Inc. Exceptions to action invocation from parsing rules
US9123336B1 (en) * 2013-06-25 2015-09-01 Google Inc. Learning parsing rules and argument identification from crowdsourcing of proposed command inputs
US9646606B2 (en) 2013-07-03 2017-05-09 Google Inc. Speech recognition using domain knowledge
US9280314B2 (en) * 2013-10-17 2016-03-08 Panasonic Intellectual Property Corporation Of America Method for controlling cordless telephone device, handset of cordless telephone device, and cordless telephone device
US9558176B2 (en) 2013-12-06 2017-01-31 Microsoft Technology Licensing, Llc Discriminating between natural language and keyword language items
US9594737B2 (en) 2013-12-09 2017-03-14 Wolfram Alpha Llc Natural language-aided hypertext document authoring
US9600465B2 (en) 2014-01-10 2017-03-21 Qualcomm Incorporated Methods and apparatuses for quantifying the holistic value of an existing network of devices by measuring the complexity of a generated grammar
US9589563B2 (en) 2014-06-02 2017-03-07 Robert Bosch Gmbh Speech recognition of partial proper names by natural language processing
US9898459B2 (en) 2014-09-16 2018-02-20 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
US9626703B2 (en) 2014-09-16 2017-04-18 Voicebox Technologies Corporation Voice commerce
EP3207467A4 (en) 2014-10-15 2018-05-23 VoiceBox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US9898455B2 (en) * 2014-12-01 2018-02-20 Nuance Communications, Inc. Natural language understanding cache
US10037374B2 (en) 2015-01-30 2018-07-31 Qualcomm Incorporated Measuring semantic and syntactic similarity between grammars according to distance metrics for clustered data
US9508339B2 (en) * 2015-01-30 2016-11-29 Microsoft Technology Licensing, Llc Updating language understanding classifier models for a digital personal assistant based on crowd-sourcing
US9536072B2 (en) 2015-04-09 2017-01-03 Qualcomm Incorporated Machine-learning behavioral analysis to detect device theft and unauthorized device usage
US20160314183A1 (en) * 2015-04-21 2016-10-27 International Business Machines Corporation Custodian disambiguation and data matching
US20170018269A1 (en) * 2015-07-14 2017-01-19 Genesys Telecommunications Laboratories, Inc. Data driven speech enabled self-help systems and methods of operating thereof
WO2018023106A1 (en) 2016-07-29 2018-02-01 Erik SWART System and method of disambiguating natural language processing requests
CN106354714A (en) * 2016-08-29 2017-01-25 广东工业大学 NLPIR Chinese character segmentation system based Chinese character segmentation tool
JP2018054790A (en) * 2016-09-28 2018-04-05 トヨタ自動車株式会社 Voice interaction system and voice interaction method

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5020021A (en) * 1985-01-14 1991-05-28 Hitachi, Ltd. System for automatic language translation using several dictionary storage areas and a noun table
US5131045A (en) * 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US5991712A (en) * 1996-12-05 1999-11-23 Sun Microsystems, Inc. Method, apparatus, and product for automatic generation of lexical features for speech recognition systems
US6163768A (en) * 1998-06-15 2000-12-19 Dragon Systems, Inc. Non-interactive enrollment in speech recognition
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US6456972B1 (en) * 1998-09-30 2002-09-24 Scansoft, Inc. User interface for speech recognition system grammars
US20020169604A1 (en) * 2001-03-09 2002-11-14 Damiba Bertrand A. System, method and computer program product for genre-based grammars and acoustic models in a speech recognition framework
US20020169613A1 (en) * 2001-03-09 2002-11-14 Damiba Bertrand A. System, method and computer program product for reduced data collection in a speech recognition tuning process
US6556973B1 (en) * 2000-04-19 2003-04-29 Voxi Ab Conversion between data representation formats
US6636831B1 (en) * 1999-04-09 2003-10-21 Inroad, Inc. System and process for voice-controlled information retrieval

Family Cites Families (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5497319A (en) 1990-12-31 1996-03-05 Trans-Link International Corp. Machine translation and telecommunications system
US5369577A (en) * 1991-02-01 1994-11-29 Wang Laboratories, Inc. Text searching system
JPH05197389A (en) 1991-08-13 1993-08-06 Toshiba Corp Voice recognition device
US5278980A (en) 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
DE69232407T2 (en) 1991-11-18 2002-09-12 Toshiba Kawasaki Kk Speech dialogue system to facilitate computer-human interaction
US5452397A (en) 1992-12-11 1995-09-19 Texas Instruments Incorporated Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list
US5615296A (en) 1993-11-12 1997-03-25 International Business Machines Corporation Continuous speech recognition and voice response system and method to enable conversational dialogues with microprocessors
US5694558A (en) 1994-04-22 1997-12-02 U S West Technologies, Inc. Method and system for interactive object-oriented dialogue management
US5675819A (en) 1994-06-16 1997-10-07 Xerox Corporation Document information retrieval using global word co-occurrence patterns
US5799268A (en) 1994-09-28 1998-08-25 Apple Computer, Inc. Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like
US5640488A (en) 1995-05-05 1997-06-17 Panasonic Technologies, Inc. System and method for constructing clustered dictionary for speech and text recognition
US5883986A (en) 1995-06-02 1999-03-16 Xerox Corporation Method and system for automatic transcription correction
US5680511A (en) 1995-06-07 1997-10-21 Dragon Systems, Inc. Systems and methods for word recognition
US5828991A (en) 1995-06-30 1998-10-27 The Research Foundation Of The State University Of New York Sentence reconstruction using word ambiguity resolution
JP3627299B2 (en) 1995-07-19 2005-03-09 ソニー株式会社 Speech recognition method and apparatus
US6026388A (en) * 1995-08-16 2000-02-15 Textwise, Llc User interface and other enhancements for natural language information retrieval system and method
AU6849196A (en) 1995-08-16 1997-03-19 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US5819260A (en) * 1996-01-22 1998-10-06 Lexis-Nexis Phrase recognition method and apparatus
US5926811A (en) * 1996-03-15 1999-07-20 Lexis-Nexis Statistical thesaurus, method of forming same, and use thereof in query expansion in automated text searching
US6098034A (en) * 1996-03-18 2000-08-01 Expert Ease Development, Ltd. Method for standardizing phrasing in a document
US5870706A (en) 1996-04-10 1999-02-09 Lucent Technologies, Inc. Method and apparatus for an improved language recognition system
US5819220A (en) 1996-09-30 1998-10-06 Hewlett-Packard Company Web triggered word set boosting for speech interfaces to the world wide web
US5797123A (en) 1996-10-01 1998-08-18 Lucent Technologies Inc. Method of key-phase detection and verification for flexible speech understanding
US6415250B1 (en) 1997-06-18 2002-07-02 Novell, Inc. System and method for identifying language using morphologically-based techniques
US5860063A (en) * 1997-07-11 1999-01-12 At&T Corp Automated meaningful phrase clustering
US6246989B1 (en) * 1997-07-24 2001-06-12 Intervoice Limited Partnership System and method for providing an adaptive dialog function choice model for various communication devices
US5960384A (en) 1997-09-03 1999-09-28 Brash; Douglas E. Method and device for parsing natural language sentences and other sequential symbolic expressions
US5995918A (en) 1997-09-17 1999-11-30 Unisys Corporation System and method for creating a language grammar using a spreadsheet or table interface
US6044337A (en) 1997-10-29 2000-03-28 At&T Corp Selection of superwords based on criteria relevant to both speech recognition and understanding
US6154722A (en) 1997-12-18 2000-11-28 Apple Computer, Inc. Method and apparatus for a speech recognition system language model that integrates a finite state grammar probability and an N-gram probability
US6499013B1 (en) 1998-09-09 2002-12-24 One Voice Technologies, Inc. Interactive user interface using speech recognition and natural language processing
US6587822B2 (en) * 1998-10-06 2003-07-01 Lucent Technologies Inc. Web-based platform for interactive voice response (IVR)
US7082397B2 (en) * 1998-12-01 2006-07-25 Nuance Communications, Inc. System for and method of creating and browsing a voice web
US6175830B1 (en) 1999-05-20 2001-01-16 Evresearch, Ltd. Information management, retrieval and display system and associated method
AU5451800A (en) * 1999-05-28 2000-12-18 Sehda, Inc. Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces
US20020032564A1 (en) * 2000-04-19 2002-03-14 Farzad Ehsani Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5020021A (en) * 1985-01-14 1991-05-28 Hitachi, Ltd. System for automatic language translation using several dictionary storage areas and a noun table
US5131045A (en) * 1990-05-10 1992-07-14 Roth Richard G Audio-augmented data keying
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US5991712A (en) * 1996-12-05 1999-11-23 Sun Microsystems, Inc. Method, apparatus, and product for automatic generation of lexical features for speech recognition systems
US6163768A (en) * 1998-06-15 2000-12-19 Dragon Systems, Inc. Non-interactive enrollment in speech recognition
US6173261B1 (en) * 1998-09-30 2001-01-09 At&T Corp Grammar fragment acquisition using syntactic and semantic clustering
US6456972B1 (en) * 1998-09-30 2002-09-24 Scansoft, Inc. User interface for speech recognition system grammars
US6636831B1 (en) * 1999-04-09 2003-10-21 Inroad, Inc. System and process for voice-controlled information retrieval
US6556973B1 (en) * 2000-04-19 2003-04-29 Voxi Ab Conversion between data representation formats
US20020169604A1 (en) * 2001-03-09 2002-11-14 Damiba Bertrand A. System, method and computer program product for genre-based grammars and acoustic models in a speech recognition framework
US20020169613A1 (en) * 2001-03-09 2002-11-14 Damiba Bertrand A. System, method and computer program product for reduced data collection in a speech recognition tuning process

Cited By (453)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140136189A1 (en) * 1999-05-28 2014-05-15 Fluential, Llc Phrase-Based Dialogue Modeling With Particular Application to Creating a Recognition Grammar
US9342504B2 (en) * 1999-05-28 2016-05-17 Nant Holdings Ip, Llc Phrase-based dialogue modeling with particular application to creating a recognition grammar
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
US7149695B1 (en) * 2000-10-13 2006-12-12 Apple Computer, Inc. Method and apparatus for speech recognition using semantic inference and word agglomeration
US20040153323A1 (en) * 2000-12-01 2004-08-05 Charney Michael L Method and system for voice activating web pages
US7640163B2 (en) * 2000-12-01 2009-12-29 The Trustees Of Columbia University In The City Of New York Method and system for voice activating web pages
US8176115B2 (en) * 2001-05-11 2012-05-08 Ambx Uk Limited Real-world representation system and language
US20020169817A1 (en) * 2001-05-11 2002-11-14 Koninklijke Philips Electronics N.V. Real-world representation system and language
US20030071833A1 (en) * 2001-06-07 2003-04-17 Dantzig Paul M. System and method for generating and presenting multi-modal applications from intent-based markup scripts
US7020841B2 (en) * 2001-06-07 2006-03-28 International Business Machines Corporation System and method for generating and presenting multi-modal applications from intent-based markup scripts
US20030004722A1 (en) * 2001-06-28 2003-01-02 Butzberger John W. Method of dynamically altering grammars in a memory efficient speech recognition system
US7324945B2 (en) * 2001-06-28 2008-01-29 Sri International Method of dynamically altering grammars in a memory efficient speech recognition system
US20100061534A1 (en) * 2001-07-03 2010-03-11 Apptera, Inc. Multi-Platform Capable Inference Engine and Universal Grammar Language Adapter for Intelligent Voice Application Execution
US20040225499A1 (en) * 2001-07-03 2004-11-11 Wang Sandy Chai-Jen Multi-platform capable inference engine and universal grammar language adapter for intelligent voice application execution
US20110106527A1 (en) * 2001-07-03 2011-05-05 Apptera, Inc. Method and Apparatus for Adapting a Voice Extensible Markup Language-enabled Voice System for Natural Speech Recognition and System Response
US7643998B2 (en) 2001-07-03 2010-01-05 Apptera, Inc. Method and apparatus for improving voice recognition performance in a voice application distribution system
US20030009339A1 (en) * 2001-07-03 2003-01-09 Yuen Michael S. Method and apparatus for improving voice recognition performance in a voice application distribution system
US7609829B2 (en) * 2001-07-03 2009-10-27 Apptera, Inc. Multi-platform capable inference engine and universal grammar language adapter for intelligent voice application execution
US20050234727A1 (en) * 2001-07-03 2005-10-20 Leo Chiu Method and apparatus for adapting a voice extensible markup language-enabled voice system for natural speech recognition and system response
US8718047B2 (en) 2001-10-22 2014-05-06 Apple Inc. Text to speech conversion of text messages from mobile communication devices
US20050086057A1 (en) * 2001-11-22 2005-04-21 Tetsuo Kosaka Speech recognition apparatus and its method and program
US20090144248A1 (en) * 2001-12-20 2009-06-04 Sybase 365, Inc. Context-Based Suggestions Mechanism and Adaptive Push Mechanism for Natural Language Systems
US8036877B2 (en) 2001-12-20 2011-10-11 Sybase, Inc. Context-based suggestions mechanism and adaptive push mechanism for natural language systems
US7231343B1 (en) * 2001-12-20 2007-06-12 Ianywhere Solutions, Inc. Synonyms mechanism for natural language systems
US20030167168A1 (en) * 2002-03-01 2003-09-04 International Business Machines Corporation Automatic generation of efficient grammar for heading selection
US7054813B2 (en) * 2002-03-01 2006-05-30 International Business Machines Corporation Automatic generation of efficient grammar for heading selection
US20080215329A1 (en) * 2002-03-27 2008-09-04 International Business Machines Corporation Methods and Apparatus for Generating Dialog State Conditioned Language Models
US7853449B2 (en) * 2002-03-27 2010-12-14 Nuance Communications, Inc. Methods and apparatus for generating dialog state conditioned language models
EP1349145A3 (en) * 2002-03-29 2005-03-09 Samsung Electronics Co., Ltd. System and method for providing information using spoken dialogue interface
US7225128B2 (en) 2002-03-29 2007-05-29 Samsung Electronics Co., Ltd. System and method for providing information using spoken dialogue interface
EP1349145A2 (en) * 2002-03-29 2003-10-01 Samsung Electronics Co., Ltd. System and method for providing information using spoken dialogue interface
US20030200094A1 (en) * 2002-04-23 2003-10-23 Gupta Narendra K. System and method of using existing knowledge to rapidly train automatic speech recognizers
US20100050150A1 (en) * 2002-06-14 2010-02-25 Apptera, Inc. Method and System for Developing Speech Applications
US7249019B2 (en) * 2002-08-06 2007-07-24 Sri International Method and apparatus for providing an integrated speech recognition and natural language understanding for a dialog system
US20040030557A1 (en) * 2002-08-06 2004-02-12 Sri International Method and apparatus for providing an integrated speech recognition and natural language understanding for a dialog system
US20040027379A1 (en) * 2002-08-08 2004-02-12 Hong Huey Anna Onon Integrated visual development system for creating computer-implemented dialog scripts
US20040054530A1 (en) * 2002-09-18 2004-03-18 International Business Machines Corporation Generating speech recognition grammars from a large corpus of data
US7567902B2 (en) * 2002-09-18 2009-07-28 Nuance Communications, Inc. Generating speech recognition grammars from a large corpus of data
US20050256715A1 (en) * 2002-10-08 2005-11-17 Yoshiyuki Okimoto Language model generation and accumulation device, speech recognition device, language model creation method, and speech recognition method
US20040111405A1 (en) * 2002-10-18 2004-06-10 Hewlett-Packard Development Company, L.P. Communication system and method
US7409207B2 (en) * 2002-10-18 2008-08-05 Hewlett-Packard Development Company, L.P. Communication system and method
US9626965B2 (en) 2002-10-31 2017-04-18 Promptu Systems Corporation Efficient empirical computation and utilization of acoustic confusability
US7519534B2 (en) * 2002-10-31 2009-04-14 Agiletv Corporation Speech controlled access to content on a presentation medium
US8959019B2 (en) 2002-10-31 2015-02-17 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US10121469B2 (en) 2002-10-31 2018-11-06 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US8793127B2 (en) 2002-10-31 2014-07-29 Promptu Systems Corporation Method and apparatus for automatically determining speaker characteristics for speech-directed advertising or other enhancement of speech-controlled devices or services
US8862596B2 (en) 2002-10-31 2014-10-14 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US9305549B2 (en) 2002-10-31 2016-04-05 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US20040193426A1 (en) * 2002-10-31 2004-09-30 Maddux Scott Lynn Speech controlled access to content on a presentation medium
US8949132B2 (en) 2002-12-18 2015-02-03 At&T Intellectual Property Ii, L.P. System and method of providing a spoken dialog interface to a website
US7580842B1 (en) * 2002-12-18 2009-08-25 At&T Intellectual Property Ii, Lp. System and method of providing a spoken dialog interface to a website
US8090583B1 (en) 2002-12-18 2012-01-03 At&T Intellectual Property Ii, L.P. System and method of automatically generating building dialog services by exploiting the content and structure of websites
US8065151B1 (en) 2002-12-18 2011-11-22 At&T Intellectual Property Ii, L.P. System and method of automatically building dialog services by exploiting the content and structure of websites
US8060369B2 (en) 2002-12-18 2011-11-15 At&T Intellectual Property Ii, L.P. System and method of providing a spoken dialog interface to a website
US8249879B2 (en) 2002-12-18 2012-08-21 At&T Intellectual Property Ii, L.P. System and method of providing a spoken dialog interface to a website
US8688456B2 (en) 2002-12-18 2014-04-01 At&T Intellectual Property Ii, L.P. System and method of providing a spoken dialog interface to a website
US7373300B1 (en) * 2002-12-18 2008-05-13 At&T Corp. System and method of providing a spoken dialog interface to a website
US8442834B2 (en) 2002-12-18 2013-05-14 At&T Intellectual Property Ii, L.P. System and method of providing a spoken dialog interface to a website
US20090292529A1 (en) * 2002-12-18 2009-11-26 At&T Corp. System and method of providing a spoken dialog interface to a website
US7818174B1 (en) 2003-01-16 2010-10-19 Comverse, Inc. Speech-recognition grammar analysis
US7243071B1 (en) * 2003-01-16 2007-07-10 Comverse, Inc. Speech-recognition grammar analysis
US7698136B1 (en) * 2003-01-28 2010-04-13 Voxify, Inc. Methods and apparatus for flexible speech recognition
US20040193557A1 (en) * 2003-03-25 2004-09-30 Olsen Jesse Dale Systems and methods for reducing ambiguity of communications
US7698435B1 (en) 2003-04-15 2010-04-13 Sprint Spectrum L.P. Distributed interactive media system and method
US7603433B1 (en) * 2003-04-15 2009-10-13 Sprint Spectrum, L.P. IMS-based interactive media system and method
US20050143975A1 (en) * 2003-06-06 2005-06-30 Charney Michael L. System and method for voice activating web pages
US9202467B2 (en) 2003-06-06 2015-12-01 The Trustees Of Columbia University In The City Of New York System and method for voice activating web pages
US20070043570A1 (en) * 2003-07-18 2007-02-22 Koninklijke Philips Electronics N.V. Method of controlling a dialoging process
US8209185B2 (en) 2003-09-05 2012-06-26 Emc Corporation Interface for management of auditory communications
US20090070380A1 (en) * 2003-09-25 2009-03-12 Dictaphone Corporation Method, system, and apparatus for assembly, transport and display of clinical data
US7860717B2 (en) * 2003-09-25 2010-12-28 Dictaphone Corporation System and method for customizing speech recognition input and output
US20050114122A1 (en) * 2003-09-25 2005-05-26 Dictaphone Corporation System and method for customizing speech recognition input and output
US20070136067A1 (en) * 2003-11-10 2007-06-14 Scholl Holger R Audio dialogue system and voice browsing method
US20110099016A1 (en) * 2003-11-17 2011-04-28 Apptera, Inc. Multi-Tenant Self-Service VXML Portal
WO2005053200A2 (en) * 2003-11-17 2005-06-09 Apptera, Inc. Multi-platform capable inference engine and universal grammar language adapter for intelligent voice application execution
US8509403B2 (en) 2003-11-17 2013-08-13 Htc Corporation System for advertisement selection, placement and delivery
WO2005053200A3 (en) * 2003-11-17 2009-04-09 Apptera Inc Multi-platform capable inference engine and universal grammar language adapter for intelligent voice application execution
US20110064207A1 (en) * 2003-11-17 2011-03-17 Apptera, Inc. System for Advertisement Selection, Placement and Delivery
US20050119892A1 (en) * 2003-12-02 2005-06-02 International Business Machines Corporation Method and arrangement for managing grammar options in a graphical callflow builder
US8355918B2 (en) * 2003-12-02 2013-01-15 Nuance Communications, Inc. Method and arrangement for managing grammar options in a graphical callflow builder
US20120209613A1 (en) * 2003-12-02 2012-08-16 Nuance Communications, Inc. Method and arrangement for managing grammar options in a graphical callflow builder
US20050125270A1 (en) * 2003-12-08 2005-06-09 International Business Machines Corporation Efficient presentation of correction options in a speech interface based upon user selection probability
US7885816B2 (en) 2003-12-08 2011-02-08 International Business Machines Corporation Efficient presentation of correction options in a speech interface based upon user selection probability
US9378187B2 (en) * 2003-12-11 2016-06-28 International Business Machines Corporation Creating a presentation document
US20050132275A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Creating a presentation document
US20050132273A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Amending a session document during a presentation
US20050132274A1 (en) * 2003-12-11 2005-06-16 International Business Machine Corporation Creating a presentation document
US20050132271A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Creating a session document from a presentation document
US20090037820A1 (en) * 2004-01-13 2009-02-05 International Business Machines Corporation Differential Dynamic Content Delivery With A Presenter-Alterable Session Copy Of A User Profile
US8578263B2 (en) 2004-01-13 2013-11-05 International Business Machines Corporation Differential dynamic content delivery with a presenter-alterable session copy of a user profile
US20050165900A1 (en) * 2004-01-13 2005-07-28 International Business Machines Corporation Differential dynamic content delivery with a participant alterable session copy of a user profile
US7890848B2 (en) 2004-01-13 2011-02-15 International Business Machines Corporation Differential dynamic content delivery with alternative content presentation
US8499232B2 (en) 2004-01-13 2013-07-30 International Business Machines Corporation Differential dynamic content delivery with a participant alterable session copy of a user profile
US7774693B2 (en) 2004-01-13 2010-08-10 International Business Machines Corporation Differential dynamic content delivery with device controlling action
US8010885B2 (en) 2004-01-13 2011-08-30 International Business Machines Corporation Differential dynamic content delivery with a presenter-alterable session copy of a user profile
US20070250602A1 (en) * 2004-01-13 2007-10-25 Bodin William K Differential Dynamic Content Delivery With A Presenter-Alterable Session Copy Of A User Profile
US20050209853A1 (en) * 2004-03-19 2005-09-22 International Business Machines Corporation Speech disambiguation for string processing in an interactive voice response system
US20050228668A1 (en) * 2004-03-31 2005-10-13 Wilson James M System and method for automatic generation of dialog run time systems
EP1583076A1 (en) 2004-03-31 2005-10-05 AT&amp;T Corp. System and method for automatic generation of dialogue run time systems
US9213692B2 (en) * 2004-04-16 2015-12-15 At&T Intellectual Property Ii, L.P. System and method for the automatic validation of dialog run time systems
US9584662B2 (en) * 2004-04-16 2017-02-28 At&T Intellectual Property Ii, L.P. System and method for the automatic validation of dialog run time systems
US20080177838A1 (en) * 2004-04-26 2008-07-24 Intrernational Business Machines Corporation Dynamic Media Content For Collaborators With Client Environment Information In Dynamic Client Contexts
US8161112B2 (en) 2004-04-26 2012-04-17 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US7827239B2 (en) 2004-04-26 2010-11-02 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20080177837A1 (en) * 2004-04-26 2008-07-24 International Business Machines Corporation Dynamic Media Content For Collaborators With Client Locations In Dynamic Client Contexts
US8161131B2 (en) 2004-04-26 2012-04-17 International Business Machines Corporation Dynamic media content for collaborators with client locations in dynamic client contexts
US20050240603A1 (en) * 2004-04-26 2005-10-27 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20050261902A1 (en) * 2004-05-24 2005-11-24 Sbc Knowledge Ventures, L.P. Method for designing an automated speech recognition (ASR) interface for a customer call center
US7460650B2 (en) 2004-05-24 2008-12-02 At&T Intellectual Property I, L.P. Method for designing an automated speech recognition (ASR) interface for a customer call center
US8761381B2 (en) 2004-05-24 2014-06-24 At&T Intellectual Property I, L.P. Method for designing an automated speech recognition (ASR) interface for a customer call center
US9197752B2 (en) 2004-05-24 2015-11-24 At&T Intellectual Property I, L.P. Method for designing an automated speech recognition (ASR) interface for a customer call center
US20050283367A1 (en) * 2004-06-17 2005-12-22 International Business Machines Corporation Method and apparatus for voice-enabling an application
US8768711B2 (en) * 2004-06-17 2014-07-01 Nuance Communications, Inc. Method and apparatus for voice-enabling an application
US8244542B2 (en) * 2004-07-01 2012-08-14 Emc Corporation Video surveillance
US8180743B2 (en) 2004-07-01 2012-05-15 Emc Corporation Information management
US8180742B2 (en) 2004-07-01 2012-05-15 Emc Corporation Policy-based information management
US20060004818A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Efficient information management
US20060004819A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Information management
US8229904B2 (en) 2004-07-01 2012-07-24 Emc Corporation Storage pools for information management
US20060004820A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Storage pools for information management
US20060004582A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Video surveillance
US9268780B2 (en) 2004-07-01 2016-02-23 Emc Corporation Content-driven information lifecycle management
US20060004868A1 (en) * 2004-07-01 2006-01-05 Claudatos Christopher H Policy-based information management
US20090089659A1 (en) * 2004-07-08 2009-04-02 International Business Machines Corporation Differential Dynamic Content Delivery To Alternate Display Device Locations
US8180832B2 (en) 2004-07-08 2012-05-15 International Business Machines Corporation Differential dynamic content delivery to alternate display device locations
US8185814B2 (en) 2004-07-08 2012-05-22 International Business Machines Corporation Differential dynamic delivery of content according to user expressions of interest
US20060010370A1 (en) * 2004-07-08 2006-01-12 International Business Machines Corporation Differential dynamic delivery of presentation previews
US20060010365A1 (en) * 2004-07-08 2006-01-12 International Business Machines Corporation Differential dynamic delivery of content according to user expressions of interest
US8214432B2 (en) 2004-07-08 2012-07-03 International Business Machines Corporation Differential dynamic content delivery to alternate display device locations
US20080177866A1 (en) * 2004-07-08 2008-07-24 International Business Machines Corporation Differential Dynamic Delivery Of Content To Users Not In Attendance At A Presentation
US9167087B2 (en) 2004-07-13 2015-10-20 International Business Machines Corporation Dynamic media content for collaborators including disparate location representations
US8005025B2 (en) 2004-07-13 2011-08-23 International Business Machines Corporation Dynamic media content for collaborators with VOIP support for client communications
US8626514B2 (en) 2004-08-31 2014-01-07 Emc Corporation Interface for management of multiple auditory communications
US20060047518A1 (en) * 2004-08-31 2006-03-02 Claudatos Christopher H Interface for management of multiple auditory communications
US20060112040A1 (en) * 2004-10-13 2006-05-25 Hewlett-Packard Development Company, L.P. Device, method, and program for document classification
US20060112063A1 (en) * 2004-11-05 2006-05-25 International Business Machines Corporation System, apparatus, and methods for creating alternate-mode applications
US7920681B2 (en) * 2004-11-05 2011-04-05 International Business Machines Corporation System, apparatus, and methods for creating alternate-mode applications
US20060149553A1 (en) * 2005-01-05 2006-07-06 At&T Corp. System and method for using a library to interactively design natural language spoken dialog systems
US10199039B2 (en) 2005-01-05 2019-02-05 Nuance Communications, Inc. Library of existing spoken dialog data for use in generating new natural language spoken dialog systems
US9240197B2 (en) 2005-01-05 2016-01-19 At&T Intellectual Property Ii, L.P. Library of existing spoken dialog data for use in generating new natural language spoken dialog systems
US8694324B2 (en) 2005-01-05 2014-04-08 At&T Intellectual Property Ii, L.P. System and method of providing an automated data-collection in spoken dialog systems
US8914294B2 (en) 2005-01-05 2014-12-16 At&T Intellectual Property Ii, L.P. System and method of providing an automated data-collection in spoken dialog systems
US20060155526A1 (en) * 2005-01-10 2006-07-13 At&T Corp. Systems, Devices, & Methods for automating non-deterministic processes
KR100718147B1 (en) * 2005-02-01 2007-05-14 삼성전자주식회사 Apparatus and method of generating grammar network for speech recognition and dialogue speech recognition apparatus and method employing the same
US20060178869A1 (en) * 2005-02-10 2006-08-10 Microsoft Corporation Classification filter for processing data for creating a language model
US8165870B2 (en) * 2005-02-10 2012-04-24 Microsoft Corporation Classification filter for processing data for creating a language model
US20060182239A1 (en) * 2005-02-16 2006-08-17 Yves Lechervy Process for synchronizing a speech service and a visual presentation
US20060277032A1 (en) * 2005-05-20 2006-12-07 Sony Computer Entertainment Inc. Structure for grammar and dictionary representation in voice recognition and method for simplifying link and node-generated grammars
WO2006127504A3 (en) * 2005-05-20 2007-06-28 Sony Computer Entertainment Inc Optimisation of a grammar for speech recognition
US7921011B2 (en) 2005-05-20 2011-04-05 Sony Computer Entertainment Inc. Structure for grammar and dictionary representation in voice recognition and method for simplifying link and node-generated grammars
WO2006127504A2 (en) * 2005-05-20 2006-11-30 Sony Computer Entertainment Inc. Optimisation of a grammar for speech recognition
US20060277031A1 (en) * 2005-06-02 2006-12-07 Microsoft Corporation Authoring speech grammars
US7617093B2 (en) * 2005-06-02 2009-11-10 Microsoft Corporation Authoring speech grammars
US20070043758A1 (en) * 2005-08-19 2007-02-22 Bodin William K Synthesizing aggregate data of disparate data types into data of a uniform data type
US8977636B2 (en) 2005-08-19 2015-03-10 International Business Machines Corporation Synthesizing aggregate data of disparate data types into data of a uniform data type
US9905223B2 (en) 2005-08-27 2018-02-27 Nuance Communications, Inc. System and method for using semantic and syntactic graphs for utterance classification
US9218810B2 (en) 2005-08-27 2015-12-22 At&T Intellectual Property Ii, L.P. System and method for using semantic and syntactic graphs for utterance classification
US8700404B1 (en) * 2005-08-27 2014-04-15 At&T Intellectual Property Ii, L.P. System and method for using semantic and syntactic graphs for utterance classification
US9501741B2 (en) 2005-09-08 2016-11-22 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8266220B2 (en) 2005-09-14 2012-09-11 International Business Machines Corporation Email management and rendering
US20070061401A1 (en) * 2005-09-14 2007-03-15 Bodin William K Email management and rendering
US8024196B1 (en) * 2005-09-19 2011-09-20 Sap Ag Techniques for creating and translating voice applications
US9958987B2 (en) 2005-09-30 2018-05-01 Apple Inc. Automated response to and sensing of user activity in portable devices
US9389729B2 (en) 2005-09-30 2016-07-12 Apple Inc. Automated response to and sensing of user activity in portable devices
US8614431B2 (en) 2005-09-30 2013-12-24 Apple Inc. Automated response to and sensing of user activity in portable devices
US9619079B2 (en) 2005-09-30 2017-04-11 Apple Inc. Automated response to and sensing of user activity in portable devices
KR100923180B1 (en) 2005-10-21 2009-10-22 뉘앙스 커뮤니케이션즈, 인코포레이티드 Creating a mixed-initiative grammar from directed dialog grammars
US8694319B2 (en) 2005-11-03 2014-04-08 International Business Machines Corporation Dynamic prosody adjustment for voice-rendering synthesized data
US20070129947A1 (en) * 2005-12-02 2007-06-07 International Business Machines Corporation Method and system for testing sections of large speech applications
US8661411B2 (en) * 2005-12-02 2014-02-25 Nuance Communications, Inc. Method and system for testing sections of large speech applications
US9384190B2 (en) * 2005-12-15 2016-07-05 Nuance Communications, Inc. Method and system for conveying an example in a natural language understanding application
US10192543B2 (en) 2005-12-15 2019-01-29 Nuance Communications, Inc. Method and system for conveying an example in a natural language understanding application
US20140156265A1 (en) * 2005-12-15 2014-06-05 Nuance Communications, Inc. Method and system for conveying an example in a natural language understanding application
US8271107B2 (en) 2006-01-13 2012-09-18 International Business Machines Corporation Controlling audio operation for data management and data rendering
US20070168191A1 (en) * 2006-01-13 2007-07-19 Bodin William K Controlling audio operation for data management and data rendering
US8229733B2 (en) * 2006-02-09 2012-07-24 John Harney Method and apparatus for linguistic independent parsing in a natural language systems
US20070185702A1 (en) * 2006-02-09 2007-08-09 John Harney Language independent parsing in natural language systems
US20070192683A1 (en) * 2006-02-13 2007-08-16 Bodin William K Synthesizing the content of disparate data types
US20070192674A1 (en) * 2006-02-13 2007-08-16 Bodin William K Publishing content through RSS feeds
US20080275893A1 (en) * 2006-02-13 2008-11-06 International Business Machines Corporation Aggregating Content Of Disparate Data Types From Disparate Data Sources For Single Point Access
US7996754B2 (en) 2006-02-13 2011-08-09 International Business Machines Corporation Consolidated content management
US9135339B2 (en) 2006-02-13 2015-09-15 International Business Machines Corporation Invoking an audio hyperlink
US20070192684A1 (en) * 2006-02-13 2007-08-16 Bodin William K Consolidated content management
US7949681B2 (en) 2006-02-13 2011-05-24 International Business Machines Corporation Aggregating content of disparate data types from disparate data sources for single point access
US9361299B2 (en) 2006-03-09 2016-06-07 International Business Machines Corporation RSS content administration for rendering RSS content on a digital audio player
US9092542B2 (en) 2006-03-09 2015-07-28 International Business Machines Corporation Podcasting content associated with a user account
US20070214485A1 (en) * 2006-03-09 2007-09-13 Bodin William K Podcasting content associated with a user account
US20070213857A1 (en) * 2006-03-09 2007-09-13 Bodin William K RSS content administration for rendering RSS content on a digital audio player
US20070214149A1 (en) * 2006-03-09 2007-09-13 International Business Machines Corporation Associating user selected content management directives with user selected ratings
US8849895B2 (en) 2006-03-09 2014-09-30 International Business Machines Corporation Associating user selected content management directives with user selected ratings
US8078467B2 (en) * 2006-03-10 2011-12-13 Nec (China) Co., Ltd. Device and method for language model switching and adaptation
US20080040099A1 (en) * 2006-03-10 2008-02-14 Nec (China) Co., Ltd. Device and method for language model switching and adaption
US9646277B2 (en) 2006-05-07 2017-05-09 Varcode Ltd. System and method for improved quality management in a product logistic chain
US10037507B2 (en) 2006-05-07 2018-07-31 Varcode Ltd. System and method for improved quality management in a product logistic chain
US20070276866A1 (en) * 2006-05-24 2007-11-29 Bodin William K Providing disparate content as a playlist of media files
US20070277233A1 (en) * 2006-05-24 2007-11-29 Bodin William K Token-based content subscription
US20070277088A1 (en) * 2006-05-24 2007-11-29 Bodin William K Enhancing an existing web page
US8286229B2 (en) 2006-05-24 2012-10-09 International Business Machines Corporation Token-based content subscription
US7778980B2 (en) 2006-05-24 2010-08-17 International Business Machines Corporation Providing disparate content as a playlist of media files
US20070282594A1 (en) * 2006-06-02 2007-12-06 Microsoft Corporation Machine translation in natural language application development
US8370127B2 (en) * 2006-06-16 2013-02-05 Nuance Communications, Inc. Systems and methods for building asset based natural language call routing application with limited resources
US20080010280A1 (en) * 2006-06-16 2008-01-10 International Business Machines Corporation Method and apparatus for building asset based natural language call routing application with limited resources
US8990126B1 (en) * 2006-08-03 2015-03-24 At&T Intellectual Property Ii, L.P. Copying human interactions through learning and discovery
US20080033720A1 (en) * 2006-08-04 2008-02-07 Pankaj Kankar A method and system for speech classification
US20080052082A1 (en) * 2006-08-23 2008-02-28 Asustek Computer Inc. Voice control method
US9117447B2 (en) 2006-09-08 2015-08-25 Apple Inc. Using event alert text as input to an automated assistant
US8930191B2 (en) 2006-09-08 2015-01-06 Apple Inc. Paraphrasing of user requests and results by automated digital assistant
US8942986B2 (en) 2006-09-08 2015-01-27 Apple Inc. Determining user intent based on ontologies of domains
US7831432B2 (en) 2006-09-29 2010-11-09 International Business Machines Corporation Audio menus describing media contents of media players
US9196241B2 (en) 2006-09-29 2015-11-24 International Business Machines Corporation Asynchronous communications using messages recorded on handheld devices
US20080082635A1 (en) * 2006-09-29 2008-04-03 Bodin William K Asynchronous Communications Using Messages Recorded On Handheld Devices
US8450591B2 (en) 2006-10-03 2013-05-28 Sony Computer Entertainment Inc. Methods for generating new output sounds from input sounds
US7902447B1 (en) * 2006-10-03 2011-03-08 Sony Computer Entertainment Inc. Automatic composition of sound sequences using finite state automata
US20080154594A1 (en) * 2006-12-26 2008-06-26 Nobuyasu Itoh Method for segmenting utterances by using partner's response
US8793132B2 (en) * 2006-12-26 2014-07-29 Nuance Communications, Inc. Method for segmenting utterances by using partner's response
US20080215325A1 (en) * 2006-12-27 2008-09-04 Hiroshi Horii Technique for accurately detecting system failure
US9318100B2 (en) 2007-01-03 2016-04-19 International Business Machines Corporation Supplementing audio recorded in a media file
US8219402B2 (en) 2007-01-03 2012-07-10 International Business Machines Corporation Asynchronous receipt of information from a user
US20080161948A1 (en) * 2007-01-03 2008-07-03 Bodin William K Supplementing audio recorded in a media file
US20100030553A1 (en) * 2007-01-04 2010-02-04 Thinking Solutions Pty Ltd Linguistic Analysis
US8600736B2 (en) * 2007-01-04 2013-12-03 Thinking Solutions Pty Ltd Linguistic analysis
US8712779B2 (en) * 2007-03-19 2014-04-29 Nec Corporation Information retrieval system, information retrieval method, and information retrieval program
US20100114571A1 (en) * 2007-03-19 2010-05-06 Kentaro Nagatomo Information retrieval system, information retrieval method, and information retrieval program
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US10176451B2 (en) 2007-05-06 2019-01-08 Varcode Ltd. System and method for quality management utilizing barcode indicators
US20090018830A1 (en) * 2007-07-11 2009-01-15 Vandinburg Gmbh Speech control of computing devices
US20100286979A1 (en) * 2007-08-01 2010-11-11 Ginger Software, Inc. Automatic context sensitive language correction and enhancement using an internet corpus
US9026432B2 (en) 2007-08-01 2015-05-05 Ginger Software, Inc. Automatic context sensitive language generation, correction and enhancement using an internet corpus
US8914278B2 (en) * 2007-08-01 2014-12-16 Ginger Software, Inc. Automatic context sensitive language correction and enhancement using an internet corpus
US8364485B2 (en) * 2007-08-27 2013-01-29 International Business Machines Corporation Method for automatically identifying sentence boundaries in noisy conversational data
US20090063150A1 (en) * 2007-08-27 2009-03-05 International Business Machines Corporation Method for automatically identifying sentence boundaries in noisy conversational data
US9053089B2 (en) 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
US9836678B2 (en) 2007-11-14 2017-12-05 Varcode Ltd. System and method for quality management utilizing barcode indicators
US9135544B2 (en) 2007-11-14 2015-09-15 Varcode Ltd. System and method for quality management utilizing barcode indicators
US9558439B2 (en) 2007-11-14 2017-01-31 Varcode Ltd. System and method for quality management utilizing barcode indicators
US10262251B2 (en) 2007-11-14 2019-04-16 Varcode Ltd. System and method for quality management utilizing barcode indicators
US8620662B2 (en) 2007-11-20 2013-12-31 Apple Inc. Context-aware unit selection
US10002189B2 (en) 2007-12-20 2018-06-19 Apple Inc. Method and apparatus for searching using an active ontology
US8214376B1 (en) * 2007-12-31 2012-07-03 Symantec Corporation Techniques for global single instance segment-based indexing for backup data
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8868424B1 (en) * 2008-02-08 2014-10-21 West Corporation Interactive voice response data collection object framework, vertical benchmarking, and bootstrapping engine
US9361886B2 (en) 2008-02-22 2016-06-07 Apple Inc. Providing text input using speech data and non-speech data
US8688446B2 (en) 2008-02-22 2014-04-01 Apple Inc. Providing text input using speech data and non-speech data
US20090240500A1 (en) * 2008-03-19 2009-09-24 Kabushiki Kaisha Toshiba Speech recognition apparatus and method
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US20110016141A1 (en) * 2008-04-15 2011-01-20 Microsoft Corporation Web Traffic Analysis Tool
US9946706B2 (en) 2008-06-07 2018-04-17 Apple Inc. Automatic language identification for dynamic text processing
US9996783B2 (en) 2008-06-10 2018-06-12 Varcode Ltd. System and method for quality management utilizing barcode indicators
US9626610B2 (en) 2008-06-10 2017-04-18 Varcode Ltd. System and method for quality management utilizing barcode indicators
US10303992B2 (en) 2008-06-10 2019-05-28 Varcode Ltd. System and method for quality management utilizing barcode indicators
US10049314B2 (en) 2008-06-10 2018-08-14 Varcode Ltd. Barcoded indicators for quality management
US9710743B2 (en) 2008-06-10 2017-07-18 Varcode Ltd. Barcoded indicators for quality management
US10089566B2 (en) 2008-06-10 2018-10-02 Varcode Ltd. Barcoded indicators for quality management
US9646237B2 (en) 2008-06-10 2017-05-09 Varcode Ltd. Barcoded indicators for quality management
US9384435B2 (en) 2008-06-10 2016-07-05 Varcode Ltd. Barcoded indicators for quality management
US9317794B2 (en) 2008-06-10 2016-04-19 Varcode Ltd. Barcoded indicators for quality management
WO2010006087A1 (en) * 2008-07-08 2010-01-14 David Seaberg Process for providing and editing instructions, data, data structures, and algorithms in a computer system
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
US9691383B2 (en) 2008-09-05 2017-06-27 Apple Inc. Multi-tiered voice feedback in an electronic device
US8898568B2 (en) 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8712776B2 (en) 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8583418B2 (en) 2008-09-29 2013-11-12 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US9412392B2 (en) 2008-10-02 2016-08-09 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8762469B2 (en) 2008-10-02 2014-06-24 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8713119B2 (en) 2008-10-02 2014-04-29 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US9324007B2 (en) 2008-11-29 2016-04-26 At&T Intellectual Property I, L.P. Systems and methods for detecting and coordinating changes in lexical items
US8271422B2 (en) * 2008-11-29 2012-09-18 At&T Intellectual Property I, Lp Systems and methods for detecting and coordinating changes in lexical items
US20100138377A1 (en) * 2008-11-29 2010-06-03 Jeremy Wright Systems and Methods for Detecting and Coordinating Changes in Lexical Items
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9569770B1 (en) 2009-01-13 2017-02-14 Amazon Technologies, Inc. Generating constructed phrases
US20100179801A1 (en) * 2009-01-13 2010-07-15 Steve Huynh Determining Phrases Related to Other Phrases
US8768852B2 (en) 2009-01-13 2014-07-01 Amazon Technologies, Inc. Determining phrases related to other phrases
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US9432516B1 (en) 2009-03-03 2016-08-30 Alpine Audio Now, LLC System and method for communicating streaming audio to a telephone device
US8751238B2 (en) 2009-03-09 2014-06-10 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
US20100312547A1 (en) * 2009-06-05 2010-12-09 Apple Inc. Contextual voice commands
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US9298700B1 (en) * 2009-07-28 2016-03-29 Amazon Technologies, Inc. Determining similar phrases
US10007712B1 (en) 2009-08-20 2018-06-26 Amazon Technologies, Inc. Enforcing user-specified rules
US9558183B2 (en) * 2009-09-04 2017-01-31 Synchronoss Technologies, Inc. System and method for the localization of statistical classifiers based on machine translation
US20120166183A1 (en) * 2009-09-04 2012-06-28 David Suendermann System and method for the localization of statistical classifiers based on machine translation
US8682649B2 (en) 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US20120303359A1 (en) * 2009-12-11 2012-11-29 Nec Corporation Dictionary creation device, word gathering method and recording medium
US8600743B2 (en) 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
US8670985B2 (en) 2010-01-13 2014-03-11 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US9311043B2 (en) 2010-01-13 2016-04-12 Apple Inc. Adaptive audio feedback system and method
US8660849B2 (en) 2010-01-18 2014-02-25 Apple Inc. Prioritizing selection criteria by automated assistant
US9548050B2 (en) 2010-01-18 2017-01-17 Apple Inc. Intelligent automated assistant
US8670979B2 (en) 2010-01-18 2014-03-11 Apple Inc. Active input elicitation by intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US8731942B2 (en) 2010-01-18 2014-05-20 Apple Inc. Maintaining context information between user interactions with a voice assistant
US8706503B2 (en) 2010-01-18 2014-04-22 Apple Inc. Intent deduction based on previous user interactions with voice assistant
US8903716B2 (en) 2010-01-18 2014-12-02 Apple Inc. Personalized vocabulary for digital assistant
US8799000B2 (en) 2010-01-18 2014-08-05 Apple Inc. Disambiguation based on active input elicitation by intelligent automated assistant
US9424862B2 (en) 2010-01-25 2016-08-23 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US9424861B2 (en) 2010-01-25 2016-08-23 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US9431028B2 (en) 2010-01-25 2016-08-30 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US8977584B2 (en) 2010-01-25 2015-03-10 Newvaluexchange Global Ai Llp Apparatuses, methods and systems for a digital conversation management platform
US10296584B2 (en) 2010-01-29 2019-05-21 British Telecommunications Plc Semantic textual analysis
US9015036B2 (en) 2010-02-01 2015-04-21 Ginger Software, Inc. Automatic context sensitive language correction using an internet corpus particularly for small keyboard devices
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9190062B2 (en) 2010-02-25 2015-11-17 Apple Inc. User profiling for voice input processing
US8799658B1 (en) 2010-03-02 2014-08-05 Amazon Technologies, Inc. Sharing media items with pass phrases
US9485286B1 (en) 2010-03-02 2016-11-01 Amazon Technologies, Inc. Sharing media items with pass phrases
US8756571B2 (en) * 2010-05-07 2014-06-17 Hewlett-Packard Development Company, L.P. Natural language text instructions
US20110276944A1 (en) * 2010-05-07 2011-11-10 Ruth Bergman Natural language text instructions
US8560318B2 (en) 2010-05-14 2013-10-15 Sony Computer Entertainment Inc. Methods and system for evaluating potential confusion within grammar structure for set of statements to be used in speech recognition during computing event
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
CN102411563A (en) * 2010-09-26 2012-04-11 阿里巴巴集团控股有限公司 Method, device and system for identifying target words
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
US9075783B2 (en) 2010-09-27 2015-07-07 Apple Inc. Electronic device with text error correction based on voice recognition data
US20120084433A1 (en) * 2010-10-01 2012-04-05 Microsoft Corporation Web test generation
US8549138B2 (en) * 2010-10-01 2013-10-01 Microsoft Corporation Web test generation
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9679561B2 (en) * 2011-03-28 2017-06-13 Nuance Communications, Inc. System and method for rapid customization of speech recognition models
US20120253799A1 (en) * 2011-03-28 2012-10-04 At&T Intellectual Property I, L.P. System and method for rapid customization of speech recognition models
US9978363B2 (en) 2011-03-28 2018-05-22 Nuance Communications, Inc. System and method for rapid customization of speech recognition models
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10255566B2 (en) 2011-06-03 2019-04-09 Apple Inc. Generating and processing task items that represent tasks to perform
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
US10332514B2 (en) 2011-08-29 2019-06-25 Microsoft Technology Licensing, Llc Using multiple modality input to feedback context for natural language understanding
US9576573B2 (en) * 2011-08-29 2017-02-21 Microsoft Technology Licensing, Llc Using multiple modality input to feedback context for natural language understanding
US20130054238A1 (en) * 2011-08-29 2013-02-28 Microsoft Corporation Using Multiple Modality Input to Feedback Context for Natural Language Understanding
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US20130246045A1 (en) * 2012-03-14 2013-09-19 Hewlett-Packard Development Company, L.P. Identification and Extraction of New Terms in Documents
US20130262114A1 (en) * 2012-04-03 2013-10-03 Microsoft Corporation Crowdsourced, Grounded Language for Intent Modeling in Conversational Interfaces
US9754585B2 (en) * 2012-04-03 2017-09-05 Microsoft Technology Licensing, Llc Crowdsourced, grounded language for intent modeling in conversational interfaces
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10019994B2 (en) 2012-06-08 2018-07-10 Apple Inc. Systems and methods for recognizing textual identifiers within a plurality of words
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US20130346078A1 (en) * 2012-06-26 2013-12-26 Google Inc. Mixed model speech recognition
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US20140039876A1 (en) * 2012-07-31 2014-02-06 Craig P. Sayers Extracting related concepts from a content stream using temporal distribution
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US8700396B1 (en) * 2012-09-11 2014-04-15 Google Inc. Generating speech data collection prompts
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
US9400952B2 (en) 2012-10-22 2016-07-26 Varcode Ltd. Tamper-proof quality management barcode indicators
US10242302B2 (en) 2012-10-22 2019-03-26 Varcode Ltd. Tamper-proof quality management barcode indicators
US9633296B2 (en) 2012-10-22 2017-04-25 Varcode Ltd. Tamper-proof quality management barcode indicators
US9965712B2 (en) 2012-10-22 2018-05-08 Varcode Ltd. Tamper-proof quality management barcode indicators
US20140214399A1 (en) * 2013-01-29 2014-07-31 Microsoft Corporation Translating natural language descriptions to programs in a domain-specific language for spreadsheets
US9330090B2 (en) * 2013-01-29 2016-05-03 Microsoft Technology Licensing, Llc. Translating natural language descriptions to programs in a domain-specific language for spreadsheets
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US9977779B2 (en) 2013-03-14 2018-05-22 Apple Inc. Automatic supplementation of word correction dictionaries
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US10078487B2 (en) 2013-03-15 2018-09-18 Apple Inc. Context-sensitive handling of interruptions
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
US10255346B2 (en) 2014-01-31 2019-04-09 Verint Systems Ltd. Tagging relations with N-best
US10339452B2 (en) 2014-02-05 2019-07-02 Verint Systems Ltd. Automated ontology development
US9037967B1 (en) * 2014-02-18 2015-05-19 King Fahd University Of Petroleum And Minerals Arabic spell checking technique
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US20160055848A1 (en) * 2014-08-25 2016-02-25 Honeywell International Inc. Speech enabled management system
US9786276B2 (en) * 2014-08-25 2017-10-10 Honeywell International Inc. Speech enabled management system
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US20160092160A1 (en) * 2014-09-26 2016-03-31 Intel Corporation User adaptive interfaces
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US20160098389A1 (en) * 2014-10-06 2016-04-07 International Business Machines Corporation Natural Language Processing Utilizing Transaction Based Knowledge Representation
US9588961B2 (en) 2014-10-06 2017-03-07 International Business Machines Corporation Natural language processing utilizing propagation of knowledge through logical parse tree structures
US9904668B2 (en) * 2014-10-06 2018-02-27 International Business Machines Corporation Natural language processing utilizing transaction based knowledge representation
US9715488B2 (en) * 2014-10-06 2017-07-25 International Business Machines Corporation Natural language processing utilizing transaction based knowledge representation
US9665564B2 (en) 2014-10-06 2017-05-30 International Business Machines Corporation Natural language processing utilizing logical tree structures
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US20160217128A1 (en) * 2015-01-27 2016-07-28 Verint Systems Ltd. Ontology expansion using entity-association rules and abstract relations
US20160217127A1 (en) * 2015-01-27 2016-07-28 Verint Systems Ltd. Identification of significant phrases using multiple language models
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US20170031896A1 (en) * 2015-07-28 2017-02-02 Xerox Corporation Robust reversible finite-state approach to contextual generation and semantic parsing
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US20180061408A1 (en) * 2016-08-24 2018-03-01 Semantic Machines, Inc. Using paraphrase in accepting utterances in an automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10073831B1 (en) * 2017-03-09 2018-09-11 International Business Machines Corporation Domain-specific method for distinguishing type-denoting domain terms from entity-denoting domain terms
US10073833B1 (en) * 2017-03-09 2018-09-11 International Business Machines Corporation Domain-specific method for distinguishing type-denoting domain terms from entity-denoting domain terms

Also Published As

Publication number Publication date
US8442812B2 (en) 2013-05-14
US20140136189A1 (en) 2014-05-15
US9342504B2 (en) 2016-05-17
US8630846B2 (en) 2014-01-14
US20130124195A1 (en) 2013-05-16
US20040199375A1 (en) 2004-10-07

Similar Documents

Publication Publication Date Title
Jelinek Statistical methods for speech recognition
Filippova Multi-sentence compression: Finding shortest paths in word graphs
Nießen et al. Statistical machine translation with scarce resources using morpho-syntactic information
Itou et al. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
Zechner Automatic summarization of open-domain multiparty dialogues in diverse genres
US7016830B2 (en) Use of a unified language model
Gorin et al. How may I help you?
US7310601B2 (en) Speech recognition apparatus and speech recognition method
US5970449A (en) Text normalization using a context-free grammar
Chen Building probabilistic models for natural language
US7725321B2 (en) Speech based query system using semantic decoding
US7203646B2 (en) Distributed internet based speech recognition system with natural language support
US9798720B2 (en) Hybrid machine translation
JP5162697B2 (en) Generation of unified task-dependent language model based on information retrieval technique
US7225125B2 (en) Speech recognition system trained with regional speech characteristics
EP0830668B1 (en) Systems and methods for word recognition
US5477451A (en) Method and system for natural language translation
JP3720068B2 (en) Posting the method and apparatus of the question
US7672841B2 (en) Method for processing speech data for a distributed recognition system
US7567902B2 (en) Generating speech recognition grammars from a large corpus of data
US20050055198A1 (en) Computer-aided reading system and method with cross-language reading wizard
JP4302326B2 (en) Automatic classification of text
US6615172B1 (en) Intelligent query engine for processing voice based queries
Tur et al. Spoken language understanding: Systems for extracting semantic information from speech
US20030036900A1 (en) Method and apparatus for improved grammar checking using a stochastic parser

Legal Events

Date Code Title Description
AS Assignment

Owner name: SEHDA, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:EHSANI, FARZAD;KNODT, EVA M.;MASTER, DEMITRIOS L.;REEL/FRAME:012182/0092;SIGNING DATES FROM 20010828 TO 20010829

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION