Connect public, paid and private patent data with Google Patents Public Datasets

协作会话语音用户界面的系统和方法

Info

Publication number
CN101535983B
CN101535983B CN 200780042315 CN200780042315A CN101535983B CN 101535983 B CN101535983 B CN 101535983B CN 200780042315 CN200780042315 CN 200780042315 CN 200780042315 A CN200780042315 A CN 200780042315A CN 101535983 B CN101535983 B CN 101535983B
Authority
CN
Grant status
Grant
Patent type
Prior art keywords
user
conversational
based
interface
cooperative
Prior art date
Application number
CN 200780042315
Other languages
English (en)
Other versions
CN101535983A (zh )
Inventor
B·艾弗尔索德
C·威德尔
L·贝尔德文
M·特加尔弗
T·弗莱曼
Original Assignee
沃伊斯博克斯科技公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Abstract

提供了协作会话语音用户界面。协作会话语音用户界面可以基于短期和长期共享知识,以生成有关用户发声的意图的一个或多个显式的和/或隐式的假设。可以基于变化的确定程度,对假设进行分级,并可以为用户生成适应性响应。可以基于确定程度,用言辞来表达响应,并使随后发声的适当域成帧。在一种实现方式中,错误识别可以容忍,可以基于随后的发声和/或响应,纠正会话过程。

Description

协作会话语音用户界面的系统和方法

技术领域

[0001] 本发明涉及人机语音用户界面的协作性会话模型。

背景技术

[0002] 技术的进步,特别是在收敛(convergence)空间内,导致了对语音识别软件的需求的増加,该软件可以以对人类直观的方式利用该技木。尽管由于共享信息和/或上下文以推进相互的谈话目标,人类 之间的通信常常是“协作性的”,但是,现有的人机界面却不能提供相同级别的直观的交互。例如,谈话中的每ー个人类參与者都可以为了受益于交换而对交换给出贡献。这是通过关于谈话的各个方面的共享的假设和预期完成的,如主题、參与者的有关主题的知识,对另一个參与者的有关主题的知识的预期,对于主题和/或參与者的适当的措词,基于以前的发声的谈话的发展,參与者的语气或语气变化,从每ー个參与者预期的贡献的质量和数量,以及许多其他因素。參与连续地形成的并依赖共享的信息的谈话是人类交谈的天然而直观的方式。

[0003] 相比之下,复杂的人机界面不允许用户直观地利用技木,这阻碍了各种技术的大量的采用。利用语音界面,通过使交互更加容易,更快,有助于减轻此负担,但是,现有的语音界面(当他们实际工作吋)仍需要用户进行很多的学习。即,现有的语音界面不能在古老的人机界面和谈话语音之间架起桥梁,使得与系统的交互感觉起来很普通。用户应该能够以普通的谈话方式直接从系统请求他们所需要的东西,无需记住准确的词语。或者,当用户不能确定特定需求时,他们应该能够使系统參加有成果的对话,以解决他们的请求。相反,现有的语音界面迫使用户为了配合简单语言的简单指令集以便以系统可以理解的方式发出请求而不说出他们的请求。通过使用现有的语音界面,用户和系统之间的对话几乎没有办法满足相互的目标。

[0004] 因此,现有的系统缺乏ー种谈话语音模型,谈话语音模型可以给用户提供以对人类固有地直观的方式与系统进行交互的能力。现有的系统存在这些及其他问题。

发明内容

[0005] 根据本发明的各个实施例和方面,协作性的谈话语音用户界面可以理解自由形式的人类发声,把用户从被限制于固定的命令集和/或请求中解放出来。相反地,用户可以使用天然的、直观的,自由形式的表达方式,參与与机器的协作性的谈话,以完成请求或一系列请求。

[0006] 根据本发明的ー个方面,提供了用于实现协作会话语音用户界面的示范性系统体系结构。系统可以接收输入,输入可以包括由输入设备接收到的人的发声,发声可以包括一个或多个请求。如这里所使用的,“发声”可以是字、音节、音素或由人发出的任何其他可听的声音。如这里所使用的,“请求”可以是设备、计算机或其他机器检索信息、执行任务或执行某种其他操作的命令、指令或其他指令。在一种实现方式中,输入可以是多模输入,多模输入的至少一部分是发声。输入的发声分量可以由自动语音识别器进行处理,以生成发声的一个或多个初步解释。然后,可以将ー个或多个初步解释提供到会话语音引擎,以便进一步地进行处理,其中,会话语音引擎可以与一个或多个数据库进行通信,以生成自适应会话响应,响应可以作为输出返回到用户。在一种实现方式中,输出可以是多模输出。例如,发声可以包括执行操作的请求,而输出可以包括报告成功或失败的会话响应,以及操作的执行。

[0007] 根据本发明的另ー个方面,示范性会话语音引擎可以生成对ー个请求或一系列请求的自适应会话响应。会话语音引擎可以包括自由形式的语音搜索模块,该模块可以理解使用典型的日常的语言(即,自由形式的)的发声,并可以解释人通常如何说话的变化,它们使用的词汇量,以及他们说话时所处的条件。为解决人类语音的不可捉模的变化,自由形式的搜索模块可以包括偶然的人类语音的模型。例如,在一种实现方式中,自由形式的搜索模块可以理解特殊的行话和/或俚语,容忍词语顺序的变化,并容忍唠叨的暂停或ロ吃的语音。例如,动词在名词前面的形式化的英 语请求可以以与名词在动词前面的请求等效的方式加以对待。在另ー种实现方式中,在单ー发声中可以标识带有多种变化的复合请求和/或复合任务。通过从单ー发声中标识用于完成一个或多个任务的所有相关信息,可以提供优于诸如命令与控制系统之类的现有的语音用户界面(它们使用ロ头菜单将人可以提供的信息限制于给定点)的优点。在另ー种实现方式中,通过从不完整的或模糊的请求推断预定的请求,可以提供谈话的感觉。通过模型化执行标识的上下文中的任务可能需要的上下文记号、限定符或其他信息,可以生成适应性响应,如提示用户提供丢失的上下文记号、限定符或其他信息。在一种实现方式中,响应可以以最限制可能的解释的方式要求遗漏信息,并可以使响应成帧,以建立随后的用户发声的域。在另ー种实现方式中,可以识别名词和动词的常见的备选方案,以根据各种条件反映使用模式的变化。如此,表达中的变化可以得到支持,因为词序是不重要的或不曾预料到的,名词和/或动词可以以不同的方式表示,以给出简单化的,但典型的示例。在另ー种实现方式中,可以从矛盾的或由于别的原因不准确的信息推断请求,如当发声包括开始和停止、重新开始、ロ吃、不停顿(run-on)的句子或其他有缺点的语音时。例如,用户可能有时改变主意,如此,在发声的过程中改变请求,尽管如此,有缺点的语音功能能够基于人语音的模型来推断请求。例如,各种模型可以指出最后ー个条件最有可能正确、或语调、着重点、强调,词语“not”的使用等各种模型,或其他模型可以表示哪ー个条件最有可能是正确的。

[0008] 根据本发明的另ー个方面,会话语音引擎可以包括噪声耐量模块,该噪声耐量模块可以丢弃给定上下文中的没有含义的词或噪声,以降低混淆的可能性。此外,噪声耐量模块250还可以过滤掉环境和非人类的噪声,以进一歩降低混淆的可能性。在一种实现方式中,噪声耐量模块可以与其他模块和功能协作,以过滤掉不适合标识的上下文的词。例如,噪声耐量模块可以过滤掉一系列ー个或多个麦克风内的其他人类对话和/或发声。例如,单ー设备可以包括多个麦克风,或者多个设备可以各自包括ー个或多个麦克风,噪声耐量模块可以校对输入,并通过比较来自各个麦克风的语音信号来协作地过滤声音。噪声耐量模块也可以过滤掉多个麦克风内的非人类的环境噪音,由说话者的歧义或词语误用所引起的词汇表之外的词,或可能与目标请求不相关的其他噪声。噪声耐量模块的性能基准可以由基于人类的条件的噪声模型进行定义。例如,当汽车以65英里每小时行驶时窗户毁损,汽车的驾驶员有92%的可能被乘客所理解,那么,噪声耐量模块的性能基准在这样的条件下可以具有类似的性能。[0009] 根据本发明的另ー个方面,会话语音引擎可以包括上下文判断过程,用于确定一个请求的ー个或多个上下文,以确定谈话内的含义。可以通过让ー个或多个上下文域代理竞争来确定给定发声的最适当的域,确定ー个或多个上下文。一旦一个给定的域“赢得” 了竞争,获胜的域可以负责确定或推断进ー步的上下文,并更新短期和长期共享知识。如果在上下文域代理之间存在僵持,则自适应会话响应可以提示用户消除僵持的代理之间的歧义。此外,上下文判断过程可以基于以前的发声和/或请求,推断预定的操作和/或上下文,而现有的系统独立地考虑每ー个发声,潜在地一次又一次地产生相同的错误。例如,如果给定解释不正确,则可以从ー个或多个自动语音识别器语法和/或从由会话语音引擎确定的可能的解释中删除不正确的解释作为潜在的解释,从而确保,对于完全相同的发声,同一个错误不会重复。

[0010] 上下文判断过程可以通过连续地更新现有上下文的ー个或多个模型并确定不能先验地确定的上下文作为谈话的副产品,提供优于现有的语音用户界面的优点。相反,上下文判断过程可以跟踪谈话主题,并试图将当前发声匹配到最近的上下文中,包括在完成、部分地完成、请求了任务时在上下文之间进行切換。上下文判断过程可以通过定义在各种上下文域中可能对用户有用的相关的功能的集合,标识发声的ー个或多个上下文域。此外, 每ー个上下文域都可以具有相关的词汇表和思想集合,以模型化词语分组,当一起评估吋,可以消除ー个上下文域与另ー个上下文域的歧义。如此,当捜索相关组合时删去不符上下文的词语和噪声字可以增强推理的准确性。这提供了优于试图向发声的每个组件分配含义(即,包括不符上下文的词语和噪声字)的现有的系统的优点,导致几乎无限的可能的组合和混淆的可能性更大。上下文判断过程也可以是自我意识的,向ー个或多个生成的假设分配确定程度,其中,假设可以发展,以反映环境条件的变化,说话者的歧义,ロ音或其他因素。通过标识上下文,上下文内的功能、上下文内的词汇表,从历史来看在上下文中最常常执行什么任务,什么任务刚刚完成等等,上下文判断过程可以根据相当弱的语音线索确定意图。此外,正如在人与人的谈话中,用户可以在任何时间切換上下文,而不会产生混淆,当发声明确时,可以快速地选择各种上下文域,而不出现菜单驱动的死路。

[0011] 根据本发明的另ー个方面,示范性协作性会话模型可以基于自由形式的语音搜索、噪声耐量,以及上下文判断,以实现会话人机界面,该会话人机界面反映了人的交互和普通会话行为。即,协作性会话模型允许人和机器參与带有接受的目的或方向的谈话中,每一个參与者都对从中受益的谈话有贡献。通过利用有关人所依赖的发声的人的假定,无论是作为说话者和听者,人机界面可以类似于日常的人与人的谈话。在一种实现方式中,示范性协作性谈话模型可以获取传入的数据(共享知识),以通知决定(智能假设构建),然后,可以精炼判断,并生成响应(适应性响应构建)。

[0012] 根据本发明的另ー个方面,共享知识可以包括短期和长期知识。短期知识可以在単一谈话过程中累积,其中,在単一谈话过程中接收到的输入可以保留。共享知识可以包括跨模态意识,其中,除累积涉及用户发声、请求、位置等等的输入之外,共享知识可以累积涉及其他模态输入的当前用户界面状态,以进ー步构建共享知识模型。共享知识可以用来使用当前和相关信息,构建ー个或多个智能假设,通过标识带有长期意义的信息,构建长期共享知识,并利用相关状态和措词信息,生成自适应响应。此外,由于协作性谈话模型化人类谈话,在心理上的适当时间量之后,短期会话数据会过期,从而人性化系统行为,降低基于过时的数据的上下文混淆的可能性,而同时还将来自过期的会话上下文的相关信息添加到长期知识模型中。长期共享知识一般可以是以用户为中心,而不是基于会话的,其中,可以随着时间的推移累积输入,以构建用户、环境、认识的、历史或其他长期知识模型。在用户參与协作性谈话过程中随时可以同时使用长期和短期共享知识。长期共享知识可以包括显式的和/或隐式的用户偏好,最近的上下文、请求、任务等等的历史,与词汇表相关的用户特定的行话和/或上下文的功能,最常使用的词语选择,或其他信息。长期共享知识可以用来使用当前和相关信息,构建ー个或多个智能假设,当通过短期共享知识不可用时,利用适当的词语选择,生成自适应响应,精炼长期共享知识模型,标识特定任务的频率,标识用户经常感到困难的任务,或提供其他信息和/或分析,以生成更准确的会话响应。共享知识也可以用来修改未经提示的支持的级别(例如,对于新手与有经验的用户,经常被错认的用户等等)。如此,共享知识可以允许用户和语音用户界面共享假设和预期,如主题知识、谈话历史、词语使用、行话、语气或有助于人用户和系统之间的协作性谈话的其他假设和/或预期。

[0013] 根据本发明的另ー个方面,对于任何给定发声,可以标识谈话类型。通过分类和发展各种交換的概念模型,可以不断地对齐用户预期和域功能。通过考虑谈话目标、參与者角色,和/或信息在參与者之间的分配,可以生成关于谈话类型的ー个或多个智能假设。基于谈话目标,參与者角色,以及信息的分配,智能假设可以考虑各种因素,将谈话(或发声)分类为可以彼此进行交互的一般的谈话类型,以形成很多变化和谈话类型的置換(例如,随着从ー个參与者向另ー个參与者重新分配信息,或者随着谈话目标基于信息的重新分配而改变,谈话类型可以动态地变化)。

[0014] 根据本发明的另ー个方面,智能假设可以包括发声中的用户的意图的ー个或多个假设。此外,智能假设可以使用短期和/或长期共享知识,以主动地随着谈话的进行或随着时间的推移来构建和评估与用户的交互。这些假设可以模型化人与人的互动,以包括每ー个假设的变化的确定程度。即,正如人依赖于參与者共享的知识来审查有多少信息以及有什么样的信息可用,智能假设可以利用标识的谈话类型和共享的知识来生成每ー个假设的确定程度。

[0015] 根据本发明的另ー个方面,可以从智能假设生成句法上、语法上以及在上下文中敏感的“智能响应”,这些响应可以用来生成用户的谈话体验,而同时还引导用户以有利于识别方式应答。智能响应可以通过适应用户的说话方式、适当地使响应成帧,具有天然变化和/或个性(例如,通过改变语气、速度、定时、语气变化,措词,行话及ロ头或音频响应中的其他变化),产生谈话的感觉。

[0016] 根据本发明的另ー个方面,智能响应可以通过使用上下文记号和语法规则,适应用户的说话方式,以生成可以与用户协作的ー个或多个句子。通过利用有关用户如何发出请求的声音的共享知识,可以使用用于识别请求的类似技术,模型化响应。智能响应可以根据统计对可能的响应进行分级和/或使响应随机化,产生构建与天然变化和谈话的感觉交换的机会。这相对于输入和输出不一致的的现有的语音用户界面而言,具有优点,在现有的语音用户界面中,输入是“谈话的”,而输出是“电脑语言”。[0017] 根据本发明的另ー个方面,智能响应可以使响应成帧,以影响用户应答发声,以便于识别。例如,可以将响应模型化为更加可能导致完整的请求的发自用户的不当(illicit)的发声。如此,响应可以符合人的对话的协作性特征和天然的人的“模仿”刚刚听到的作为下一个发声的一部分的倾向。此外,当前上下文的知识可以增强响应,以生成更有意义的谈话响应。使响应成帧也可以根据人的模型,处理错误识别。例如,人经常记住许多最近的发声,特别是当ー个或多个以前的发声被错误识别或无法识别吋。谈话中的另ー个參与者可能限制对错误识别的或无法识别的发声的一部分的纠正,或在随后的发声和/或其他交互中,可以提供线索以指出初始的解释不正确。如此,通过存储和分析多个发声,随着谈话的进行,可以纠正前面的谈话的发声。

[0018] 根据本发明的另ー个方面,智能响应可以包括对用户的多模或跨模的响应。在一种实现方式中,响应可以识别并控制一个或多个设备和/或接ロ,用户可以通过使用最方便的任何一种输入方法或输入方法的组 合来作出响应。

[0019] 根据本发明的另ー个方面,智能响应可以纠正谈话的过程,而不会中断会话流。即,尽管智能响应可以合理地“确定”,但是,智能响应有时也可能不正确。尽管现有的语音用户界面趋于不出现通常的会话失误(misstep),正常的人之间的交互则可能预期出现失误并相应地进行处理。如此,错误识别之后的响应可以在澄清而不是错误之后模型化,在随后的响应中可以选择词语,以使谈话继续进行,并确定与用户一起探查的适当的域。

[0020] 基于下面的图形和具体实施方式,本发明的其他目标和优点对所属领域的技术人员是显而易见的。

附图说明

[0021] 图I是根据本发明的ー个方面的系统体系结构的示范性方框图。

[0022] 图2是根据本发明的ー个方面的会话语音引擎的示范性方框图。

[0023] 图3是根据本发明的ー个方面的协作性会话模型的示范性方框图。

具体实施方式

[0024] 请參看图1,该图显示了根据本发明的ー个方面的用于实现协作会话语音用户界面的示范性系统体系结构。系统可以从用户那里接收输入,其中,在一种实现方式中,输入105可以包括由输入设备(例如,麦克风)接收到的人的发声,发声可以包括ー个或多个请求。输入105也可以是多模输入,其中,多模输入的至少一部分是发声。例如,输入设备可以包括麦克风和触摸屏设备的组合,而输入105可以包括发声,该发声包括涉及用户正在触摸的触摸屏设备上的显示器的一部分的请求。例如,触摸屏设备可以是导航设备,输入105可以包括发声“Give me directions to here”,其中,用户可以在导航设备的显示器上请求到所需目的地的方向。

[0025] 输入105的发声分量可以由自动语音识别器110进行处理,以生成发声的ー个或多个初步解释。自动语音识别器110可以使用当前技术已知的任何适用技术对发声进行处理。例如,在一种实现方式中,自动语音识别器110可以使用语音听写的技术,解释发声,以识别音素流,如标题为“Dynamic Speech Sharpening”的待审批的美国专利申请系列编号11/513,269所描述的,在此引用该申请的全部内容作为參考。然后,可以将由自动语音识别器110生成的ー个或多个初步解释提供到会话语音引擎115,以便进一步进行处理。会话语音引擎115可以包括会话语言处理器120和/或语音搜索引擎125,如下面的图2比较详细地描述的。会话语音引擎115可以与ー个或多个数据库130进行通信,以生成自适应会话响应,响应可以作为输出140返回到用户。在一种实现方式中,输出140可以是多模输出和/或与一个或多个应用程序145的交互,以完成请求。例如,输出140可以包括音频响应和导航设备上的路线的显示的组合。例如,发声可以包括执行操作的请求,而输出140可以包括报告成功或失败的会话响应,以及操作的执行。此外,在各种实现方式中,自动语音识别器110、会话语音引擎115,和/或数据库130可以驻留在本地(例如,在用户设备上)、远程(例如,在服务器上),或者,也可以使用本地和远程处理的混合模式(例如,可以在本地对轻量级应用程序进行处理,而在远程对计算密集的应用程序进行处理)。

[0026] 请參看图2,该示范性方框图显示了根据本发明的ー个方面的会话语音引擎215。会话语音引擎215可以包括会话语言处理器220,该处理器使用自由形式的语音搜索模块245、噪声耐量模块250,和/或上下文判断进程255,生成对ー个请求或一系列请求的自适应会话响应。根据本发明的ー个方面,模块245-255可以与语音搜索引擎225进行通信,语音搜索引擎225包括ー个或多个上下文域代理230和/或一个或多个词汇表235, 以帮助解释发声和生成响应,如Tom Freeman和Larry Baldwin所著的“Enhancing theVUE™(Voce-User-Experience)Through Conversational Speech”所描述的,在此引用该申请的全部内容作为參考。会话语音引擎215可以生成针对ー个或多个请求的自适应会话响应,其中,请求可以取决于未说出口的假设、不完整的信息、由以前的发声建立的上下文、用户概况、历史概况、环境概况,或其他信息。此外,会话语音引擎215可以跟踪哪些请求已经完成,哪些请求正在被处理中,和/或哪些请求由于信息不完整或不准确而不能处理,并可以相应地生成响应。

[0027] 根据本发明的ー个方面,自由形式的语音搜索模块245可以理解使用典型的日常的语言(即,自由形式的)的发声,并可以解释人通常如何说话的变化,它们使用的词汇表,以及他们说话时所处的条件。由于诸如应力、分心,以及偶然发现之类的可变因素始終是不同的并且无限地变化的,自由形式的捜索模块245可以这样设计,目标是应该理解,没有人以同样的方式来到相同的人机界面的情况两次。如此,自由形式的捜索模块245可以实现模型化偶然的人的语音的ー个或多个特点。在各种实现方式中,自由形式的搜索模块245可以包括,其中,自由形式的发声功能、一步访问功能、推理计划的操作功能、交替表达功能,和/或有缺点的语音功能。

[0028]自由形式的发声功能可以理解特殊的行话和/或俚语,容忍词语顺序的变化(例如,无论请求的主体先于或后于动词,可能不相关),并容忍唠叨的暂停(例如,“um”、“ah”、“eh”,及其他没有意义的发声)。例如,自由形式的发声功能可以以与自由形式的请求(名词可能在动词前面)的等效的方式对待形式化的英语“动词在名词之前”的请求。例如,可以等效地对待用户的发声“Change it to theSquizz”和“You know, um, that Squizzchannel, ah, switch it there”(其中,Squizz 是 XM Satellite Radio 上的一个频道)。不论是哪ー种情况,自由形式的发声功能能够将“Squizz”识别为发声的主体,而将“Changeit”或“ switch it”识别为发声的动词或请求(例如,通过与上下文判断过程255或者其他功能协作,并标识相关上下文域代理230和/或词汇表235,以解释发声)。

[0029] 一步访问功能可以理解包括带有多个变量的复合请求的发声。例如,用户发声可以是“What is the forecast for Boston thisweekend ?,,一步访问功能可以将“weather”识别为上下文(例如,通过与上下文判断过程255或其他功能协作,并将“forecast”识别为“weather”的同义词),并搜索等于“Boston”的城市和等于“weekend”的时间。通过从单ー发声中标识用于完成任务的所有相关信息,一步访问功能可以克服诸如命令与控制系统之类的现有的语音用户界面(它们使用ロ头菜单将人可以提供的信息限制于给定点,例如,电话目录服务的命令与控制系统可以说:“State please”, “City please”,“Whatlisting”,等等)。此外,某些发声可以包括复合请求,一步访问功能可以将复合请求分解为多个子任务。例如,用户发声“I need tobe at a meeting tomorrow in San Francisco at8:00am”可以分解成ー组子任务,如:(I)检查是否有会议之前的晚上的航班并预定ー个航班,(2)检查是否有宾馆,并预定宾馆,(3)检查是否有小汽车并预定ー辆汽车等等,其中,用户还可以进ー步为各种任务指定偏好(例如,首先检查用户经常乘坐的航空公司是否有航班)。取决于有关用户的偏好和/或历史模式的共享知识的级别,一步访问功能可以根据请求推断另外的任务。例如,在上面的示例中,一步访问功能也可以检查天气预报,如果天气是“好”(如用户偏好所定义和/或根据历史模式推断),一步访问功能可以安排旧金山的首选的高尔夫球场的开球时间

[0030] 推理预定的操作功能可以从不完整的模糊的请求识别预定的请求。例如,当用户发出 “Route〈indecipherable>Chicago〈indecipherable>here”,其中,用户打算说 “Routecalculation toChicago from here”,推理预定的操作功能可以模型化计算路线(起始点和目的地点)所需的东西。由于发声包括起始点和目的地点,可以推断计算从用户的当前位置到芝加哥的路线的请求。类似地,当推理预定的操作功能没有足够的信息来推断完整的请求时,可以生成自适应谈话响应以提示用户提供遗漏信息。例如,当发声包括股票行情的请求但没有公司名称(例如,“Get me the stock price for〈indecipherable>”)时,ロ向应可以是“What company' s stock quote doyou want ? ”然后,用户可以提供包括公司名称的发声,请求可以是完整的。在一种实现方式中,响应可以以最限制可能的解释的方式要求遗漏信息(例如,在需要城市和州的任务的请求中,可以首先要求州,因为州比城市的数量少)。此外,推理预定的操作功能可以通过在复合的任务和子任务级别维护上下文和识别相关和/或遗漏信息,模型化复合任务和/或请求。

[0031] 交替表达功能可以识别名词和动词的常见的备选方案,以根据各种条件反映使用模式的变化。例如,用户可以基于年龄、社会经济背景、种族、用户的怪念头或其他因素,改变表达方式。如此,交替表达功能可以支持表达方式的变化,词序是不重要的或不曾预料到的。基于各种条件或人口资料的表达式的备选方案可以加载到上下文域代理230和/或词汇表235中,交替表达功能可以基于推断的或新发现的变化,更新上下文域代理230和/或词汇表235。在一种实现方式中,谈话语音引擎215可以包括预订接ロ,以更新对上下文域代理230和/或词汇表235的更改(例如,储存库可以综合各种用户发声和部署全系统的更新)。在操作中,交替表达功能可以允许名词和/或动词以不同的方式表示,以给出简单化的,但典型的示例。例如,对Washington,D. C.的天气预报感兴趣的用户可能提供任何一个下面的发声,其中姆一个都被等效地解释:“What, s the weather likein DC”, “Is itraining inside the Beltway,,,Gimme the forecast forthe capital,,等等。类似地,“Go to my home,,,Go home,y, uShow routeto home,,和“I would like to know my wayhome”这些发声都可以等效地解释,其中,用户概况可以包括用户的家庭地址,可以计算到家庭地址的导航路线。

[0032] 有缺点的语音功能能够从矛盾的或由于别的原因不准确的信息推断请求,如当发声包括开始和停止、重新开始、ロ吃、run-on句子或其他有缺点的语音时。例如,用户可能有时改变主意,如此,在发声的过程中改变请求,尽管如此,有缺点的语音功能能够基于人语音的模型来推断请求。例如,对于发声“Well, I wanna. . . Mexi. . . no, steak restaurantplease, Γ m hungry”,现有的语音用户界面不会对关于人的语音的模型作出假设,不能推断用户是需要墨西哥餐厅还是牛排餐厅。有缺点的语音功能通过使用可能表示最后ー个条件最有可能正确、或语调、着重点、强调,词语“not”的使用等人的理解的各种模型,或可能表示哪ー个条件最有可能是正确的其他模型,来克服这些缺点。如此,在上面的示例中,有缺点的语音功能可以推断,用户需要牛排餐厅。

[0033] 根据本发明的ー个方面,噪声耐量模块250可以与有缺点的语音功能紧密关联,并可以操作以丢弃给定上下文中的没有含义的词或噪声,以便不会产生混淆。此外,噪声耐量模块250还可以过滤掉环境和非人类的噪声,以进一歩降低混淆的可能性。在一种实现方式中,噪声耐量模块250可以与其他模块和功能协作,以过滤掉不适合上下文的词。例如,可以标识ー个或多个上下文,可以过滤掉相对于系统功能没有含义的词,没有含义的人类的随机的发声及其他噪声。如此,噪声耐量模块250可以模型化现实世界的条件,以标识有意义的请求。例如,噪声耐量模块250还可以过滤掉一系列ー个或多个麦克风内的其他人类对话和/或发声。例如,単一设备可以包括多个麦克风,或者多个设备可以各自包括一个或多个麦克风,噪声耐量模块可以校对输入,并通过比较来自各个麦克风的语音信号来协作地过滤声音。噪声耐量模块250也可以过滤掉多个麦克风内的非人类的环境噪音,词汇表之外的词(可能是说话者的歧义或词语误用所造成的),或可能与目标请求不相关的其他噪声。噪声耐量模块250中的噪声模型可以基于人类的条件,定义性能基准。例如,当汽车以65英里每小时行驶时窗户毁损,汽车的驾驶员有92%的可能被乘客所理解,那么,噪声耐量模块250在那些条件下可以具有类似的性能。

[0034] 根据本发明的ー个方面,谈话语音引擎215可以包括上下文判断过程255,该过程确定ー个请求的ー个或多个上下文,以确定谈话内的含义。可以通过让ー个或多个上下文域代理竞争来确定给定发声的最适当的域,确定ー个或多个上下文,如标题为“Systems andMetnods for Responding to Natural Language Speech Utterance,,的编号为No. 11/197,504的待审批的美国专利申请和标题为“MobileSystems and Methods ofSupporting Natural LanguageHuman-Machine Interactions,,的编号为 No. 11/212,693的待审批的美国专利申请所描述的,这里引用了这两个申请作为參考。一旦一个给定的上下文域代理“赢得” 了竞争,获胜的代理可以负责确定或推断进ー步的上下文,并更新短期和长期共享知识。如果在上下文域代理之间存在僵持,则自适应会话响应可以提示用户消除僵持的代理之间的歧义。例如,用户的发声“ What about traff ic ? ”可能在各种上下文中具有不同的含义。即,当用户正在查询系统的媒体播放器时具有第一种含义(即,“traffic”将是歌手/作曲家Steve Winwood领导的摇滚乐队)当用户正在查询关于Michael Douglas电影的搜索界面时具有第二种含义(即,“traffic”将是StevenSoderbergh导演的电影),当用户正在查询导航设备以了解到机场的方向时具有第三种含 义(即,“traffic”将与到机场的路上的道路的状况相关)。[0035] 此外,上下文判断过程255可以基于以前的发声和/或请求,推断预定的操作和/或上下文,而现有的系统独立地考虑每ー个发声,潜在地一次又一次地产生相同的错误。例如,如果给定解释不正确,则可以从ー个或多个自动语音识别器语法和/或从由上下文判断过程255确定的可能的随后的解释中删除不正确的解释作为潜在的解释,从而确保,对于完全相同的发声,同一个错误不会重复。

[0036] 上下文判断过程255可以通过连续地更新现有上下文的ー个或多个模型,来克服现有的系统的缺点,其中,确定上下文的过程可以是谈话的副产品,这是不能先验地确定的。上下文判断过程255可以随着任务完成、部分地完成,被请求,等等,确定第一上下文域、变为第二上下文域,变回到第 一上下文域,上下文堆栈可以跟踪谈话主题,并试图将当前发声匹配到最近的上下文中,次最近的主题等等,遍历上下文堆栈,直到可以确定最有可能的意图。例如,用户可能发出“What' s the traffic report”,上下文判断过程255可以确定Traffic为上下文,并返回包括交通状况报告的输出,不正巧提及Interstate-5的交通状況。然后,用户可以发出“ What about 1-5 ? ”,上下文判断过程255可以知道,当前上下文是交通,可以搜索包括有关Interstate-5的信息的交通报告,可以作为输出返回表明Interstate-5拥挤的交通状况报告。然后,用户可以“Is there a fasterway ? ”,上下文确定模块255可以知道,当前上下文仍是交通,并可以捜索交通不太拥挤的到指定的目的地的并避开Interstate-5的路线。此外,上下文判断过程255可以基于用户概況、环境概况、历史概况或其他信息,构建上下文,以进ー步精炼上下文。例如,概况可以表明,Interstate-5是星期一到星期五行驶的典型的路线。

[0037] 当试图消除上下文之间的歧义时,概况特别有意义,词语在不同上下文中具有不同含义。例如,用户可以发出“ What' s theweather in Seattle ?”,上下文判断过程255可以确定天气为上下文,并确定西雅图作为环境上下文。然后,用户可以发出“andPortland? ”,上下文判断过程255可以基于俄勒R州波特兰市和华盛顿州西雅图市之间的天气和环境相似性,返回俄勒冈州波特兰市的天气报告。然后,用户可以询问“What timedoes the game start ?”,并可能搜索来自西雅图和/或波特兰的运动队的体育赛事,根据下面的图3中比较详细地描述的方法,以会话方式呈现結果。相关地,如果在第二次发声中用户最初发出了“ What' s the weather in Portsmouth, NewHampshire ”,则上下文判断过程255可以基于与New Hampshire的环境相似性,检索緬因州波特兰市的天气报告。此夕卜,当环境概况、上下文共享知识,和/或其他短期和/或长期共享知识不能提供足够的信息消除各种可能性之间的歧义时,响应可以提示用户提供更进一歩的信息(例如,“Did youmean Portland, Maine, or Portland, Oregon ?,,)。

[0038] 上下文判断过程255可以与上下文域代理230协作,其中,每ー个上下文域代理230都可以定义可能对用户有用的相关的功能的集合。此外,每ー个上下文域代理230都可以包括相关的词汇表235和思想集合,以模型化词语分组,当一起评估时,可以消除ー个上下文域与另ー个上下文域的歧义(例如,音乐上下文域代理230可以包括歌曲、音乐家、专辑等等的词汇表235,而股票上下文域代理230可以包括公司名称、股票行情报价机符号,财务度量等等的词汇表235)。如此,当捜索相关组合时删去不符上下文的词语和噪声字可以增强识别含义的准确性。相比之下,现有的系统试图向发声的每个组件分配含义(例如,包括不符上下文的词语和噪声字),导致几乎无限的可能的组合和混淆的可能性更大。此夕卜,上下文域代理230可以包括进ー步有助于解释发声、推断意图,完成不完整的请求等等的每ー个条件的元数据(例如,Space Needle词汇表单词可以包括西雅图、界标、旅游、SkyCity餐厅等等的元数据)。给定消除歧义的条件,上下文判断过程255如此能够自动地确定其他完成请求所需的信息,忽略词序的重要性,并执行谈话语音的其他增强。

[0039] 上下文域代理230也可以是自我意识的,向ー个或多个生成的假设分配确定程度,其中,假设可以发展,以反映环境条件的变化,说话者的歧义,ロ音,或其他因素。在概念上,上下文域代理230可以被设计为模型化发声,因为耳背的人将在有噪声的一方。通过标识上下文,上下文内的功能、上下文内的词汇表,从历史来看在上下文中最常常执行什么任务,什么任务刚刚完成等等,上下文域代理230可以根据相当弱的语音线索确定意图。此夕卜,上下文堆栈可以是用于确定上下文的多个组件中的ー个,如此,不对用户构成约束。所有上下文域都是可访问的,允许用户在任何时间切換上下文,不会产生混淆。如此,正如在人与人的谈话中,当发声明确时,可以快速地选择上下文域,没有菜单驱动的死路。例如,用户可以发出“Pleasecall Rich Kennewick on his cell phone”,可以生成“Do you wishme tocall Rich Kennewick on his cell?”的系统响应。用户可以此时决定稍后呼叫Rich Kennewick,而相反,听一些音乐。如此,用户可以发出“No,play the Louis Armstrong version of Body and Soul from my iPod”,可以在通过媒体播放器播放 Body and Soul时生成“Playing Bodyand Soul by Louis Armstrong”的系统响应。在此示例中,稍后的发声与第一次的发声没有上下文的联系,因为发声中的请求条件是明确的,可以轻松地切换上下文,而不依赖于上下文堆栈。

[0040] 请參看图3,该图显示了根据本发明的ー个方面的示范性协作性会话模型300。协作性会话模型300可以基于自由形式的语音搜索245、噪声耐量250,以及上下文判断255,以实现会话人机界面,该会话人机界面反映了人如何彼此进行交互以及谈话中的他们的普通行为。简而言之,协作性会话模型300允许人和机器參与带有接受的目的或方向的谈话中,每ー个參与者都对谈话有贡献。即,协作性谈话模型300包括了利用有关人依赖的发声的人的假定(无论是作为说话者和听者)的技术和进程流,从而创建了类似于日常的人与人的谈话的人机界面。在一种实现方式中,协作性谈话可以获取传入的数据(共享知识)305,以通知决定(智能假设构建)310,然后,可以精炼判断,并生成响应(适应性响应构建)315。

[0041] 根据本发明的ー个方面,共享的知识305包括有关传入的数据的短期和长期知识。短期知识可以在単一谈话过程中累积,而长期知识可以随着时间的推移累积,以构建用户概况、环境概况、历史概况或认识概况等等。

[0042] 可以在会话输入累加器中保留在単一谈话过程中接收到的输入。会话输入累加器可以包括跨模态意识,其中,除累积涉及用户发声、请求、位置等等的输入之外,会话输入累加器还可以累积涉及其他模态输入的当前用户界面状态,以进一步构建共享知识模型,和更准确的自适应响应(例如,如上文所描述的,当用户发出涉及触摸屏设备的一部分的请求时)O例如,会话输入累加器可以累积输入,包括姆ー个发声的识别文本,姆ー个发声的记录的语音文件,列表项目选择历史、图形用户界面处理历史,或其他输入数据。如此,会话输入累加器可以利用当前和相关信息填充智能假设生成器310,通过标识带有长期意义的信息,构建长期共享知识,给适应性响应生成器315提供相关状态和词语使用信息,保留最近的上下文,以便与智能假设生成器310 —起使用,和/或保留发声,以便在多遍评估过程中重新进行处理。此外,因为协作性谈话300模型化人类谈话,在心理上的适当时间量之后,短期会话数据会过期,从而人性化系统行为。例如,人大不可能回忆起两年之前的谈话的上下文,但是由于上下文可以通过机器识别,在预定的时间量之后会话上下文过期,降低基于过时的数据的上下文混淆的可能性。然而,可以向用户、历史、环境、认识的或其他长期知识模型中添加来自过期会话上下文的相关信息。

[0043] 长期共享知识一般可以是以用户为中心的,而不是基于会话的。即,可以随着时间的推移累积输入,以构建用户、环境、认识的、历史或其他长期知识模型。在用户參与协作性谈话300过程中随时可以同时使用长期和短期共享知识(统称为共享知识305)。长期共享知识可以包括显式的和/或隐式的用户偏好,最近使用的代理、上下文、请求、任务等等的历史,与词汇表相关的用户特定的行话和/或代理 和/或上下文的功能,最常使用的词语选择,或其他信息。长期共享知识可以用来使用当前和相关信息,填充智能假设生成器310,当适当的词语选择通过会话输入累加器不可用吋,给适应性响应生成器315提供适当的词语选择,精炼长期共享知识模型,标识特定任务的频率,标识用户经常感到困难的任务,或提供其他信息和/或分析,以生成更准确的会话响应。

[0044] 如上文所描述的,共享知识305可以用来填充智能假设生成器310,以便用户和语音用户界面可以共享假设和预期,如主题知识、谈话历史、词语使用、行话、语气(例如,正式的、幽默的,简洁的等等),或有助于在人机界面中进行交互的其他假设和/或预期。

[0045] 根据本发明的ー个方面,成功的协作性谈话的ー个组件可以是根据发声识别谈话的类型。通过分类和发展各种交换的概念模型,可以不断地对齐用户预期和域功能。通过考虑谈话目标、參与者角色,和/或信息在參与者之间的分配,智能假设生成器310可以生成关于谈话类型的假设。谈话目标可以广泛地包括:(I)获取单独的信息片段或执行单独的任务,⑵收集相关的信息片段,以作出决定,和/或⑶传播或收集大量的信息,以构建专业知识。參与者角色可以广泛地包括:(I)控制谈话的主导,(2)跟随主导并应请求提供输入的支持者,和/或(3)使用信息的消费者。在谈话开始时,信息可以由ー个或多个參与者持有,其中,參与者可以持有大多数(或所有)信息,少量(或没有)信息,或者信息可以在多个參与者之间大致平均地分配。基于谈话目标,參与者角色,以及信息的分配,智能假设生成器310可以考虑各种因素,将谈话(或发声)分类为可以彼此进行交互的一般的谈话类型,以形成很多变化和谈话类型的置換(例如,随着从一个參与者向另ー个參与者重新分配信息,或者随着谈话目标基于信息的重新分配而改变,谈话类型可以动态地变化)。

[0046] 例如,在一种实现方式中,查询谈话可以包括获取单独的信息片段或执行特定的任务的谈话目标,其中,查询谈话的主导头脑中可能有特定目标,并可能引导谈话往下进行,以实现目标。其他參与者可能持有信息,并可能通过提供信息来支持主导。在启发式谈话中,谈话的主导可以控制谈话的支持者所需的信息。支持者的角色可以仅限于调节谈话的整个进展,并插入查询,以便澄清。在探讨性的谈话中,两个參与者都共享主导和支持者角色,谈话可能没有特定目标,或目标可以随着谈话的进行而即席实现。基于此模型,智能假设生成器310可以根据下面的图形广泛地对谈话(或发声)进行分类:

[0047] 智能假设生成器310可以使用标识的谈话类型,以有助于生成一组针对发声中的用户的意图的假设。此外,智能假设生成器310可以使用来自会话输入累加器的短期共享知识,以主动地随着谈话的进行来构建和评估与用户的交互,并使用长期共享知识,以主动地随着时间的推移来构建和评估与用户的交互。如此,智能假设生成器310可以自适应地取得有关用户意图的ー组η个最佳假设,可以向适应性响应生成器315提供η个最佳假设。此外,智能假设生成器310可以通过计算每一个假设的确定程度来模型化人与人的互动。即,正如人依赖于參与者共享的知识来审查有多少信息以及有什么样的信息可用,智能假设生成器310可以利用标识的谈话类型和短期和长期共享知识来生成每ー个假设的确定程度。[0048] 根据本发明的另ー个方面,当发声包含完成请求或任务所需的所有信息(包括限定符)时,智能假设生成器310可以生成用户的意图的ー个或多个显式的假设。每ー个假设都可以具有对应的确定程度,可以用来确定响应中提供的未经提示的支持的级别。例如,响应可以包括ー个确认,以确保发声不会被误解,或者,响应可以自适应地提示用户提供遗漏的信息。

[0049] 根据本发明的另ー个方面,当发声包含可能是丢失的必需的限定符或完成请求或任务所需的其他信息时,智能假设生成器310可以使用短期知识来生成用户的意图的ー个或多个隐式假设。每ー个假设都可以具有对应的确定程度。例如,当谈话开始时,存储在会话输入累加器中的短期知识可以是空的,随着谈话的进行,会话输入累加器可以构建谈话的历史。智能假设生成器310可以使用会话输入累加器中的数据来补充或推断有关当前发声的额外的信息。例如,智能假设生成器310可以基于与当前发声相关的许多以前的请求,来评估确定程度。在另ー个示例中,当当前发声包含的信息不足以完成请求或任务时,会话输入累加器中的数据可以用来推断遗漏信息,以便可以生成假设。在再一个示例中,智能假设生成器310可以标识供适应性响应生成器315形成个性化的谈话的响应要使用的句法和/或语法。在再一个示例中,当当前发声包含完成请求或任务所需的阈值信息量时,可以依赖会话输入累加器中的数据来调整确定程度。

[0050] 根据本发明的另ー个方面,当发声缺少完成请求或任务所需的限定符或其他信息时,智能假设生成器310可以使用长期共享知识来生成用户的意图的ー个或多个隐式假设。每ー个假设都可以具有对应的确定程度。使用长期知识的过程可以基本上类似于使用短期共享知识的过程,只是信息可以不受当前会话的约束,而输入机制可以包括来自不同于当前谈话内容的另外的来源的信息。例如,智能假设生成器310可以在任何时间使用来自长期共享知识的信息,甚至在启动新的谈话的情况下,而短期共享知识仅限于现有的谈话(当启动新的谈话时,没有短期共享知识可用)。长期共享知识可以来自多个来源,包括用户偏好或插入的数据源(例如,到远程数据库的预订接ロ),用户的专业知识(例如,基于错误的频率,请求的任务类型,等等,可以将用户标识为新手、中等,有经验,或其他类型的用户),也可以适用于其他代理的代理特定的信息和/或语言(例如,通过将信息与某ー个代理分离,以将该信息归入其他代理),来自会话输入累加器的经常使用的主题,经常使用的动词、名词或其他词类,和/或来自会话输入累加器的其他句法信息,或者也可以使用其他长期共享知识的来源。

[0051] 根据本发明的另ー个方面,由智能假设生成器310生成的启用知识的发声,可以包括一个或多个显式的(由用户提供),和一个或多个隐式(由智能假设生成器310提供)上下文记号、限定符、条件,及其他信息,它们可以用来标识和评估相关任务。此时,智能假设生成器310可以向适应性响应生成器315提供输入。由适应性响应生成器315接收到的输入可以至少包括假设的排序的列表,包括显式的和/或隐式的假设,每ー个假设都可以具有对应的确定程度。可以给假设分配四个确定程度中的ー个:(I) “确信”,其中,上下文记号和限定符涉及一个任务,上下文和限定符涉及一个任务,而ASR置信水平超过预定的阈值;(2) “十分确信”,其中,上下文记号和限定符涉及ー个以上的任务(选择等级最高的 任务),条件涉及ー个请求,和/或ASR置信水平低于预定的阈值;(3)“不确信”,其中,需要另外的上下文记号或限定符来表示任务或对任务进行分级;以及⑷“无假设”,其中,很少或没有信息可以解密。每ー个确定程度都还可以进ー步被分类为显式的或隐式的,可以用来调整响应。由适应性响应生成器310接收到的输入也可以包括上下文、用户句法和/或语法,上下文域代理特定的信息和/或偏好(例如,旅行上下文域代理可以知道,用户经常请求有关法国的信息,该信息可以与电影上下文域代理共享,以便响应有时候可以包括法语电影)。

[0052] 根据本发明的另ー个方面,适应性响应生成器315可以构建句法上、语法上以及在上下文中敏感的“智能响应”,这些响应可以与ー个或多个代理一起使用,生成用户的谈话体验,而同时还引导用户以有利于识别方式应答。在一种实现方式中,智能响应可以包括通过输出设备(例如,说话者)播放的ロ头或音频应答,和/或通过设备、计算机或机器执行的操作(例如,下载Web页面,显示列表,执行应用程序等等)。在一种实现方式中,适当的响应可能不需要谈话适应,可以使用默认应答和/或针对给定住务随机地选择的响应集

[0053] 根据本发明的另ー个方面,适应性响应生成器310可以凭借由智能假设生成器310维护的信息,生成可以对上下文、当前发声的任务识别,用户知道有关主题的情况,应用程序已经知道有关主题的情况,关于用户偏好和/或相关标题的共享知识,适当的上下文措词(例如,行话),用户在最近的发声中发出的词语,谈话的发展和/或过程校正,谈话的语气,谈话的类型,响应的措词的天然变化,或其他信息敏感的响应。结果,适应性响应生成器315可以生成产生谈话的感觉的智能响应,适应在谈话期间累积的信息,维护跨模意识,保持谈话过程。

[0054] 根据本发明的另ー个方面,适应性响应生成器315可以通过适应用户的说话方式、适当地使响应成帧,具有天然变化和/或个性(例如,通过改变语气、速度、定时、语气变化,措词,行话及ロ头或音频响应中的其他变化),产生谈话的感觉。适应用户的说话方式的过程可以包括使用上下文记号和语法规则来生成ー个或多个句子,用作可以与用户协作的响应集合。通过利用有关用户如何发出请求的短期(来自会话输入累加器)和长期(来自ー个或多个配置文件)共享知识,可以使用用于识别请求的技术,模型化响应。适应性响应生成器315可以根据统计对可能的响应进行分级和/或使响应随机化,产生构建与天然变化和谈话的感觉交換的机会。这相对于输入和输出不一致的的现有的语音用户界面而言,可能是显著的优点,在现有的语音用户界面中,输入是“谈话的”,而输出是“电脑语言”。下面的示例可以演示响应如何适应用户的输入词语选择和说话方式:

[0055] 根据本发明的另ー个方面,适应性响应生成器315可以使响应成帧,以影响用户用可以被轻松地识别的发声作出应答。例如,用户可以发出“给我提供ー些新闻”,而语音用户界面响应可以是“是哪些类别的新闻? Top news stories、国际新闻、政治新闻还是体育新闻? ”响应可能是来自用户的非法的发声,如“ Top news stories”或“国际新闻”,很可能导致完整的请求。如此,响应可以符合人的对话的协作性特征和天然的人的“模仿”刚刚听到的作为下ー个发声的一部分的倾向。此外,当前上下文的知识可以增强响应,以生成更有意义的谈话响应,如在下面的交换中。

[0056] 使响应成帧也可以根据人的模型,处理错误识别。例如,人经常记住许多最近的发声,特别是当ー个或多个以前的发声被错误识别或无法识别吋。谈话中的另ー个參与者可能限制对错误识别的或无法识别的发声的一部分的纠正,或在随后的发声和/或其他交互中,可以提供线索以指出初始的解释不正确。如此,通过存储和分析多个发声,随着谈话的进行,可以纠正前面的谈话的发声。

[0057] 根据本发明的另ー个方面,适应性响应生成器315可以为用户生成多模或跨模的响应。在一种实现方式中,响应可以识别并控制一个或多个设备和/或接ロ,用户可以通过使用最方便的任何一种输入方法或输入方法的组合来作出响应。例如,在多模环境中,要求用户利用“ Yes”或“ No”引导发声的响应也可以可视地显示多种备选方案。

[0058] 根据本发明的另ー个方面,适应性响应生成器315可以纠正谈话的过程,而不会中断会话流。适应性响应生成器315可以基于假设的排序的列表和对应的确定程度,生成智能响应,可以用来纠正谈话的过程,而不会中断会话流。即,尽管智能响应可以合理地“确定”,但是,智能响应有时也可能不正确。尽管现有的语音用户界面趋于不出现常规失误,正常的人之间的交互可以预期出现失误,并相应地进行处理。如此,错误识别之后的响应可以在澄清而不是错误之后模型化,在随后的响应中可以选择词语,以使谈话继续进行,并确定与用户一起探查的适当的域。例如,过程校正可以导致下面的交換:

[0059] 利用特定的示范性实现方式和实施例对本发明进行了描述。然而,所属领域的技术人员将认识到,在不偏离本发明的范围和精神的情况下,可以进行各种修改。因此,说明书和图形都只是示范性的,本发明的范围仅仅由所附的权利要求来确定。

Claims (28)

1. 一种用于提供协作会话的语音用户界面的方法,所述方法包括: 在与用户的当前谈话期间,在语音输入设备接收发声,其中所述发声包括在不同上下文中具有不同含义的ー个或多个词; 累积有关当前谈话的短期共享知识,其中,短期共享知识包括有关接收的当前谈话期间的发声的知识; 累积有关用户的长期共享知识,其中,长期共享知识包括与用户的一个或多个过去的谈话有关的知识; 基于短期共享知识和长期共享知识,在会话语音引擎上标识与所述发声相关联的上下文; 建立所述发声在所标识的上下文中的预定的含义,其中会话语音引擎建立所述发声在所标识的上下文中的预定的含义,以消除用户发出在不同上下文中具有不同含义的ー个或多个词时的意图的歧义;以及 生成对所述发声的响应,其中,会话语音引擎基于所述发声在所标识的上下文中建立的预定的含义,在语法或句法上改編所述响应。
2.根据权利要求I所述的方法,其中,累积有关当前谈话的短期共享知识包括用接收的当前谈话期间的关于发声的信息填充短期上下文堆栈。
3.根据权利要求2所述的方法,其中,累积有关当前谈话的短期共享知识进ー步包括在预定的时间量之后,使来自所述短期上下文堆栈的关于所述发声的信息过期。
4.根据权利要求3所述的方法,其中累积有关用户的长期共享知识包括,更新与用户相关联的一个或多个长期配置文件,以包括接收的当前谈话期间的关于所述发声的信息以及与从短期上下文堆栈过期的信息相关联的相关数据。
5.根据权利要求I所述的方法,还包括:标识与发声相关联的谈话目标、当前谈话中与用户以及一个或多个其他參与者相关联的角色、以及当前谈话中的用户和该ー个或多个其他參与者中的信息分配以确定所述发声的预定的含义;以及 基于所标识的会话目标、所标识的角色或者所标识的信息分配中的ー个或多个,将ー个或多个发声或当前会话分类到会话类型中,其中会话语音引擎还基于所述会话类型建立所述发声的预定的含义。
6.根据权利要求5所述的方法,其中,为所述发声建立的预定的含义包括具有关于用户发出发声中的一个或多个词时的意图的确定程度的假设。
7.根据权利要求6所述的方法,还包括: 在与语音输入设备和会话语音引擎连接的语音识别引擎上生成发声的初步解释,以及 在所述会话语音引擎上基于会话类型、与所标识的上下文相关联的信息或者与在语音识别引擎上生成的初步解释相关联的置信水平中ー个或多个,为假设分配确定程度。
8.根据权利要求5所述的方法,其中,会话语音引擎还基于所述会话类型在语法或句法上对响应进行改編。
9.根据权利要求I所述的方法,其中,会话语音引擎在语法或句法上改編所述响应,以影响会话语音引擎预期在当前谈话期间来自用户的随后的应答发声。
10.根据权利要求I所述的方法,还包括: 在与语音输入设备和会话语音引擎连接的语音识别引擎上生成发声的多种可能的解释,其中,所述发声的初始的解释包括具有最高置信水平的多种可能的解释之ー;以及响应于确定初始的解释是不正确的,更新有关当前谈话的短期共享知识,以从多种可能的解释中去除初始的解释,其中所述会话语音引擎基于具有下ー个最高置信水平的多种可能的解释之一,确定所述发声的预定的含义。
11.根据权利要求I所述的方法,其中,所述用户在进ー步包括与发声有关的ー个或多个非语音输入的多模输入中进行发声。
12.根据权利要求I所述的方法,其中,会话语音引擎在多模输出中生成响应,该多模输出包括ー个或多个非语音输出,该ー个或多个非语音输出涉及所述发声或执行用于处理从所述发声的预定的含义标识的请求的ー个或多个任务。
13. 一种用于提供协作会话的语音用户界面的方法,所述方法包括: 在语音输入设备接收与用户的当前谈话期间的发声; 累积有关当前谈话的短期共享知识,其中,短期共享知识包括有关接收的当前谈话期间的发声的知识; 累积有关用户的长期共享知识,其中,长期共享知识包括与用户的ー个或多个过去的谈话有关的知识; 基于短期共享知识和长期共享知识,在会话语音引擎上标识与发声相关联的上下文;响应于确定发声没有包含足够的用于完成所标识的上下文中的请求的信息,从短期共享知识和长期共享知识推断有关所述发声的附加信息;以及 建立所述发声在所标识的上下文中的预定的含义,其中,会话语音引擎基于有关所述发声的推断的附加信息建立所述发声的预定的含义; 基于所标识的上下文中建立的所述发声的预定的含义,生成对所述发声的响应。
14.根据权利要求13所述的方法,其中,为所述发声建立的预定的含义包括具有关于用户发声时的意图的对应的确定程度的隐式假定。
15. 一种用于提供协作会话的语音用户界面的系统,所述系统包括: 用于控制语音输入设备以接收与用户的当前谈话期间的发声的装置,其中所述发声包括在不同上下文中具有不同含义的ー个或多个词; 用于累积有关当前谈话的短期共享知识的装置,其中短期共享知识包括关于接收的当前谈话期间的发声的知识; 用于累积有关用户的长期共享知识的装置,其中,长期共享知识包括与用户的ー个或多个过去的谈话有关的知识; 用于基于短期共享知识和长期共享知识,标识与发声相关联的上下文的装置; 用于建立所述发声在所标识的上下文中的预定的含义,以消除用户发出在不同上下文中具有不同含义的ー个或多个词时的意图的歧义的装置;以及 用于基于在所标识的上下文中建立的所述发声的预定的含义,生成在语法或句法上改编的对所述发声的响应的装置。
16.根据权利要求15所述的系统,其中,用于累积有关当前会话的短期共享知识的装置用关于接收的当前谈话期间的所述发声的信息填充短期上下文堆栈。
17.根据权利要求16所述的系统,其中,用于累积有关当前会话的短期共享知识的装置用于在预定的时间量之后,使来自所述短期上下文堆栈的关于所述发声的信息过期。
18.根据权利要求17所述的系统,其中,用于累积有关用户的长期共享知识的装置用于更新与所述用户相关联的一个或多个长期配置文件,以包括有关接收的当前谈话期间的所述发声的信息和与从短期上下文堆栈过期的信息相关联的相关数据。
19.根据权利要求15所述的系统,其中,进ー步包括: 用于标识与所述发声相关联的谈话目标、当前谈话中与用户以及ー个或多个其他參与者相关联的角色、以及当前谈话中的用户和该一个或多个其他參与者中的信息分配的装置;以及 用于基于所标识的会话目标、所标识的角色或者所标识的信息分配中的ー个或多个,将ー个或多个发声或当前会话分类到会话类型中的装置,其中建立所述发声的预定的含义的装置基于会话类型建立所述发声的预定的含义。
20.根据权利要求19所述的系统,其中,为所述发声建立的预定的含义还包括具有有关用户发出发声中的一个或多个词时的意图的确定程度的假设。
21.根据权利要求20所述的系统,还包括: 被配置用于生成所述发声的初歩解释的语音识别引擎;以及 用于基于会话类型、与所标识的上下文相关联的信息或者与在语音识别引擎上生成的初步解释相关联的置信水平中的ー个或多个,来为所述假设分配确定程度的装置。
22.根据权利要求19所述的系统,用于生成在语法或句法上改編的响应的装置基于所述会话类型生成在语法或句法上改編的响应。
23.根据权利要求15所述的系统,其中,用于生成在语法或句法上改編的响应的装置生成在语法或句法上改編的响应,以影响预期在当前谈话期间来自用户的随后的应答发声。
24.根据权利要求15所述的系统,还包括语音识别引擎,被配置用于: 生成所述发声的多种可能的解释,其中,所述发声的初始的解释包括具有最高置信水平的多种可能的解释之ー;以及 响应于确定初始的解释是不正确的,更新有关当前谈话的短期共享知识,以从多种可能的解释中去除初始的解释,其中用于标识与所述发声相关联的上下文的装置基于具有下ー个最高置信水平的多种可能的解释之ー标识与所述发声相关联的上下文,以及建立所述发声的预定的含义的装置基于具有所述下ー个最高置信水平的多种可能的解释之一,建立所述发声的预定的含义。
25.根据权利要求15所述的系统,其中,用户在进ー步包括与发声有关的ー个或多个非语音输入的多模输入中进行发声。
26.根据权利要求15所述的系统,其中,在语法或句法上改編的响应包括多模输出,该多模输出包含ー个或多个非语音输出,该ー个或多个非语音输出涉及所述发声或执行用于处理从所述发声的预定的含义标识的请求的ー个或多个任务。
27. ー种提供协作会话的语音用户界面的系统,所述系统包括: 用于控制语音输入设备以接收与用户的当前谈话期间的发声的装置; 用于累积有关当前谈话的短期共享知识的装置,其中,短期共享知识包括有关接收的当前谈话期间的发声的知识; 用于累积有关用户的长期共享知识的装置,其中,长期共享知识包括与用户的ー个或多个过去的谈话有关的知识; 用于基于短期共享知识和长期共享知识,标识与发声相关联的上下文的装置; 用于响应于确定发声没有包含足够的用于完成所标识的上下文中的请求的信息,从短期共享知识和长期共享知识推断有关所述发声的附加信息的装置; 用于基于有关所述发声的推断的附加信息,建立所标识的上下文中的所述发声的预定的含义的装置;以及 用于基于在所标识的上下文中建立的所述发声的预定的含义,生成对所述发声的响应的装置。
28.根据权利要求27所述的系统,为所述发声建立的预定的含义包括具有有关用户发声时的意图的对应的确定程度的隐式假定。
CN 200780042315 2006-10-16 2007-10-16 协作会话语音用户界面的系统和方法 CN101535983B (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US11580926 US8073681B2 (en) 2006-10-16 2006-10-16 System and method for a cooperative conversational voice user interface
US11/580,926 2006-10-16
PCT/US2007/081481 WO2008118195A3 (en) 2006-10-16 2007-10-16 System and method for a cooperative conversational voice user interface

Publications (2)

Publication Number Publication Date
CN101535983A true CN101535983A (zh) 2009-09-16
CN101535983B true CN101535983B (zh) 2012-08-22

Family

ID=39304061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200780042315 CN101535983B (zh) 2006-10-16 2007-10-16 协作会话语音用户界面的系统和方法

Country Status (4)

Country Link
US (4) US8073681B2 (zh)
CN (1) CN101535983B (zh)
EP (1) EP2082335A4 (zh)
WO (1) WO2008118195A3 (zh)

Families Citing this family (177)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001013255A3 (en) 1999-08-13 2001-11-15 Pixo Inc Displaying and traversing links in character array
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
DE60205649D1 (de) 2001-10-22 2005-09-22 Riccardo Vieri System um Textnachrichten in Stimmnachrichten zu konvertieren und über eine Internetverbindung zu einem Telefon zu schicken und verfahren zum Betrieb dieses System
US7398209B2 (en) 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7693720B2 (en) 2002-07-15 2010-04-06 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US7669134B1 (en) 2003-05-02 2010-02-23 Apple Inc. Method and apparatus for displaying information during an instant messaging session
US7633076B2 (en) 2005-09-30 2009-12-15 Apple Inc. Automated response to and sensing of user activity in portable devices
US20060271520A1 (en) * 2005-05-27 2006-11-30 Ragan Gene Z Content-based implicit search query
US7640160B2 (en) * 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7620549B2 (en) 2005-08-10 2009-11-17 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US7949529B2 (en) 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
EP1934971A4 (en) * 2005-08-31 2010-10-27 Voicebox Technologies Inc Dynamic speech sharpening
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
CA2664152C (en) * 2006-09-21 2014-09-30 Activx Biosciences, Inc. Serine hydrolase inhibitors
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US20080129520A1 (en) * 2006-12-01 2008-06-05 Apple Computer, Inc. Electronic device with enhanced audio feedback
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US7912828B2 (en) * 2007-02-23 2011-03-22 Apple Inc. Pattern searching methods and apparatuses
US8949130B2 (en) * 2007-03-07 2015-02-03 Vlingo Corporation Internal and external speech recognition use with a mobile communication facility
US8886540B2 (en) * 2007-03-07 2014-11-11 Vlingo Corporation Using speech recognition results based on an unstructured language model in a mobile communication facility application
US8949266B2 (en) * 2007-03-07 2015-02-03 Vlingo Corporation Multiple web-based content category searching in mobile search application
US8635243B2 (en) * 2007-03-07 2014-01-21 Research In Motion Limited Sending a communications header with voice recording to send metadata for use in speech recognition, formatting, and search mobile search application
US8838457B2 (en) * 2007-03-07 2014-09-16 Vlingo Corporation Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility
US8996379B2 (en) 2007-03-07 2015-03-31 Vlingo Corporation Speech recognition text entry for software applications
US8886545B2 (en) 2007-03-07 2014-11-11 Vlingo Corporation Dealing with switch latency in speech recognition
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
JP4715805B2 (ja) * 2007-05-10 2011-07-06 トヨタ自動車株式会社 車載情報検索装置
US8359234B2 (en) 2007-07-26 2013-01-22 Braintexter, Inc. System to generate and set up an advertising campaign based on the insertion of advertising messages within an exchange of messages, and method to operate said system
US9053089B2 (en) 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
US8165886B1 (en) 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system
US8364694B2 (en) 2007-10-26 2013-01-29 Apple Inc. Search assistant for digital media assets
US8620662B2 (en) 2007-11-20 2013-12-31 Apple Inc. Context-aware unit selection
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US20090164441A1 (en) * 2007-12-20 2009-06-25 Adam Cheyer Method and apparatus for searching using an active ontology
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8327272B2 (en) 2008-01-06 2012-12-04 Apple Inc. Portable multifunction device, method, and graphical user interface for viewing and managing electronic calendars
US20090182702A1 (en) 2008-01-15 2009-07-16 Miller Tanya M Active Lab
US8065143B2 (en) 2008-02-22 2011-11-22 Apple Inc. Providing text input using speech data and non-speech data
US8289283B2 (en) 2008-03-04 2012-10-16 Apple Inc. Language input interface on a device
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
CA2665009A1 (en) * 2008-05-23 2009-11-23 Accenture Global Services Gmbh System for handling a plurality of streaming voice signals for determination of responsive action thereto
CA2665055A1 (en) 2008-05-23 2009-11-23 Accenture Global Services Gmbh Treatment processing of a plurality of streaming voice signals for determination of responsive action thereto
CA2665014A1 (en) * 2008-05-23 2009-11-23 Accenture Global Services Gmbh Recognition processing of a plurality of streaming voice signals for determination of responsive action thereto
US8589161B2 (en) * 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
US8898568B2 (en) 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8355919B2 (en) 2008-09-29 2013-01-15 Apple Inc. Systems and methods for text normalization for text to speech synthesis
US8583418B2 (en) 2008-09-29 2013-11-12 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US8352272B2 (en) 2008-09-29 2013-01-08 Apple Inc. Systems and methods for text to speech synthesis
US8352268B2 (en) 2008-09-29 2013-01-08 Apple Inc. Systems and methods for selective rate of speech and speech preferences for text to speech synthesis
US8396714B2 (en) 2008-09-29 2013-03-12 Apple Inc. Systems and methods for concatenation of words in text to speech synthesis
US8712776B2 (en) 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US20100153398A1 (en) * 2008-12-12 2010-06-17 Next It Corporation Leveraging concepts with information retrieval techniques and knowledge bases
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US8380507B2 (en) 2009-03-09 2013-02-19 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
US20100312547A1 (en) * 2009-06-05 2010-12-09 Apple Inc. Contextual voice commands
GB2495222B (en) * 2011-09-30 2016-10-26 Apple Inc Using context information to facilitate processing of commands in a virtual assistant
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US20110010179A1 (en) * 2009-07-13 2011-01-13 Naik Devang K Voice synthesis and processing
US8892439B2 (en) * 2009-07-15 2014-11-18 Microsoft Corporation Combination and federation of local and remote speech recognition
US20110066438A1 (en) * 2009-09-15 2011-03-17 Apple Inc. Contextual voiceover
US8943094B2 (en) 2009-09-22 2015-01-27 Next It Corporation Apparatus, system, and method for natural language processing
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
WO2011059997A1 (en) 2009-11-10 2011-05-19 Voicebox Technologies, Inc. System and method for providing a natural language content dedication service
US8682649B2 (en) 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US8346549B2 (en) * 2009-12-04 2013-01-01 At&T Intellectual Property I, L.P. System and method for supplemental speech recognition by identified idle resources
EP2339576A3 (en) 2009-12-23 2011-11-23 Google Inc. Multi-modal input on an electronic device
KR101649911B1 (ko) 2010-01-04 2016-08-22 삼성전자 주식회사 확장 도메인을 이용한 대화 시스템 및 그 자연어 인식 방법
US8600743B2 (en) 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
US20110167350A1 (en) * 2010-01-06 2011-07-07 Apple Inc. Assist Features For Content Display Device
US8381107B2 (en) 2010-01-13 2013-02-19 Apple Inc. Adaptive audio feedback system and method
US8311838B2 (en) 2010-01-13 2012-11-13 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US8639516B2 (en) 2010-06-04 2014-01-28 Apple Inc. User-specific noise suppression for voice quality improvements
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US9104670B2 (en) 2010-07-21 2015-08-11 Apple Inc. Customized search or acquisition of digital media assets
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
US9594845B2 (en) * 2010-09-24 2017-03-14 International Business Machines Corporation Automating web tasks based on web browsing histories and user actions
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
GB201306716D0 (en) * 2010-10-15 2013-05-29 Intelligent Mechatronic Sys Implicit association and polymorphism driven human machine interaction
US8352245B1 (en) 2010-12-30 2013-01-08 Google Inc. Adjusting language models
US8296142B2 (en) * 2011-01-21 2012-10-23 Google Inc. Speech recognition using dock context
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US9081760B2 (en) * 2011-03-08 2015-07-14 At&T Intellectual Property I, L.P. System and method for building diverse language models
US20130066634A1 (en) * 2011-03-16 2013-03-14 Qualcomm Incorporated Automated Conversation Assistance
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9679561B2 (en) * 2011-03-28 2017-06-13 Nuance Communications, Inc. System and method for rapid customization of speech recognition models
US9026446B2 (en) * 2011-06-10 2015-05-05 Morgan Fiumi System for generating captions for live video broadcasts
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US9836177B2 (en) 2011-12-30 2017-12-05 Next IT Innovation Labs, LLC Providing variable responses in a virtual-assistant environment
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
US20130317805A1 (en) * 2012-05-24 2013-11-28 Google Inc. Systems and methods for detecting real names in different languages
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US8983840B2 (en) * 2012-06-19 2015-03-17 International Business Machines Corporation Intent discovery in audio or text-based conversation
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9424233B2 (en) * 2012-07-20 2016-08-23 Veveo, Inc. Method of and system for inferring user intent in search input in a conversational interaction system
US9424840B1 (en) * 2012-08-31 2016-08-23 Amazon Technologies, Inc. Speech recognition platforms
US9536049B2 (en) 2012-09-07 2017-01-03 Next It Corporation Conversational virtual healthcare assistant
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
WO2014055181A1 (en) * 2012-10-01 2014-04-10 Nuance Communications, Inc. Systems and methods for providing a voice agent user interface
US8983849B2 (en) 2012-10-17 2015-03-17 Nuance Communications, Inc. Multiple device intelligent language model synchronization
US9564125B2 (en) 2012-11-13 2017-02-07 GM Global Technology Operations LLC Methods and systems for adapting a speech system based on user characteristics
RU2530268C2 (ru) * 2012-11-28 2014-10-10 Общество с ограниченной ответственностью "Спиктуит" Способ обучения информационной диалоговой системы пользователем
KR20140089871A (ko) * 2013-01-07 2014-07-16 삼성전자주식회사 대화형 서버, 그 제어 방법 및 대화형 시스템
US9378741B2 (en) 2013-03-12 2016-06-28 Microsoft Technology Licensing, Llc Search results using intonation nuances
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
WO2014160203A3 (en) 2013-03-14 2014-12-04 The Brigham And Women's Hospital, Inc. Bmp inhibitors and methods of use thereof
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US8775163B1 (en) * 2013-03-15 2014-07-08 Rallee Selectable silent mode for real-time audio communication system
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
US20140317502A1 (en) * 2013-04-18 2014-10-23 Next It Corporation Virtual assistant focused user interfaces
US9805718B2 (en) * 2013-04-19 2017-10-31 Sri Internaitonal Clarifying natural language input using targeted questions
US9390708B1 (en) * 2013-05-28 2016-07-12 Amazon Technologies, Inc. Low latency and memory efficient keywork spotting
US9431008B2 (en) * 2013-05-29 2016-08-30 Nuance Communications, Inc. Multiple parallel dialogs in smart phone applications
WO2014197334A3 (en) 2013-06-07 2015-01-29 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
KR101772152B1 (ko) * 2013-06-09 2017-08-28 애플 인크. 디지털 어시스턴트의 둘 이상의 인스턴스들에 걸친 대화 지속성을 가능하게 하기 위한 디바이스, 방법 및 그래픽 사용자 인터페이스
JP2016521948A (ja) 2013-06-13 2016-07-25 アップル インコーポレイテッド 音声コマンドによって開始される緊急電話のためのシステム及び方法
US20140379336A1 (en) * 2013-06-20 2014-12-25 Atul Bhatnagar Ear-based wearable networking device, system, and method
US20150004591A1 (en) * 2013-06-27 2015-01-01 DoSomething.Org Device, system, method, and computer-readable medium for providing an educational, text-based interactive game
US9733894B2 (en) * 2013-07-02 2017-08-15 24/7 Customer, Inc. Method and apparatus for facilitating voice user interface design
US20150039316A1 (en) * 2013-07-31 2015-02-05 GM Global Technology Operations LLC Systems and methods for managing dialog context in speech systems
US9189742B2 (en) 2013-11-20 2015-11-17 Justin London Adaptive virtual intelligent agent
US20150179168A1 (en) * 2013-12-20 2015-06-25 Microsoft Corporation Multi-user, Multi-domain Dialog System
US20150186155A1 (en) 2013-12-31 2015-07-02 Next It Corporation Virtual assistant acquisitions and training
US9514748B2 (en) 2014-01-15 2016-12-06 Microsoft Technology Licensing, Llc Digital personal assistant interaction with impersonations and rich multimedia in responses
US9842592B2 (en) 2014-02-12 2017-12-12 Google Inc. Language models using non-linguistic context
US9412365B2 (en) 2014-03-24 2016-08-09 Google Inc. Enhanced maximum entropy models
JP2017514793A (ja) * 2014-03-26 2017-06-08 ザ ブリガム アンド ウィメンズ ホスピタル インコーポレイテッドThe Brigham and Women’s Hospital, Inc. Bmp阻害用組成物及びbmp阻害方法
US20150278370A1 (en) * 2014-04-01 2015-10-01 Microsoft Corporation Task completion for natural language input
US20150317973A1 (en) * 2014-04-30 2015-11-05 GM Global Technology Operations LLC Systems and methods for coordinating speech recognition
US9633649B2 (en) 2014-05-02 2017-04-25 At&T Intellectual Property I, L.P. System and method for creating voice profiles for specific demographics
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US20150340033A1 (en) * 2014-05-20 2015-11-26 Amazon Technologies, Inc. Context interpretation in natural language processing using previous dialog acts
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
JP5716859B1 (ja) * 2014-06-06 2015-05-13 富士ゼロックス株式会社 回答者抽出システム及び回答者抽出プログラム
US9390706B2 (en) * 2014-06-19 2016-07-12 Mattersight Corporation Personality-based intelligent personal assistant system and methods
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9530412B2 (en) 2014-08-29 2016-12-27 At&T Intellectual Property I, L.P. System and method for multi-agent architecture for interactive machines
US9607102B2 (en) * 2014-09-05 2017-03-28 Nuance Communications, Inc. Task switching in dialogue processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9412379B2 (en) * 2014-09-16 2016-08-09 Toyota Motor Engineering & Manufacturing North America, Inc. Method for initiating a wireless communication link using voice recognition
US9626703B2 (en) 2014-09-16 2017-04-18 Voicebox Technologies Corporation Voice commerce
KR20160032564A (ko) * 2014-09-16 2016-03-24 삼성전자주식회사 영상표시장치, 영상표시장치의 구동방법 및 컴퓨터 판독가능 기록매체
US9606986B2 (en) 2014-09-29 2017-03-28 Apple Inc. Integrated word N-gram and class M-gram language models
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
JP2017535823A (ja) * 2014-10-01 2017-11-30 エクスブレイン・インコーポレーテッド 音声および接続プラットフォーム
EP3207467A1 (en) 2014-10-15 2017-08-23 VoiceBox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US20170032791A1 (en) * 2015-07-31 2017-02-02 Google Inc. Managing dialog data providers
CN105138250A (zh) * 2015-08-03 2015-12-09 科大讯飞股份有限公司 人机交互操作引导方法、系统、人机交互装置及服务端
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5488652A (en) 1994-04-14 1996-01-30 Northern Telecom Limited Method and apparatus for training speech recognition algorithms for directory assistance applications
US5748841A (en) 1994-02-25 1998-05-05 Morin; Philippe Supervised contextual language acquisition system

Family Cites Families (535)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4430669A (en) 1981-05-29 1984-02-07 Payview Limited Transmitting and receiving apparatus for permitting the transmission and reception of multi-tier subscription programs
US5208748A (en) * 1985-11-18 1993-05-04 Action Technologies, Inc. Method and apparatus for structuring and managing human communications by explicitly defining the types of communications permitted between participants
US5027406A (en) 1988-12-06 1991-06-25 Dragon Systems, Inc. Method for interactive speech recognition and training
CA2011286A1 (en) * 1989-03-06 1990-09-06 Gregor I. Jonsson Natural language analysing apparatus and method
JPH03129469A (en) 1989-10-14 1991-06-03 Canon Inc Natural language processor
JP3266246B2 (ja) * 1990-06-15 2002-03-18 インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン 自然言語解析装置及び方法並びに自然言語解析用知識ベース構築方法
US5722084A (en) * 1990-09-28 1998-02-24 At&T Corp. Cellular/PCS handset NAM download capability using a wide-area paging system
US5155743A (en) 1990-11-27 1992-10-13 Nuance Designworks, Inc. Digital data converter
US5274560A (en) 1990-12-03 1993-12-28 Audio Navigation Systems, Inc. Sensor free vehicle navigation system utilizing a voice input/output interface for routing a driver from his source point to his destination point
DE69232407T2 (de) 1991-11-18 2002-09-12 Toshiba Kawasaki Kk Sprach-Dialog-System zur Erleichterung von Rechner-Mensch-Wechselwirkung
US20070061735A1 (en) 1995-06-06 2007-03-15 Hoffberg Steven M Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US5608635A (en) 1992-04-14 1997-03-04 Zexel Corporation Navigation system for a vehicle with route recalculation between multiple locations
US8482535B2 (en) 1999-11-08 2013-07-09 Apple Inc. Programmable tactile touch screen displays and man-machine interfaces for improved vehicle instrumentation and telematics
CA2102077C (en) * 1992-12-21 1997-09-16 Steven Lloyd Greenspan Call billing and measurement methods for redirected calls
WO1994020952A1 (en) * 1993-03-12 1994-09-15 Sri International Method and apparatus for voice-interactive language instruction
US5471318A (en) 1993-04-22 1995-11-28 At&T Corp. Multimedia communications network
US5377350A (en) 1993-04-30 1994-12-27 International Business Machines Corporation System for cooperative communication between local object managers to provide verification for the performance of remote calls by object messages
US5537436A (en) 1993-06-14 1996-07-16 At&T Corp. Simultaneous analog and digital communication applications
US5983161A (en) 1993-08-11 1999-11-09 Lemelson; Jerome H. GPS vehicle collision avoidance warning and control system and method
DE69423838D1 (de) 1993-09-23 2000-05-11 Xerox Corp Semantische Gleichereignisfilterung für Spracherkennung und Signalübersetzungsanwendungen
US5475733A (en) 1993-11-04 1995-12-12 At&T Corp. Language accommodated message relaying for hearing impaired callers
US5615296A (en) * 1993-11-12 1997-03-25 International Business Machines Corporation Continuous speech recognition and voice response system and method to enable conversational dialogues with microprocessors
CA2118278C (en) * 1993-12-21 1999-09-07 J. David Garland Multimedia system
US6009382A (en) 1996-08-19 1999-12-28 International Business Machines Corporation Word storage table for natural language determination
US5533108A (en) 1994-03-18 1996-07-02 At&T Corp. Method and system for routing phone calls based on voice and data transport capability
US5752052A (en) * 1994-06-24 1998-05-12 Microsoft Corporation Method and system for bootstrapping statistical processing into a rule-based natural language parser
JP2674521B2 (ja) 1994-09-21 1997-11-12 日本電気株式会社 移動体誘導装置
US5539744A (en) 1994-10-17 1996-07-23 At&T Corp. Hand-off management for cellular telephony
US5696965A (en) 1994-11-03 1997-12-09 Intel Corporation Electronic information appraisal agent
JP2855409B2 (ja) * 1994-11-17 1999-02-10 日本アイ・ビー・エム株式会社 自然言語処理方法及びシステム
US6571279B1 (en) 1997-12-05 2003-05-27 Pinpoint Incorporated Location enhanced information delivery system
US5499289A (en) * 1994-12-06 1996-03-12 At&T Corp. Systems, methods and articles of manufacture for performing distributed telecommunications
US5748974A (en) 1994-12-13 1998-05-05 International Business Machines Corporation Multimodal natural language interface for cross-application tasks
US5774859A (en) 1995-01-03 1998-06-30 Scientific-Atlanta, Inc. Information system having a speech interface
US5794050A (en) 1995-01-04 1998-08-11 Intelligent Text Processing, Inc. Natural language understanding system
US5918222A (en) * 1995-03-17 1999-06-29 Kabushiki Kaisha Toshiba Information disclosing apparatus and multi-modal information input/output system
US6965864B1 (en) * 1995-04-10 2005-11-15 Texas Instruments Incorporated Voice activated hypermedia systems using grammatical metadata
DE69622565D1 (de) 1995-05-26 2002-08-29 Speechworks Int Inc Verfahren und vorrichtung zur dynamischen anpassung eines spracherkennungssystems mit grossem wortschatz und zur verwendung von einschränkungen aus einer datenbank in einem spracherkennungssystem mit grossem wortschatz
US5708422A (en) * 1995-05-31 1998-01-13 At&T Transaction authorization and alert system
JP3716870B2 (ja) 1995-05-31 2005-11-16 ソニー株式会社 音声認識装置および音声認識方法
US5721938A (en) * 1995-06-07 1998-02-24 Stuckey; Barbara K. Method and device for parsing and analyzing natural language sentences and text
US5617407A (en) * 1995-06-21 1997-04-01 Bareis; Monica M. Optical disk having speech recognition templates for information access
US5794196A (en) 1995-06-30 1998-08-11 Kurzweil Applied Intelligence, Inc. Speech recognition system distinguishing dictation from commands by arbitration between continuous speech and isolated word modules
US6292767B1 (en) 1995-07-18 2001-09-18 Nuance Communications Method and system for building and running natural language understanding systems
US6567778B1 (en) * 1995-12-21 2003-05-20 Nuance Communications Natural language speech recognition using slot semantic confidence scores related to their word recognition confidence scores
US6804638B2 (en) 1999-04-30 2004-10-12 Recent Memory Incorporated Device and method for selective recall and preservation of events prior to decision to record the events
US5963940A (en) 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US5675629A (en) 1995-09-08 1997-10-07 At&T Cordless cellular system base station
US5855000A (en) 1995-09-08 1998-12-29 Carnegie Mellon University Method and apparatus for correcting and repairing machine-transcribed input using independent or cross-modal secondary input
US5911120A (en) * 1995-09-08 1999-06-08 At&T Wireless Services Wireless communication system having mobile stations establish a communication link through the base station without using a landline or regional cellular network and without a call in progress
US6192110B1 (en) * 1995-09-15 2001-02-20 At&T Corp. Method and apparatus for generating sematically consistent inputs to a dialog manager
US5799276A (en) 1995-11-07 1998-08-25 Accent Incorporated Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals
US5960447A (en) 1995-11-13 1999-09-28 Holt; Douglas Word tagging and editing system for speech recognition
DE69631955T2 (de) 1995-12-15 2005-01-05 Koninklijke Philips Electronics N.V. Verfahren und schaltung zur adaptiven rauschunterdrückung und sendeempfänger
US5832221A (en) 1995-12-29 1998-11-03 At&T Corp Universal message storage system
US5633922A (en) * 1995-12-29 1997-05-27 At&T Process and apparatus for restarting call routing in a telephone network
US5802510A (en) 1995-12-29 1998-09-01 At&T Corp Universal directory service
US5742763A (en) * 1995-12-29 1998-04-21 At&T Corp. Universal message delivery system for handles identifying network presences
US5987404A (en) 1996-01-29 1999-11-16 International Business Machines Corporation Statistical natural language understanding using hidden clumpings
US6314420B1 (en) 1996-04-04 2001-11-06 Lycos, Inc. Collaborative/adaptive search engine
US5848396A (en) 1996-04-26 1998-12-08 Freedom Of Information, Inc. Method and apparatus for determining behavioral profile of a computer user
US5878386A (en) * 1996-06-28 1999-03-02 Microsoft Corporation Natural language parser with dictionary-based part-of-speech probabilities
US5953393A (en) 1996-07-15 1999-09-14 At&T Corp. Personal telephone agent
US5867817A (en) * 1996-08-19 1999-02-02 Virtual Vision, Inc. Speech recognition manager
US6385646B1 (en) * 1996-08-23 2002-05-07 At&T Corp. Method and system for establishing voice communications in an internet environment
US5892900A (en) 1996-08-30 1999-04-06 Intertrust Technologies Corp. Systems and methods for secure transaction management and electronic rights protection
US6470315B1 (en) 1996-09-11 2002-10-22 Texas Instruments Incorporated Enrollment and modeling method and apparatus for robust speaker dependent speech models
US5878385A (en) 1996-09-16 1999-03-02 Ergo Linguistic Technologies Method and apparatus for universal parsing of language
US6085186A (en) 1996-09-20 2000-07-04 Netbot, Inc. Method and system using information written in a wrapper description language to execute query on a network
WO1998013771A1 (fr) 1996-09-26 1998-04-02 Mitsubishi Denki Kabushiki Kaisha Processeur interactif
US5892813A (en) * 1996-09-30 1999-04-06 Matsushita Electric Industrial Co., Ltd. Multimodal voice dialing digital key telephone with dialog manager
US5995928A (en) 1996-10-02 1999-11-30 Speechworks International, Inc. Method and apparatus for continuous spelling speech recognition with early identification
US5902347A (en) 1996-11-19 1999-05-11 American Navigation Systems, Inc. Hand-held GPS-mapping device
US5839107A (en) 1996-11-29 1998-11-17 Northern Telecom Limited Method and apparatus for automatically generating a speech recognition vocabulary from a white pages listing
US6154526A (en) * 1996-12-04 2000-11-28 Intellivoice Communications, Inc. Data acquisition and error correcting speech recognition system
US5960399A (en) 1996-12-24 1999-09-28 Gte Internetworking Incorporated Client/server speech processor/recognizer
US6456974B1 (en) 1997-01-06 2002-09-24 Texas Instruments Incorporated System and method for adding speech recognition capabilities to java
US6122613A (en) 1997-01-30 2000-09-19 Dragon Systems, Inc. Speech recognition using multiple recognizers (selectively) applied to the same input sample
JPH10254486A (ja) 1997-03-13 1998-09-25 Canon Inc 音声認識装置および方法
GB2323693B (en) 1997-03-27 2001-09-26 Forum Technology Ltd Speech to text conversion
US6167377A (en) 1997-03-28 2000-12-26 Dragon Systems, Inc. Speech recognition language models
FR2761837B1 (fr) * 1997-04-08 1999-06-11 Sophie Sommelet Dispositif d'aide a la navigation ayant une architecture distribuee basee sur internet
US6014559A (en) * 1997-04-10 2000-01-11 At&T Wireless Services, Inc. Method and system for delivering a voice mail notification to a private base station using cellular phone network
US6078886A (en) 1997-04-14 2000-06-20 At&T Corporation System and method for providing remote automatic speech recognition services via a packet network
US6058187A (en) * 1997-04-17 2000-05-02 At&T Corp. Secure telecommunications data transmission
US5895464A (en) * 1997-04-30 1999-04-20 Eastman Kodak Company Computer program product and a method for using natural language for the description, search and retrieval of multi-media objects
CA2292959A1 (en) 1997-05-06 1998-11-12 Speechworks International, Inc. System and method for developing interactive speech applications
US5899991A (en) * 1997-05-12 1999-05-04 Teleran Technologies, L.P. Modeling technique for system access control and management
US6128369A (en) 1997-05-14 2000-10-03 A.T.&T. Corp. Employing customer premises equipment in communications network maintenance
US5960397A (en) 1997-05-27 1999-09-28 At&T Corp System and method of recognizing an acoustic environment to adapt a set of based recognition models to the current acoustic environment for subsequent speech recognition
US5995119A (en) 1997-06-06 1999-11-30 At&T Corp. Method for generating photo-realistic animated characters
FI972723A0 (fi) 1997-06-24 1997-06-24 Nokia Mobile Phones Ltd Mobila kommunikationsanordningar
US6199043B1 (en) * 1997-06-24 2001-03-06 International Business Machines Corporation Conversation management in speech recognition interfaces
US6101241A (en) 1997-07-16 2000-08-08 At&T Corp. Telephone-based speech recognition for data collection
US5926784A (en) 1997-07-17 1999-07-20 Microsoft Corporation Method and system for natural language parsing using podding
US5933822A (en) 1997-07-22 1999-08-03 Microsoft Corporation Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision
US6275231B1 (en) 1997-08-01 2001-08-14 American Calcar Inc. Centralized control and management system for automobiles
US6044347A (en) * 1997-08-05 2000-03-28 Lucent Technologies Inc. Methods and apparatus object-oriented rule-based dialogue management
US6144667A (en) 1997-08-07 2000-11-07 At&T Corp. Network-based method and apparatus for initiating and completing a telephone call via the internet
US6192338B1 (en) * 1997-08-12 2001-02-20 At&T Corp. Natural language knowledge servers as network resources
US6360234B2 (en) * 1997-08-14 2002-03-19 Virage, Inc. Video cataloger system with synchronized encoders
US5895466A (en) * 1997-08-19 1999-04-20 At&T Corp Automated natural language understanding customer service system
US6081774A (en) * 1997-08-22 2000-06-27 Novell, Inc. Natural language information retrieval system and method
US6018708A (en) 1997-08-26 2000-01-25 Nortel Networks Corporation Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies
US6076059A (en) 1997-08-29 2000-06-13 Digital Equipment Corporation Method for aligning text with audio signals
US6049602A (en) * 1997-09-18 2000-04-11 At&T Corp Virtual call center
US6650747B1 (en) 1997-09-18 2003-11-18 At&T Corp. Control of merchant application by system monitor in virtual contact center
DE19742054A1 (de) 1997-09-24 1999-04-01 Philips Patentverwaltung Eingabesystem wenigstens für Orts- und/oder Straßennamen
US5897613A (en) 1997-10-08 1999-04-27 Lucent Technologies Inc. Efficient transmission of voice silence intervals
US6134235A (en) 1997-10-08 2000-10-17 At&T Corp. Pots/packet bridge
US6272455B1 (en) 1997-10-22 2001-08-07 Lucent Technologies, Inc. Method and apparatus for understanding natural language
JPH11126090A (ja) 1997-10-23 1999-05-11 Pioneer Electron Corp 音声認識方法及び音声認識装置並びに音声認識装置を動作させるためのプログラムが記録された記録媒体
US6021384A (en) * 1997-10-29 2000-02-01 At&T Corp. Automatic generation of superwords
US6498797B1 (en) 1997-11-14 2002-12-24 At&T Corp. Method and apparatus for communication services on a network
US6188982B1 (en) 1997-12-01 2001-02-13 Industrial Technology Research Institute On-line background noise adaptation of parallel model combination HMM with discriminative learning using weighted HMM for noisy speech recognition
US6614773B1 (en) 1997-12-02 2003-09-02 At&T Corp. Packet transmissions over cellular radio
US5970412A (en) * 1997-12-02 1999-10-19 Maxemchuk; Nicholas Frank Overload control in a packet-switching cellular environment
US6219346B1 (en) * 1997-12-02 2001-04-17 At&T Corp. Packet switching architecture in cellular radio
US6195634B1 (en) 1997-12-24 2001-02-27 Nortel Networks Corporation Selection of decoys for non-vocabulary utterances rejection
US6301560B1 (en) 1998-01-05 2001-10-09 Microsoft Corporation Discrete speech recognition system with ballooning active grammar
US6420975B1 (en) 1999-08-25 2002-07-16 Donnelly Corporation Interior rearview mirror sound processing system
US6226612B1 (en) * 1998-01-30 2001-05-01 Motorola, Inc. Method of evaluating an utterance in a speech recognition system
US6385596B1 (en) * 1998-02-06 2002-05-07 Liquid Audio, Inc. Secure online music distribution system
US6160883A (en) * 1998-03-04 2000-12-12 At&T Corporation Telecommunications network system and method
US6292779B1 (en) 1998-03-09 2001-09-18 Lernout & Hauspie Speech Products N.V. System and method for modeless large vocabulary speech recognition
US6119087A (en) 1998-03-13 2000-09-12 Nuance Communications System architecture for and method of voice processing
US6233559B1 (en) * 1998-04-01 2001-05-15 Motorola, Inc. Speech control of multiple applications using applets
US6173279B1 (en) * 1998-04-09 2001-01-09 At&T Corp. Method of using a natural language interface to retrieve information from one or more data resources
US6144938A (en) 1998-05-01 2000-11-07 Sun Microsystems, Inc. Voice user interface with personality
US6574597B1 (en) 1998-05-08 2003-06-03 At&T Corp. Fully expanded context-dependent networks for speech recognition
US6236968B1 (en) 1998-05-14 2001-05-22 International Business Machines Corporation Sleep prevention dialog based car system
US20070094224A1 (en) * 1998-05-28 2007-04-26 Lawrence Au Method and system for determining contextual meaning for network search applications
US7072826B1 (en) 1998-06-04 2006-07-04 Matsushita Electric Industrial Co., Ltd. Language conversion rule preparing device, language conversion device and program recording medium
US6219643B1 (en) * 1998-06-26 2001-04-17 Nuance Communications, Inc. Method of analyzing dialogs in a natural language speech recognition system
US6553372B1 (en) * 1998-07-13 2003-04-22 Microsoft Corporation Natural language information retrieval system
US6175858B1 (en) * 1998-07-13 2001-01-16 At&T Corp. Intelligent network messaging agent and method
US6393428B1 (en) * 1998-07-13 2002-05-21 Microsoft Corporation Natural language information retrieval system
US6269336B1 (en) 1998-07-24 2001-07-31 Motorola, Inc. Voice browser for interactive services and methods thereof
US6539348B1 (en) * 1998-08-24 2003-03-25 Virtual Research Associates, Inc. Systems and methods for parsing a natural language sentence
US6208964B1 (en) 1998-08-31 2001-03-27 Nortel Networks Limited Method and apparatus for providing unsupervised adaptation of transcriptions
US6434524B1 (en) 1998-09-09 2002-08-13 One Voice Technologies, Inc. Object interactive user interface using speech recognition and natural language processing
US6499013B1 (en) * 1998-09-09 2002-12-24 One Voice Technologies, Inc. Interactive user interface using speech recognition and natural language processing
US6049607A (en) 1998-09-18 2000-04-11 Lamar Signal Processing Interference canceling method and apparatus
US6606598B1 (en) 1998-09-22 2003-08-12 Speechworks International, Inc. Statistical computing and reporting for interactive speech applications
US6405170B1 (en) 1998-09-22 2002-06-11 Speechworks International, Inc. Method and system of reviewing the behavior of an interactive speech recognition application
EP1163576A4 (en) 1998-10-02 2005-11-30 Ibm Conversational computing via conversational virtual machine
US7003463B1 (en) 1998-10-02 2006-02-21 International Business Machines Corporation System and method for providing network coordinated conversational services
CA2346145A1 (en) * 1998-10-05 2000-04-13 Lernout & Hauspie Speech Products N.V. Speech controlled computer user interface
WO2000022549A1 (en) * 1998-10-09 2000-04-20 Koninklijke Philips Electronics N.V. Automatic inquiry method and system
US6185535B1 (en) * 1998-10-16 2001-02-06 Telefonaktiebolaget Lm Ericsson (Publ) Voice control of a user interface to service applications
JP2002528948A (ja) 1998-10-21 2002-09-03 アメリカン カルカー インコーポレイティド 位置カメラおよびgpsデータ交換デバイス
US6453292B2 (en) 1998-10-28 2002-09-17 International Business Machines Corporation Command boundary identifier for conversational natural language
US6477200B1 (en) 1998-11-09 2002-11-05 Broadcom Corporation Multi-pair gigabit ethernet transceiver
US8121891B2 (en) 1998-11-12 2012-02-21 Accenture Global Services Gmbh Personalized product report
US6195651B1 (en) * 1998-11-19 2001-02-27 Andersen Consulting Properties Bv System, method and article of manufacture for a tuned user application experience
US6246981B1 (en) * 1998-11-25 2001-06-12 International Business Machines Corporation Natural language task-oriented dialog manager and method
US7881936B2 (en) 1998-12-04 2011-02-01 Tegic Communications, Inc. Multimodal disambiguation of speech recognition
US6430285B1 (en) 1998-12-15 2002-08-06 At&T Corp. Method and apparatus for an automated caller interaction system
US6233556B1 (en) * 1998-12-16 2001-05-15 Nuance Communications Voice processing and verification system
US6721001B1 (en) * 1998-12-16 2004-04-13 International Business Machines Corporation Digital camera with voice recognition annotation
US6754485B1 (en) 1998-12-23 2004-06-22 American Calcar Inc. Technique for effectively providing maintenance and information to vehicles
US6208972B1 (en) * 1998-12-23 2001-03-27 Richard Grant Method for integrating computer processes with an interface controlled by voice actuated grammars
US6570555B1 (en) 1998-12-30 2003-05-27 Fuji Xerox Co., Ltd. Method and apparatus for embodied conversational characters with multimodal input/output in an interface device
US6757718B1 (en) 1999-01-05 2004-06-29 Sri International Mobile navigation of network-based electronic information using spoken input
US6523061B1 (en) * 1999-01-05 2003-02-18 Sri International, Inc. System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system
US7036128B1 (en) 1999-01-05 2006-04-25 Sri International Offices Using a community of distributed electronic agents to support a highly mobile, ambient computing environment
US6851115B1 (en) * 1999-01-05 2005-02-01 Sri International Software-based architecture for communication and cooperation among distributed electronic agents
US6742021B1 (en) * 1999-01-05 2004-05-25 Sri International, Inc. Navigating network-based electronic information using spoken input with multimodal error feedback
JP3822990B2 (ja) 1999-01-07 2006-09-20 株式会社日立製作所 翻訳装置、記録媒体
US6429813B2 (en) 1999-01-14 2002-08-06 Navigation Technologies Corp. Method and system for providing end-user preferences with a navigation system
US6567797B1 (en) 1999-01-26 2003-05-20 Xerox Corporation System and method for providing recommendations based on multi-modal user clusters
GB2361339B (en) 1999-01-27 2003-08-06 Kent Ridge Digital Labs Method and apparatus for voice annotation and retrieval of multimedia data
US6556970B1 (en) * 1999-01-28 2003-04-29 Denso Corporation Apparatus for determining appropriate series of words carrying information to be recognized
US6278968B1 (en) * 1999-01-29 2001-08-21 Sony Corporation Method and apparatus for adaptive speech recognition hypothesis construction and selection in a spoken language translation system
US6430531B1 (en) * 1999-02-04 2002-08-06 Soliloquy, Inc. Bilateral speech system
US6643620B1 (en) 1999-03-15 2003-11-04 Matsushita Electric Industrial Co., Ltd. Voice activated controller for recording and retrieving audio/video programs
JP4176228B2 (ja) 1999-03-15 2008-11-05 株式会社東芝 自然言語対話装置及び自然言語対話方法
US6272461B1 (en) * 1999-03-22 2001-08-07 Siemens Information And Communication Networks, Inc. Method and apparatus for an enhanced presentation aid
US6631346B1 (en) 1999-04-07 2003-10-07 Matsushita Electric Industrial Co., Ltd. Method and apparatus for natural language parsing using multiple passes and tags
US6233561B1 (en) * 1999-04-12 2001-05-15 Matsushita Electric Industrial Co., Ltd. Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue
US6408272B1 (en) * 1999-04-12 2002-06-18 General Magic, Inc. Distributed voice user interface
US6570964B1 (en) * 1999-04-16 2003-05-27 Nuance Communications Technique for recognizing telephone numbers and other spoken information embedded in voice messages stored in a voice messaging system
US6314402B1 (en) 1999-04-23 2001-11-06 Nuance Communications Method and apparatus for creating modifiable and combinable speech objects for acquiring information from a speaker in an interactive voice response system
US6434523B1 (en) 1999-04-23 2002-08-13 Nuance Communications Creating and editing grammars for speech recognition graphically
US6356869B1 (en) 1999-04-30 2002-03-12 Nortel Networks Limited Method and apparatus for discourse management
US6505155B1 (en) 1999-05-06 2003-01-07 International Business Machines Corporation Method and system for automatically adjusting prompt feedback based on predicted recognition accuracy
US6308151B1 (en) 1999-05-14 2001-10-23 International Business Machines Corp. Method and system using a speech recognition system to dictate a body of text in response to an available body of text
US6505230B1 (en) * 1999-05-14 2003-01-07 Pivia, Inc. Client-server independent intermediary mechanism
US6604075B1 (en) 1999-05-20 2003-08-05 Lucent Technologies Inc. Web-based voice dialog interface
GB9911971D0 (en) 1999-05-21 1999-07-21 Canon Kk A system, a server for a system and a machine for use in a system
US6584439B1 (en) 1999-05-21 2003-06-24 Winbond Electronics Corporation Method and apparatus for controlling voice controlled devices
US7787907B2 (en) 1999-05-26 2010-08-31 Johnson Controls Technology Company System and method for using speech recognition with a vehicle control system
US20020107694A1 (en) 1999-06-07 2002-08-08 Traptec Corporation Voice-recognition safety system for aircraft and method of using the same
US7072888B1 (en) * 1999-06-16 2006-07-04 Triogo, Inc. Process for improving search engine efficiency using feedback
US6389398B1 (en) * 1999-06-23 2002-05-14 Lucent Technologies Inc. System and method for storing and executing network queries used in interactive voice response systems
US6374214B1 (en) * 1999-06-24 2002-04-16 International Business Machines Corp. Method and apparatus for excluding text phrases during re-dictation in a speech recognition system
DE60026637T2 (de) 1999-06-30 2006-10-05 International Business Machines Corp. Verfahren zur Erweiterung des Wortschatzes eines Spracherkennungssystems
US6374226B1 (en) * 1999-08-06 2002-04-16 Sun Microsystems, Inc. System and method for interfacing speech recognition grammars to individual components of a computer program
US6377913B1 (en) 1999-08-13 2002-04-23 International Business Machines Corporation Method and system for multi-client access to a dialog system
US7069220B2 (en) 1999-08-13 2006-06-27 International Business Machines Corporation Method for determining and maintaining dialog focus in a conversational speech system
US6278377B1 (en) 1999-08-25 2001-08-21 Donnelly Corporation Indicator for vehicle accessory
US6513006B2 (en) 1999-08-26 2003-01-28 Matsushita Electronic Industrial Co., Ltd. Automatic control of household activity using speech recognition and natural language
US6415257B1 (en) 1999-08-26 2002-07-02 Matsushita Electric Industrial Co., Ltd. System for identifying and adapting a TV-user profile by means of speech technology
US6901366B1 (en) 1999-08-26 2005-05-31 Matsushita Electric Industrial Co., Ltd. System and method for assessing TV-related information over the internet
EP1083545A3 (en) * 1999-09-09 2001-09-26 Xanavi Informatics Corporation Voice recognition of proper names in a navigation apparatus
US6658388B1 (en) 1999-09-10 2003-12-02 International Business Machines Corporation Personality generator for conversational systems
US7340040B1 (en) * 1999-09-13 2008-03-04 Microstrategy, Incorporated System and method for real-time, personalized, dynamic, interactive voice services for corporate-analysis related information
US6850603B1 (en) 1999-09-13 2005-02-01 Microstrategy, Incorporated System and method for the creation and automatic deployment of personalized dynamic and interactive voice services
US6631351B1 (en) * 1999-09-14 2003-10-07 Aidentity Matrix Smart toys
US6601026B2 (en) 1999-09-17 2003-07-29 Discern Communications, Inc. Information retrieval by natural language querying
US20020049535A1 (en) 1999-09-20 2002-04-25 Ralf Rigo Wireless interactive voice-actuated mobile telematics system
US6587858B1 (en) 1999-09-30 2003-07-01 Steven Paul Strazza Systems and methods for the control of dynamic data and request criteria in a data repository
US6937977B2 (en) 1999-10-05 2005-08-30 Fastmobile, Inc. Method and apparatus for processing an input speech signal during presentation of an output audio signal
US6442522B1 (en) 1999-10-12 2002-08-27 International Business Machines Corporation Bi-directional natural language system for interfacing with multiple back-end applications
US6721697B1 (en) * 1999-10-18 2004-04-13 Sony Corporation Method and system for reducing lexical ambiguity
JP5118280B2 (ja) 1999-10-19 2013-01-16 ソニー エレクトロニクス インク 自然言語インターフェースコントロールシステム
US6581103B1 (en) * 1999-10-22 2003-06-17 Dedicated Radio, Llc Method for internet radio broadcasting including listener requests of audio and/or video files with input dedications
US6594367B1 (en) 1999-10-25 2003-07-15 Andrea Electronics Corporation Super directional beamforming design and implementation
US6882970B1 (en) * 1999-10-28 2005-04-19 Canon Kabushiki Kaisha Language recognition using sequence frequency
WO2001031500A1 (en) 1999-10-29 2001-05-03 British Telecommunications Public Limited Company Method and apparatus for processing queries
US6622119B1 (en) 1999-10-30 2003-09-16 International Business Machines Corporation Adaptive command predictor and method for a natural language dialog system
US6526139B1 (en) 1999-11-03 2003-02-25 Tellabs Operations, Inc. Consolidated noise injection in a voice processing system
US6681206B1 (en) 1999-11-05 2004-01-20 At&T Corporation Method for generating morphemes
US6615172B1 (en) 1999-11-12 2003-09-02 Phoenix Solutions, Inc. Intelligent query engine for processing voice based queries
US6633846B1 (en) 1999-11-12 2003-10-14 Phoenix Solutions, Inc. Distributed realtime speech recognition system
US7392185B2 (en) 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US9076448B2 (en) 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
US6418210B1 (en) 1999-11-29 2002-07-09 At&T Corp Method and apparatus for providing information between a calling network and a called network
US6751612B1 (en) 1999-11-29 2004-06-15 Xerox Corporation User query generate search results that rank set of servers where ranking is based on comparing content on each server with user query, frequency at which content on each server is altered using web crawler in a search engine
US6219645B1 (en) * 1999-12-02 2001-04-17 Lucent Technologies, Inc. Enhanced automatic speech recognition using multiple directional microphones
GB9928420D0 (en) 1999-12-02 2000-01-26 Ibm Interactive voice response system
US6288319B1 (en) 1999-12-02 2001-09-11 Gary Catona Electronic greeting card with a custom audio mix
US6591239B1 (en) 1999-12-09 2003-07-08 Steris Inc. Voice controlled surgical suite
GB9929284D0 (en) * 1999-12-11 2000-02-02 Ibm Voice processing apparatus
US6732088B1 (en) * 1999-12-14 2004-05-04 Xerox Corporation Collaborative searching by query induction
US6598018B1 (en) 1999-12-15 2003-07-22 Matsushita Electric Industrial Co., Ltd. Method for natural dialog interface to car devices
US6976229B1 (en) 1999-12-16 2005-12-13 Ricoh Co., Ltd. Method and apparatus for storytelling with digital photographs
US6832230B1 (en) 1999-12-22 2004-12-14 Nokia Corporation Apparatus and associated method for downloading an application with a variable lifetime to a mobile terminal
US6920421B2 (en) 1999-12-28 2005-07-19 Sony Corporation Model adaptive apparatus for performing adaptation of a model used in pattern recognition considering recentness of a received pattern data
US6678680B1 (en) * 2000-01-06 2004-01-13 Mark Woo Music search engine
US6701294B1 (en) 2000-01-19 2004-03-02 Lucent Technologies, Inc. User interface for translating natural language inquiries into database queries and data presentations
US20010047261A1 (en) * 2000-01-24 2001-11-29 Peter Kassan Partially automated interactive dialog
US6829603B1 (en) * 2000-02-02 2004-12-07 International Business Machines Corp. System, method and program product for interactive natural dialog
US6640098B1 (en) * 2000-02-14 2003-10-28 Action Engine Corporation System for obtaining service-related information for local interactive wireless devices
US6560590B1 (en) 2000-02-14 2003-05-06 Kana Software, Inc. Method and apparatus for multiple tiered matching of natural language queries to positions in a text corpus
US6434529B1 (en) 2000-02-16 2002-08-13 Sun Microsystems, Inc. System and method for referencing object instances and invoking methods on those object instances from within a speech recognition grammar
US7117199B2 (en) 2000-02-22 2006-10-03 Metacarta, Inc. Spatially coding and displaying information
US7110951B1 (en) 2000-03-03 2006-09-19 Dorothy Lemelson, legal representative System and method for enhancing speech intelligibility for the hearing impaired
US6466654B1 (en) 2000-03-06 2002-10-15 Avaya Technology Corp. Personal virtual assistant with semantic tagging
US6510417B1 (en) 2000-03-21 2003-01-21 America Online, Inc. System and method for voice access to internet-based information
US7974875B1 (en) 2000-03-21 2011-07-05 Aol Inc. System and method for using voice over a telephone to access, process, and carry out transactions over the internet
WO2001073755A1 (en) * 2000-03-24 2001-10-04 Eliza Corporation Web-based speech recognition with scripting and semantic objects
US6868380B2 (en) 2000-03-24 2005-03-15 Eliza Corporation Speech recognition system and method for generating phonotic estimates
US6968333B2 (en) 2000-04-02 2005-11-22 Tangis Corporation Soliciting information based on a computer user's context
US6980092B2 (en) 2000-04-06 2005-12-27 Gentex Corporation Vehicle rearview mirror assembly incorporating a communication system
EP1273004A1 (en) 2000-04-06 2003-01-08 One Voice Technologies Inc. Natural language and dialogue generation processing
US7177798B2 (en) 2000-04-07 2007-02-13 Rensselaer Polytechnic Institute Natural language interface using constrained intermediate dictionary of results
US7734287B2 (en) 2000-04-10 2010-06-08 I/O Controls Corporation System for providing remote access to diagnostic information over a wide area network
US6578022B1 (en) 2000-04-18 2003-06-10 Icplanet Corporation Interactive intelligent searching with executable suggestions
US6556973B1 (en) * 2000-04-19 2003-04-29 Voxi Ab Conversion between data representation formats
US20020032564A1 (en) 2000-04-19 2002-03-14 Farzad Ehsani Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface
US20020007267A1 (en) * 2000-04-21 2002-01-17 Leonid Batchilo Expanded search and display of SAO knowledge base information
US7502672B1 (en) 2000-04-24 2009-03-10 Usa Technologies, Inc. Wireless vehicle diagnostics with service and part determination capabilities
US6560576B1 (en) * 2000-04-25 2003-05-06 Nuance Communications Method and apparatus for providing active help to a user of a voice-enabled application
US20010054087A1 (en) 2000-04-26 2001-12-20 Michael Flom Portable internet services
WO2001084535A3 (en) 2000-05-02 2002-06-27 David Abrahams Error correction in speech recognition
CN1252975C (zh) 2000-05-16 2006-04-19 约翰·塔歇罗 提供地理目标信息和广告的方法和系统
CN100477704C (zh) 2000-05-26 2009-04-08 皇家菲利浦电子有限公司 用于与自适应波束形成组合的回声抵消的方法和设备
US7082469B2 (en) 2000-06-09 2006-07-25 Gold Mustache Publishing, Inc. Method and system for electronic song dedication
WO2001097558A3 (en) 2000-06-13 2002-03-28 Gn Resound Corp Fixed polar-pattern-based adaptive directionality systems
CA2409920C (en) * 2000-06-22 2013-05-14 Microsoft Corporation Distributed computing services platform
US6636790B1 (en) 2000-07-25 2003-10-21 Reynolds And Reynolds Holdings, Inc. Wireless diagnostic system and method for monitoring vehicles
WO2002010900A9 (en) 2000-07-28 2003-03-06 Siemens Automotive Corp Lp User interface for telematics systems
US7092928B1 (en) 2000-07-31 2006-08-15 Quantum Leap Research, Inc. Intelligent portal engine
US7027975B1 (en) * 2000-08-08 2006-04-11 Object Services And Consulting, Inc. Guided natural language interface system and method
US7653748B2 (en) 2000-08-10 2010-01-26 Simplexity, Llc Systems, methods and computer program products for integrating advertising within web content
US7143039B1 (en) * 2000-08-11 2006-11-28 Tellme Networks, Inc. Providing menu and other services for an information processing system using a telephone or other audio interface
US6574624B1 (en) * 2000-08-18 2003-06-03 International Business Machines Corporation Automatic topic identification and switch for natural language search of textual document collections
WO2002017069A8 (en) 2000-08-21 2002-07-04 Yahoo & Excl Method and system of interpreting and presenting web content using a voice browser
CN1226717C (zh) 2000-08-30 2005-11-09 国际商业机器公司 自动新词提取方法和系统
US7062488B1 (en) 2000-08-30 2006-06-13 Richard Reisman Task/domain segmentation in applying feedback to command control
EP1184841A1 (de) 2000-08-31 2002-03-06 Siemens Aktiengesellschaft Sprachgesteuerte Anordnung und Verfahren zur Spracheingabe und -erkennung
WO2002021334A1 (en) 2000-09-07 2002-03-14 Telefonaktiebolaget Lm Ericsson (Publ) Information supply system and control method thereof
US20040205671A1 (en) 2000-09-13 2004-10-14 Tatsuya Sukehiro Natural-language processing system
US6785651B1 (en) * 2000-09-14 2004-08-31 Microsoft Corporation Method and apparatus for performing plan-based dialog
CA2423200A1 (en) 2000-09-21 2002-03-28 American Calcar Inc. Technique for operating a vehicle effectively and safely
US6754647B1 (en) * 2000-09-26 2004-06-22 Verity, Inc. Method and apparatus for hierarchically decomposed bot scripts
US6362748B1 (en) 2000-09-27 2002-03-26 Lite Vision Corporation System for communicating among vehicles and a communication system control center
US6704576B1 (en) 2000-09-27 2004-03-09 At&T Corp. Method and system for communicating multimedia content in a unicast, multicast, simulcast or broadcast environment
WO2002027712A1 (en) 2000-09-29 2002-04-04 Professorq, Inc. Natural-language voice-activated personal assistant
US7451085B2 (en) * 2000-10-13 2008-11-11 At&T Intellectual Property Ii, L.P. System and method for providing a compensated speech recognition model for speech recognition
US6922670B2 (en) 2000-10-24 2005-07-26 Sanyo Electric Co., Ltd. User support apparatus and system using agents
US6795808B1 (en) 2000-10-30 2004-09-21 Koninklijke Philips Electronics N.V. User interface/entertainment device that simulates personal interaction and charges external database with relevant data
US6721706B1 (en) 2000-10-30 2004-04-13 Koninklijke Philips Electronics N.V. Environment-responsive user interface/entertainment device that simulates personal interaction
US6934756B2 (en) 2000-11-01 2005-08-23 International Business Machines Corporation Conversational networking via transport, coding and control conversational protocols
GB0027178D0 (en) 2000-11-07 2000-12-27 Canon Kk Speech processing system
US7158935B1 (en) 2000-11-15 2007-01-02 At&T Corp. Method and system for predicting problematic situations in a automated dialog
US6735592B1 (en) 2000-11-16 2004-05-11 Discern Communications System, method, and computer program product for a network-based content exchange system
US7013308B1 (en) 2000-11-28 2006-03-14 Semscript Ltd. Knowledge storage and retrieval system and method
US20020065568A1 (en) 2000-11-30 2002-05-30 Silfvast Robert Denton Plug-in modules for digital signal processor functionalities
US20020067839A1 (en) 2000-12-04 2002-06-06 Heinrich Timothy K. The wireless voice activated and recogintion car system
US6973429B2 (en) * 2000-12-04 2005-12-06 A9.Com, Inc. Grammar generation for voice-based searches
US7016847B1 (en) 2000-12-08 2006-03-21 Ben Franklin Patent Holdings L.L.C. Open architecture for a voice user interface
US6456711B1 (en) 2000-12-12 2002-09-24 At&T Corp. Method for placing a call intended for an enhanced network user on hold while the enhanced network user is unavailable to take the call using a distributed feature architecture
US20020082911A1 (en) 2000-12-22 2002-06-27 Dunn Charles L. Online revenue sharing
US6973427B2 (en) 2000-12-26 2005-12-06 Microsoft Corporation Method for adding phonetic descriptions to a speech recognition lexicon
US20020133347A1 (en) * 2000-12-29 2002-09-19 Eberhard Schoneburg Method and apparatus for natural language dialog interface
US20020087326A1 (en) 2000-12-29 2002-07-04 Lee Victor Wai Leung Computer-implemented web page summarization method and system
US20020087312A1 (en) * 2000-12-29 2002-07-04 Lee Victor Wai Leung Computer-implemented conversation buffering method and system
US7085723B2 (en) * 2001-01-12 2006-08-01 International Business Machines Corporation System and method for determining utterance context in a multi-context speech application
US6751591B1 (en) * 2001-01-22 2004-06-15 At&T Corp. Method and system for predicting understanding errors in a task classification system
US7069207B2 (en) 2001-01-26 2006-06-27 Microsoft Corporation Linguistically intelligent text compression
US7487110B2 (en) 2001-01-30 2009-02-03 International Business Machines Corporation Automotive information communication exchange system, method, and program product
US6964023B2 (en) 2001-02-05 2005-11-08 International Business Machines Corporation System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input
US20020107873A1 (en) 2001-02-07 2002-08-08 Bandag Licensing Corporation System and method for data collection, reporting, and analysis of fleet vehicle information
US7206418B2 (en) 2001-02-12 2007-04-17 Fortemedia, Inc. Noise suppression for a wireless communication device
EP1231788A1 (en) 2001-02-12 2002-08-14 Philips Electronics N.V. Arrangement for distributing content, profiling center, receiving device and method
US6549629B2 (en) 2001-02-21 2003-04-15 Digisonix Llc DVE system with normalized selection
GB2372864B (en) 2001-02-28 2005-09-07 Vox Generation Ltd Spoken language interface
US6754627B2 (en) 2001-03-01 2004-06-22 International Business Machines Corporation Detecting speech recognition errors in an embedded speech recognition system
US7024364B2 (en) 2001-03-09 2006-04-04 Bevocal, Inc. System, method and computer program product for looking up business addresses and directions based on a voice dial-up session
US20020173961A1 (en) 2001-03-09 2002-11-21 Guerra Lisa M. System, method and computer program product for dynamic, robust and fault tolerant audio output in a speech recognition framework
US20020169597A1 (en) * 2001-03-12 2002-11-14 Fain Systems, Inc. Method and apparatus providing computer understanding and instructions from natural language
US20020133402A1 (en) 2001-03-13 2002-09-19 Scott Faber Apparatus and method for recruiting, communicating with, and paying participants of interactive advertising
US7729918B2 (en) * 2001-03-14 2010-06-01 At&T Intellectual Property Ii, Lp Trainable sentence planning system
WO2002073449A1 (en) 2001-03-14 2002-09-19 At & T Corp. Automated sentence planning in a task classification system
US7574362B2 (en) 2001-03-14 2009-08-11 At&T Intellectual Property Ii, L.P. Method for automated sentence planning in a task classification system
US6801897B2 (en) 2001-03-28 2004-10-05 International Business Machines Corporation Method of providing concise forms of natural commands
US8175886B2 (en) 2001-03-29 2012-05-08 Intellisist, Inc. Determination of signal-processing approach based on signal destination characteristics
US6487494B2 (en) 2001-03-29 2002-11-26 Wingcast, Llc System and method for reducing the amount of repetitive data sent by a server to a client for vehicle navigation
US7472075B2 (en) 2001-03-29 2008-12-30 Intellisist, Inc. System and method to associate broadcast radio content with a transaction via an internet server
KR20030007793A (ko) 2001-03-30 2003-01-23 소니 가부시끼 가이샤 음성 처리 장치
FR2822994B1 (fr) 2001-03-30 2004-05-21 Bouygues Telecom Sa Assistance au conducteur d'un vehicule automobile
EP1451679A2 (en) 2001-03-30 2004-09-01 BRITISH TELECOMMUNICATIONS public limited company Multi-modal interface
US6996531B2 (en) * 2001-03-30 2006-02-07 Comverse Ltd. Automated database assistance using a telephone for a speech based or text based multimedia communication mode
US6885989B2 (en) 2001-04-02 2005-04-26 International Business Machines Corporation Method and system for collaborative speech recognition for small-area network
US6856990B2 (en) * 2001-04-09 2005-02-15 Intel Corporation Network dedication system
US7437295B2 (en) 2001-04-27 2008-10-14 Accenture Llp Natural language processing for a location-based services system
US7970648B2 (en) 2001-04-27 2011-06-28 Accenture Global Services Limited Advertising campaign and business listing management for a location-based services system
US6950821B2 (en) 2001-05-04 2005-09-27 Sun Microsystems, Inc. System and method for resolving distributed network search queries to information providers
US6804684B2 (en) 2001-05-07 2004-10-12 Eastman Kodak Company Method for associating semantic information with multiple images in an image database environment
US20020173333A1 (en) 2001-05-18 2002-11-21 Buchholz Dale R. Method and apparatus for processing barge-in requests
US6944594B2 (en) 2001-05-30 2005-09-13 Bellsouth Intellectual Property Corporation Multi-context conversational environment system and method
JP2003005897A (ja) 2001-06-20 2003-01-08 Alpine Electronics Inc 情報入力方法および装置
US6801604B2 (en) * 2001-06-25 2004-10-05 International Business Machines Corporation Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources
US20020198714A1 (en) 2001-06-26 2002-12-26 Guojun Zhou Statistical spoken dialog system
US20100029261A1 (en) 2001-06-27 2010-02-04 John Mikkelsen Virtual wireless data cable method, apparatus and system
US7606712B1 (en) * 2001-06-28 2009-10-20 At&T Intellectual Property Ii, L.P. Speech recognition interface for voice actuation of legacy systems
US20050234727A1 (en) 2001-07-03 2005-10-20 Leo Chiu Method and apparatus for adapting a voice extensible markup language-enabled voice system for natural speech recognition and system response
US6983307B2 (en) * 2001-07-11 2006-01-03 Kirusa, Inc. Synchronization among plural browsers
US7123727B2 (en) 2001-07-18 2006-10-17 Agere Systems Inc. Adaptive close-talking differential microphone array
US7966177B2 (en) * 2001-08-13 2011-06-21 Hans Geiger Method and device for recognising a phonetic sound sequence or character sequence
US7283951B2 (en) 2001-08-14 2007-10-16 Insightful Corporation Method and system for enhanced data searching
US6757544B2 (en) 2001-08-15 2004-06-29 Motorola, Inc. System and method for determining a location relevant to a communication device and/or its associated user
US6941264B2 (en) * 2001-08-16 2005-09-06 Sony Electronics Inc. Retraining and updating speech models for speech recognition
US7920682B2 (en) 2001-08-21 2011-04-05 Byrne William J Dynamic interactive voice interface
JP2003157259A (ja) * 2001-09-05 2003-05-30 Fuji Xerox Co Ltd 情報検索システム
US20030046071A1 (en) * 2001-09-06 2003-03-06 International Business Machines Corporation Voice recognition apparatus and method
US7305381B1 (en) 2001-09-14 2007-12-04 Ricoh Co., Ltd Asynchronous unconscious retrieval in a network of information appliances
US6959276B2 (en) 2001-09-27 2005-10-25 Microsoft Corporation Including the category of environmental noise when processing speech signals
US6721633B2 (en) 2001-09-28 2004-04-13 Robert Bosch Gmbh Method and device for interfacing a driver information system using a voice portal server
US7289606B2 (en) 2001-10-01 2007-10-30 Sandeep Sibal Mode-swapping in multi-modal telephonic applications
JP3997459B2 (ja) 2001-10-02 2007-10-24 株式会社日立製作所 音声入力システムおよび音声ポータルサーバおよび音声入力端末
US7640006B2 (en) 2001-10-03 2009-12-29 Accenture Global Services Gmbh Directory assistance with multi-modal messaging
US7254384B2 (en) * 2001-10-03 2007-08-07 Accenture Global Services Gmbh Multi-modal messaging
US20030069734A1 (en) 2001-10-05 2003-04-10 Everhart Charles Allen Technique for active voice recognition grammar adaptation for dynamic multimedia application
JP4065936B2 (ja) 2001-10-09 2008-03-26 独立行政法人情報通信研究機構 機械学習法を用いた言語解析処理システムおよび機械学習法を用いた言語省略解析処理システム
US8249880B2 (en) 2002-02-14 2012-08-21 Intellisist, Inc. Real-time display of system instructions
US7406421B2 (en) 2001-10-26 2008-07-29 Intellisist Inc. Systems and methods for reviewing informational content in a vehicle
US6501834B1 (en) 2001-11-21 2002-12-31 At&T Corp. Message sender status monitor
US20030101054A1 (en) 2001-11-27 2003-05-29 Ncc, Llc Integrated system and method for electronic speech recognition and transcription
US7165028B2 (en) 2001-12-12 2007-01-16 Texas Instruments Incorporated Method of speech recognition resistant to convolutive distortion and additive distortion
GB0129788D0 (en) 2001-12-13 2002-01-30 Hewlett Packard Co Multi-modal picture
US7231343B1 (en) 2001-12-20 2007-06-12 Ianywhere Solutions, Inc. Synonyms mechanism for natural language systems
US20030120493A1 (en) 2001-12-21 2003-06-26 Gupta Sunil K. Method and system for updating and customizing recognition vocabulary
US7203644B2 (en) * 2001-12-31 2007-04-10 Intel Corporation Automating tuning of speech recognition systems
US7493259B2 (en) 2002-01-04 2009-02-17 Siebel Systems, Inc. Method for accessing data via voice
US6804330B1 (en) * 2002-01-04 2004-10-12 Siebel Systems, Inc. Method and system for accessing CRM data via voice
US7493559B1 (en) * 2002-01-09 2009-02-17 Ricoh Co., Ltd. System and method for direct multi-modal annotation of objects
US7117200B2 (en) 2002-01-11 2006-10-03 International Business Machines Corporation Synthesizing information-bearing content from multiple channels
US7111248B2 (en) 2002-01-15 2006-09-19 Openwave Systems Inc. Alphanumeric information input method
US7536297B2 (en) 2002-01-22 2009-05-19 International Business Machines Corporation System and method for hybrid text mining for finding abbreviations and their definitions
US7054817B2 (en) 2002-01-25 2006-05-30 Canon Europa N.V. User interface for speech model generation and testing
US20030144846A1 (en) 2002-01-31 2003-07-31 Denenberg Lawrence A. Method and system for modifying the behavior of an application based upon the application's grammar
US7130390B2 (en) 2002-02-01 2006-10-31 Microsoft Corporation Audio messaging system and method
US7177814B2 (en) 2002-02-07 2007-02-13 Sap Aktiengesellschaft Dynamic grammar for voice-enabled applications
US7058890B2 (en) 2002-02-13 2006-06-06 Siebel Systems, Inc. Method and system for enabling connectivity to a data system
US7587317B2 (en) 2002-02-15 2009-09-08 Microsoft Corporation Word training interface
JP3974419B2 (ja) 2002-02-18 2007-09-12 株式会社日立製作所 音声入力を用いた情報取得方法及び情報取得システム
US20030167167A1 (en) * 2002-02-26 2003-09-04 Li Gong Intelligent personal assistants
EP1478982B1 (en) 2002-02-27 2014-11-05 Y Indeed Consulting L.L.C. System and method that facilitates customizing media
US6704396B2 (en) 2002-02-27 2004-03-09 Sbc Technology Resources, Inc. Multi-modal communications method
US7016849B2 (en) * 2002-03-25 2006-03-21 Sri International Method and apparatus for providing speech-driven routing between spoken language applications
US7072834B2 (en) 2002-04-05 2006-07-04 Intel Corporation Adapting to adverse acoustic environment in speech processing using playback training data
US7197460B1 (en) 2002-04-23 2007-03-27 At&T Corp. System for handling frequently asked questions in a natural language dialog service
US6877001B2 (en) 2002-04-25 2005-04-05 Mitsubishi Electric Research Laboratories, Inc. Method and system for retrieving documents with spoken queries
US7167568B2 (en) 2002-05-02 2007-01-23 Microsoft Corporation Microphone array signal enhancement
US20030212558A1 (en) 2002-05-07 2003-11-13 Matula Valentine C. Method and apparatus for distributed interactive voice processing
US20030212550A1 (en) 2002-05-10 2003-11-13 Ubale Anil W. Method, apparatus, and system for improving speech quality of voice-over-packets (VOP) systems
US20030212562A1 (en) 2002-05-13 2003-11-13 General Motors Corporation Manual barge-in for server-based in-vehicle voice recognition systems
JP2003329477A (ja) * 2002-05-15 2003-11-19 Pioneer Electronic Corp ナビゲーション装置及び対話型情報提供プログラム
US7107210B2 (en) 2002-05-20 2006-09-12 Microsoft Corporation Method of noise reduction based on dynamic aspects of speech
US7127400B2 (en) 2002-05-22 2006-10-24 Bellsouth Intellectual Property Corporation Methods and systems for personal interactive voice response
US7546382B2 (en) 2002-05-28 2009-06-09 International Business Machines Corporation Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms
US20040140989A1 (en) 2002-05-28 2004-07-22 John Papageorge Content subscription and delivery service
US7398209B2 (en) 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7143037B1 (en) 2002-06-12 2006-11-28 Cisco Technology, Inc. Spelling words using an arbitrary phonetic alphabet
US20030233230A1 (en) * 2002-06-12 2003-12-18 Lucent Technologies Inc. System and method for representing and resolving ambiguity in spoken dialogue systems
US7502737B2 (en) 2002-06-24 2009-03-10 Intel Corporation Multi-pass recognition of spoken dialogue
US20050021470A1 (en) 2002-06-25 2005-01-27 Bose Corporation Intelligent music track selection
US7177815B2 (en) * 2002-07-05 2007-02-13 At&T Corp. System and method of context-sensitive help for multi-modal dialog systems
US20040010358A1 (en) 2002-07-12 2004-01-15 General Motors Corporation Vehicle personalization through web portal
US7693720B2 (en) 2002-07-15 2010-04-06 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
EP1391830A1 (fr) 2002-07-19 2004-02-25 Albert Inc. S.A. Système d'extraction d'informations dans un texte en langage naturel
EP1394692A1 (en) 2002-08-05 2004-03-03 Alcatel Method, terminal, browser application, and mark-up language for multimodal interaction between a user and a terminal
US7236923B1 (en) 2002-08-07 2007-06-26 Itt Manufacturing Enterprises, Inc. Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text
US6741931B1 (en) * 2002-09-05 2004-05-25 Daimlerchrysler Corporation Vehicle navigation system with off-board server
US7136875B2 (en) 2002-09-24 2006-11-14 Google, Inc. Serving advertisements based on content
US7184957B2 (en) 2002-09-25 2007-02-27 Toyota Infotechnology Center Co., Ltd. Multiple pass speech recognition method and system
US7328155B2 (en) 2002-09-25 2008-02-05 Toyota Infotechnology Center Co., Ltd. Method and system for speech recognition using grammar weighted based upon location information
US20030115062A1 (en) 2002-10-29 2003-06-19 Walker Marilyn A. Method for automated sentence planning
US8793127B2 (en) 2002-10-31 2014-07-29 Promptu Systems Corporation Method and apparatus for automatically determining speaker characteristics for speech-directed advertising or other enhancement of speech-controlled devices or services
EP1614102A4 (en) 2002-12-10 2006-12-20 Kirusa Inc Techniques for disambiguating speech input using multimodal interfaces
KR100580619B1 (ko) * 2002-12-11 2006-05-16 삼성전자주식회사 사용자와 에이전트 간의 대화 관리방법 및 장치
US6834265B2 (en) 2002-12-13 2004-12-21 Motorola, Inc. Method and apparatus for selective speech recognition
US7890324B2 (en) 2002-12-19 2011-02-15 At&T Intellectual Property Ii, L.P. Context-sensitive interface widgets for multi-modal dialog systems
US20040158555A1 (en) 2003-02-11 2004-08-12 Terradigtal Systems Llc. Method for managing a collection of media objects
DE10306022B3 (de) 2003-02-13 2004-02-19 Siemens Ag Dreistufige Einzelworterkennung
GB2398913B (en) 2003-02-27 2005-08-17 Motorola Inc Noise estimation in speech recognition
JP4103639B2 (ja) 2003-03-14 2008-06-18 セイコーエプソン株式会社 音響モデル作成方法および音響モデル作成装置ならびに音声認識装置
US7146319B2 (en) 2003-03-31 2006-12-05 Novauris Technologies Ltd. Phonetically based speech recognition system and method
US20050021826A1 (en) * 2003-04-21 2005-01-27 Sunil Kumar Gateway controller for a multimodal system that provides inter-communication among different data and voice servers through various mobile devices, and interface for that controller
US7200559B2 (en) * 2003-05-29 2007-04-03 Microsoft Corporation Semantic object synchronous understanding implemented with speech application language tags
US20050015256A1 (en) * 2003-05-29 2005-01-20 Kargman James B. Method and apparatus for ordering food items, and in particular, pizza
US7103553B2 (en) * 2003-06-04 2006-09-05 Matsushita Electric Industrial Co., Ltd. Assistive call center interface
JP2005003926A (ja) 2003-06-11 2005-01-06 Sony Corp 情報処理装置および方法、並びにプログラム
JP2005010691A (ja) * 2003-06-20 2005-01-13 P To Pa:Kk 音声認識装置、音声認識方法、会話制御装置、会話制御方法及びこれらのためのプログラム
KR100577387B1 (ko) * 2003-08-06 2006-05-10 삼성전자주식회사 음성 대화 시스템에서의 음성 인식 오류 처리 방법 및 장치
US20050043940A1 (en) * 2003-08-20 2005-02-24 Marvin Elder Preparing a data source for a natural language query
US7428497B2 (en) 2003-10-06 2008-09-23 Utbk, Inc. Methods and apparatuses for pay-per-call advertising in mobile/wireless applications
US20070162296A1 (en) 2003-10-06 2007-07-12 Utbk, Inc. Methods and apparatuses for audio advertisements
US7454608B2 (en) 2003-10-31 2008-11-18 International Business Machines Corporation Resource configuration in multi-modal distributed computing systems
GB0325497D0 (en) 2003-10-31 2003-12-03 Vox Generation Ltd Automated speech application creation deployment and management
US20050102282A1 (en) * 2003-11-07 2005-05-12 Greg Linden Method for personalized search
KR100651729B1 (ko) * 2003-11-14 2006-12-06 한국전자통신연구원 홈네트워크 환경에서의 멀티-모달 상황 인식어플리케이션을 위한 시스템 및 방법
JP2005157494A (ja) * 2003-11-20 2005-06-16 Aruze Corp 会話制御装置及び会話制御方法
JP4558308B2 (ja) * 2003-12-03 2010-10-06 ニュアンス コミュニケーションズ,インコーポレイテッド 音声認識システム、データ処理装置、そのデータ処理方法及びプログラム
US20050137877A1 (en) 2003-12-17 2005-06-23 General Motors Corporation Method and system for enabling a device function of a vehicle
US7027586B2 (en) 2003-12-18 2006-04-11 Sbc Knowledge Ventures, L.P. Intelligently routing customer communications
US20050137850A1 (en) * 2003-12-23 2005-06-23 Intel Corporation Method for automation of programmable interfaces
US20050246174A1 (en) 2004-04-28 2005-11-03 Degolia Richard C Method and system for presenting dynamic commercial content to clients interacting with a voice extensible markup language system
US7386443B1 (en) 2004-01-09 2008-06-10 At&T Corp. System and method for mobile automatic speech recognition
WO2005076258A1 (ja) * 2004-02-03 2005-08-18 Matsushita Electric Industrial Co., Ltd. ユーザ適応型装置およびその制御方法
US7542903B2 (en) 2004-02-18 2009-06-02 Fuji Xerox Co., Ltd. Systems and methods for determining predictive models of discourse functions
US7430510B1 (en) * 2004-03-01 2008-09-30 At&T Corp. System and method of using modular spoken-dialog components
US7412393B1 (en) * 2004-03-01 2008-08-12 At&T Corp. Method for developing a dialog manager using modular spoken-dialog components
US7421393B1 (en) 2004-03-01 2008-09-02 At&T Corp. System for developing a dialog manager using modular spoken-dialog components
US20050216254A1 (en) 2004-03-24 2005-09-29 Gupta Anurag K System-resource-based multi-modal input fusion
US20050283752A1 (en) 2004-05-17 2005-12-22 Renate Fruchter DiVAS-a cross-media system for ubiquitous gesture-discourse-sketch knowledge capture and reuse
US20090018829A1 (en) * 2004-06-08 2009-01-15 Metaphor Solutions, Inc. Speech Recognition Dialog Management
US20060206310A1 (en) 2004-06-29 2006-09-14 Damaka, Inc. System and method for natural language processing in a peer-to-peer hybrid communications network
DE102004037858A1 (de) 2004-08-04 2006-03-16 Harman Becker Automotive Systems Gmbh Navigationssystem mit sprachgesteuerter Angabe von Sonderzielen
US7480618B2 (en) 2004-09-02 2009-01-20 Microsoft Corporation Eliminating interference of noisy modality in a multimodal application
US7716056B2 (en) * 2004-09-27 2010-05-11 Robert Bosch Corporation Method and system for interactive conversational dialogue for cognitively overloaded device users
FR2875919A1 (fr) 2004-09-27 2006-03-31 France Telecom Agent rationnel dialoguant, systeme de dialogue intelligent l'utilisant, procede de pilotage d'un dialogue intelligent, et programme pour sa mise en oeuvre
US20060074660A1 (en) 2004-09-29 2006-04-06 France Telecom Method and apparatus for enhancing speech recognition accuracy by using geographic data to filter a set of words
JP4478939B2 (ja) * 2004-09-30 2010-06-09 株式会社国際電気通信基礎技術研究所 音声処理装置およびそのためのコンピュータプログラム
US7925506B2 (en) * 2004-10-05 2011-04-12 Inago Corporation Speech recognition accuracy via concept to keyword mapping
JP3962767B2 (ja) * 2004-10-08 2007-08-22 松下電器産業株式会社 対話支援装置
US20060106769A1 (en) * 2004-11-12 2006-05-18 Gibbs Kevin A Method and system for autocompletion for languages having ideographs and phonetic characters
US7376645B2 (en) 2004-11-29 2008-05-20 The Intellection Group, Inc. Multimodal natural language query system and architecture for processing voice and proximity-based queries
US7590635B2 (en) * 2004-12-15 2009-09-15 Applied Minds, Inc. Distributed data store with an orderstamp to ensure progress
US20070214182A1 (en) 2005-01-15 2007-09-13 Outland Research, Llc Establishment-based media and messaging service
US7437297B2 (en) 2005-01-27 2008-10-14 International Business Machines Corporation Systems and methods for predicting consequences of misinterpretation of user commands in automated systems
KR100718147B1 (ko) 2005-02-01 2007-05-14 삼성전자주식회사 음성인식용 문법망 생성장치 및 방법과 이를 이용한 대화체음성인식장치 및 방법
US7831433B1 (en) 2005-02-03 2010-11-09 Hrl Laboratories, Llc System and method for using context in navigation dialog
US7461059B2 (en) 2005-02-23 2008-12-02 Microsoft Corporation Dynamically updated search results based upon continuously-evolving search query that is based at least in part upon phrase suggestion, search engine uses previous result sets performing additional search tasks
US7283829B2 (en) 2005-03-25 2007-10-16 Cisco Technology, Inc. Management of call requests in multi-modal communication environments
US20060236343A1 (en) * 2005-04-14 2006-10-19 Sbc Knowledge Ventures, Lp System and method of locating and providing video content via an IPTV network
US7668716B2 (en) * 2005-05-05 2010-02-23 Dictaphone Corporation Incorporation of external knowledge in multimodal dialog systems
US7813485B2 (en) 2005-05-26 2010-10-12 International Business Machines Corporation System and method for seamlessly integrating an interactive visual menu with an voice menu provided in an interactive voice response system
US7917365B2 (en) 2005-06-16 2011-03-29 Nuance Communications, Inc. Synchronizing visual and speech events in a multimodal application
US7873523B2 (en) 2005-06-30 2011-01-18 Microsoft Corporation Computer implemented method of analyzing recognition results between a user and an interactive application utilizing inferred values instead of transcribed speech
WO2007008798A3 (en) 2005-07-07 2007-04-19 Enable Inc V System and method for searching for network-based content in a multi-modal system using spoken keywords
US7424431B2 (en) 2005-07-11 2008-09-09 Stragent, Llc System, method and computer program product for adding voice activation and voice control to a media player
US7640160B2 (en) 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7620549B2 (en) 2005-08-10 2009-11-17 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US20070043569A1 (en) 2005-08-19 2007-02-22 Intervoice Limited Partnership System and method for inheritance of advertised functionality in a user interactive system
US7949529B2 (en) 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
EP1934971A4 (en) 2005-08-31 2010-10-27 Voicebox Technologies Inc Dynamic speech sharpening
US7672852B2 (en) * 2005-09-29 2010-03-02 Microsoft Corporation Localization of prompts
US20070078708A1 (en) 2005-09-30 2007-04-05 Hua Yu Using speech recognition to determine advertisements relevant to audio content and/or audio content relevant to advertisements
US8626588B2 (en) 2005-09-30 2014-01-07 Google Inc. Advertising with audio content
JP4849662B2 (ja) * 2005-10-21 2012-01-11 株式会社ピートゥピーエー 会話制御装置
US7747612B2 (en) * 2005-10-31 2010-06-29 Yahoo! Inc. Indication of exclusive items in a result set
US7477909B2 (en) 2005-10-31 2009-01-13 Nuance Communications, Inc. System and method for conducting a search using a wireless mobile device
US7587308B2 (en) * 2005-11-21 2009-09-08 Hewlett-Packard Development Company, L.P. Word recognition using ontologies
US20070135101A1 (en) 2005-12-08 2007-06-14 Comverse, Ltd. Enhanced visual IVR capabilities
US8325398B2 (en) 2005-12-22 2012-12-04 Canon Kabushiki Kaisha Image editing system, image management apparatus, and image editing program
US20070186165A1 (en) 2006-02-07 2007-08-09 Pudding Ltd. Method And Apparatus For Electronically Providing Advertisements
US20070203736A1 (en) 2006-02-28 2007-08-30 Commonwealth Intellectual Property Holdings, Inc. Interactive 411 Directory Assistance
US8204751B1 (en) * 2006-03-03 2012-06-19 At&T Intellectual Property Ii, L.P. Relevance recognition for a human machine dialog system contextual question answering based on a normalization of the length of the user input
JP5649303B2 (ja) 2006-03-30 2015-01-07 エスアールアイ インターナショナルSRI International メディア・ストリームに注釈を付ける方法および装置
US7966324B2 (en) * 2006-05-30 2011-06-21 Microsoft Corporation Personalizing a search results page based on search history
US7533089B2 (en) 2006-06-27 2009-05-12 International Business Machines Corporation Hybrid approach for query recommendation in conversation systems
JP5156013B2 (ja) * 2006-07-10 2013-03-06 アクセンチュア グローバル サービスィズ ゲーエムベーハー フィードバックを提供するための移動パーソナルサービスプラットフォーム
US8145493B2 (en) 2006-09-11 2012-03-27 Nuance Communications, Inc. Establishing a preferred mode of interaction between a user and a multimodal application
US8086463B2 (en) 2006-09-12 2011-12-27 Nuance Communications, Inc. Dynamically generating a vocal help prompt in a multimodal application
WO2008032329A3 (en) 2006-09-13 2009-04-23 Alon Atsmon Providing content responsive to multimedia signals
US7788084B2 (en) * 2006-09-19 2010-08-31 Xerox Corporation Labeling of work of art titles in text for natural language processing
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
WO2008057268A3 (en) 2006-10-26 2008-08-07 Mobile Content Networks Inc Techniques for determining relevant advertisements in response to queries
US7805740B2 (en) 2006-11-10 2010-09-28 Audiogate Technologies Ltd. System and method for providing advertisement based on speech recognition
JP4833043B2 (ja) 2006-11-30 2011-12-07 ヤフー株式会社 ブログ又はクエリ・クリックを利用したアフィニティ装置
US7640272B2 (en) 2006-12-07 2009-12-29 Microsoft Corporation Using automated content analysis for audio/video content consumption
US20080154604A1 (en) 2006-12-22 2008-06-26 Nokia Corporation System and method for providing context-based dynamic speech grammar generation for use in search applications
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US8909532B2 (en) 2007-03-23 2014-12-09 Nuance Communications, Inc. Supporting multi-lingual user interaction with a multimodal application
US8060367B2 (en) 2007-06-26 2011-11-15 Targus Information Corporation Spatially indexed grammar and methods of use
US8219399B2 (en) 2007-07-11 2012-07-10 Garmin Switzerland Gmbh Automated speech recognition (ASR) tiling
DE102007044792B4 (de) 2007-09-19 2012-12-13 Siemens Ag Verfahren, Steuergerät und System zur Steuerung oder Bedienung
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US8077975B2 (en) 2008-02-26 2011-12-13 Microsoft Corporation Handwriting symbol recognition accuracy using speech input
US8255224B2 (en) 2008-03-07 2012-08-28 Google Inc. Voice recognition grammar selection based on context
US20090276700A1 (en) 2008-04-30 2009-11-05 Nokia Corporation Method, apparatus, and computer program product for determining user status indicators
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8909810B2 (en) 2008-08-05 2014-12-09 Isabella Products, Inc. Systems and methods for multimedia content sharing
US8224652B2 (en) 2008-09-26 2012-07-17 Microsoft Corporation Speech and text driven HMM-based body animation synthesis
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
WO2011059997A1 (en) 2009-11-10 2011-05-19 Voicebox Technologies, Inc. System and method for providing a natural language content dedication service
US8676565B2 (en) * 2010-03-26 2014-03-18 Virtuoz Sa Semantic clustering and conversational agents
US8762156B2 (en) * 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US9443514B1 (en) * 2012-02-08 2016-09-13 Google Inc. Dynamic voice response control based on a weighted pace of spoken terms
US9070367B1 (en) * 2012-11-26 2015-06-30 Amazon Technologies, Inc. Local speech recognition of frequent utterances

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5748841A (en) 1994-02-25 1998-05-05 Morin; Philippe Supervised contextual language acquisition system
US5488652A (en) 1994-04-14 1996-01-30 Northern Telecom Limited Method and apparatus for training speech recognition algorithms for directory assistance applications

Also Published As

Publication number Publication date Type
CN101535983A (zh) 2009-09-16 application
EP2082335A4 (en) 2010-11-10 application
US20130339022A1 (en) 2013-12-19 application
US9015049B2 (en) 2015-04-21 grant
US8073681B2 (en) 2011-12-06 grant
WO2008118195A2 (en) 2008-10-02 application
US20080091406A1 (en) 2008-04-17 application
US20150228276A1 (en) 2015-08-13 application
EP2082335A2 (en) 2009-07-29 application
US20120022857A1 (en) 2012-01-26 application
WO2008118195A3 (en) 2008-12-04 application
US8515765B2 (en) 2013-08-20 grant

Similar Documents

Publication Publication Date Title
Lasecki et al. Real-time captioning by groups of non-experts
Hazen et al. Recognition confidence scoring and its use in speech understanding systems
Narayanan et al. Creating conversational interfaces for children
US8219407B1 (en) Method for processing the output of a speech recognizer
US8620659B2 (en) System and method of supporting adaptive misrecognition in conversational speech
US6985852B2 (en) Method and apparatus for dynamic grammars and focused semantic parsing
Delgado et al. Spoken, multilingual and multimodal dialogue systems: development and assessment
US7200559B2 (en) Semantic object synchronous understanding implemented with speech application language tags
US6567778B1 (en) Natural language speech recognition using slot semantic confidence scores related to their word recognition confidence scores
US8838457B2 (en) Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility
US7689420B2 (en) Personalizing a context-free grammar using a dictation language model
EP2575128A2 (en) Using context information to facilitate processing of commands in a virtual assistant
US20090030687A1 (en) Adapting an unstructured language model speech recognition system based on usage
US7143037B1 (en) Spelling words using an arbitrary phonetic alphabet
US8886540B2 (en) Using speech recognition results based on an unstructured language model in a mobile communication facility application
US20040073431A1 (en) Application abstraction with dialog purpose
US20100106497A1 (en) Internal and external speech recognition use with a mobile communication facility
US20080133245A1 (en) Methods for speech-to-speech translation
US20090030688A1 (en) Tagging speech recognition results based on an unstructured language model for use in a mobile communication facility application
US20070094005A1 (en) Conversation control apparatus
US20040113908A1 (en) Web server controls for web enabled recognition and/or audible prompting
US20090030698A1 (en) Using speech recognition results based on an unstructured language model with a music system
US6999931B2 (en) Spoken dialog system using a best-fit language model and best-fit grammar
US20090030685A1 (en) Using speech recognition results based on an unstructured language model with a navigation system
US7809569B2 (en) Turn-taking confidence

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
C14 Grant of patent or utility model
C41 Transfer of patent application or patent right or utility model