CN103035240B - 用于使用上下文信息的语音识别修复的方法和系统 - Google Patents
用于使用上下文信息的语音识别修复的方法和系统 Download PDFInfo
- Publication number
- CN103035240B CN103035240B CN201210369739.0A CN201210369739A CN103035240B CN 103035240 B CN103035240 B CN 103035240B CN 201210369739 A CN201210369739 A CN 201210369739A CN 103035240 B CN103035240 B CN 103035240B
- Authority
- CN
- China
- Prior art keywords
- interpreter
- group
- repairing
- application program
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 230000008439 repair process Effects 0.000 title claims abstract description 54
- 238000013461 design Methods 0.000 claims abstract description 18
- 238000013518 transcription Methods 0.000 claims description 28
- 230000035897 transcription Effects 0.000 claims description 28
- 230000008878 coupling Effects 0.000 claims description 19
- 238000010168 coupling process Methods 0.000 claims description 19
- 238000005859 coupling reaction Methods 0.000 claims description 19
- 230000008569 process Effects 0.000 claims description 19
- 230000013011 mating Effects 0.000 claims description 5
- 230000004807 localization Effects 0.000 claims description 3
- 230000005055 memory storage Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 18
- 230000004044 response Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000001413 cellular effect Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 239000008267 milk Substances 0.000 description 5
- 210000004080 milk Anatomy 0.000 description 5
- 235000013336 milk Nutrition 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 235000013399 edible fruits Nutrition 0.000 description 4
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000000712 assembly Effects 0.000 description 2
- 238000000429 assembly Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 206010038743 Restlessness Diseases 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 235000013351 cheese Nutrition 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 239000006071 cream Substances 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/26—Devices for calling a subscriber
- H04M1/27—Devices whereby a plurality of signals may be stored simultaneously
- H04M1/271—Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
- Telephonic Communication Services (AREA)
- Document Processing Apparatus (AREA)
Abstract
本发明涉及用于使用上下文信息的语音识别修复的方法和系统。本发明提供一种语音控制系统,其可识别口述命令和相关联的字词(例如“呼叫在家里的妈妈”)且可致使选定应用程序(例如电话拨号器)执行所述命令以致使例如智能电话等数据处理系统执行基于所述命令的操作(例如查找妈妈在家里的电话号码并拨打所述号码以建立电话呼叫)。所述语音控制系统可使用一组解释器来修复来自语音识别系统的经识别文本,且来自所述组的结果可被合并为最终经修复转录,所述最终经修复转录被提供到所述选定应用程序。
Description
技术领域
本发明涉及语音识别系统,且在一个实施例中,涉及用以控制数据处理系统的语音识别系统。
背景技术
许多语音识别系统所具有的常见问题是准确性。用户可对着语音识别器说话,且系统可用识别文本做出响应,但所述识别文本通常可能含有许多错误,因为语音识别器未能恰当地识别人类用户的话语。
语音识别可用以在电话上调用话音拨号,例如当用户在电话上口述命令“callmom(呼叫妈妈)”时。使用语音来控制数据处理系统可在来自语音识别器系统的转录错误决定用户口述“callTom(呼叫汤姆)”而非“callmom(呼叫妈妈)”时导致异常系统行为。转录错误可由硬件缺点(例如不能够经由蓝牙头戴式耳机俘获高质量音频记录)或用户错误(例如不正确或不完全的发音或背景噪声)造成。一些语音识别系统可采用使用上下文来改善语音识别系统;美国专利7,478,037提供可采用上下文来辅助语音识别过程的语音识别系统的实例。
发明内容
本发明的实施例提供一种语音控制系统,其可识别口述命令和相关联字词(例如“呼叫在家里的妈妈”)且可致使选定应用程序(例如电话拨号器)执行所述命令以致使系统(其可为智能电话)执行基于所述命令和相关联字词的操作(例如,向在家里的妈妈发出电话呼叫)。在一个实施例中,所述语音控制系统可使用包括常规声学模型和常规语言模型的语言识别器来根据从人类用户的语音获得的数字化输入产生文本输出。在一个实施例中,所述语音控制系统可由用户可调用的语音辅助应用程序来启动,且此语音辅助应用程序可解释并修复来自所述语音识别器的所述文本输出且将经修复的文本输出提供到一组应用程序中的选定应用程序;所述组应用程序可包括(例如)一个或一个以上应用程序,例如电话应用程序(用以拨号并建立话音电话呼叫连接)和媒体播放器应用程序(例如,iTunes)和SMS(短消息服务)“文本消息”应用程序和视频会议(例如,“面对面时间(FaceTime)”)或聊天应用程序和用以找到或定位例如朋友等个人的应用程序和其它应用程序。
在一个实施例中,语音辅助应用程序使用一组解释器来解释所述文本输出,所述解释器中的每一者经设计以解释所述组应用程序所使用的特定类型的文本。举例来说,名字解释器经设计以解释地址簿或通讯录数据库中的名字(在名字字段中),姓氏解释器经设计以解释地址簿或通讯录数据库中的姓氏(在姓氏字段中),全名解释器经设计以解释地址簿或通讯录数据库中的全名,且公司名称解释器经设计以解释地址簿或通讯录数据库中的公司名称。在一个实施例中,这些解释器可经配置以使用不同算法或过程来解释文本输出中的每一字词;举例来说,全名解释器可使用模拟匹配(使用编辑距离相似性测量)算法来将文本输出中的字词与地址簿或通讯录数据库中的字词进行比较,但在一个实施例中,不使用n码组(n-gram)算法来将文本输出中的字词与地址簿或通讯录数据库中的字词进行比较,而名字解释器使用n码组算法来将文本输出中的字词与地址簿或通讯录数据库中的字词进行比较。此外,在一个实施例中,这些解释器可在搜索地址簿或通讯录数据库以查找匹配时使用不同搜索算法。在一个实施例中,每一解释器还可在解释字词时使用上下文(例如,所述上下文可包括媒体播放器正在播放歌曲的指示)。在一个实施例中,所述上下文可包括用户输入历史(例如会话历史(例如,先前识别语音))或所述组应用程序中的应用程序的状态等。在一个实施例中,所述组中的每一解释器可处理文本输出中的每一字词以试图确定其是否能修复所述字词,且在一个实施例中,每一解释器自行决定其是否能修复每一字词;所述解释器产生指示其是否能修复所述字词的得分或置信度。
在一个实施例中,所述组解释器的控制器可通过排列所得的经修复解释(使用每一解释器的得分或置信度来执行所述排列)且接着合并所述经排列的解释来处理所述组解释器的结果。在一个实施例中,所述合并设法避免解释的重叠,使得仅使用来自一个解释器的输出来修复特定字词。
在一个实施例中,所述语音辅助应用程序可基于字词在字词串中的位置来确定来自语音识别器系统的文本输出中的命令或通过使用语法剖析器来确定所述命令,且所述命令连同经修复的语音转录可由语音辅助应用程序传递到一组应用程序中的特定应用程序以供所述特定应用程序使用经修复的语音转录执行所述命令。在此实施例中,语音辅助应用程序可基于所述命令来选择所述特定应用程序;举例来说,识别文本中的“呼叫”命令致使语音辅助应用程序通过API将所述“呼叫”命令连同经修复的语音转录传递到电话拨号器或电话应用程序,而识别文本中的“停止”命令致使语音辅助应用程序通过API将“停止”命令传递到媒体播放器(例如iTunes)以停止播放当前正在播放的歌曲。在此实例中,提供到所述组解释器中的媒体播放器解释器的上下文可包括媒体的状态(例如,上下文包括在语音识别器系统接收到含有识别字词“停止”的语音输入时披头士歌曲“ComeTogether(一起来)”当前正在播放的状态)。在此实例中,用户不需要在口述命令之前选择特定的所要应用程序;而是,用户在将语音辅助应用程序作为最前端应用程序(且具有语音输入焦点)的情况下进行口述且语音辅助应用程序接着自动地(不需要用户直接指定应用程序)基于所述命令来在所述组应用程序中选择恰当的应用程序,且接着通过API将所述命令传递到选定应用程序。
在一个方面中,一种机器实施方法包括:从数据处理系统的用户接收语音输入;在所述数据处理系统中确定所述语音输入的上下文;通过语音识别系统在所述语音输入中识别文本,所述文本识别产生文本输出;将所述文本输出存储为具有多个标记的剖析数据结构,所述多个标记各自表示所述文本输出中的字词;用一组解释器处理所述标记中的每一者,其中每一解释器经设计以修复所述文本输出中的特定类型的错误,搜索一个或一个以上数据库以识别所述数据库中的一个或一个以上项目与所述标记中的每一者之间的匹配,且根据所述所识别的匹配和所述上下文确定所述解释器是否能修复所述文本输出中的标记;合并由所述组解释器产生的选定结果以产生经修复的语音转录,所述经修复的语音转录表示所述文本输出的经修复版本;以及基于所述经修复的语音转录中的命令而将所述经修复的语音转录提供到一组应用程序中的选定应用程序,其中所述选定应用程序经配置以执行所述命令。
在一些实施例中,所述上下文包括先前用户输入历史,且其中所述一个或一个以上数据库包括通讯录数据库,所述通讯录数据库存储姓名、地址和电话号码中的至少一者。
在一些实施例中,所述上下文包括会话历史,其中所述一个或一个以上数据库包括媒体数据库,所述媒体数据库存储歌曲、题目和艺术家中的至少一者,且其中所述组解释器中的解释器在评估可能的匹配时使用至少两个字词的字符串。
在一些实施例中,所述组解释器中的第一解释器使用第一算法来确定是否修复字词,且所述组解释器中的第二解释器使用第二算法来确定是否修复字词,所述第一算法不同于所述第二算法。
在一些实施例中,所述组解释器中的第三解释器使用第三算法来搜索所述一个或一个以上数据库,且所述组解释器中的第四解释器使用第四算法来搜索所述一个或一个以上数据库,所述第三算法不同于所述第四算法。
在一些实施例中,所述组解释器中的所述解释器不试图修复所述命令。
在一些实施例中,所述合并仅合并来自所述组解释器的不重叠结果,并且将来自所述组解释器的重叠结果排列在分级组中,且选择所述分级组中的一个结果并将其合并到所述经修复的语音转录中。
在一些实施例中,每一解释器经设计以修复的所述特定错误类型是基于由所述解释器搜索的所述一个或一个以上数据库中的一个或一个以上字段来确定的。
在一些实施例中,所述组解释器在确定是否修复所述文本输出中的一个或一个以上字词时搜索所述一个或一个以上数据库以将所述文本输出中的字词与所述一个或一个以上数据库中的一个或一个以上项目进行比较。
在一些实施例中,语法剖析器根据所述文本输出确定所述命令。
在一些实施例中,所述组应用程序包括以下各项中的至少两者:(a)电话拨号器,其使用所述经修复的语音转录来拨打电话号码;(b)媒体播放器,其用于播放歌曲或其它内容;(c)文本消息接发应用程序;(d)电子邮件应用程序;(e)日历应用程序;(f)本地搜索应用程序;(g)视频会议应用程序;或(h)人员或物体定位应用程序。
在一些实施例中,所述方法包括上文所陈述的特征的任何组合。
在一个方面中,一种数据处理系统包括:语音识别器,其可操作以在语音输入中识别文本且产生文本输出;上下文确定模块,其可操作以确定所述语音输入的上下文;麦克风,其耦合到所述语音识别器以将所述语音输入提供到所述语音识别器;存储装置,其用于将所述文本输出存储为具有多个标记的剖析数据结构,所述多个标记各自表示所述文本输出中的字词;一组解释器,其耦合到所述语音识别器和所述上下文确定模块,其中每一解释器经设计以修复所述文本输出中的特定类型的错误,搜索一个或一个以上数据库以识别所述数据库中的一个或一个以上项目与所述标记中的每一者之间的匹配,且根据所述所识别的匹配和所述上下文确定所述解释器是否能修复所述文本输出中的标记;以及控制器,其用于合并由所述组解释器产生的选定结果以产生经修复的语音转录且用于基于所述经修复的语音转录中的命令来将所述经修复的语音转录提供到一组应用程序中的选定应用程序,其中所述经修复的语音转录表示所述文本输出的经修复版本,且所述选定应用程序经配置以执行所述命令。
在一些实施例中,所述上下文包括先前用户输入历史,且所述一个或一个以上数据库包括通讯录数据库,所述通讯录数据库存储姓名、地址和电话号码中的至少一者。
在一些实施例中,所述上下文包括会话历史,其中所述一个或一个以上数据库包括媒体数据库,所述媒体数据库存储歌曲、题目和艺术家中的至少一者,且其中所述组解释器中的解释器在评估可能的匹配时使用至少两个字词的字符串。
在一些实施例中,所述组解释器中的第一解释器使用第一算法来确定是否修复字词,且所述组解释器中的第二解释器使用第二算法来确定是否修复字词,所述第一算法不同于所述第二算法。
在一些实施例中,所述组解释器中的第三解释器使用第三算法来搜索所述一个或一个以上数据库,且所述组解释器中的第四解释器使用第四算法来搜索所述一个或一个以上数据库,所述第三算法不同于所述第四算法。
在一些实施例中,所述组解释器中的所述解释器不试图修复所述命令。
在一些实施例中,所述合并仅合并来自所述组解释器的不重叠结果,并且来自所述组解释器的重叠结果被排列在分级组中,且所述分级组中的一个结果被选择且合并到所述经修复的语音转录中。
在一些实施例中,每一解释器经设计以修复的所述特定错误类型是基于由所述解释器搜索的所述一个或一个以上数据库中的一个或一个以上字段来确定的。
在一些实施例中,所述系统进一步包含语法剖析器,所述语法剖析器用于根据所述文本输出确定所述命令。
在一些实施例中,所述系统包括上文所陈述的特征的任何组合。
本文中所描述的实施例可实施为机器可读非暂时存储媒体或方法或数据处理系统。
以上概述并不包括本发明的所有方面的详尽列举。期望本发明包括可根据上文所概述的各种方面以及下文具体实施方案中所揭示的那些方面的所有合适组合实践的所有系统和方法。
附图说明
在附图的图式中借助于实例而非限制来说明本发明,在附图中相同参考标号指示相似元件。
图1展示说明根据本发明的一个实施例的方法的流程图。
图2展示根据本发明的一个实施例的可包括软件模块和数据结构的架构的实例。
图3为展示根据本发明的一个实施例的方法的流程图。
图4展示根据本发明的一个实施例的架构的实例,其中控制器模块用以排列且合并来自根据本发明的一个实施例的一组解释器的经修复结果。
图5A展示描绘根据本发明的一个实施例的方法的流程图。
图5B展示可在本文中所描述的一个或一个以上实施例中采用的包括一个或一个以上API的软件架构。
图6展示根据一个实施例的架构,其中在语音识别系统中确定并使用当前上下文。
图7展示可在本文中所描述的一个或一个以上实施例中的修复过程中使用的数据结构的实例。
图8展示可在解释器确定是否修复已由语音识别系统识别的特定字词时由本文中所描述的解释器中的一者或一者以上使用的特定算法的实例。
图9展示根据本发明的一个实施例的数据处理系统的实例。
图10为可在本发明的一些实施例中使用的软件堆叠的实例。
图11为说明可在本发明的一些实施例中使用的示范性API架构的框图。
具体实施方式
将参考下文所论述的细节来描述本发明的各种实施例和方面,且附图将说明各种实施例。以下描述和附图说明本发明且不应解释为限制本发明。描述众多具体细节以提供对本发明的各种实施例的透彻理解。然而,在某些例子中,未描述众所周知或常规的细节以便提供对本发明的实施例的简明论述。
说明书中对“一个实施例”或“一实施例”的参考意味着在本发明的至少一个实施例中可包括结合所述实施例描述的特定特征、结构或特性。在说明书的各种地方中出现短语“在一个实施例中”未必全部指代同一实施例。由包含硬件(例如,电路、专用逻辑等)、软件或两者的组合的处理逻辑来执行在以下图式中所描绘的过程。虽然下文中根据一些顺序操作来描述所述过程,但应了解,所描述的一些操作可按不同次序来执行。此外,一些操作可并行地而非顺序地来执行。
本发明的一个实施例提供一组解释器,其每一者经设计或配置以修复由语音识别器系统提供的识别文本中的特定类型的错误。语音识别器系统可为常规的基于软件的语音识别系统,其包括声学模型和语言模型,且语音识别器系统中的这些模型的组合产生文本输出,所述文本输出接着由所述组解释器修复。解释器可经配置以与特定数据库和所述数据库中的内容以及可使用那些数据库的特定应用程序一起操作。在一个实施例中,所述组解释器与语音识别系统的分离(使得所述组解释器在语音识别系统提供输出之后进行操作)允许在设计语音控制系统方面具有较大灵活性。特定应用程序和/或那些数据库的任何改变可反映在恰当且对应的解释器的改变中,而不必改变基础的语音识别系统。举例来说,数据处理系统可使用现有的常规语音识别系统,且接着提供定制解释器,所述定制解释器针对特定应用程序和含有将针对数据处理系统上的每一应用程序或一组应用程序在口述命令中出现的内容的特定数据库进行定制。举例来说,例如“callJohnSmithonmobile(呼叫约翰·史密斯的移动电话)”等命令使用估计可能应当在用户的通讯录或地址簿数据库中出现的字词。名字“John(约翰)”应当在数据库中出现且姓氏“Smith(史密斯)”应当在数据库中出现;此外,数据库应包括指示一个电话号码是JohnSmith(约翰·史密斯)的移动电话号码的字段识别符。可能需要命令“Call(呼叫)”位于口述命令的开始处,或数据处理系统可使用语法剖析器来确定来自口述命令的命令的位置。如果通讯录数据库改变或电话应用程序改变(例如,添加或删除或修改命令),那么可改变用于所述数据库和应用程序的解释器而不必修改语音识别系统(例如,不必修改语音识别系统的语言模型)。可通过(例如)改变解释器与之交互的(数据库中的)字段或改变用以将(来自语音识别系统的)文本输出中的字词与数据库中的字段进行匹配的算法或通过改变用以搜索数据库的搜索算法来改变解释器。
图1展示根据本发明的一个实施例的方法的实例,其可使用一组解释器来修复已由语音识别器系统(例如使用声学模型和语言模型的系统)提供的文本输出。所述方法可在操作10中开始,在操作10中激活语音控制系统。举例来说,在一个实施例中,用户可按压按钮或按压并固持按钮或选择或启动语音辅助应用程序或仅仅开启调用语音辅助应用程序作为始终运行的后台守护程序的数据处理系统。在已经激活语音控制系统(例如,语音辅助应用程序为最前端的且具有语音输入焦点)之后,语音控制系统接收语音输入(12)。在一个实施例中,用户可口述例如“callJohnSmithonmobile(呼叫约翰·史密斯的移动电话)”或“tellJohnSmiththatIamintrafficandwillbelateforthemeeting(告诉约翰·史密斯:我在途中且将见面迟到)”或“playallsongsbytheBeatles(播放披头士的所有歌曲)”或“tellmysontopickupmilkifhegoestoSafewayafterschool(让我儿子去取牛奶,如果他放学后去西夫韦的话)”等命令。接着,在操作14中,常规的语音识别系统或语音识别器可识别在口述输入中所接收的字词,所述口述输入已经使用可采用声学模型和语言模型两者的常规语音识别系统来数字化并处理以产生可呈单一代码(Unicode)或ASCII格式或编码或其它字符编码的文本输出。常规的语音控制或语音识别系统此时使用所得输出而不进行进一步处理。在本发明的至少一些实施例中,进一步处理所述输出以便确定是否修复由操作14所提供的来自语音识别器系统的识别文本输出中的一个或一个以上字词。举例来说,在本发明的一个实施例中,通过处理识别文本(其可呈单一代码编码形式)以确定是否可修复识别文本中的一个或一个以上字词来执行操作16。在一个实施例中,由一组解释器执行所述修复,其中每一解释器经设计或配置以修复特定类型的错误,例如数据库的数据结构的特定字段中的错误。举例来说,一个解释器可经配置且设计以修复通讯录数据库的名字中的错误,而另一解释器可经设计以修复通讯录数据库中的公司名称中的错误。下文中通过使用不同算法(包括不同处理算法或搜索算法)来进一步描述每一解释器经配置以修复特定字段中的特定类型的错误的方式。作为操作16中的处理的结果,在操作18中提供经修复文本,且接着所述经修复文本可作为实际命令提供到特定应用程序,所述特定应用程序可为一组应用程序内的一个应用程序。
在一个实施例中,数据处理系统可在所述组中包括两个应用程序,例如由语音输入控制的电话拨号器和由语音输入控制的媒体播放器(例如iTunes)。在另一实施例中,所述组应用程序可包括那些应用程序以及文本消息接发(SMS-短消息接发服务)应用程序,和电子邮件应用程序,和日历应用程序,和备忘录应用程序,和本地搜索应用程序和视频会议应用程序和人员或物体定位应用程序。本地搜索应用程序是其中用户指令数据处理系统提供关于在地理上靠近用户当前位置的本地公司或本地实体的信息的应用程序。举例来说,本地搜索口述命令可为“findaChineserestaurant(寻找中式餐厅)”,其可调用通过网络浏览器进行搜索以基于用户当前位置来查找本地中式餐厅。或者,在本地搜索应用程序的情况下,口述命令可为“callDNJAutoRepair(呼叫DNJ汽车修理厂)”。如果用户系统中的通讯录数据库未包括DNJ汽车修理厂的条目,那么系统可作为响应来调用网络搜索以在用户当前位置(例如,由GPS接收器确定的位置)本地的区域中查找称为DNJ汽车修理厂的公司。
图2展示数据处理系统的架构的实例,其可包括多个软件模块或硬件子系统来实施图2中所展示的每一块,所述块还包括数据结构,例如数据库和来自所述模块的输出。在一个实施例中,元件201、205、207、211、215和219中的每一者可实施为软件模块或软件应用程序,其通过一个或一个以上API进行交互以便执行图3中所展示的方法或图5A中所展示的方法或图3和5A中所展示的方法的组合。图2中所展示的架构还可包括语音辅助应用程序,其将经数字化的语音输入提供到语音识别器系统201;在一个实施例中,语音辅助应用程序可包括经展示为元件207的所述组解释器和经展示为元件215的控制器,且语音辅助应用程序可作为经展示为元件205的预处理器进行操作。另外,语音辅助应用程序还可包括经展示为元件211的上下文确定模块。
图2中的元件201可包括常规的语音识别器系统,其采用声学模型和语言模型两者来识别来自人类用户的经数字化口述命令或输入中的字词。在一个实施例中,麦克风收集来自人类用户的口述声音,且那些声音经数字化并提供到经展示为元件201的语音识别器系统,所述语音识别器系统又产生经展示为元件203的呈例如单一代码等字符编码格式的识别文本输出。此文本输出203接着被提供到元件205,所述元件205可为创建修复数据结构的预处理器,所述修复数据结构可在一个实施例中为使用标记的剖析数据结构,下文中结合图7进一步描述所述剖析数据结构,图7提供在(例如)图3中所展示的方法或图5A中所展示的方法的修复过程中所使用的此剖析数据结构的实例。在一个实施例中,标记可在数据结构中用以表示文本输出203中的每一字词,且元件207中的所述组解释器可对那些标记或字词进行操作以便确定是否修复文本输出203中的每一字词。在一个实施例中,可在元件207中包括任选的语法剖析器以便确定短语中的哪个字词是可用以从所述组应用程序中选出特定应用程序的命令,如下文中将结合图5A进一步描述。在图4中展示可在元件207中使用的一组解释器的实例,其包括可使用不同算法来搜索其对应数据库或处理字词以确定文本输出中的字词与对应数据库中的字词之间是否存在匹配的一组解释器。
图8展示可由所述组解释器中的一个或一个以上解释器使用以便确定在文本输出203中的字词与一个或一个以上数据库(例如图4中所展示的通讯录数据库415)中的字词之间是否存在匹配的算法的实例。下文中将结合图4和8进一步描述这些各种算法。元件211可为上下文确定模块,例如图6中所展示的上下文确定模块601。来自元件211中的此上下文确定模块的输出被提供到元件207中所展示的所述组解释器中的一个或一个以上解释器,以便供这些解释器在确定文本输出203中的字词是否可由每一解释器修复时使用上下文。
所述解释器中的每一者可经配置或设计以与一个或一个以上数据库(例如元件209中的数据库)交互。这些数据库可包括通讯录或地址簿数据库、电子邮件数据库、文本消息接发数据库、媒体数据库(例如iTunes数据库或者歌曲或电影或歌曲与电影的组合的数据库)等。在本发明的一个实施例中还可包括其它数据库和用以在那些数据库中交互的对应解释器。在典型操作中,经设计以与特定数据库交互(而不与其它数据库交互)的解释器将处理除了命令字词以外的每一字词以确定所述字词是否匹配其对应数据库中的现有字词以及其匹配程度。举例来说,名字解释器可使用如图8中所展示的n码组算法通过搜索通讯录数据库以查找可能是名字的字词且接着经由使用一个或一个以上算法确定是否应执行修复来在通讯录数据库中搜索此字词的匹配,所述一个或一个以上算法经设计以确定所述数据库中的字词与当前正由解释器处理的字词之间的匹配程度。在一个实施例中,每一解释器处理文本输出203中的除了命令字词以外的每个字词以确定所述解释器是否能修复所述字词。此外,每一解释器可提供指示匹配程度或是否应当用在数据库中找到的替代字词来修复字词的得分或置信度。
元件207中所展示的所述组解释器可在一个实施例中提供一组替代解释213,且在一个实施例中,这些替代解释由元件215中所示的控制器来处理,所述元件215可排列且合并解释器的结果以便提供经合并的解释217,所述经合并的解释217可接着被提供到作为语音输入中的命令的目标的应用程序。
在一个实施例中,元件215中的控制器可为图4中所展示的控制器模块411,其与图4中所展示的一组解释器交互,所述组解释器又通过在一个或一个以上数据库中执行搜索并使用一个或一个以上算法处理那些数据库中的匹配来与所述一个或一个以上数据库交互,如下文中进一步描述。例如图5B中所展示的语音辅助应用程序511等语音辅助应用程序可向API调用经展示为元件219的目标应用程序且可提供所述命令和经修复的转录(其在一个实施例中为经合并的解释217)作为那些调用的参数。
图3中展示用于修复识别文本输出(例如识别文本输出203)中的字词的方法。图3的方法可用图2中所展示的架构和用图4中所展示的架构来执行,且使用图5B中所展示的一个或一个以上API。此外,图3的方法可使用语音辅助应用程序,其可基于在识别文本输出203中所检测到的命令来从一组应用程序中选出应用程序。图3的方法可在操作301中开始,在操作301中语音识别系统产生初始转录。此可为图2中所展示的识别文本输出203。操作301中所使用的语音识别系统可为用以在经数字化的语音输入中识别字词的语音识别器系统201,其包括常规的声学模型和语言模型。在操作303中,正在执行图3的方法的数据处理系统可创建用于初始转录的剖析数据结构。此剖析数据结构可在图3的修复过程中使用,且此数据结构的实例在图7中展示,下文中将进一步描述图7。在操作305中,系统确定转录中的命令,且还确定用户和/或系统上下文。所述命令可通过要求用户首先口述命令或通过使用剖析文本输出(例如文本输出203)的语法剖析器确定命令且因此命令字词本身在文本输出(例如文本输出203)中的位置来确定。
另外,在图3中所展示的实施例中,操作305还包括确定用户和/或系统上下文。上下文信息可包括哪些应用程序被启动且正在运行以及哪些应用程序未被启动且不在运行的列表、媒体播放器是否正在播放媒体(例如歌曲或电影),且还可包括基于传感器(例如接近传感器、定向传感器、加速计和其它传感器)的用户状态。另外,上下文信息还可包括先前会话历史,其可包括(对于所述组应用程序中的每一应用程序)先前所识别的文本,例如播放披头士专辑“阿比大街(AbbeyRoad)”等。在一个实施例中,上下文可包括先前会话中提及的应用程序领域,且还可包括当前应用程序状态是否期望来自用户的确认(例如是或否或取消等)。(用于确认的)选择值可由系统基于当前会话上下文来指定。举例来说,用户要求系统向朋友发送电子邮件。在撰写消息之后,系统请求用户进行确认。此时,确认选择值由“是”、“取消”和“对其进行改变”填充。在一个实施例中,上下文信息还可包括用户的当前位置,例如可在如本文中所描述用户请求本地搜索的情况下使用的当前GPS位置。上下文信息还可包括场所上下文和/或语言上下文;举例来说,输入语言上下文可由所述组解释器用来辅助语音修复。在一个实施例中,当语言上下文(在一个实施例中,其是根据用户的偏好设置来确定的)为英语时,那么解释器可将文本输出(来自语音识别系统的初始转录)中的“yet(仍)”修复为“yes(是)”。
在操作307中,在一个实施例中,系统执行所述组解释器中的每一解释器以便确定是否需要修复所述转录(例如,识别文本输出203)且其是否能被修复。在一个实施例中,在操作307中执行所述组解释器中的所有解释器。在另一实施例中,仅执行用于当前正在执行的应用程序的那些解释器来确定是否需要针对仅那些当前正在执行的应用程序修复所述转录。在一个实施例中,每一解释器基于其算法来自行决定其是否能修复由语音识别器系统(例如图2的元件201中的语音识别器系统)所提供的识别文本输出中的一个或一个以上字词。此操作经展示为元件309。如果没有解释器能修复或决定不需要修复,那么在操作311中,使用由语音识别器系统所提供的初始转录(例如识别文本输出203)且将其提供到选定应用程序。另一方面,如果已经确定一个或一个以上字词为可修复的,那么提供一组替代解释313,其包括初始转录(例如,识别文本输出203)以及经修复的解释。举例来说,如果确定用户在其通讯录数据库中没有“John(约翰)”而是在其通讯录数据库中具有“Jon(乔恩)”,那么字词“Jon(乔恩)”将为字词“John(约翰)”的替代解释。每一解释器维持指示一个或一个以上替代解释的匹配程度的得分或置信度,所述解释器可将所述得分或置信度提供到(例如)控制器,例如图4中所展示的控制器模块411。所述得分或置信度可在排列各种解释时使用以便选择最高匹配解释。所述得分或置信度可基于每个字词或基于每个短语(例如,两个或三个字词)来确定。接着,在操作315中,控制器模块或其它模块可执行合并操作,其在一个实施例中试图基于由每一解释器所提供的置信得分或匹配或排列得分来合并不重叠的解释。可接着在操作317中将作为已被修复的最终转录的经合并解释提供到选定应用程序。在一个实施例中,所述选定应用程序是基于在操作305中识别或确定的命令来选择的。
图4展示使用一组解释器和控制器模块来修复初始转录(例如识别文本输出203)中的字词的架构的实例。在一个实施例中,每一解释器经配置或设计以通过其恰当使用算法来处理一个或一个以上数据库的某些字段中的字词。举例来说,在图4中所展示的实施例中,解释器401经配置以使用算法A修复通讯录数据库415的名字字段中的字词,所述算法A可在一个实施例中为n码组算法,例如图8中所展示的算法。在一个实施例中,解释器可采用若干个算法或仅一个算法。除了n码组算法之外,算法还可包括模糊匹配算法,其可使用测量两个文本之间的相似性的编辑距离或可使用语音学匹配算法,例如双变音位算法或探测算法。另外,可使用前缀、后缀部分标记算法,且还可使用此项技术中已知的用于确定两个文本之间的匹配或相似性程度的其它算法。在一个实施例中,不同解释器使用不同算法,使得一个解释器可使用算法A,而另一解释器使用算法B而非算法A。在一个实施例中,所述算法经定制以在对应数据库中找出匹配且搜索数据库,且明确地说,针对每一解释器经设计以校正的特定字段而定制。解释器403可为使用算法A的姓氏解释器,且解释器405可为使用算法B的全名解释器。另外,图4中所展示的所述组解释器可包括公司名称解释器407,其使用与算法B和算法A不同的算法C。解释器401、403、405和407中的每一者能够存取通讯录数据库415而非数据库414以便搜索以查找其对应字段中的每一者中的匹配。除了针对不同字段使用不同算法之外,每一解释器可在搜索其对应数据库时采用不同搜索算法。图4中所展示的所述组解释器还包括媒体播放器解释器409,其经设计以搜索媒体数据库414(例如歌曲和/或电影以及其它媒体的iTunes数据库等)中的一个或一个以上字段。
图4中所展示的架构中的每一解释器可将一个或一个以上替代解释(例如所述组替代解释213)提供到控制器模块411。举例来说,名字解释器401可提供在口述命令中似乎是名字的内容的两个不同替代解释,且那两个不同解释将各自包括指示所述解释正确的置信度或概率的打分或等级。在一个实施例中,得分或等级是基于匹配或相似性的级别。图8展示具有不同得分的两个解释的实例。
图8展示可如何使用n码组算法来提供用于排列匹配的得分的实例。在此实例中,来自语音识别器系统的文本(例如识别文本输出203)包括字词“cream(奶酪)”801。接着将来自语音识别器系统的此字词与在用户的地址簿中找到的至少两个不同字词803和805进行比较,且明确地说,将字词801与字词803和805进行比较,如图8中所展示。算法通过将字符对与文本801进行比较来提供得分。如从图8中可以看到,名字“Kream(克林姆)”是比在地址簿中找到的另一名字(具有得分0的名字805)接近的匹配(因为其具有得分3)。
所述组解释器中的每一解释器可使用由上下文确定模块(例如元件211中所展示的上下文确定模块或图6中的上下文确定模块601)所提供的上下文信息。上下文可包括指示是否针对所述组应用程序中的一个或一个以上应用程序口述命令的先前会话历史以及任选地,命令本身和字词本身。先前会话历史603提供此信息,其还可包括例如触摸屏或键盘上的用户输入等先前用户输入等。上下文确定模块可根据先前会话历史而且还根据应用程序的状态605确定上下文,所述应用程序的状态605可包括指示哪些应用程序被启动且正在执行、哪些应用程序未被启动且因此不在执行以及媒体是否正在播放等的指示符。举例来说,媒体播放器解释器409可使用媒体正在播放时的上下文指示符来将字词“stock(股票)”的初始转录修复为“stop(停止)”,因为用户先前已经致使媒体开始播放且在媒体正在播放时的那个上下文中,由媒体播放器解释器409将字词“stock(股票)”解释为“stop(停止)”。上下文确定模块可确定语言或场所上下文,如本文中所描述。上下文确定模块601还可包括来自传感器(例如定向传感器或接近传感器或光传感器等)的输入作为上下文确定过程的一部分。另外,上下文确定模块601可包括先前用户输入历史。上下文确定模块601收集此各种关于上下文的信息,且将其提供到解释器,所述解释器使用所述上下文来帮助决定是否能在口述命令输入中修复字词。
现将结合图7提供根据本发明的实施例的具体实施方案。数据结构701用标记703表示识别文本输出中的字词。
语音识别过程获取语音音频记录且将其转录为一个或一个以上文本解释。初始转录被展示为文本字符串705。这些转录文本存储在表状数据结构中,所述表状数据结构在一个实施例中被称为识别且展示在图7中。
识别的基本构造是标记。标记是表示转录的原子单位的不可变字符串。如果转录由一序列标记703组成,那么每一标记被包封在称为短语707的第二级数据结构中。短语是支柱-主要数据结构。短语对象的有序列表形成识别。短语数据结构的存在将允许替代转录。
举例来说,当用户说“CallJohnSmithonmobile(呼叫约翰·史密斯的移动电话)”时,语音辅助应用程序可产生图7中所展示的识别:
语音修复过程获取识别对象(图7中所展示)作为输入且在原始识别的任何部分需要且能修复的情况下产生经修改的识别对象。
可创建称为元修复的内部数据结构以辅助语音修复过程。此数据结构可由以下各项组成:原始识别对象(图7中所展示)、修复对象和相对于原始转录的标记位置。
此处是用于图7中所展示的数据结构的标记位置查找表的实例:
Pair<Start,End>:特定标记字符串相对于原始识别文本的开始和结束位置
“CallJohnSmithonMobile(呼叫约翰·史密斯的移动电话)”
元修复的标记位置列表:
[0]:Pair<0,3>
[1]:Pair<5,8>
[2]:Pair<10,14>
[3]:Pair<16,17>
[4]:Pair<19,24>
元修复的修复对象由语音修复程序所产生的替代解释列表组成。用以表示替代解释的数据结构称为修复解释。
修复解释由作为用于原始识别中的子字符串的看似可能的替换的文本以及所述子字符串的开始和结束位置组成。举例来说,如果“Jon(乔恩)”应为用于“John(约翰)”的替代,那么用于图7中所展示的数据结构的修复解释可被描述为如下:
修复解释:
文本:“Jon”
开始:5
结束:8
元修复对象含有用以执行解释合并的信息。合并逻辑在将原始转录在一个实施例中传递通过所有解释器之后且在产生一个或一个以上修复解释的情况下发生。以下伪码提供可用以合并来自所述组解释器的不重叠的解释的合并函数的实例。在图7中展示“INPUT:original:Recognition(输入:原始:识别)”。
图5A说明本发明的一个实施例,其可使用允许系统基于在口述命令中检测到的命令来选择确定为口述命令的目标的恰当应用程序的语音辅助应用程序。在一个实施例中,语音辅助应用程序可使用语法剖析器来在口述命令中检测命令且进而选择恰当的应用程序或者可要求用户在每个口述命令中将命令口述为第一个字词,使得系统可确定哪个字词相对于口述输入中的其它字词来说是命令。在此实施例中,系统可基于口述命令来选择特定应用程序且进而不要求用户在口述命令之前选择应用程序以便使得所述应用程序成为系统的具有语音输入焦点的最前端应用程序。换句话说,在此实施例中,语音辅助应用程序可基于对于一个应用程序为恰当的命令来从一组应用程序中选出所述一个应用程序。在操作501中,系统可接收可指向一组应用程序中的一个应用程序的语音输入。所述应用程序可正在执行或不在执行。在一个实施例中,所述方法可经配置以使得仅正在执行的应用程序将在所述组应用程序中,而在另一实施例中,所有应用程序不管是否正在执行均可在所述组应用程序中使得每一应用程序可接收语音输入。在操作503中,语音辅助应用程序可接着确定语音输入中的命令,且可接着基于所确定的命令来选择恰当的应用程序。举例来说,如果命令是“call(呼叫)”,那么在一个实施例中恰当的应用程序是用以与语音输入(例如,callmomathome(呼叫在家里的妈妈))中所指定的人员建立电话呼叫的电话拨号器。所述命令可通过使用语法剖析器定位命令来确定,其中可通过指令用户使用动词来要求所述命令为动词,或系统可要求用户将命令放置在口述字词序列中的固定位置中。在一个实施例中,如果命令为字词“tell(告诉)”,那么选定应用程序是文本消息接发(SMS)应用程序;如果命令是字词“播放”或字词“停止”,那么选定应用程序是媒体播放器应用程序,等等。
接着,在操作505中,执行解释器以修复识别语音输入中的一个或一个以上字词。如果在执行解释器之前在操作503中选择应用程序,那么可仅执行经设计以与所述特定应用程序一起工作的那些解释器而非执行所述系统中对于能够通过语音输入接收口述命令的所有应用程序可用的所有解释器。操作505类似于操作307,且可结合图6中所展示的上下文确定模块来使用图4中所展示的架构。在一个实施例中,不修复在操作503中确定或检测到的命令。在此情况下,解释器在对解释器与之交互的一个或一个以上数据库执行搜索时将文本输入中的命令分析为结束字词。接着,在操作507中,语音辅助应用程序可将命令传递到在操作503中确定的选定应用程序且可将经修复的转录(其通过执行解释器和合并替代解释来产生)传递到选定应用程序。在一个实施例中,语音辅助应用程序可执行将命令连同经修复的转录传递通过API,例如如图5B中所展示的一个或一个API。
图5B中所展示的语音辅助应用程序511可与执行图5A的一个或一个以上方法的语音辅助应用程序相同。语音辅助应用程序511可通过经由API514对操作系统516进行上下文调用来确定上下文,所述操作系统516又返回上下文信息,例如上文描述和/或图6中展示的上下文信息。上下文信息还可包括哪些应用程序正在执行以及哪些应用程序先前接收到用户输入或先前接收到口述命令的列表。语音辅助应用程序511还可对语音识别器系统进行调用,所述语音识别器系统可为在图5B中所展示的系统上执行的软件应用程序,其表示包括操作系统516以及语音辅助应用程序511和一组应用程序中的一个或一个以上应用程序(例如应用程序518和520)的软件堆叠。应用程序518和520可接收从语音辅助应用程序传递通过API512的命令。
下文是在一个实施例中由语音辅助应用程序所进行的语音修复的三个使用实例。
(1)“对齐网格”话音拨号。语音辅助应用程序允许用户使用语音呼叫地址簿数据库中的联系人。用户在地址簿中具有名为“MarcDickinson(玛克·迪金森)”的联系人,而没有名为“马克(Mark)”或“迪克(Dick)”的联系人。当用户说“CallMarcDickinson(呼叫玛克·迪金森)”时,语音识别不正确地将输入转录为“CallMarkDickson(呼叫马克·迪克儿子)”。代替告诉用户所述辅助程序因为其无法在数据库中找到“MarkDickson(马克·迪克儿子)”而无法完成所述操作,语音修复可利用联系人姓名拼写且使用模糊匹配算法来产生较为看似可能的替代转录“CallMarcDickinson(呼叫玛克·迪金森)”。(2)消除用户意图的歧义。辅助语音应用程序允许用户发送SMS消息且做出话音拨号请求。当用户说“TellmywifetopickupmilkandfruitsifshegoestoSafewayafterwork(让我妻子去取牛奶和水果,如果她在下班后去西夫韦的话)”时,辅助程序自动地向用户的妻子编写文本消息。归因于识别错误,语音系统可不正确地将动作字词“tell(让)”转录为“call(呼叫)”或“tall(高)”。因为请求“CallmywifetopickupmilkandfruitsifshegoestoSafewayafterwork(呼叫我妻子去取牛奶和水果,如果她在下班后去西夫韦的话)”或“tallmywifetopickupmilkandfruitsifshegoestoSafewayafterwork(高我妻子去取牛奶和水果,如果她在下班后去西夫韦的话)”在一个实施例中不会映射到辅助程序中的任何可起作用的任务,所以默认响应通常是“Sorry!Idon’tknowwhatyoumeant(对不起!我不知道你的意思)”。语音修复可通过使用上下文消除语音意图的歧义来解决此问题。举例来说,在已知字词“tell(让)”和“tall(高)”之间的编辑距离较短且话音拨号命令通常在目标人员标记之后没有长连续字符串的情况下,解释器可将原始转录重写为“TellmywifetopickupmilkandfruitsifsheplantovisitSafewayafterwork(让我妻子去取牛奶和水果,如果她打算在下班后去西夫韦的话)”。(3)消除命令/系统关键字词的歧义。语音系统可不正确地转录短关键字词发音。举例来说,用户说“Stop(停止)”而初始转录为“Stock(股票)”;用户说“Yes(是)”而初始转录为“Yet(仍)”。语音修复可通过基于一个或一个以上上下文线索而在原始转录文本为较不看似可能的解释时提供替代转录来解决这些问题。举例来说,当辅助程序正提示用户以获得是/否确认时,用户将不可能说“Yet(仍)”作为跟随响应。代替返回“Yet(仍)”作为最终转录,语音修复可用“Yes(是)”来将其盖写作为较看似可能的语音输入。类似的修复逻辑适用于媒体播放器领域。如果用户刚刚请求播放歌曲而紧接的语音转录为“Stock(股票)”,那么语音修复可用“Stop(停止)”来将其盖写作为较看似可能的命令转录。
图9展示可与本发明的一个实施例一起使用的数据处理系统900的实例。举例来说且在一个实施例中,系统900可实施为便携式数据处理装置,例如智能电话或输入板(例如,iPad)装置或膝上型计算机或娱乐系统。图9中所展示的数据处理系统900包括处理系统911,其可为一个或一个以上微处理器或其可为单芯片系统(集成电路),且所述系统还包括用于存储数据和程序以供处理系统执行的存储器901。存储器901可存储(例如)结合图2所描述的软件组件,且存储器901可为任何已知形式的机器可读非暂时存储媒体,例如半导体存储器(例如,快闪;DRAM;SRAM;等等)。系统900还包括音频输入/输出子系统905,其可包括麦克风和扬声器,以用于例如重放音乐或通过扬声器和麦克风提供电话功能性。麦克风可接收本文中所描述的语音输入,且所述输入可经数字化并提供到如本文中所描述的语音识别器系统。
显示器控制器和显示装置909可为用户提供视觉用户接口;此接口可包括图形用户接口,其类似于当在iPhone或iPad上运行OSX操作系统软件或iOS软件时在麦金托什计算机上所展示的图形用户接口。系统900还包括一个或一个以上无线收发器903来与另一数据处理系统通信。无线收发器可为WLAN收发器(例如,WiFi)、红外线收发器、蓝牙收发器和/或无线蜂窝式电话收发器。将了解,额外组件(未图示)在某些实施例中也可为系统900的一部分,且在某些实施例中,比图9中所展示的组件少的组件也可在数据处理系统中使用。系统900进一步可包括一个或一个以上通信端口917以与另一数据处理系统通信。通信端口可为USB短路、火线端口、蓝牙接口、对接端口等。
数据处理系统900还包括一个或一个以上输入装置913,其经提供以允许用户向系统提供输入。这些输入装置可为小键盘或键盘或触摸板或多点触摸板,其与例如显示装置909等显示装置重叠并集成。数据处理系统900还可包括任选的输入/输出装置,其可为用于对接口的连接器。将了解,一个或一个以上总线(未图示)可用以互连各种组件,如此项技术中众所周知的。图9中所展示的数据处理系统可为手持式计算机或个人数字助理(PDA),或具有PDA类功能性的蜂窝式电话,或包括蜂窝式电话的手持式计算机,或媒体播放器(例如iPod),或游戏或娱乐装置,或组合这些装置的多个方面或功能的装置(例如在一个装置中与PDA和蜂窝式电话组合的媒体播放器或嵌入式装置或其它消费型电子装置)。在其它实施例中,数据处理系统900可为网络计算机或位于另一装置内的嵌入式处理装置,或具有比图9中所展示的组件少的组件或可能多的组件的其它类型的数据处理系统。
数据处理系统900可任选地包括一个或一个以上硬件装置,其经设计以数字化并存储由音频I/O905中的麦克风所接收的人类语音。
本发明的至少某些实施例可为数字媒体播放器(例如便携式音乐和/或视频媒体播放器)的一部分,所述数字媒体播放器可包括用以呈现媒体的媒体处理系统、用以存储媒体的存储装置,且可进一步包括与天线系统和媒体处理系统耦合的射频(RF)收发器(例如,用于蜂窝式电话的RF收发器)。在某些实施例中,存储在远程存储装置上的媒体可通过RF收发器发射到媒体播放器。媒体可为(例如)一个或一个以上音乐或其它音频、静止图片或运动图片。
便携式媒体播放器的实例在第7,345,671号公开美国专利和第2004/0224638号美国公开专利申请案中描述,所述两者均以引用的方式并入本文中。
可在一些实施例中使用一个或一个以上应用程序编程接口(API)。API是由程序代码组件或硬件组件(下文中称为“API实施组件”)实施的接口,其允许不同的程序代码组件或硬件组件(下文中称为“API调用组件”)接入并使用由API实施组件所提供的一个或一个以上函数、方法、程序、数据结构、类别和/或其它服务。API可界定在API调用组件与API实施组件之间传递的一个或一个以上参数。
API允许API调用组件的开发者(其可为第三方开发者)利用由API实施组件所提供的指定特征。可存在一个API调用组件或可存在一个以上此类组件。API可为计算机系统或程序库提供以便支持请求来自应用程序的服务的源代码接口。操作系统(OS)可具有多个API来允许在OS上运行的应用程序调用那些API中的一者或一者以上,且服务(例如程序库)可具有多个API来允许使用所述服务的应用程序调用那些API中的一者或一者以上。API可依据可在构建应用程序时解释或编译的编程语言来指定。
在一些实施例中,API实施组件可提供一个以上API,其各自提供由API实施组件实施的功能性的不同视图或具有接入由API实施组件实施的功能性的不同方面的不同方面。举例来说,API实施组件的一个API可提供第一组功能且可暴露到第三方开发者,且API实施组件的另一API可被隐蔽(不暴露)且提供第一组功能的子组并还提供另一组功能,例如测试或调试不在第一组功能中的功能。在其它实施例中,API实施组件可自身经由基础API调用一个或一个以上其它组件且因此为API调用组件和API实施组件两者。
API界定API调用组件在接入并使用API实施组件的指定特征时使用的语言和参数。举例来说,API调用组件通过由API暴露的一个或一个以上API调用或引用(例如由函数或方法调用来体现)接入API实施组件的指定特征且经由API调用或引用使用参数传递数据和控制信息。API实施组件可响应于来自API调用组件的API调用而通过API返回值。尽管API界定API调用的语法和结果(例如,如何调用API调用以及API调用进行什么),但API可不展现API调用如何完成所述API调用所指定的函数。经由调用(API调用组件)与API实施组件之间的一个或一个以上应用程序编程接口来传送各种API调用。传送API调用可包括发布、起始、引用、调用、接收、返回或响应于所述函数调用或消息;换句话说,传送可描述API调用组件或API实施组件中的任一者所进行的动作。API的函数调用或其它引用可通过参数列表或其它结构发送或接收一个或一个以上参数。参数可为常数、关键字、数据结构、对象、对象类别、变量、数据类型、指针、阵列、列表或指向函数或方法的指针或者用以提及将经由API传递的数据或其它项目的另一方式。
此外,数据类型或类别可由API提供且由API实施组件实施。因此,API调用组件可声明变量、使用指针、通过使用在API中所提供的定义来使用或例示所述类型或类别的常数值。
一般来说,API可用以接入由API实施组件提供的服务或数据,或起始由API实施组件所提供的操作或计算的执行。借助于实例,API实施组件和API调用组件可各自为操作系统、库、装置驱动器、API、应用程序或其它模块中的任一者(应理解,API实施组件和API调用组件可为彼此相同或不同类型的模块)。API实施组件可在一些情况下至少部分地以固件、微码或其它硬件逻辑来体现。在一些实施例中,API可允许客户端程序使用由软件开发套件(SDK)库所提供的服务。在其它实施例中,应用程序或其它客户端程序可使用由应用程序框架所提供的API。在这些实施例中,应用程序或客户端程序可并入对由SDK提供和由API提供的函数或方法的调用,或使用在SDK中界定和由API提供的数据类型或对象。应用程序框架可在这些实施例中为对所述框架所界定的各种事件做出响应的程序提供主要事件环路。API允许应用程序使用应用程序框架指定所述事件以及对事件的响应。在一些实施方案中,API调用可向应用程序报告硬件装置的能力或状态,包括与例如输入能力和状态、输出能力和状态、处理能力、功率状态、存储容量和状态、通信能力等方面相关的那些,且API可部分地通过固件、微码或部分地在硬件组件上执行的其它低级逻辑来实施。
API调用组件可为本地组件(即,在与API实施组件相同的数据处理系统上)或远程组件(即,在与API实施组件不同的数据处理系统上),其经由网络通过API而与API实施组件进行通信。应理解,API实施组件也可充当API调用组件(即,其可对由不同API实施组件暴露的API做出API调用),且API调用组件也可通过实施暴露到不同API调用组件的API来充当API实施组件。
API可允许以不同编程语言编写的多个API调用组件与API实施组件通信(因此,API可包括用于翻译API实施组件与API调用组件之间的调用和返回的特征);然而,API可依据特定编程语言来实施。API调用组件可在一个实施例中调用来自不同提供者的API,例如来自OS提供者的一组API和来自插件提供者的另一组API和来自另一提供者(例如,软件库的提供者)或另一组API的创建者的另一组API。
图11为说明可在本发明的一些实施例中使用的示范性API架构的框图。如图11中所展示,API架构1100包括实施API1120的API实施组件1110(例如,操作系统、库、装置驱动器、API、应用程序、软件或其它模块)。API1120指定可由API调用组件1130使用的API实施组件的一个或一个以上函数、方法、类别、对象、协议、数据结构、格式和/或其它特征。API1120可指定至少一个调用协定,其指定API实施组件中的函数如何从API调用组件接收参数且所述函数如何将结果返回到API调用组件。API调用组件1130(例如,操作系统、库、装置驱动器、API、应用程序、软件或其它模块)通过API1120做出API调用以接入并使用API实施组件1110的由API1120指定的特征。API实施组件1110可响应于API调用而通过API1120将值返回到API调用组件1130。
将了解,API实施组件1110可包括未通过API1120指定且不可由API调用组件1130获得的额外函数、方法、类别、数据结构和/或其它特征。应理解,API调用组件1130可位于与API实施组件1110相同的系统上或可远程地定位且经由网络使用API1120接入API实施组件1110。尽管图11说明单个API调用组件1130与API1120交互,但其理解,可以与API调用组件1130不同的语言(或相同的语言)编写的其它API调用组件也可使用API1120。
API实施组件1110、API1120和API调用组件1130可存储在机器可读非暂时存储媒体中,所述机器可读非暂时存储媒体包括可由机器(例如,计算机或其它数据处理系统)读取的呈有形形式的用于存储信息的任何机制。举例来说,机器可读媒体包括磁盘、光盘、随机存取存储器、只读存储器、快闪存储器装置等,且可为本地存储媒体或位于通过一个或一个以上网络耦合到客户端装置的远程装置上的存储媒体。
在图10(“软件堆叠”)(示范性实施例)中,应用程序可使用若干服务API对服务1或2且使用若干OSAPI对操作系统(OS)做出调用。服务1和2可使用若干OSAPI对OS做出调用。
请注意,服务2具有两个API,其中一者(服务2API1)从应用程序1接收调用且将值返回到应用程序1,且另一者(服务2API2)从应用程序2接收调用且将值返回到应用程序2。服务1(其可例如为软件库)对OSAPI1做出调用且从OSAPI1接收返回值,且服务2(其可例如为软件库)对OSAPI1和OSAPI2两者做出调用且从OSAPI1和OSAPI2两者接收返回值。应用程序2对OSAPI2做出调用且从OSAPI2接收返回值。
说明书中对“一个实施例”或“一实施例”的参考意味着结合所述实施例描述的特定特征、结构或特性包括在本发明的至少一个实施例中。在说明书的各种地方出现短语“在一个实施例中”不必全部指代同一实施例。
在前述说明书中,已经参考本发明的特定示范性实施例来描述了本发明。将明显的是,可在不脱离如所附权利要求书中所陈述的本发明的较广精神和范围的情况下对本发明做出各种修改。因此,应在说明性意义而非限制性意义上看待说明书和图式。
Claims (20)
1.一种机器实施的用于语音识别修复的方法,其包括:
从数据处理系统的用户接收语音输入;
在所述数据处理系统中确定所述语音输入的上下文;
通过语音识别系统在所述语音输入中识别文本,所述文本识别产生文本输出;
将所述文本输出存储为具有多个标记的经剖析数据结构,所述多个标记各自表示所述文本输出中的字词;
用一组解释器处理所述标记中的每一者,其中每一解释器经设计以修复所述文本输出中的特定类型的错误,搜索一个或一个以上数据库以识别所述数据库中的一个或一个以上项目与所述标记中的每一者之间的匹配,且根据所述所识别的匹配和所述上下文确定所述解释器是否能修复所述文本输出中的标记;
合并由所述组解释器产生的选定结果以产生经修复的语音转录,所述经修复的语音转录表示所述文本输出的经修复版本;以及
基于所述经修复的语音转录中的命令而将所述经修复的语音转录提供到一组应用程序中的选定应用程序,其中所述选定应用程序经配置以执行所述命令。
2.根据权利要求1所述的方法,其中所述上下文包括先前用户输入历史,且其中所述一个或一个以上数据库包括通讯录数据库,所述通讯录数据库存储姓名、地址和电话号码中的至少一者。
3.根据权利要求1到2中任一权利要求所述的方法,其中所述上下文包括会话历史,其中所述一个或一个以上数据库包括媒体数据库,所述媒体数据库存储歌曲、题目和艺术家中的至少一者,且其中所述组解释器中的解释器在评估可能的匹配时使用至少两个字词的字符串。
4.根据权利要求1到2中任一权利要求所述的方法,其中所述组解释器中的第一解释器使用第一算法来确定是否修复字词,且所述组解释器中的第二解释器使用第二算法来确定是否修复字词,所述第一算法不同于所述第二算法。
5.根据权利要求4所述的方法,其中所述组解释器中的第三解释器使用第三算法来搜索所述一个或一个以上数据库,且所述组解释器中的第四解释器使用第四算法来搜索所述一个或一个以上数据库,所述第三算法不同于所述第四算法。
6.根据权利要求1到2中任一权利要求所述的方法,其中所述组解释器中的所述解释器不试图修复所述命令。
7.根据权利要求1到2中任一权利要求所述的方法,其中所述合并仅合并来自所述组解释器的不重叠结果,并且将来自所述组解释器的重叠结果排列在分级组中,且选择所述分级组中的一个结果并将其合并到所述经修复的语音转录中。
8.根据权利要求1到2中任一权利要求所述的方法,其中每一解释器经设计以修复的所述特定错误类型是基于由所述解释器搜索的所述一个或一个以上数据库中的一个或一个以上字段来确定的。
9.根据权利要求1到2中任一权利要求所述的方法,其中所述组解释器在确定是否修复所述文本输出中的一个或一个以上字词时搜索所述一个或一个以上数据库以将所述文本输出中的字词与所述一个或一个以上数据库中的一个或一个以上项目进行比较。
10.根据权利要求1到2中任一权利要求所述的方法,其中语法剖析器根据所述文本输出确定所述命令。
11.根据权利要求1到2中任一权利要求所述的方法,其中所述组应用程序包括以下各项中的至少两者:(a)电话拨号器,其使用所述经修复的语音转录来拨打电话号码;(b)媒体播放器,其用于播放歌曲或其它内容;(c)文本消息接发应用程序;(d)电子邮件应用程序;(e)日历应用程序;(f)本地搜索应用程序;(g)视频会议应用程序;或(h)人员或物体定位应用程序。
12.一种数据处理系统,其包括:
语音识别器,其可操作以在语音输入中识别文本且产生文本输出;
上下文确定模块,其可操作以确定所述语音输入的上下文;
麦克风,其耦合到所述语音识别器以将所述语音输入提供到所述语音识别器;
存储装置,其用于将所述文本输出存储为具有多个标记的经剖析数据结构,所述多个标记各自表示所述文本输出中的字词;
一组解释器,其耦合到所述语音识别器和所述上下文确定模块,其中每一解释器经设计以修复所述文本输出中的特定类型的错误,搜索一个或一个以上数据库以识别所述数据库中的一个或一个以上项目与所述标记中的每一者之间的匹配,且根据所述所识别的匹配和所述上下文确定所述解释器是否能修复所述文本输出中的标记;以及
控制器,其用于合并由所述组解释器产生的选定结果以产生经修复的语音转录且用于基于所述经修复的语音转录中的命令来将所述经修复的语音转录提供到一组应用程序中的选定应用程序,其中所述经修复的语音转录表示所述文本输出的经修复版本,且所述选定应用程序经配置以执行所述命令。
13.根据权利要求12所述的系统,其中所述上下文包括先前用户输入历史,且其中所述一个或一个以上数据库包括通讯录数据库,所述通讯录数据库存储姓名、地址和电话号码中的至少一者。
14.根据权利要求12到13中任一权利要求所述的系统,其中所述上下文包括会话历史,其中所述一个或一个以上数据库包括媒体数据库,所述媒体数据库存储歌曲、题目和艺术家中的至少一者,且其中所述组解释器中的解释器在评估可能的匹配时使用至少两个字词的字符串。
15.根据权利要求12到13中任一权利要求所述的系统,其中所述组解释器中的第一解释器使用第一算法来确定是否修复字词,且所述组解释器中的第二解释器使用第二算法来确定是否修复字词,所述第一算法不同于所述第二算法。
16.根据权利要求15所述的系统,其中所述组解释器中的第三解释器使用第三算法来搜索所述一个或一个以上数据库,且所述组解释器中的第四解释器使用第四算法来搜索所述一个或一个以上数据库,所述第三算法不同于所述第四算法。
17.根据权利要求12到13中任一权利要求所述的系统,其中所述组解释器中的所述解释器不试图修复所述命令。
18.根据权利要求12到13中任一权利要求所述的系统,其中所述合并仅合并来自所述组解释器的不重叠结果,并且来自所述组解释器的重叠结果被排列在分级组中,且所述分级组中的一个结果被选择且合并到所述经修复的语音转录中。
19.根据权利要求12到13中任一权利要求所述的系统,其中每一解释器经设计以修复的所述特定错误类型是基于由所述解释器搜索的所述一个或一个以上数据库中的一个或一个以上字段来确定的。
20.根据权利要求12到13中任一权利要求所述的系统,其进一步包含语法剖析器,所述语法剖析器用于根据所述文本输出确定所述命令。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510922714.2A CN105336326A (zh) | 2011-09-28 | 2012-09-28 | 用于使用上下文信息的语音识别修复的方法和系统 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/247,912 | 2011-09-28 | ||
US13/247,912 US8762156B2 (en) | 2011-09-28 | 2011-09-28 | Speech recognition repair using contextual information |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510922714.2A Division CN105336326A (zh) | 2011-09-28 | 2012-09-28 | 用于使用上下文信息的语音识别修复的方法和系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103035240A CN103035240A (zh) | 2013-04-10 |
CN103035240B true CN103035240B (zh) | 2015-11-25 |
Family
ID=47048983
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510922714.2A Pending CN105336326A (zh) | 2011-09-28 | 2012-09-28 | 用于使用上下文信息的语音识别修复的方法和系统 |
CN201210369739.0A Expired - Fee Related CN103035240B (zh) | 2011-09-28 | 2012-09-28 | 用于使用上下文信息的语音识别修复的方法和系统 |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510922714.2A Pending CN105336326A (zh) | 2011-09-28 | 2012-09-28 | 用于使用上下文信息的语音识别修复的方法和系统 |
Country Status (6)
Country | Link |
---|---|
US (2) | US8762156B2 (zh) |
EP (1) | EP2587478A3 (zh) |
JP (2) | JP2013073240A (zh) |
KR (2) | KR101418163B1 (zh) |
CN (2) | CN105336326A (zh) |
AU (2) | AU2012227294B2 (zh) |
Families Citing this family (387)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US6915262B2 (en) | 2000-11-30 | 2005-07-05 | Telesector Resources Group, Inc. | Methods and apparatus for performing speech recognition and using speech recognition results |
US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
US10032452B1 (en) * | 2016-12-30 | 2018-07-24 | Google Llc | Multimodal transmission of packetized data |
US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8073681B2 (en) * | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) * | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US9305548B2 (en) | 2008-05-27 | 2016-04-05 | Voicebox Technologies Corporation | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8463053B1 (en) | 2008-08-08 | 2013-06-11 | The Research Foundation Of State University Of New York | Enhanced max margin learning on multimodal data mining in a multimedia database |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US9390167B2 (en) | 2010-07-29 | 2016-07-12 | Soundhound, Inc. | System and methods for continuous audio matching |
US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US9634855B2 (en) | 2010-05-13 | 2017-04-25 | Alexander Poltorak | Electronic personal interactive device that determines topics of interest using a conversational agent |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9035163B1 (en) | 2011-05-10 | 2015-05-19 | Soundbound, Inc. | System and method for targeting content based on identified audio and multimedia |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
DE102011079034A1 (de) | 2011-07-12 | 2013-01-17 | Siemens Aktiengesellschaft | Ansteuerung eines technischen Systems |
JP2013025299A (ja) * | 2011-07-26 | 2013-02-04 | Toshiba Corp | 書き起こし支援システムおよび書き起こし支援方法 |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US9002322B2 (en) | 2011-09-29 | 2015-04-07 | Apple Inc. | Authentication with secondary approver |
US8769624B2 (en) | 2011-09-29 | 2014-07-01 | Apple Inc. | Access control utilizing indirect authentication |
US9620122B2 (en) * | 2011-12-08 | 2017-04-11 | Lenovo (Singapore) Pte. Ltd | Hybrid speech recognition |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9361878B2 (en) * | 2012-03-30 | 2016-06-07 | Michael Boukadakis | Computer-readable medium, system and method of providing domain-specific information |
US10255914B2 (en) | 2012-03-30 | 2019-04-09 | Michael Boukadakis | Digital concierge and method |
US9190054B1 (en) * | 2012-03-31 | 2015-11-17 | Google Inc. | Natural language refinement of voice and text entry |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US10152723B2 (en) | 2012-05-23 | 2018-12-11 | Google Llc | Methods and systems for identifying new computers and providing matching services |
US10776830B2 (en) | 2012-05-23 | 2020-09-15 | Google Llc | Methods and systems for identifying new computers and providing matching services |
KR20130135410A (ko) * | 2012-05-31 | 2013-12-11 | 삼성전자주식회사 | 음성 인식 기능을 제공하는 방법 및 그 전자 장치 |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
JP5819261B2 (ja) * | 2012-06-19 | 2015-11-18 | 株式会社Nttドコモ | 機能実行指示システム、機能実行指示方法及び機能実行指示プログラム |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US10957310B1 (en) | 2012-07-23 | 2021-03-23 | Soundhound, Inc. | Integrated programming framework for speech and text understanding with meaning parsing |
US9031848B2 (en) | 2012-08-16 | 2015-05-12 | Nuance Communications, Inc. | User interface for searching a bundled service content data source |
US8799959B2 (en) | 2012-08-16 | 2014-08-05 | Hoi L. Young | User interface for entertainment systems |
US9497515B2 (en) | 2012-08-16 | 2016-11-15 | Nuance Communications, Inc. | User interface for entertainment systems |
US9026448B2 (en) | 2012-08-16 | 2015-05-05 | Nuance Communications, Inc. | User interface for entertainment systems |
US9106957B2 (en) * | 2012-08-16 | 2015-08-11 | Nuance Communications, Inc. | Method and apparatus for searching data sources for entertainment systems |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
JP6068901B2 (ja) * | 2012-09-26 | 2017-01-25 | 京セラ株式会社 | 情報端末、音声操作プログラムおよび音声操作方法 |
US20140122084A1 (en) * | 2012-10-25 | 2014-05-01 | Nuance Communications, Inc. | Data Search Service |
US10026400B2 (en) * | 2013-06-27 | 2018-07-17 | Google Llc | Generating dialog recommendations for chat information systems based on user interaction and environmental data |
WO2014088588A1 (en) * | 2012-12-07 | 2014-06-12 | Empire Technology Development Llc | Personal assistant context building |
CN103065630B (zh) * | 2012-12-28 | 2015-01-07 | 科大讯飞股份有限公司 | 用户个性化信息语音识别方法及系统 |
US10735552B2 (en) | 2013-01-31 | 2020-08-04 | Google Llc | Secondary transmissions of packetized data |
US10650066B2 (en) | 2013-01-31 | 2020-05-12 | Google Llc | Enhancing sitelinks with creative content |
DE112014000709B4 (de) | 2013-02-07 | 2021-12-30 | Apple Inc. | Verfahren und vorrichtung zum betrieb eines sprachtriggers für einen digitalen assistenten |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
WO2014143776A2 (en) | 2013-03-15 | 2014-09-18 | Bodhi Technology Ventures Llc | Providing remote interactions with host device using a wireless device |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
EP3937002A1 (en) | 2013-06-09 | 2022-01-12 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10083009B2 (en) | 2013-06-20 | 2018-09-25 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system planning |
US9594542B2 (en) | 2013-06-20 | 2017-03-14 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on training by third-party developers |
US9633317B2 (en) * | 2013-06-20 | 2017-04-25 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on a natural language intent interpreter |
US10474961B2 (en) | 2013-06-20 | 2019-11-12 | Viv Labs, Inc. | Dynamically evolving cognitive architecture system based on prompting for additional user input |
CN103354089B (zh) * | 2013-06-25 | 2015-10-28 | 天津三星通信技术研究有限公司 | 一种语音通信管理方法及其装置 |
DE112014003653B4 (de) | 2013-08-06 | 2024-04-18 | Apple Inc. | Automatisch aktivierende intelligente Antworten auf der Grundlage von Aktivitäten von entfernt angeordneten Vorrichtungen |
US20150058006A1 (en) * | 2013-08-23 | 2015-02-26 | Xerox Corporation | Phonetic alignment for user-agent dialogue recognition |
JP6502249B2 (ja) | 2013-08-29 | 2019-04-17 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 音声認識方法及び音声認識装置 |
CN105593783A (zh) * | 2013-09-26 | 2016-05-18 | 谷歌公司 | 用于将导航数据提供至车辆的系统和方法 |
US9361084B1 (en) | 2013-11-14 | 2016-06-07 | Google Inc. | Methods and systems for installing and executing applications |
US9507849B2 (en) | 2013-11-28 | 2016-11-29 | Soundhound, Inc. | Method for combining a query and a communication command in a natural language computer system |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US10811013B1 (en) * | 2013-12-20 | 2020-10-20 | Amazon Technologies, Inc. | Intent-specific automatic speech recognition result generation |
US11386886B2 (en) * | 2014-01-28 | 2022-07-12 | Lenovo (Singapore) Pte. Ltd. | Adjusting speech recognition using contextual information |
US9292488B2 (en) | 2014-02-01 | 2016-03-22 | Soundhound, Inc. | Method for embedding voice mail in a spoken utterance using a natural language processing computer system |
US11295730B1 (en) | 2014-02-27 | 2022-04-05 | Soundhound, Inc. | Using phonetic variants in a local context to improve natural language understanding |
CN103853463A (zh) * | 2014-02-27 | 2014-06-11 | 珠海多玩信息技术有限公司 | 语音操控方法及装置 |
US9959744B2 (en) | 2014-04-25 | 2018-05-01 | Motorola Solutions, Inc. | Method and system for providing alerts for radio communications |
KR102282487B1 (ko) * | 2014-05-08 | 2021-07-26 | 삼성전자주식회사 | 애플리케이션 실행 장치 및 방법 |
US9564123B1 (en) | 2014-05-12 | 2017-02-07 | Soundhound, Inc. | Method and system for building an integrated user profile |
US20150350146A1 (en) | 2014-05-29 | 2015-12-03 | Apple Inc. | Coordination of message alert presentations across devices based on device modes |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
EP3149554B1 (en) | 2014-05-30 | 2024-05-01 | Apple Inc. | Continuity |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9967401B2 (en) | 2014-05-30 | 2018-05-08 | Apple Inc. | User interface for phone call routing among devices |
TWI566107B (zh) | 2014-05-30 | 2017-01-11 | 蘋果公司 | 用於處理多部分語音命令之方法、非暫時性電腦可讀儲存媒體及電子裝置 |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
CN107113222B (zh) | 2014-06-06 | 2020-09-01 | 谷歌有限责任公司 | 基于环境的主动聊天信息系统 |
CN104966513B (zh) * | 2014-06-09 | 2018-09-04 | 腾讯科技(深圳)有限公司 | 语言命令处理方法和装置 |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) * | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
CN105469789A (zh) * | 2014-08-15 | 2016-04-06 | 中兴通讯股份有限公司 | 一种语音信息的处理方法及终端 |
US10339293B2 (en) | 2014-08-15 | 2019-07-02 | Apple Inc. | Authenticated device used to unlock another device |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
KR20160027640A (ko) * | 2014-09-02 | 2016-03-10 | 삼성전자주식회사 | 전자 장치 및 전자 장치에서의 개체명 인식 방법 |
US9953646B2 (en) | 2014-09-02 | 2018-04-24 | Belleau Technologies | Method and system for dynamic speech recognition and tracking of prewritten script |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
EP3195145A4 (en) | 2014-09-16 | 2018-01-24 | VoiceBox Technologies Corporation | Voice commerce |
WO2016044321A1 (en) | 2014-09-16 | 2016-03-24 | Min Tang | Integration of domain information into state transitions of a finite state transducer for natural language processing |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
EP3201770B1 (en) * | 2014-09-30 | 2020-06-03 | Nuance Communications, Inc. | Methods and apparatus for module arbitration |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
WO2016061309A1 (en) | 2014-10-15 | 2016-04-21 | Voicebox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
US10235130B2 (en) | 2014-11-06 | 2019-03-19 | Microsoft Technology Licensing, Llc | Intent driven command processing |
US9646611B2 (en) | 2014-11-06 | 2017-05-09 | Microsoft Technology Licensing, Llc | Context-based actions |
US9922098B2 (en) | 2014-11-06 | 2018-03-20 | Microsoft Technology Licensing, Llc | Context-based search and relevancy generation |
US10614799B2 (en) | 2014-11-26 | 2020-04-07 | Voicebox Technologies Corporation | System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance |
US10431214B2 (en) | 2014-11-26 | 2019-10-01 | Voicebox Technologies Corporation | System and method of determining a domain and/or an action related to a natural language input |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
JP6348831B2 (ja) * | 2014-12-12 | 2018-06-27 | クラリオン株式会社 | 音声入力補助装置、音声入力補助システムおよび音声入力方法 |
US10147421B2 (en) | 2014-12-16 | 2018-12-04 | Microcoft Technology Licensing, Llc | Digital assistant voice input integration |
WO2016117854A1 (ko) * | 2015-01-22 | 2016-07-28 | 삼성전자 주식회사 | 음성 신호를 기초로 한 텍스트 편집 장치 및 텍스트 편집 방법 |
CN105869632A (zh) * | 2015-01-22 | 2016-08-17 | 北京三星通信技术研究有限公司 | 基于语音识别的文本修订方法和装置 |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9460713B1 (en) | 2015-03-30 | 2016-10-04 | Google Inc. | Language model biasing modulation |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US9472196B1 (en) | 2015-04-22 | 2016-10-18 | Google Inc. | Developer voice actions system |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US9576578B1 (en) * | 2015-08-12 | 2017-02-21 | Google Inc. | Contextual improvement of voice query recognition |
CN105183422B (zh) * | 2015-08-31 | 2018-06-05 | 百度在线网络技术(北京)有限公司 | 语音控制应用程序的方法和装置 |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
KR102420518B1 (ko) * | 2015-09-09 | 2022-07-13 | 삼성전자주식회사 | 자연어 처리 시스템, 자연어 처리 장치, 자연어 처리 방법 및 컴퓨터 판독가능 기록매체 |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
CN105512182B (zh) * | 2015-11-25 | 2019-03-12 | 深圳Tcl数字技术有限公司 | 语音控制方法及智能电视 |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US20170177716A1 (en) * | 2015-12-22 | 2017-06-22 | Intel Corporation | Technologies for semantic interpretation of user input by a dialogue manager |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10509626B2 (en) | 2016-02-22 | 2019-12-17 | Sonos, Inc | Handling of loss of pairing between networked devices |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US9772817B2 (en) | 2016-02-22 | 2017-09-26 | Sonos, Inc. | Room-corrected voice detection |
US10264030B2 (en) | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US9922648B2 (en) * | 2016-03-01 | 2018-03-20 | Google Llc | Developer voice actions system |
CN107193389A (zh) * | 2016-03-14 | 2017-09-22 | 中兴通讯股份有限公司 | 一种实现输入的方法和装置 |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US10319371B2 (en) * | 2016-05-04 | 2019-06-11 | GM Global Technology Operations LLC | Disambiguation of vehicle speech commands |
US10332516B2 (en) | 2016-05-10 | 2019-06-25 | Google Llc | Media transfer among media output devices |
KR102177786B1 (ko) | 2016-05-13 | 2020-11-12 | 구글 엘엘씨 | 미디어 출력 디바이스들 사이의 미디어 전달 |
US20190066676A1 (en) * | 2016-05-16 | 2019-02-28 | Sony Corporation | Information processing apparatus |
DK179186B1 (en) | 2016-05-19 | 2018-01-15 | Apple Inc | REMOTE AUTHORIZATION TO CONTINUE WITH AN ACTION |
JP2017211430A (ja) | 2016-05-23 | 2017-11-30 | ソニー株式会社 | 情報処理装置および情報処理方法 |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT |
US9978390B2 (en) | 2016-06-09 | 2018-05-22 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK201670622A1 (en) | 2016-06-12 | 2018-02-12 | Apple Inc | User interfaces for transactions |
US10410622B2 (en) * | 2016-07-13 | 2019-09-10 | Tata Consultancy Services Limited | Systems and methods for automatic repair of speech recognition engine output using a sliding window mechanism |
US10134399B2 (en) | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
KR102691889B1 (ko) * | 2016-07-27 | 2024-08-06 | 삼성전자주식회사 | 전자 장치 및 그의 음성 인식 방법 |
US10331784B2 (en) | 2016-07-29 | 2019-06-25 | Voicebox Technologies Corporation | System and method of disambiguating natural language processing requests |
US20180039478A1 (en) * | 2016-08-02 | 2018-02-08 | Google Inc. | Voice interaction services |
US10115400B2 (en) | 2016-08-05 | 2018-10-30 | Sonos, Inc. | Multiple voice services |
JP6597527B2 (ja) * | 2016-09-06 | 2019-10-30 | トヨタ自動車株式会社 | 音声認識装置および音声認識方法 |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10339925B1 (en) * | 2016-09-26 | 2019-07-02 | Amazon Technologies, Inc. | Generation of automated message responses |
US10217453B2 (en) | 2016-10-14 | 2019-02-26 | Soundhound, Inc. | Virtual assistant configured by selection of wake-up phrase |
US10181323B2 (en) | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
US9959864B1 (en) * | 2016-10-27 | 2018-05-01 | Google Llc | Location-based voice query recognition |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10276161B2 (en) * | 2016-12-27 | 2019-04-30 | Google Llc | Contextual hotwords |
US10708313B2 (en) | 2016-12-30 | 2020-07-07 | Google Llc | Multimodal transmission of packetized data |
US10593329B2 (en) * | 2016-12-30 | 2020-03-17 | Google Llc | Multimodal transmission of packetized data |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
JP7107228B2 (ja) * | 2017-01-18 | 2022-07-27 | ソニーグループ株式会社 | 情報処理装置および情報処理方法、並びにプログラム |
US11100384B2 (en) | 2017-02-14 | 2021-08-24 | Microsoft Technology Licensing, Llc | Intelligent device user interactions |
US10467509B2 (en) | 2017-02-14 | 2019-11-05 | Microsoft Technology Licensing, Llc | Computationally-efficient human-identifying smart assistant computer |
US11010601B2 (en) | 2017-02-14 | 2021-05-18 | Microsoft Technology Licensing, Llc | Intelligent assistant device communicating non-verbal cues |
US10560656B2 (en) * | 2017-03-19 | 2020-02-11 | Apple Inc. | Media message creation with automatic titling |
KR102375800B1 (ko) | 2017-04-28 | 2022-03-17 | 삼성전자주식회사 | 음성 인식 서비스를 제공하는 전자 장치 및 그 방법 |
US10992795B2 (en) | 2017-05-16 | 2021-04-27 | Apple Inc. | Methods and interfaces for home media control |
US11431836B2 (en) | 2017-05-02 | 2022-08-30 | Apple Inc. | Methods and interfaces for initiating media playback |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK201770428A1 (en) | 2017-05-12 | 2019-02-18 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
US11436417B2 (en) * | 2017-05-15 | 2022-09-06 | Google Llc | Providing access to user-controlled resources by automated assistants |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
CN111343060B (zh) | 2017-05-16 | 2022-02-11 | 苹果公司 | 用于家庭媒体控制的方法和界面 |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US20220279063A1 (en) | 2017-05-16 | 2022-09-01 | Apple Inc. | Methods and interfaces for home media control |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US11056105B2 (en) | 2017-05-18 | 2021-07-06 | Aiqudo, Inc | Talk back from actions in applications |
US11043206B2 (en) | 2017-05-18 | 2021-06-22 | Aiqudo, Inc. | Systems and methods for crowdsourced actions and commands |
WO2018213788A1 (en) * | 2017-05-18 | 2018-11-22 | Aiqudo, Inc. | Systems and methods for crowdsourced actions and commands |
US11340925B2 (en) | 2017-05-18 | 2022-05-24 | Peloton Interactive Inc. | Action recipes for a crowdsourced digital assistant system |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10607606B2 (en) | 2017-06-19 | 2020-03-31 | Lenovo (Singapore) Pte. Ltd. | Systems and methods for execution of digital assistant |
CN107393544B (zh) * | 2017-06-19 | 2019-03-05 | 维沃移动通信有限公司 | 一种语音信号修复方法及移动终端 |
US20190354557A1 (en) * | 2017-06-20 | 2019-11-21 | Tom Kornblit | System and Method For Providing Intelligent Customer Service |
KR102383430B1 (ko) * | 2017-06-21 | 2022-04-07 | 현대자동차주식회사 | 고속 음성 파일 처리 장치, 그를 포함한 시스템 및 그 방법 |
US10475449B2 (en) | 2017-08-07 | 2019-11-12 | Sonos, Inc. | Wake-word detection suppression |
JP6513749B2 (ja) | 2017-08-09 | 2019-05-15 | レノボ・シンガポール・プライベート・リミテッド | 音声アシストシステム、サーバ装置、その音声アシスト方法、及びコンピュータが実行するためのプログラム |
US10048930B1 (en) | 2017-09-08 | 2018-08-14 | Sonos, Inc. | Dynamic computation of system response volume |
US10719507B2 (en) * | 2017-09-21 | 2020-07-21 | SayMosaic Inc. | System and method for natural language processing |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10452695B2 (en) * | 2017-09-22 | 2019-10-22 | Oracle International Corporation | Context-based virtual assistant implementation |
US10482868B2 (en) | 2017-09-28 | 2019-11-19 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10051366B1 (en) | 2017-09-28 | 2018-08-14 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10466962B2 (en) | 2017-09-29 | 2019-11-05 | Sonos, Inc. | Media playback system with voice assistance |
US10599645B2 (en) * | 2017-10-06 | 2020-03-24 | Soundhound, Inc. | Bidirectional probabilistic natural language rewriting and selection |
KR102445779B1 (ko) * | 2017-11-07 | 2022-09-21 | 주식회사 엘지유플러스 | 대화형 서비스 장치 및 대화형 서비스 장치의 제어 방법 |
US20190146491A1 (en) * | 2017-11-10 | 2019-05-16 | GM Global Technology Operations LLC | In-vehicle system to communicate with passengers |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10922357B1 (en) | 2017-12-07 | 2021-02-16 | Amazon Technologies, Inc. | Automatically mapping natural language commands to service APIs |
CN110021295B (zh) * | 2018-01-07 | 2023-12-08 | 国际商业机器公司 | 用于识别由语音识别系统生成的错误转录的方法和系统 |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US11410075B2 (en) | 2018-01-15 | 2022-08-09 | Microsoft Technology Licensing, Llc | Contextually-aware recommendations for assisting users with task completion |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10313514B1 (en) | 2018-02-21 | 2019-06-04 | Plantronics, Inc. | Device registry for mediating communication sessions |
WO2019168208A1 (ko) * | 2018-02-27 | 2019-09-06 | 엘지전자 주식회사 | 이동 단말기 및 그 제어 방법 |
US10777217B2 (en) * | 2018-02-27 | 2020-09-15 | At&T Intellectual Property I, L.P. | Performance sensitive audio signal selection |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
WO2019169591A1 (zh) * | 2018-03-07 | 2019-09-12 | 华为技术有限公司 | 一种语音交互的方法及装置 |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
KR102617265B1 (ko) | 2018-03-13 | 2023-12-26 | 삼성전자주식회사 | 사용자 음성 입력을 처리하는 장치 |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
CN108520760B (zh) * | 2018-03-27 | 2020-07-24 | 维沃移动通信有限公司 | 一种语音信号处理方法及终端 |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
CN108538291A (zh) * | 2018-04-11 | 2018-09-14 | 百度在线网络技术(北京)有限公司 | 语音控制方法、终端设备、云端服务器及系统 |
US11256752B2 (en) * | 2018-05-02 | 2022-02-22 | Samsung Electronics Co., Ltd. | Contextual recommendation |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
CN112868060B (zh) * | 2018-05-07 | 2024-07-12 | 谷歌有限责任公司 | 用户、自动化助理和其它计算服务之间的多模态交互 |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
CN108922537B (zh) * | 2018-05-28 | 2021-05-18 | Oppo广东移动通信有限公司 | 音频识别方法、装置、终端、耳机及可读存储介质 |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US11076039B2 (en) | 2018-06-03 | 2021-07-27 | Apple Inc. | Accelerated task performance |
US10811009B2 (en) * | 2018-06-27 | 2020-10-20 | International Business Machines Corporation | Automatic skill routing in conversational computing frameworks |
CN108806688A (zh) * | 2018-07-16 | 2018-11-13 | 深圳Tcl数字技术有限公司 | 智能电视的语音控制方法、智能电视、系统及存储介质 |
EP3937030B1 (en) | 2018-08-07 | 2024-07-10 | Google LLC | Assembling and evaluating automated assistant responses for privacy concerns |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US10811015B2 (en) * | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US10325597B1 (en) | 2018-10-08 | 2019-06-18 | Sorenson Ip Holdings, Llc | Transcription of communications |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
CN109068011A (zh) * | 2018-11-09 | 2018-12-21 | 长沙龙生光启新材料科技有限公司 | 一种智能移动终端及其控制方法 |
US10777186B1 (en) * | 2018-11-13 | 2020-09-15 | Amazon Technolgies, Inc. | Streaming real-time automatic speech recognition service |
US10573312B1 (en) | 2018-12-04 | 2020-02-25 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
US10388272B1 (en) | 2018-12-04 | 2019-08-20 | Sorenson Ip Holdings, Llc | Training speech recognition systems using word sequences |
US11017778B1 (en) | 2018-12-04 | 2021-05-25 | Sorenson Ip Holdings, Llc | Switching between speech recognition systems |
US11170761B2 (en) | 2018-12-04 | 2021-11-09 | Sorenson Ip Holdings, Llc | Training of speech recognition systems |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US10602268B1 (en) | 2018-12-20 | 2020-03-24 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
CN109410923B (zh) * | 2018-12-26 | 2022-06-10 | 中国联合网络通信集团有限公司 | 语音识别方法、装置、系统及存储介质 |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11955120B1 (en) * | 2019-01-31 | 2024-04-09 | Alan AI, Inc. | Systems and methods for integrating voice controls into applications |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11120794B2 (en) | 2019-05-03 | 2021-09-14 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK201970510A1 (en) | 2019-05-31 | 2021-02-11 | Apple Inc | Voice identification in digital assistant systems |
US10996917B2 (en) | 2019-05-31 | 2021-05-04 | Apple Inc. | User interfaces for audio media control |
US11620103B2 (en) | 2019-05-31 | 2023-04-04 | Apple Inc. | User interfaces for audio media control |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
US11481094B2 (en) | 2019-06-01 | 2022-10-25 | Apple Inc. | User interfaces for location-related communications |
US11477609B2 (en) | 2019-06-01 | 2022-10-18 | Apple Inc. | User interfaces for location-related communications |
CN112086096B (zh) * | 2019-06-14 | 2024-04-05 | 北京京东尚科信息技术有限公司 | 数据处理方法、装置、系统、介质 |
CN112242142B (zh) * | 2019-07-17 | 2024-01-30 | 北京搜狗科技发展有限公司 | 一种语音识别输入的方法及相关装置 |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
KR20210042520A (ko) * | 2019-10-10 | 2021-04-20 | 삼성전자주식회사 | 전자 장치 및 이의 제어 방법 |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
CN111143535B (zh) * | 2019-12-27 | 2021-08-10 | 北京百度网讯科技有限公司 | 用于生成对话模型的方法和装置 |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
EP4118538A4 (en) * | 2020-03-10 | 2024-03-20 | Meetkai, Inc. | PARALLEL HYPOTHETICAL CONCLUSION FOR OPERATION OF A MULTILINGUAL VIRTUAL ASSISTANT WITH MULTIPLE USES AND MULTIPLE DOMAIN |
US12045572B2 (en) | 2020-03-10 | 2024-07-23 | MeetKai, Inc. | System and method for handling out of scope or out of domain user inquiries |
KR20210130465A (ko) * | 2020-04-22 | 2021-11-01 | 현대자동차주식회사 | 대화 시스템 및 그 제어 방법 |
US11038934B1 (en) | 2020-05-11 | 2021-06-15 | Apple Inc. | Digital assistant hardware abstraction |
US11810578B2 (en) | 2020-05-11 | 2023-11-07 | Apple Inc. | Device arbitration for digital assistant-based intercom systems |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
CN111883105B (zh) * | 2020-07-15 | 2022-05-10 | 思必驰科技股份有限公司 | 用于视频场景的上下文信息预测模型的训练方法及系统 |
CN111863009B (zh) * | 2020-07-15 | 2022-07-26 | 思必驰科技股份有限公司 | 上下文信息预测模型的训练方法及系统 |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
US11488604B2 (en) | 2020-08-19 | 2022-11-01 | Sorenson Ip Holdings, Llc | Transcription of audio |
US11829720B2 (en) | 2020-09-01 | 2023-11-28 | Apple Inc. | Analysis and validation of language models |
US11527237B1 (en) * | 2020-09-18 | 2022-12-13 | Amazon Technologies, Inc. | User-system dialog expansion |
US11392291B2 (en) | 2020-09-25 | 2022-07-19 | Apple Inc. | Methods and interfaces for media control with dynamic feedback |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
KR20220119219A (ko) * | 2021-02-19 | 2022-08-29 | 삼성전자주식회사 | 온디바이스 인공지능 서비스를 제공하는 전자 장치 및 방법 |
US11967306B2 (en) | 2021-04-14 | 2024-04-23 | Honeywell International Inc. | Contextual speech recognition methods and systems |
US11847378B2 (en) | 2021-06-06 | 2023-12-19 | Apple Inc. | User interfaces for audio routing |
US20230117535A1 (en) * | 2021-10-15 | 2023-04-20 | Samsung Electronics Co., Ltd. | Method and system for device feature analysis to improve user experience |
US20230245649A1 (en) * | 2022-02-03 | 2023-08-03 | Soundhound, Inc. | Token confidence scores for automatic speech recognition |
US12045561B2 (en) * | 2022-11-28 | 2024-07-23 | Theta Lake, Inc. | System and method for disambiguating data to improve analysis of electronic content |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5909666A (en) * | 1992-11-13 | 1999-06-01 | Dragon Systems, Inc. | Speech recognition system which creates acoustic models by concatenating acoustic models of individual words |
US6311157B1 (en) * | 1992-12-31 | 2001-10-30 | Apple Computer, Inc. | Assigning meanings to utterances in a speech recognition system |
CN1864204A (zh) * | 2002-09-06 | 2006-11-15 | 语音信号技术有限公司 | 用来完成语音识别的方法、系统和程序 |
US7315818B2 (en) * | 2000-05-02 | 2008-01-01 | Nuance Communications, Inc. | Error correction in speech recognition |
CN101183525A (zh) * | 2006-10-12 | 2008-05-21 | Qnx软件操作系统(威美科)有限公司 | 用于自动语音识别系统的自适应语境 |
Family Cites Families (601)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3828132A (en) | 1970-10-30 | 1974-08-06 | Bell Telephone Labor Inc | Speech synthesis by concatenation of formant encoded words |
US3704345A (en) | 1971-03-19 | 1972-11-28 | Bell Telephone Labor Inc | Conversion of printed text into synthetic speech |
US3979557A (en) | 1974-07-03 | 1976-09-07 | International Telephone And Telegraph Corporation | Speech processor system for pitch period extraction using prediction filters |
BG24190A1 (en) | 1976-09-08 | 1978-01-10 | Antonov | Method of synthesis of speech and device for effecting same |
JPS597120B2 (ja) | 1978-11-24 | 1984-02-16 | 日本電気株式会社 | 音声分析装置 |
US4310721A (en) | 1980-01-23 | 1982-01-12 | The United States Of America As Represented By The Secretary Of The Army | Half duplex integral vocoder modem system |
US4348553A (en) | 1980-07-02 | 1982-09-07 | International Business Machines Corporation | Parallel pattern verifier with dynamic time warping |
EP0411675B1 (en) | 1982-06-11 | 1995-09-20 | Mitsubishi Denki Kabushiki Kaisha | Interframe coding apparatus |
US4688195A (en) | 1983-01-28 | 1987-08-18 | Texas Instruments Incorporated | Natural-language interface generating system |
JPS603056A (ja) | 1983-06-21 | 1985-01-09 | Toshiba Corp | 情報整理装置 |
DE3335358A1 (de) | 1983-09-29 | 1985-04-11 | Siemens AG, 1000 Berlin und 8000 München | Verfahren zur bestimmung von sprachspektren fuer die automatische spracherkennung und sprachcodierung |
US5164900A (en) | 1983-11-14 | 1992-11-17 | Colman Bernath | Method and device for phonetically encoding Chinese textual data for data processing entry |
US4726065A (en) | 1984-01-26 | 1988-02-16 | Horst Froessl | Image manipulation by speech signals |
US4955047A (en) | 1984-03-26 | 1990-09-04 | Dytel Corporation | Automated attendant with direct inward system access |
US4811243A (en) | 1984-04-06 | 1989-03-07 | Racine Marsh V | Computer aided coordinate digitizing system |
US4692941A (en) | 1984-04-10 | 1987-09-08 | First Byte | Real-time text-to-speech conversion system |
US4783807A (en) | 1984-08-27 | 1988-11-08 | John Marley | System and method for sound recognition with feature selection synchronized to voice pitch |
US4718094A (en) | 1984-11-19 | 1988-01-05 | International Business Machines Corp. | Speech recognition system |
US5165007A (en) | 1985-02-01 | 1992-11-17 | International Business Machines Corporation | Feneme-based Markov models for words |
US4944013A (en) | 1985-04-03 | 1990-07-24 | British Telecommunications Public Limited Company | Multi-pulse speech coder |
US4833712A (en) | 1985-05-29 | 1989-05-23 | International Business Machines Corporation | Automatic generation of simple Markov model stunted baseforms for words in a vocabulary |
US4819271A (en) | 1985-05-29 | 1989-04-04 | International Business Machines Corporation | Constructing Markov model word baseforms from multiple utterances by concatenating model sequences for word segments |
EP0218859A3 (en) | 1985-10-11 | 1989-09-06 | International Business Machines Corporation | Signal processor communication interface |
US4776016A (en) | 1985-11-21 | 1988-10-04 | Position Orientation Systems, Inc. | Voice control system |
JPH0833744B2 (ja) | 1986-01-09 | 1996-03-29 | 株式会社東芝 | 音声合成装置 |
US4724542A (en) | 1986-01-22 | 1988-02-09 | International Business Machines Corporation | Automatic reference adaptation during dynamic signature verification |
US5759101A (en) | 1986-03-10 | 1998-06-02 | Response Reward Systems L.C. | Central and remote evaluation of responses of participatory broadcast audience with automatic crediting and couponing |
US5128752A (en) | 1986-03-10 | 1992-07-07 | Kohorn H Von | System and method for generating and redeeming tokens |
US5032989A (en) | 1986-03-19 | 1991-07-16 | Realpro, Ltd. | Real estate search and location system and method |
DE3779351D1 (zh) | 1986-03-28 | 1992-07-02 | American Telephone And Telegraph Co., New York, N.Y., Us | |
US4903305A (en) | 1986-05-12 | 1990-02-20 | Dragon Systems, Inc. | Method for representing word models for use in speech recognition |
ES2047494T3 (es) | 1986-10-03 | 1994-03-01 | British Telecomm | Sistema de traduccion de lenguas. |
WO1988002975A1 (en) | 1986-10-16 | 1988-04-21 | Mitsubishi Denki Kabushiki Kaisha | Amplitude-adapted vector quantizer |
US4829576A (en) | 1986-10-21 | 1989-05-09 | Dragon Systems, Inc. | Voice recognition system |
US4852168A (en) | 1986-11-18 | 1989-07-25 | Sprague Richard P | Compression of stored waveforms for artificial speech |
US4727354A (en) | 1987-01-07 | 1988-02-23 | Unisys Corporation | System for selecting best fit vector code in vector quantization encoding |
US4827520A (en) | 1987-01-16 | 1989-05-02 | Prince Corporation | Voice actuated control system for use in a vehicle |
US4965763A (en) | 1987-03-03 | 1990-10-23 | International Business Machines Corporation | Computer method for automatic extraction of commonly specified information from business correspondence |
US5644727A (en) | 1987-04-15 | 1997-07-01 | Proprietary Financial Products, Inc. | System for the operation and management of one or more financial accounts through the use of a digital communication and computation system for exchange, investment and borrowing |
EP0293259A3 (en) | 1987-05-29 | 1990-03-07 | Kabushiki Kaisha Toshiba | Voice recognition system used in telephone apparatus |
DE3723078A1 (de) | 1987-07-11 | 1989-01-19 | Philips Patentverwaltung | Verfahren zur erkennung von zusammenhaengend gesprochenen woertern |
US4974191A (en) | 1987-07-31 | 1990-11-27 | Syntellect Software Inc. | Adaptive natural language computer interface system |
CA1288516C (en) | 1987-07-31 | 1991-09-03 | Leendert M. Bijnagte | Apparatus and method for communicating textual and image information between a host computer and a remote display terminal |
US5022081A (en) | 1987-10-01 | 1991-06-04 | Sharp Kabushiki Kaisha | Information recognition system |
US4852173A (en) | 1987-10-29 | 1989-07-25 | International Business Machines Corporation | Design and construction of a binary-tree system for language modelling |
EP0314908B1 (en) | 1987-10-30 | 1992-12-02 | International Business Machines Corporation | Automatic determination of labels and markov word models in a speech recognition system |
US5072452A (en) | 1987-10-30 | 1991-12-10 | International Business Machines Corporation | Automatic determination of labels and Markov word models in a speech recognition system |
US4914586A (en) | 1987-11-06 | 1990-04-03 | Xerox Corporation | Garbage collector for hypermedia systems |
US4992972A (en) | 1987-11-18 | 1991-02-12 | International Business Machines Corporation | Flexible context searchable on-line information system with help files and modules for on-line computer system documentation |
US5220657A (en) | 1987-12-02 | 1993-06-15 | Xerox Corporation | Updating local copy of shared data in a collaborative system |
US4984177A (en) | 1988-02-05 | 1991-01-08 | Advanced Products And Technologies, Inc. | Voice language translator |
CA1333420C (en) | 1988-02-29 | 1994-12-06 | Tokumichi Murakami | Vector quantizer |
US4914590A (en) | 1988-05-18 | 1990-04-03 | Emhart Industries, Inc. | Natural language understanding system |
FR2636163B1 (fr) | 1988-09-02 | 1991-07-05 | Hamon Christian | Procede et dispositif de synthese de la parole par addition-recouvrement de formes d'onde |
US4839853A (en) | 1988-09-15 | 1989-06-13 | Bell Communications Research, Inc. | Computer information retrieval using latent semantic structure |
JPH0293597A (ja) | 1988-09-30 | 1990-04-04 | Nippon I B M Kk | 音声認識装置 |
US4905163A (en) | 1988-10-03 | 1990-02-27 | Minnesota Mining & Manufacturing Company | Intelligent optical navigator dynamic information presentation and navigation system |
US5282265A (en) | 1988-10-04 | 1994-01-25 | Canon Kabushiki Kaisha | Knowledge information processing system |
DE3837590A1 (de) | 1988-11-05 | 1990-05-10 | Ant Nachrichtentech | Verfahren zum reduzieren der datenrate von digitalen bilddaten |
ATE102731T1 (de) | 1988-11-23 | 1994-03-15 | Digital Equipment Corp | Namenaussprache durch einen synthetisator. |
US5027406A (en) | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5127055A (en) | 1988-12-30 | 1992-06-30 | Kurzweil Applied Intelligence, Inc. | Speech recognition apparatus & method having dynamic reference pattern adaptation |
US5293448A (en) | 1989-10-02 | 1994-03-08 | Nippon Telegraph And Telephone Corporation | Speech analysis-synthesis method and apparatus therefor |
US5047614A (en) | 1989-01-23 | 1991-09-10 | Bianco James S | Method and apparatus for computer-aided shopping |
SE466029B (sv) | 1989-03-06 | 1991-12-02 | Ibm Svenska Ab | Anordning och foerfarande foer analys av naturligt spraak i ett datorbaserat informationsbehandlingssystem |
JPH0782544B2 (ja) | 1989-03-24 | 1995-09-06 | インターナショナル・ビジネス・マシーンズ・コーポレーション | マルチテンプレートを用いるdpマツチング方法及び装置 |
US4977598A (en) | 1989-04-13 | 1990-12-11 | Texas Instruments Incorporated | Efficient pruning algorithm for hidden markov model speech recognition |
US5197005A (en) | 1989-05-01 | 1993-03-23 | Intelligent Business Systems | Database retrieval system having a natural language interface |
US5010574A (en) | 1989-06-13 | 1991-04-23 | At&T Bell Laboratories | Vector quantizer search arrangement |
JP2940005B2 (ja) | 1989-07-20 | 1999-08-25 | 日本電気株式会社 | 音声符号化装置 |
US5091945A (en) | 1989-09-28 | 1992-02-25 | At&T Bell Laboratories | Source dependent channel coding with error protection |
CA2027705C (en) | 1989-10-17 | 1994-02-15 | Masami Akamine | Speech coding system utilizing a recursive computation technique for improvement in processing speed |
US5020112A (en) | 1989-10-31 | 1991-05-28 | At&T Bell Laboratories | Image recognition method using two-dimensional stochastic grammars |
US5220639A (en) | 1989-12-01 | 1993-06-15 | National Science Council | Mandarin speech input method for Chinese computers and a mandarin speech recognition machine |
US5021971A (en) | 1989-12-07 | 1991-06-04 | Unisys Corporation | Reflective binary encoder for vector quantization |
US5179652A (en) | 1989-12-13 | 1993-01-12 | Anthony I. Rozmanith | Method and apparatus for storing, transmitting and retrieving graphical and tabular data |
CH681573A5 (en) | 1990-02-13 | 1993-04-15 | Astral | Automatic teller arrangement involving bank computers - is operated by user data card carrying personal data, account information and transaction records |
US5208862A (en) | 1990-02-22 | 1993-05-04 | Nec Corporation | Speech coder |
US5301109A (en) | 1990-06-11 | 1994-04-05 | Bell Communications Research, Inc. | Computerized cross-language document retrieval using latent semantic indexing |
JP3266246B2 (ja) | 1990-06-15 | 2002-03-18 | インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン | 自然言語解析装置及び方法並びに自然言語解析用知識ベース構築方法 |
US5202952A (en) | 1990-06-22 | 1993-04-13 | Dragon Systems, Inc. | Large-vocabulary continuous speech prefiltering and processing system |
GB9017600D0 (en) | 1990-08-10 | 1990-09-26 | British Aerospace | An assembly and method for binary tree-searched vector quanisation data compression processing |
US5309359A (en) | 1990-08-16 | 1994-05-03 | Boris Katz | Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval |
US5404295A (en) | 1990-08-16 | 1995-04-04 | Katz; Boris | Method and apparatus for utilizing annotations to facilitate computer retrieval of database material |
US5297170A (en) | 1990-08-21 | 1994-03-22 | Codex Corporation | Lattice and trellis-coded quantization |
US5400434A (en) | 1990-09-04 | 1995-03-21 | Matsushita Electric Industrial Co., Ltd. | Voice source for synthetic speech system |
US5216747A (en) | 1990-09-20 | 1993-06-01 | Digital Voice Systems, Inc. | Voiced/unvoiced estimation of an acoustic signal |
US5128672A (en) | 1990-10-30 | 1992-07-07 | Apple Computer, Inc. | Dynamic predictive keyboard |
US5325298A (en) | 1990-11-07 | 1994-06-28 | Hnc, Inc. | Methods for generating or revising context vectors for a plurality of word stems |
US5317507A (en) | 1990-11-07 | 1994-05-31 | Gallant Stephen I | Method for document retrieval and for word sense disambiguation using neural networks |
US5247579A (en) | 1990-12-05 | 1993-09-21 | Digital Voice Systems, Inc. | Methods for speech transmission |
US5345536A (en) | 1990-12-21 | 1994-09-06 | Matsushita Electric Industrial Co., Ltd. | Method of speech recognition |
US5127053A (en) | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
US5133011A (en) | 1990-12-26 | 1992-07-21 | International Business Machines Corporation | Method and apparatus for linear vocal control of cursor position |
US5268990A (en) | 1991-01-31 | 1993-12-07 | Sri International | Method for recognizing speech using linguistically-motivated hidden Markov models |
GB9105367D0 (en) | 1991-03-13 | 1991-04-24 | Univ Strathclyde | Computerised information-retrieval database systems |
US5303406A (en) | 1991-04-29 | 1994-04-12 | Motorola, Inc. | Noise squelch circuit with adaptive noise shaping |
US5475587A (en) | 1991-06-28 | 1995-12-12 | Digital Equipment Corporation | Method and apparatus for efficient morphological text analysis using a high-level language for compact specification of inflectional paradigms |
US5293452A (en) | 1991-07-01 | 1994-03-08 | Texas Instruments Incorporated | Voice log-in using spoken name input |
US5687077A (en) | 1991-07-31 | 1997-11-11 | Universal Dynamics Limited | Method and apparatus for adaptive control |
US5199077A (en) | 1991-09-19 | 1993-03-30 | Xerox Corporation | Wordspotting for voice editing and indexing |
JP2662120B2 (ja) | 1991-10-01 | 1997-10-08 | インターナショナル・ビジネス・マシーンズ・コーポレイション | 音声認識装置および音声認識用処理ユニット |
US5222146A (en) | 1991-10-23 | 1993-06-22 | International Business Machines Corporation | Speech recognition apparatus having a speech coder outputting acoustic prototype ranks |
KR940002854B1 (ko) | 1991-11-06 | 1994-04-04 | 한국전기통신공사 | 음성 합성시스팀의 음성단편 코딩 및 그의 피치조절 방법과 그의 유성음 합성장치 |
US5386494A (en) | 1991-12-06 | 1995-01-31 | Apple Computer, Inc. | Method and apparatus for controlling a speech recognition function using a cursor control device |
US6081750A (en) | 1991-12-23 | 2000-06-27 | Hoffberg; Steven Mark | Ergonomic man-machine interface incorporating adaptive pattern recognition based control system |
US5903454A (en) | 1991-12-23 | 1999-05-11 | Hoffberg; Linda Irene | Human-factored interface corporating adaptive pattern recognition based controller apparatus |
US5502790A (en) | 1991-12-24 | 1996-03-26 | Oki Electric Industry Co., Ltd. | Speech recognition method and system using triphones, diphones, and phonemes |
US5349645A (en) | 1991-12-31 | 1994-09-20 | Matsushita Electric Industrial Co., Ltd. | Word hypothesizer for continuous speech decoding using stressed-vowel centered bidirectional tree searches |
US5267345A (en) | 1992-02-10 | 1993-11-30 | International Business Machines Corporation | Speech recognition apparatus which predicts word classes from context and words from word classes |
DE69322894T2 (de) | 1992-03-02 | 1999-07-29 | At & T Corp., New York, N.Y. | Lernverfahren und Gerät zur Spracherkennung |
US6055514A (en) | 1992-03-20 | 2000-04-25 | Wren; Stephen Corey | System for marketing foods and services utilizing computerized centraland remote facilities |
US5317647A (en) | 1992-04-07 | 1994-05-31 | Apple Computer, Inc. | Constrained attribute grammars for syntactic pattern recognition |
US5412804A (en) | 1992-04-30 | 1995-05-02 | Oracle Corporation | Extending the semantics of the outer join operator for un-nesting queries to a data base |
US5293584A (en) | 1992-05-21 | 1994-03-08 | International Business Machines Corporation | Speech recognition system for natural language translation |
US5390281A (en) | 1992-05-27 | 1995-02-14 | Apple Computer, Inc. | Method and apparatus for deducing user intent and providing computer implemented services |
US5434777A (en) | 1992-05-27 | 1995-07-18 | Apple Computer, Inc. | Method and apparatus for processing natural language |
US5734789A (en) | 1992-06-01 | 1998-03-31 | Hughes Electronics | Voiced, unvoiced or noise modes in a CELP vocoder |
US5333275A (en) | 1992-06-23 | 1994-07-26 | Wheatley Barbara J | System and method for time aligning speech |
US5325297A (en) | 1992-06-25 | 1994-06-28 | System Of Multiple-Colored Images For Internationally Listed Estates, Inc. | Computer implemented method and system for storing and retrieving textual data and compressed image data |
US5999908A (en) | 1992-08-06 | 1999-12-07 | Abelow; Daniel H. | Customer-based product design module |
US5412806A (en) | 1992-08-20 | 1995-05-02 | Hewlett-Packard Company | Calibration of logical cost formulae for queries in a heterogeneous DBMS using synthetic database |
GB9220404D0 (en) | 1992-08-20 | 1992-11-11 | Nat Security Agency | Method of identifying,retrieving and sorting documents |
US5333236A (en) | 1992-09-10 | 1994-07-26 | International Business Machines Corporation | Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models |
US5384893A (en) | 1992-09-23 | 1995-01-24 | Emerson & Stern Associates, Inc. | Method and apparatus for speech synthesis based on prosodic analysis |
FR2696036B1 (fr) | 1992-09-24 | 1994-10-14 | France Telecom | Procédé de mesure de ressemblance entre échantillons sonores et dispositif de mise en Óoeuvre de ce procédé. |
JPH0772840B2 (ja) | 1992-09-29 | 1995-08-02 | 日本アイ・ビー・エム株式会社 | 音声モデルの構成方法、音声認識方法、音声認識装置及び音声モデルの訓練方法 |
US5758313A (en) | 1992-10-16 | 1998-05-26 | Mobile Information Systems, Inc. | Method and apparatus for tracking vehicle location |
US6092043A (en) * | 1992-11-13 | 2000-07-18 | Dragon Systems, Inc. | Apparatuses and method for training and operating speech recognition systems |
US5455888A (en) | 1992-12-04 | 1995-10-03 | Northern Telecom Limited | Speech bandwidth extension method and apparatus |
US5412756A (en) | 1992-12-22 | 1995-05-02 | Mitsubishi Denki Kabushiki Kaisha | Artificial intelligence software shell for plant operation simulation |
US5734791A (en) | 1992-12-31 | 1998-03-31 | Apple Computer, Inc. | Rapid tree-based method for vector quantization |
US5384892A (en) | 1992-12-31 | 1995-01-24 | Apple Computer, Inc. | Dynamic language model for speech recognition |
US5613036A (en) | 1992-12-31 | 1997-03-18 | Apple Computer, Inc. | Dynamic categories for a speech recognition system |
US5390279A (en) | 1992-12-31 | 1995-02-14 | Apple Computer, Inc. | Partitioning speech rules by context for speech recognition |
US6122616A (en) | 1993-01-21 | 2000-09-19 | Apple Computer, Inc. | Method and apparatus for diphone aliasing |
US5864844A (en) | 1993-02-18 | 1999-01-26 | Apple Computer, Inc. | System and method for enhancing a user interface with a computer based training tool |
CA2091658A1 (en) | 1993-03-15 | 1994-09-16 | Matthew Lennig | Method and apparatus for automation of directory assistance using speech recognition |
US6055531A (en) | 1993-03-24 | 2000-04-25 | Engate Incorporated | Down-line transcription system having context sensitive searching capability |
US5536902A (en) | 1993-04-14 | 1996-07-16 | Yamaha Corporation | Method of and apparatus for analyzing and synthesizing a sound by extracting and controlling a sound parameter |
US5444823A (en) | 1993-04-16 | 1995-08-22 | Compaq Computer Corporation | Intelligent search engine for associated on-line documentation having questionless case-based knowledge base |
US5574823A (en) | 1993-06-23 | 1996-11-12 | Her Majesty The Queen In Right Of Canada As Represented By The Minister Of Communications | Frequency selective harmonic coding |
US5515475A (en) | 1993-06-24 | 1996-05-07 | Northern Telecom Limited | Speech recognition method using a two-pass search |
JPH0756933A (ja) | 1993-06-24 | 1995-03-03 | Xerox Corp | 文書検索方法 |
JP3685812B2 (ja) | 1993-06-29 | 2005-08-24 | ソニー株式会社 | 音声信号送受信装置 |
US5794207A (en) | 1996-09-04 | 1998-08-11 | Walker Asset Management Limited Partnership | Method and apparatus for a cryptographically assisted commercial network system designed to facilitate buyer-driven conditional purchase offers |
US5495604A (en) | 1993-08-25 | 1996-02-27 | Asymetrix Corporation | Method and apparatus for the modeling and query of database structures using natural language-like constructs |
US5619694A (en) | 1993-08-26 | 1997-04-08 | Nec Corporation | Case database storage/retrieval system |
US5940811A (en) | 1993-08-27 | 1999-08-17 | Affinity Technology Group, Inc. | Closed loop financial transaction method and apparatus |
US5377258A (en) | 1993-08-30 | 1994-12-27 | National Medical Research Council | Method and apparatus for an automated and interactive behavioral guidance system |
US5873056A (en) | 1993-10-12 | 1999-02-16 | The Syracuse University | Natural language processing system for semantic vector representation which accounts for lexical ambiguity |
US5578808A (en) | 1993-12-22 | 1996-11-26 | Datamark Services, Inc. | Data card that can be used for transactions involving separate card issuers |
CA2179523A1 (en) | 1993-12-23 | 1995-06-29 | David A. Boulton | Method and apparatus for implementing user feedback |
US5621859A (en) | 1994-01-19 | 1997-04-15 | Bbn Corporation | Single tree method for grammar directed, very large vocabulary speech recognizer |
US5584024A (en) | 1994-03-24 | 1996-12-10 | Software Ag | Interactive database query system and method for prohibiting the selection of semantically incorrect query parameters |
US5642519A (en) | 1994-04-29 | 1997-06-24 | Sun Microsystems, Inc. | Speech interpreter with a unified grammer compiler |
EP0684607B1 (en) | 1994-05-25 | 2001-03-14 | Victor Company Of Japan, Limited | Variable transfer rate data reproduction apparatus |
US5493677A (en) | 1994-06-08 | 1996-02-20 | Systems Research & Applications Corporation | Generation, archiving, and retrieval of digital images with evoked suggestion-set captions and natural language interface |
US5675819A (en) | 1994-06-16 | 1997-10-07 | Xerox Corporation | Document information retrieval using global word co-occurrence patterns |
JPH0869470A (ja) | 1994-06-21 | 1996-03-12 | Canon Inc | 自然言語処理装置及びその方法 |
US5948040A (en) | 1994-06-24 | 1999-09-07 | Delorme Publishing Co. | Travel reservation information and planning system |
JP3586777B2 (ja) * | 1994-08-17 | 2004-11-10 | 富士通株式会社 | 音声入力装置 |
US5682539A (en) | 1994-09-29 | 1997-10-28 | Conrad; Donovan | Anticipated meaning natural language interface |
US5715468A (en) | 1994-09-30 | 1998-02-03 | Budzinski; Robert Lucius | Memory system for storing and retrieving experience and knowledge with natural language |
GB2293667B (en) | 1994-09-30 | 1998-05-27 | Intermation Limited | Database management system |
US5845255A (en) | 1994-10-28 | 1998-12-01 | Advanced Health Med-E-Systems Corporation | Prescription management system |
US5577241A (en) | 1994-12-07 | 1996-11-19 | Excite, Inc. | Information retrieval system and method with implementation extensible query architecture |
US5748974A (en) | 1994-12-13 | 1998-05-05 | International Business Machines Corporation | Multimodal natural language interface for cross-application tasks |
US5794050A (en) | 1995-01-04 | 1998-08-11 | Intelligent Text Processing, Inc. | Natural language understanding system |
EP1526472A3 (en) | 1995-02-13 | 2006-07-26 | Intertrust Technologies Corp. | Systems and methods for secure transaction management and electronic rights protection |
US5701400A (en) | 1995-03-08 | 1997-12-23 | Amado; Carlos Armando | Method and apparatus for applying if-then-else rules to data sets in a relational data base and generating from the results of application of said rules a database of diagnostics linked to said data sets to aid executive analysis of financial data |
US5749081A (en) | 1995-04-06 | 1998-05-05 | Firefly Network, Inc. | System and method for recommending items to a user |
US5642464A (en) | 1995-05-03 | 1997-06-24 | Northern Telecom Limited | Methods and apparatus for noise conditioning in digital speech compression systems using linear predictive coding |
US5664055A (en) | 1995-06-07 | 1997-09-02 | Lucent Technologies Inc. | CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity |
US5710886A (en) | 1995-06-16 | 1998-01-20 | Sellectsoft, L.C. | Electric couponing method and apparatus |
JP3284832B2 (ja) | 1995-06-22 | 2002-05-20 | セイコーエプソン株式会社 | 音声認識対話処理方法および音声認識対話装置 |
US6038533A (en) | 1995-07-07 | 2000-03-14 | Lucent Technologies Inc. | System and method for selecting training text |
US6026388A (en) | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
JP3697748B2 (ja) | 1995-08-21 | 2005-09-21 | セイコーエプソン株式会社 | 端末、音声認識装置 |
US5712957A (en) | 1995-09-08 | 1998-01-27 | Carnegie Mellon University | Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists |
US5790978A (en) | 1995-09-15 | 1998-08-04 | Lucent Technologies, Inc. | System and method for determining pitch contours |
US5737734A (en) | 1995-09-15 | 1998-04-07 | Infonautics Corporation | Query word relevance adjustment in a search of an information retrieval system |
US6173261B1 (en) | 1998-09-30 | 2001-01-09 | At&T Corp | Grammar fragment acquisition using syntactic and semantic clustering |
US5884323A (en) | 1995-10-13 | 1999-03-16 | 3Com Corporation | Extendible method and apparatus for synchronizing files on two different computer systems |
US5799276A (en) | 1995-11-07 | 1998-08-25 | Accent Incorporated | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals |
US5794237A (en) | 1995-11-13 | 1998-08-11 | International Business Machines Corporation | System and method for improving problem source identification in computer systems employing relevance feedback and statistical source ranking |
US6064959A (en) * | 1997-03-28 | 2000-05-16 | Dragon Systems, Inc. | Error correction in speech recognition |
US5706442A (en) | 1995-12-20 | 1998-01-06 | Block Financial Corporation | System for on-line financial services using distributed objects |
US6119101A (en) | 1996-01-17 | 2000-09-12 | Personal Agents, Inc. | Intelligent agents for electronic commerce |
US6125356A (en) | 1996-01-18 | 2000-09-26 | Rosefaire Development, Ltd. | Portable sales presentation system with selective scripted seller prompts |
US5987404A (en) | 1996-01-29 | 1999-11-16 | International Business Machines Corporation | Statistical natural language understanding using hidden clumpings |
US5729694A (en) | 1996-02-06 | 1998-03-17 | The Regents Of The University Of California | Speech coding, reconstruction and recognition using acoustics and electromagnetic waves |
US6076088A (en) | 1996-02-09 | 2000-06-13 | Paik; Woojin | Information extraction system and method using concept relation concept (CRC) triples |
US5835893A (en) | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
US5901287A (en) | 1996-04-01 | 1999-05-04 | The Sabre Group Inc. | Information aggregation and synthesization system |
US5867799A (en) | 1996-04-04 | 1999-02-02 | Lang; Andrew K. | Information system and method for filtering a massive flow of information entities to meet user information classification needs |
US5987140A (en) | 1996-04-26 | 1999-11-16 | Verifone, Inc. | System, method and article of manufacture for secure network electronic payment and credit collection |
US5963924A (en) | 1996-04-26 | 1999-10-05 | Verifone, Inc. | System, method and article of manufacture for the use of payment instrument holders and payment instruments in network electronic commerce |
US5913193A (en) | 1996-04-30 | 1999-06-15 | Microsoft Corporation | Method and system of runtime acoustic unit selection for speech synthesis |
US5857184A (en) | 1996-05-03 | 1999-01-05 | Walden Media, Inc. | Language and method for creating, organizing, and retrieving data from a database |
FR2748342B1 (fr) | 1996-05-06 | 1998-07-17 | France Telecom | Procede et dispositif de filtrage par egalisation d'un signal de parole, mettant en oeuvre un modele statistique de ce signal |
US5828999A (en) | 1996-05-06 | 1998-10-27 | Apple Computer, Inc. | Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems |
US5826261A (en) | 1996-05-10 | 1998-10-20 | Spencer; Graham | System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query |
US6366883B1 (en) | 1996-05-15 | 2002-04-02 | Atr Interpreting Telecommunications | Concatenation of speech segments by use of a speech synthesizer |
US5727950A (en) | 1996-05-22 | 1998-03-17 | Netsage Corporation | Agent based instruction system and method |
US5966533A (en) | 1996-06-11 | 1999-10-12 | Excite, Inc. | Method and system for dynamically synthesizing a computer program by differentially resolving atoms based on user context data |
US5915249A (en) | 1996-06-14 | 1999-06-22 | Excite, Inc. | System and method for accelerated query evaluation of very large full-text databases |
US5987132A (en) | 1996-06-17 | 1999-11-16 | Verifone, Inc. | System, method and article of manufacture for conditionally accepting a payment method utilizing an extensible, flexible architecture |
US5825881A (en) | 1996-06-28 | 1998-10-20 | Allsoft Distributing Inc. | Public network merchandising system |
US6070147A (en) | 1996-07-02 | 2000-05-30 | Tecmark Services, Inc. | Customer identification and marketing analysis systems |
EP0912954B8 (en) | 1996-07-22 | 2006-06-14 | Cyva Research Corporation | Personal information security and exchange tool |
US5862223A (en) | 1996-07-24 | 1999-01-19 | Walker Asset Management Limited Partnership | Method and apparatus for a cryptographically-assisted commercial network system designed to facilitate and support expert-based commerce |
EP0829811A1 (en) | 1996-09-11 | 1998-03-18 | Nippon Telegraph And Telephone Corporation | Method and system for information retrieval |
US6181935B1 (en) | 1996-09-27 | 2001-01-30 | Software.Com, Inc. | Mobility extended telephone application programming interface and method of use |
US5794182A (en) | 1996-09-30 | 1998-08-11 | Apple Computer, Inc. | Linear predictive speech encoding systems with efficient combination pitch coefficients computation |
US5721827A (en) | 1996-10-02 | 1998-02-24 | James Logan | System for electrically distributing personalized information |
US5913203A (en) | 1996-10-03 | 1999-06-15 | Jaesent Inc. | System and method for pseudo cash transactions |
US5930769A (en) | 1996-10-07 | 1999-07-27 | Rose; Andrea | System and method for fashion shopping |
US5836771A (en) | 1996-12-02 | 1998-11-17 | Ho; Chi Fai | Learning method and system based on questioning |
US6665639B2 (en) | 1996-12-06 | 2003-12-16 | Sensory, Inc. | Speech recognition in consumer electronic products |
US6078914A (en) | 1996-12-09 | 2000-06-20 | Open Text Corporation | Natural language meta-search system and method |
US5839106A (en) | 1996-12-17 | 1998-11-17 | Apple Computer, Inc. | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model |
US5966126A (en) | 1996-12-23 | 1999-10-12 | Szabo; Andrew J. | Graphic user interface for database system |
US5932869A (en) | 1996-12-27 | 1999-08-03 | Graphic Technology, Inc. | Promotional system with magnetic stripe and visual thermo-reversible print surfaced medium |
JP3579204B2 (ja) | 1997-01-17 | 2004-10-20 | 富士通株式会社 | 文書要約装置およびその方法 |
US5941944A (en) | 1997-03-03 | 1999-08-24 | Microsoft Corporation | Method for providing a substitute for a requested inaccessible object by identifying substantially similar objects using weights corresponding to object features |
US6076051A (en) | 1997-03-07 | 2000-06-13 | Microsoft Corporation | Information retrieval utilizing semantic representation of text |
US5930801A (en) | 1997-03-07 | 1999-07-27 | Xerox Corporation | Shared-data environment in which each file has independent security properties |
US5822743A (en) | 1997-04-08 | 1998-10-13 | 1215627 Ontario Inc. | Knowledge-based information retrieval system |
US5970474A (en) | 1997-04-24 | 1999-10-19 | Sears, Roebuck And Co. | Registry information system for shoppers |
US5895464A (en) | 1997-04-30 | 1999-04-20 | Eastman Kodak Company | Computer program product and a method for using natural language for the description, search and retrieval of multi-media objects |
US6138098A (en) * | 1997-06-30 | 2000-10-24 | Lernout & Hauspie Speech Products N.V. | Command parsing and rewrite system |
US5860063A (en) | 1997-07-11 | 1999-01-12 | At&T Corp | Automated meaningful phrase clustering |
US5933822A (en) | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US5974146A (en) | 1997-07-30 | 1999-10-26 | Huntington Bancshares Incorporated | Real time bank-centric universal payment system |
US5895466A (en) | 1997-08-19 | 1999-04-20 | At&T Corp | Automated natural language understanding customer service system |
US6081774A (en) | 1997-08-22 | 2000-06-27 | Novell, Inc. | Natural language information retrieval system and method |
US6404876B1 (en) | 1997-09-25 | 2002-06-11 | Gte Intelligent Network Services Incorporated | System and method for voice activated dialing and routing under open access network control |
US6023684A (en) | 1997-10-01 | 2000-02-08 | Security First Technologies, Inc. | Three tier financial transaction system with cache memory |
EP0911808B1 (en) | 1997-10-23 | 2002-05-08 | Sony International (Europe) GmbH | Speech interface in a home network environment |
US6108627A (en) | 1997-10-31 | 2000-08-22 | Nortel Networks Corporation | Automatic transcription tool |
US5943670A (en) | 1997-11-21 | 1999-08-24 | International Business Machines Corporation | System and method for categorizing objects in combined categories |
US5960422A (en) | 1997-11-26 | 1999-09-28 | International Business Machines Corporation | System and method for optimized source selection in an information retrieval system |
US6026375A (en) | 1997-12-05 | 2000-02-15 | Nortel Networks Corporation | Method and apparatus for processing orders from customers in a mobile environment |
US6064960A (en) | 1997-12-18 | 2000-05-16 | Apple Computer, Inc. | Method and apparatus for improved duration modeling of phonemes |
US6094649A (en) | 1997-12-22 | 2000-07-25 | Partnet, Inc. | Keyword searches of structured databases |
US6173287B1 (en) | 1998-03-11 | 2001-01-09 | Digital Equipment Corporation | Technique for ranking multimedia annotations of interest |
US6195641B1 (en) | 1998-03-27 | 2001-02-27 | International Business Machines Corp. | Network universal spoken language vocabulary |
US6026393A (en) | 1998-03-31 | 2000-02-15 | Casebank Technologies Inc. | Configuration knowledge as an aid to case retrieval |
US6233559B1 (en) | 1998-04-01 | 2001-05-15 | Motorola, Inc. | Speech control of multiple applications using applets |
US6173279B1 (en) | 1998-04-09 | 2001-01-09 | At&T Corp. | Method of using a natural language interface to retrieve information from one or more data resources |
US6088731A (en) | 1998-04-24 | 2000-07-11 | Associative Computing, Inc. | Intelligent assistant for use with a local computer and with the internet |
US6016471A (en) | 1998-04-29 | 2000-01-18 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word |
US6029132A (en) | 1998-04-30 | 2000-02-22 | Matsushita Electric Industrial Co. | Method for letter-to-sound in text-to-speech synthesis |
US6285786B1 (en) | 1998-04-30 | 2001-09-04 | Motorola, Inc. | Text recognizer and method using non-cumulative character scoring in a forward search |
US6144938A (en) | 1998-05-01 | 2000-11-07 | Sun Microsystems, Inc. | Voice user interface with personality |
US6778970B2 (en) | 1998-05-28 | 2004-08-17 | Lawrence Au | Topological methods to organize semantic network data flows for conversational applications |
US20070094224A1 (en) | 1998-05-28 | 2007-04-26 | Lawrence Au | Method and system for determining contextual meaning for network search applications |
US7711672B2 (en) | 1998-05-28 | 2010-05-04 | Lawrence Au | Semantic network methods to disambiguate natural language meaning |
US6144958A (en) | 1998-07-15 | 2000-11-07 | Amazon.Com, Inc. | System and method for correcting spelling errors in search queries |
US6105865A (en) | 1998-07-17 | 2000-08-22 | Hardesty; Laurence Daniel | Financial transaction system with retirement saving benefit |
US6499013B1 (en) | 1998-09-09 | 2002-12-24 | One Voice Technologies, Inc. | Interactive user interface using speech recognition and natural language processing |
US6434524B1 (en) | 1998-09-09 | 2002-08-13 | One Voice Technologies, Inc. | Object interactive user interface using speech recognition and natural language processing |
US6266637B1 (en) | 1998-09-11 | 2001-07-24 | International Business Machines Corporation | Phrase splicing and variable substitution using a trainable speech synthesizer |
DE19841541B4 (de) | 1998-09-11 | 2007-12-06 | Püllen, Rainer | Teilnehmereinheit für einen Multimediadienst |
US6792082B1 (en) | 1998-09-11 | 2004-09-14 | Comverse Ltd. | Voice mail system with personal assistant provisioning |
US6317831B1 (en) | 1998-09-21 | 2001-11-13 | Openwave Systems Inc. | Method and apparatus for establishing a secure connection over a one-way data path |
IL140805A0 (en) | 1998-10-02 | 2002-02-10 | Ibm | Structure skeletons for efficient voice navigation through generic hierarchical objects |
US6275824B1 (en) | 1998-10-02 | 2001-08-14 | Ncr Corporation | System and method for managing data privacy in a database management system |
GB9821969D0 (en) | 1998-10-08 | 1998-12-02 | Canon Kk | Apparatus and method for processing natural language |
US6928614B1 (en) | 1998-10-13 | 2005-08-09 | Visteon Global Technologies, Inc. | Mobile office with speech recognition |
US6453292B2 (en) | 1998-10-28 | 2002-09-17 | International Business Machines Corporation | Command boundary identifier for conversational natural language |
US6208971B1 (en) | 1998-10-30 | 2001-03-27 | Apple Computer, Inc. | Method and apparatus for command recognition using data-driven semantic inference |
US6321092B1 (en) | 1998-11-03 | 2001-11-20 | Signal Soft Corporation | Multiple input data management for wireless location-based applications |
US6839669B1 (en) * | 1998-11-05 | 2005-01-04 | Scansoft, Inc. | Performing actions identified in recognized speech |
US6446076B1 (en) | 1998-11-12 | 2002-09-03 | Accenture Llp. | Voice interactive web-based agent system responsive to a user location for prioritizing and formatting information |
EP1138038B1 (en) | 1998-11-13 | 2005-06-22 | Lernout & Hauspie Speech Products N.V. | Speech synthesis using concatenation of speech waveforms |
US6606599B2 (en) | 1998-12-23 | 2003-08-12 | Interactive Speech Technologies, Llc | Method for integrating computing processes with an interface controlled by voice actuated grammars |
US6246981B1 (en) | 1998-11-25 | 2001-06-12 | International Business Machines Corporation | Natural language task-oriented dialog manager and method |
US7082397B2 (en) | 1998-12-01 | 2006-07-25 | Nuance Communications, Inc. | System for and method of creating and browsing a voice web |
US6260024B1 (en) | 1998-12-02 | 2001-07-10 | Gary Shkedy | Method and apparatus for facilitating buyer-driven purchase orders on a commercial network system |
US7881936B2 (en) | 1998-12-04 | 2011-02-01 | Tegic Communications, Inc. | Multimodal disambiguation of speech recognition |
US6317707B1 (en) | 1998-12-07 | 2001-11-13 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
US6308149B1 (en) | 1998-12-16 | 2001-10-23 | Xerox Corporation | Grouping words with equivalent substrings by automatic clustering based on suffix relationships |
US6523172B1 (en) | 1998-12-17 | 2003-02-18 | Evolutionary Technologies International, Inc. | Parser translator system and method |
US6460029B1 (en) | 1998-12-23 | 2002-10-01 | Microsoft Corporation | System for improving search text |
US7036128B1 (en) | 1999-01-05 | 2006-04-25 | Sri International Offices | Using a community of distributed electronic agents to support a highly mobile, ambient computing environment |
US6513063B1 (en) | 1999-01-05 | 2003-01-28 | Sri International | Accessing network-based electronic information through scripted online interfaces using spoken input |
US6851115B1 (en) | 1999-01-05 | 2005-02-01 | Sri International | Software-based architecture for communication and cooperation among distributed electronic agents |
US6757718B1 (en) | 1999-01-05 | 2004-06-29 | Sri International | Mobile navigation of network-based electronic information using spoken input |
US6742021B1 (en) | 1999-01-05 | 2004-05-25 | Sri International, Inc. | Navigating network-based electronic information using spoken input with multimodal error feedback |
US6523061B1 (en) | 1999-01-05 | 2003-02-18 | Sri International, Inc. | System, method, and article of manufacture for agent-based navigation in a speech-based data navigation system |
US7152070B1 (en) | 1999-01-08 | 2006-12-19 | The Regents Of The University Of California | System and method for integrating and accessing multiple data sources within a data warehouse architecture |
US6505183B1 (en) | 1999-02-04 | 2003-01-07 | Authoria, Inc. | Human resource knowledge modeling and delivery system |
US6317718B1 (en) | 1999-02-26 | 2001-11-13 | Accenture Properties (2) B.V. | System, method and article of manufacture for location-based filtering for shopping agent in the physical world |
GB9904662D0 (en) | 1999-03-01 | 1999-04-21 | Canon Kk | Natural language search method and apparatus |
US6356905B1 (en) | 1999-03-05 | 2002-03-12 | Accenture Llp | System, method and article of manufacture for mobile communication utilizing an interface support framework |
US6928404B1 (en) | 1999-03-17 | 2005-08-09 | International Business Machines Corporation | System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies |
US6584464B1 (en) | 1999-03-19 | 2003-06-24 | Ask Jeeves, Inc. | Grammar template query system |
EP1088299A2 (en) | 1999-03-26 | 2001-04-04 | Scansoft, Inc. | Client-server speech recognition |
US6356854B1 (en) | 1999-04-05 | 2002-03-12 | Delphi Technologies, Inc. | Holographic object position and type sensing system and method |
US6631346B1 (en) | 1999-04-07 | 2003-10-07 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for natural language parsing using multiple passes and tags |
WO2000060435A2 (en) | 1999-04-07 | 2000-10-12 | Rensselaer Polytechnic Institute | System and method for accessing personal information |
US6647260B2 (en) | 1999-04-09 | 2003-11-11 | Openwave Systems Inc. | Method and system facilitating web based provisioning of two-way mobile communications devices |
US6711620B1 (en) * | 1999-04-14 | 2004-03-23 | Matsushita Electric Industrial Co. | Event control device and digital broadcasting system |
US6924828B1 (en) | 1999-04-27 | 2005-08-02 | Surfnotes | Method and apparatus for improved information representation |
US6697780B1 (en) | 1999-04-30 | 2004-02-24 | At&T Corp. | Method and apparatus for rapid acoustic unit selection from a large speech corpus |
JP2003505778A (ja) | 1999-05-28 | 2003-02-12 | セーダ インコーポレイテッド | 音声制御ユーザインタフェース用の認識文法作成の特定用途を有する句ベースの対話モデル化 |
US20020032564A1 (en) | 2000-04-19 | 2002-03-14 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface |
US6931384B1 (en) | 1999-06-04 | 2005-08-16 | Microsoft Corporation | System and method providing utility-based decision making about clarification dialog given communicative uncertainty |
US6598039B1 (en) | 1999-06-08 | 2003-07-22 | Albert-Inc. S.A. | Natural language interface for searching database |
US6615175B1 (en) | 1999-06-10 | 2003-09-02 | Robert F. Gazdzinski | “Smart” elevator system and method |
US8065155B1 (en) | 1999-06-10 | 2011-11-22 | Gazdzinski Robert F | Adaptive advertising apparatus and methods |
US7711565B1 (en) | 1999-06-10 | 2010-05-04 | Gazdzinski Robert F | “Smart” elevator system and method |
US7093693B1 (en) | 1999-06-10 | 2006-08-22 | Gazdzinski Robert F | Elevator access control system and method |
US6711585B1 (en) | 1999-06-15 | 2004-03-23 | Kanisa Inc. | System and method for implementing a knowledge management system |
JP3662780B2 (ja) * | 1999-07-16 | 2005-06-22 | 日本電気株式会社 | 自然言語を用いた対話システム |
JP3361291B2 (ja) | 1999-07-23 | 2003-01-07 | コナミ株式会社 | 音声合成方法、音声合成装置及び音声合成プログラムを記録したコンピュータ読み取り可能な媒体 |
US6421672B1 (en) | 1999-07-27 | 2002-07-16 | Verizon Services Corp. | Apparatus for and method of disambiguation of directory listing searches utilizing multiple selectable secondary search keys |
EP1079387A3 (en) | 1999-08-26 | 2003-07-09 | Matsushita Electric Industrial Co., Ltd. | Mechanism for storing information about recorded television broadcasts |
US6601234B1 (en) | 1999-08-31 | 2003-07-29 | Accenture Llp | Attribute dictionary in a business logic services environment |
US6912499B1 (en) | 1999-08-31 | 2005-06-28 | Nortel Networks Limited | Method and apparatus for training a multilingual speech model set |
US6697824B1 (en) | 1999-08-31 | 2004-02-24 | Accenture Llp | Relationship management in an E-commerce application framework |
US7127403B1 (en) | 1999-09-13 | 2006-10-24 | Microstrategy, Inc. | System and method for personalizing an interactive voice broadcast of a voice service based on particulars of a request |
US6601026B2 (en) | 1999-09-17 | 2003-07-29 | Discern Communications, Inc. | Information retrieval by natural language querying |
US6625583B1 (en) | 1999-10-06 | 2003-09-23 | Goldman, Sachs & Co. | Handheld trading system interface |
US6505175B1 (en) | 1999-10-06 | 2003-01-07 | Goldman, Sachs & Co. | Order centric tracking system |
US7020685B1 (en) | 1999-10-08 | 2006-03-28 | Openwave Systems Inc. | Method and apparatus for providing internet content to SMS-based wireless devices |
JP5118280B2 (ja) | 1999-10-19 | 2013-01-16 | ソニー エレクトロニクス インク | 自然言語インターフェースコントロールシステム |
US6807574B1 (en) | 1999-10-22 | 2004-10-19 | Tellme Networks, Inc. | Method and apparatus for content personalization over a telephone interface |
JP2001125896A (ja) | 1999-10-26 | 2001-05-11 | Victor Co Of Japan Ltd | 自然言語対話システム |
US7310600B1 (en) | 1999-10-28 | 2007-12-18 | Canon Kabushiki Kaisha | Language recognition using a similarity measure |
US6665640B1 (en) | 1999-11-12 | 2003-12-16 | Phoenix Solutions, Inc. | Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries |
US7392185B2 (en) | 1999-11-12 | 2008-06-24 | Phoenix Solutions, Inc. | Speech based learning/training system using semantic decoding |
US6633846B1 (en) | 1999-11-12 | 2003-10-14 | Phoenix Solutions, Inc. | Distributed realtime speech recognition system |
US7725307B2 (en) | 1999-11-12 | 2010-05-25 | Phoenix Solutions, Inc. | Query engine for processing voice based queries including semantic decoding |
US6615172B1 (en) | 1999-11-12 | 2003-09-02 | Phoenix Solutions, Inc. | Intelligent query engine for processing voice based queries |
US9076448B2 (en) | 1999-11-12 | 2015-07-07 | Nuance Communications, Inc. | Distributed real time speech recognition system |
US7050977B1 (en) | 1999-11-12 | 2006-05-23 | Phoenix Solutions, Inc. | Speech-enabled server for internet website and method |
US6532446B1 (en) | 1999-11-24 | 2003-03-11 | Openwave Systems Inc. | Server based speech recognition user interface for wireless devices |
US6526382B1 (en) | 1999-12-07 | 2003-02-25 | Comverse, Inc. | Language-oriented user interfaces for voice activated services |
US6526395B1 (en) | 1999-12-31 | 2003-02-25 | Intel Corporation | Application of personality models and interaction with synthetic characters in a computing system |
US6556983B1 (en) | 2000-01-12 | 2003-04-29 | Microsoft Corporation | Methods and apparatus for finding semantic information, such as usage logs, similar to a query using a pattern lattice data space |
US6546388B1 (en) | 2000-01-14 | 2003-04-08 | International Business Machines Corporation | Metadata search results ranking system |
US6701294B1 (en) | 2000-01-19 | 2004-03-02 | Lucent Technologies, Inc. | User interface for translating natural language inquiries into database queries and data presentations |
US6829603B1 (en) | 2000-02-02 | 2004-12-07 | International Business Machines Corp. | System, method and program product for interactive natural dialog |
US6895558B1 (en) | 2000-02-11 | 2005-05-17 | Microsoft Corporation | Multi-access mode electronic personal assistant |
US6640098B1 (en) | 2000-02-14 | 2003-10-28 | Action Engine Corporation | System for obtaining service-related information for local interactive wireless devices |
US6847979B2 (en) | 2000-02-25 | 2005-01-25 | Synquiry Technologies, Ltd | Conceptual factoring and unification of graphs representing semantic models |
US6895380B2 (en) | 2000-03-02 | 2005-05-17 | Electro Standards Laboratories | Voice actuation with contextual learning for intelligent machine control |
US6449620B1 (en) | 2000-03-02 | 2002-09-10 | Nimble Technology, Inc. | Method and apparatus for generating information pages using semi-structured data stored in a structured manner |
EP1275042A2 (en) | 2000-03-06 | 2003-01-15 | Kanisa Inc. | A system and method for providing an intelligent multi-step dialog with a user |
US6466654B1 (en) | 2000-03-06 | 2002-10-15 | Avaya Technology Corp. | Personal virtual assistant with semantic tagging |
US6757362B1 (en) | 2000-03-06 | 2004-06-29 | Avaya Technology Corp. | Personal virtual assistant |
US6477488B1 (en) | 2000-03-10 | 2002-11-05 | Apple Computer, Inc. | Method for dynamic context scope selection in hybrid n-gram+LSA language modeling |
US6615220B1 (en) | 2000-03-14 | 2003-09-02 | Oracle International Corporation | Method and mechanism for data consolidation |
US6510417B1 (en) | 2000-03-21 | 2003-01-21 | America Online, Inc. | System and method for voice access to internet-based information |
GB2366009B (en) | 2000-03-22 | 2004-07-21 | Canon Kk | Natural language machine interface |
JP3728172B2 (ja) | 2000-03-31 | 2005-12-21 | キヤノン株式会社 | 音声合成方法および装置 |
US7177798B2 (en) | 2000-04-07 | 2007-02-13 | Rensselaer Polytechnic Institute | Natural language interface using constrained intermediate dictionary of results |
US6810379B1 (en) | 2000-04-24 | 2004-10-26 | Sensory, Inc. | Client/server architecture for text-to-speech synthesis |
US6684187B1 (en) | 2000-06-30 | 2004-01-27 | At&T Corp. | Method and system for preselection of suitable units for concatenative speech |
US6691111B2 (en) | 2000-06-30 | 2004-02-10 | Research In Motion Limited | System and method for implementing a natural language user interface |
US6505158B1 (en) | 2000-07-05 | 2003-01-07 | At&T Corp. | Synthesis-based pre-selection of suitable units for concatenative speech |
JP3949356B2 (ja) | 2000-07-12 | 2007-07-25 | 三菱電機株式会社 | 音声対話システム |
US7139709B2 (en) | 2000-07-20 | 2006-11-21 | Microsoft Corporation | Middleware layer between speech related applications and engines |
US20060143007A1 (en) | 2000-07-24 | 2006-06-29 | Koh V E | User interaction with voice information services |
JP2002041276A (ja) | 2000-07-24 | 2002-02-08 | Sony Corp | 対話型操作支援システム及び対話型操作支援方法、並びに記憶媒体 |
US7092928B1 (en) | 2000-07-31 | 2006-08-15 | Quantum Leap Research, Inc. | Intelligent portal engine |
US6778951B1 (en) | 2000-08-09 | 2004-08-17 | Concerto Software, Inc. | Information retrieval method with natural language interface |
US6766320B1 (en) | 2000-08-24 | 2004-07-20 | Microsoft Corporation | Search engine with natural language-based robust parsing for user query and relevance feedback learning |
DE10042944C2 (de) | 2000-08-31 | 2003-03-13 | Siemens Ag | Graphem-Phonem-Konvertierung |
WO2002023523A2 (en) | 2000-09-15 | 2002-03-21 | Lernout & Hauspie Speech Products N.V. | Fast waveform synchronization for concatenation and time-scale modification of speech |
WO2002027712A1 (en) | 2000-09-29 | 2002-04-04 | Professorq, Inc. | Natural-language voice-activated personal assistant |
US6832194B1 (en) | 2000-10-26 | 2004-12-14 | Sensory, Incorporated | Audio recognition peripheral system |
US7027974B1 (en) | 2000-10-27 | 2006-04-11 | Science Applications International Corporation | Ontology-based parser for natural language processing |
US7006969B2 (en) | 2000-11-02 | 2006-02-28 | At&T Corp. | System and method of pattern recognition in very high-dimensional space |
EP1346344A1 (en) | 2000-12-18 | 2003-09-24 | Koninklijke Philips Electronics N.V. | Store speech, select vocabulary to recognize word |
US6937986B2 (en) | 2000-12-28 | 2005-08-30 | Comverse, Inc. | Automatic dynamic speech recognition vocabulary based on external sources of information |
MXPA02008345A (es) | 2000-12-29 | 2002-12-13 | Gen Electric | Metodo y sistema para identificar equipo que repetidamente funciona mal. |
US7249018B2 (en) * | 2001-01-12 | 2007-07-24 | International Business Machines Corporation | System and method for relating syntax and semantics for a conversational speech application |
US7257537B2 (en) | 2001-01-12 | 2007-08-14 | International Business Machines Corporation | Method and apparatus for performing dialog management in a computer conversational interface |
US6964023B2 (en) | 2001-02-05 | 2005-11-08 | International Business Machines Corporation | System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input |
US7290039B1 (en) | 2001-02-27 | 2007-10-30 | Microsoft Corporation | Intent based processing |
US20020123894A1 (en) | 2001-03-01 | 2002-09-05 | International Business Machines Corporation | Processing speech recognition errors in an embedded speech recognition system |
US6721728B2 (en) | 2001-03-02 | 2004-04-13 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | System, method and apparatus for discovering phrases in a database |
US7216073B2 (en) | 2001-03-13 | 2007-05-08 | Intelligate, Ltd. | Dynamic natural language understanding |
US6996531B2 (en) | 2001-03-30 | 2006-02-07 | Comverse Ltd. | Automated database assistance using a telephone for a speech based or text based multimedia communication mode |
US6654740B2 (en) | 2001-05-08 | 2003-11-25 | Sunflare Co., Ltd. | Probabilistic information retrieval based on differential latent semantic space |
US7085722B2 (en) | 2001-05-14 | 2006-08-01 | Sony Computer Entertainment America Inc. | System and method for menu-driven voice control of characters in a game environment |
US6944594B2 (en) | 2001-05-30 | 2005-09-13 | Bellsouth Intellectual Property Corporation | Multi-context conversational environment system and method |
US20020194003A1 (en) | 2001-06-05 | 2002-12-19 | Mozer Todd F. | Client-server security system and method |
US20020198714A1 (en) | 2001-06-26 | 2002-12-26 | Guojun Zhou | Statistical spoken dialog system |
US7139722B2 (en) | 2001-06-27 | 2006-11-21 | Bellsouth Intellectual Property Corporation | Location and time sensitive wireless calendaring |
US6671670B2 (en) * | 2001-06-27 | 2003-12-30 | Telelogue, Inc. | System and method for pre-processing information used by an automated attendant |
US6604059B2 (en) | 2001-07-10 | 2003-08-05 | Koninklijke Philips Electronics N.V. | Predictive calendar |
US7987151B2 (en) | 2001-08-10 | 2011-07-26 | General Dynamics Advanced Info Systems, Inc. | Apparatus and method for problem solving using intelligent agents |
US6813491B1 (en) | 2001-08-31 | 2004-11-02 | Openwave Systems Inc. | Method and apparatus for adapting settings of wireless communication devices in accordance with user proximity |
US7403938B2 (en) | 2001-09-24 | 2008-07-22 | Iac Search & Media, Inc. | Natural language query processing |
US6985865B1 (en) | 2001-09-26 | 2006-01-10 | Sprint Spectrum L.P. | Method and system for enhanced response to voice commands in a voice command platform |
US20050196732A1 (en) | 2001-09-26 | 2005-09-08 | Scientific Learning Corporation | Method and apparatus for automated training of language learning skills |
US6650735B2 (en) | 2001-09-27 | 2003-11-18 | Microsoft Corporation | Integrated voice access to a variety of personal information services |
US7324947B2 (en) | 2001-10-03 | 2008-01-29 | Promptu Systems Corporation | Global speech user interface |
US7167832B2 (en) | 2001-10-15 | 2007-01-23 | At&T Corp. | Method for dialog management |
US7345671B2 (en) | 2001-10-22 | 2008-03-18 | Apple Inc. | Method and apparatus for use of rotational user inputs |
GB2381409B (en) | 2001-10-27 | 2004-04-28 | Hewlett Packard Ltd | Asynchronous access to synchronous voice services |
NO316480B1 (no) | 2001-11-15 | 2004-01-26 | Forinnova As | Fremgangsmåte og system for tekstuell granskning og oppdagelse |
US20030101054A1 (en) | 2001-11-27 | 2003-05-29 | Ncc, Llc | Integrated system and method for electronic speech recognition and transcription |
TW541517B (en) | 2001-12-25 | 2003-07-11 | Univ Nat Cheng Kung | Speech recognition system |
US7197460B1 (en) | 2002-04-23 | 2007-03-27 | At&T Corp. | System for handling frequently asked questions in a natural language dialog service |
US6847966B1 (en) | 2002-04-24 | 2005-01-25 | Engenium Corporation | Method and system for optimally searching a document database using a representative semantic space |
US7546382B2 (en) | 2002-05-28 | 2009-06-09 | International Business Machines Corporation | Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms |
US7398209B2 (en) | 2002-06-03 | 2008-07-08 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7233790B2 (en) | 2002-06-28 | 2007-06-19 | Openwave Systems, Inc. | Device capability based discovery, packaging and provisioning of content for wireless mobile devices |
US7299033B2 (en) | 2002-06-28 | 2007-11-20 | Openwave Systems Inc. | Domain-based management of distribution of digital content from multiple suppliers to multiple wireless services subscribers |
US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
US7467087B1 (en) | 2002-10-10 | 2008-12-16 | Gillick Laurence S | Training and using pronunciation guessers in speech recognition |
US7783486B2 (en) | 2002-11-22 | 2010-08-24 | Roy Jonathan Rosser | Response generator for mimicking human-computer natural language conversation |
WO2004053836A1 (en) | 2002-12-10 | 2004-06-24 | Kirusa, Inc. | Techniques for disambiguating speech input using multimodal interfaces |
US7386449B2 (en) | 2002-12-11 | 2008-06-10 | Voice Enabling Systems Technology Inc. | Knowledge-based flexible natural speech dialogue system |
US7956766B2 (en) | 2003-01-06 | 2011-06-07 | Panasonic Corporation | Apparatus operating system |
US7805299B2 (en) * | 2004-03-01 | 2010-09-28 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
US7809565B2 (en) | 2003-03-01 | 2010-10-05 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
US7426468B2 (en) * | 2003-03-01 | 2008-09-16 | Coifman Robert E | Method and apparatus for improving the transcription accuracy of speech recognition software |
US7529671B2 (en) | 2003-03-04 | 2009-05-05 | Microsoft Corporation | Block synchronous decoding |
US6980949B2 (en) | 2003-03-14 | 2005-12-27 | Sonum Technologies, Inc. | Natural language processor |
US7496498B2 (en) | 2003-03-24 | 2009-02-24 | Microsoft Corporation | Front-end architecture for a multi-lingual text-to-speech system |
US7627343B2 (en) | 2003-04-25 | 2009-12-01 | Apple Inc. | Media player system |
US7421393B1 (en) | 2004-03-01 | 2008-09-02 | At&T Corp. | System for developing a dialog manager using modular spoken-dialog components |
US7200559B2 (en) | 2003-05-29 | 2007-04-03 | Microsoft Corporation | Semantic object synchronous understanding implemented with speech application language tags |
US7720683B1 (en) | 2003-06-13 | 2010-05-18 | Sensory, Inc. | Method and apparatus of specifying and performing speech recognition operations |
US7475010B2 (en) | 2003-09-03 | 2009-01-06 | Lingospot, Inc. | Adaptive and scalable method for resolving natural language ambiguities |
US7418392B1 (en) | 2003-09-25 | 2008-08-26 | Sensory, Inc. | System and method for controlling the operation of a device by voice commands |
US7155706B2 (en) | 2003-10-24 | 2006-12-26 | Microsoft Corporation | Administrative tool environment |
US7412385B2 (en) | 2003-11-12 | 2008-08-12 | Microsoft Corporation | System for identifying paraphrases using machine translation |
US7584092B2 (en) | 2004-11-15 | 2009-09-01 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7447630B2 (en) | 2003-11-26 | 2008-11-04 | Microsoft Corporation | Method and apparatus for multi-sensory speech enhancement |
WO2005062293A1 (ja) | 2003-12-05 | 2005-07-07 | Kabushikikaisha Kenwood | オーディオ機器制御装置、オーディオ機器制御方法及びプログラム |
JP2005181386A (ja) | 2003-12-16 | 2005-07-07 | Mitsubishi Electric Corp | 音声対話処理装置及び音声対話処理方法並びにプログラム |
ATE404967T1 (de) | 2003-12-16 | 2008-08-15 | Loquendo Spa | Text-zu-sprache-system und verfahren, computerprogramm dafür |
US7427024B1 (en) | 2003-12-17 | 2008-09-23 | Gazdzinski Mark J | Chattel management apparatus and methods |
US7552055B2 (en) | 2004-01-10 | 2009-06-23 | Microsoft Corporation | Dialog component re-use in recognition systems |
US7567896B2 (en) | 2004-01-16 | 2009-07-28 | Nuance Communications, Inc. | Corpus-based speech synthesis based on segment recombination |
US20050165607A1 (en) | 2004-01-22 | 2005-07-28 | At&T Corp. | System and method to disambiguate and clarify user intention in a spoken dialog system |
DE602004017955D1 (de) | 2004-01-29 | 2009-01-08 | Daimler Ag | Verfahren und System zur Sprachdialogschnittstelle |
KR100462292B1 (ko) | 2004-02-26 | 2004-12-17 | 엔에이치엔(주) | 중요도 정보를 반영한 검색 결과 리스트 제공 방법 및 그시스템 |
US7505906B2 (en) * | 2004-02-26 | 2009-03-17 | At&T Intellectual Property, Ii | System and method for augmenting spoken language understanding by correcting common errors in linguistic performance |
US7693715B2 (en) | 2004-03-10 | 2010-04-06 | Microsoft Corporation | Generating large units of graphonemes with mutual information criterion for letter to sound conversion |
US7409337B1 (en) | 2004-03-30 | 2008-08-05 | Microsoft Corporation | Natural language processing interface |
US7496512B2 (en) | 2004-04-13 | 2009-02-24 | Microsoft Corporation | Refining of segmental boundaries in speech waveforms using contextual-dependent models |
US8095364B2 (en) | 2004-06-02 | 2012-01-10 | Tegic Communications, Inc. | Multimodal disambiguation of speech recognition |
US7720674B2 (en) | 2004-06-29 | 2010-05-18 | Sap Ag | Systems and methods for processing natural language queries |
US20060004570A1 (en) * | 2004-06-30 | 2006-01-05 | Microsoft Corporation | Transcribing speech data with dialog context and/or recognition alternative information |
TWI252049B (en) | 2004-07-23 | 2006-03-21 | Inventec Corp | Sound control system and method |
US7725318B2 (en) | 2004-07-30 | 2010-05-25 | Nice Systems Inc. | System and method for improving the accuracy of audio searching |
US7853574B2 (en) | 2004-08-26 | 2010-12-14 | International Business Machines Corporation | Method of generating a context-inferenced search query and of sorting a result of the query |
US7716056B2 (en) | 2004-09-27 | 2010-05-11 | Robert Bosch Corporation | Method and system for interactive conversational dialogue for cognitively overloaded device users |
US8107401B2 (en) | 2004-09-30 | 2012-01-31 | Avaya Inc. | Method and apparatus for providing a virtual assistant to a communication participant |
US7552046B2 (en) | 2004-11-15 | 2009-06-23 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7546235B2 (en) | 2004-11-15 | 2009-06-09 | Microsoft Corporation | Unsupervised learning of paraphrase/translation alternations and selective application thereof |
US7702500B2 (en) | 2004-11-24 | 2010-04-20 | Blaedow Karen R | Method and apparatus for determining the meaning of natural language |
CN1609859A (zh) | 2004-11-26 | 2005-04-27 | 孙斌 | 搜索结果聚类的方法 |
US7376645B2 (en) | 2004-11-29 | 2008-05-20 | The Intellection Group, Inc. | Multimodal natural language query system and architecture for processing voice and proximity-based queries |
US8214214B2 (en) | 2004-12-03 | 2012-07-03 | Phoenix Solutions, Inc. | Emotion detection device and method for use in distributed systems |
US20060122834A1 (en) | 2004-12-03 | 2006-06-08 | Bennett Ian M | Emotion detection device & method for use in distributed systems |
US7636657B2 (en) | 2004-12-09 | 2009-12-22 | Microsoft Corporation | Method and apparatus for automatic grammar generation from data entries |
US7873654B2 (en) | 2005-01-24 | 2011-01-18 | The Intellection Group, Inc. | Multimodal natural language query system for processing and analyzing voice and proximity-based queries |
US7508373B2 (en) | 2005-01-28 | 2009-03-24 | Microsoft Corporation | Form factor and input method for language input |
GB0502259D0 (en) | 2005-02-03 | 2005-03-09 | British Telecomm | Document searching tool and method |
US7949533B2 (en) * | 2005-02-04 | 2011-05-24 | Vococollect, Inc. | Methods and systems for assessing and improving the performance of a speech recognition system |
EP1693829B1 (en) * | 2005-02-21 | 2018-12-05 | Harman Becker Automotive Systems GmbH | Voice-controlled data system |
US7676026B1 (en) | 2005-03-08 | 2010-03-09 | Baxtech Asia Pte Ltd | Desktop telephony system |
US7925525B2 (en) | 2005-03-25 | 2011-04-12 | Microsoft Corporation | Smart reminders |
WO2006129967A1 (en) | 2005-05-30 | 2006-12-07 | Daumsoft, Inc. | Conversation system and method using conversational agent |
US8041570B2 (en) | 2005-05-31 | 2011-10-18 | Robert Bosch Corporation | Dialogue management using scripts |
US8024195B2 (en) | 2005-06-27 | 2011-09-20 | Sensory, Inc. | Systems and methods of performing speech recognition using historical information |
US8396715B2 (en) * | 2005-06-28 | 2013-03-12 | Microsoft Corporation | Confidence threshold tuning |
US7826945B2 (en) | 2005-07-01 | 2010-11-02 | You Zhang | Automobile speech-recognition interface |
US8271549B2 (en) | 2005-08-05 | 2012-09-18 | Intel Corporation | System and method for automatically managing media content |
US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
US7620549B2 (en) | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
WO2007027989A2 (en) | 2005-08-31 | 2007-03-08 | Voicebox Technologies, Inc. | Dynamic speech sharpening |
US8265939B2 (en) | 2005-08-31 | 2012-09-11 | Nuance Communications, Inc. | Hierarchical methods and apparatus for extracting user intent from spoken utterances |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
JP4908094B2 (ja) | 2005-09-30 | 2012-04-04 | 株式会社リコー | 情報処理システム、情報処理方法及び情報処理プログラム |
US7930168B2 (en) | 2005-10-04 | 2011-04-19 | Robert Bosch Gmbh | Natural language processing of disfluent sentences |
US8620667B2 (en) | 2005-10-17 | 2013-12-31 | Microsoft Corporation | Flexible speech-activated command and control |
US7707032B2 (en) | 2005-10-20 | 2010-04-27 | National Cheng Kung University | Method and system for matching speech data |
US20070106674A1 (en) | 2005-11-10 | 2007-05-10 | Purusharth Agrawal | Field sales process facilitation systems and methods |
US20070185926A1 (en) | 2005-11-28 | 2007-08-09 | Anand Prahlad | Systems and methods for classifying and transferring information in a storage network |
KR100810500B1 (ko) | 2005-12-08 | 2008-03-07 | 한국전자통신연구원 | 대화형 음성 인터페이스 시스템에서의 사용자 편의성증대 방법 |
DE102005061365A1 (de) | 2005-12-21 | 2007-06-28 | Siemens Ag | Verfahren zur Ansteuerung zumindest einer ersten und zweiten Hintergrundapplikation über ein universelles Sprachdialogsystem |
US7996228B2 (en) | 2005-12-22 | 2011-08-09 | Microsoft Corporation | Voice initiated network operations |
US7599918B2 (en) | 2005-12-29 | 2009-10-06 | Microsoft Corporation | Dynamic search with implicit user intention mining |
JP2007183864A (ja) | 2006-01-10 | 2007-07-19 | Fujitsu Ltd | ファイル検索方法及びそのシステム |
US20070174188A1 (en) | 2006-01-25 | 2007-07-26 | Fish Robert D | Electronic marketplace that facilitates transactions between consolidated buyers and/or sellers |
IL174107A0 (en) | 2006-02-01 | 2006-08-01 | Grois Dan | Method and system for advertising by means of a search engine over a data network |
KR100764174B1 (ko) | 2006-03-03 | 2007-10-08 | 삼성전자주식회사 | 음성 대화 서비스 장치 및 방법 |
US7752152B2 (en) | 2006-03-17 | 2010-07-06 | Microsoft Corporation | Using predictive user models for language modeling on a personal device with user behavior models based on statistical modeling |
JP4734155B2 (ja) | 2006-03-24 | 2011-07-27 | 株式会社東芝 | 音声認識装置、音声認識方法および音声認識プログラム |
US7707027B2 (en) | 2006-04-13 | 2010-04-27 | Nuance Communications, Inc. | Identification and rejection of meaningless input during natural language classification |
US20070276651A1 (en) * | 2006-05-23 | 2007-11-29 | Motorola, Inc. | Grammar adaptation through cooperative client and server based speech recognition |
US8423347B2 (en) | 2006-06-06 | 2013-04-16 | Microsoft Corporation | Natural language personal information management |
US7523108B2 (en) | 2006-06-07 | 2009-04-21 | Platformation, Inc. | Methods and apparatus for searching with awareness of geography and languages |
US20100257160A1 (en) | 2006-06-07 | 2010-10-07 | Yu Cao | Methods & apparatus for searching with awareness of different types of information |
US7483894B2 (en) | 2006-06-07 | 2009-01-27 | Platformation Technologies, Inc | Methods and apparatus for entity search |
KR100776800B1 (ko) | 2006-06-16 | 2007-11-19 | 한국전자통신연구원 | 지능형 가제트를 이용한 맞춤형 서비스 제공 방법 및시스템 |
US7548895B2 (en) | 2006-06-30 | 2009-06-16 | Microsoft Corporation | Communication-prompted user assistance |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
US8055502B2 (en) * | 2006-11-28 | 2011-11-08 | General Motors Llc | Voice dialing using a rejection reference |
US8600760B2 (en) * | 2006-11-28 | 2013-12-03 | General Motors Llc | Correcting substitution errors during automatic speech recognition by accepting a second best when first best is confusable |
US20080129520A1 (en) | 2006-12-01 | 2008-06-05 | Apple Computer, Inc. | Electronic device with enhanced audio feedback |
WO2008085742A2 (en) | 2007-01-07 | 2008-07-17 | Apple Inc. | Portable multifunction device, method and graphical user interface for interacting with user input elements in displayed content |
KR100883657B1 (ko) | 2007-01-26 | 2009-02-18 | 삼성전자주식회사 | 음성 인식 기반의 음악 검색 방법 및 장치 |
US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
US7822608B2 (en) | 2007-02-27 | 2010-10-26 | Nuance Communications, Inc. | Disambiguating a speech recognition grammar in a multimodal application |
US20080221899A1 (en) | 2007-03-07 | 2008-09-11 | Cerra Joseph P | Mobile messaging environment speech processing facility |
US7801729B2 (en) | 2007-03-13 | 2010-09-21 | Sensory, Inc. | Using multiple attributes to create a voice search playlist |
US8219406B2 (en) | 2007-03-15 | 2012-07-10 | Microsoft Corporation | Speech-centric multimodal user interface design in mobile technology |
US7809610B2 (en) | 2007-04-09 | 2010-10-05 | Platformation, Inc. | Methods and apparatus for freshness and completeness of information |
US7983915B2 (en) | 2007-04-30 | 2011-07-19 | Sonic Foundry, Inc. | Audio content search engine |
US8055708B2 (en) | 2007-06-01 | 2011-11-08 | Microsoft Corporation | Multimedia spaces |
US8204238B2 (en) | 2007-06-08 | 2012-06-19 | Sensory, Inc | Systems and methods of sonic communication |
US8190627B2 (en) | 2007-06-28 | 2012-05-29 | Microsoft Corporation | Machine assisted query formulation |
US8019606B2 (en) | 2007-06-29 | 2011-09-13 | Microsoft Corporation | Identification and selection of a software application via speech |
JP5638948B2 (ja) * | 2007-08-01 | 2014-12-10 | ジンジャー ソフトウェア、インコーポレイティッド | インターネットコーパスを用いた、文脈依存言語の自動的な修正および改善 |
JP2009036999A (ja) | 2007-08-01 | 2009-02-19 | Infocom Corp | コンピュータによる対話方法、対話システム、コンピュータプログラムおよびコンピュータに読み取り可能な記憶媒体 |
KR101359715B1 (ko) | 2007-08-24 | 2014-02-10 | 삼성전자주식회사 | 모바일 음성 웹 제공 방법 및 장치 |
US8190359B2 (en) | 2007-08-31 | 2012-05-29 | Proxpro, Inc. | Situation-aware personal information management for a mobile device |
US20090058823A1 (en) | 2007-09-04 | 2009-03-05 | Apple Inc. | Virtual Keyboards in Multi-Language Environment |
US9734465B2 (en) | 2007-09-14 | 2017-08-15 | Ricoh Co., Ltd | Distributed workflow-enabled system |
KR100920267B1 (ko) | 2007-09-17 | 2009-10-05 | 한국전자통신연구원 | 음성 대화 분석 시스템 및 그 방법 |
US8706476B2 (en) | 2007-09-18 | 2014-04-22 | Ariadne Genomics, Inc. | Natural language processing method by analyzing primitive sentences, logical clauses, clause types and verbal blocks |
KR100919225B1 (ko) * | 2007-09-19 | 2009-09-28 | 한국전자통신연구원 | 음성 대화 시스템에 있어서 다단계 검증을 이용한 대화오류 후처리 장치 및 방법 |
US8165886B1 (en) | 2007-10-04 | 2012-04-24 | Great Northern Research LLC | Speech interface system and method for control and interaction with applications on a computing system |
US8036901B2 (en) | 2007-10-05 | 2011-10-11 | Sensory, Incorporated | Systems and methods of performing speech recognition using sensory inputs of human position |
US20090112677A1 (en) | 2007-10-24 | 2009-04-30 | Rhett Randolph L | Method for automatically developing suggested optimal work schedules from unsorted group and individual task lists |
US7840447B2 (en) | 2007-10-30 | 2010-11-23 | Leonard Kleinrock | Pricing and auctioning of bundled items among multiple sellers and buyers |
US7983997B2 (en) | 2007-11-02 | 2011-07-19 | Florida Institute For Human And Machine Cognition, Inc. | Interactive complex task teaching system that allows for natural language input, recognizes a user's intent, and automatically performs tasks in document object model (DOM) nodes |
US8112280B2 (en) | 2007-11-19 | 2012-02-07 | Sensory, Inc. | Systems and methods of performing speech recognition with barge-in for use in a bluetooth system |
US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US8219407B1 (en) | 2007-12-27 | 2012-07-10 | Great Northern Research, LLC | Method for processing the output of a speech recognizer |
US8099289B2 (en) | 2008-02-13 | 2012-01-17 | Sensory, Inc. | Voice interface and search for electronic devices including bluetooth headsets and remote systems |
US8958848B2 (en) | 2008-04-08 | 2015-02-17 | Lg Electronics Inc. | Mobile terminal and menu control method thereof |
US8666824B2 (en) | 2008-04-23 | 2014-03-04 | Dell Products L.P. | Digital media content location and purchasing system |
US8285344B2 (en) | 2008-05-21 | 2012-10-09 | DP Technlogies, Inc. | Method and apparatus for adjusting audio for a user environment |
US8589161B2 (en) | 2008-05-27 | 2013-11-19 | Voicebox Technologies, Inc. | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
US8694355B2 (en) | 2008-05-30 | 2014-04-08 | Sri International | Method and apparatus for automated assistance with task management |
US8423288B2 (en) | 2009-11-30 | 2013-04-16 | Apple Inc. | Dynamic alerts for calendar events |
US8166019B1 (en) | 2008-07-21 | 2012-04-24 | Sprint Communications Company L.P. | Providing suggested actions in response to textual communications |
US9200913B2 (en) | 2008-10-07 | 2015-12-01 | Telecommunication Systems, Inc. | User interface for predictive traffic |
US8140328B2 (en) | 2008-12-01 | 2012-03-20 | At&T Intellectual Property I, L.P. | User intention based on N-best list of recognition hypotheses for utterances in a dialog |
US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
US8417526B2 (en) | 2009-03-13 | 2013-04-09 | Adacel, Inc. | Speech recognition learning system and method |
US9123341B2 (en) * | 2009-03-18 | 2015-09-01 | Robert Bosch Gmbh | System and method for multi-modal input synchronization and disambiguation |
US8805823B2 (en) | 2009-04-14 | 2014-08-12 | Sri International | Content processing systems and methods |
KR101581883B1 (ko) | 2009-04-30 | 2016-01-11 | 삼성전자주식회사 | 모션 정보를 이용하는 음성 검출 장치 및 방법 |
CN102405463B (zh) | 2009-04-30 | 2015-07-29 | 三星电子株式会社 | 利用多模态信息的用户意图推理装置及方法 |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10540976B2 (en) | 2009-06-05 | 2020-01-21 | Apple Inc. | Contextual voice commands |
US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
KR101562792B1 (ko) | 2009-06-10 | 2015-10-23 | 삼성전자주식회사 | 목표 예측 인터페이스 제공 장치 및 그 방법 |
US8527278B2 (en) | 2009-06-29 | 2013-09-03 | Abraham Ben David | Intelligent home automation |
US20110047072A1 (en) | 2009-08-07 | 2011-02-24 | Visa U.S.A. Inc. | Systems and Methods for Propensity Analysis and Validation |
US8768313B2 (en) | 2009-08-17 | 2014-07-01 | Digimarc Corporation | Methods and systems for image or audio recognition processing |
JP2011045005A (ja) * | 2009-08-24 | 2011-03-03 | Fujitsu Toshiba Mobile Communications Ltd | 携帯電話機 |
WO2011028844A2 (en) | 2009-09-02 | 2011-03-10 | Sri International | Method and apparatus for tailoring the output of an intelligent automated assistant to a user |
US8321527B2 (en) | 2009-09-10 | 2012-11-27 | Tribal Brands | System and method for tracking user location and associated activity and responsively providing mobile device updates |
KR20110036385A (ko) | 2009-10-01 | 2011-04-07 | 삼성전자주식회사 | 사용자 의도 분석 장치 및 방법 |
US9197736B2 (en) | 2009-12-31 | 2015-11-24 | Digimarc Corporation | Intuitive computing methods and systems |
US20110099507A1 (en) | 2009-10-28 | 2011-04-28 | Google Inc. | Displaying a collection of interactive elements that trigger actions directed to an item |
US20120137367A1 (en) | 2009-11-06 | 2012-05-31 | Cataphora, Inc. | Continuous anomaly detection based on behavior modeling and heterogeneous information analysis |
WO2011059997A1 (en) | 2009-11-10 | 2011-05-19 | Voicebox Technologies, Inc. | System and method for providing a natural language content dedication service |
US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
US8712759B2 (en) | 2009-11-13 | 2014-04-29 | Clausal Computing Oy | Specializing disambiguation of a natural language expression |
KR101960835B1 (ko) | 2009-11-24 | 2019-03-21 | 삼성전자주식회사 | 대화 로봇을 이용한 일정 관리 시스템 및 그 방법 |
US8396888B2 (en) | 2009-12-04 | 2013-03-12 | Google Inc. | Location-based searching using a search area that corresponds to a geographical location of a computing device |
KR101622111B1 (ko) | 2009-12-11 | 2016-05-18 | 삼성전자 주식회사 | 대화 시스템 및 그의 대화 방법 |
US20110161309A1 (en) | 2009-12-29 | 2011-06-30 | Lx1 Technology Limited | Method Of Sorting The Result Set Of A Search Engine |
US8494852B2 (en) | 2010-01-05 | 2013-07-23 | Google Inc. | Word-level correction of speech input |
US8334842B2 (en) | 2010-01-15 | 2012-12-18 | Microsoft Corporation | Recognizing user intent in motion capture system |
US8626511B2 (en) | 2010-01-22 | 2014-01-07 | Google Inc. | Multi-dimensional disambiguation of voice commands |
WO2011093025A1 (ja) * | 2010-01-29 | 2011-08-04 | 日本電気株式会社 | 入力支援システム、方法、およびプログラム |
US20110218855A1 (en) | 2010-03-03 | 2011-09-08 | Platformation, Inc. | Offering Promotions Based on Query Analysis |
US8265928B2 (en) | 2010-04-14 | 2012-09-11 | Google Inc. | Geotagged environmental audio for enhanced speech recognition accuracy |
US20110279368A1 (en) | 2010-05-12 | 2011-11-17 | Microsoft Corporation | Inferring user intent to engage a motion capture system |
US8694313B2 (en) | 2010-05-19 | 2014-04-08 | Google Inc. | Disambiguation of contact information using historical data |
US8522283B2 (en) | 2010-05-20 | 2013-08-27 | Google Inc. | Television remote control data transfer |
US8468012B2 (en) | 2010-05-26 | 2013-06-18 | Google Inc. | Acoustic model adaptation using geographic information |
US20110306426A1 (en) | 2010-06-10 | 2011-12-15 | Microsoft Corporation | Activity Participation Based On User Intent |
US8234111B2 (en) | 2010-06-14 | 2012-07-31 | Google Inc. | Speech and noise models for speech recognition |
US8411874B2 (en) | 2010-06-30 | 2013-04-02 | Google Inc. | Removing noise from audio |
US8775156B2 (en) | 2010-08-05 | 2014-07-08 | Google Inc. | Translating languages in response to device motion |
US8359020B2 (en) | 2010-08-06 | 2013-01-22 | Google Inc. | Automatically monitoring for voice input based on context |
US8473289B2 (en) | 2010-08-06 | 2013-06-25 | Google Inc. | Disambiguating input based on context |
CN102074231A (zh) * | 2010-12-30 | 2011-05-25 | 万音达有限公司 | 语音识别方法和语音识别系统 |
JP5670802B2 (ja) * | 2011-03-31 | 2015-02-18 | 水ing株式会社 | 固形燃料製造方法、その装置、及び固形燃料 |
WO2012148904A1 (en) | 2011-04-25 | 2012-11-01 | Veveo, Inc. | System and method for an intelligent personal timeline assistant |
-
2011
- 2011-09-28 US US13/247,912 patent/US8762156B2/en not_active Expired - Fee Related
-
2012
- 2012-09-24 AU AU2012227294A patent/AU2012227294B2/en not_active Ceased
- 2012-09-26 EP EP20120186113 patent/EP2587478A3/en not_active Ceased
- 2012-09-27 KR KR1020120108099A patent/KR101418163B1/ko not_active IP Right Cessation
- 2012-09-27 JP JP2012214570A patent/JP2013073240A/ja active Pending
- 2012-09-28 CN CN201510922714.2A patent/CN105336326A/zh active Pending
- 2012-09-28 CN CN201210369739.0A patent/CN103035240B/zh not_active Expired - Fee Related
-
2014
- 2014-03-21 KR KR1020140033255A patent/KR20140047633A/ko not_active Application Discontinuation
- 2014-06-05 US US14/297,473 patent/US8812316B1/en not_active Expired - Fee Related
- 2014-08-26 JP JP2014171991A patent/JP2015018265A/ja active Pending
-
2015
- 2015-08-07 AU AU2015210460A patent/AU2015210460B2/en not_active Ceased
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5909666A (en) * | 1992-11-13 | 1999-06-01 | Dragon Systems, Inc. | Speech recognition system which creates acoustic models by concatenating acoustic models of individual words |
US6311157B1 (en) * | 1992-12-31 | 2001-10-30 | Apple Computer, Inc. | Assigning meanings to utterances in a speech recognition system |
US7315818B2 (en) * | 2000-05-02 | 2008-01-01 | Nuance Communications, Inc. | Error correction in speech recognition |
CN1864204A (zh) * | 2002-09-06 | 2006-11-15 | 语音信号技术有限公司 | 用来完成语音识别的方法、系统和程序 |
CN101183525A (zh) * | 2006-10-12 | 2008-05-21 | Qnx软件操作系统(威美科)有限公司 | 用于自动语音识别系统的自适应语境 |
Also Published As
Publication number | Publication date |
---|---|
EP2587478A3 (en) | 2014-05-28 |
KR20130034630A (ko) | 2013-04-05 |
US8812316B1 (en) | 2014-08-19 |
EP2587478A2 (en) | 2013-05-01 |
JP2013073240A (ja) | 2013-04-22 |
KR20140047633A (ko) | 2014-04-22 |
AU2015210460A1 (en) | 2015-09-03 |
AU2012227294B2 (en) | 2015-05-07 |
CN105336326A (zh) | 2016-02-17 |
JP2015018265A (ja) | 2015-01-29 |
AU2015210460B2 (en) | 2017-04-13 |
CN103035240A (zh) | 2013-04-10 |
US8762156B2 (en) | 2014-06-24 |
US20130080177A1 (en) | 2013-03-28 |
KR101418163B1 (ko) | 2014-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103035240B (zh) | 用于使用上下文信息的语音识别修复的方法和系统 | |
US11388291B2 (en) | System and method for processing voicemail | |
JP6588637B2 (ja) | 個別化されたエンティティ発音の学習 | |
CN101366075B (zh) | 话音控制式无线通信装置系统的控制中心 | |
JP5967569B2 (ja) | 音声処理システム | |
CN103299361B (zh) | 翻译语言 | |
US7003457B2 (en) | Method and system for text editing in hand-held electronic device | |
US10599469B2 (en) | Methods to present the context of virtual assistant conversation | |
CN110110319A (zh) | 语音输入的字词级纠正 | |
CN105989840A (zh) | 自然语言语音服务环境中的混合处理的系统及方法 | |
KR20140142280A (ko) | 대화에서 정보를 추출하는 장치 | |
CN102137085A (zh) | 语音命令的多维消歧 | |
KR101950387B1 (ko) | 학습 데이터 중 식별 가능하지만 학습 가능성이 없는 데이터의 레이블화를 통한, 대화형 ai 에이전트 시스템을 위한 지식베이스 모델의 구축 또는 갱신 방법, 컴퓨터 장치, 및 컴퓨터 판독 가능 기록 매체 | |
KR20240115216A (ko) | 음성 신호 처리 방법 및 장치 | |
JP2015052745A (ja) | 情報処理装置、制御方法、及びプログラム | |
KR102092058B1 (ko) | 인터페이스 제공 방법 및 장치 | |
JP7322782B2 (ja) | 情報提供プログラム、情報提供方法および情報処理装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20151125 Termination date: 20170928 |
|
CF01 | Termination of patent right due to non-payment of annual fee |