CN102782751B - 社会网络中的数字媒体语音标签 - Google Patents
社会网络中的数字媒体语音标签 Download PDFInfo
- Publication number
- CN102782751B CN102782751B CN201180012464.9A CN201180012464A CN102782751B CN 102782751 B CN102782751 B CN 102782751B CN 201180012464 A CN201180012464 A CN 201180012464A CN 102782751 B CN102782751 B CN 102782751B
- Authority
- CN
- China
- Prior art keywords
- speech samples
- voice
- media object
- label
- linked
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2203/00—Aspects of automatic or semi-automatic exchanges
- H04M2203/30—Aspects of automatic or semi-automatic exchanges related to audio recordings in general
- H04M2203/303—Marking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4936—Speech interaction details
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/912—Applications of a database
- Y10S707/913—Multimedia
- Y10S707/915—Image
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/912—Applications of a database
- Y10S707/913—Multimedia
- Y10S707/916—Audio
Abstract
一种语音加标签系统包括客户端计算装置,该客户端计算装置包括媒体对象捕获装置及语音捕获装置,并运行将媒体对象与语音样本相关联的客户端应用。该系统还包括:通信网络,其耦接至该客户端计算装置;语音加标签系统,其耦接至该通信网络并接收第一媒体对象和第一语音样本之间的至少一个关联;以及数据库,其耦接至该语音加标签系统,该数据库包括一个或多个语音标签,每个语音标签耦接至一个或多个语音样本。
Description
技术领域
本发明涉及表征媒体,且更具体而言,利用语音标签表征数字媒体。
背景技术
数字图书馆、照片共享站点、图像搜索引擎、在线百科全书及其它计算机系统皆在文件系统或数据库中拥有大量图像。访问这些站点的用户可能在查找想要的图像方面有困难,因为与文档不同,图像(及其它数字媒体)不包括可索引的字或短语。
对查找想要的图像的问题的一种解决方案是图像识别,但此方法对于用户生成的内容花费极高,且并不高度准确。另一已知方法是按指定种类(诸如,文件夹)对图像分组以促进存取。然而,这需要人工努力,且必须提前知晓图像。
存在组织这些图像的许多方式,包括收集、集合及分层结构。组织收集的一种常用方法为加标签。当用户看到图像时,用户可键入字或短语以对该图像“加标签”(描述该图像)。多个用户可将一个或多个标签添加至同一图像。当另一用户访问该站点时,用户可接着导航至由特定标签标记的图像。
存在可使用标签实现图像导航的各种方式。举例而言,用户可键入作为用于一个或多个图像的集合的现有标签的字或短语。或者,用户可看到按各种方式(按字母顺序、按流行度等)排列的标签,且接着选择描述(这些)图像的标签。用于社会导航的文本加标签的功效被广泛使用且良好地理解。
也存在呈现数字媒体以使得用户能够扫描且识别项目的多个方式(拼贴、网格、可视化)。这些方法的主要缺点为其不可缩放:显示变得杂乱,且屏幕可能用完像素,尤其是在小屏幕上,诸如,在移动装置上。
也存在“自动”处理数字媒体以得出可接着用于搜索的元数据的多个方式。元数据(位置、时间)可在图像获取时捕获,且随后用以导航至视觉数字媒体。
然而,存在不可能或不方便创建或使用文本标签的许多情形。实例包括当用户:正使用移动电话(花费长时间或将注意力自视觉任务转至键入字或短语)时;身体残疾(不能键入字或短语)时;由于受教育有限而文盲或半文盲(仅具有有限的阅读或书写能力)时;或具有视力问题(不能看到字或短语)或这些情形的组合时。
发明内容
根据本发明的一个实施例,提供一种系统,其包括客户端计算装置,该客户端计算装置包括媒体对象捕获装置和语音捕获装置并运行将媒体对象与语音样本相关联的客户端应用。此实施例的系统还包括:通信网络,其耦接至该客户端计算装置;语音加标签系统,其耦接至该通信网络并接收第一媒体对象和第一语音样本之间的至少一个关联;以及数据库,其耦接至该语音加标签系统,该数据库包括一个或多个语音标签,每个语音标签耦接至一个或多个语音样本。
根据本发明的另一实施例,公开一种对媒体对象加标签的方法。此实施例的方法包括:在服务器处接收第一语音样本和第一媒体对象之间的关联;比较该第一语音样本与一个或多个其它语音样本;将该第一语音样本链接至第一语音标签;将该第一语音标签链接至该第一媒体对象;以及将该第一语音样本、该第一语音标签、该第一媒体对象以及它们之间的任何链接存储在耦接至该服务器的数据库中。
根据本发明的另一实施例,公开一种搜索含有已加语音标签的媒体对象的数字数据库的方法。该方法包括:在服务器处接收第一音频搜索;比较该第一音频搜索与存储在该数字数据库中的语音标签的数字表示;以及返回链接至匹配该第一音频搜索的语音标签的一个或多个媒体对象。
经由本发明的技术认识到额外特征及优点。本发明的其它实施例及方面在本文中得以详细描述且被认为是所主张的本发明的一部分。为了更好地理解具有这些优点及特征的本发明,参考描述及附图。
附图说明
本认为是本发明的主题被特别地指出并清楚地在说明书结尾处的权利要求书中被请求。本发明的上述和其它特征和优点从结合附图的下述详细描述中将是明显的,在图中:
图1示出本发明的实施例可实施于其上的计算系统的实例;
图2示出根据本发明的一个实施例的系统的实例;
图3示出可在图2中示出的系统中利用的数据库的一个实施例的方块图;
图4为图3中示出的数据库的更详细描绘;
图5为示出根据本发明的可对媒体对象加标签的方法的流程图;
图6为示出根据本发明的一个实施例的形成数据库的方法的流程图;以及
图7为示出根据本发明的一个实施例的搜索且检索已加语音标签的媒体对象的方法的流程图。
具体实施方式
本发明的实施例可解决以上描述的问题或其它未提到的问题中的一些或全部。在一些情况下,本发明的系统及方法允许用户利用音频标识符对媒体对象加标签。这些音频标识符可在本文中被称作“语音样本”。此外,本发明包括用于基于“语音查询”来搜索链接至数据库中的语音样本的媒体对象的系统及方法。语音查询为人类语言中的一连串字,每个字由一连串音素组成。若语音查询听起来像一个或多个语音样本,则链接至这些语音样本的这些标签将被用于检索媒体对象。
在一个实施例中,提供用于用户利用其说出字或短语的语音的音频记录对数字媒体加标签的方法,以及用于用户使用这些语音标签来搜索和浏览数字媒体的另一方法。应理解,“用户”是说出所述字或短语的人,未必是向其提供语音标签的装置的拥有者。
具体而言,一些实施例提供用于利用口头音频(例如,字及短语)对图像及其它数字媒体加标签的系统及方法。本文中公开的系统及方法可包括将语音样本中的音素序列识别为标签的能力。随后,若同一或另一用户讲出紧密匹配的音素序列,则本文中公开的系统及方法可检索数字媒体。
还提供用于用户收听语音标签并选择标签中的一个来接着检索相关联的数字媒体的方法。可按字母顺序、按流行度、按分层结构或按其它方式排列标签。在分层结构中,可在较具体标签前呈现较概括标签,且标签可具有同义词,如由用户对标签的特定性或相似性层级的判断所确定的。若选择处于给定层级的标签,则可呈现在下一向下层级的更具体标签或可记录用于选定标签的新同义词。若在给定层级下无标签被选择,则一标签可被记录且在此层级处添加至分层结构。当用户收听语音标签时,所链接的语音样本的音频特性(例如,响度)可被用于指示该标签相对于全部标签集合的流行度或其它特性以及身份(若扬声器可被用于根据偏好来选择标签或标签的特定语音样本)。举例而言,一个人可能在听到其它用户的语音前更喜欢听到其自己的语音用于标签。
图1示出本发明的实施例可实施于其上的计算系统的实例。在此实施例中,系统100具有一个或多个中央处理单元(处理器)101a、101b、101c等(被共称作或统称作处理器101)。在一个实施例中,每个处理器101可包括精简指令集计算机(RISC)微处理器。处理器101经由系统总线113耦接至系统内存114及各种其它部件。只读存储器(ROM)102耦接至系统总线113,且可包括基本输入/输出系统(BIOS),该BIOS控制系统100的某些基本功能。
图1进一步描绘耦接至系统总线113的输入/输出(I/O)适配器107及网络适配器106。I/O适配器107可为与硬盘103和/或磁带存储驱动器105或任何其它类似部件通信的小计算机系统接口(SCSI)适配器。I/O适配器107、硬盘103及磁带存储驱动器105在本文中被共称作大容量存储器104。网络适配器106把总线113与外部网络116互连,从而使数据处理系统100能够与其它这些系统通信。屏幕(例如,显示监视器)115利用显示适配器112连接至系统总线113,显示适配器112可包括用以改进图形密集型应用的性能的图形适配器及视频控制器。在一个实施例中,适配器107、106及112可连接至一个或多个I/O总线,该一个或多个I/O总线可经由中间总线桥接器(未示出)连接至系统总线113。用于连接外围装置(诸如,硬盘控制器、网络适配器及图形适配器)的合适的I/O总线通常包括共同协议,诸如,外围部件接口(PCI)。另外的输入/输出装置被示出为经由用户接口适配器108及显示适配器112连接至系统总线113。键盘109、鼠标110及扬声器111皆经由用户接口适配器108互连至总线113,用户接口适配器108可包括例如将多个装置适配器整合至单个集成电路中的超级I/O芯片。当然,可包括诸如数字相机或数字视频相机(或以数字格式供应一个或多个图像的其它设备)及麦克风的其它输入作为另外的输入装置。
因此,如图1中所配置的,系统100包括呈处理器101的形式的处理设备、包括系统内存114及大容量存储器104的存储设备、诸如键盘109及鼠标110的输入设备,以及包括扬声器111及显示器115的输出设备。在一个实施例中,系统内存114及大容量存储器104的一部分共同地存储操作系统(诸如,来自IBM Corporation的操作系统)以协调图1中示出的各种部件的功能。
应认识到,系统100可以是任何合适的计算机或计算平台,且可包括终端机、无线装置、信息用具、装置、工作站、微型计算机、大型计算机、个人数字助理(PDA)或其它计算装置。应理解,系统100可包括利用通信网络链接在一起的多个计算装置。举例而言,在两个系统之间可存在客户端-服务器关系,且可在两者之间分开进行处理。
可由系统100支持的操作系统的实例包括Windows 95、Windows98、Windows NT4.0、Windows XP、Windows 2000、Windows CE、Windows Vista、Mac OS、Java、AIX、LINUX及UNIX或任何其它合适的操作系统。系统100还包括用于在网络116上通信的网络接口106。网络116可为局域网(LAN)、城域网(MAN)或广域网(WAN),诸如因特网或万维网。
系统100的用户可经由任何合适的网络接口116连接(诸如,标准电话线、数字用户线、LAN或WAN链路(例如,T1、T3)、宽带连接(帧中继、ATM)及无线连接(例如,802.11(a)、802.11(b)、802.11(g)))连接至网络。
如本文中所公开的,系统100包括存储在机器可读介质(例如,硬盘104)上的机器可读指令,用于用户对屏幕115上示出的信息的捕获及交互显示。如本文中论述,这些指令被称作“软件”120。可使用本领域中已知的软件开发工具生产软件120。软件120可包括本领域中已知的用于提供用户交互能力的各种工具及特征。
在一些实施例中,将软件120提供为对另一程序的覆盖。举例而言,可将软件120提供为针对一应用程序(或操作系统)的“插件(add-in)”。注意,术语“插件”通常指本领域中已知的补充程序代码。在这种实施例中,软件120可替换其合作的应用程序或操作系统的结构或对象。
应理解,在一个实施例中,本发明的系统可按一特定方式配置,且包括多个计算装置。为此,图2示出根据本发明的一个实施例的系统200的实例。可利用系统200来实施本文中公开的方法。
系统200包括一个或多个客户端计算装置202。客户端计算装置202可为任何类型的计算装置。在一个实施例中,客户端计算装置202包括麦克风及扬声器。在一个实施例中,且如图2中所示,客户端计算装置202可为蜂窝或“智能”电话、PDA或包括麦克风204及扬声器206的其它手持型通信(计算)装置。为了完整性,客户端计算装置202的其它部件可包括数字相机208、显示屏210及输入小键盘212。应理解,可将客户端计算装置202的部件中的一些组合在一起。举例而言,显示屏210可包括输入能力,且因此,包括用于输入信息以及显示例如图像的设备。在一个实施例中,客户端计算装置202可包括运行客户端应用、连接至无线数据网络、捕获一个或多个图像、显示图像、捕获音频及广播音频的能力。
客户端计算装置202可耦接至通信网络214。在一个实施例中,通信网络214可为蜂窝网络。举例而言,通信网络214可为GSM、TDMA、2G、3G或4G无线网络。通信网络214也可为诸如WIMAX或802.11的无线数据网络。当然,通信链路216可为无线或实体的。在一个实施例中,通信网络可为内联网或因特网。
系统200还可包括语音加标签系统218。语音加标签系统218耦接至通信网络214。因此,语音加标签系统218可在通信网络214上与客户端计算装置202通信。在一个实施例中,可将语音加标签系统218植入于服务器上。在一些实施例中,语音加标签系统218可被配置成运行web应用,该web应用处理对媒体对象及语音标签的请求且执行语音标签匹配。在一个实施例中,语音加标签系统218可包括具有用于人类语言的音素层级话语模型的话语处理单元,给定一语音样本,则该话语处理单元将返回最紧密匹配的音素序列。当然,该话语处理单元可处于独立的单元中或可实施于独立的单元上。
系统200还可包括耦接至语音加标签系统218的数据库220。数据库220可存储由语音加标签系统218利用的信息。在一个实施例中,语音加标签系统218可在其内包括数据库220。
图3a示出可存储在数据库220中的信息的实例。在一个实施例中,数据库220可包括语音标签存储302、数字媒体304及讲话者注册表306。当然,数据库220无需按此特定方式划分。
数字媒体存储304可包括数字媒体对象。数字媒体对象可包括能够视觉重现的任何类型的媒体,包括但不限于图像、文档、动画及视频。应理解,在一个实施例中,可用于语音加标签系统218(图2)的所有数字媒体可不存储在单个位置中,且可散布在多个数据库220上。
讲话者注册表306可包括与特定讲话者相关联的语音剪辑。在一个实施例中,语音剪辑中的一些或全部可与相应语音剪辑的音素表示相关联。这可能对于语音加标签并不需要,但可用于以下论述的讲话者身份验证(SIV)中。
语音标签为存储一个或多个语音剪辑与一个或多个数字媒体对象之间的关联的对象,且存储在语音标签存储302中。在一个实施例中,“加标签”应指创建媒体对象与语音样本之间的关联。相比之下,语音标签存储302中的语音标签包括至至少一个媒体对象及一个语音样本的链接。
图3b示出讲话者注册表306的较详细版本。讲话者注册表唯一地识别语音加标签系统的用户。讲话者可具有被识别的不同方式:使用触摸屏键入其姓名或特殊代码、匹配的语音剪辑(“说出字‘baggage’”)、来自呼叫者ID的电话号码,或产生可链接至语音剪辑以识别在记录语音剪辑时正交谈的讲话者的唯一讲话者标识符的任何其它方式。
图4示出具有数字媒体存储304与讲话者注册表306之间的链接的数据库220的一个实例。更详细地,图4示出语音剪辑402、404、406及408与数字媒体对象430及432之间的可能连接中的一些的实例。第一语音剪辑402表示某一讲话者讲出字“wheat”的剪辑。第一语音剪辑402链接至第一语音剪辑402的讲话者标识符410及音素表示412。
可按许多不同方式形成音素表示412(以及用于其它语音剪辑的任何其它音素表示)。在一个实施例中,可将音频剪辑分成语音片段及非语音片段,且接着,可利用已知的或日后开发的技术来识别语音部分的音素。如所示出的,以实例说明,第一语音剪辑402可表示描绘为各字母“wheet”的音素
第一语音标签426也可链接至耦接至第二讲话者标识符414及音素表示416的第二语音剪辑404。在此实施例中,第二语音剪辑404表示由各字母“weet”描绘的音素可实施音素匹配算法以推断:当由不同人讲话时,第一语音剪辑402和第二语音剪辑404实际上为同一个字。这种匹配可包括例如基于字的开始且因此基于字的音素序列的开头按同一方式分类的语音剪辑。因此,举例而言,每个语音剪辑中的前N=3个音素被识别且与其它相比较。当然,可利用其它分类技术,诸如,表示使两个序列相同所需的添加、删除及移动的数目的“编辑距离”。无论如何,第一语音标签426与第一数字媒体对象430相关联。
第二语音标签428与第一数字媒体对象430及第二数字媒体对象432两者相关联。这说明本发明允许将一个语音标签链接至包括不同类型的数字媒体对象(诸如,图像及视频)的一个或多个数字媒体对象的原理。类似于第一语音标签426,第二语音标签428可链接至一个或多个语音剪辑。在此实例中,第二语音标签428链接至第三语音剪辑406及第四语音剪辑408。第三语音剪辑406链接至讲话者标识符418及音素表示420。类似地,第四语音剪辑408链接至讲话者标识符422及音素表示424。当然,在一个实施例中,可组合这些讲话者标识符。
用户可创建语音剪辑与媒体对象之间的关联。这些关联可被用于创建语音标签并创建语音标签、数字媒体对象和语音剪辑之间的链接,如图4中所示。这些链接可例如由语音加标签系统218(图1)创建。当记录语音剪辑时,可创建讲话者标识符与语音剪辑之间的链接。也可由语音加标签系统218创建与每个语音剪辑相关联的音素表示且将其链接至语音剪辑。如所示,讲话者1(块422)讲出语音剪辑406及408两者。当收听标签428时,语音剪辑406可较佳,这是由于任意数量的包括清晰性、讲话时间、音量等的可配置原因。
对图像加标签
存在可根据本发明对图像加标签的若干方式。关于图5公开了一种方法。在块502处,获取媒体对象且将其呈现给用户。可按不同方式获取媒体对象。举例而言,媒体对象可由用户利用内置于用户的蜂窝电话中的数字相机拍照而获取。在另一实施例中,可从数据库将媒体对象下载至用户的蜂窝电话的屏幕。当然,在不脱离本发明的情况下,可执行其它获取图像的方法。在一个实施例中,媒体对象必须对用户来说可见以便对图像加标签。当然,这并非必需。
在块504处,启用语音加标签应用。语音加标签应用可以是例如能够接收语音样本且使其与正观看的图像相关联的客户端应用。在一个实施例中,语音加标签应用是蜂窝电话上的客户端应用。
在块506处,从用户接收语音样本。在一个实施例中,可在向用户呈现图像或其它媒体对象时接收语音样本。
在块507处,可分析语音样本以确定讲话者的身份。若无讲话者可被识别,则语音加标签系统可利用匿名讲话者操作。可使用各种信息来确定讲话者身份,包括但不限于呼叫者ID(电话号码)、讲话者身份验证(SIV)及在电话小键盘上键入姓名。存储在讲话者注册表中的一个或多个语音样本也可被用于匹配由用户提供并存储在讲话者注册表中的语音样本。可选地,若在块507处不存在匹配,则可在讲话者注册表中创建新的讲话者标识符。在此情况下,可能需要与用户的对话以记录语音剪辑、姓名、电话号码或其它识别信息。
在块508处,创建语音样本与媒体对象之间的关联。此关联可处于语音样本与下载的媒体文件、已加载于装置上的媒体或由用户创建的媒体对象之间。无论如何,所述关联可描述语音剪辑的位置及媒体对象位置及创建关联的时间。
在块510处,可将所述关联传输至语音加标签系统。当然,若语音样本或媒体对象先前未存储在数据库中,则可将语音样本或媒体对象与所述关联一起传输。举例而言,若用户从数据库220(图2)下载图像,且用语音样本对该图像加标签,则仅需要传输该语音样本及关联。所传输的除了关联之外的数据可以是系统特定且可配置的,且取决于具体情形。
创建已加标签的图像的数据库
如上论述,各个用户可创建语音样本与媒体对象之间的关联。这些关联形成图4中示出的链接的基础。图6为示出根据本发明的一个实施例的形成数据库的方法的流程图。
在块602处,接收关联。该关联使语音样本与媒体对象相关联。该关联可来自例如同时记录语音样本并显示图像。或者,该关联可来自允许在不显示图像的情况下进行关联的系统。在一个实施例中,媒体对象及语音样本中的一个或两者可与关联一起接收,例如在媒体对象或语音样本中的一个或两者尚未存在于数据库中的情况下。可例如由语音加标签系统218(图2)接收所述关联。
在块604处,将语音样本转换成音素表示。可利用已知技术来创建音素表示。音素表示被链接至语音样本。此外,若语音样本的讲话者已知,则其可被链接至讲话者注册表中的语音样本的创建者。此链接可将每个语音样本链接至至少一个讲话者标识符。例如,当不能识别唯一讲话者时,或当不使用讲话者识别且因此所有语音样本链接至匿名讲话者标识符时,讲话者标识符可识别唯一匿名用户。当然,多个样本可链接至单个标识符。
在块606处,比较数据库中的现有语音样本的音素表示与新接收的语音样本的音素表示。存在执行这种匹配的许多方式。一个实例包括匹配(并因此分类)基于字的开始听起来相似的字。这种匹配可包括:针对这些N个音素中的每一个,提取在语音样本中识别的前M个音素。对于一些情形,可使用少至M=3个音素。对于每个语音标签,顺序地比较这些音素。标签接收基于与其第M个音素的匹配程度的计分。与第M-1个音素的匹配可被加权高于第M个音素。在一个实施例中,匹配程度基于音素的匹配特征(诸如,浊辅音及清辅音)的数目,且无匹配接收计分-1。每个音素存在5个特征,因此,最佳计分为15且最差为-3。
在块608处,确定是否存在新语音样本与现有语音样本之间的匹配。若多个现有语音样本被从现有语音样本的数据库中检索出且匹配,则用户可选择最佳者。在存在与单个语音样本的匹配的情况下,在块610处,新语音样本被链接至现有语音样本被链接至的语音标签。举例而言,再次参看图4,第一语音剪辑402及第二语音剪辑404均链接至语音标签426。这可发生是因为第一语音剪辑402先前被链接至语音标签426。当将第二语音剪辑404放置于系统中时,第二音素表示416匹配第一音素表示412。因此,它们均被分配至同一个语音标签(语音标签426)。
现返回参看图6,如上所论述,每个语音标签链接至至少一个媒体对象及至少一个语音样本。在块612处,确定链接至现有语音标签的媒体对象是否匹配与新语音样本相关联的媒体对象。若是,则可记录关于加标签过程的信息且该过程可结束。举例而言,可将已对图像加标签的次数记录于数据库220(图2)中。否则,在块614处,将语音标签链接至与新语音样本相关联的媒体对象。以此方式,可将单个语音标签与多个媒体对象相关联。
在不存在新语音样本与现有语音样本之间的匹配(即,这是先前未讲出的字的语音样本)的情况下,在块616处,创建新语音标签。接着在块618处,将新创建的语音标签链接至新语音样本。新创建的语音标签被用于开始于已描述的块612处的处理。因此,若这是与匹配媒体对象的关联,则将新语音标签链接至语音样本先前相关联的媒体对象。若这是非匹配的新媒体对象,则将新创建的标签链接至该新媒体对象。因此可能使用新记录的语音样本对新捕获的图像加语音标签,在该情况下,该语音样本不匹配任何现有标签。
如上所论述,讲话者注册表306可被用于唯一地识别语音加标签系统的用户。可如上所述搜集用于讲话者的信息。
搜索已加标签的图像的数据库
以上描述详述了可创建及修改数据库的方式,以下描述描述在一个实施例中可如何搜索数据库。
图7为示出搜索且检索已加语音标签的媒体对象的方法的流程图。在块702处,语音加标签系统的用户启用在其客户端计算装置上的系统。在一个实施例中,客户端计算装置可为蜂窝电话。在另一实施例中,能够拍照且记录及播放声音且在WiFi网络上操作的触摸屏装置可形成客户端计算装置。
在块704处,创建利用语音搜索项的搜索。这可包括用户对着麦克风说出字。接着在块706处将搜索提交至服务器。
在块708处,服务器(例如,语音加标签系统218,图2)将(多个)语音搜索项与现有语音标签进行匹配。此匹配可包括将(多个)搜索项分成语音片段及非语音片段。接着,针对每个语音片段,可形成音素表示。可比较这些音素表示与链接至语音标签的现有音素表示,且基于与语音标签一起存储的现有语音样本的音素表示的匹配计分为每个语音标签创建“匹配计分”。可使用以上描述的匹配计分为每个语音标签确定最佳匹配。
在块710处,将结果返回至搜索者。在多个语音标签具有足够高计分的情况下,返回那些标签。在未找到标签的情况下,可将此对搜索者指示。假定存在匹配,则可将关联呈现给用户。向搜索者显示链接至选定标签的一个或多个匹配媒体对象。在触摸屏装置上选择匹配媒体对象可通过播放具有最佳计分的相关联的语音样本来播放与每个媒体对象相关联的语音标签。
在替代实施例中,捕获图像且经由MMS(多媒体消息传送服务)来发送,且系统执行语音输入的层次分类。在此实施例中,系统可包括“语音网关”,该语音网关自身是将用户的电话(经由公共交换电话网络或PSTN)连接至计算机系统的部件的组合。
现返回参看图2,在此实施例中,语音加标签系统218可被配置成操作交互式语音响应系统(IVR)。IVR系统可处理用户的小键盘输入,且引导语音网关播放和/或记录音频流(也被称作音频剪辑或语音剪辑)。系统还可包括无线手持电话,该无线手持电话能够记录和显示图像并具有与语音加标签系统218的无线数据连接。如先前所述,图像(或其它数字媒体)可存储并链接于数据库220中。该系统还可包括用于对其它用户通知新书签的至外部(在本IVR外部)服务的一个或多个接口。实例为公共域电子邮件网络、由无线载体(服务提供者)拥有并操作的SMS(短消息服务)及MMS(多媒体消息服务)网络、及公共交换电话网络(PSTN)。
在此实施例中,用户调用在连接至PSTN的任意移动相机电话上的IVR系统,且历经以下步骤来按层次地分类照片:1.用户利用其相机电话拍摄照片;用户将照片自其移动电话(使用电子邮件或MMS)发送至IVR服务;3.IVR服务将照片存储至数据库中并将照片添加至未加标签的照片的队列;4.用户登录IVR服务。用户的电话的呼叫者ID或明确的登录还被用于识别用户;用户通过收听与每个未加标签的照片相关联的元数据的文本至话音(TTS)生成而使用IVR菜单来选择照片。在此实施例中,使用该队列中每个未加标签的照片的上载时间;接着由IVR提示用户是否想要对该照片加标签,且若是,则从先前记录的语音标签的分层结构构建IVR菜单树;8.在IVR菜单树中的每个层级N处,提示所述用户:a)选择适当标签,b)创建新标签,或c)删除标签;9.若用户已选择适当标签,则检索层级N+l处的语音标签;及10.若无更多特定标签可用,则将该语音标签与照片一起存储。
本文中使用的术语仅用于描述特定实施例的目的,且并不意欲限制本发明。如本文中所使用的,单数形式的“一”、“一个”及“该/所述”意欲还包括复数形式,除非上下文另有清晰指示。应进一步理解,当词语“包括”用于此说明书中时,其指定所述的特征、整数、步骤、操作、元件和/或部件的存在,但并不排除一个或多个其它特征、整数、步骤、操作、元件、部件及/或其群组的存在或添加。
权利要求中的所有设备或步骤加功能元件的对应结构、材料、动作及等效物意欲包括用于连同如具体所主张的其它所主张元件一起执行功能的任何结构、材料或动作。已呈现本发明的描述以用于达成说明及描述的目的,但其并不意欲为详尽的或限于所公开的形式下的本发明。在不脱离本发明的范围及精神的情况下,许多修改及变化对于本领域普通技术人员将显而易见。选择并描述了实施例以便最佳地解释本发明的原理及实践应用,且使其它本领域普通技术人员能够针对具有适合于所预期特定用途的各种修改的各种实施例来理解本发明。
本文中描绘的流程图仅为一个实例。在不脱离本发明的精神的情况下,可存在对本文中描述的该图或步骤(或操作)的许多变化。举例而言,可按不同次序执行这些步骤,或者可添加、删除或修改步骤。将所有这些变化考虑为所主张的本发明的一部分。
尽管已描述了本发明的优选实施例,但本领域技术人员应理解,在现在及将来,可进行落入权利要求的范围的各种改进及增强。这些权利要求应被解释为维持对最初描述的本发明的适度保护。
Claims (14)
1.一种用于对媒体对象加标签的系统,包括:
客户端计算装置,所述客户端计算装置包括媒体对象捕获装置和语音捕获装置,并运行将媒体对象与语音样本相关联的客户端应用;
通信网络,其耦接至所述客户端计算装置;
语音加标签系统,其耦接至所述通信网络并接收第一媒体对象和第一语音样本之间的至少一个关联;以及
数据库,其耦接至所述语音加标签系统,所述数据库包括一个或多个现有语音样本,其中所述一个或多个现有语音样本被用于比较,所述比较包括:顺序地比较所述现有语音样本的音素表示与所述第一语音样本的音素表示,且在前的音素的匹配被加权高于在后的音素。
2.如权利要求1所述的系统,其中,所述至少一个现有语音样本被链接至讲话者标识符。
3.如权利要求1所述的系统,其中,所述客户端计算装置是蜂窝电话。
4.如权利要求1所述的系统,其中,具有类似音素表示的多个语音样本被链接至一个语音标签。
5.如权利要求1所述的系统,其中,所述第一媒体对象是图像。
6.一种对媒体对象加标签的方法,所述方法包括:
在服务器处接收第一语音样本和第一媒体对象之间的关联;
比较所述第一语音样本与一个或多个其它语音样本,其中所述比较包括顺序地比较所述第一语音样本的音素表示与所述一个或多个其它语音样本的音素表示,且在前的音素的匹配被加权高于在后的音素;
将所述第一语音样本链接至第一语音标签;
将所述第一语音标签链接至所述第一媒体对象;以及
将所述第一语音样本、所述第一语音标签、所述第一媒体对象以及它们之间的任何链接存储在耦接至所述服务器的数据库中。
7.如权利要求6所述的方法,其中,将所述第一语音样本链接至 第一语音标签包括,将所述第一语音样本链接至用户选择的现有的语音样本所链接至的第一语音标签,所述选择的现有的语音样本与所述第一语音样本匹配。
8.如权利要求6所述的方法,其中,从蜂窝电话接收所述关联。
9.如权利要求8所述的方法,其中,在所述蜂窝电话处创建所述第一媒体对象。
10.如权利要求8所述的方法,其中,从所述数据库中检索所述第一媒体对象并呈现在所述蜂窝电话上。
11.如权利要求6所述的方法,其中,在所述第一语音样本的音素表示匹配所述一个或多个其它语音样本中的一个的情况下,将所述第一语音样本链接至所述第一语音标签,所述第一语音标签先前被链接至所述一个或多个其它语音样本中的一个。
12.如权利要求6所述的方法,其中,在所述第一语音样本的音素表示不匹配所述一个或多个其它语音样本中的一个的情况下,将所述第一语音样本链接至所述第一语音标签进一步包括:
在确定所述第一语音样本的音素表示不匹配所述一个或多个其它语音样本中的一个之后,创建所述第一语音标签。
13.如权利要求8所述的方法,进一步包括:
将所述第一媒体对象存储在所述数据库中。
14.如权利要求8所述的方法,进一步包括:
将所述第一语音标签链接至第二媒体对象。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/718,041 US8903847B2 (en) | 2010-03-05 | 2010-03-05 | Digital media voice tags in social networks |
US12/718,041 | 2010-03-05 | ||
PCT/US2011/023557 WO2011109137A1 (en) | 2010-03-05 | 2011-02-03 | Digital media voice tags in social networks |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102782751A CN102782751A (zh) | 2012-11-14 |
CN102782751B true CN102782751B (zh) | 2015-02-11 |
Family
ID=44532204
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201180012464.9A Expired - Fee Related CN102782751B (zh) | 2010-03-05 | 2011-02-03 | 社会网络中的数字媒体语音标签 |
Country Status (6)
Country | Link |
---|---|
US (1) | US8903847B2 (zh) |
JP (1) | JP5671557B2 (zh) |
CN (1) | CN102782751B (zh) |
GB (1) | GB2491324B (zh) |
TW (1) | TW201209804A (zh) |
WO (1) | WO2011109137A1 (zh) |
Families Citing this family (183)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US20110115931A1 (en) * | 2009-11-17 | 2011-05-19 | Kulinets Joseph M | Image management system and method of controlling an image capturing device using a mobile communication device |
US20110115930A1 (en) * | 2009-11-17 | 2011-05-19 | Kulinets Joseph M | Image management system and method of selecting at least one of a plurality of cameras |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US8903847B2 (en) | 2010-03-05 | 2014-12-02 | International Business Machines Corporation | Digital media voice tags in social networks |
US20120244842A1 (en) | 2011-03-21 | 2012-09-27 | International Business Machines Corporation | Data Session Synchronization With Phone Numbers |
US20120246238A1 (en) | 2011-03-21 | 2012-09-27 | International Business Machines Corporation | Asynchronous messaging tags |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US8688090B2 (en) | 2011-03-21 | 2014-04-01 | International Business Machines Corporation | Data session preferences |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US9407892B2 (en) | 2011-09-12 | 2016-08-02 | Intel Corporation | Methods and apparatus for keyword-based, non-linear navigation of video streams and other content |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US20130289991A1 (en) * | 2012-04-30 | 2013-10-31 | International Business Machines Corporation | Application of Voice Tags in a Social Media Context |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US20130346068A1 (en) * | 2012-06-25 | 2013-12-26 | Apple Inc. | Voice-Based Image Tagging and Searching |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9058806B2 (en) * | 2012-09-10 | 2015-06-16 | Cisco Technology, Inc. | Speaker segmentation and recognition based on list of speakers |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
TWI528186B (zh) * | 2012-11-09 | 2016-04-01 | 財團法人資訊工業策進會 | 經由音訊發布訊息的系統及方法 |
JP2016508007A (ja) | 2013-02-07 | 2016-03-10 | アップル インコーポレイテッド | デジタルアシスタントのためのボイストリガ |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
CN110442699A (zh) | 2013-06-09 | 2019-11-12 | 苹果公司 | 操作数字助理的方法、计算机可读介质、电子设备和系统 |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
CN103399737B (zh) * | 2013-07-18 | 2016-10-12 | 百度在线网络技术(北京)有限公司 | 基于语音数据的多媒体处理方法及装置 |
CN104346388B (zh) * | 2013-07-31 | 2018-03-09 | 株式会社理光 | 云端服务器以及图像存储检索系统 |
US9167082B2 (en) | 2013-09-22 | 2015-10-20 | Steven Wayne Goldstein | Methods and systems for voice augmented caller ID / ring tone alias |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9966065B2 (en) | 2014-05-30 | 2018-05-08 | Apple Inc. | Multi-command single utterance input method |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
CN104199956B (zh) * | 2014-09-16 | 2018-01-16 | 成都博智维讯信息技术有限公司 | 一种erp数据语音搜索方法 |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
KR102252072B1 (ko) * | 2014-10-14 | 2021-05-14 | 삼성전자주식회사 | 음성 태그를 이용한 이미지 관리 방법 및 그 장치 |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
KR20170027551A (ko) * | 2015-09-02 | 2017-03-10 | 삼성전자주식회사 | 전자 장치 및 그의 제어 방법 |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10186253B2 (en) | 2015-10-05 | 2019-01-22 | Olympus Corporation | Control device for recording system, and recording system |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
JP2018025855A (ja) * | 2016-08-08 | 2018-02-15 | ソニーモバイルコミュニケーションズ株式会社 | 情報処理サーバ、情報処理装置、情報処理システム、情報処理方法、およびプログラム |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770429A1 (en) | 2017-05-12 | 2018-12-14 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
DK179549B1 (en) | 2017-05-16 | 2019-02-12 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
TWI752437B (zh) * | 2020-03-13 | 2022-01-11 | 宇康生科股份有限公司 | 基於至少雙音素的語音輸入操作方法及電腦程式產品 |
US11183193B1 (en) | 2020-05-11 | 2021-11-23 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1343337A (zh) * | 1999-03-05 | 2002-04-03 | 佳能株式会社 | 数据库注释和获取 |
Family Cites Families (141)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US1191425A (en) * | 1915-08-13 | 1916-07-18 | Henry Koch | Adjustable table. |
JPS58145998A (ja) * | 1982-02-25 | 1983-08-31 | ソニー株式会社 | 音声過渡点検出方法 |
US5390278A (en) * | 1991-10-08 | 1995-02-14 | Bell Canada | Phoneme based speech recognition |
US5422816A (en) * | 1994-02-22 | 1995-06-06 | Trimble Navigation Limited | Portable personal navigation tracking system |
US6236365B1 (en) | 1996-09-09 | 2001-05-22 | Tracbeam, Llc | Location of a mobile station using a plurality of commercial wireless infrastructures |
US7225249B1 (en) * | 1997-09-26 | 2007-05-29 | Mci, Llc | Integrated systems for providing communications network management services and interactive generating invoice documents |
US7209949B2 (en) * | 1998-05-29 | 2007-04-24 | Research In Motion Limited | System and method for synchronizing information between a host system and a mobile data communication device |
US6718367B1 (en) * | 1999-06-01 | 2004-04-06 | General Interactive, Inc. | Filter for modeling system and method for handling and routing of text-based asynchronous communications |
US7177795B1 (en) * | 1999-11-10 | 2007-02-13 | International Business Machines Corporation | Methods and apparatus for semantic unit based automatic indexing and searching in data archive systems |
US6834270B1 (en) * | 2000-02-28 | 2004-12-21 | Carlo Pagani | Secured financial transaction system using single use codes |
US7634528B2 (en) | 2000-03-16 | 2009-12-15 | Microsoft Corporation | Harnessing information about the timing of a user's client-server interactions to enhance messaging and collaboration services |
US7650376B1 (en) | 2000-03-27 | 2010-01-19 | Blumenau Trevor I | Content distribution system for distributing content over a network, with particular applicability to distributing high-bandwidth content |
US6700538B1 (en) * | 2000-03-29 | 2004-03-02 | Time Domain Corporation | System and method for estimating separation distance between impulse radios using impulse signal amplitude |
EP1410231A4 (en) * | 2000-04-03 | 2005-02-23 | Juergen Stark | METHOD AND SYSTEM FOR ELECTRONIC MESSAGING WITH CONTENT CONTROL |
US8489669B2 (en) * | 2000-06-07 | 2013-07-16 | Apple Inc. | Mobile data processing system moving interest radius |
US20030120822A1 (en) * | 2001-04-19 | 2003-06-26 | Langrind Nicholas A. | Isolated control plane addressing |
AU2001296866A1 (en) * | 2000-09-05 | 2002-03-22 | Zaplet, Inc. | Methods and apparatus providing electronic messages that are linked and aggregated |
FI114000B (fi) * | 2000-11-08 | 2004-07-15 | Mikko Kalervo Vaeaenaenen | Sähköinen lyhytviestintä- ja ilmoittelumenetelmä ja vastaavat välineet |
WO2002041029A1 (en) | 2000-11-15 | 2002-05-23 | Racetrace Inc. | Tag tracking |
US20030009385A1 (en) * | 2000-12-26 | 2003-01-09 | Tucciarone Joel D. | Electronic messaging system and method thereof |
US7266085B2 (en) * | 2001-03-21 | 2007-09-04 | Stine John A | Access and routing protocol for ad hoc network using synchronous collision resolution and node state dissemination |
US7263597B2 (en) * | 2001-04-19 | 2007-08-28 | Ciena Corporation | Network device including dedicated resources control plane |
EP1410353B1 (en) * | 2001-06-14 | 2009-12-30 | RF Code, Inc. | Wireless identification method and tag |
US7110524B2 (en) * | 2001-08-07 | 2006-09-19 | Qwest Communications International, Inc. | Method and system for call queueing and customer application interaction |
US6975994B2 (en) * | 2001-09-12 | 2005-12-13 | Technology Innovations, Llc | Device for providing speech driven control of a media presentation |
US7003570B2 (en) * | 2001-10-05 | 2006-02-21 | Bea Systems, Inc. | System for integrating java servlets with asynchronous messages |
KR100451180B1 (ko) * | 2001-11-28 | 2004-10-02 | 엘지전자 주식회사 | 태그를 이용한 단문 메시지 전송방법 |
US7065544B2 (en) * | 2001-11-29 | 2006-06-20 | Hewlett-Packard Development Company, L.P. | System and method for detecting repetitions in a multimedia stream |
US20030115366A1 (en) * | 2001-12-18 | 2003-06-19 | Robinson Brian R. | Asynchronous message delivery system and method |
US6879257B2 (en) * | 2002-02-25 | 2005-04-12 | Omron Corporation | State surveillance system and method for an object and the adjacent space, and a surveillance system for freight containers |
US7512649B2 (en) * | 2002-03-22 | 2009-03-31 | Sun Microsytems, Inc. | Distributed identities |
JP4978810B2 (ja) | 2002-05-30 | 2012-07-18 | 独立行政法人産業技術総合研究所 | 端末装置、情報配信装置、情報配信システムおよびプログラム |
US7966374B2 (en) * | 2002-07-01 | 2011-06-21 | Profiliq Software Inc. | Adaptive media messaging, such as for rich media messages incorporating digital content |
US7707317B2 (en) * | 2002-07-01 | 2010-04-27 | Prolifiq Software Inc. | Adaptive electronic messaging |
US20040024585A1 (en) * | 2002-07-03 | 2004-02-05 | Amit Srivastava | Linguistic segmentation of speech |
US20040024817A1 (en) * | 2002-07-18 | 2004-02-05 | Binyamin Pinkas | Selectively restricting access of automated agents to computer services |
US20040022264A1 (en) * | 2002-07-30 | 2004-02-05 | Mccue Andrew Charles | Method of determining context in a subjectless message |
GB2396520A (en) | 2002-11-23 | 2004-06-23 | Liquid Drop Ltd | System for issuing and authenticating mobile tokens |
EP1434409A1 (en) * | 2002-12-23 | 2004-06-30 | Koninklijke KPN N.V. | Setting user preferences via a mobile terminal |
AU2004246547B2 (en) * | 2003-06-09 | 2008-10-30 | Toku Pte Ltd | System and method for providing a service |
US20040260551A1 (en) * | 2003-06-19 | 2004-12-23 | International Business Machines Corporation | System and method for configuring voice readers using semantic analysis |
US7266754B2 (en) * | 2003-08-14 | 2007-09-04 | Cisco Technology, Inc. | Detecting network denial of service attacks |
US20050049924A1 (en) * | 2003-08-27 | 2005-03-03 | Debettencourt Jason | Techniques for use with application monitoring to obtain transaction data |
WO2005025155A1 (en) | 2003-09-05 | 2005-03-17 | Petr Hejl | Reply recognition in communications |
US20050102625A1 (en) * | 2003-11-07 | 2005-05-12 | Lee Yong C. | Audio tag retrieval system and method |
US20050114357A1 (en) * | 2003-11-20 | 2005-05-26 | Rathinavelu Chengalvarayan | Collaborative media indexing system and method |
GB2409365B (en) * | 2003-12-19 | 2009-07-08 | Nokia Corp | Image handling |
US8112103B2 (en) | 2004-01-16 | 2012-02-07 | Kuang-Chao Eric Yeh | Methods and systems for mobile device messaging |
US7756709B2 (en) * | 2004-02-02 | 2010-07-13 | Applied Voice & Speech Technologies, Inc. | Detection of voice inactivity within a sound stream |
US8457300B2 (en) | 2004-02-12 | 2013-06-04 | Avaya Inc. | Instant message contact management in a contact center |
US7725545B2 (en) * | 2004-02-20 | 2010-05-25 | Sybase 365, Inc. | Dual use counters for routing loops and spam detection |
US20050192808A1 (en) * | 2004-02-26 | 2005-09-01 | Sharp Laboratories Of America, Inc. | Use of speech recognition for identification and classification of images in a camera-equipped mobile handset |
US7539860B2 (en) * | 2004-03-18 | 2009-05-26 | American Express Travel Related Services Company, Inc. | Single use user IDS |
CN1973504A (zh) * | 2004-06-07 | 2007-05-30 | 99有限公司 | 用于对通信进行路由的方法和装置 |
US7693945B1 (en) | 2004-06-30 | 2010-04-06 | Google Inc. | System for reclassification of electronic messages in a spam filtering system |
JP4018678B2 (ja) * | 2004-08-13 | 2007-12-05 | キヤノン株式会社 | データ管理方法および装置 |
JP4587165B2 (ja) * | 2004-08-27 | 2010-11-24 | キヤノン株式会社 | 情報処理装置及びその制御方法 |
WO2006058004A1 (en) * | 2004-11-23 | 2006-06-01 | Transera Communications | A method and system for monitoring and managing multi-sourced call centers |
US7218943B2 (en) * | 2004-12-13 | 2007-05-15 | Research In Motion Limited | Text messaging conversation user interface functionality |
US7512659B2 (en) * | 2004-12-16 | 2009-03-31 | International Business Machines Corporation | Enabling interactive electronic mail and real-time messaging |
US7574453B2 (en) | 2005-01-03 | 2009-08-11 | Orb Networks, Inc. | System and method for enabling search and retrieval operations to be performed for data items and records using data obtained from associated voice files |
EP2296386A1 (en) | 2005-05-20 | 2011-03-16 | Qualcomm Incorporated | Asynchronous media communications using priority tags |
US20060287867A1 (en) * | 2005-06-17 | 2006-12-21 | Cheng Yan M | Method and apparatus for generating a voice tag |
US7471775B2 (en) * | 2005-06-30 | 2008-12-30 | Motorola, Inc. | Method and apparatus for generating and updating a voice tag |
US7957520B2 (en) * | 2005-07-14 | 2011-06-07 | Cisco Technology, Inc. | System and method for responding to an emergency at a call center |
US20070033229A1 (en) * | 2005-08-03 | 2007-02-08 | Ethan Fassett | System and method for indexing structured and unstructured audio content |
US7886083B2 (en) | 2005-08-31 | 2011-02-08 | Microsoft Corporation | Offloaded neighbor cache entry synchronization |
US20070078986A1 (en) * | 2005-09-13 | 2007-04-05 | Cisco Technology, Inc. | Techniques for reducing session set-up for real-time communications over a network |
US7702821B2 (en) | 2005-09-15 | 2010-04-20 | Eye-Fi, Inc. | Content-aware digital media storage device and methods of using the same |
US7551935B2 (en) * | 2005-09-21 | 2009-06-23 | U Owe Me, Inc. | SMS+4D: short message service plus 4-dimensional context |
US8489132B2 (en) | 2005-09-21 | 2013-07-16 | Buckyball Mobile Inc. | Context-enriched microblog posting |
US9009265B2 (en) * | 2005-09-28 | 2015-04-14 | Photobucket Corporation | System and method for automatic transfer of data from one device to another |
CN1852354A (zh) | 2005-10-17 | 2006-10-25 | 华为技术有限公司 | 收集用户行为特征的方法和装置 |
US8209620B2 (en) * | 2006-01-31 | 2012-06-26 | Accenture Global Services Limited | System for storage and navigation of application states and interactions |
US7945653B2 (en) * | 2006-10-11 | 2011-05-17 | Facebook, Inc. | Tagging digital media |
US20070171066A1 (en) * | 2005-12-20 | 2007-07-26 | Edward Merritt | Security-enabled digital media and authentication methods thereof |
KR100833500B1 (ko) | 2006-01-24 | 2008-05-29 | 한국전자통신연구원 | Dab/dmb 방송 시스템에서 음성 태그가 추가된epg xml을 이용한 음성 epg 서비스 제공 시스템및 방법 |
US20070174326A1 (en) * | 2006-01-24 | 2007-07-26 | Microsoft Corporation | Application of metadata to digital media |
ES2420559T3 (es) * | 2006-02-10 | 2013-08-23 | Spinvox Limited | Un sistema a gran escala, independiente del usuario e independiente del dispositivo de conversión del mensaje vocal a texto |
US8151323B2 (en) | 2006-04-12 | 2012-04-03 | Citrix Systems, Inc. | Systems and methods for providing levels of access and action control via an SSL VPN appliance |
WO2007140023A2 (en) * | 2006-06-01 | 2007-12-06 | Voxpixel, Inc. | Methods and systems for incorporating a voice-attached, tagged rich media package from a wireless camera-equipped handheld mobile device into a collaborative workflow |
US20070290787A1 (en) * | 2006-06-20 | 2007-12-20 | Trevor Fiatal | Systems and methods for group messaging |
US7729689B2 (en) * | 2006-07-13 | 2010-06-01 | International Business Machines Corporation | Mobile wireless device adaptation based on abstracted contectual situation of user using near-field communications and information collectors |
US7652813B2 (en) * | 2006-08-30 | 2010-01-26 | Silicon Quest Kabushiki-Kaisha | Mirror device |
US8239480B2 (en) * | 2006-08-31 | 2012-08-07 | Sony Ericsson Mobile Communications Ab | Methods of searching using captured portions of digital audio content and additional information separate therefrom and related systems and computer program products |
NZ549654A (en) | 2006-09-01 | 2007-05-31 | Run The Red Ltd | A method of online payment authorization, a method of correlating text messages and systems therefor |
US20080075433A1 (en) * | 2006-09-22 | 2008-03-27 | Sony Ericsson Mobile Communications Ab | Locating digital images in a portable electronic device |
US7917911B2 (en) * | 2006-12-01 | 2011-03-29 | Computer Associates Think, Inc. | Automated grouping of messages provided to an application using execution path similarity analysis |
US9282446B2 (en) | 2009-08-06 | 2016-03-08 | Golba Llc | Location-aware content and location-based advertising with a mobile device |
US8136090B2 (en) * | 2006-12-21 | 2012-03-13 | International Business Machines Corporation | System and methods for applying social computing paradigm to software installation and configuration |
US20080159266A1 (en) * | 2006-12-30 | 2008-07-03 | Arcsoft (Shanghai) Technology Company, Ltd | Determining Pairings of Telephone Numbers and IP Addresses from Caching and Peer-To-Peer Lookup |
US20090012841A1 (en) * | 2007-01-05 | 2009-01-08 | Yahoo! Inc. | Event communication platform for mobile device users |
US7788247B2 (en) * | 2007-01-12 | 2010-08-31 | Microsoft Corporation | Characteristic tagging |
US8060123B2 (en) * | 2007-03-19 | 2011-11-15 | Sony Corporation | System and method for using SMS and tagged message to send position and travel information to server and/or to peers |
US8761815B2 (en) * | 2007-03-21 | 2014-06-24 | Motorola Mobility Llc | Method, device and system for accessing mobile device user information |
US7577433B2 (en) * | 2007-06-18 | 2009-08-18 | Cvon Innovations Limited | Method and system for managing delivery of communications |
AU2008201643B1 (en) | 2007-07-24 | 2008-08-28 | Rambrandt Messaging Technologies, LP | Messaging service in a wireless communications network |
KR101459136B1 (ko) * | 2007-09-03 | 2014-11-10 | 엘지전자 주식회사 | 오디오 데이터 플레이어 및 이의 재생목록 생성방법 |
CA2702397A1 (en) * | 2007-09-12 | 2009-03-19 | Airkast, Inc. | Wireless device tagging system and method |
US8347231B2 (en) * | 2007-10-08 | 2013-01-01 | At&T Intellectual Property I, L.P. | Methods, systems, and computer program products for displaying tag words for selection by users engaged in social tagging of content |
GB2453810A (en) * | 2007-10-15 | 2009-04-22 | Cvon Innovations Ltd | System, Method and Computer Program for Modifying Communications by Insertion of a Targeted Media Content or Advertisement |
US8539097B2 (en) * | 2007-11-14 | 2013-09-17 | Oracle International Corporation | Intelligent message processing |
PL2061284T3 (pl) * | 2007-11-15 | 2014-07-31 | Deutsche Telekom Ag | Sposób i system dostarczania usługi bezwarunkowego przekazywania krótkich wiadomości (SMS) |
US8472972B2 (en) * | 2007-11-21 | 2013-06-25 | International Business Machines Corporation | Device, system, and method of physical context based wireless communication |
US8307029B2 (en) * | 2007-12-10 | 2012-11-06 | Yahoo! Inc. | System and method for conditional delivery of messages |
US20090150786A1 (en) * | 2007-12-10 | 2009-06-11 | Brown Stephen J | Media content tagging on a social network |
US20090164287A1 (en) * | 2007-12-24 | 2009-06-25 | Kies Jonathan K | Method and apparatus for optimizing presentation of media content on a wireless device based on user behavior |
US20090191902A1 (en) * | 2008-01-25 | 2009-07-30 | John Osborne | Text Scripting |
US9111286B2 (en) * | 2008-02-01 | 2015-08-18 | Qualcomm, Incorporated | Multiple actions and icons for mobile advertising |
US8015005B2 (en) * | 2008-02-15 | 2011-09-06 | Motorola Mobility, Inc. | Method and apparatus for voice searching for stored content using uniterm discovery |
US7996432B2 (en) * | 2008-02-25 | 2011-08-09 | International Business Machines Corporation | Systems, methods and computer program products for the creation of annotations for media content to enable the selective management and playback of media content |
US20100030578A1 (en) | 2008-03-21 | 2010-02-04 | Siddique M A Sami | System and method for collaborative shopping, business and entertainment |
US20090265631A1 (en) | 2008-04-18 | 2009-10-22 | Yahoo! Inc. | System and method for a user interface to navigate a collection of tags labeling content |
US9906620B2 (en) | 2008-05-05 | 2018-02-27 | Radware, Ltd. | Extensible, asynchronous, centralized analysis and optimization of server responses to client requests |
US8948731B2 (en) * | 2008-07-18 | 2015-02-03 | Qualcomm Incorporated | Rating of message content for content control in wireless devices |
US9152722B2 (en) | 2008-07-22 | 2015-10-06 | Yahoo! Inc. | Augmenting online content with additional content relevant to user interest |
US8260846B2 (en) | 2008-07-25 | 2012-09-04 | Liveperson, Inc. | Method and system for providing targeted content to a surfer |
US8385971B2 (en) | 2008-08-19 | 2013-02-26 | Digimarc Corporation | Methods and systems for content processing |
US20100049599A1 (en) * | 2008-08-20 | 2010-02-25 | First Data Corporation | Filtering mobile marketing offers |
GB2461730B (en) | 2008-08-22 | 2010-11-10 | Peter Tanner | A communication device |
US8365267B2 (en) * | 2008-11-13 | 2013-01-29 | Yahoo! Inc. | Single use web based passwords for network login |
US8831203B2 (en) | 2008-12-23 | 2014-09-09 | Genesys Telecommunications Laboratories, Inc. | System and methods for tracking unresolved customer involvement with a service organization and automatically formulating a dynamic service solution |
KR20100079639A (ko) * | 2008-12-31 | 2010-07-08 | 삼성전자주식회사 | 지도 정보를 이용한 음원 탐색 시스템 및 그 방법 |
US9857501B2 (en) * | 2009-02-13 | 2018-01-02 | Centurylink Intellectual Property Llc | System and method for a wireless phone enabled with weather alerts |
US8638911B2 (en) * | 2009-07-24 | 2014-01-28 | Avaya Inc. | Classification of voice messages based on analysis of the content of the message and user-provisioned tagging rules |
US8539542B1 (en) | 2009-08-25 | 2013-09-17 | Whdc Llc | System and method for managing multiple live video broadcasts via a public data network on a single viewing channel |
US20110061068A1 (en) | 2009-09-10 | 2011-03-10 | Rashad Mohammad Ali | Tagging media with categories |
US8370358B2 (en) | 2009-09-18 | 2013-02-05 | Microsoft Corporation | Tagging content with metadata pre-filtered by context |
US9438741B2 (en) | 2009-09-30 | 2016-09-06 | Nuance Communications, Inc. | Spoken tags for telecom web platforms in a social network |
US9183580B2 (en) | 2010-11-04 | 2015-11-10 | Digimarc Corporation | Methods and systems for resource management on portable devices |
CA2684678A1 (en) | 2009-11-03 | 2011-05-03 | Research In Motion Limited | System and method for dynamic post-processing on a mobile device |
US20110141855A1 (en) | 2009-12-11 | 2011-06-16 | General Motors Llc | System and method for updating information in electronic calendars |
US8230054B2 (en) | 2009-12-23 | 2012-07-24 | Citrix Systems, Inc. | Systems and methods for managing dynamic proximity in multi-core GSLB appliance |
US8463887B2 (en) | 2009-12-23 | 2013-06-11 | Citrix Systems, Inc. | Systems and methods for server surge protection in a multi-core system |
US8903847B2 (en) | 2010-03-05 | 2014-12-02 | International Business Machines Corporation | Digital media voice tags in social networks |
US8583725B2 (en) | 2010-04-05 | 2013-11-12 | Microsoft Corporation | Social context for inter-media objects |
EP2567346B1 (en) | 2010-05-05 | 2017-06-14 | Digimarc Corporation | Hidden image signaling |
US20110276513A1 (en) | 2010-05-10 | 2011-11-10 | Avaya Inc. | Method of automatic customer satisfaction monitoring through social media |
US20120246238A1 (en) | 2011-03-21 | 2012-09-27 | International Business Machines Corporation | Asynchronous messaging tags |
WO2013075071A1 (en) | 2011-11-18 | 2013-05-23 | Ayman Hammad | Mobile wallet store and service injection platform apparatuses, methods and systems |
US9406222B2 (en) | 2012-10-18 | 2016-08-02 | Calamp Corp. | Systems and methods for location reporting of detected events in vehicle operation |
-
2010
- 2010-03-05 US US12/718,041 patent/US8903847B2/en not_active Expired - Fee Related
-
2011
- 2011-02-03 GB GB1217273.0A patent/GB2491324B/en active Active
- 2011-02-03 WO PCT/US2011/023557 patent/WO2011109137A1/en active Application Filing
- 2011-02-03 CN CN201180012464.9A patent/CN102782751B/zh not_active Expired - Fee Related
- 2011-02-03 JP JP2012556078A patent/JP5671557B2/ja not_active Expired - Fee Related
- 2011-03-01 TW TW100106768A patent/TW201209804A/zh unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1343337A (zh) * | 1999-03-05 | 2002-04-03 | 佳能株式会社 | 数据库注释和获取 |
Also Published As
Publication number | Publication date |
---|---|
GB201217273D0 (en) | 2012-11-14 |
US8903847B2 (en) | 2014-12-02 |
TW201209804A (en) | 2012-03-01 |
WO2011109137A1 (en) | 2011-09-09 |
JP2013521567A (ja) | 2013-06-10 |
GB2491324B (en) | 2017-03-22 |
US20110219018A1 (en) | 2011-09-08 |
GB2491324A (en) | 2012-11-28 |
CN102782751A (zh) | 2012-11-14 |
JP5671557B2 (ja) | 2015-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102782751B (zh) | 社会网络中的数字媒体语音标签 | |
US11315546B2 (en) | Computerized system and method for formatted transcription of multimedia content | |
US9229974B1 (en) | Classifying queries | |
US8238528B2 (en) | Automatic analysis of voice mail content | |
CN108305626A (zh) | 应用程序的语音控制方法和装置 | |
US10928996B2 (en) | Systems, devices and methods for electronic determination and communication of location information | |
US20090018832A1 (en) | Information communication terminal, information communication system, information communication method, information communication program, and recording medium recording thereof | |
US20090327272A1 (en) | Method and System for Searching Multiple Data Types | |
US20170249934A1 (en) | Electronic device and method for operating the same | |
US20120109759A1 (en) | Speech recognition system platform | |
CN104714942A (zh) | 用于针对自然语言处理任务的内容可用性的方法和系统 | |
KR20140060217A (ko) | 오디오 신호에 의해 메시지를 포스팅하는 시스템 및 방법 | |
CN110442803A (zh) | 由计算设备执行的数据处理方法、装置、介质和计算设备 | |
KR101440887B1 (ko) | 영상 및 음성 정보를 이용한 명함 인식 방법 및 장치 | |
CN110740212B (zh) | 基于智能语音技术的通话接听方法、装置及电子设备 | |
CN111555960A (zh) | 信息生成的方法 | |
EP2680256A1 (en) | System and method to analyze voice communications | |
CN107368602A (zh) | 一种用于智能设备的照片存储方法和照片存储装置 | |
CN111259181B (zh) | 用于展示信息、提供信息的方法和设备 | |
CN109739970B (zh) | 信息处理方法及装置、以及电子设备 | |
JP2017167433A (ja) | サマリ生成装置、サマリ生成方法及びサマリ生成プログラム | |
WO2019098036A1 (ja) | 情報処理装置、情報処理端末、および情報処理方法 | |
US20230153061A1 (en) | Hierarchical Context Specific Actions from Ambient Speech | |
JP2019008378A (ja) | 広告システム及び広告方法 | |
US20210109960A1 (en) | Electronic apparatus and controlling method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20150211 Termination date: 20210203 |