CN116348950A - 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 - Google Patents

在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 Download PDF

Info

Publication number
CN116348950A
CN116348950A CN202180071279.0A CN202180071279A CN116348950A CN 116348950 A CN116348950 A CN 116348950A CN 202180071279 A CN202180071279 A CN 202180071279A CN 116348950 A CN116348950 A CN 116348950A
Authority
CN
China
Prior art keywords
sounds
voice command
augmented reality
selection
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180071279.0A
Other languages
English (en)
Chinese (zh)
Inventor
C·德克罗普
T·阿格拉瓦尔
J·R·福克斯
S·K·拉克希特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN116348950A publication Critical patent/CN116348950A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Selective Calling Equipment (AREA)
CN202180071279.0A 2020-11-24 2021-11-10 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 Pending CN116348950A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/102,687 US11978444B2 (en) 2020-11-24 2020-11-24 AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
US17/102,687 2020-11-24
PCT/CN2021/129740 WO2022111282A1 (en) 2020-11-24 2021-11-10 Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Publications (1)

Publication Number Publication Date
CN116348950A true CN116348950A (zh) 2023-06-27

Family

ID=81657233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180071279.0A Pending CN116348950A (zh) 2020-11-24 2021-11-10 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括

Country Status (6)

Country Link
US (1) US11978444B2 (https=)
JP (1) JP7824008B2 (https=)
CN (1) CN116348950A (https=)
DE (1) DE112021005482T5 (https=)
GB (1) GB2616765B (https=)
WO (1) WO2022111282A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115079833B (zh) * 2022-08-24 2023-01-06 北京亮亮视野科技有限公司 基于体感控制的多层界面与信息可视化呈现方法及系统

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8223088B1 (en) * 2011-06-09 2012-07-17 Google Inc. Multimode input field for a head-mounted display
WO2016050724A1 (fr) * 2014-09-29 2016-04-07 Christophe Guedon Procédé d'aide au suivi d'une conversation pour personne malentendante
CN107209549A (zh) * 2014-12-11 2017-09-26 万德实验室公司 能够实现可动作的消息传送的虚拟助理系统
CN108702580A (zh) * 2016-02-19 2018-10-23 微软技术许可有限责任公司 具有自动语音转录的听力辅助
CN109272982A (zh) * 2018-09-07 2019-01-25 昆明盛策同辉数字科技有限责任公司 结合增强现实的tts语音实时播报方法、装置、存储介质及设备
KR101990284B1 (ko) * 2018-12-13 2019-06-18 주식회사 버넥트 음성인식을 이용한 지능형 인지기술기반 증강현실시스템
CN110199240A (zh) * 2016-12-23 2019-09-03 瑞欧威尔股份有限公司 用于可穿戴显示器的基于上下文的内容导航
CN110413106A (zh) * 2019-06-18 2019-11-05 中国人民解放军军事科学院国防科技创新研究院 一种基于语音和手势的增强现实输入方法及系统
CN110476090A (zh) * 2017-01-27 2019-11-19 奇跃公司 用于超表面的抗反射涂层
US10499179B1 (en) * 2019-01-01 2019-12-03 Philip Scott Lyren Displaying emojis for binaural sound
DE102018208703A1 (de) * 2018-06-01 2019-12-05 Volkswagen Aktiengesellschaft Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6270040B1 (en) 2000-04-03 2001-08-07 Kam Industries Model train control system
ATE400871T1 (de) * 2004-01-29 2008-07-15 Harman Becker Automotive Sys Multimodale dateneingabe
US8788589B2 (en) 2007-10-12 2014-07-22 Watchitoo, Inc. System and method for coordinating simultaneous edits of shared digital data
US8769510B2 (en) 2010-04-08 2014-07-01 The Mathworks, Inc. Identification and translation of program code executable by a graphical processing unit (GPU)
US8296151B2 (en) * 2010-06-18 2012-10-23 Microsoft Corporation Compound gesture-speech commands
US8971854B2 (en) * 2012-06-19 2015-03-03 Honeywell International Inc. System and method of speaker recognition
US9966075B2 (en) * 2012-09-18 2018-05-08 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
US10824310B2 (en) * 2012-12-20 2020-11-03 Sri International Augmented reality virtual personal assistant for external representation
US9092600B2 (en) * 2012-11-05 2015-07-28 Microsoft Technology Licensing, Llc User authentication on augmented reality display device
US9747900B2 (en) 2013-05-24 2017-08-29 Google Technology Holdings LLC Method and apparatus for using image data to aid voice recognition
US9582246B2 (en) 2014-03-04 2017-02-28 Microsoft Technology Licensing, Llc Voice-command suggestions based on computer context
US9293141B2 (en) 2014-03-27 2016-03-22 Storz Endoskop Produktions Gmbh Multi-user voice control system for medical devices
US10152987B2 (en) * 2014-06-23 2018-12-11 Google Llc Remote invocation of mobile device actions
US10146355B2 (en) * 2015-03-26 2018-12-04 Lenovo (Singapore) Pte. Ltd. Human interface device input fusion
US10031967B2 (en) * 2016-02-29 2018-07-24 Rovi Guides, Inc. Systems and methods for using a trained model for determining whether a query comprising multiple segments relates to an individual query or several queries
JP6918471B2 (ja) 2016-11-24 2021-08-11 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 対話補助システムの制御方法、対話補助システム、及び、プログラム
US11107469B2 (en) * 2017-01-18 2021-08-31 Sony Corporation Information processing apparatus and information processing method
US20180261223A1 (en) 2017-03-13 2018-09-13 Amazon Technologies, Inc. Dialog management and item fulfillment using voice assistant system
US20200327890A1 (en) * 2017-11-28 2020-10-15 Sony Corporation Information processing device and information processing method
CN108363556A (zh) 2018-01-30 2018-08-03 百度在线网络技术(北京)有限公司 一种基于语音与增强现实环境交互的方法和系统
US10365885B1 (en) * 2018-02-21 2019-07-30 Sling Media Pvt. Ltd. Systems and methods for composition of audio content from multi-object audio
US10650829B2 (en) 2018-06-06 2020-05-12 International Business Machines Corporation Operating a voice response system in a multiuser environment
US11120791B2 (en) 2018-11-15 2021-09-14 International Business Machines Corporation Collaborative artificial intelligence (AI) voice response system control for authorizing a command associated with a calendar event
KR20200072026A (ko) * 2018-12-12 2020-06-22 현대자동차주식회사 음성 인식 처리 장치 및 방법
JP2020141235A (ja) 2019-02-27 2020-09-03 パナソニックIpマネジメント株式会社 機器制御システム、機器制御方法及びプログラム
US11170774B2 (en) * 2019-05-21 2021-11-09 Qualcomm Incorproated Virtual assistant device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8223088B1 (en) * 2011-06-09 2012-07-17 Google Inc. Multimode input field for a head-mounted display
WO2016050724A1 (fr) * 2014-09-29 2016-04-07 Christophe Guedon Procédé d'aide au suivi d'une conversation pour personne malentendante
CN107209549A (zh) * 2014-12-11 2017-09-26 万德实验室公司 能够实现可动作的消息传送的虚拟助理系统
CN108702580A (zh) * 2016-02-19 2018-10-23 微软技术许可有限责任公司 具有自动语音转录的听力辅助
CN110199240A (zh) * 2016-12-23 2019-09-03 瑞欧威尔股份有限公司 用于可穿戴显示器的基于上下文的内容导航
CN110476090A (zh) * 2017-01-27 2019-11-19 奇跃公司 用于超表面的抗反射涂层
DE102018208703A1 (de) * 2018-06-01 2019-12-05 Volkswagen Aktiengesellschaft Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm
CN109272982A (zh) * 2018-09-07 2019-01-25 昆明盛策同辉数字科技有限责任公司 结合增强现实的tts语音实时播报方法、装置、存储介质及设备
KR101990284B1 (ko) * 2018-12-13 2019-06-18 주식회사 버넥트 음성인식을 이용한 지능형 인지기술기반 증강현실시스템
US10499179B1 (en) * 2019-01-01 2019-12-03 Philip Scott Lyren Displaying emojis for binaural sound
CN110413106A (zh) * 2019-06-18 2019-11-05 中国人民解放军军事科学院国防科技创新研究院 一种基于语音和手势的增强现实输入方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张永生等: "《基于AR的互动式3D电子书的研发与实现》", 齐齐哈尔大学学报(自然科学版), vol. 32, no. 2, 31 March 2016 (2016-03-31), pages 60 - 63 *

Also Published As

Publication number Publication date
JP7824008B2 (ja) 2026-03-04
GB2616765A (en) 2023-09-20
WO2022111282A1 (en) 2022-06-02
JP2023551169A (ja) 2023-12-07
GB2616765B (en) 2025-03-05
DE112021005482T5 (de) 2023-09-14
US20220165260A1 (en) 2022-05-26
US11978444B2 (en) 2024-05-07

Similar Documents

Publication Publication Date Title
CN114365120B (zh) 用于减少的训练意图识别的方法、系统、装置和介质
EP4172843B1 (en) Using a single request for multi-person calling in assistant systems
US11699194B2 (en) User controlled task execution with task persistence for assistant systems
US11551665B2 (en) Dynamic contextual dialog session extension
JP2022551788A (ja) 補助システムのためのプロアクティブコンテンツを生成すること
CN107895577A (zh) 使用长尾语音命令的任务发起
US12340172B2 (en) Semantic parser including a coarse semantic parser and a fine semantic parser
US11403462B2 (en) Streamlining dialog processing using integrated shared resources
EP3792912B1 (en) Improved wake-word recognition in low-power devices
KR20250002657A (ko) 시맨틱 이벤트들을 갖는 멀티모달 ui
CN116348950A (zh) 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination