CN116348950A - 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 - Google Patents
在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 Download PDFInfo
- Publication number
- CN116348950A CN116348950A CN202180071279.0A CN202180071279A CN116348950A CN 116348950 A CN116348950 A CN 116348950A CN 202180071279 A CN202180071279 A CN 202180071279A CN 116348950 A CN116348950 A CN 116348950A
- Authority
- CN
- China
- Prior art keywords
- sounds
- voice command
- augmented reality
- selection
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- User Interface Of Digital Computer (AREA)
- Selective Calling Equipment (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/102,687 US11978444B2 (en) | 2020-11-24 | 2020-11-24 | AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
| US17/102,687 | 2020-11-24 | ||
| PCT/CN2021/129740 WO2022111282A1 (en) | 2020-11-24 | 2021-11-10 | Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116348950A true CN116348950A (zh) | 2023-06-27 |
Family
ID=81657233
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202180071279.0A Pending CN116348950A (zh) | 2020-11-24 | 2021-11-10 | 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11978444B2 (https=) |
| JP (1) | JP7824008B2 (https=) |
| CN (1) | CN116348950A (https=) |
| DE (1) | DE112021005482T5 (https=) |
| GB (1) | GB2616765B (https=) |
| WO (1) | WO2022111282A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115079833B (zh) * | 2022-08-24 | 2023-01-06 | 北京亮亮视野科技有限公司 | 基于体感控制的多层界面与信息可视化呈现方法及系统 |
Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8223088B1 (en) * | 2011-06-09 | 2012-07-17 | Google Inc. | Multimode input field for a head-mounted display |
| WO2016050724A1 (fr) * | 2014-09-29 | 2016-04-07 | Christophe Guedon | Procédé d'aide au suivi d'une conversation pour personne malentendante |
| CN107209549A (zh) * | 2014-12-11 | 2017-09-26 | 万德实验室公司 | 能够实现可动作的消息传送的虚拟助理系统 |
| CN108702580A (zh) * | 2016-02-19 | 2018-10-23 | 微软技术许可有限责任公司 | 具有自动语音转录的听力辅助 |
| CN109272982A (zh) * | 2018-09-07 | 2019-01-25 | 昆明盛策同辉数字科技有限责任公司 | 结合增强现实的tts语音实时播报方法、装置、存储介质及设备 |
| KR101990284B1 (ko) * | 2018-12-13 | 2019-06-18 | 주식회사 버넥트 | 음성인식을 이용한 지능형 인지기술기반 증강현실시스템 |
| CN110199240A (zh) * | 2016-12-23 | 2019-09-03 | 瑞欧威尔股份有限公司 | 用于可穿戴显示器的基于上下文的内容导航 |
| CN110413106A (zh) * | 2019-06-18 | 2019-11-05 | 中国人民解放军军事科学院国防科技创新研究院 | 一种基于语音和手势的增强现实输入方法及系统 |
| CN110476090A (zh) * | 2017-01-27 | 2019-11-19 | 奇跃公司 | 用于超表面的抗反射涂层 |
| US10499179B1 (en) * | 2019-01-01 | 2019-12-03 | Philip Scott Lyren | Displaying emojis for binaural sound |
| DE102018208703A1 (de) * | 2018-06-01 | 2019-12-05 | Volkswagen Aktiengesellschaft | Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6270040B1 (en) | 2000-04-03 | 2001-08-07 | Kam Industries | Model train control system |
| ATE400871T1 (de) * | 2004-01-29 | 2008-07-15 | Harman Becker Automotive Sys | Multimodale dateneingabe |
| US8788589B2 (en) | 2007-10-12 | 2014-07-22 | Watchitoo, Inc. | System and method for coordinating simultaneous edits of shared digital data |
| US8769510B2 (en) | 2010-04-08 | 2014-07-01 | The Mathworks, Inc. | Identification and translation of program code executable by a graphical processing unit (GPU) |
| US8296151B2 (en) * | 2010-06-18 | 2012-10-23 | Microsoft Corporation | Compound gesture-speech commands |
| US8971854B2 (en) * | 2012-06-19 | 2015-03-03 | Honeywell International Inc. | System and method of speaker recognition |
| US9966075B2 (en) * | 2012-09-18 | 2018-05-08 | Qualcomm Incorporated | Leveraging head mounted displays to enable person-to-person interactions |
| US10824310B2 (en) * | 2012-12-20 | 2020-11-03 | Sri International | Augmented reality virtual personal assistant for external representation |
| US9092600B2 (en) * | 2012-11-05 | 2015-07-28 | Microsoft Technology Licensing, Llc | User authentication on augmented reality display device |
| US9747900B2 (en) | 2013-05-24 | 2017-08-29 | Google Technology Holdings LLC | Method and apparatus for using image data to aid voice recognition |
| US9582246B2 (en) | 2014-03-04 | 2017-02-28 | Microsoft Technology Licensing, Llc | Voice-command suggestions based on computer context |
| US9293141B2 (en) | 2014-03-27 | 2016-03-22 | Storz Endoskop Produktions Gmbh | Multi-user voice control system for medical devices |
| US10152987B2 (en) * | 2014-06-23 | 2018-12-11 | Google Llc | Remote invocation of mobile device actions |
| US10146355B2 (en) * | 2015-03-26 | 2018-12-04 | Lenovo (Singapore) Pte. Ltd. | Human interface device input fusion |
| US10031967B2 (en) * | 2016-02-29 | 2018-07-24 | Rovi Guides, Inc. | Systems and methods for using a trained model for determining whether a query comprising multiple segments relates to an individual query or several queries |
| JP6918471B2 (ja) | 2016-11-24 | 2021-08-11 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 対話補助システムの制御方法、対話補助システム、及び、プログラム |
| US11107469B2 (en) * | 2017-01-18 | 2021-08-31 | Sony Corporation | Information processing apparatus and information processing method |
| US20180261223A1 (en) | 2017-03-13 | 2018-09-13 | Amazon Technologies, Inc. | Dialog management and item fulfillment using voice assistant system |
| US20200327890A1 (en) * | 2017-11-28 | 2020-10-15 | Sony Corporation | Information processing device and information processing method |
| CN108363556A (zh) | 2018-01-30 | 2018-08-03 | 百度在线网络技术(北京)有限公司 | 一种基于语音与增强现实环境交互的方法和系统 |
| US10365885B1 (en) * | 2018-02-21 | 2019-07-30 | Sling Media Pvt. Ltd. | Systems and methods for composition of audio content from multi-object audio |
| US10650829B2 (en) | 2018-06-06 | 2020-05-12 | International Business Machines Corporation | Operating a voice response system in a multiuser environment |
| US11120791B2 (en) | 2018-11-15 | 2021-09-14 | International Business Machines Corporation | Collaborative artificial intelligence (AI) voice response system control for authorizing a command associated with a calendar event |
| KR20200072026A (ko) * | 2018-12-12 | 2020-06-22 | 현대자동차주식회사 | 음성 인식 처리 장치 및 방법 |
| JP2020141235A (ja) | 2019-02-27 | 2020-09-03 | パナソニックIpマネジメント株式会社 | 機器制御システム、機器制御方法及びプログラム |
| US11170774B2 (en) * | 2019-05-21 | 2021-11-09 | Qualcomm Incorproated | Virtual assistant device |
-
2020
- 2020-11-24 US US17/102,687 patent/US11978444B2/en active Active
-
2021
- 2021-11-10 JP JP2023530249A patent/JP7824008B2/ja active Active
- 2021-11-10 GB GB2309312.3A patent/GB2616765B/en active Active
- 2021-11-10 WO PCT/CN2021/129740 patent/WO2022111282A1/en not_active Ceased
- 2021-11-10 DE DE112021005482.1T patent/DE112021005482T5/de active Pending
- 2021-11-10 CN CN202180071279.0A patent/CN116348950A/zh active Pending
Patent Citations (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8223088B1 (en) * | 2011-06-09 | 2012-07-17 | Google Inc. | Multimode input field for a head-mounted display |
| WO2016050724A1 (fr) * | 2014-09-29 | 2016-04-07 | Christophe Guedon | Procédé d'aide au suivi d'une conversation pour personne malentendante |
| CN107209549A (zh) * | 2014-12-11 | 2017-09-26 | 万德实验室公司 | 能够实现可动作的消息传送的虚拟助理系统 |
| CN108702580A (zh) * | 2016-02-19 | 2018-10-23 | 微软技术许可有限责任公司 | 具有自动语音转录的听力辅助 |
| CN110199240A (zh) * | 2016-12-23 | 2019-09-03 | 瑞欧威尔股份有限公司 | 用于可穿戴显示器的基于上下文的内容导航 |
| CN110476090A (zh) * | 2017-01-27 | 2019-11-19 | 奇跃公司 | 用于超表面的抗反射涂层 |
| DE102018208703A1 (de) * | 2018-06-01 | 2019-12-05 | Volkswagen Aktiengesellschaft | Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm |
| CN109272982A (zh) * | 2018-09-07 | 2019-01-25 | 昆明盛策同辉数字科技有限责任公司 | 结合增强现实的tts语音实时播报方法、装置、存储介质及设备 |
| KR101990284B1 (ko) * | 2018-12-13 | 2019-06-18 | 주식회사 버넥트 | 음성인식을 이용한 지능형 인지기술기반 증강현실시스템 |
| US10499179B1 (en) * | 2019-01-01 | 2019-12-03 | Philip Scott Lyren | Displaying emojis for binaural sound |
| CN110413106A (zh) * | 2019-06-18 | 2019-11-05 | 中国人民解放军军事科学院国防科技创新研究院 | 一种基于语音和手势的增强现实输入方法及系统 |
Non-Patent Citations (1)
| Title |
|---|
| 张永生等: "《基于AR的互动式3D电子书的研发与实现》", 齐齐哈尔大学学报(自然科学版), vol. 32, no. 2, 31 March 2016 (2016-03-31), pages 60 - 63 * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7824008B2 (ja) | 2026-03-04 |
| GB2616765A (en) | 2023-09-20 |
| WO2022111282A1 (en) | 2022-06-02 |
| JP2023551169A (ja) | 2023-12-07 |
| GB2616765B (en) | 2025-03-05 |
| DE112021005482T5 (de) | 2023-09-14 |
| US20220165260A1 (en) | 2022-05-26 |
| US11978444B2 (en) | 2024-05-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114365120B (zh) | 用于减少的训练意图识别的方法、系统、装置和介质 | |
| EP4172843B1 (en) | Using a single request for multi-person calling in assistant systems | |
| US11699194B2 (en) | User controlled task execution with task persistence for assistant systems | |
| US11551665B2 (en) | Dynamic contextual dialog session extension | |
| JP2022551788A (ja) | 補助システムのためのプロアクティブコンテンツを生成すること | |
| CN107895577A (zh) | 使用长尾语音命令的任务发起 | |
| US12340172B2 (en) | Semantic parser including a coarse semantic parser and a fine semantic parser | |
| US11403462B2 (en) | Streamlining dialog processing using integrated shared resources | |
| EP3792912B1 (en) | Improved wake-word recognition in low-power devices | |
| KR20250002657A (ko) | 시맨틱 이벤트들을 갖는 멀티모달 ui | |
| CN116348950A (zh) | 在执行任何语音命令时从周围环境进行基于ar(增强现实)的选择性声音包括 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |