DE112021005482T5 - Ar- (augmented reality) gestützte selektive geräuscheinbindung aus der umgebung während der ausführung eines sprachbefehls - Google Patents
Ar- (augmented reality) gestützte selektive geräuscheinbindung aus der umgebung während der ausführung eines sprachbefehls Download PDFInfo
- Publication number
- DE112021005482T5 DE112021005482T5 DE112021005482.1T DE112021005482T DE112021005482T5 DE 112021005482 T5 DE112021005482 T5 DE 112021005482T5 DE 112021005482 T DE112021005482 T DE 112021005482T DE 112021005482 T5 DE112021005482 T5 DE 112021005482T5
- Authority
- DE
- Germany
- Prior art keywords
- sounds
- voice command
- unit
- selecting
- visualization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- User Interface Of Digital Computer (AREA)
- Selective Calling Equipment (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/102,687 US11978444B2 (en) | 2020-11-24 | 2020-11-24 | AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
| US17/102,687 | 2020-11-24 | ||
| PCT/CN2021/129740 WO2022111282A1 (en) | 2020-11-24 | 2021-11-10 | Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| DE112021005482T5 true DE112021005482T5 (de) | 2023-09-14 |
Family
ID=81657233
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| DE112021005482.1T Pending DE112021005482T5 (de) | 2020-11-24 | 2021-11-10 | Ar- (augmented reality) gestützte selektive geräuscheinbindung aus der umgebung während der ausführung eines sprachbefehls |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11978444B2 (https=) |
| JP (1) | JP7824008B2 (https=) |
| CN (1) | CN116348950A (https=) |
| DE (1) | DE112021005482T5 (https=) |
| GB (1) | GB2616765B (https=) |
| WO (1) | WO2022111282A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115079833B (zh) * | 2022-08-24 | 2023-01-06 | 北京亮亮视野科技有限公司 | 基于体感控制的多层界面与信息可视化呈现方法及系统 |
Family Cites Families (37)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6270040B1 (en) | 2000-04-03 | 2001-08-07 | Kam Industries | Model train control system |
| ATE400871T1 (de) * | 2004-01-29 | 2008-07-15 | Harman Becker Automotive Sys | Multimodale dateneingabe |
| US8788589B2 (en) | 2007-10-12 | 2014-07-22 | Watchitoo, Inc. | System and method for coordinating simultaneous edits of shared digital data |
| US8769510B2 (en) | 2010-04-08 | 2014-07-01 | The Mathworks, Inc. | Identification and translation of program code executable by a graphical processing unit (GPU) |
| US8296151B2 (en) * | 2010-06-18 | 2012-10-23 | Microsoft Corporation | Compound gesture-speech commands |
| US8223088B1 (en) | 2011-06-09 | 2012-07-17 | Google Inc. | Multimode input field for a head-mounted display |
| US8971854B2 (en) * | 2012-06-19 | 2015-03-03 | Honeywell International Inc. | System and method of speaker recognition |
| US9966075B2 (en) * | 2012-09-18 | 2018-05-08 | Qualcomm Incorporated | Leveraging head mounted displays to enable person-to-person interactions |
| US10824310B2 (en) * | 2012-12-20 | 2020-11-03 | Sri International | Augmented reality virtual personal assistant for external representation |
| US9092600B2 (en) * | 2012-11-05 | 2015-07-28 | Microsoft Technology Licensing, Llc | User authentication on augmented reality display device |
| US9747900B2 (en) | 2013-05-24 | 2017-08-29 | Google Technology Holdings LLC | Method and apparatus for using image data to aid voice recognition |
| US9582246B2 (en) | 2014-03-04 | 2017-02-28 | Microsoft Technology Licensing, Llc | Voice-command suggestions based on computer context |
| US9293141B2 (en) | 2014-03-27 | 2016-03-22 | Storz Endoskop Produktions Gmbh | Multi-user voice control system for medical devices |
| US10152987B2 (en) * | 2014-06-23 | 2018-12-11 | Google Llc | Remote invocation of mobile device actions |
| FR3026543B1 (fr) | 2014-09-29 | 2017-12-22 | Christophe Guedon | Procede d'aide au suivi d'une conversation pour personne malentendante |
| CN111427534B (zh) * | 2014-12-11 | 2023-07-25 | 微软技术许可有限责任公司 | 能够实现可动作的消息传送的虚拟助理系统 |
| US10146355B2 (en) * | 2015-03-26 | 2018-12-04 | Lenovo (Singapore) Pte. Ltd. | Human interface device input fusion |
| US20170243582A1 (en) * | 2016-02-19 | 2017-08-24 | Microsoft Technology Licensing, Llc | Hearing assistance with automated speech transcription |
| US10031967B2 (en) * | 2016-02-29 | 2018-07-24 | Rovi Guides, Inc. | Systems and methods for using a trained model for determining whether a query comprising multiple segments relates to an individual query or several queries |
| JP6918471B2 (ja) | 2016-11-24 | 2021-08-11 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | 対話補助システムの制御方法、対話補助システム、及び、プログラム |
| US11099716B2 (en) * | 2016-12-23 | 2021-08-24 | Realwear, Inc. | Context based content navigation for wearable display |
| US11107469B2 (en) * | 2017-01-18 | 2021-08-31 | Sony Corporation | Information processing apparatus and information processing method |
| WO2018140502A1 (en) * | 2017-01-27 | 2018-08-02 | Magic Leap, Inc. | Antireflection coatings for metasurfaces |
| US20180261223A1 (en) | 2017-03-13 | 2018-09-13 | Amazon Technologies, Inc. | Dialog management and item fulfillment using voice assistant system |
| US20200327890A1 (en) * | 2017-11-28 | 2020-10-15 | Sony Corporation | Information processing device and information processing method |
| CN108363556A (zh) | 2018-01-30 | 2018-08-03 | 百度在线网络技术(北京)有限公司 | 一种基于语音与增强现实环境交互的方法和系统 |
| US10365885B1 (en) * | 2018-02-21 | 2019-07-30 | Sling Media Pvt. Ltd. | Systems and methods for composition of audio content from multi-object audio |
| DE102018208703A1 (de) * | 2018-06-01 | 2019-12-05 | Volkswagen Aktiengesellschaft | Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm |
| US10650829B2 (en) | 2018-06-06 | 2020-05-12 | International Business Machines Corporation | Operating a voice response system in a multiuser environment |
| CN109272982A (zh) * | 2018-09-07 | 2019-01-25 | 昆明盛策同辉数字科技有限责任公司 | 结合增强现实的tts语音实时播报方法、装置、存储介质及设备 |
| US11120791B2 (en) | 2018-11-15 | 2021-09-14 | International Business Machines Corporation | Collaborative artificial intelligence (AI) voice response system control for authorizing a command associated with a calendar event |
| KR20200072026A (ko) * | 2018-12-12 | 2020-06-22 | 현대자동차주식회사 | 음성 인식 처리 장치 및 방법 |
| KR101990284B1 (ko) * | 2018-12-13 | 2019-06-18 | 주식회사 버넥트 | 음성인식을 이용한 지능형 인지기술기반 증강현실시스템 |
| US10499179B1 (en) * | 2019-01-01 | 2019-12-03 | Philip Scott Lyren | Displaying emojis for binaural sound |
| JP2020141235A (ja) | 2019-02-27 | 2020-09-03 | パナソニックIpマネジメント株式会社 | 機器制御システム、機器制御方法及びプログラム |
| US11170774B2 (en) * | 2019-05-21 | 2021-11-09 | Qualcomm Incorproated | Virtual assistant device |
| CN110413106B (zh) * | 2019-06-18 | 2024-02-09 | 中国人民解放军军事科学院国防科技创新研究院 | 一种基于语音和手势的增强现实输入方法及系统 |
-
2020
- 2020-11-24 US US17/102,687 patent/US11978444B2/en active Active
-
2021
- 2021-11-10 JP JP2023530249A patent/JP7824008B2/ja active Active
- 2021-11-10 GB GB2309312.3A patent/GB2616765B/en active Active
- 2021-11-10 WO PCT/CN2021/129740 patent/WO2022111282A1/en not_active Ceased
- 2021-11-10 DE DE112021005482.1T patent/DE112021005482T5/de active Pending
- 2021-11-10 CN CN202180071279.0A patent/CN116348950A/zh active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| JP7824008B2 (ja) | 2026-03-04 |
| GB2616765A (en) | 2023-09-20 |
| WO2022111282A1 (en) | 2022-06-02 |
| JP2023551169A (ja) | 2023-12-07 |
| GB2616765B (en) | 2025-03-05 |
| CN116348950A (zh) | 2023-06-27 |
| US20220165260A1 (en) | 2022-05-26 |
| US11978444B2 (en) | 2024-05-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| DE102022102912A1 (de) | Pipelines für effizientes training und einsatz von modellen für maschinelles lernen | |
| DE102021125855B4 (de) | Selbstlernende sprachsteuerung durch künstliche intelligenz auf grundlage eines benutzerverhaltens während einer interaktion | |
| DE112020005323B4 (de) | Elastische ausführung von machine-learning-arbeitslasten unter verwendung einer anwendungsbasierten profilierung | |
| DE112021004261T5 (de) | Dualmodale beziehungsnetzwerke zur audiovisuellen ereignislokalisierung | |
| DE112020005253T5 (de) | Auflösung von anaphern | |
| DE112020000545T5 (de) | Deep-forest-modell-entwicklung und -training | |
| DE112021004163T5 (de) | Zuschneiden eines kommunikationsinhalts | |
| DE112018005227T5 (de) | Merkmalsextraktion mithilfe von multi-task-lernen | |
| DE112018005167T5 (de) | Aktualisieren von trainingsdaten | |
| US20080162310A1 (en) | System and Method for Creating and Implementing Community Defined Presentation Structures | |
| DE112021004234T5 (de) | Einsetzen von metalernen zum optimieren der automatischen auswahl von pipelinesdes maschinellen lernens | |
| DE102017207686A1 (de) | Einblicke in die belegschaftsstrategie | |
| DE102023108430A1 (de) | Erzeugung von konversationellen erwiderungen unter verwendung von neuralen netzwerken | |
| DE112021005633T5 (de) | Lernen eines abgleichens ungepaarter multimodaler merkmale für halbüberwachtes lernen | |
| DE112021002572T5 (de) | Multikriterielles optimieren von anwendungen | |
| DE112022004517T5 (de) | Optimierung von lippensynchronisation in einem in natürliche sprache übersetzten video | |
| DE102021124264A1 (de) | Erzeugung von synthetischen Systemfehlern | |
| DE112015005269T5 (de) | Erweitern einer Informationsanforderung | |
| DE112018001711T5 (de) | Generator von Unterrichtsnotizen auf Blickrichtungsgrundlage | |
| DE102024136304A1 (de) | Prompteignungsanalyse für sprachmodellbasierte ki-systeme und anwendungen | |
| DE112021001550T5 (de) | Automatisches erstellen von verbesserungen an av inhalten | |
| DE112020004925T5 (de) | Aktualisieren und umsetzen eines dokuments aus einem audiovorgang | |
| DE112020005296T5 (de) | Durchsuchen von gesprächsprotokollen eines systems mit virtuellen dialogagenten nach kontrastierenden zeitlichen mustern | |
| DE112022001431T5 (de) | Adaptive auswahl von datenmodalitäten für eine effiziente videoerkennung | |
| DE112022001483B4 (de) | Betriebsbefehlsgrenzen |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| R012 | Request for examination validly filed | ||
| R084 | Declaration of willingness to licence | ||
| R016 | Response to examination communication |