GB2616765B - AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command - Google Patents

AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command Download PDF

Info

Publication number
GB2616765B
GB2616765B GB2309312.3A GB202309312A GB2616765B GB 2616765 B GB2616765 B GB 2616765B GB 202309312 A GB202309312 A GB 202309312A GB 2616765 B GB2616765 B GB 2616765B
Authority
GB
United Kingdom
Prior art keywords
executing
surrounding
augmented reality
voice command
based selective
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
GB2309312.3A
Other languages
English (en)
Other versions
GB2616765A (en
Inventor
Decrop Clement
Agrawal Tushar
R Fox Jeremy
K Rakshit Sarbajit
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of GB2616765A publication Critical patent/GB2616765A/en
Application granted granted Critical
Publication of GB2616765B publication Critical patent/GB2616765B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Selective Calling Equipment (AREA)
GB2309312.3A 2020-11-24 2021-11-10 AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command Active GB2616765B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17/102,687 US11978444B2 (en) 2020-11-24 2020-11-24 AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
PCT/CN2021/129740 WO2022111282A1 (en) 2020-11-24 2021-11-10 Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Publications (2)

Publication Number Publication Date
GB2616765A GB2616765A (en) 2023-09-20
GB2616765B true GB2616765B (en) 2025-03-05

Family

ID=81657233

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2309312.3A Active GB2616765B (en) 2020-11-24 2021-11-10 AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Country Status (6)

Country Link
US (1) US11978444B2 (https=)
JP (1) JP7824008B2 (https=)
CN (1) CN116348950A (https=)
DE (1) DE112021005482T5 (https=)
GB (1) GB2616765B (https=)
WO (1) WO2022111282A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115079833B (zh) * 2022-08-24 2023-01-06 北京亮亮视野科技有限公司 基于体感控制的多层界面与信息可视化呈现方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8223088B1 (en) * 2011-06-09 2012-07-17 Google Inc. Multimode input field for a head-mounted display
US20140081634A1 (en) * 2012-09-18 2014-03-20 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
US20150254058A1 (en) * 2014-03-04 2015-09-10 Microsoft Technology Licensing, Llc Voice control shortcuts
WO2016050724A1 (fr) * 2014-09-29 2016-04-07 Christophe Guedon Procédé d'aide au suivi d'une conversation pour personne malentendante
CN108363556A (zh) * 2018-01-30 2018-08-03 百度在线网络技术(北京)有限公司 一种基于语音与增强现实环境交互的方法和系统
US20190378516A1 (en) * 2018-06-06 2019-12-12 International Business Machines Corporation Operating a voice response system in a multiuser environment
WO2020175293A1 (ja) * 2019-02-27 2020-09-03 パナソニックIpマネジメント株式会社 機器制御システム、機器制御方法及びプログラム

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6270040B1 (en) 2000-04-03 2001-08-07 Kam Industries Model train control system
ATE400871T1 (de) * 2004-01-29 2008-07-15 Harman Becker Automotive Sys Multimodale dateneingabe
US8788589B2 (en) 2007-10-12 2014-07-22 Watchitoo, Inc. System and method for coordinating simultaneous edits of shared digital data
US8769510B2 (en) 2010-04-08 2014-07-01 The Mathworks, Inc. Identification and translation of program code executable by a graphical processing unit (GPU)
US8296151B2 (en) * 2010-06-18 2012-10-23 Microsoft Corporation Compound gesture-speech commands
US8971854B2 (en) * 2012-06-19 2015-03-03 Honeywell International Inc. System and method of speaker recognition
US10824310B2 (en) * 2012-12-20 2020-11-03 Sri International Augmented reality virtual personal assistant for external representation
US9092600B2 (en) * 2012-11-05 2015-07-28 Microsoft Technology Licensing, Llc User authentication on augmented reality display device
US9747900B2 (en) 2013-05-24 2017-08-29 Google Technology Holdings LLC Method and apparatus for using image data to aid voice recognition
US9293141B2 (en) 2014-03-27 2016-03-22 Storz Endoskop Produktions Gmbh Multi-user voice control system for medical devices
US10152987B2 (en) * 2014-06-23 2018-12-11 Google Llc Remote invocation of mobile device actions
CN111427534B (zh) * 2014-12-11 2023-07-25 微软技术许可有限责任公司 能够实现可动作的消息传送的虚拟助理系统
US10146355B2 (en) * 2015-03-26 2018-12-04 Lenovo (Singapore) Pte. Ltd. Human interface device input fusion
US20170243582A1 (en) * 2016-02-19 2017-08-24 Microsoft Technology Licensing, Llc Hearing assistance with automated speech transcription
US10031967B2 (en) * 2016-02-29 2018-07-24 Rovi Guides, Inc. Systems and methods for using a trained model for determining whether a query comprising multiple segments relates to an individual query or several queries
JP6918471B2 (ja) 2016-11-24 2021-08-11 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 対話補助システムの制御方法、対話補助システム、及び、プログラム
US11099716B2 (en) * 2016-12-23 2021-08-24 Realwear, Inc. Context based content navigation for wearable display
US11107469B2 (en) * 2017-01-18 2021-08-31 Sony Corporation Information processing apparatus and information processing method
WO2018140502A1 (en) * 2017-01-27 2018-08-02 Magic Leap, Inc. Antireflection coatings for metasurfaces
US20180261223A1 (en) 2017-03-13 2018-09-13 Amazon Technologies, Inc. Dialog management and item fulfillment using voice assistant system
US20200327890A1 (en) * 2017-11-28 2020-10-15 Sony Corporation Information processing device and information processing method
US10365885B1 (en) * 2018-02-21 2019-07-30 Sling Media Pvt. Ltd. Systems and methods for composition of audio content from multi-object audio
DE102018208703A1 (de) * 2018-06-01 2019-12-05 Volkswagen Aktiengesellschaft Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm
CN109272982A (zh) * 2018-09-07 2019-01-25 昆明盛策同辉数字科技有限责任公司 结合增强现实的tts语音实时播报方法、装置、存储介质及设备
US11120791B2 (en) 2018-11-15 2021-09-14 International Business Machines Corporation Collaborative artificial intelligence (AI) voice response system control for authorizing a command associated with a calendar event
KR20200072026A (ko) * 2018-12-12 2020-06-22 현대자동차주식회사 음성 인식 처리 장치 및 방법
KR101990284B1 (ko) * 2018-12-13 2019-06-18 주식회사 버넥트 음성인식을 이용한 지능형 인지기술기반 증강현실시스템
US10499179B1 (en) * 2019-01-01 2019-12-03 Philip Scott Lyren Displaying emojis for binaural sound
US11170774B2 (en) * 2019-05-21 2021-11-09 Qualcomm Incorproated Virtual assistant device
CN110413106B (zh) * 2019-06-18 2024-02-09 中国人民解放军军事科学院国防科技创新研究院 一种基于语音和手势的增强现实输入方法及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8223088B1 (en) * 2011-06-09 2012-07-17 Google Inc. Multimode input field for a head-mounted display
US20140081634A1 (en) * 2012-09-18 2014-03-20 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
US20150254058A1 (en) * 2014-03-04 2015-09-10 Microsoft Technology Licensing, Llc Voice control shortcuts
WO2016050724A1 (fr) * 2014-09-29 2016-04-07 Christophe Guedon Procédé d'aide au suivi d'une conversation pour personne malentendante
CN108363556A (zh) * 2018-01-30 2018-08-03 百度在线网络技术(北京)有限公司 一种基于语音与增强现实环境交互的方法和系统
US20190378516A1 (en) * 2018-06-06 2019-12-12 International Business Machines Corporation Operating a voice response system in a multiuser environment
WO2020175293A1 (ja) * 2019-02-27 2020-09-03 パナソニックIpマネジメント株式会社 機器制御システム、機器制御方法及びプログラム

Also Published As

Publication number Publication date
JP7824008B2 (ja) 2026-03-04
GB2616765A (en) 2023-09-20
WO2022111282A1 (en) 2022-06-02
JP2023551169A (ja) 2023-12-07
DE112021005482T5 (de) 2023-09-14
CN116348950A (zh) 2023-06-27
US20220165260A1 (en) 2022-05-26
US11978444B2 (en) 2024-05-07

Similar Documents

Publication Publication Date Title
EP4139626A4 (en) SOUND SUPPRESSOR
EP4115633A4 (en) IMMERSIVE AUDIO PLATFORM
EP4651516A3 (en) Audio response playback
GB2612624B (en) Methods and systems for synthesising speech from text
GB2591745B (en) Augmented reality system
GB2595860B (en) Augmented reality system
GB202102563D0 (en) Augmented reality enabled autonomous vehicle command center
EP3982356A4 (en) CARTRIDGE, STRING INSTRUMENT AND RECORDING CONTROL METHOD
GB202308093D0 (en) Audio emulation
EP4038537A4 (en) PLATFORM FOR GENERATING ENHANCED NATURAL LANGUAGE
EP4046392A4 (en) SOUND SYSTEM
GB2616765B (en) AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
EP4203510A4 (en) FULL-BAND MEMS MICROPHONE WITH SOUND BEAMS AND SOUND TUNNELS
EP4081781A4 (en) CORE MODEL AUGMENTED REALITY
EP4323827A4 (en) AUGMENTED REALITY HEADGEAR
GB2607903B (en) Text-to-speech system
GB201916857D0 (en) Sound reproduction
CA226478S (en) Augmented reality headset
CA226479S (en) Augmented reality headset
CA219182S (en) Augmented reality headset
CA219183S (en) Augmented reality headset
CA219184S (en) Augmented reality headset
CA219185S (en) Augmented reality headset
GB2608186B (en) Augmented Reality System
EP4119480A4 (en) SOUND SYSTEM FOR ELEVATOR

Legal Events

Date Code Title Description
746 Register noted 'licences of right' (sect. 46/1977)

Effective date: 20250512