JP7824008B2 - 何らかの音声コマンドを実行する間にar(拡張現実)ベースで周囲からの音を選択的に包含すること - Google Patents

何らかの音声コマンドを実行する間にar(拡張現実)ベースで周囲からの音を選択的に包含すること

Info

Publication number
JP7824008B2
JP7824008B2 JP2023530249A JP2023530249A JP7824008B2 JP 7824008 B2 JP7824008 B2 JP 7824008B2 JP 2023530249 A JP2023530249 A JP 2023530249A JP 2023530249 A JP2023530249 A JP 2023530249A JP 7824008 B2 JP7824008 B2 JP 7824008B2
Authority
JP
Japan
Prior art keywords
sounds
voice command
augmented reality
augmented
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2023530249A
Other languages
English (en)
Japanese (ja)
Other versions
JP2023551169A (ja
JP2023551169A5 (https=
Inventor
デクロップ、クレメント
アグラワル、トゥーシャー
アール フォックス、ジェレミー
ケイ ラクシット、サルバジット
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of JP2023551169A publication Critical patent/JP2023551169A/ja
Publication of JP2023551169A5 publication Critical patent/JP2023551169A5/ja
Application granted granted Critical
Publication of JP7824008B2 publication Critical patent/JP7824008B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)
  • Selective Calling Equipment (AREA)
JP2023530249A 2020-11-24 2021-11-10 何らかの音声コマンドを実行する間にar(拡張現実)ベースで周囲からの音を選択的に包含すること Active JP7824008B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/102,687 US11978444B2 (en) 2020-11-24 2020-11-24 AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
US17/102,687 2020-11-24
PCT/CN2021/129740 WO2022111282A1 (en) 2020-11-24 2021-11-10 Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Publications (3)

Publication Number Publication Date
JP2023551169A JP2023551169A (ja) 2023-12-07
JP2023551169A5 JP2023551169A5 (https=) 2024-01-15
JP7824008B2 true JP7824008B2 (ja) 2026-03-04

Family

ID=81657233

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2023530249A Active JP7824008B2 (ja) 2020-11-24 2021-11-10 何らかの音声コマンドを実行する間にar(拡張現実)ベースで周囲からの音を選択的に包含すること

Country Status (6)

Country Link
US (1) US11978444B2 (https=)
JP (1) JP7824008B2 (https=)
CN (1) CN116348950A (https=)
DE (1) DE112021005482T5 (https=)
GB (1) GB2616765B (https=)
WO (1) WO2022111282A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115079833B (zh) * 2022-08-24 2023-01-06 北京亮亮视野科技有限公司 基于体感控制的多层界面与信息可视化呈现方法及系统

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018135302A1 (ja) 2017-01-18 2018-07-26 ソニー株式会社 情報処理装置および情報処理方法、並びにプログラム
WO2019107145A1 (ja) 2017-11-28 2019-06-06 ソニー株式会社 情報処理装置、及び情報処理方法

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6270040B1 (en) 2000-04-03 2001-08-07 Kam Industries Model train control system
ATE400871T1 (de) * 2004-01-29 2008-07-15 Harman Becker Automotive Sys Multimodale dateneingabe
US8788589B2 (en) 2007-10-12 2014-07-22 Watchitoo, Inc. System and method for coordinating simultaneous edits of shared digital data
US8769510B2 (en) 2010-04-08 2014-07-01 The Mathworks, Inc. Identification and translation of program code executable by a graphical processing unit (GPU)
US8296151B2 (en) * 2010-06-18 2012-10-23 Microsoft Corporation Compound gesture-speech commands
US8223088B1 (en) 2011-06-09 2012-07-17 Google Inc. Multimode input field for a head-mounted display
US8971854B2 (en) * 2012-06-19 2015-03-03 Honeywell International Inc. System and method of speaker recognition
US9966075B2 (en) * 2012-09-18 2018-05-08 Qualcomm Incorporated Leveraging head mounted displays to enable person-to-person interactions
US10824310B2 (en) * 2012-12-20 2020-11-03 Sri International Augmented reality virtual personal assistant for external representation
US9092600B2 (en) * 2012-11-05 2015-07-28 Microsoft Technology Licensing, Llc User authentication on augmented reality display device
US9747900B2 (en) 2013-05-24 2017-08-29 Google Technology Holdings LLC Method and apparatus for using image data to aid voice recognition
US9582246B2 (en) 2014-03-04 2017-02-28 Microsoft Technology Licensing, Llc Voice-command suggestions based on computer context
US9293141B2 (en) 2014-03-27 2016-03-22 Storz Endoskop Produktions Gmbh Multi-user voice control system for medical devices
US10152987B2 (en) * 2014-06-23 2018-12-11 Google Llc Remote invocation of mobile device actions
FR3026543B1 (fr) 2014-09-29 2017-12-22 Christophe Guedon Procede d'aide au suivi d'une conversation pour personne malentendante
CN111427534B (zh) * 2014-12-11 2023-07-25 微软技术许可有限责任公司 能够实现可动作的消息传送的虚拟助理系统
US10146355B2 (en) * 2015-03-26 2018-12-04 Lenovo (Singapore) Pte. Ltd. Human interface device input fusion
US20170243582A1 (en) * 2016-02-19 2017-08-24 Microsoft Technology Licensing, Llc Hearing assistance with automated speech transcription
US10031967B2 (en) * 2016-02-29 2018-07-24 Rovi Guides, Inc. Systems and methods for using a trained model for determining whether a query comprising multiple segments relates to an individual query or several queries
JP6918471B2 (ja) 2016-11-24 2021-08-11 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 対話補助システムの制御方法、対話補助システム、及び、プログラム
US11099716B2 (en) * 2016-12-23 2021-08-24 Realwear, Inc. Context based content navigation for wearable display
WO2018140502A1 (en) * 2017-01-27 2018-08-02 Magic Leap, Inc. Antireflection coatings for metasurfaces
US20180261223A1 (en) 2017-03-13 2018-09-13 Amazon Technologies, Inc. Dialog management and item fulfillment using voice assistant system
CN108363556A (zh) 2018-01-30 2018-08-03 百度在线网络技术(北京)有限公司 一种基于语音与增强现实环境交互的方法和系统
US10365885B1 (en) * 2018-02-21 2019-07-30 Sling Media Pvt. Ltd. Systems and methods for composition of audio content from multi-object audio
DE102018208703A1 (de) * 2018-06-01 2019-12-05 Volkswagen Aktiengesellschaft Verfahren zur Berechnung einer "augmented reality"-Einblendung für die Darstellung einer Navigationsroute auf einer AR-Anzeigeeinheit, Vorrichtung zur Durchführung des Verfahrens sowie Kraftfahrzeug und Computerprogramm
US10650829B2 (en) 2018-06-06 2020-05-12 International Business Machines Corporation Operating a voice response system in a multiuser environment
CN109272982A (zh) * 2018-09-07 2019-01-25 昆明盛策同辉数字科技有限责任公司 结合增强现实的tts语音实时播报方法、装置、存储介质及设备
US11120791B2 (en) 2018-11-15 2021-09-14 International Business Machines Corporation Collaborative artificial intelligence (AI) voice response system control for authorizing a command associated with a calendar event
KR20200072026A (ko) * 2018-12-12 2020-06-22 현대자동차주식회사 음성 인식 처리 장치 및 방법
KR101990284B1 (ko) * 2018-12-13 2019-06-18 주식회사 버넥트 음성인식을 이용한 지능형 인지기술기반 증강현실시스템
US10499179B1 (en) * 2019-01-01 2019-12-03 Philip Scott Lyren Displaying emojis for binaural sound
JP2020141235A (ja) 2019-02-27 2020-09-03 パナソニックIpマネジメント株式会社 機器制御システム、機器制御方法及びプログラム
US11170774B2 (en) * 2019-05-21 2021-11-09 Qualcomm Incorproated Virtual assistant device
CN110413106B (zh) * 2019-06-18 2024-02-09 中国人民解放军军事科学院国防科技创新研究院 一种基于语音和手势的增强现实输入方法及系统

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018135302A1 (ja) 2017-01-18 2018-07-26 ソニー株式会社 情報処理装置および情報処理方法、並びにプログラム
WO2019107145A1 (ja) 2017-11-28 2019-06-06 ソニー株式会社 情報処理装置、及び情報処理方法

Also Published As

Publication number Publication date
GB2616765A (en) 2023-09-20
WO2022111282A1 (en) 2022-06-02
JP2023551169A (ja) 2023-12-07
GB2616765B (en) 2025-03-05
DE112021005482T5 (de) 2023-09-14
CN116348950A (zh) 2023-06-27
US20220165260A1 (en) 2022-05-26
US11978444B2 (en) 2024-05-07

Similar Documents

Publication Publication Date Title
CN114365120B (zh) 用于减少的训练意图识别的方法、系统、装置和介质
US10692606B2 (en) Stress level reduction using haptic feedback
CN114365215B (zh) 动态上下文对话会话扩展
US10930265B2 (en) Cognitive enhancement of communication with tactile stimulation
US11188809B2 (en) Optimizing personality traits of virtual agents
US10943070B2 (en) Interactively building a topic model employing semantic similarity in a spoken dialog system
US11394668B1 (en) System and method for executing operations in a performance engineering environment
KR20220091529A (ko) 아나포라 처리
US11157533B2 (en) Designing conversational systems driven by a semantic network with a library of templated query operators
CN112580359B (zh) 计算机实施的方法、训练系统和计算机程序产品
US11288293B2 (en) Methods and systems for ensuring quality of unstructured user input content
US20210142180A1 (en) Feedback discriminator
JP7710813B2 (ja) 発話障害のあるユーザのための人工知能音声応答システム
US12307335B2 (en) User assistance through demonstration
JP7824008B2 (ja) 何らかの音声コマンドを実行する間にar(拡張現実)ベースで周囲からの音を選択的に包含すること
CN114365141A (zh) 使用可生性对抗网络训练对话系统的语义解析器
CN112489632A (zh) 实施校正模型以减少自动语音识别错误的传播
US11830490B2 (en) Multi-user voice assistant with disambiguation
US11481401B2 (en) Enhanced cognitive query construction
WO2019207421A1 (en) Navigation and cognitive dialog assistance
US11631488B2 (en) Dialogue generation via hashing functions
CA3106998A1 (en) System and method for executing operations in a performance engineering environment
Aku in Computer Science

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20231227

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20240411

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20250220

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20250306

A603 Late request for extension of time limit during examination

Free format text: JAPANESE INTERMEDIATE CODE: A603

Effective date: 20250610

RD12 Notification of acceptance of power of sub attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7432

Effective date: 20250716

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20250717

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20250716

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20251023

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20260105

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20260204

RD14 Notification of resignation of power of sub attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7434

Effective date: 20260206

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20260217

R150 Certificate of patent or registration of utility model

Ref document number: 7824008

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150