KR20220031610A - 멀티-모달 사용자 인터페이스 - Google Patents

멀티-모달 사용자 인터페이스 Download PDF

Info

Publication number
KR20220031610A
KR20220031610A KR1020227000411A KR20227000411A KR20220031610A KR 20220031610 A KR20220031610 A KR 20220031610A KR 1020227000411 A KR1020227000411 A KR 1020227000411A KR 20227000411 A KR20227000411 A KR 20227000411A KR 20220031610 A KR20220031610 A KR 20220031610A
Authority
KR
South Korea
Prior art keywords
input
user
data
mode
feedback message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020227000411A
Other languages
English (en)
Korean (ko)
Inventor
라비 쵸우다리
래훈 김
선국 문
인이 궈
파테메 사키
에릭 비제르
Original Assignee
퀄컴 인코포레이티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 퀄컴 인코포레이티드 filed Critical 퀄컴 인코포레이티드
Publication of KR20220031610A publication Critical patent/KR20220031610A/ko
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/038Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04883Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0382Plural input, i.e. interface arrangements in which a plurality of input device of the same type are in communication with a PC
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Input From Keyboards Or The Like (AREA)
KR1020227000411A 2019-07-12 2020-07-10 멀티-모달 사용자 인터페이스 Pending KR20220031610A (ko)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201962873775P 2019-07-12 2019-07-12
US62/873,775 2019-07-12
US16/685,946 US11348581B2 (en) 2019-07-12 2019-11-15 Multi-modal user interface
US16/685,946 2019-11-15
PCT/US2020/041499 WO2021011331A1 (en) 2019-07-12 2020-07-10 Multi-modal user interface

Publications (1)

Publication Number Publication Date
KR20220031610A true KR20220031610A (ko) 2022-03-11

Family

ID=74101815

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020227000411A Pending KR20220031610A (ko) 2019-07-12 2020-07-10 멀티-모달 사용자 인터페이스

Country Status (9)

Country Link
US (1) US11348581B2 (https=)
EP (1) EP3997553A1 (https=)
JP (1) JP7522177B2 (https=)
KR (1) KR20220031610A (https=)
CN (1) CN114127665B (https=)
BR (1) BR112021026765A2 (https=)
PH (1) PH12021553219A1 (https=)
TW (1) TWI840587B (https=)
WO (1) WO2021011331A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024029827A1 (ko) * 2022-08-01 2024-02-08 삼성전자 주식회사 제어 추천을 위한 전자 장치 및 컴퓨터 판독가능 저장 매체

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021103191A (ja) * 2018-03-30 2021-07-15 ソニーグループ株式会社 情報処理装置および情報処理方法
US11615801B1 (en) * 2019-09-20 2023-03-28 Apple Inc. System and method of enhancing intelligibility of audio playback
US11521643B2 (en) * 2020-05-08 2022-12-06 Bose Corporation Wearable audio device with user own-voice recording
EP4187930A4 (en) * 2020-07-22 2024-04-03 Beijing Xiaomi Mobile Software Co., Ltd. INFORMATION TRANSMISSION METHOD AND APPARATUS, AND COMMUNICATION DEVICE
US11996095B2 (en) * 2020-08-12 2024-05-28 Kyndryl, Inc. Augmented reality enabled command management
US11878244B2 (en) * 2020-09-10 2024-01-23 Holland Bloorview Kids Rehabilitation Hospital Customizable user input recognition systems
US11830486B2 (en) * 2020-10-13 2023-11-28 Google Llc Detecting near matches to a hotword or phrase
US11461681B2 (en) * 2020-10-14 2022-10-04 Openstream Inc. System and method for multi-modality soft-agent for query population and information mining
US11809480B1 (en) * 2020-12-31 2023-11-07 Meta Platforms, Inc. Generating dynamic knowledge graph of media contents for assistant systems
US12321865B2 (en) * 2021-01-25 2025-06-03 Salesforce, Inc. Event prediction based on multimodal learning
US11651541B2 (en) * 2021-03-01 2023-05-16 Roblox Corporation Integrated input/output (I/O) for a three-dimensional (3D) environment
CN113282172A (zh) * 2021-05-18 2021-08-20 前海七剑科技(深圳)有限公司 一种手势识别的控制方法和装置
US11783073B2 (en) * 2021-06-21 2023-10-10 Microsoft Technology Licensing, Llc Configuration of default sensitivity labels for network file storage locations
CN116670624A (zh) * 2021-06-30 2023-08-29 华为技术有限公司 界面的控制方法、装置和系统
CN118251878A (zh) * 2021-09-08 2024-06-25 华为技术加拿大有限公司 使用多模态合成进行通信的方法和设备
US11966663B1 (en) * 2021-09-29 2024-04-23 Amazon Technologies, Inc. Speech processing and multi-modal widgets
US20230104856A1 (en) * 2021-10-05 2023-04-06 Rfmicron, Inc. Data logging device
US12333794B2 (en) * 2021-11-12 2025-06-17 Sony Group Corporation Emotion recognition in multimedia videos using multi-modal fusion-based deep neural network
US11971710B2 (en) * 2021-11-12 2024-04-30 Pani Energy Inc Digital model based plant operation and optimization
US20240036527A1 (en) * 2022-08-01 2024-02-01 Samsung Electronics Co., Ltd. Electronic device and computer readable storage medium for control recommendation
KR20240079507A (ko) * 2022-11-29 2024-06-05 한국전자통신연구원 크로스모달 정보를 이용한 언어모델 생성 방법 및 장치
EP4524685A1 (en) * 2023-09-12 2025-03-19 Rohde & Schwarz GmbH & Co. KG Measurement application device, and method
US20250178624A1 (en) * 2023-12-01 2025-06-05 Qualcomm Incorporated Speech-based vehicular control
US20260016309A1 (en) * 2024-07-11 2026-01-15 Apple Inc. Providing movement dynamics estimations

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8386255B2 (en) * 2009-03-17 2013-02-26 Avaya Inc. Providing descriptions of visually presented information to video teleconference participants who are not video-enabled
US9123341B2 (en) 2009-03-18 2015-09-01 Robert Bosch Gmbh System and method for multi-modal input synchronization and disambiguation
KR101092820B1 (ko) 2009-09-22 2011-12-12 현대자동차주식회사 립리딩과 음성 인식 통합 멀티모달 인터페이스 시스템
US8473289B2 (en) * 2010-08-06 2013-06-25 Google Inc. Disambiguating input based on context
US8898583B2 (en) * 2011-07-28 2014-11-25 Kikin Inc. Systems and methods for providing information regarding semantic entities included in a page of content
US20130085753A1 (en) * 2011-09-30 2013-04-04 Google Inc. Hybrid Client/Server Speech Recognition In A Mobile Device
US9152376B2 (en) * 2011-12-01 2015-10-06 At&T Intellectual Property I, L.P. System and method for continuous multimodal speech and gesture interaction
US9465833B2 (en) * 2012-07-31 2016-10-11 Veveo, Inc. Disambiguating user intent in conversational interaction system for large corpus information retrieval
CN103729386B (zh) * 2012-10-16 2017-08-04 阿里巴巴集团控股有限公司 信息查询系统与方法
WO2014070872A2 (en) 2012-10-30 2014-05-08 Robert Bosch Gmbh System and method for multimodal interaction with reduced distraction in operating vehicles
US9190058B2 (en) * 2013-01-25 2015-11-17 Microsoft Technology Licensing, Llc Using visual cues to disambiguate speech inputs
EP2995040B1 (en) 2013-05-08 2022-11-16 JPMorgan Chase Bank, N.A. Systems and methods for high fidelity multi-modal out-of-band biometric authentication
US10402060B2 (en) 2013-06-28 2019-09-03 Orange System and method for gesture disambiguation
US10741182B2 (en) * 2014-02-18 2020-08-11 Lenovo (Singapore) Pte. Ltd. Voice input correction using non-audio based input
US8825585B1 (en) 2014-03-11 2014-09-02 Fmr Llc Interpretation of natural communication
US20160034249A1 (en) * 2014-07-31 2016-02-04 Microsoft Technology Licensing Llc Speechless interaction with a speech recognition device
US10446141B2 (en) * 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
CN105843605B (zh) * 2016-03-17 2019-03-08 中国银行股份有限公司 一种数据映射方法及装置
JP2018036902A (ja) * 2016-08-31 2018-03-08 島根県 機器操作システム、機器操作方法および機器操作プログラム
DK201770411A1 (en) * 2017-05-15 2018-12-20 Apple Inc. Multi-modal interfaces
US20180357040A1 (en) * 2017-06-09 2018-12-13 Mitsubishi Electric Automotive America, Inc. In-vehicle infotainment with multi-modal interface
US11430437B2 (en) * 2017-08-01 2022-08-30 Sony Corporation Information processor and information processing method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024029827A1 (ko) * 2022-08-01 2024-02-08 삼성전자 주식회사 제어 추천을 위한 전자 장치 및 컴퓨터 판독가능 저장 매체

Also Published As

Publication number Publication date
WO2021011331A1 (en) 2021-01-21
US20210012770A1 (en) 2021-01-14
US11348581B2 (en) 2022-05-31
JP7522177B2 (ja) 2024-07-24
EP3997553A1 (en) 2022-05-18
CN114127665B (zh) 2024-10-08
PH12021553219A1 (en) 2022-11-21
TWI840587B (zh) 2024-05-01
TW202109245A (zh) 2021-03-01
BR112021026765A2 (pt) 2022-02-15
JP2022539794A (ja) 2022-09-13
CN114127665A (zh) 2022-03-01

Similar Documents

Publication Publication Date Title
US11348581B2 (en) Multi-modal user interface
CN111868824B (zh) 用于情境感知控制的设备和方法
US11656837B2 (en) Electronic device for controlling sound and operation method therefor
CN111699528B (zh) 电子装置及执行电子装置的功能的方法
KR102901070B1 (ko) 로봇 및 그의 기동어 인식 방법
CN102483918B (zh) 声音识别装置
US10353495B2 (en) Personalized operation of a mobile device using sensor signatures
KR102740847B1 (ko) 사용자 입력 처리 방법 및 이를 지원하는 전자 장치
CN111163906B (zh) 能够移动的电子设备及其操作方法
JP6433903B2 (ja) 音声認識方法及び音声認識装置
CN111788043B (zh) 信息处理装置、信息处理方法和程序
US11895474B2 (en) Activity detection on devices with multi-modal sensing
CN104464737B (zh) 声音验证系统和声音验证方法
JP2015219440A (ja) 操作補助装置および操作補助方法
JP7435641B2 (ja) 制御装置、ロボット、制御方法およびプログラム
KR20180115880A (ko) 네트워크에 연결된 음향기기와의 멀티모달 인터렉션 방법 및 시스템
KR20230084154A (ko) 동적 분류기를 사용한 사용자 음성 활동 검출
JP7605034B2 (ja) 制御装置、制御方法および制御プログラム
KR102168812B1 (ko) 사운드를 제어하는 전자 장치 및 그 동작 방법
US12537004B2 (en) Voice recognition device having barge-in function and method thereof
KR102933433B1 (ko) 근접 음성 분류 방법, 장치 및 기록 매체
US20240098411A1 (en) Dynamic adjustment of an output of a speaker

Legal Events

Date Code Title Description
PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

D18-X000 Deferred examination requested

St.27 status event code: A-1-2-D10-D18-exm-X000

D19-X000 Deferred examination accepted

St.27 status event code: A-1-2-D10-D19-exm-X000

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

D20-X000 Deferred examination resumed

St.27 status event code: A-1-2-D10-D20-exm-X000

D21 Rejection of application intended

Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D21-EXM-PE0902 (AS PROVIDED BY THE NATIONAL OFFICE)

PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

P11 Amendment of application requested

Free format text: ST27 STATUS EVENT CODE: A-2-2-P10-P11-NAP-X000 (AS PROVIDED BY THE NATIONAL OFFICE)

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000