CN114127665B - 多模态用户界面 - Google Patents
多模态用户界面 Download PDFInfo
- Publication number
- CN114127665B CN114127665B CN202080049275.8A CN202080049275A CN114127665B CN 114127665 B CN114127665 B CN 114127665B CN 202080049275 A CN202080049275 A CN 202080049275A CN 114127665 B CN114127665 B CN 114127665B
- Authority
- CN
- China
- Prior art keywords
- input
- user
- data
- mode
- command
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/033—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
- G06F3/038—Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04883—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/451—Execution arrangements for user interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/038—Indexing scheme relating to G06F3/038
- G06F2203/0381—Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/038—Indexing scheme relating to G06F3/038
- G06F2203/0382—Plural input, i.e. interface arrangements in which a plurality of input device of the same type are in communication with a PC
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- User Interface Of Digital Computer (AREA)
- Input From Keyboards Or The Like (AREA)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962873775P | 2019-07-12 | 2019-07-12 | |
| US62/873,775 | 2019-07-12 | ||
| US16/685,946 | 2019-11-15 | ||
| US16/685,946 US11348581B2 (en) | 2019-07-12 | 2019-11-15 | Multi-modal user interface |
| PCT/US2020/041499 WO2021011331A1 (en) | 2019-07-12 | 2020-07-10 | Multi-modal user interface |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114127665A CN114127665A (zh) | 2022-03-01 |
| CN114127665B true CN114127665B (zh) | 2024-10-08 |
Family
ID=74101815
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202080049275.8A Active CN114127665B (zh) | 2019-07-12 | 2020-07-10 | 多模态用户界面 |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US11348581B2 (https=) |
| EP (1) | EP3997553A1 (https=) |
| JP (1) | JP7522177B2 (https=) |
| KR (1) | KR20220031610A (https=) |
| CN (1) | CN114127665B (https=) |
| BR (1) | BR112021026765A2 (https=) |
| PH (1) | PH12021553219A1 (https=) |
| TW (1) | TWI840587B (https=) |
| WO (1) | WO2021011331A1 (https=) |
Families Citing this family (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2021103191A (ja) * | 2018-03-30 | 2021-07-15 | ソニーグループ株式会社 | 情報処理装置および情報処理方法 |
| US11615801B1 (en) * | 2019-09-20 | 2023-03-28 | Apple Inc. | System and method of enhancing intelligibility of audio playback |
| US11521643B2 (en) * | 2020-05-08 | 2022-12-06 | Bose Corporation | Wearable audio device with user own-voice recording |
| WO2022016406A1 (zh) * | 2020-07-22 | 2022-01-27 | 北京小米移动软件有限公司 | 信息传输方法、装置及通信设备 |
| US11996095B2 (en) | 2020-08-12 | 2024-05-28 | Kyndryl, Inc. | Augmented reality enabled command management |
| US11878244B2 (en) * | 2020-09-10 | 2024-01-23 | Holland Bloorview Kids Rehabilitation Hospital | Customizable user input recognition systems |
| US11830486B2 (en) * | 2020-10-13 | 2023-11-28 | Google Llc | Detecting near matches to a hotword or phrase |
| US11461681B2 (en) * | 2020-10-14 | 2022-10-04 | Openstream Inc. | System and method for multi-modality soft-agent for query population and information mining |
| US11809480B1 (en) * | 2020-12-31 | 2023-11-07 | Meta Platforms, Inc. | Generating dynamic knowledge graph of media contents for assistant systems |
| US12321865B2 (en) * | 2021-01-25 | 2025-06-03 | Salesforce, Inc. | Event prediction based on multimodal learning |
| US11651541B2 (en) * | 2021-03-01 | 2023-05-16 | Roblox Corporation | Integrated input/output (I/O) for a three-dimensional (3D) environment |
| CN113282172A (zh) * | 2021-05-18 | 2021-08-20 | 前海七剑科技(深圳)有限公司 | 一种手势识别的控制方法和装置 |
| US11783073B2 (en) * | 2021-06-21 | 2023-10-10 | Microsoft Technology Licensing, Llc | Configuration of default sensitivity labels for network file storage locations |
| WO2023272629A1 (zh) * | 2021-06-30 | 2023-01-05 | 华为技术有限公司 | 界面的控制方法、装置和系统 |
| US12614095B2 (en) * | 2021-07-12 | 2026-04-28 | Cypress Semiconductor Corporation | System and method for activity classification |
| WO2023035073A1 (en) * | 2021-09-08 | 2023-03-16 | Huawei Technologies Canada Co., Ltd. | Methods and devices for communication with multimodal compositions |
| US11966663B1 (en) * | 2021-09-29 | 2024-04-23 | Amazon Technologies, Inc. | Speech processing and multi-modal widgets |
| US20230104856A1 (en) * | 2021-10-05 | 2023-04-06 | Rfmicron, Inc. | Data logging device |
| US11971710B2 (en) * | 2021-11-12 | 2024-04-30 | Pani Energy Inc | Digital model based plant operation and optimization |
| US12333794B2 (en) * | 2021-11-12 | 2025-06-17 | Sony Group Corporation | Emotion recognition in multimedia videos using multi-modal fusion-based deep neural network |
| WO2024029827A1 (ko) * | 2022-08-01 | 2024-02-08 | 삼성전자 주식회사 | 제어 추천을 위한 전자 장치 및 컴퓨터 판독가능 저장 매체 |
| US20240036527A1 (en) * | 2022-08-01 | 2024-02-01 | Samsung Electronics Co., Ltd. | Electronic device and computer readable storage medium for control recommendation |
| KR20240079507A (ko) * | 2022-11-29 | 2024-06-05 | 한국전자통신연구원 | 크로스모달 정보를 이용한 언어모델 생성 방법 및 장치 |
| EP4524685A1 (en) * | 2023-09-12 | 2025-03-19 | Rohde & Schwarz GmbH & Co. KG | Measurement application device, and method |
| US20250178624A1 (en) * | 2023-12-01 | 2025-06-05 | Qualcomm Incorporated | Speech-based vehicular control |
| US20260016309A1 (en) * | 2024-07-11 | 2026-01-15 | Apple Inc. | Providing movement dynamics estimations |
Family Cites Families (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8386255B2 (en) * | 2009-03-17 | 2013-02-26 | Avaya Inc. | Providing descriptions of visually presented information to video teleconference participants who are not video-enabled |
| US9123341B2 (en) | 2009-03-18 | 2015-09-01 | Robert Bosch Gmbh | System and method for multi-modal input synchronization and disambiguation |
| KR101092820B1 (ko) | 2009-09-22 | 2011-12-12 | 현대자동차주식회사 | 립리딩과 음성 인식 통합 멀티모달 인터페이스 시스템 |
| US8473289B2 (en) * | 2010-08-06 | 2013-06-25 | Google Inc. | Disambiguating input based on context |
| US20130031076A1 (en) * | 2011-07-28 | 2013-01-31 | Kikin, Inc. | Systems and methods for contextual searching of semantic entities |
| US20130085753A1 (en) * | 2011-09-30 | 2013-04-04 | Google Inc. | Hybrid Client/Server Speech Recognition In A Mobile Device |
| US9152376B2 (en) * | 2011-12-01 | 2015-10-06 | At&T Intellectual Property I, L.P. | System and method for continuous multimodal speech and gesture interaction |
| US9465833B2 (en) * | 2012-07-31 | 2016-10-11 | Veveo, Inc. | Disambiguating user intent in conversational interaction system for large corpus information retrieval |
| CN103729386B (zh) * | 2012-10-16 | 2017-08-04 | 阿里巴巴集团控股有限公司 | 信息查询系统与方法 |
| WO2014070872A2 (en) | 2012-10-30 | 2014-05-08 | Robert Bosch Gmbh | System and method for multimodal interaction with reduced distraction in operating vehicles |
| US9190058B2 (en) * | 2013-01-25 | 2015-11-17 | Microsoft Technology Licensing, Llc | Using visual cues to disambiguate speech inputs |
| WO2014182787A2 (en) | 2013-05-08 | 2014-11-13 | Jpmorgan Chase Bank, N.A. | Systems and methods for high fidelity multi-modal out-of-band biometric authentication |
| US10402060B2 (en) | 2013-06-28 | 2019-09-03 | Orange | System and method for gesture disambiguation |
| US10741182B2 (en) * | 2014-02-18 | 2020-08-11 | Lenovo (Singapore) Pte. Ltd. | Voice input correction using non-audio based input |
| US8825585B1 (en) | 2014-03-11 | 2014-09-02 | Fmr Llc | Interpretation of natural communication |
| US20160034249A1 (en) * | 2014-07-31 | 2016-02-04 | Microsoft Technology Licensing Llc | Speechless interaction with a speech recognition device |
| US10446141B2 (en) * | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
| CN105843605B (zh) * | 2016-03-17 | 2019-03-08 | 中国银行股份有限公司 | 一种数据映射方法及装置 |
| JP2018036902A (ja) * | 2016-08-31 | 2018-03-08 | 島根県 | 機器操作システム、機器操作方法および機器操作プログラム |
| DK201770411A1 (en) * | 2017-05-15 | 2018-12-20 | Apple Inc. | Multi-modal interfaces |
| US20180357040A1 (en) * | 2017-06-09 | 2018-12-13 | Mitsubishi Electric Automotive America, Inc. | In-vehicle infotainment with multi-modal interface |
| US11430437B2 (en) * | 2017-08-01 | 2022-08-30 | Sony Corporation | Information processor and information processing method |
-
2019
- 2019-11-15 US US16/685,946 patent/US11348581B2/en active Active
-
2020
- 2020-07-10 CN CN202080049275.8A patent/CN114127665B/zh active Active
- 2020-07-10 JP JP2022500128A patent/JP7522177B2/ja active Active
- 2020-07-10 PH PH1/2021/553219A patent/PH12021553219A1/en unknown
- 2020-07-10 KR KR1020227000411A patent/KR20220031610A/ko active Pending
- 2020-07-10 TW TW109123487A patent/TWI840587B/zh active
- 2020-07-10 WO PCT/US2020/041499 patent/WO2021011331A1/en not_active Ceased
- 2020-07-10 EP EP20747296.0A patent/EP3997553A1/en active Pending
- 2020-07-10 BR BR112021026765A patent/BR112021026765A2/pt unknown
Also Published As
| Publication number | Publication date |
|---|---|
| PH12021553219A1 (en) | 2022-11-21 |
| BR112021026765A2 (pt) | 2022-02-15 |
| EP3997553A1 (en) | 2022-05-18 |
| WO2021011331A1 (en) | 2021-01-21 |
| JP7522177B2 (ja) | 2024-07-24 |
| JP2022539794A (ja) | 2022-09-13 |
| KR20220031610A (ko) | 2022-03-11 |
| CN114127665A (zh) | 2022-03-01 |
| TWI840587B (zh) | 2024-05-01 |
| US20210012770A1 (en) | 2021-01-14 |
| US11348581B2 (en) | 2022-05-31 |
| TW202109245A (zh) | 2021-03-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114127665B (zh) | 多模态用户界面 | |
| CN111699528B (zh) | 电子装置及执行电子装置的功能的方法 | |
| JP7288143B2 (ja) | キーワード適合を伴うカスタマイズ可能なキーワードスポッティングシステム | |
| US11656837B2 (en) | Electronic device for controlling sound and operation method therefor | |
| CN111868824B (zh) | 用于情境感知控制的设备和方法 | |
| CN112020864B (zh) | 麦克风阵列中的智能波束控制 | |
| CN110832578B (zh) | 可定制唤醒语音命令 | |
| CN112513983B (zh) | 可穿戴系统语音处理 | |
| US10353495B2 (en) | Personalized operation of a mobile device using sensor signatures | |
| JP5916888B2 (ja) | 直接的文法アクセス | |
| JP5075664B2 (ja) | 音声対話装置及び支援方法 | |
| KR102740847B1 (ko) | 사용자 입력 처리 방법 및 이를 지원하는 전자 장치 | |
| CN105580071B (zh) | 用于训练声音识别模型数据库的方法和装置 | |
| TWI871343B (zh) | 激活話音識別 | |
| KR20190053001A (ko) | 이동이 가능한 전자 장치 및 그 동작 방법 | |
| CN104464737B (zh) | 声音验证系统和声音验证方法 | |
| CN111421557B (zh) | 电子装置及其控制方法 | |
| JP2015219440A (ja) | 操作補助装置および操作補助方法 | |
| JP7435641B2 (ja) | 制御装置、ロボット、制御方法およびプログラム | |
| CN112639965A (zh) | 在包括多个设备的环境中的语音识别方法和设备 | |
| KR20230084154A (ko) | 동적 분류기를 사용한 사용자 음성 활동 검출 | |
| US20210158819A1 (en) | Electronic device and control method thereof | |
| KR102168812B1 (ko) | 사운드를 제어하는 전자 장치 및 그 동작 방법 | |
| US12537004B2 (en) | Voice recognition device having barge-in function and method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |