WO2023034497A3 - Gaze based dictation - Google Patents

Gaze based dictation Download PDF

Info

Publication number
WO2023034497A3
WO2023034497A3 PCT/US2022/042331 US2022042331W WO2023034497A3 WO 2023034497 A3 WO2023034497 A3 WO 2023034497A3 US 2022042331 W US2022042331 W US 2022042331W WO 2023034497 A3 WO2023034497 A3 WO 2023034497A3
Authority
WO
WIPO (PCT)
Prior art keywords
enter
gaze
dictation
utterance
user
Prior art date
Application number
PCT/US2022/042331
Other languages
French (fr)
Other versions
WO2023034497A2 (en
Inventor
Timothy S. Paek
Karan M. DARYANANI
Kenneth S. Friedman
Yue Gu
Susumu Harada
Viet Huy Le
Dmytro Rudchenko
Garrett L. Weinberg
Original Assignee
Apple Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Inc. filed Critical Apple Inc.
Priority to CN202280059719.5A priority Critical patent/CN117957511A/en
Priority to EP22786586.2A priority patent/EP4377773A2/en
Publication of WO2023034497A2 publication Critical patent/WO2023034497A2/en
Publication of WO2023034497A3 publication Critical patent/WO2023034497A3/en
Priority to US18/442,910 priority patent/US20240185856A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0236Character input methods using selection techniques to select from displayed items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/048Indexing scheme relating to G06F3/048
    • G06F2203/04803Split screen, i.e. subdividing the display area or the window area into separate subareas
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04883Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • G06F3/04886Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures by partitioning the display area of the touch-screen or the surface of the digitising tablet into independently controllable areas, e.g. virtual keyboards or menus
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/227Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Digital Computer Display Output (AREA)

Abstract

Systems and processes for operating an intelligent dictation system based on gaze are provided. An example method includes, at an electronic device having one or more processors and memory, detecting a gaze of a user, determining based on the detected gaze of the user, whether to enter a dictation mode, and in accordance with a determination to enter the dictation mode: receiving an utterance; determining, based on the detected gaze of the user and the utterance, whether to enter an editing mode; and in accordance with a determination not to enter the editing mode, displaying a textual representation of the utterance on a screen of the electronic device.
PCT/US2022/042331 2021-09-03 2022-09-01 Gaze based dictation WO2023034497A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280059719.5A CN117957511A (en) 2021-09-03 2022-09-01 Gaze-based dictation
EP22786586.2A EP4377773A2 (en) 2021-09-03 2022-09-01 Gaze based dictation
US18/442,910 US20240185856A1 (en) 2021-09-03 2024-02-15 Gaze based dictation

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202163240696P 2021-09-03 2021-09-03
US63/240,696 2021-09-03
US202263335649P 2022-04-27 2022-04-27
US63/335,649 2022-04-27
US202217900666A 2022-08-31 2022-08-31
US17/900,666 2022-08-31

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/442,910 Continuation US20240185856A1 (en) 2021-09-03 2024-02-15 Gaze based dictation

Publications (2)

Publication Number Publication Date
WO2023034497A2 WO2023034497A2 (en) 2023-03-09
WO2023034497A3 true WO2023034497A3 (en) 2023-04-13

Family

ID=83688647

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/042331 WO2023034497A2 (en) 2021-09-03 2022-09-01 Gaze based dictation

Country Status (2)

Country Link
EP (1) EP4377773A2 (en)
WO (1) WO2023034497A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150293602A1 (en) * 2010-03-12 2015-10-15 Nuance Communications, Inc. Multimodal text input system, such as for use with touch screens on mobile phones
US20170206002A1 (en) * 2010-02-12 2017-07-20 Microsoft Technology Licensing, Llc User-centric soft keyboard predictive technologies

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1717678B1 (en) 1998-01-26 2017-11-22 Apple Inc. Method for integrating manual input
US7688306B2 (en) 2000-10-02 2010-03-30 Apple Inc. Methods and apparatuses for operating a portable device based on an accelerometer
US7218226B2 (en) 2004-03-01 2007-05-15 Apple Inc. Acceleration-based theft detection system for portable electronic devices
US6677932B1 (en) 2001-01-28 2004-01-13 Finger Works, Inc. System and method for recognizing touch typing under limited tactile feedback conditions
US6570557B1 (en) 2001-02-10 2003-05-27 Finger Works, Inc. Multi-touch system and method for emulating modifier keys via fingertip chords
US7657849B2 (en) 2005-12-23 2010-02-02 Apple Inc. Unlocking a device by performing gestures on an unlock image
US10903964B2 (en) 2017-03-24 2021-01-26 Apple Inc. Techniques to enable physical downlink control channel communications
JP6821099B2 (en) 2018-07-31 2021-01-27 三菱電機株式会社 Optical transmission equipment and optical transmission system
CN110932673A (en) 2018-09-19 2020-03-27 恩智浦美国有限公司 Chopper-stabilized amplifier containing shunt notch filter
CN111448591B (en) 2018-11-16 2021-05-18 北京嘀嘀无限科技发展有限公司 System and method for locating a vehicle in poor lighting conditions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206002A1 (en) * 2010-02-12 2017-07-20 Microsoft Technology Licensing, Llc User-centric soft keyboard predictive technologies
US20150293602A1 (en) * 2010-03-12 2015-10-15 Nuance Communications, Inc. Multimodal text input system, such as for use with touch screens on mobile phones

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PRODUCTS FOR PALS - ALS TECH: "Skyle for iPad Pro eye gaze control real world review", 13 August 2020 (2020-08-13), XP093006810, Retrieved from the Internet <URL:https://www.youtube.com/watch?v=_3TxZtDJpFo> [retrieved on 20221210] *
RICK CASTELLINI: "How to enable and use dictation with an iPhone or iPad", 7 September 2017 (2017-09-07), XP093006809, Retrieved from the Internet <URL:https://www.youtube.com/watch?v=8wl33yN6rTU> [retrieved on 20221210] *

Also Published As

Publication number Publication date
EP4377773A2 (en) 2024-06-05
WO2023034497A2 (en) 2023-03-09

Similar Documents

Publication Publication Date Title
US11302341B2 (en) Microphone array based pickup method and system
US9502026B2 (en) Initiating actions based on partial hotwords
EP1647972A3 (en) Intelligibility enhancement of audio signals containing speech
CN107886944B (en) Voice recognition method, device, equipment and storage medium
US20150074524A1 (en) Management of virtual assistant action items
US20160019886A1 (en) Method and apparatus for recognizing whisper
US20160055847A1 (en) System and method for speech validation
EP3432303A3 (en) Automatically monitoring for voice input based on context
CN105139858B (en) A kind of information processing method and electronic equipment
WO2005094397A3 (en) Tone event detector and method therefor
CN105139849A (en) Speech recognition method and apparatus
EP2728576A1 (en) Method and apparatus for voice recognition
CN107516526B (en) Sound source tracking and positioning method, device, equipment and computer readable storage medium
CN103871401A (en) Method for voice recognition and electronic equipment
CN107680613A (en) A kind of voice-operated device speech recognition capabilities method of testing and equipment
HK1104616A1 (en) Slide misload detection system
US11610578B2 (en) Automatic hotword threshold tuning
AU2003274432A1 (en) Method and system for speech recognition
CA3164079A1 (en) Smart-device-orientated feedback awaking method and smart device thereof
US20190180734A1 (en) Keyword confirmation method and apparatus
WO2023034497A3 (en) Gaze based dictation
US20180350360A1 (en) Provide non-obtrusive output
EP3851963A3 (en) Incident detection and management
CN106200950B (en) A kind of method and mobile terminal of adjustable font size
US9043204B2 (en) Thought recollection and speech assistance device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22786586

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 202280059719.5

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 2022786586

Country of ref document: EP

Effective date: 20240227

NENP Non-entry into the national phase

Ref country code: DE