CN117043720A

CN117043720A - Method for interacting with objects in an environment

Info

Publication number: CN117043720A
Application number: CN202280022799.7A
Authority: CN
Inventors: C·D·麦肯齐; P·普拉·艾·柯尼萨; M·阿朗索鲁伊斯; S·O·勒梅; W·A·索伦蒂诺三世; 邱诗善; J·拉瓦兹; B·H·博伊塞尔; K·E·鲍尔利; P·D·安东; E·克日沃卢奇科; N·吉特
Original assignee: Apple Inc
Current assignee: Apple Inc
Priority date: 2021-01-20
Filing date: 2022-01-20
Publication date: 2023-11-10
Also published as: CN117406892A

Abstract

In some embodiments, the electronic device selectively performs operations in response to user inputs based on whether a ready state was previously detected by those inputs. In some embodiments, the electronic device processes the user input based on an attention area associated with the user. In some embodiments, the electronic device enhances interaction with user interface elements at different distances and/or angles relative to the user's gaze. In some embodiments, the electronic device enhances interaction with the user interface element for mixed direct and indirect interaction modes. In some embodiments, the electronic device manages input from both hands of the user and/or presents visual indications of user input. In some implementations, the electronic device uses visual indications of such interactions to enhance interactions with user interface elements in a three-dimensional environment. In some embodiments, the electronic device redirects the selection input from one user interface element to another user interface element.

Description

Method for interacting with objects in an environment

Cross Reference to Related Applications

This patent application claims the benefit of U.S. provisional application 63/139,566 filed on day 20 of 1 of 2021 and U.S. provisional application 63/261,559 filed on day 23 of 9 of 2021, the contents of both of which are incorporated herein by reference in their entirety for all purposes.

Technical Field

The present invention relates generally to computer systems having a display generating component that presents a graphical user interface and one or more input devices, including but not limited to electronic devices that present interactive user interface elements via the display generating component.

Background

In recent years, the development of computer systems for augmented reality has increased significantly. An example augmented reality environment includes at least some virtual elements that replace or augment the physical world. Input devices (such as cameras, controllers, joysticks, touch-sensitive surfaces, and touch screen displays) for computer systems and other electronic computing devices are used to interact with the virtual/augmented reality environment. Exemplary virtual elements include virtual objects (including digital images, videos, text, icons, control elements (such as buttons), and other graphics).

Methods and interfaces for interacting with environments (e.g., applications, augmented reality environments, mixed reality environments, and virtual reality environments) that include at least some virtual elements are cumbersome, inefficient, and limited. For example, providing a system for insufficient feedback of actions associated with virtual objects, a system that requires a series of inputs to achieve desired results in an augmented reality environment, and a system in which virtual objects are complex, cumbersome, and error-prone to manipulate can create a significant cognitive burden on the user and detract from the experience of the virtual/augmented reality environment. In addition, these methods take longer than necessary, wasting energy. This latter consideration is particularly important in battery-powered devices.

Disclosure of Invention

Accordingly, there is a need for a computer system with improved methods and interfaces to provide a user with a computer-generated experience, thereby making user interactions with the computer system more efficient and intuitive for the user. Such methods and interfaces optionally complement or replace conventional methods for providing a computer-generated real-world experience to a user. Such methods and interfaces reduce the number, extent, and/or nature of inputs from a user by helping the user understand the association between the inputs provided and the response of the device to those inputs, thereby forming a more efficient human-machine interface.

The disclosed system reduces or eliminates the above-described drawbacks and other problems associated with user interfaces for computer systems having a display generating component and one or more input devices. In some embodiments, the computer system is a desktop computer with an associated display. In some embodiments, the computer system is a portable device (e.g., a notebook, tablet, or handheld device). In some embodiments, the computer system is a personal electronic device (e.g., a wearable electronic device such as a watch or a head-mounted device). In some embodiments, the computer system has a touch pad. In some embodiments, the computer system has one or more cameras. In some implementations, the computer system has a touch-sensitive display (also referred to as a "touch screen" or "touch screen display"). In some embodiments, the computer system has one or more eye tracking components. In some embodiments, the computer system has one or more hand tracking components. In some embodiments, the computer system has, in addition to the display generating component, one or more output devices including one or more haptic output generators and one or more audio output devices. In some embodiments, a computer system has a Graphical User Interface (GUI), one or more processors, memory and one or more modules, a program or set of instructions stored in the memory for performing a plurality of functions. In some embodiments, the user interacts with the GUI through contact and gestures of a stylus and/or finger on the touch-sensitive surface, movements of the user's eyes and hands in space relative to the GUI or user's body (as captured by cameras and other motion sensors), and voice input (as captured by one or more audio input devices). In some embodiments, the functions performed by the interactions optionally include image editing, drawing, presentation, word processing, spreadsheet making, game playing, phone calls, video conferencing, email sending and receiving, instant messaging, test support, digital photography, digital video recording, web browsing, digital music playing, notes taking, and/or digital video playing. Executable instructions for performing these functions are optionally included in a non-transitory computer-readable storage medium or other computer program product configured for execution by one or more processors.

There is a need for an electronic device with improved methods and interfaces for interacting with objects in a three-dimensional environment. Such methods and interfaces may supplement or replace conventional methods for interacting with objects in a three-dimensional environment. Such methods and interfaces reduce the amount, degree, and/or nature of input from a user and result in a more efficient human-machine interface.

In some embodiments, the electronic device performs or does not perform an operation in response to a user input depending on whether a ready state of the user was previously detected by the user input. In some embodiments, the electronic device processes the user input based on an attention area associated with the user. In some embodiments, the electronic device enhances interaction with user interface elements in the three-dimensional environment that are at different distances and/or angles relative to the user's gaze. In some embodiments, the electronic device enhances interaction with the user interface element for mixed direct and indirect interaction modes. In some embodiments, the electronic device manages input from both hands of the user. In some embodiments, the electronic device presents a visual indication of user input. In some implementations, the electronic device uses visual indications of such interactions to enhance interactions with user interface elements in a three-dimensional environment. In some embodiments, the electronic device redirects input from one user interface element to another user interface element according to movement included in the input.

It is noted that the various embodiments described above may be combined with any of the other embodiments described herein. The features and advantages described in this specification are not all-inclusive and, in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter.

Drawings

For a better understanding of the various described embodiments, reference should be made to the following detailed description taken in conjunction with the following drawings, in which like reference numerals designate corresponding parts throughout the several views.

FIG. 1 is a block diagram illustrating an operating environment of a computer system for providing a CGR experience, according to some embodiments.

FIG. 2 is a block diagram illustrating a controller of a computer system configured to manage and coordinate a user's CGR experience, according to some embodiments.

FIG. 3 is a block diagram illustrating a display generation component of a computer system configured to provide a visual component of a CGR experience to a user, according to some embodiments.

FIG. 4 is a block diagram illustrating a hand tracking unit of a computer system configured to capture gesture inputs of a user, according to some embodiments.

Fig. 5 is a block diagram illustrating an eye tracking unit of a computer system configured to capture gaze input of a user, according to some embodiments.

Fig. 6A is a flow diagram illustrating a flash-assisted gaze tracking pipeline in accordance with some embodiments.

FIG. 6B illustrates an exemplary environment for an electronic device that provides a CGR experience, according to some embodiments.

Fig. 7A-7C illustrate an exemplary manner in which an electronic device may or may not perform an operation in response to a user input according to whether a ready state of the user was previously detected by the user input, in accordance with some embodiments.

Fig. 8A-8K are flowcharts illustrating methods of performing or not performing an operation in response to a user input according to whether a ready state of the user was previously detected by the user input, according to some embodiments.

Fig. 9A-9C illustrate an exemplary manner in which an electronic device processes user input based on an attention area associated with a user, according to some embodiments.

Fig. 10A-10H are flowcharts illustrating methods of processing user input based on an attention area associated with a user, according to some embodiments.

Fig. 11A-11C illustrate examples of how an electronic device enhances interactions with user interface elements in a three-dimensional environment at different distances and/or angles relative to a user's gaze, in accordance with some embodiments.

Fig. 12A-12F are flowcharts illustrating methods of enhancing interaction with user interface elements in a three-dimensional environment at different distances and/or angles relative to a user's gaze, according to some embodiments.

Fig. 13A-13C illustrate examples of how an electronic device enhances interaction with user interface elements for mixed direct and indirect interaction modes, according to some embodiments.

Fig. 14A-14H are flowcharts illustrating methods of enhancing interactions with user interface elements for mixed direct and indirect interaction modes, according to some embodiments.

Fig. 15A-15E illustrate an exemplary manner in which an electronic device manages input from two hands of a user, according to some embodiments.

Fig. 16A-16I are flowcharts illustrating methods of managing input from two hands of a user, according to some embodiments.

17A-17E illustrate various ways in which an electronic device presents a visual indication of user input according to some embodiments.

Fig. 18A-18O are flowcharts illustrating methods of presenting visual indications of user inputs according to some embodiments.

19A-19D illustrate examples of how an electronic device uses visual indications of such interactions to enhance interactions with user interface elements in a three-dimensional environment, according to some embodiments.

20A-20F are flowcharts illustrating methods of enhancing interactions with user interface elements in a three-dimensional environment using visual indications of such interactions, according to some embodiments.

Fig. 21A-21E illustrate examples of how an electronic device redirects input from one user interface element to another user interface element in response to detecting movement included in the input, according to some embodiments.

Fig. 22A-22K are flowcharts illustrating methods of redirecting input from one user interface element to another user interface element in response to detecting movement included in the input, according to some embodiments.

Detailed Description

According to some embodiments, the present disclosure relates to a user interface for providing a computer-generated reality (CGR) experience to a user.

The systems, methods, and GUIs described herein provide an improved way for an electronic device to interact with and manipulate objects in a three-dimensional environment. The three-dimensional environment optionally includes one or more virtual objects, one or more representations of real objects in the physical environment of the electronic device (e.g., photorealistic (e.g., "transparent") representations displayed as real objects or representations visible to a user through transparent portions of the display generating component), and/or representations of the user in the three-dimensional environment.

In some embodiments, the electronic device automatically updates the orientation of the virtual object in the three-dimensional environment based on the viewpoint of the user in the three-dimensional environment. In some implementations, the electronic device moves the virtual object according to the user input and displays the object at the updated location in response to termination of the user input. In some implementations, the electronic device automatically updates the orientation of the virtual object located at the updated location (e.g., and/or as the virtual object moves to the updated location) such that the virtual object is oriented toward (e.g., throughout and/or at the end of) the user's point of view in the three-dimensional environment. Automatically updating the orientation of the virtual object in the three-dimensional environment enables the user to more naturally and efficiently view and interact with the virtual object without requiring the user to manually adjust the orientation of the object.

In some embodiments, the electronic device automatically updates the orientation of the virtual object in the three-dimensional environment based on the viewpoints of the multiple users in the three-dimensional environment. In some implementations, the electronic device moves the virtual object according to the user input and displays the object at the updated location in response to termination of the user input. In some implementations, the electronic device automatically updates the orientation of the virtual object located at the updated location (e.g., and/or as the virtual object moves to the updated location) such that the virtual object is oriented toward (e.g., throughout and/or at the end of) the viewpoint of the plurality of users in the three-dimensional environment. Automatically updating the orientation of the virtual object in the three-dimensional environment enables the user to more naturally and efficiently view and interact with the virtual object without requiring the user to manually adjust the orientation of the object.

In some embodiments, the electronic device modifies the appearance of the real object between the virtual object and the user's point of view in the three-dimensional environment. The electronic device optionally blurs, darkens, or otherwise modifies a portion of the real object between a user's point of view in the three-dimensional environment and the virtual object (e.g., as a realistic (e.g., "see-through") representation display of the image of the real object, or visible to the user through a transparent portion of the display generating component). In some implementations, the electronic device modifies a portion of the real object that is within a threshold distance (e.g., 5 cm, 10 cm, 30 cm, 50 cm, 100 cm, etc.) of the boundary of the virtual object, without modifying a portion of the real object that is more than the threshold distance from the boundary of the virtual object. Modifying the appearance of the real object allows the user to more naturally and efficiently view and interact with the virtual object. In addition, modifying the appearance of the real object reduces the cognitive burden on the user.

In some embodiments, the electronic device automatically selects a location of the user in a three-dimensional environment that includes one or more virtual objects and/or other users. In some embodiments, a user obtains access to a three-dimensional environment that already includes one or more other users and one or more virtual objects. In some embodiments, the electronic device automatically selects a location to be associated with the user (e.g., a location at which to place the user's point of view) based on the locations and orientations of the virtual object and other users in the three-dimensional environment. In some embodiments, the electronic device will select the location of the user to enable the user to view other users and virtual objects in the three-dimensional environment without blocking the view of the user and virtual objects by the other users. Automatically placing the user in the three-dimensional environment based on the position and orientation of the virtual object and other users in the three-dimensional environment enables the user to effectively view and interact with the virtual object and other users in the three-dimensional environment without requiring the user to manually select a position in the three-dimensional environment to be associated with.

In some embodiments, the electronic device redirects input from one user interface element to another user interface element according to movement included in the input. In some implementations, an electronic device presents a plurality of interactive user interface elements and receives input via one or more input devices directed to a first user interface element of the plurality of user interface elements. In some implementations, after detecting a portion of the input (e.g., not detecting the entire input), the electronic device detects a moving portion of the input corresponding to a request to redirect the input to the second user interface element. In response, in some embodiments, the electronic device directs the input to the second user interface element. In some implementations, in response to movement that meets one or more criteria (e.g., based on speed, duration, distance, etc.), the electronic device cancels the input rather than redirecting the input. Enabling a user to redirect or cancel input after providing a portion of the input enables the user to efficiently interact with the electronic device with less input (e.g., undo unexpected actions and/or direct input to a different user interface element).

Fig. 1-6 provide a description of an exemplary computer system for providing a CGR experience to a user (such as described below with respect to methods 800, 1000, 1200, 1400, 1600, 1800, 2000, and 2200). In some embodiments, as shown in fig. 1, a CGR experience is provided to a user via an operating environment 100 including a computer system 101. The computer system 101 includes a controller 110 (e.g., a processor or remote server of a portable electronic device), a display generation component 120 (e.g., a Head Mounted Device (HMD), a display, a projector, a touch screen, etc.), one or more input devices 125 (e.g., an eye tracking device 130, a hand tracking device 140, other input devices 150), one or more output devices 155 (e.g., a speaker 160, a haptic output generator 170, and other output devices 180), one or more sensors 190 (e.g., an image sensor, a light sensor, a depth sensor, a haptic sensor, an orientation sensor, a proximity sensor, a temperature sensor, a position sensor, a motion sensor, a speed sensor, etc.), and optionally one or more peripheral devices 195 (e.g., a household appliance, a wearable device, etc.). In some implementations, one or more of the input device 125, the output device 155, the sensor 190, and the peripheral device 195 are integrated with the display generating component 120 (e.g., in a head-mounted device or a handheld device).

The processes described below enhance operability of the device and make the user-device interface more efficient through various techniques (e.g., by helping a user provide appropriate input and reducing user error in operating/interacting with the device), including by providing improved visual feedback to the user, reducing the number of inputs required to perform the operation, providing additional control options without cluttering the user interface with additional display controls, performing the operation when a set of conditions has been met without further user input and/or additional techniques. These techniques also reduce power usage and extend battery life of the device by enabling a user to use the device faster and more efficiently.

In describing the CGR experience, various terms are used to refer differently to several related but different environments that a user may sense and/or interact with (e.g., interact with inputs detected by computer system 101 generating the CGR experience that cause the computer system generating the CGR experience to generate audio, visual, and/or tactile feedback corresponding to various inputs provided to computer system 101). The following are a subset of these terms:

Physical environment: a physical environment refers to a physical world in which people can sense and/or interact without the assistance of an electronic system. Physical environments such as physical parks include physical objects such as physical trees, physical buildings, and physical people. People can directly sense and/or interact with a physical environment, such as by visual, tactile, auditory, gustatory, and olfactory.

Computer-generated reality: conversely, a computer-generated reality (CGR) environment refers to a fully or partially simulated environment in which people sense and/or interact via an electronic system. In the CGR, a subset of the physical movements of the person, or a representation thereof, is tracked and in response one or more characteristics of one or more virtual objects simulated in the CGR environment are adjusted in a manner consistent with at least one physical law. For example, the CGR system may detect human head rotation and, in response, adjust the graphical content and sound field presented to the human in a manner similar to the manner in which such views and sounds change in the physical environment. In some cases (e.g., for reachability reasons), the adjustment of the characteristics of the virtual object in the CGR environment may be made in response to a representation of physical motion (e.g., a voice command). A person may utilize any of his sensations to sense and/or interact with CGR objects, including visual, auditory, tactile, gustatory, and olfactory. For example, a person may sense and/or interact with audio objects that create a 3D or spatial audio environment that provides a perception of point audio sources in 3D space. As another example, an audio object may enable audio transparency that selectively introduces environmental sounds from a physical environment with or without computer generated audio. In some CGR environments, a person may sense and/or interact with only audio objects.

Examples of CGR include virtual reality and mixed reality.

Virtual reality: a Virtual Reality (VR) environment refers to a simulated environment designed to be based entirely on computer-generated sensory input for one or more senses. The VR environment includes a plurality of virtual objects that a person can sense and/or interact with. For example, computer-generated images of trees, buildings, and avatars representing people are examples of virtual objects. A person may sense and/or interact with virtual objects in the VR environment through a simulation of the presence of the person within the computer-generated environment and/or through a simulation of a subset of the physical movements of the person within the computer-generated environment.

Mixed reality: in contrast to VR environments designed to be based entirely on computer-generated sensory input, a Mixed Reality (MR) environment refers to a simulated environment designed to introduce sensory input from a physical environment or a representation thereof in addition to including computer-generated sensory input (e.g., virtual objects). On a virtual continuum, a mixed reality environment is any condition between, but not including, a full physical environment as one end and a virtual reality environment as the other end. In some MR environments, the computer-generated sensory input may be responsive to changes in sensory input from the physical environment. In addition, some electronic systems for rendering MR environments may track the position and/or orientation relative to the physical environment to enable virtual objects to interact with real objects (i.e., physical objects or representations thereof from the physical environment). For example, the system may cause movement such that the virtual tree appears to be stationary relative to the physical ground.

Examples of mixed reality include augmented reality and augmented virtualization.

Augmented reality: an Augmented Reality (AR) environment refers to a simulated environment in which one or more virtual objects are superimposed over a physical environment or a representation of a physical environment. For example, an electronic system for presenting an AR environment may have a transparent or translucent display through which a person may directly view the physical environment. The system may be configured to present the virtual object on a transparent or semi-transparent display such that a person perceives the virtual object superimposed over the physical environment with the system. Alternatively, the system may have an opaque display and one or more imaging sensors that capture images or videos of the physical environment, which are representations of the physical environment. The system combines the image or video with the virtual object and presents the composition on an opaque display. A person utilizes the system to indirectly view the physical environment via an image or video of the physical environment and perceive a virtual object superimposed over the physical environment. As used herein, video of a physical environment displayed on an opaque display is referred to as "pass-through video," meaning that the system captures images of the physical environment using one or more image sensors and uses those images when rendering an AR environment on the opaque display. Further alternatively, the system may have a projection system that projects the virtual object into the physical environment, for example as a hologram or on a physical surface, such that a person perceives the virtual object superimposed on top of the physical environment with the system. An augmented reality environment also refers to a simulated environment in which a representation of a physical environment is transformed by computer-generated sensory information. For example, in providing a passthrough video, the system may transform one or more sensor images to apply a selected viewing angle (e.g., a viewpoint) that is different from the viewing angle captured by the imaging sensor. As another example, the representation of the physical environment may be transformed by graphically modifying (e.g., magnifying) portions thereof such that the modified portions may be representative but not real versions of the original captured image. For another example, the representation of the physical environment may be transformed by graphically eliminating or blurring portions thereof.

Enhanced virtualization: enhanced virtual (AV) environment refers to a simulated environment in which a virtual environment or computer-generated environment incorporates one or more sensory inputs from a physical environment. The sensory input may be a representation of one or more characteristics of the physical environment. For example, an AV park may have virtual trees and virtual buildings, but the face of a person is realistically reproduced from an image taken of a physical person. As another example, the virtual object may take the shape or color of a physical object imaged by one or more imaging sensors. For another example, the virtual object may employ shadows that conform to the positioning of the sun in the physical environment.

Hardware: there are many different types of electronic systems that enable a person to sense and/or interact with various CGR environments. Examples include head-mounted systems, projection-based systems, head-up displays (HUDs), vehicle windshields integrated with display capabilities, windows integrated with display capabilities, displays formed as lenses designed for placement on a human eye (e.g., similar to contact lenses), headphones/earphones, speaker arrays, input systems (e.g., wearable or handheld controllers with or without haptic feedback), smart phones, tablet computers, and desktop/laptop computers. The head-mounted system may have one or more speakers and an integrated opaque display. Alternatively, the head-mounted system may be configured to accept an external opaque display (e.g., a smart phone). The head-mounted system may incorporate one or more imaging sensors for capturing images or video of the physical environment, and/or one or more microphones for capturing audio of the physical environment. The head-mounted system may have a transparent or translucent display instead of an opaque display. The transparent or translucent display may have a medium through which light representing an image is directed to the eyes of a person. The display may utilize digital light projection, OLED, LED, uLED, liquid crystal on silicon, laser scanning light sources, or any combination of these techniques. The medium may be an optical waveguide, a holographic medium, an optical combiner, an optical reflector, or any combination thereof. In one embodiment, the transparent or translucent display may be configured to selectively become opaque. Projection-based systems may employ retinal projection techniques that project a graphical image onto a person's retina. The projection system may also be configured to project the virtual object into the physical environment, e.g. as a hologram, or onto a physical surface. In some embodiments, controller 110 is configured to manage and coordinate the CGR experience of the user. In some embodiments, controller 110 includes suitable combinations of software, firmware, and/or hardware. The controller 110 is described in more detail below with reference to fig. 2. In some implementations, the controller 110 is a computing device that is in a local or remote location relative to the scene 105 (e.g., physical environment). For example, the controller 110 is a local server located within the scene 105. As another example, the controller 110 is a remote server (e.g., cloud server, central server, etc.) located outside of the scene 105. In some implementations, the controller 110 is communicatively coupled with the display generation component 120 (e.g., HMD, display, projector, touch-screen, etc.) via one or more wired or wireless communication channels 144 (e.g., bluetooth, IEEE 802.11x, IEEE 802.16x, IEEE 802.3x, etc.). In another example, the controller 110 is included within a housing (e.g., a physical enclosure) of the display generation component 120 (e.g., an HMD or portable electronic device including a display and one or more processors, etc.), one or more of the input devices 125, one or more of the output devices 155, one or more of the sensors 190, and/or one or more of the peripheral devices 195, or shares the same physical housing or support structure with one or more of the above.

In some embodiments, display generation component 120 is configured to provide a CGR experience (e.g., at least a visual component of the CGR experience) to a user. In some embodiments, display generation component 120 includes suitable combinations of software, firmware, and/or hardware. The display generating section 120 is described in more detail below with respect to fig. 3. In some embodiments, the functionality of the controller 110 is provided by and/or combined with the display generating component 120.

According to some embodiments, display generation component 120 provides a CGR experience to a user when the user is virtually and/or physically present within scene 105.

In some embodiments, the display generating component is worn on a portion of the user's body (e.g., on his/her head, on his/her hand, etc.). As such, display generation component 120 includes one or more CGR displays provided for displaying CGR content. For example, in various embodiments, the display generation component 120 encloses a field of view of a user. In some embodiments, display generation component 120 is a handheld device (such as a smart phone or tablet computer) configured to present CGR content, and the user holds the device with a display facing the user's field of view and a camera facing scene 105. In some embodiments, the handheld device is optionally placed within a housing that is worn on the head of the user. In some embodiments, the handheld device is optionally placed on a support (e.g., tripod) in front of the user. In some embodiments, display generation component 120 is a CGR room, housing, or room configured to present CGR content, wherein display generation component 120 is not worn or held by a user. Many of the user interfaces described with reference to one type of hardware for displaying CGR content (e.g., a handheld device or a device on a tripod) may be implemented on another type of hardware for displaying CGR content (e.g., an HMD or other wearable computing device). For example, a user interface showing interactions with CGR content triggered based on interactions occurring in a space in front of a handheld device or a tripod-mounted device may similarly be implemented with an HMD, where the interactions occur in the space in front of the HMD and responses to the CGR content are displayed via the HMD. Similarly, a user interface showing interactions with CRG content triggered based on movement of a handheld device or tripod-mounted device relative to a physical environment (e.g., a scene 105 or a portion of a user's body (e.g., a user's eyes, head, or hands)) may similarly be implemented with an HMD, where the movement is caused by movement of the HMD relative to the physical environment (e.g., the scene 105 or a portion of the user's body (e.g., a user's eyes, head, or hands)).

While relevant features of the operating environment 100 are shown in fig. 1, those of ordinary skill in the art will recognize from this disclosure that various other features are not shown for the sake of brevity and so as not to obscure more relevant aspects of the exemplary embodiments disclosed herein.

Fig. 2 is a block diagram of an example of a controller 110 according to some embodiments. While certain specific features are shown, those of ordinary skill in the art will appreciate from the disclosure that various other features are not shown for the sake of brevity and so as not to obscure more pertinent aspects of the embodiments disclosed herein. To this end, as a non-limiting example, in some embodiments, the controller 110 includes one or more processing units 202 (e.g., microprocessors, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), graphics Processing Units (GPUs), central Processing Units (CPUs), processing cores, etc.), one or more input/output (I/O) devices 206, one or more communication interfaces 208 (e.g., universal Serial Bus (USB), IEEE 802.3x, IEEE 802.11x, IEEE 802.16x, global system for mobile communications (GSM), code Division Multiple Access (CDMA), time Division Multiple Access (TDMA), global Positioning System (GPS), infrared (IR), bluetooth, ZIGBEE, and/or similar types of interfaces), one or more programming (e.g., I/O) interfaces 210, memory 220, and one or more communication buses 204 for interconnecting these components and various other components.

In some embodiments, one or more of the communication buses 204 include circuitry that interconnects and controls communications between system components. In some embodiments, the one or more I/O devices 206 include at least one of a keyboard, a mouse, a touchpad, a joystick, one or more microphones, one or more speakers, one or more image sensors, one or more displays, and the like.

Memory 220 includes high-speed random access memory such as Dynamic Random Access Memory (DRAM), static Random Access Memory (SRAM), double data rate random access memory (DDR RAM), or other random access solid state memory devices. In some embodiments, memory 220 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. Memory 220 optionally includes one or more storage devices located remotely from the one or more processing units 202. Memory 220 includes a non-transitory computer-readable storage medium. In some embodiments, memory 220 or a non-transitory computer readable storage medium of memory 220 stores the following programs, modules, and data structures, or a subset thereof, including optional operating system 230 and CGR experience module 240.

Operating system 230 includes instructions for handling various basic system services and for performing hardware-related tasks. In some embodiments, CGR experience module 240 is configured to manage and coordinate single or multiple CGR experiences of one or more users (e.g., single CGR experiences of one or more users, or multiple CGR experiences of a respective group of one or more users). To this end, in various embodiments, the CGR experience module 240 includes a data acquisition unit 242, a tracking unit 244, a coordination unit 246, and a data transmission unit 248.

In some embodiments, the data acquisition unit 242 is configured to acquire data (e.g., presentation data, interaction data, sensor data, location data, etc.) from at least the display generation component 120 of fig. 1, and optionally from one or more of the input device 125, the output device 155, the sensor 190, and/or the peripheral device 195. For this purpose, in various embodiments, the data acquisition unit 242 includes instructions and/or logic for instructions as well as heuristics and metadata for heuristics.

In some embodiments, tracking unit 244 is configured to map scene 105 and track at least the location/position of display generation component 120 relative to scene 105 of fig. 1, and optionally the location of one or more of input device 125, output device 155, sensor 190, and/or peripheral device 195. For this purpose, in various embodiments, tracking unit 244 includes instructions and/or logic for instructions as well as heuristics and metadata for heuristics. In some embodiments, tracking unit 244 includes a hand tracking unit 243 and/or an eye tracking unit 245. In some embodiments, the hand tracking unit 243 is configured to track the location/position of one or more portions of the user's hand, and/or the motion of one or more portions of the user's hand relative to the scene 105 of fig. 1, relative to the display generating component 120, and/or relative to a coordinate system defined relative to the user's hand. The hand tracking unit 243 is described in more detail below with respect to fig. 4. In some implementations, the eye tracking unit 245 is configured to track the positioning or movement of the user gaze (or more generally, the user's eyes, face, or head) relative to the scene 105 (e.g., relative to the physical environment and/or relative to the user (e.g., the user's hand)) or relative to the CGR content displayed via the display generating component 120. Eye tracking unit 245 is described in more detail below with respect to fig. 5.

In some embodiments, coordination unit 246 is configured to manage and coordinate the CGR experience presented to the user by display generation component 120, and optionally by one or more of output device 155 and/or peripheral device 195. For this purpose, in various embodiments, coordination unit 246 includes instructions and/or logic for instructions as well as heuristics and metadata for heuristics.

In some embodiments, the data transmission unit 248 is configured to transmit data (e.g., presentation data, location data, etc.) to at least the display generation component 120, and optionally to one or more of the input device 125, the output device 155, the sensor 190, and/or the peripheral device 195. For this purpose, in various embodiments, the data transmission unit 248 includes instructions and/or logic for instructions as well as heuristics and metadata for heuristics.

While the data acquisition unit 242, tracking unit 244 (e.g., including the eye tracking unit 243 and hand tracking unit 244), coordination unit 246, and data transmission unit 248 are shown as residing on a single device (e.g., controller 110), it should be understood that in other embodiments, any combination of the data acquisition unit 242, tracking unit 244 (e.g., including the eye tracking unit 243 and hand tracking unit 244), coordination unit 246, and data transmission unit 248 may reside in a single computing device.

Furthermore, FIG. 2 is a functional description of various features that may be present in a particular implementation, as opposed to a schematic of the embodiments described herein. As will be appreciated by one of ordinary skill in the art, the individually displayed items may be combined and some items may be separated. For example, some of the functional blocks shown separately in fig. 2 may be implemented in a single block, and the various functions of a single functional block may be implemented by one or more functional blocks in various embodiments. The actual number of modules and the division of particular functions, and how features are allocated among them, will vary depending upon the particular implementation, and in some embodiments, depend in part on the particular combination of hardware, software, and/or firmware selected for a particular implementation.

Fig. 3 is a block diagram of an example of display generation component 120 according to some embodiments. While certain specific features are shown, those of ordinary skill in the art will appreciate from the disclosure that various other features are not shown for the sake of brevity and so as not to obscure more pertinent aspects of the embodiments disclosed herein. For the purposes described, as a non-limiting example, in some embodiments, HMD 120 includes one or more processing units 302 (e.g., microprocessors, ASIC, FPGA, GPU, CPU, processing cores, etc.), one or more input/output (I/O) devices and sensors 306, one or more communication interfaces 308 (e.g., USB, FIREWIRE, THUNDERBOLT, IEEE 802.3.3 x, IEEE 802.11x, IEEE 802.16x, GSM, CDMA, TDMA, GPS, IR, BLUETOOTH, ZIGBEE, and/or similar types of interfaces), one or more programming (e.g., I/O) interfaces 310, one or more CGR displays 312, one or more optional internally and/or externally facing image sensors 314, a memory 320, and one or more communication buses 304 for interconnecting these components and various other components.

In some embodiments, one or more communication buses 304 include circuitry for interconnecting and controlling communications between various system components. In some embodiments, the one or more I/O devices and sensors 306 include an Inertial Measurement Unit (IMU), an accelerometer, a gyroscope, a thermometer, one or more physiological sensors (e.g., blood pressure monitor, heart rate monitor, blood oxygen sensor, blood glucose sensor, etc.), one or more microphones, one or more speakers, a haptic engine, and/or one or more depth sensors (e.g., structured light, time of flight, etc.), and/or the like.

In some embodiments, one or more CGR displays 312 are configured to provide a CGR experience to a user. In some embodiments, one or more CGR displays 312 correspond to holographic, digital Light Processing (DLP), liquid Crystal Displays (LCD), liquid crystal on silicon (LCoS), organic light emitting field effect transistors (OLET), organic Light Emitting Diodes (OLED), surface conduction electron emitting displays (SED), field Emission Displays (FED), quantum dot light emitting diodes (QD-LED), microelectromechanical systems (MEMS), and/or similar display types. In some embodiments, one or more CGR displays 312 correspond to a diffractive, reflective, polarizing, holographic, or the like waveguide display. For example, HMD 120 includes a single CGR display. As another example, HMD 120 includes a CGR display for each eye of the user. In some implementations, one or more CGR displays 312 are capable of presenting MR and VR content. In some implementations, one or more CGR displays 312 are capable of presenting MR or VR content.

In some embodiments, the one or more image sensors 314 are configured to acquire image data corresponding to at least a portion of the user's face including the user's eyes (and may be referred to as an eye tracking camera). In some embodiments, the one or more image sensors 314 are configured to acquire image data corresponding to at least a portion of the user's hand and optionally the user's arm (and may be referred to as a hand tracking camera). In some implementations, the one or more image sensors 314 are configured to face forward in order to acquire image data corresponding to a scene that a user would see in the absence of the HMD 120 (and may be referred to as a scene camera). The one or more optional image sensors 314 may include one or more RGB cameras (e.g., with Complementary Metal Oxide Semiconductor (CMOS) image sensors or Charge Coupled Device (CCD) image sensors), one or more Infrared (IR) cameras, and/or one or more event-based cameras, etc.

Memory 320 includes high-speed random access memory such as DRAM, SRAM, DDR RAM or other random access solid state memory devices. In some embodiments, memory 320 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. Memory 320 optionally includes one or more storage devices located remotely from the one or more processing units 302. Memory 320 includes a non-transitory computer-readable storage medium. In some embodiments, memory 320 or a non-transitory computer readable storage medium of memory 320 stores the following programs, modules, and data structures, or a subset thereof, including optional operating system 330 and CGR rendering module 340.

Operating system 330 includes processes for handling various basic system services and for performing hardware-related tasks. In some embodiments, CGR rendering module 340 is configured to render CGR content to a user via one or more CGR displays 312. For this purpose, in various embodiments, the CGR presentation module 340 includes a data acquisition unit 342, a CGR presentation unit 344, a CGR map generation unit 346, and a data transmission unit 348.

In some embodiments, the data acquisition unit 342 is configured to at least acquire data (e.g., presentation data, interaction data, sensor data, location data, etc.) from the controller 110 of fig. 1. For this purpose, in various embodiments, the data acquisition unit 342 includes instructions and/or logic for instructions and heuristics and metadata for heuristics.

In some implementations, CGR presentation unit 344 is configured to present CGR content via one or more CGR displays 312. For this purpose, in various embodiments, CGR rendering unit 344 includes instructions and/or logic for instructions and heuristics and metadata for heuristics.

In some embodiments, CGR map generation unit 346 is configured to generate a CGR map (e.g., a 3D map of a mixed reality scene or a physical environment in which computer-generated objects may be placed to generate a computer-generated reality map) based on media content data. For this purpose, in various embodiments, CGR map generation unit 346 includes instructions and/or logic for the instructions as well as heuristics and metadata for the heuristics.

In some embodiments, the data transmission unit 348 is configured to transmit data (e.g., presentation data, location data, etc.) to at least the controller 110, and optionally one or more of the input device 125, the output device 155, the sensor 190, and/or the peripheral device 195. For this purpose, in various embodiments, the data transmission unit 348 includes instructions and/or logic for instructions and heuristics and metadata for heuristics.

Although the data acquisition unit 342, the CGR presentation unit 344, the CGR map generation unit 346, and the data transmission unit 348 are shown as residing on a single device (e.g., the display generation component 120 of fig. 1), it should be understood that any combination of the data acquisition unit 342, the CGR presentation unit 344, the CGR map generation unit 346, and the data transmission unit 348 may be located in a separate computing device in other embodiments.

Furthermore, fig. 3 is used more as a functional description of various features that may be present in a particular embodiment, as opposed to a schematic of the embodiments described herein. As will be appreciated by one of ordinary skill in the art, the individually displayed items may be combined and some items may be separated. For example, some of the functional blocks shown separately in fig. 3 may be implemented in a single block, and the various functions of a single functional block may be implemented by one or more functional blocks in various embodiments. The actual number of modules and the division of particular functions, and how features are allocated among them, will vary depending upon the particular implementation, and in some embodiments, depend in part on the particular combination of hardware, software, and/or firmware selected for a particular implementation.

Fig. 4 is a schematic illustration of an exemplary embodiment of a hand tracking device 140. In some embodiments, the hand tracking device 140 (fig. 1) is controlled by the hand tracking unit 243 (fig. 2) to track the position/location of one or more portions of the user's hand, and/or movement of one or more portions of the user's hand relative to the scene 105 of fig. 1 (e.g., relative to a portion of the physical environment surrounding the user, relative to the display generating component 120, or relative to a portion of the user (e.g., the user's face, eyes, or head), and/or relative to a coordinate system defined relative to the user's hand). In some implementations, the hand tracking device 140 is part of the display generation component 120 (e.g., embedded in or attached to a head-mounted device). In some embodiments, the hand tracking device 140 is separate from the display generation component 120 (e.g., in a separate housing or attached to a separate physical support structure).

In some implementations, the hand tracking device 140 includes an image sensor 404 (e.g., one or more IR cameras, 3D cameras, depth cameras, and/or color cameras, etc.) that captures three-dimensional scene information including at least a human user's hand 406. The image sensor 404 captures the hand image with sufficient resolution to enable the fingers and their respective locations to be distinguished. The image sensor 404 typically captures images of other parts of the user's body, and possibly also all parts of the body, and may have a zoom capability or a dedicated sensor with increased magnification to capture images of the hand with a desired resolution. In some implementations, the image sensor 404 also captures 2D color video images of the hand 406 and other elements of the scene. In some implementations, the image sensor 404 is used in conjunction with other image sensors to capture the physical environment of the scene 105, or as an image sensor that captures the physical environment of the scene 105. In some embodiments, the image sensor 404, or a portion thereof, is positioned relative to the user or the user's environment in a manner that uses the field of view of the image sensor to define an interaction space in which hand movements captured by the image sensor are considered input to the controller 110.

In some embodiments, the image sensor 404 outputs a sequence of frames containing 3D mapping data (and, in addition, possible color image data) to the controller 110, which extracts high-level information from the mapping data. This high-level information is typically provided via an Application Program Interface (API) to an application program running on the controller, which drives the display generating component 120 accordingly. For example, a user may interact with software running on the controller 110 by moving his hands 408 and changing his hand gestures.

In some implementations, the image sensor 404 projects a speckle pattern onto a scene that includes the hand 406 and captures an image of the projected pattern. In some implementations, the controller 110 calculates 3D coordinates of points in the scene (including points on the surface of the user's hand) by triangulation based on lateral offsets of the blobs in the pattern. This approach is advantageous because it does not require the user to hold or wear any kind of beacon, sensor or other marker. The method gives the depth coordinates of points in the scene relative to a predetermined reference plane at a specific distance from the image sensor 404. In this disclosure, it is assumed that the image sensor 404 defines an orthogonal set of x-axis, y-axis, z-axis such that the depth coordinates of points in the scene correspond to the z-component measured by the image sensor. Alternatively, the hand tracking device 440 may use other 3D mapping methods, such as stereoscopic imaging or time-of-flight measurements, based on single or multiple cameras or other types of sensors.

In some implementations, the hand tracking device 140 captures and processes a time series containing a depth map of the user's hand as the user moves his hand (e.g., the entire hand or one or more fingers). Software running on the image sensor 404 and/or a processor in the controller 110 processes the 3D mapping data to extract image block descriptors of the hand in these depth maps. The software may match these descriptors with image block descriptors stored in database 408 based on previous learning processes in order to estimate the pose of the hand in each frame. The pose typically includes the 3D position of the user's hand joints and finger tips.

The software may also analyze the trajectory of the hand and/or finger over multiple frames in the sequence to identify gestures. The pose estimation functions described herein may alternate with motion tracking functions such that image block-based pose estimation is performed only once every two (or more) frames while tracking changes used to find poses that occur on the remaining frames. Pose, motion, and gesture information are provided to an application running on the controller 110 via the APIs described above. The program may move and modify images presented on the display generation component 120, for example, in response to pose and/or gesture information, or perform other functions.

In some embodiments, the software may be downloaded to the controller 110 in electronic form, over a network, for example, or may alternatively be provided on tangible non-transitory media, such as optical, magnetic, or electronic memory media. In some embodiments, database 408 is also stored in a memory associated with controller 110. Alternatively or in addition, some or all of the described functions of the computer may be implemented in dedicated hardware, such as a custom or semi-custom integrated circuit or a programmable Digital Signal Processor (DSP). Although controller 110 is shown in fig. 4, some or all of the processing functions of the controller may be performed by a suitable microprocessor and software or by dedicated circuitry within the housing of hand tracking device 402 or other devices associated with image sensor 404, for example, as a separate unit from image sensor 440. In some embodiments, at least some of these processing functions may be performed by a suitable processor integrated with display generation component 120 (e.g., in a television receiver, handheld device, or head mounted device) or with any other suitable computerized device (such as a game console or media player). The sensing functionality of the image sensor 404 may likewise be integrated into a computer or other computerized device to be controlled by the sensor output.

Fig. 4 also includes a schematic diagram of a depth map 410 captured by the image sensor 404, according to some embodiments. As described above, the depth map comprises a matrix of pixels having corresponding depth values. Pixels 412 corresponding to the hand 406 have been segmented from the background and wrist in the map. The brightness of each pixel within the depth map 410 is inversely proportional to its depth value (i.e., the measured z-distance from the image sensor 404), where the gray shade becomes darker with increasing depth. The controller 110 processes these depth values to identify and segment components of the image (i.e., a set of adjacent pixels) that have human hand features. These features may include, for example, overall size, shape, and frame-to-frame motion from a sequence of depth maps.

Fig. 4 also schematically illustrates the hand bones 414 that the controller 110 eventually extracts from the depth map 410 of the hand 406, according to some embodiments. In fig. 4, bone 414 is superimposed over hand background 416 that has been segmented from the original depth map. In some embodiments, key feature points of the hand and optionally on the wrist or arm connected to the hand (e.g., points corresponding to knuckles, finger tips, palm centers, ends of the hand connected to the wrist, etc.) are identified and located on the hand bones 414. In some embodiments, the controller 110 uses the positions and movements of these key feature points on the plurality of image frames to determine a gesture performed by the hand or a current state of the hand according to some embodiments.

Fig. 5 illustrates an exemplary embodiment of the eye tracking device 130 (fig. 1). In some implementations, the eye tracking device 130 is controlled by the eye tracking unit 245 (fig. 2) to track the positioning and movement of the user gaze relative to the scene 105 or relative to the CGR content displayed via the display generating component 120. In some embodiments, the eye tracking device 130 is integrated with the display generation component 120. For example, in some embodiments, when display generating component 120 is a head-mounted device (such as a headset, helmet, goggles, or glasses) or a handheld device placed in a wearable frame, the head-mounted device includes both components that generate CGR content for viewing by a user and components for tracking the user's gaze with respect to the CGR content. In some embodiments, the eye tracking device 130 is separate from the display generation component 120. For example, when the display generating component is a handheld device or CGR room, the eye tracking device 130 is optionally a device separate from the handheld device or CGR room. In some embodiments, the eye tracking device 130 is a head mounted device or a portion of a head mounted device. In some embodiments, the head-mounted eye tracking device 130 is optionally used in combination with a display generating component that is also head-mounted or a display generating component that is not head-mounted. In some embodiments, the eye tracking device 130 is not a head mounted device and is optionally used in conjunction with a head mounted display generating component. In some embodiments, the eye tracking device 130 is not a head mounted device and optionally is part of a non-head mounted display generating component.

In some embodiments, the display generation component 120 uses a display mechanism (e.g., a left near-eye display panel and a right near-eye display panel) to display frames including left and right images in front of the user's eyes, thereby providing a 3D virtual view to the user. For example, the head mounted display generating component may include left and right optical lenses (referred to herein as eye lenses) located between the display and the user's eyes. In some embodiments, the display generation component may include or be coupled to one or more external cameras that capture video of the user's environment for display. In some embodiments, the head mounted display generating component may have a transparent or translucent display and the virtual object is displayed on the transparent or translucent display through which the user may directly view the physical environment. In some embodiments, the display generation component projects the virtual object into the physical environment. The virtual object may be projected, for example, on a physical surface or as a hologram, such that an individual uses the system to observe the virtual object superimposed over the physical environment. In this case, separate display panels and image frames for the left and right eyes may not be required.

As shown in fig. 5, in some embodiments, the gaze tracking device 130 includes at least one eye tracking camera (e.g., an Infrared (IR) or Near Infrared (NIR) camera) and an illumination source (e.g., an array or ring of IR or NIR light sources, such as LEDs) that emits light (e.g., IR or NIR light) toward the user's eyes. The eye-tracking camera may be directed toward the user's eye to receive IR or NIR light reflected directly from the eye by the light source, or alternatively may be directed toward "hot" mirrors located between the user's eye and the display panel that reflect IR or NIR light from the eye to the eye-tracking camera while allowing visible light to pass through. The gaze tracking device 130 optionally captures images of the user's eyes (e.g., as a video stream captured at 60-120 frames per second (fps)), analyzes the images to generate gaze tracking information, and communicates the gaze tracking information to the controller 110. In some embodiments, both eyes of the user are tracked separately by the respective eye tracking camera and illumination source. In some embodiments, only one eye of the user is tracked by the respective eye tracking camera and illumination source.

In some embodiments, the eye tracking device 130 is calibrated using a device-specific calibration process to determine parameters of the eye tracking device for the particular operating environment 100, such as 3D geometry and parameters of LEDs, cameras, hot mirrors (if present), eye lenses, and display screens. The device-specific calibration procedure may be performed at the factory or another facility prior to delivering the AR/VR equipment to the end user. The device-specific calibration process may be an automatic calibration process or a manual calibration process. According to some embodiments, the user-specific calibration process may include an estimation of eye parameters of a specific user, such as pupil position, foveal position, optical axis, visual axis, eye distance, etc. According to some embodiments, once the device-specific parameters and the user-specific parameters are determined for the eye-tracking device 130, the images captured by the eye-tracking camera may be processed using a flash-assist method to determine the current visual axis and gaze point of the user relative to the display.

As shown in fig. 5, the eye tracking device 130 (e.g., 130A or 130B) includes an eye lens 520 and a gaze tracking system including at least one eye tracking camera 540 (e.g., an Infrared (IR) or Near Infrared (NIR) camera) positioned on a side of the user's face on which eye tracking is performed, and an illumination source 530 (e.g., an IR or NIR light source such as an array or ring of NIR Light Emitting Diodes (LEDs)) that emits light (e.g., IR or NIR light) toward the user's eyes 592. The eye-tracking camera 540 may be directed toward a mirror 550 (which reflects IR or NIR light from the eye 592 while allowing visible light to pass) located between the user's eye 592 and the display 510 (e.g., left or right display panel of a head-mounted display, or display of a handheld device, projector, etc.) (e.g., as shown in the top portion of fig. 5), or alternatively may be directed toward the user's eye 592 to receive reflected IR or NIR light from the eye 592 (e.g., as shown in the bottom portion of fig. 5).

In some implementations, the controller 110 renders AR or VR frames 562 (e.g., left and right frames for left and right display panels) and provides the frames 562 to the display 510. The controller 110 uses the gaze tracking input 542 from the eye tracking camera 540 for various purposes, such as for processing the frames 562 for display. The controller 110 optionally estimates the gaze point of the user on the display 510 based on gaze tracking input 542 acquired from the eye tracking camera 540 using a flash assist method or other suitable method. The gaze point estimated from the gaze tracking input 542 is optionally used to determine the direction in which the user is currently looking.

Several possible use cases of the current gaze direction of the user are described below and are not intended to be limiting. As an exemplary use case, the controller 110 may render virtual content differently based on the determined direction of the user's gaze. For example, the controller 110 may generate virtual content in a foveal region determined according to a current gaze direction of the user at a higher resolution than in a peripheral region. As another example, the controller may position or move virtual content in the view based at least in part on the user's current gaze direction. As another example, the controller may display particular virtual content in the view based at least in part on the user's current gaze direction. As another exemplary use case in an AR application, controller 110 may direct an external camera used to capture the physical environment of the CGR experience to focus in the determined direction. The autofocus mechanism of the external camera may then focus on an object or surface in the environment that the user is currently looking at on display 510. As another example use case, the eye lens 520 may be a focusable lens, and the controller uses the gaze tracking information to adjust the focus of the eye lens 520 such that the virtual object the user is currently looking at has the appropriate vergence to match the convergence of the user's eyes 592. The controller 110 may utilize the gaze tracking information to direct the eye lens 520 to adjust the focus such that the approaching object the user is looking at appears at the correct distance.

In some embodiments, the eye tracking device is part of a head mounted device that includes a display (e.g., display 510), two eye lenses (e.g., eye lens 520), an eye tracking camera (e.g., eye tracking camera 540), and a light source (e.g., light source 530 (e.g., IR or NIR LED)) mounted in a wearable housing. The light source emits light (e.g., IR or NIR light) toward the user's eye 592. In some embodiments, the light sources may be arranged in a ring or circle around each of the lenses, as shown in fig. 5. In some embodiments, for example, eight light sources 530 (e.g., LEDs) are arranged around each lens 520. However, more or fewer light sources 530 may be used, and other arrangements and locations of light sources 530 may be used.

In some implementations, the display 510 emits light in the visible range and does not emit light in the IR or NIR range, and thus does not introduce noise in the gaze tracking system. Note that the position and angle of the eye tracking camera 540 is given by way of example and is not intended to be limiting. In some implementations, a single eye tracking camera 540 is located on each side of the user's face. In some implementations, two or more NIR cameras 540 may be used on each side of the user's face. In some implementations, a camera 540 with a wider field of view (FOV) and a camera 540 with a narrower FOV may be used on each side of the user's face. In some implementations, a camera 540 operating at one wavelength (e.g., 850 nm) and a camera 540 operating at a different wavelength (e.g., 940 nm) may be used on each side of the user's face.

The embodiment of the gaze tracking system as shown in fig. 5 may be used, for example, in computer-generated reality, virtual reality, and/or mixed reality applications to provide a user with a computer-generated reality, virtual reality, augmented reality, and/or augmented virtual experience.

Fig. 6A illustrates a flash-assisted gaze tracking pipeline in accordance with some embodiments. In some embodiments, the gaze tracking pipeline is implemented by a glint-assisted gaze tracking system (e.g., an eye tracking device 130 as shown in fig. 1 and 5). The flash-assisted gaze tracking system may maintain a tracking state. Initially, the tracking state is off or "no". When in the tracking state, the glint-assisted gaze tracking system uses previous information from a previous frame when analyzing the current frame to track pupil contours and glints in the current frame. When not in the tracking state, the glint-assisted gaze tracking system attempts to detect pupils and glints in the current frame and, if successful, initializes the tracking state to "yes" and continues with the next frame in the tracking state.

As shown in fig. 6A, the gaze tracking camera may capture left and right images of the left and right eyes of the user. The captured image is then input to the gaze tracking pipeline for processing beginning at 610. As indicated by the arrow returning to element 600, the gaze tracking system may continue to capture images of the user's eyes, for example, at a rate of 60 to 120 frames per second. In some embodiments, each set of captured images may be input to a pipeline for processing. However, in some embodiments or under some conditions, not all captured frames are pipelined.

At 610, for the currently captured image, if the tracking state is yes, the method proceeds to element 640. At 610, if the tracking state is no, the image is analyzed to detect a user's pupil and glints in the image, as indicated at 620. At 630, if the pupil and glints are successfully detected, the method proceeds to element 640. Otherwise, the method returns to element 610 to process the next image of the user's eye.

At 640, if proceeding from element 410, the current frame is analyzed to track pupils and glints based in part on previous information from the previous frame. At 640, if proceeding from element 630, a tracking state is initialized based on the pupil and flash detected in the current frame. The results of the processing at element 640 are checked to verify that the results of the tracking or detection may be trusted. For example, the results may be checked to determine if the pupil and a sufficient number of flashes for performing gaze estimation are successfully tracked or detected in the current frame. At 650, if the result is unlikely to be authentic, the tracking state is set to no and the method returns to element 610 to process the next image of the user's eye. At 650, if the result is trusted, the method proceeds to element 670. At 670, the tracking state is set to YES (if not already YES), and pupil and glint information is passed to element 680 to estimate the gaze point of the user.

Fig. 6A is intended to serve as one example of an eye tracking technique that may be used in a particular implementation. As will be appreciated by one of ordinary skill in the art, other eye tracking techniques, currently existing or developed in the future, may be used in place of or in combination with the glint-assisted eye tracking techniques described herein in computer system 101 for providing a CGR experience to a user, according to various embodiments.

FIG. 6B illustrates an exemplary environment for electronic devices 101a and 101B providing a CGR experience, according to some embodiments. In fig. 6B, the real world environment 602 includes electronic devices 101a and 101B, users 608a and 608B, and real world objects (e.g., a table 604). As shown in fig. 6B, electronic devices 101a and 101B are optionally mounted on a tripod or otherwise secured in real world environment 602 such that one or more hands of users 608a and 608B are free (e.g., users 608a and 608B optionally do not hold devices 101a and 101B with one or more hands). As described above, devices 101a and 101b optionally have one or more sensor groups positioned on different sides of devices 101a and 101b, respectively. For example, devices 101a and 101b optionally include sensor groups 612-1a and 612-1b and sensor groups 612-2a and 612-2b located on "back" and "front" sides of devices 101a and 101b, respectively (e.g., they are capable of capturing information from respective sides of devices 101a and 101 b). As used herein, the front side of device 101a is the side facing users 608a and 608b, and the back side of devices 101a and 101b is the side facing away from users 608a and 608 b.

In some embodiments, the sensor sets 612-2a and 612-2b include eye tracking units (e.g., eye tracking unit 245 described above with reference to fig. 2) that include one or more sensors for tracking the user's eyes and/or gaze such that the eye tracking units are able to "look" at the users 608a and 608b and track the eyes of the users 608a and 608b in the manner previously described. In some embodiments, the eye tracking units of devices 101a and 101b are capable of capturing movements, orientations, and/or fixations of the eyes of users 608a and 608b and treating these movements, orientations, and/or fixations as inputs.

In some embodiments, the sensor groups 612-1a and 612-1B include hand tracking units (e.g., hand tracking unit 243 described above with reference to fig. 2) that are capable of tracking one or more hands of the users 608a and 608B that remain on the "back" sides of the devices 101a and 101B, as shown in fig. 6B. In some embodiments, a hand tracking unit is optionally included in the sensor sets 612-2a and 612-2b, such that the users 608a and 608b can additionally or alternatively hold one or more hands on the "front" side of the devices 101a and 101b as the devices 101a and 101b track the positioning of the one or more hands. As described above, the hand tracking units of devices 101a and 101b are capable of capturing movements, locations, and/or gestures of one or more hands of users 608a and 608b and treating those movements, locations, and/or gestures as inputs.

In some embodiments, the sensor sets 612-1a and 612-1b optionally include one or more sensors configured to capture images of the real world environment 602 including the table 604 (e.g., such as the image sensor 404 described above with reference to fig. 4). As described above, devices 101a and 101b are capable of capturing images of portions (e.g., some or all) of real-world environment 602 and presenting the captured portions of real-world environment 602 to a user via one or more display generating components of devices 101a and 101b (e.g., displays of devices 101a and 101b, optionally located on a user-facing side of devices 101a and 101b opposite a side of devices 101a and 101b facing the captured portions of real-world environment 602).

In some implementations, the captured portion of the real-world environment 602 is used to provide a CGR experience to the user, such as a mixed reality environment with one or more virtual objects superimposed over a representation of the real-world environment 602.

Thus, the description herein describes some embodiments of a three-dimensional environment (e.g., a CGR environment) that includes a representation of a real-world object and a representation of a virtual object. For example, the three-dimensional environment optionally includes a representation of a table present in the physical environment that is captured and displayed in the three-dimensional environment (e.g., actively via a camera and display of the electronic device or passively via a transparent or translucent display of the electronic device). As previously described, the three-dimensional environment is optionally a mixed reality system, wherein the three-dimensional environment is based on a physical environment captured by one or more sensors of the device and displayed via a display generating component. As a mixed reality system, the device is optionally capable of selectively displaying portions and/or objects of the physical environment such that the respective portions and/or objects of the physical environment appear as if they were present in the three-dimensional environment displayed by the electronic device. Similarly, the device is optionally capable of displaying the virtual object in the three-dimensional environment to appear as if the virtual object is present in the real world (e.g., physical environment) by placing the virtual object in the three-dimensional environment at a respective location having a corresponding location in the real world. For example, the device optionally displays a vase so that the vase appears as if the real vase were placed on top of a desk in a physical environment. In some embodiments, each location in the three-dimensional environment has a corresponding location in the physical environment. Thus, when the device is described as displaying a virtual object at a corresponding location relative to a physical object (e.g., such as a location at or near a user's hand or a location at or near a physical table), the device displays the virtual object at a particular location in the three-dimensional environment such that it appears as if the virtual object were at or near a physical object in the physical world (e.g., the virtual object is displayed in the three-dimensional environment at a location corresponding to the location in the physical environment where the virtual object would be displayed if the virtual object were a real object at the particular location).

In some implementations, real world objects present in a physical environment that are displayed in a three-dimensional environment may interact with virtual objects that are present only in the three-dimensional environment. For example, a three-dimensional environment may include a table and a vase placed on top of the table, where the table is a view (or representation) of a physical table in a physical environment, and the vase is a virtual object.

Similarly, a user is optionally able to interact with a virtual object in a three-dimensional environment using one or more hands as if the virtual object were a real object in a physical environment. For example, as described above, the one or more sensors of the device optionally capture one or more hands of the user and display a representation of the user's hands in a three-dimensional environment (e.g., in a manner similar to displaying real world objects in the three-dimensional environment described above), or in some embodiments, the user's hands may be visible via the display generating component via the ability to see the physical environment through the user interface due to the transparency/translucency of the display generating component as a result of the user interface or projection of the user interface onto a transparent/translucent surface or a portion of the user interface onto the user's eye or into the field of view of the user's eye. Thus, in some embodiments, a user's hands are displayed at respective locations in the three-dimensional environment and are considered as if they were objects in the three-dimensional environment that are capable of interacting with virtual objects in the three-dimensional environment as if they were real physical objects in the physical environment. In some embodiments, the user is able to move his or her hand such that the representation of the hand in the three-dimensional environment moves in conjunction with the movement of the user's hand.

In some of the embodiments described below, the device is optionally capable of determining a "valid" distance between a physical object in the physical world and a virtual object in the three-dimensional environment, e.g., for determining whether the physical object is interacting with the virtual object (e.g., whether a hand is touching, grabbing, holding, etc., the virtual object or is within a threshold distance from the virtual object). For example, the device determines a distance between the user's hand and the virtual object when determining whether the user is interacting with the virtual object and/or how the user is interacting with the virtual object. In some embodiments, the device determines the distance between the user's hand and the virtual object by determining the distance between the position of the hand in the three-dimensional environment and the position of the virtual object of interest in the three-dimensional environment. For example, the one or more hands of the user are located at a particular location in the physical world, the device optionally captures the one or more hands and displays the one or more hands at a particular corresponding location in the three-dimensional environment (e.g., a location where the hand would be displayed in the three-dimensional environment if the hand were a virtual hand rather than a physical hand). The positioning of the hand in the three-dimensional environment is optionally compared with the positioning of the virtual object of interest in the three-dimensional environment to determine the distance between the one or more hands of the user and the virtual object. In some implementations, the device optionally determines the distance between the physical object and the virtual object by comparing locations in the physical world (e.g., rather than comparing locations in a three-dimensional environment). For example, when determining the distance between one or more hands of the user and the virtual object, the device optionally determines the corresponding location of the virtual object in the physical world (e.g., the location in the physical world where the virtual object would be if the virtual object were a physical object instead of a virtual object), and then determines the distance between the corresponding physical location and the one or more hands of the user. In some implementations, the same technique is optionally used to determine the distance between any physical object and any virtual object. Thus, as described herein, when determining whether a physical object is in contact with a virtual object or whether the physical object is within a threshold distance of the virtual object, the device optionally performs any of the techniques described above to map the location of the physical object to a three-dimensional environment and/or map the location of the virtual object to the physical world.

In some implementations, the same or similar techniques are used to determine where and where the user's gaze is directed, and/or where and where a physical stylus held by the user is directed. For example, if the user's gaze is directed to a particular location in the physical environment, the device optionally determines a corresponding location in the three-dimensional environment, and if the virtual object is located at the corresponding virtual location, the device optionally determines that the user's gaze is directed to the virtual object. Similarly, the device is optionally able to determine the direction in the physical world in which the physical stylus is pointing based on its orientation. In some embodiments, based on the determination, the device determines a corresponding virtual location in the three-dimensional environment corresponding to a location in the physical world at which the stylus is pointing, and optionally determines that the stylus is pointing at the corresponding virtual location in the three-dimensional environment.

Similarly, embodiments described herein may refer to a location of a user (e.g., a user of a device) in a three-dimensional environment and/or a location of a device in a three-dimensional environment. In some embodiments, a user of the device is holding, wearing, or otherwise located at or near the electronic device. Thus, in some embodiments, the location of the device serves as a proxy for the location of the user. In some embodiments, the location of the device and/or user in the physical environment corresponds to a corresponding location in the three-dimensional environment. In some implementations, the respective location is a location from which a "camera" or "view" of the three-dimensional environment extends. For example, the location of the device will be the location in the physical environment (and its corresponding location in the three-dimensional environment) from which the user would see those objects in the physical environment that are in the same position, orientation and/or size (e.g., in absolute sense and/or relative to each other) as when the objects were displayed by the display generating component of the device if the user were standing at that location facing the corresponding portion of the physical environment displayed by the display generating component. Similarly, if the virtual objects displayed in the three-dimensional environment are physical objects in the physical environment (e.g., physical objects placed in the physical environment at the same locations in the three-dimensional environment as those virtual objects, and physical objects in the physical environment having the same size and orientation as in the three-dimensional environment), then the location of the device and/or user is the location of those virtual objects in the physical environment that would be seen by the user to be at the same location, orientation, and/or size (e.g., in absolute terms and/or relative to each other and real world objects) as when the virtual objects were displayed by the display generating component of the device.

In this disclosure, various input methods are described with respect to interactions with a computer system. When one input device or input method is used to provide an example and another input device or input method is used to provide another example, it should be understood that each example may be compatible with and optionally utilize the input device or input method described with respect to the other example. Similarly, various output methods are described with respect to interactions with a computer system. When one output device or output method is used to provide an example and another output device or output method is used to provide another example, it should be understood that each example may be compatible with and optionally utilize the output device or output method described with respect to the other example. Similarly, the various methods are described with respect to interactions with a virtual environment or mixed reality environment through a computer system. When examples are provided using interactions with a virtual environment, and another example is provided using a mixed reality environment, it should be understood that each example may be compatible with and optionally utilize the methods described with respect to the other example. Thus, the present disclosure discloses embodiments that are combinations of features of multiple examples, without the need to list all features of the embodiments in detail in the description of each example embodiment.

Furthermore, in a method described herein in which one or more steps are dependent on one or more conditions having been met, it should be understood that the method may be repeated in multiple iterations such that during the iteration, all conditions that determine steps in the method have been met in different iterations of the method. For example, if a method requires performing a first step (if a condition is met) and performing a second step (if a condition is not met), one of ordinary skill will know that the stated steps are repeated until both the condition and the condition are not met (not sequentially). Thus, a method described as having one or more steps depending on one or more conditions having been met may be rewritten as a method that repeats until each of the conditions described in the method have been met. However, this does not require the system or computer-readable medium to claim that the system or computer-readable medium contains instructions for performing the contingent operation based on the satisfaction of the corresponding condition or conditions, and thus is able to determine whether the contingent situation has been met without explicitly repeating the steps of the method until all conditions to decide on steps in the method have been met. It will also be appreciated by those of ordinary skill in the art that, similar to a method with optional steps, a system or computer readable storage medium may repeat the steps of the method as many times as necessary to ensure that all optional steps have been performed.

User interface and associated process

Attention is now directed to embodiments of a user interface ("UI") and associated processes that may be implemented on a computer system (such as a portable multifunction device or a head-mounted device) having a display generating component, one or more input devices, and (optionally) one or more cameras.

Fig. 7A-7C illustrate exemplary ways in which an electronic device 101a or 101b may or may not perform an operation in response to a user input depending on whether a ready state of the user was previously detected by the user input, in accordance with some embodiments.

Fig. 7A shows that the electronic devices 101a and 101b display a three-dimensional environment via the display generating parts 120a and 120 b. It should be appreciated that in some embodiments, the electronic devices 101a and/or 101b utilize one or more of the techniques described with reference to fig. 7A-7C in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic devices 101a and 1010b optionally include display generating components 120a and 120b (e.g., a touch screen) and a plurality of image sensors 314a and 314b. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101a and/or 101b can use to capture one or more images of the user or a portion of the user when the user interacts with the electronic device 101a and/or 101 b. In some embodiments, display generating components 120a and 120b are touch screens capable of detecting gestures and movements of a user's hand. In some embodiments, the user interfaces described below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hand (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

Fig. 7A shows two electronic devices 101a and 101B displaying a three-dimensional environment that includes a representation 704 of a table (e.g., such as table 604 in fig. 6B) in the physical environment of the electronic devices 101a and 101B, selectable options 707, and scrollable user interface element 705. The electronic devices 101a and 101b render the three-dimensional environment from different viewpoints in the three-dimensional environment because the electronic devices are associated with different user viewpoints in the three-dimensional environment. In some embodiments, the representation 704 of the table is a photorealistic representation (e.g., digital passthrough) displayed by the display generating component 120a and/or 120 b. In some embodiments, the representation 704 of the table is a view (e.g., physical transmission) of the table through the transparent portion of the display generating component 120a and/or 120 b. In fig. 7A, the gaze 701a of the user of the first electronic device 101a is directed to the scrollable user interface element 705, and the scrollable user interface element 705 is within the attention area 703 of the user of the first electronic device 101 a. In some embodiments, the attention area 703 is similar to the attention area described in more detail below with reference to fig. 9A-10H.

In some implementations, the first electronic device 101a displays objects in the three-dimensional environment that are not in the attention area 703 (e.g., a representation of the table 704 and/or the options 707) in a blurred and/or dimmed appearance (e.g., a weakened appearance). In some embodiments, the second electronic device 101b obscures and/or fades (e.g., weakens) portions of the three-dimensional environment based on an attention area of the user of the second electronic device 101b, which is optionally different from the attention area of the user of the first electronic device 101 a. Thus, in some embodiments, the attention area and blurring of objects outside the attention area is not synchronized between the electronic devices 101a and 101 b. In contrast, in some embodiments, the attention areas associated with electronic devices 101a and 101b are independent of each other.

In fig. 7A, the hand 709 of the user of the first electronic device 101a is in an inactive hand state (e.g., hand state a). For example, hand 709 assumes a hand shape that does not correspond to a ready state or input, as described in more detail below. Because hand 709 is in an inactive hand state, first electronic device 101a displays scrollable user interface element 705 without indicating that the input will be pointing or is pointing to scrollable user interface element 705. Similarly, the electronic device 101b also displays the scrollable user interface element 705 without indicating that the input will be pointing or is pointing to the scrollable user interface element 705.

In some embodiments, the electronic device 101a displays an indication of the user's gaze 701a on the user interface element 705 when the user's hand 709 is in an inactive state. For example, the electronic device 101a optionally changes the color, size, and/or positioning of the scrollable user interface element 705 in a different manner than the electronic device 101a updates the scrollable user interface element 705 in response to detecting a ready state of a user, as will be described below. In some implementations, the electronic device 101a indicates that the user's gaze 701a is on the user interface element 705 by displaying a visual indication that is separate from updating the appearance of the scrollable user interface element 705. In some embodiments, the second electronic device 101b foregoes displaying an indication of the gaze of the user of the first electronic device 101 a. In some embodiments, the second electronic device 101b displays an indication to indicate a location of the gaze of the user of the second electronic device 101 b.

In fig. 7B, the first electronic device 101a detects a ready state of the user when the user's gaze 701B is directed to the scrollable user interface element 705. In some implementations, the ready state of the user is detected in response to detecting that the user's hand 709 is in a direct ready state hand state (e.g., hand state D). In some implementations, the ready state of the user is detected in response to detecting that the user's hand 711 is in an indirect ready state hand state (e.g., hand state B).

In some implementations, the user's hand 709 of the first electronic device 101a is in a direct ready state when the hand 709 is within a predetermined threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 20, 30, etc. centimeters) of the scrollable user interface element 705, the scrollable user interface element 705 is within the user's attention area 703, and/or the hand 709 is in a pointing hand shape (e.g., a hand shape in which one or more fingers curl toward the palm and one or more fingers protrude toward the scrollable user interface element 705). In some implementations, the scrollable user interface element 705 need not be in the attention area 703 in order to meet the ready state criteria for direct input. In some implementations, the user's gaze 701b does not have to be directed to the scrollable user interface element 705 in order to meet the ready state criteria for direct input.

In some implementations, the hand 711 is in an indirect ready state when the user's hand 711 is farther from the scrollable user interface 705 than a predetermined threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 20, 30, etc. centimeters) and the user's gaze 701b is directed toward the scrollable user interface element 705 and the hand 711 is in a pre-pinch hand shape (e.g., the thumb is within the threshold distance of another finger on the hand (e.g., 0.1, 0.5, 1, 2, 3, etc. centimeters) without touching the hand shape of another finger on the hand). In some implementations, the ready state criteria for indirect input are met when the scrollable user interface element 705 is within the user's attention area 703 even though the gaze 701b is not directed to the user interface element 705. In some embodiments, the electronic device 101A disambiguates in determining the location of the user gaze 701b, as described below with reference to fig. 11A-12F.

In some implementations, the hand shape that meets the criteria for a direct ready state (e.g., with hand 709) is the same as the hand shape that meets the criteria for an indirect ready state (e.g., with hand 711). For example, both the pointing hand shape and the pre-pinch hand shape meet criteria for direct and indirect ready states. In some implementations, the hand shape that meets the criteria for a direct ready state (e.g., hand 709) is different from the hand shape that meets the criteria for an indirect ready state (e.g., hand 711). For example, the direct ready state needs to point to the hand shape, but the indirect ready state needs to pre-pinch the hand shape.

In some embodiments, the electronic device 101a (and/or 101 b) is in communication with one or more input devices, such as a stylus or touch pad. In some implementations, the criteria for entering the ready state with the input devices is different from the criteria for entering the ready state without one of the input devices. For example, the ready state criteria for these input devices do not require detection of the hand shape described above for the direct and indirect ready states without a stylus or touch pad. For example, the ready state criteria when the user is providing input to the device 101a and/or 101b using a stylus requires the user to be holding the stylus, and the ready state criteria when the user is providing input to the device 101a and/or 101b using a touchpad requires the user's hand to rest on the touchpad.

In some embodiments, each hand (e.g., left and right) of the user has an independent associated ready state (e.g., each hand must independently meet its ready state criteria before the device 101a and/or 101b will respond to the input provided by each respective hand). In some embodiments, the criteria for the ready state of each hand are different from each other (e.g., each hand requires a different hand shape, allowing for an indirect or direct ready state of only one or both hands). In some embodiments, the visual indication of the ready state of each hand is different. For example, if the color of scrollable user interface element 705 changes to indicate that devices 101a and/or 101b detected a ready state, the color of scrollable user interface element 705 may be a first color (e.g., blue) for the right-handed ready state and a second color (e.g., green) for the left-handed ready state.

In some implementations, in response to detecting the ready state of the user, the electronic device 101a becomes ready to detect input provided by the user (e.g., by the user's hand) and update the display of the scrollable user interface element 705 to indicate that further input is to be directed to the scrollable user interface element 705. For example, as shown in fig. 7B, the scrollable user interface element 705 is updated at the electronic device 101a by increasing the thickness of the line around the boundary of the scrollable user interface element 705. In some implementations, the electronic device 101a updates the appearance of the scrollable user interface element 705 in a different or additional manner, such as by changing the color of the background of the scrollable user interface element 705, displaying highlights around the scrollable user interface element 705, updating the size of the scrollable user interface element 705, updating the positioning of the scrollable user interface element 705 in a three-dimensional environment (e.g., displaying the scrollable user interface element 705 closer to the user's point of view in the three-dimensional environment), and so forth. In some implementations, the second electronic device 101b does not update the appearance of the scrollable user interface element 705 to indicate the ready state of the user of the first electronic device 101 a.

In some implementations, the manner in which the electronic device 101a updates the scrollable user interface element 705 in response to detecting the ready state is the same regardless of whether the ready state is a direct ready state (e.g., with the hand 709) or an indirect ready state (e.g., with the hand 711). In some implementations, the manner in which the electronic device 101a updates the scrollable user interface element 705 in response to detecting the ready state is different depending on whether the ready state is a direct ready state (e.g., with the hand 709) or an indirect ready state (e.g., with the hand 711). For example, if the electronic device 101a updates the color of the scrollable user interface element 705 in response to detecting the ready state, the electronic device 101a uses a first color (e.g., blue) in response to a direct ready state (e.g., with the hand 709) and uses a second color (e.g., green) in response to an indirect ready state (e.g., with the hand 711).

In some implementations, after detecting the ready state for the scrollable user interface element 705, the electronic device 101a updates the target of the ready state based on the indication of the user's focus. For example, the electronic device 101a directs an indirect ready state (e.g., with the hand 711) to the selectable option 707 (e.g., and removes the ready state from the scrollable user interface element 705) in response to detecting that the location of the gaze 701b moves from the scrollable user interface element 705 to the selectable option 707. As another example, the electronic device 101a directs a direct ready state (e.g., with the hand 709) to the selectable option 707 (e.g., and removes the ready state from the scrollable user interface element 705) in response to detecting that the hand 709 moves within a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) of the scrollable user interface element 705 to within a threshold distance of the selectable option 707.

In fig. 7B, while the user's hand 715 is in an inactive state (e.g., hand state a), the device 101B detects that the user of the second electronic device 101B is pointing his gaze 701c at the selectable option 707. Because the electronic device 101b does not detect the user's ready state, the electronic device 101b discards the update selectable option 707 to indicate the user's ready state. In some embodiments, as described above, the electronic device 101b updates the appearance of the selectable option 707 in a different manner than the electronic device 101b updates the user interface element to indicate the ready state to indicate that the user's gaze 701c is directed to the selectable option 707.

In some embodiments, when a ready state is detected before an input is detected, the electronic devices 101a and 101b perform operations only in response to the input. Fig. 7C shows that a user of electronic devices 101a and 101b provides input to electronic devices 101a and 101b, respectively. In fig. 7B, the first electronic device 101a detects a ready state of the user, while the second electronic device 101B does not detect a ready state, as previously described. Thus, in fig. 7C, the first electronic device 101a performs an operation in response to detecting a user input, while the second electronic device 101b foregoes performing an operation in response to detecting a user input.

Specifically, in fig. 7C, the first electronic device 101a detects a scroll input directed to the scrollable user interface element 705. Fig. 7C illustrates a direct scroll input provided by hand 709 and/or an indirect scroll input provided by hand 711. Direct scroll input includes touching the scrollable user interface element 705 within a direct input threshold (e.g., 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. cm) or when the hand 709 is in a pointing hand shape (e.g., hand state E) as the hand 709 moves in a direction in which the scrollable user interface element 705 is scrollable (e.g., vertical motion or horizontal motion). Indirect scroll input includes detecting that the hand 711 is farther than a direct input ready state threshold (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) and/or farther than a direct input threshold (e.g., 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. centimeters) from the scrollable user interface element 705 when the user's gaze 701b is detected on the scrollable user interface element 705, detecting that the hand 711 is in a pinch (pinch) hand shape (e.g., the hand shape of a thumb touching another finger on the hand 711, hand state C), and movement of the hand 711 in a direction in which the scrollable user interface element 705 is scrollable (e.g., vertical motion or horizontal motion).

In some implementations, the electronic device 101a requires that the scrollable user interface element 705 be within the user's attention area 703 in order to detect a scroll input. In some embodiments, the electronic device 101a does not require that the scrollable user interface element 705 be within the user's attention area 703 in order to detect a scroll input. In some implementations, the electronic device 101a requests that the user's gaze 701b be directed toward the scrollable user interface element 705 in order to detect a scroll input. In some implementations, the electronic device 101a does not require that the user's gaze 701b be directed toward the scrollable user interface element 705 in order to detect a scroll input. In some implementations, for indirect scroll input but not for direct scroll input, the electronic device 101a asks the user's gaze 701b to be directed to the scrollable user interface element 705.

In response to detecting the scroll input, the first electronic device 101a scrolls the content in the scrollable user interface element 705 according to the movement of the hand 709 or hand 711, as shown in fig. 7C. In some implementations, the first electronic device 101a transmits an indication of scrolling (e.g., via a server) to the second electronic device 101b, and in response, the second electronic device 101b scrolls the scrollable user interface element 705 in the same manner as the first electronic device 101a scrolls the scrollable user interface element 705. For example, the scrollable user interface element 705 in the three-dimensional environment has now been scrolled, and thus the electronic devices that display the viewpoint of the three-dimensional environment that includes the scrollable user interface element 705 (e.g., electronic devices that include other than those that detected the input for scrolling the scrollable user interface element 705) reflect the scroll state of the user interface element. In some embodiments, if the ready state of the user shown in fig. 7B is not detected before the input shown in fig. 7C is detected, the electronic devices 101a and 101B will forgo scrolling the scrollable user interface element 705 in response to the input shown in fig. 7C.

Thus, in some embodiments, the results of the user input are synchronized between the first electronic device 101a and the second electronic device 101 b. For example, if the second electronic device 101b were to detect a selection of the selectable option 707, both the first electronic device 101a and the second electronic device 101b would update the appearance (e.g., color, style, size, positioning, etc.) of the selectable option 707 upon detection of a selection input and perform an operation in accordance with the selection.

Thus, because the electronic device 101a detects the ready state of the user in fig. 7B before detecting the input in fig. 7C, the electronic device 101a scrolls the scrollable user interface 705 in response to the input. In some embodiments, the electronic devices 101a and 101b forgo performing actions in response to input detected without first detecting a ready state.

For example, in fig. 7C, the user of the second electronic device 101b provides an indirect selection input directed to selectable option 707 using hand 715. In some implementations, detecting the selection input includes detecting that a pinch gesture (e.g., hand state C) is made by the user's hand 715 while the user's gaze 701C is directed to the selectable option 707. Because the second electronic device 101B did not detect a ready state (e.g., in fig. 7B) before detecting the input in fig. 7C, the second electronic device 101B forgoes selecting option 707 and forgoes performing an action in accordance with the selection of option 707. In some embodiments, although the second electronic device 101b detects the same input (e.g., an indirect input) as the first electronic device 101a in fig. 7C, the second electronic device 101b does not perform an operation in response to the input because the ready state is not detected before the input is detected. In some embodiments, if the second electronic device 101b has detected a direct input without first detecting a ready state, the second electronic device 101b will also forgo performing an action in response to the direct input because the ready state was not detected before the input was detected.

Fig. 8A-8K are flowcharts illustrating a method 800 of performing or not performing an operation in response to a user input according to whether a ready state of the user was previously detected by the user input, according to some embodiments. In some embodiments, the method 800 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, method 800 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some of the operations in method 800 are optionally combined and/or the order of some of the operations are optionally changed.

In some embodiments, the method 800 is performed at an electronic device 101a or 101b (e.g., a mobile device (e.g., a tablet, a smartphone, a media player, or a wearable device)) or a computer in communication with a display generating component and one or more input devices. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from the electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., hand tracking devices, hand motion sensors), and the like, in some embodiments, the electronic device is coupled to the hand tracking devices (e.g., one or more cameras, depth sensors, proximity sensors, depth sensors depth, touch sensors (e.g., touch screen, touch pad)). In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, such as in fig. 7A, the electronic device 101a displays (802 a) a user interface including user interface elements (e.g., 705) via a display generation component. In some implementations, the user interface element is an interactive user interface element, and in response to detecting input directed to the user interface element, the electronic device performs an action associated with the user interface element. For example, the user interface element is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the user interface element is a container (e.g., window) in which the user interface/content is displayed, and in response to detecting a selection of the user interface element and a subsequent movement input, the electronic device updates the positioning of the user interface element in accordance with the movement input. In some embodiments, the user interface and/or user interface elements are displayed in (e.g., the user interface is a three-dimensional environment and/or is displayed within) a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) that is generated by, displayed by, or otherwise made viewable by the device.

In some embodiments, such as in fig. 7C, upon display of a user interface element (e.g., 705), the electronic device 101a detects (802 b) input from a predefined portion (e.g., 709) (e.g., hand, arm, head, eye, etc.) of a user of the electronic device 101a via the one or more input devices. In some implementations, detecting the input includes detecting, via the hand tracking device, that the user is performing a predetermined gesture with his hand, optionally while his gaze is directed at the user interface element. In some embodiments, the predetermined gesture is a pinch gesture while looking at the user interface element, the pinch gesture comprising touching the thumb to another finger (e.g., index finger, middle finger, ring finger, little finger) on the same hand as the thumb. In some implementations, the input is a direct or indirect interaction with a user interface element, such as described with reference to methods 1000, 1200, 1400, 1600, 1800, and/or 2000.

In some embodiments, in response to detecting an input from a predefined portion of a user of the electronic device (802C), in accordance with a determination that a pose (e.g., location, orientation, hand shape) of the predefined portion of the user (e.g., 709) meets one or more criteria prior to detecting the input, the electronic device performs (802 d) a corresponding operation in accordance with the input from the predefined portion of the user of the electronic device 101a (e.g., 709), such as in fig. 7C. In some embodiments, the pose of the physical feature of the user is the orientation and/or shape of the user's hand. For example, if the electronic device detects a pre-pinch hand shape with the user's hand oriented with the user's palm facing away from the user's torso while taking the shape of the user's thumb within a threshold distance (e.g., 0.5, 1, 2, etc. centimeters) of another finger (e.g., index finger, middle finger, ring finger, little finger) on the thumb's hand, the pose meets the one or more criteria. As another example, the one or more criteria are met when the hand is in a pointing hand shape in which one or more fingers are extended and one or more other fingers are curled toward the palm of the user. The input by the user's hand after the gesture is detected is optionally identified as pointing to the user interface element, and the device optionally performs a corresponding operation in accordance with this subsequent input by the hand. In some implementations, the respective operations include scrolling through the user interface, selecting an option, activating a setting, or navigating to a new user interface. In some embodiments, the electronic device scrolls the user interface in response to detecting an input comprising a selection and subsequent movement of the portion of the user after detecting the predetermined gesture. For example, the electronic device detects a user gaze directed at the user interface while first detecting a directed hand shape, and subsequently detecting movement of the user's hand away from the user's torso and in a direction in which the user interface is scrollable, the electronic device scrolling the user interface in response to the input sequence. As another example, in response to detecting a user's gaze at an option to activate a setting of the electronic device while detecting a pre-pinch hand shape followed by a pinch hand shape, the electronic device activates the setting on the electronic device.

In some embodiments, such as in fig. 7C, in response to detecting an input (802C) from a predefined portion (e.g., 715) of a user of electronic device 101B, in accordance with a determination that the pose of the predefined portion (e.g., 715) of the user did not meet the one or more criteria prior to detecting the input, such as in fig. 7B, electronic device 101B relinquishes (802 e) performing a corresponding operation in accordance with the input from the predefined portion (e.g., 715) of the user of electronic device 101B, such as in fig. 7C. In some embodiments, even if the pose meets the one or more criteria, the electronic device refrains from performing the respective operation in response to detecting that the user's gaze is not directed to the user interface element upon detection of the pose and input. In some embodiments, in accordance with a determination that the user's gaze is directed to a user interface element when the pose and input are detected, the electronic device performs a respective operation in accordance with the input.

The above-described manner of performing or not performing the first operation depending on whether the pose of the predefined portion of the user meets one or more criteria before an input is detected provides an efficient way of reducing accidental user input, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use and by reducing the likelihood that the electronic device performs unexpected and then would reverse operations.

In some embodiments, such as in fig. 7A, when the pose of the predefined portion of the user (e.g., 709) does not meet the one or more criteria (e.g., before input from the predefined portion of the user is detected), the electronic device 101a displays (804 a) the user interface element (e.g., 705) with a visual characteristic (e.g., size, color, location, translucence) having a first value and displays (707) a second user interface element (e.g., 707) included in the user interface with a visual characteristic (e.g., size, color, location, translucence) having a second value. In some embodiments, displaying the user interface element with a visual characteristic having a first value and displaying the second user interface element with a visual characteristic having a second value indicates that the input focus is not directed to either the user interface element or the second user interface element and/or the electronic device will not direct input from a predefined portion of the user to either the user interface element or the second user interface element.

In some embodiments, such as in fig. 7B, when the pose of the predefined portion of the user (e.g., 709) meets the one or more criteria, the electronic device 101a updates (804B) the visual characteristics of the user interface element (e.g., 705) to which the input focus is directed, including (e.g., before detecting input from the predefined portion of the user): in accordance with determining that the input focus is directed to the user interface element (e.g., 705), the electronic device 101a updates (804 c) the user interface element (e.g., 705) to be displayed with a visual characteristic (e.g., size, color, translucence) having a third value (e.g., the third value is different from the first value while maintaining the second user interface element displayed with the visual characteristic having the second value). In some implementations, in accordance with a determination that the user's gaze is directed toward the user interface element, the input focus is directed toward the user interface element, optionally including a disambiguation technique in accordance with method 1200. In some implementations, in accordance with a determination that the predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 30, 50, etc. centimeters) of the user interface element (e.g., a threshold distance for direct input), the input focus is directed at the user interface element. For example, before the predefined portion of the user meets the one or more criteria, the electronic device displays the user interface element in a first color, and in response to detecting that the predefined portion of the user meets the one or more criteria and that the input focus is directed toward the user interface element, the electronic device displays the user interface element in a second color different from the first color to indicate that the input from the predefined portion of the user is to be directed toward the user interface element.

In some embodiments, when the pose of the predefined portion of the user (e.g., 705) meets the one or more criteria, such as in fig. 7B, the electronic device 101a updates (804B) the visual characteristics of the user interface element to which the input focus is directed (e.g., in the manner in which the electronic device 101a updates the user interface element 705 in fig. 7B), including (e.g., before detecting input from the predefined portion of the user): in accordance with determining that the input focus points to the second user interface element, the electronic device 101a updates (804 d) the second user interface element to be displayed with a visual characteristic having a fourth value (e.g., updates the appearance of the user interface element 707 in fig. 7B if the user interface element 707 has input focus instead of the user interface element 705 having input focus as in the case of fig. 7B) (e.g., the fourth value is different from the second value while maintaining the user interface element displayed with the visual characteristic having the first value). In some implementations, in accordance with determining that the user's gaze is directed toward the second user interface element, the input focus is directed toward the second user interface element, optionally including a disambiguation technique in accordance with method 1200. In some implementations, in accordance with a determination that the predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 50, etc. centimeters) of the second user interface element (e.g., a threshold distance for direct input), the input focus is directed toward the second user interface element. For example, before the predefined portion of the user meets the one or more criteria, the electronic device displays the second user interface element in a first color, and in response to detecting that the predefined portion of the user meets the one or more criteria and that the input focus is directed to the second user interface element, the electronic device displays the second user interface element in a second color different from the first color to indicate that the input is to be directed to the user interface element.

The above-described manner of updating the visual characteristics of the user interface element to which the input focus is directed in response to detecting that the predefined portion of the user meets the one or more criteria provides an efficient manner of indicating to the user which user interface element the input is to be directed, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 7B, in accordance with a determination that a predefined portion (e.g., 709) of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 50, etc. centimeters) of a location corresponding to the user interface element (e.g., 705) (e.g., and not within a threshold distance of a second user interface element), input focus is directed to the user interface element (e.g., 705) (806 a). In some implementations, the threshold distance is associated with a direct input, such as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000. For example, in response to detecting that a finger of a user's hand in a pointing hand shape is within a threshold distance of a user interface element, an input focus is directed at the user interface element.

In some implementations, in accordance with a determination that the predefined portion (e.g., 709) of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 50, etc. centimeters) of the second user interface element (e.g., and not within the threshold distance of the user interface element; such as if the hand 709 of the user is within the threshold distance of the user interface element 707 instead of the user interface element 705 in fig. 7B, the input focus is directed to the second user interface element (e.g., 707) in fig. 7B) (806B). In some implementations, the threshold distance is associated with a direct input, such as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000. For example, in response to detecting that a finger of a user's hand in a pointing hand shape is within a threshold distance of a second user interface element, an input focus is directed at the second user interface element.

The above-described manner of directing the pointing of the input focus based on which user interface element the predefined portion of the user is within a threshold distance provides an efficient manner of directing the pointing of the user input when the predefined portion of the user is used to provide input, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 7B, in accordance with a determination that the user's gaze (e.g., 701B) is directed to the user interface element (e.g., 705) (e.g., and the predefined portion of the user is not within a threshold distance of the user interface element and/or any interactive user interface elements), the input focus is directed to the user interface element (e.g., 705) (808 a). In some implementations, determining that the user's gaze is directed to the user interface element includes one or more disambiguation techniques in accordance with method 1200. For example, the electronic device directs an input focus to the user interface element for indirect input in response to detecting a gaze of a user directed to the user interface element.

In some implementations, in accordance with a determination that the user's gaze is directed to a second user interface element (e.g., 707) (e.g., and the predefined portion of the user is not within a threshold distance of the second user interface element and/or any interactive user interface element), the input focus is directed to the second user interface element (e.g., 707) in fig. 7B (808B). For example, if the user's gaze is directed to user interface element 707 in fig. 7B instead of user interface element 705, the input focus would be directed to user interface element 707. In some implementations, determining that the user's gaze is directed at the second user interface element includes one or more disambiguation techniques in accordance with method 1200. For example, the electronic device directs an input focus to the second user interface element for indirect input in response to detecting a gaze of a user directed to the second user interface element.

The above-described manner of directing the input focus to the user interface to which the user is looking provides an efficient way of directing the pointing of user input without the user having additional input devices (e.g., input devices other than the eye tracking device and the hand tracking device), which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 7B, updating the visual characteristics of the user interface element (e.g., 705) to which the input focus is directed includes (810 a): in accordance with a determination that the predefined portion (e.g., 709) of the user is less than a threshold distance (e.g., 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) from a location corresponding to the user interface element (e.g., 705), a visual characteristic (810B) of the user interface element (e.g., 705) to which the input focus is directed is updated in accordance with a determination that the pose of the predefined portion (e.g., 709) of the user meets a first set of one or more criteria (e.g., as described in association with direct input, such as methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000), such as in fig. 7B (and optionally, in accordance with a determination that the pose of the predefined portion of the user does not meet the first set of one or more criteria without updating the visual characteristic of the user interface element to which the input focus is directed). For example, when the user's hand is within a direct input threshold distance of the user interface element, the first set of one or more criteria includes detecting a pointing hand shape (e.g., a shape in which a finger protrudes from an otherwise closed hand).

In some embodiments, such as in fig. 7B, updating the visual characteristics of the user interface element (e.g., 705) to which the input focus is directed includes (810 a): in accordance with a determination that the predefined portion (e.g., 711) of the user is greater than a threshold distance (e.g., 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) from a location corresponding to the user interface element (e.g., 705), a visual characteristic (810 c) of the user interface element (e.g., 705) to which the input focus is directed is updated in accordance with a determination that the pose of the predefined portion (e.g., 711) of the user meets a second set of one or more criteria that is different from the first set of one or more criteria (e.g., in association with an indirect input such as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000), such as in fig. 7B (and optionally, in accordance with a determination that the pose of the predefined portion of the user does not meet the second set of one or more criteria without updating the visual characteristic of the user interface element to which the input focus is directed. For example, when the user's hand distance is greater than the direct input threshold, the second set of one or more criteria includes detecting a pre-pinch hand shape instead of detecting a pointing hand shape. In some embodiments, the hand shape that meets the one or more first criteria is different from the hand shape that meets the one or more second criteria. In some embodiments, the one or more criteria are not met when the predefined portion of the user is greater than a threshold distance from a location corresponding to the user interface element and the pose of the predefined portion of the user meets the first set of one or more criteria and does not meet the second set of one or more criteria. In some embodiments, the one or more criteria are not met when the predefined portion of the user is less than a threshold distance from a location corresponding to the user interface element and the pose of the predefined portion of the user meets the second set of one or more criteria but does not meet the first set of one or more criteria.

The above-described manner of evaluating the predefined portion of the user using different criteria depending on whether the predefined portion of the user is within a threshold distance of a location corresponding to the user interface element provides an efficient and intuitive manner of interacting with the user interface element that is tailored to whether the input is direct or indirect, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 7B, the pose of the predefined portion of the user (e.g., 709) meeting the one or more criteria includes (812 a): in accordance with a determination that the predefined portion (e.g., 709) of the user is less than a threshold distance (e.g., 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) from a location corresponding to the user interface element (e.g., 705), the pose of the predefined portion (e.g., 709) of the user meets a first set of one or more criteria (812 b) (e.g., associated with direct input, such as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). For example, a first set of one or more criteria includes detecting a pointing hand shape (e.g., a shape in which a finger protrudes from an otherwise closed hand) when the user's hand is within a direct input threshold distance of the user interface element.

In some embodiments, such as in fig. 7B, the pose of the predefined portion (e.g., 711) of the user meeting the one or more criteria includes (812 a): in accordance with a determination that the predefined portion (e.g., 711) of the user is greater than a threshold distance (e.g., 1, 2, 3, 4, 5, 10, 15, 30, etc. centimeters) from a location corresponding to the user interface element (e.g., 705), the pose of the predefined portion (e.g., 711) of the user meets a second set of one or more criteria (812 c) different from the first set of one or more criteria (e.g., associated with an indirect input, such as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). For example, the second set of one or more criteria includes detecting a pre-pinch hand shape when the hand distance of the user is greater than the direct input threshold. In some embodiments, the hand shape that meets the one or more first criteria is different from the hand shape that meets the one or more second criteria. In some embodiments, the one or more criteria are not met when the predefined portion of the user is greater than a threshold distance from a location corresponding to the user interface element and the pose of the predefined portion of the user meets the first set of one or more criteria and does not meet the second set of one or more criteria. In some embodiments, the one or more criteria are not met when the predefined portion of the user is less than a threshold distance from a location corresponding to the user interface element and the pose of the predefined portion of the user meets the second set of one or more criteria but does not meet the first set of one or more criteria.

In some embodiments, such as in fig. 7B, the pose of the predefined portion of the user meeting the one or more criteria includes (814 a): in accordance with a determination that a predefined portion of a user is holding an input device (e.g., a stylus, remote control, touch pad) of the one or more input devices (e.g., interacting with or touching the input device), a pose of the predefined portion of the user meets a first set of one or more criteria (814B) (e.g., if the user's hand 709 in fig. 7B is holding the input device). In some embodiments, the predefined portion of the user is the user's hand. In some embodiments, the first set of one or more criteria is met when a user holds a stylus or controller in their hand within a predefined area of the three-dimensional environment and/or in a predefined orientation relative to user interface elements and/or relative to the torso of the user. In some implementations, the first set of one or more criteria is met when a user holds the remote control in a predefined orientation relative to the user interface element and/or relative to the torso of the user within a predefined area of the three-dimensional environment and/or when the digits of the user's thumb are resting on respective components (e.g., buttons, touch pad, etc.) of the remote control. In some implementations, the first set of one or more criteria is met when the user is holding or interacting with the touch pad and a predefined portion of the user is in contact with the touch-sensitive surface of the touch pad (e.g., does not press the touch pad as the selection is made).

In some embodiments, such as in fig. 7B, the pose of the predefined portion of the user (e.g., 709) meeting the one or more criteria includes (814 a): in accordance with a determination that the predefined portion (e.g., 709) of the user is not holding the input device, the pose of the predefined portion (e.g., 709) of the user meets a second set of one or more criteria (814 c) (e.g., different from the first set of one or more criteria). In some implementations, when a user of the electronic device is not holding, touching, or interacting with the input device, the second set of one or more criteria is met when the user's pose is a predefined pose (e.g., a pose that includes pre-pinching or pointing to a hand shape), such as previously described, rather than holding a stylus or controller in his hand. In some embodiments, the pose of the predefined portion of the user does not meet the one or more criteria when the predefined portion of the user is holding the input device and the second set of one or more criteria is met and the first set of one or more criteria is not met. In some embodiments, when the predefined portion of the user does not hold the input device and meets the first set of one or more criteria and does not meet the second set of one or more criteria, the pose of the predefined portion of the user does not meet the one or more criteria.

The above-described manner of evaluating the predefined portion of the user according to different criteria depending on whether the user is holding the input device provides an efficient way of switching between accepting input using the input device and input not using the input device (e.g., input devices other than the eye tracking device and/or the hand tracking device), which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7B, the pose of the predefined portion of the user (e.g., 709) meeting the one or more criteria includes (816 a): in accordance with a determination that the predefined portion (e.g., 709) of the user is less than a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, 50, etc., centimeters, corresponding to direct input) from a location corresponding to the user interface element (e.g., 705), the pose of the predefined portion (e.g., 709) of the user satisfies a first set of one or more criteria (816 b). For example, the first set of one or more criteria includes detecting a pointing hand shape and/or a pre-pinch hand shape when the user's hand is within a direct input threshold distance of the user interface element.

In some embodiments, such as in fig. 7B, the pose of the predefined portion (e.g., 711) of the user meeting the one or more criteria includes (816 a): in accordance with a determination that the predefined portion (e.g., 711) of the user is greater than a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, 50, etc., centimeters, corresponding to indirect input) from a location corresponding to the user interface element (e.g., 705), the pose of the predefined portion (e.g., 711) of the user satisfies a first set of one or more criteria (816 c). For example, when the user's hand is greater than the direct input threshold from the user interface element, the second set of one or more criteria includes detecting the same pre-pinch hand shape and/or pointing hand shape as used to satisfy the one or more criteria. In some embodiments, the hand shape that meets the one or more first criteria is the same regardless of whether the predefined portion of the hand is greater than or less than a threshold distance from a location corresponding to the user interface element.

The above-described manner of evaluating the pose of the predefined portion of the user against the first set of one or more criteria regardless of the distance between the predefined portion of the user and the location corresponding to the user interface element provides an efficient and consistent manner of detecting user input provided with the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 7C, in accordance with a determination that the predefined portion (e.g., 711) of the user corresponds to the user interface element (e.g., 705) at a respective input period distance that is greater than a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, 50, etc., cm, corresponding to an indirect input) (e.g., the input is an indirect input), the one or more criteria include a criterion (818 a) that is met when the user's attention is directed to the user interface element (e.g., 705) (e.g., and that is not met when the user's attention is not directed to the user interface element) (e.g., the user's gaze is within the threshold distance of the user interface element, the user interface element is within the user's attention area, etc., such as described with reference to method 1000). In some embodiments, the electronic device determines which user interface element the indirect input is directed to based on the user's attention, and thus it is not possible to provide the indirect input to the respective user interface element without directing the user's attention to the respective user interface element.

In some embodiments, such as in fig. 7C, in accordance with a determination that the predefined portion (e.g., 709) of the user corresponds to the user interface element (e.g., 705) less than a threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 30, 50, etc., cm, corresponding to an indirect input) at the respective input period interval (e.g., the input is a direct input), the one or more criteria do not include a requirement (818 b) to direct the user's attention to the user interface element (e.g., 709) in order to satisfy the one or more criteria (e.g., it is possible to satisfy the one or more criteria without directing the user's attention to the user interface element). In some embodiments, the electronic device determines the target of the direct input based on the position of the predefined portion of the user relative to the user interface elements in the user interface and directs the input to the user interface element closest to the predefined portion of the user, regardless of whether the user's attention is directed to the user interface element.

The above-described manner of requiring the user's attention to meet the one or more criteria when the predefined portion of the user is greater than the threshold distance from the user interface element and not requiring the user's attention to meet the one or more criteria when the predefined portion of the user is less than the threshold distance from the user interface element provides an efficient way of enabling the user to look at other portions of the user interface element while providing direct input, thus saving user time when using the electronic device and reducing user errors when providing indirect input, simplifying interactions between the user and the electronic device, enhancing operability of the electronic device, and making the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some implementations, in response to detecting that the user's gaze (e.g., 701 a) is directed toward a first region (e.g., 703) of the user interface, such as in fig. 7A, the electronic device 101a visually weakens (820 a) (e.g., blurs, fades, darkens, and/or reduces saturation to) a second region (e.g., a second region) of the user interface relative to the first region (e.g., 705) of the user interface via the display generating component. In some embodiments, the electronic device modifies the display of the second region of the user interface and/or modifies the display of the first region of the user interface to achieve visual weakness of the second region of the user interface relative to the first region of the user interface.

In some implementations, such as in fig. 7B, in response to detecting that the user's gaze 701c is directed toward the second region (e.g., 702) of the user interface, the electronic device 101B visually weakens (820B) the first region (e.g., blurs, fades, darkens, and/or reduces saturation of) the first region of the user interface relative to the second region (e.g., 702) of the user interface via the display generating component. In some embodiments, the electronic device modifies the display of the first region of the user interface and/or modifies the display of the second region of the user interface to achieve visual weakness of the first region of the user interface relative to the second region of the user interface. In some embodiments, the first region and/or the second region of the user interface includes one or more virtual objects (e.g., application user interfaces, content items, representations of other users, files, control elements, etc.) and/or one or more physical objects (e.g., a passthrough video including a photorealistic representation of a real object, a real passthrough in which a view of the real object is visible through a transparent portion of the display generating component) that are weakened when the region of the user interface is weakened.

The above-described manner of visually weakening the area other than the area to which the user's gaze is directed provides an efficient way of reducing visual clutter when the user views the corresponding area of the user interface, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7A, the user interface is accessible (822 a) by the electronic device 101a and the second electronic device 101b (e.g., the electronic device and the second electronic device are in communication (e.g., via a wired or wireless network connection)). In some embodiments, the electronic device and the second electronic device are located remotely from each other. In some embodiments, the electronic device and the second electronic device are collocated (e.g., in the same room, building, etc.). In some embodiments, the electronic device and the second electronic device present a three-dimensional environment in a coexistence session in which representations of users of the two devices are associated with unique locations in the three-dimensional environment, and each electronic device displays the three-dimensional environment from the perspective of the representations of the respective users.

In some embodiments, such as in fig. 7B, the electronic device 101a forgoes (822B) visually weakening (e.g., blurs, dims, darkens, and/or reduces saturation of) the second region of the user interface relative to the first region of the user interface via the display generating component in accordance with an indication that the gaze 701c of the second user of the second electronic device 101B is directed toward the first region 702 of the user interface. In some embodiments, the second electronic device visually weakens the second region of the user interface in accordance with a determination that the gaze of the second user is directed to the first region of the user interface. In some embodiments, in accordance with a determination that a gaze of a user of the electronic device is directed toward a first region of the user interface, the second electronic device forgoes visually weakening a second region of the user interface relative to the first region of the user interface.

In some embodiments, such as in fig. 7B, the electronic device 101B forgoes (822 c) visually weakening (e.g., obscuring, darkening, and/or reducing saturation) the first region of the user interface relative to the second region of the user interface via the display generating component in accordance with an indication that the gaze of the second user of the second electronic device 101a is directed at the second region of the user interface (e.g., 703). In some implementations, the second electronic device visually weakens the first region of the user interface in accordance with determining that the gaze of the second user is directed at the second region of the user interface. In some embodiments, in accordance with a determination that the gaze of the user of the electronic device is directed toward the second region of the user interface, the second electronic device forgoes visually weakening the first region of the user interface relative to the second region of the user interface.

The above-described manner of visually weakening the area of the user interface based on the gaze of the user of the second electronic device provides an efficient way of enabling the user to look at different areas of the user interface at the same time, which simplifies the interaction between the user and the electronic device, enhances the operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7C, detecting input from a predefined portion (e.g., 705) of the user of the electronic device 101a includes detecting a pinch (e.g., pinch and hold, pinch and drag, double pinch, toggle, no-speed release, speed throw) gesture (824 a) performed by the predefined portion (e.g., 709) of the user via the hand tracking device. In some embodiments, detecting a pinch gesture includes detecting that a user moves their thumb toward and/or within a predefined distance of another finger (e.g., index finger, middle finger, ring finger, little finger) on the hand of the thumb. In some embodiments, detecting a pose that meets the one or more criteria includes detecting that the user is in a ready state, such as a pre-pinch hand shape in which the thumb is within a threshold distance (e.g., 1, 2, 3, 4, 5, etc. centimeters) of the other finger.

The above-described manner of detecting input including pinch gestures provides an efficient manner of accepting user input based on hand gestures without requiring the user to physically touch and/or manipulate the input device with their hands, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some implementations, such as in fig. 7C, detecting input from a predefined portion (e.g., 709) of a user of the electronic device 101a includes detecting a press (e.g., tap, press and hold, press and drag, flick) gesture performed by the predefined portion (e.g., 709) of the user (826 a) via the hand tracking device. In some implementations, detecting a press gesture includes detecting that a predefined portion of the user presses a location corresponding to a user interface element displayed in the user interface (e.g., such as described with reference to methods 1400, 1600, and/or 2000), such as a user interface element or virtual touch pad or other visual indication according to method 1800. In some embodiments, prior to detecting an input comprising a press gesture, the electronic device detects a pose of a predefined portion of the user that meets the one or more criteria, including detecting that the user is in a ready state, such as the user's hand being in a pointing-hand shape with one or more fingers extending and one or more fingers curling toward the palm. In some implementations, the press gesture includes moving a finger, hand, or arm of the user while the hand is in a pointed hand shape.

The above-described manner of detecting input including a press gesture provides an efficient manner of accepting user input based on hand gestures without requiring the user to physically touch and/or manipulate the input device with their hands, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, such as in fig. 7C, detecting input from a predefined portion (e.g., 709) of a user of the electronic device 101a includes detecting lateral movement (828 a) of the predefined portion (e.g., 709) of the user relative to a location corresponding to the user interface element (e.g., 705) (e.g., such as described with reference to method 1800). In some implementations, the lateral movement includes movement having a component perpendicular to a straight-line path between the predefined portion of the user and the location corresponding to the user interface element. For example, if the user interface element is in front of the predefined portion of the user and the user moves the predefined portion of the user to the left, right, up or down, the movement is a lateral movement. For example, the input is one of a press and drag, pinch and drag, or a throw (with speed) input.

The above-described manner of detecting lateral movement of the predefined portion of the user relative to the user interface element provides an efficient manner of providing directional input to the electronic device with the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7A, before determining that the pose of the predefined portion of the user (e.g., 709) meets the one or more criteria (830 a) before detecting the input, the electronic device 101a detects (830 b) via the eye tracking device that the gaze of the user (e.g., 701 a) is directed to the user interface element (e.g., 705) (e.g., according to one or more disambiguation techniques of method 1200).

In some embodiments, before determining that the pose of the predefined portion (e.g., 709) of the user meets the one or more criteria (830 a) prior to detecting the input, such as in fig. 7A, in response to detecting that the gaze (e.g., 701 a) of the user is directed to the user interface element (e.g., 705), the electronic device 101a displays (830 c), via the display generating component, a first indication that the gaze (e.g., 701 a) of the user is directed to the user interface element (e.g., 705). In some embodiments, the first indication is a highlight overlaid on or displayed around the user interface element. In some implementations, the first indication is a color change or a position change of the user interface element (e.g., toward the user). In some embodiments, the first indication is a symbol or icon displayed overlaying or displayed adjacent to the user interface element.

The above-described manner of displaying the first indication that the user's gaze is directed to the user interface element provides an efficient way of conveying to the user that the input focus is based on the location the user is looking at, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7B, prior to detecting input from the predefined portion (e.g., 709) of the user of the electronic device 101a, when the pose of the predefined portion (e.g., 709) of the user before detecting the input meets the one or more criteria (832 a) (e.g., and when the gaze of the user is directed to the user interface element) (e.g., according to one or more disambiguation techniques of the method 1200), the electronic device 101a displays (832B) via the display generating component a second indication that the pose of the predefined portion (e.g., 709) of the user before detecting the input meets the one or more criteria, such as in fig. 7B, wherein the first indication is different from the second indication. In some implementations, displaying the second indication includes modifying a visual characteristic (e.g., color, size, positioning, translucence) of the user interface element that the user is looking at. For example, the second indication is that the electronic device moves the user interface element toward the user in a three-dimensional environment. In some embodiments, the second indication is displayed overlaid on or near the user interface element that the user is looking at. In some implementations, the second indication is an icon or image displayed in the user interface at a location that is independent of the location at which the user gaze is directed.

The above-described manner of displaying an indication of the pose of the user that meets the one or more criteria that is different from an indication of the location of the user's gaze provides an efficient way of indicating to the user that the electronic device is ready to accept further input from a predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some implementations, such as in fig. 7C, upon display of the user interface element (e.g., 705), the electronic device 101a detects (834 a) a second input from a second predefined portion (e.g., 717) (e.g., a second hand) of the user of the electronic device 101a via the one or more input devices.

In some embodiments, in response to detecting a second input (834B) from a second predefined portion (e.g., 717) of a user of the electronic device, in accordance with a determination that a pose (e.g., location, orientation, hand shape) of the second predefined portion (e.g., 711) of the user before the second input was detected meets the one or more second criteria, such as in fig. 7B, the electronic device 101a performs (834 c) a second corresponding operation in accordance with the second input from the second predefined portion (e.g., 711) of the user of the electronic device 101 a. In some embodiments, the one or more second criteria differ from the one or more criteria in that the user performs the pose in a different predefined portion, but otherwise the one or more criteria are the same as the one or more second criteria. For example, the one or more criteria require the user's right hand to be in a ready state, such as a pre-pinch hand shape or a pointing hand shape, and the one or more second criteria require the user's left hand to be in a ready state, such as a pre-pinch hand shape or a pointing hand shape. In some embodiments, the one or more criteria are different from the one or more second criteria. For example, a first subset of the poses satisfies the one or more criteria for the right hand of the user, and a second, different subset of the poses satisfies the one or more criteria for the left hand of the user.

In some embodiments, such as in fig. 7C, in response to detecting a second input (834B) from a second predefined portion (e.g., 715) of a user of the electronic device 101B, in accordance with a determination that the pose of the second predefined portion (e.g., 721) of the user did not meet the one or more second criteria prior to detection of the second input, such as in fig. 7B, the electronic device relinquishes (834 d) performing a second corresponding operation in accordance with the second input from the second predefined portion (e.g., 715) of the user of the electronic device 101B, such as in fig. 7C. In some embodiments, the electronic device is capable of detecting input from a predefined portion of the user and/or a second predefined portion of the user, the predefined portion of the user and the second predefined portion of the user being independent of each other. In some embodiments, in order to perform an action based on input provided by the user's left hand, the user's left hand must have a pose that meets the one or more criteria before providing the input, and in order to perform an action based on input provided by the user's right hand, the user's right hand must have a pose that meets the second one or more criteria. In some embodiments, in response to detecting a pose of a predefined portion of a user that meets one or more criteria and subsequently input provided by a second predefined portion of the user if the second predefined portion of the user does not first meet the second one or more criteria, the electronic device relinquishes performing an action in accordance with the input of the second predefined portion of the user. In some embodiments, in response to detecting a pose of a second predefined portion of the user that meets a second one or more criteria and subsequently input provided by the predefined portion of the user if the predefined portion of the user does not first meet the one or more criteria, the electronic device relinquishes performing the action according to the input of the predefined portion of the user.

The above-described manner of accepting input from a second predefined portion of a user that is independent of the predefined portion of the user provides an efficient way of increasing the rate at which the user can provide input to the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, such as in fig. 7A-7C, the user interface is accessible (836 a) by the electronic device 101a and the second electronic device 101b (e.g., the electronic device and the second electronic device are in communication (e.g., connected via a wired or wireless network)). In some embodiments, the electronic device and the second electronic device are located remotely from each other. In some embodiments, the electronic device and the second electronic device are collocated (e.g., in the same room, building, etc.). In some embodiments, the electronic device and the second electronic device present a three-dimensional environment in a coexistence session in which representations of users of the two devices are associated with unique locations in the three-dimensional environment, and each electronic device displays the three-dimensional environment from the perspective of the representations of the respective users.

In some embodiments, such as in fig. 7A, the electronic device 101a displays 836b the user interface element (e.g., 705) with a visual characteristic (e.g., size, color, translucence, location) having a first value before detecting that the pose of the predefined portion (e.g., 709) of the user meets the one or more criteria prior to detecting the input.

In some implementations, such as in fig. 7B, upon detecting that the pose of the predefined portion of the user (e.g., 709) meets the one or more criteria, the electronic device 101a displays (836 c) the user interface element (e.g., 705) with a visual characteristic (e.g., size, color, translucency, location) having a second value different from the first value. In some implementations, the electronic device updates the visual appearance of the user interface element in response to detecting that the pose of the predefined portion of the user meets the one or more criteria. In some implementations, the electronic device only updates the appearance of the user interface element to which the user's attention is directed (e.g., according to the user's gaze or according to the user's attention area of method 1000). In some implementations, the second electronic device maintains the display of the user interface element with the visual characteristic having the first value in response to the predefined portion of the user meeting the one or more criteria.

In some embodiments, when the pose of the predefined portion of the second user of the second electronic device 101B satisfies the one or more criteria while the user interface element is displayed with the visual characteristic having the first value (optionally, in response to the indication of the above), the electronic device 101a maintains (836 d) the display of the user interface element with the visual characteristic having the first value in a manner similar to the manner in which the electronic device 101B maintains the display of the user interface element (e.g., 705) when the portion (e.g., 709) of the user of the first electronic device 101a satisfies the one or more criteria in fig. 7B. In some embodiments, in response to detecting that the pose of the predefined portion of the user of the second electronic device meets the one or more criteria, the second electronic device updates the user interface element to be displayed with a visual characteristic having a second value in a manner similar to the manner in which both electronic devices 101a and 101b scroll the user interface element (e.g., 705) in response to the input detected by electronic device 101a in fig. 7C (e.g., via hand 709 or 711). In some embodiments, the second electronic device maintains the user interface element displayed with the visual characteristic having the first value in response to an indication that the pose of the user of the electronic device satisfies the one or more criteria while the user interface element is displayed with the visual characteristic having the first value. In some embodiments, in accordance with a determination that the pose of the user of the electronic device meets the one or more criteria and in accordance with an indication that the pose of the user of the second electronic device meets the one or more criteria, the electronic device displays the user interface element with a visual characteristic having a third value.

The above-described manner of not synchronizing the updating of visual properties of user interface elements across an electronic device provides an efficient way of indicating portions of a user interface with which a user interacts without confusion due to portions with which other users of the user interface are also indicated, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, in response to detecting input from a predefined portion (e.g., 709 or 711) of a user of the electronic device, the electronic device 101a displays (836 a) the user interface element (e.g., 705) with a visual characteristic having a third value, such as in fig. 7C (e.g., the third value is different from the first value and the second value). In some embodiments, in response to the input, the electronic device and the second electronic device perform respective operations in accordance with the input.

In some embodiments, in response to an indication of input from a predefined portion of a second user of the second electronic device (e.g., after the second electronic device detects that the predefined portion of the user of the second electronic device meets the one or more criteria), the electronic device 101a displays (836 b) the user interface element in a visual characteristic having a third value, such as, for example, though the electronic device 101b would display the user interface element (e.g., 705) in the same manner as the electronic device 101a displayed the user interface element (e.g., 705) in response to the electronic device 101a detecting user input from a hand (e.g., 709 or 711) of the user of the electronic device 101 a. In some embodiments, in response to an input from the second electronic device, the electronic device and the second electronic device perform respective operations in accordance with the input. In some embodiments, the electronic device displays an indication that the user of the second electronic device has provided input directed to the user interface element, but does not present an indication of the hover state of the user interface element.

The above-described manner of updating user interface elements in response to input, regardless of the device that detected the input, provides an efficient way of indicating the current interaction state of the user interface elements displayed by the two devices, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by clearly indicating what portions of the user interface other users are interacting with), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device, and avoids errors that would subsequently need correction caused by changes in the interaction state of the user interface elements.

Fig. 9A-9C illustrate an exemplary manner in which the electronic device 101a processes user input based on an attention area associated with a user, according to some embodiments.

Fig. 9A shows that the electronic device 101a displays a three-dimensional environment via the display generating section 120. It should be appreciated that in some embodiments, the electronic device 101a utilizes one or more of the techniques described with reference to fig. 9A-9C in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device optionally includes a display generation component 120a (e.g., a touch screen) and a plurality of image sensors 314a. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101a can use to capture one or more images of the user or a portion of the user when the user interacts with the electronic device 101 a. In some embodiments, the display generating component 120a is a touch screen capable of detecting gestures and movements of the user's hand. In some embodiments, the user interfaces described below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hand (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

Fig. 9A shows the electronic device 101a presenting a first selectable option 903, a second selectable option 905, and a representation 904 of a table in the physical environment of the electronic device 101a (e.g., such as the table 604 in fig. 6B) via the display generation component 120 a. In some implementations, the representation 904 of the table is a photorealistic image (e.g., a passthrough video or digital passthrough) of the table generated by the display generating component 120 a. In some embodiments, the representation 904 of the table is a view (e.g., a real or actual passthrough) of the table through the transparent portion of the display generating component 120 a. In some embodiments, the electronic device 101a displays the three-dimensional environment from a point of view associated with a user of the electronic device in the three-dimensional environment.

In some embodiments, the electronic device 101a defines the user's attention area 907 as a conical volume based on the user's gaze 901a in a three-dimensional environment. For example, the attention area 907 is optionally a cone centered on a line defined by the user's line of sight 901a (e.g., a line passing through the location of the user's gaze in the three-dimensional environment and the viewpoint associated with the electronic device 101 a), the cone comprising a volume of the three-dimensional environment within a predetermined angle (e.g., 1, 2, 3, 5, 10, 15, etc.) from the line defined by the user's gaze 901 a. Thus, in some embodiments, the two-dimensional area of the attention area 907 increases according to the distance from the viewpoint associated with the electronic device 101 a. In some embodiments, the electronic device 101a determines the user interface element to which the input is directed and/or whether to respond to the input based on the user's attention area.

As shown in fig. 9A, the first selectable option 903 is within the user's attention area 907 and the second selectable option 905 is outside the user's attention area. As shown in fig. 9A, selectable option 903 may be in attention area 907 even if user's gaze 901a is not directed to selectable option 903. In some embodiments, the selectable option 903 may be in the attention area 907 when the user's gaze is directed to the selectable option 903. Fig. 9A also shows the user's hand 909 in a direct input ready state (e.g., hand state D). In some embodiments, the direct input ready state is the same or similar to the direct input ready state described above with reference to fig. 7A-8K. Further, in some implementations, the direct inputs described herein share one or more characteristics of the direct inputs described with reference to methods 800, 1200, 1400, 1600, 1800, and/or 2000. For example, the user's hand 909 is pointing to the hand shape and within a direct ready state threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of the first selectable option 903. Fig. 9A also shows the user's hand 911 in a direct input ready state. In some embodiments, hand 911 is an alternative to hand 909. In some embodiments, the electronic device 101a is capable of detecting both hands of the user simultaneously (e.g., according to one or more steps of the method 1600). For example, the user's hand 911 is pointing to the hand shape and within the ready state threshold distance of the second selectable option 905.

In some embodiments, the electronic device 101a requires that the user interface element be within the attention area 907 in order to accept input. For example, because the first selectable option 903 is within the user's attention area 907, the electronic device 101a updates the first selectable option 903 to indicate that further input (e.g., from the hand 909) will point to the first selectable option 903. As another example, because the second selectable option 905 is outside of the user's attention area 907, the electronic device 101a foregoes updating the second selectable option 905 to indicate that further input (e.g., from the hand 911) will point to the second selectable option 905. It should be appreciated that although the user's gaze 901a is not directed to the first selectable option 903, the electronic device 101a is still configured to direct input to the first selectable option 903 because the first selectable option 903 is within an attention area 907, which is optionally wider than the user's gaze.

In fig. 9B, the electronic device 101a detects that the user's hand 909 makes a direct selection of the first selectable option 903. In some implementations, the direct selection includes moving the hand 909 to a position touching the first selectable option 903 or within a direct selection threshold (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, etc. centimeters) of the first selectable option while the hand is in the pointing hand shape. As shown in fig. 9B, when an input is detected, the first selectable option 903 is no longer in the user's attention area 907. In some implementations, because the user's gaze 901b moves, the attention area 907 moves. In some embodiments, after the electronic device 101a detects the ready state of the hand 909 illustrated in fig. 9A, the attention area 907 moves to the position illustrated in fig. 9B. In some embodiments, the input shown in FIG. 9B is detected before ready state 907 moves to the position shown in FIG. 9B. In some embodiments, the input shown in FIG. 9B is detected after ready state 907 moves to the position shown in FIG. 9B. Although the first selectable option 903 is no longer in the user's attention area 907, in some embodiments the electronic device 101a updates the color of the first selectable option 903 in response to the input because the first selectable option 903 is in the attention area 907 during the ready state, as shown in fig. 9A. In some embodiments, in addition to updating the appearance of the first selectable option 903, the electronic device 101a performs an action based on the selection of the first selectable option 903. For example, the electronic device 101a performs operations such as activating/deactivating a setting associated with the option 903, initiating playback of content associated with the option 903, displaying a user interface associated with the option 903, or a different operation associated with the option 903.

In some implementations, the selection input is detected only in response to detecting that the user's hand 909 moves from one side of the first selectable option 903 visible in fig. 9B to a position touching the first selectable option 903 or within a direct selection threshold of the first selectable option. For example, if the user instead reached around the first selectable option 903 to touch the first selectable option 903 from the back of the first selectable option 903 that is not visible in fig. 9B, the electronic device 101a will optionally forgo updating the appearance of the first selectable option 903 and/or forgo performing an action in accordance with the selection.

In some embodiments, in addition to continuing to accept press inputs (e.g., select inputs) that begin when the first selectable option 903 is in the attention area 907 and continue when the first selectable option 903 is not in the attention area 907, the electronic device 101a accepts other types of inputs that begin when the user interface element to which the input is directed is in the attention area, even though the user interface element is no longer in the attention area when the inputs continue. For example, the electronic device 101a can continue the drag input, where the electronic device 101a updates the location of the user interface element in response to the user input even if the drag input continues after the user interface element is outside of the attention area (e.g., and is initiated when the user interface element is within the attention area). As another example, even if the scroll input continues after the user interface element is outside of the attention area 907 (e.g., and is initiated while the user interface element is inside of the attention area), the electronic device 101a can continue the scroll input in response to the user input. As shown in fig. 9A, in some embodiments, if the user interface element is in the attention area when the ready state is detected, the input is accepted even if the user interface element to which the input is directed is outside the attention area for a portion of the input.

Further, in some implementations, after detecting movement of the user's gaze, the location of the attention area 907 remains in the respective locations in the three-dimensional environment for a threshold time (e.g., 0.5, 1, 2, 3, 5, etc. seconds). For example, when the user's gaze 901a and the attention area 907 are in the position shown in fig. 9A, the electronic device 101a detects that the user's gaze 901B moves to the position shown in fig. 9B. In this example, the attention area 907 remains in the position shown in fig. 9A for a threshold time before the attention area 907 is moved to the position in fig. 9B in response to the user's gaze 901B moving to the position shown in fig. 9B. Thus, in some embodiments, inputs to user interface elements within the original attention area (e.g., attention area 907 in fig. 9A) initiated after the user's gaze movement are optionally responded to by the electronic device 101a, so long as those inputs are initiated within a threshold time (e.g., 0.5, 1, 2, 3, 5, etc. seconds) of the user's gaze movement to the location in fig. 9b—in some embodiments, the electronic device 101a is not responsive to such inputs initiated after the threshold time of the user's gaze movement to the location in fig. 9B.

In some implementations, the electronic device 101a cancels the user input if the user moves his hand away from the user interface element to which the input is directed or no further input is provided within a threshold time (e.g., 1, 2, 3, 5, 10, etc. seconds) after the ready state is detected. For example, if the user were to move his hand 909 to the position shown in fig. 9C after the electronic device 101a detected a ready state as shown in fig. 9A, the electronic device 101a would resume the appearance of the first selectable option 903 to no longer indicate that the input is pointing to the first selectable option 903 and no longer accept direct input from the pointing option 903 of the hand 909 (e.g., unless and until a ready state is again detected).

As shown in fig. 9C, the first selectable option 903 is still within the user's attention area 907. The user's hand 909 optionally assumes a hand shape (e.g., pointing to the hand shape, hand state D) corresponding to the direct ready state. Because the user's hand 909 has moved a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, 50, etc. centimeters) away from the first selectable option 903 and/or moved to a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, 50, etc. centimeters) from the first selectable option 903, the electronic device 101a is no longer configured to direct input from the hand 909 to the first selectable option 903. In some implementations, even if the user were to maintain the positioning of the hand 909 shown in fig. 9A, if no input was detected within a threshold period of time (e.g., 1, 2, 3, 5, 10, etc. seconds) in which the hand was positioned and had a shape as in fig. 9A, the electronic device 101a would stop directing further input from the hand to the first user interface element 903. Similarly, in some embodiments, if the user would begin providing additional input (e.g., in addition to meeting the ready state criteria-e.g., beginning to provide a press input to element 903, but not yet reaching the press distance threshold required to complete the press/select input), then move the hand a threshold distance away from the first selectable option 903 and/or move the hand a threshold distance from the first selectable option 903, then the electronic device 101a would cancel the input. It should be appreciated that if input is initiated while the first selectable option 903 is in the user's attention area 907 as described above with reference to fig. 9B, the electronic device 101a optionally does not cancel input in response to detecting that the user's gaze 901B or the user's attention area 907 is moving away from the first selectable option 903.

Although fig. 9A-9C illustrate examples of determining whether to accept direct input directed to a user interface element based on the user's attention area 907, it should be appreciated that the electronic device 101a can similarly determine whether to accept indirect input directed to a user interface element based on the user's attention area 907. For example, the various results shown and described with reference to fig. 9A-9C will also optionally be applicable to indirect input (e.g., as described with reference to methods 800, 1200, 1400, 1800, etc.). In some embodiments, accepting direct input does not require attention to the region, but for indirect input, attention to the region is required.

Fig. 10A-10H are flowcharts illustrating a method 1000 of processing user input based on an attention area associated with a user, according to some embodiments. In some embodiments, the method 1000 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, method 1000 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some operations in method 1000 are optionally combined and/or the order of some operations is optionally changed.

In some embodiments, the method 1000 is performed at an electronic device 101a (e.g., a mobile device (e.g., a tablet, a smartphone, a media player, or a wearable device)) or a computer in communication with a display generating component and one or more input devices. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from the electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), etc., in some embodiments, the electronic device is in communication with a hand tracking device (e.g., one or more cameras, depth sensors, proximity sensors, touch sensors (e.g., touch screen, touch pad)). In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, such as in fig. 9A, the electronic device 101a displays (1002 a) the first user interface element (e.g., 903, 905) via the display generation component 120 a. In some implementations, the first user interface element is an interactive user interface element, and in response to detecting input directed to the first user interface element, the electronic device performs an action associated with the first user interface element. For example, the first user interface element is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the first user interface element is a container (e.g., a window) in which the user interface/content is displayed, and in response to detecting a selection of the first user interface element and a subsequent movement input, the electronic device updates the location of the first user interface element in accordance with the movement input. In some embodiments, the user interface and/or user interface elements are displayed in (e.g., the user interface is a three-dimensional environment and/or is displayed within) a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) that is generated by, displayed by, or otherwise made viewable by the device.

In some implementations, such as in fig. 9B, upon display of the first user interface element (e.g., 909), the electronic device 101a detects (1002B) a first input to the first user interface element (e.g., 909) via the one or more input devices. In some implementations, detecting the first user input includes detecting, via the hand tracking device, a pinch gesture by the user performing a predetermined gesture (e.g., wherein the user touches the thumb to another finger (e.g., index finger, middle finger, ring finger, little finger) on the same hand as the thumb). In some embodiments, detecting the input includes detecting that the user performs a pointing gesture in which one or more fingers are extended and one or more fingers are curled toward the user's palm and moves their hands away from the user's torso by a predetermined distance (e.g., 2, 5, 10, etc. centimeters) in a pressing or pushing motion. In some implementations, the pointing gesture and the pushing motion are detected when the user's hand is within a threshold distance (e.g., 1, 2, 3, 5, 10, etc. centimeters) of the first user interface element in the three-dimensional environment. In some embodiments, the three-dimensional environment includes a representation of the virtual object and the user. In some embodiments, the three-dimensional environment includes a representation of the user's hand (which may be a photorealistic representation of the hand), a passthrough video of the user's hand, or a view of the user's hand through a transparent portion of the display generating component. In some implementations, the input is a direct or indirect interaction with a user interface element, such as described with reference to methods 800, 1200, 1400, 1600, 1800, and/or 2000.

In some implementations, in response to detecting a first input (1002 c) directed to a first user interface element (e.g., 903), in accordance with a determination that the first user interface element (e.g., 903) is within an attention area (e.g., 907) associated with a user of the electronic device 101a, such as in fig. 9A (e.g., when the first input is detected), the electronic device 101a performs (1002 d) a first operation corresponding to the first user interface element (e.g., 903). In some implementations, the attention area includes an area of the three-dimensional environment that is within a predetermined threshold distance (e.g., 5, 10, 30, 50, 100, etc. centimeters) and/or threshold angle (e.g., 5, 10, 15, 20, 30, 45, etc. degrees) of the location in the three-dimensional environment at which the user gaze is directed. In some implementations, the attention area includes an area of the three-dimensional environment between a location in the three-dimensional environment at which the user's gaze is directed and one or more physical features of the user (e.g., the user's hand, arm, shoulder, torso, etc.). In some embodiments, the attention area is a three-dimensional region of a three-dimensional environment. For example, the attention area is conical, the tip of the cone corresponding to the user's eyes/viewpoint, the base of the cone corresponding to the area of the three-dimensional environment at which the user's gaze is directed. In some implementations, the first user interface element is within an attention area associated with the user when the user gaze is directed at the first user interface element and/or when the first user interface element falls within a conical volume of the attention area. In some embodiments, the first operation is one of the following operations: make selections, activate settings of the electronic device, initiate a process of moving a virtual object within the three-dimensional environment, display a new user interface that is not currently displayed, play content items, save files, initiate communication (e.g., phone call, email, message) with another user, and/or scroll through the user interface. In some implementations, the first input is detected by detecting a pose and/or movement of a predefined portion of the user. For example, the electronic device detects a position in which a user moves his finger to within a threshold distance (e.g., 0.1, 0.3, 0.5, 1, 3, 5, etc. centimeters) of a first user interface element in a three-dimensional environment, with the user's hand/finger in a pose corresponding to the index finger of the hand pointing outward and the other fingers curling toward the hand.

In some implementations, such as in fig. 9A, in response to detecting a first input (1002 c) directed to a first user interface element (e.g., 905), in accordance with a determination that the first user interface element (e.g., 905) is not within an attention area associated with the user (e.g., when the first input is detected), the electronic device 101a foregoes (1002 e) performing the first operation. In some implementations, the first user interface element is not within the attention area associated with the user if the user gaze is directed to a user interface element other than the first user interface element and/or if the first user interface element does not fall within the conical volume of the attention area.

The above-described manner of performing or not performing the first operation depending on whether the first user interface element is within an attention area associated with the user provides an efficient way of reducing accidental user input, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, the first input to the first user interface element (e.g., 903) is an indirect input (1004 a) to the first user interface element (e.g., 903 in fig. 9C). In some implementations, the indirect input is input provided by a predefined portion of the user (e.g., the user's hand, finger, arm, etc.) when the predefined portion is greater than a threshold distance (e.g., 0.2, 1, 2, 3, 5, 10, 30, 50, etc. centimeters) from the first user interface element. In some embodiments, the indirect input is similar to the indirect input discussed with reference to methods 800, 1200, 1400, 1600, 1800, and/or 2000.

In some implementations, such as in fig. 9B, while the first user interface element (e.g., 905) is displayed, the electronic device 101a detects (1004B) a second input via the one or more input devices, where the second input corresponds to a direct input directed to the respective user interface element (e.g., 903). In some embodiments, the direct input is similar to the direct input discussed with reference to methods 800, 1200, 1400, 1600, 1800, and/or 2000. In some implementations, the direct input is provided by a predefined portion of the user (e.g., a hand, a finger, an arm) when the predefined portion is less than a threshold distance (e.g., 0.2, 1, 2, 3, 5, 10, 30, 50, etc. centimeters) from the first user interface element. In some embodiments, detecting the direct input includes detecting that the user is performing a predefined gesture with their hand (e.g., a press gesture in which the user moves the extended finger to the position of the corresponding user interface element while the other fingers curl toward the palm of the hand) after detecting the ready state of the hand (e.g., a pointing hand shape in which one or more fingers are extended and one or more fingers curl toward the palm). In some embodiments, the ready state is detected according to one or more steps of method 800.

In some embodiments, such as in fig. 9B, in response to detecting the second input, the electronic device 101a performs (1004 c) an operation associated with the respective user interface element (e.g., 903) regardless of whether the respective user interface element is within an attention area (e.g., 907) associated with the user (e.g., because the input is a direct input). In some implementations, if an indirect input is detected while the user's gaze is directed to the first user interface element, the electronic device performs an operation associated with the first user interface element only in response to the indirect input. In some implementations, the electronic device performs an operation associated with a user interface element in an attention area of the user in response to the direct input, regardless of whether the user's gaze is directed at the user interface element when the direct input is detected.

The above-described manner of forgoing performing the second operation in response to detecting an indirect input when the user's gaze is not directed at the first user interface element provides a manner of reducing or preventing performing operations that are not desired by the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 9B, the attention area (e.g., 907) associated with the user is based on the direction (and/or location) of the gaze (e.g., 901B) of the user of the electronic device (1006 a). In some embodiments, the attention area is defined as a conical volume (e.g., extending outward from a point at the user's point of view into the three-dimensional environment) that includes the point in the three-dimensional environment that the user is looking at and the location between the point in the three-dimensional environment that the user is looking at and the user within a predetermined threshold angle (e.g., 5, 10, 15, 20, 30, 45, etc.) of the user's gaze. In some embodiments, the attention area is based on the orientation of the user's head in addition to or instead of based on the user's gaze. For example, the attention area is defined to include a conical volume including a position in the three-dimensional environment within a predetermined threshold angle (e.g., 5, 10, 15, 20, 30, 45, etc.) of a line perpendicular to the user's face. As another example, the attention area is a cone centered on an average of a line extending from the user's gaze and a line perpendicular to the user's face, or a union of a cone centered on the user's gaze and a cone centered on a line perpendicular to the user's face.

The above-described manner of directing the attention area based on the orientation of the user's gaze provides an efficient way of directing the direction of the user's input based on the gaze without additional input, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, when the first user interface element (e.g., 903) is within an attention area (e.g., 907) associated with the user, such as in fig. 9A, the electronic device 101a detects (1008 a) that one or more criteria for moving the attention area (e.g., 903) to a location where the first user interface element (e.g., 903) is not within the attention area are met. In some implementations, the attention area is based on the user's gaze and the one or more criteria are met when the user's gaze moves to a new location such that the first user interface element is no longer in the attention area. For example, the attention area includes an area of the user interface that is within 10 degrees of a line of user gaze, and the user gaze moves to a position such that the first user interface element is greater than 10 degrees from the line of user gaze.

In some embodiments, such as in fig. 9B, after detecting that the one or more criteria are met (1008B), the electronic device 101a detects (1008 c) a second input directed to the first user interface element (e.g., 903). In some implementations, the second input is a direct input, wherein the user's hand is within a threshold distance (e.g., 0.2, 1, 2, 3, 5, 10, 30, 50, etc. centimeters) of the first user interface element.

In some embodiments, after detecting satisfaction of the one or more criteria (1008B), such as in fig. 9B, in response to detecting a second input (1008 d) directed to the first user interface element (e.g., 903), in accordance with a determination that the second input was detected within a respective time threshold (e.g., 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. seconds) for the one or more criteria to be satisfied, the electronic device 101a performs (1008 e) a second operation corresponding to the first user interface element (e.g., 903). In some embodiments, the user's attention area does not move until a time threshold (e.g., 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. seconds) has elapsed since the one or more criteria were met.

In some embodiments, after detecting satisfaction of the one or more criteria (1008B), such as in fig. 9B, in response to detecting a second input (1008 d) directed to the first user interface element (e.g., 903), in accordance with a determination that the second input was detected after a respective time threshold (e.g., 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. seconds) at which the one or more criteria were satisfied, the electronic device 101a forgoes performing (1008 f) the second operation. In some implementations, the electronic device updates the location of the attention area associated with the user (e.g., based on the new gaze location of the user) once a time threshold (e.g., 0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. seconds) has elapsed since the one or more criteria for moving the attention area were met. In some implementations, the electronic device gradually moves the attention area within the time threshold and initiates movement with or without a time delay after detecting the user gaze movement. In some implementations, the electronic device foregoes performing the second operation in response to an input detected when the first user interface element is not in the user's attention area.

The above-described manner of performing the second operation in response to the second input, in response to the second input being received within the time threshold for moving the attention area being met, provides an efficient manner of accepting the user input without requiring the user to maintain his gaze for the duration of the input and avoiding accidental input by preventing activation of the user interface element once the predetermined time threshold has passed after the attention area has moved, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9A-9B, the first input includes a first portion followed by a second portion (1010 a). In some implementations, detecting the first portion of the input includes detecting a ready state of a predefined portion of the user, as described with reference to method 800. In some implementations, in response to a first portion of the input, the electronic device moves an input focus to a respective user interface element. For example, the electronic device updates the appearance of the respective user interface element to indicate that the input focus points to the respective user interface element. In some implementations, the second portion of the input is a selection input. For example, a first portion of the input includes detecting that the user's hand is within a first threshold distance (e.g., 3, 5, 10, 15, etc. centimeters) of the corresponding user interface element while making a predefined hand shape (e.g., a pointing hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm), and a second portion of the input includes detecting that the user's hand is within a second lower threshold distance (e.g., touching, 0.1, 0.3, 0.5, 1, 2, etc. centimeters) of the corresponding user interface element while maintaining the pointing hand shape.

In some implementations, such as in fig. 9A, upon detecting a first input (1010 b), the electronic device 101a detects (1010 c) a first portion of the first input when the first user interface element (e.g., 903) is within the attention area (e.g., 907).

In some implementations, such as in fig. 9A, upon detecting a first input (1010 b), in response to detecting a first portion of the first input, the electronic device 101a performs (1010 d) a first portion of a first operation corresponding to a first user interface element (e.g., 903). In some implementations, the first portion of the first operation includes identifying the first user interface element as having an input focus of the electronic device and/or updating an appearance of the first user interface element to indicate that the input focus is directed toward the first user interface element. For example, in response to detecting that the user has pre-pinched the hand shape within a threshold distance (e.g., 1, 2, 3, 5, 10, etc. centimeters) of the first user interface element, the electronic device changes the color of the first user interface element to indicate that the input focus is directed toward the first user interface element (e.g., similar to a cursor "hovering" over the user interface element). In some implementations, the first portion of the input includes a selection of scrollable content in the user interface and a first portion of movement of a predefined portion of the user. In some implementations, the electronic device scrolls the scrollable content a first amount in response to a first portion of movement of a predefined portion of the user.

In some implementations, such as in fig. 9B, upon detecting the first input (1010B), the electronic device 101a detects (1010 e) a second portion of the first input when the first user interface element (e.g., 903) is outside of the attention area. In some implementations, after detecting the first portion of the first input and before detecting the second portion of the second input, the electronic device detects that the attention area no longer includes the first user interface element. For example, the electronic device detects that the user's gaze is directed toward a portion of the user interface such that the first user interface element is outside a distance threshold or an angle threshold of the user's attention area. For example, when the attention area does not include the first user interface element, the electronic device detects that the user makes a pinch hand shape within a threshold distance (e.g., 1, 2, 3, 5, 10, etc. centimeters) of the first user interface element. In some embodiments, the second portion of the first input includes a continuation of the movement of the predefined portion of the user. In some implementations, the electronic device continues to scroll the scrollable content in response to a continuation of the movement of the predefined portion of the user. In some embodiments, the second portion of the first input is detected after a threshold time has elapsed since the first portion of the input was detected (e.g., a threshold time after the ready state of the input was detected for which the input must be detected to cause an action as described above).

In some implementations, such as in fig. 9B, upon detecting the first input (1010B), the electronic device 101a performs (1010 f) a second portion of the first operation corresponding to the first user interface element (e.g., 903) in response to detecting a second portion of the first input. In some implementations, the second portion of the first operation is an operation performed in response to detecting a selection of the first user interface element. For example, if the first user interface element is an option to initiate playback of the content item, the electronic device initiates playback of the content item in response to detecting the second portion of the first operation. In some implementations, the electronic device performs the operation in response to detecting the second portion of the first input after a threshold time has elapsed since the first portion of the input was detected (e.g., a threshold time after the ready state of the input was detected that the input must be detected to cause the action as described above).

The above-described manner of performing the second portion of the first operation corresponding to the first user interface element in response to detecting the second portion of the input when the first user interface element is outside of the attention area provides an efficient manner of performing the operation in response to the input starting when the first user interface element is in the attention area even if the attention area moves away from the first user interface element before the input is completed, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power use and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9A-9B, the first input corresponds to a press input, a first portion of the first input corresponds to initiation of the press input, and a second portion of the first input corresponds to continuation of the press input (1012 a). In some embodiments, detecting a press input includes detecting that a user has made a predetermined shape in his hand (e.g., a pointing shape in which one or more fingers are extended and one or more fingers are curled toward the palm of the hand). In some implementations, detecting initiation of the press input includes detecting that a user has made a predetermined shape with his hand when the hand or a portion of the hand (e.g., the tip of one of the extended fingers) is within a first threshold distance (e.g., 3, 5, 10, 15, 30, etc. centimeters) of the first user interface element. In some implementations, detecting continuation of the press input includes detecting that the user has made a predetermined shape with his hand while the hand or a portion of the hand (e.g., the tip of one of the extended fingers) is within a second threshold distance (e.g., 0.1, 0.5, 1, 2, etc. centimeters) of the first user interface element. In some implementations, the electronic device performs a second operation corresponding to the first user interface element in response to detecting initiation of the press input while the first user interface element is within the attention area and subsequent continuation of the press input (either while the first user interface element is within the attention area or not at that time). In some implementations, in response to a first portion of the press input, the electronic device pushes the user interface element away from the user less than an entire amount required to cause an action according to the press input. In some implementations, in response to the second portion of the press input, the electronic device continues to push the user interface element to the full amount required to cause the action, and in response, performs the action in accordance with the press input.

The above-described manner of performing the second operation in response to detecting the emulation of the press input while the first user interface element is in the attention area and subsequent continuation of the press input provides an efficient way of detecting user input with the hand tracking device (and optionally the eye tracking device) without the need for an additional input device, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, the first input corresponds to a drag input, the first portion of the first input corresponds to initiation of the drag input, and the second portion of the first input corresponds to continuation of the drag input (1014 a). For example, if the user were to move hand 909 upon selection of user interface element 903 in fig. 9B, the input would be a drag input. In some implementations, the drag input includes a selection of a user interface element, a movement input, and an end of the drag input (e.g., release of the selection input, similar to canceling a click on a mouse or lifting a finger off a touch sensor panel (e.g., a touchpad, a touch screen)). In some implementations, the initiation of the drag input includes selection of a user interface element to which the drag input is to be directed. For example, the electronic device selects the user interface element in response to detecting that the user makes a pinch hand shape when the hand is within a threshold distance (e.g., 1, 2, 5, 10, 15, 30, etc. centimeters) of the user interface element. In some implementations, continuation of the drag input includes movement input while maintaining the selection. For example, the electronic device detects that the user maintains the pinch hand shape while moving the hand, and moves the user interface element according to the movement of the hand. In some implementations, the continuation of the drag input includes an end of the drag input. For example, the electronic device detects that the user stops making pinch hand shapes, such as by moving the thumb away from the finger. In some implementations, the electronic device performs operations (e.g., moving the first user interface element, scrolling the first user interface element, etc.) in response to the drag input, in response to detecting selection of the first user interface element while the first user interface element is in the attention area, and in response to detecting an end of the movement input and/or drag input while the first user interface element is in the attention area or not. In some implementations, the first portion of the input includes a selection of a user interface element and a portion of the movement of a predefined portion of the user. In some embodiments, in response to the first portion of the input, the electronic device moves the user interface element by a first amount in accordance with an amount of movement of a predefined portion of the electronic device in the first portion of the input. In some embodiments, the second portion of the input includes continued movement of the predefined portion of the user. In some implementations, in response to the second portion of the input, the electronic device continues to move the user interface element by an amount in accordance with movement of a predefined portion of the user in the second portion of the user input.

The above-described manner of performing an operation in response to detecting initiation of a drag input while the first user interface element is in the attention area and detecting continuation of the drag input while the first user interface element is not in the attention area provides an efficient manner of performing an operation in response to a drag input initiated while the first user interface element is in the attention area even if the attention area moves away from the first user interface element before the drag input is completed, which simplifies interactions between a user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power use and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9A-9B, the first input corresponds to a selection input, a first portion of the first input corresponds to initiation of the selection input, and a second portion of the first input corresponds to continuation of the selection input (1016 a). In some implementations, selecting the input includes detecting that an input focus is directed to the first user interface element, detecting initiation of a request to select the first user interface element, and detecting an end of the request to select the first user interface element. In some implementations, in response to detecting that the user's hand in a ready state is pointing at the first user interface element according to method 800, the electronic device points the input focus at the first user interface element. In some implementations, the request to point the input focus to the first user interface element is similar to a cursor hover. For example, the electronic device detects that the user has made a pointing hand shape when the hand is within a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of the first user interface element. In some implementations, the initiation of the request to select the first user interface element includes detecting a selection input similar to a click of a mouse or a downward touch on a sensor panel. For example, the electronic device detects that the user remains pointing to the hand shape while the hand is within a second threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, etc. centimeters) of the first user interface element. In some implementations, the end of the request to select a user interface element is similar to canceling a click on a mouse or lifting off a touch sensor panel. For example, the electronic device detects that the user has moved his or her hand away from the first user interface element by at least a second threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, etc. centimeters). In some implementations, the electronic device performs the selection operation in response to detecting that the input focus is directed to the first user interface element while the first user interface element is in the attention area and that initiation and termination of a request to select the first user interface element is in the attention area or not detected at this time.

The above-described manner of performing an operation in response to detecting a emulation of a selection input while a first user interface element is in an attention area, regardless of whether continuation of the selection input is detected while the first user interface element is in the attention area, provides an efficient manner of performing an operation in response to a selection input that begins while the first user interface element is in the attention area even if the attention area moves away from the first user interface element before the selection input is completed, which simplifies interactions between a user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9A, detecting the first portion of the first input includes detecting that the predefined portion of the user (e.g., 909) has a respective pose (e.g., a hand shape including a pointing hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm, such as the ready state described with reference to method 800) and within a respective distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of a location corresponding to the first user interface element (e.g., 903) without detecting movement of the predefined portion of the user (e.g., 909), and detecting the second portion of the first input includes detecting movement of the predefined portion of the user (e.g., 909), such as in fig. 9B (1018 a). In some implementations, detecting that the predefined portion of the user has a respective pose and within a respective distance of the first user interface element includes detecting a ready state according to one or more steps of method 800. In some implementations, movement of the predefined portion of the user includes movement from a respective pose to a second pose associated with selection of the user interface element and/or movement from a respective distance to a second distance associated with selection of the user interface element. For example, making the pointing hand shape within a respective distance of the first user interface element is a first portion of the first input, and maintaining the pointing hand shape while moving the hand to a second distance (e.g., within 0.1, 0.2, 0.3, 0.5, 1, etc. centimeters) from the first user interface element is a second portion of the first input. As another example, a first portion is made in which a pre-pinch hand shape of a thumb of the hand within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of another finger on the hand is a first input, and a movement of the hand from the pre-pinch shape to a pinch shape in which the thumb is touching the other finger is detected as a second portion of the first input. In some implementations, the electronic device detects further movement of the hand after the second portion of the input, such as movement of the hand corresponding to a request to drag or scroll the first user interface element. In some implementations, the electronic device performs the operation in response to detecting that the predefined portion of the user is within a respective distance of the first user interface element when the first user interface element is in an attention area associated with the user and then detecting movement of the predefined portion of the user when the first user interface element is in the attention area or not at that time.

The above-described manner of performing an operation in response to detecting a respective pose of a predefined portion of a user within a respective distance of a first user interface element while the first user interface element is in an attention area and subsequently detecting movement of the predefined portion of the user while the first user interface element is in the attention area or not at that time provides an efficient manner of performing an operation in response to an input that begins while the first user interface element is in the attention area even though the attention area moves away from the first user interface element before the input is completed, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9B, the first input is provided by a predefined portion (e.g., 909) of the user (e.g., a finger, hand, arm, or head of the user), and detecting the first input includes detecting that the predefined portion (e.g., 909) of the user is within a distance threshold (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of a location corresponding to the first user interface element (e.g., 903) (1020 a).

In some implementations, such as in fig. 9C, upon detecting a first input directed to a first user interface element (e.g., 903) and prior to performing a first operation, the electronic device 101a detects (1020 b) via the one or more input devices that a predefined portion (e.g., 909) of the user moves to a distance greater than a distance threshold from a location corresponding to the first user interface element (e.g., 903).

In some implementations, such as in fig. 9C, in response to detecting that the predefined portion (e.g., 909) moves to a distance greater than a distance threshold from a location corresponding to the first user interface element (e.g., 903), the electronic device 101a foregoes (1020C) performing the first operation corresponding to the first user interface element (e.g., 903). In some embodiments, in response to detecting that the user begins providing input directed to the first user interface element, the electronic device then refrains from performing a first operation corresponding to the input directed to the first user interface element by moving a predefined portion of the user beyond a threshold distance from a location corresponding to the user interface element before the input is completed. In some embodiments, the electronic device foregoes performing the first operation in response to the user moving the predefined portion of the user away from the location corresponding to the first user interface element by at least a distance threshold, even when the predefined portion of the user is within the distance threshold of the location corresponding to the first user interface element, the user having performed one or more portions of the first input without performing all of the first input. For example, the selection input includes detecting that the user has made a pre-pinch hand shape (e.g., a hand shape with a thumb within a threshold (e.g., 0.1, 0.2, 0.5, 1, 2, 3, etc. centimeters), then pinch the hand shape (e.g., a thumb touching a finger), then end the hand pinch shape (e.g., the thumb no longer touches a finger, the thumb is at least 0.1, 0.2, 0.5, 1, 2, 3, etc. centimeters from the finger)). In this example, even if the hand is within the threshold distance when the pre-pinch hand shape and/or pinch hand shape is detected, the electronic device foregoes performing the first operation if the end of the pinch gesture is detected when the hand distance is greater than the threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) from the position corresponding to the first user interface element.

The above-described manner of forgoing performing the first operation in response to detecting movement of the predefined portion of the user to a distance greater than the threshold distance provides an efficient manner of canceling the first operation after the portion of the first input has been provided, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 9A, the first input is provided by a predefined portion (e.g., 909) of the user (e.g., a finger, hand, arm, or head of the user), and detecting the first input includes detecting that the predefined portion (e.g., 909) of the user is in a respective spatial relationship (1022 a) with respect to a location corresponding to the first user interface element (e.g., detecting that the predefined portion of the user is within a predetermined threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, etc. centimeters) of the first user interface element, having a predetermined orientation or pose with respect to the user interface element. In some embodiments, according to one or more steps of method 800, the respective spatial relationship with respect to the location corresponding to the first user interface is that the portion of the user is in a ready state.

In some implementations, when a predefined portion (e.g., 909) of a user is in a respective spatial relationship with respect to a location corresponding to a first user interface element (e.g., 903) during a first input and prior to performing a first operation, such as in fig. 9A, the electronic device 101a detects (1022 b) via the one or more input devices that the predefined portion (e.g., 909) of the user does not interact with the first user interface element (e.g., 903) within a respective time threshold (e.g., 1, 2, 3, 5, etc. seconds) of obtaining the respective spatial relationship with respect to the location corresponding to the first user interface element (e.g., 903) (e.g., provides additional input directed to the first user interface element). In some embodiments, the electronic device detects the ready state of the predefined portion of the user according to one or more steps of method 800 without detecting further input within the time threshold. For example, the electronic device detects that the user's hand is in a pre-pinch hand shape (e.g., the thumb is within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of another finger on the thumb's hand) when the hand is within a predetermined threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, etc. centimeters) of the first user interface element, and does not detect that the pinch hand shape (e.g., the thumb and finger are touching) for a predetermined period of time.

In some implementations, in response to detecting that the predefined portion (e.g., 909) of the user does not interact with the first user interface element (e.g., 903) within a respective time threshold that obtains a respective spatial relationship relative to a location corresponding to the first user interface element (e.g., 903), the electronic device 101a foregoes (1022C) performing a first operation corresponding to the first user interface element (e.g., 903), such as in fig. 9C. In some implementations, in response to detecting that a predefined portion of the user interacted with the first user interface element after the respective time threshold has elapsed, the electronic device forgoes performing a first operation corresponding to the first user interface element. For example, in response to detecting that the user's hand is in a pre-pinch hand shape (e.g., the thumb is within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of another finger on the thumb's hand) when the hand is within a predetermined threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, etc.) of the first user interface element and detecting that a predetermined time threshold has elapsed between the thumb and the finger's hand shape (e.g., the thumb and finger are touching), the electronic device foregoes performing the first operation even if the pinch hand shape is detected after the predetermined threshold time has elapsed. In some implementations, in response to detecting that the predefined portion of the user is in a respective spatial relationship with respect to a location corresponding to the user interface element, the electronic device updates an appearance of the user interface element (e.g., updates a color, a size, translucency, positioning, etc. of the user interface element). In some implementations, after the respective time threshold, the electronic device resumes the updated appearance of the user interface element without detecting further input from the predefined portion of the user.

The above-described manner of relinquishing the first operation in response to detecting that the time threshold has elapsed without the predefined portion of the user interacting with the first user interface element provides an efficient manner of canceling the request to perform the first operation, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some implementations, a first portion of the first input is detected when the user's gaze is directed to the first user interface element (e.g., such as if gaze 901a in fig. 9A is directed to user interface element 903), and a second portion of the first input (1024 a) that follows the first portion of the first input is detected when the user's gaze (e.g., 901B) is not directed to the first user interface element (e.g., 903), such as in fig. 9B. In some implementations, the electronic device performs an action associated with the first user interface element in response to detecting a first portion of the first input when the user's gaze is directed to the first user interface element, and subsequently detecting a second portion of the first input when the user's gaze is not directed to the first user interface element. In some implementations, the electronic device performs an action associated with the first user interface element in response to detecting a first portion of the first input while the first user interface element is in the attention area and subsequently detecting a second portion of the first input while the first user interface element is not in the attention area.

The above-described manner of performing an operation in response to detecting a first portion of the first input when the user's gaze is directed at the first user interface element, and subsequently detecting a second portion of the first input when the user's gaze is not directed at the first user interface element, provides an efficient manner of allowing the user to look away from the first user interface element without canceling the first input, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 9B, the first input is provided by a predefined portion (e.g., 909) of the user (e.g., a finger, hand, arm, etc.) moving from within a predefined range of angles relative to the first user interface element (e.g., 903) to a location corresponding to the first user interface element (e.g., 903) (e.g., the first user interface object is a three-dimensional virtual object accessible from multiple angles). For example, the first user interface object is a virtual video player that includes a face of the presented content, and the first input is provided by moving a user's hand to the first user interface object, touching the face of the presented content of the first user interface object before touching any other face of the first user interface object.

In some implementations, the electronic device 101a detects (1026B) a second input to the first user interface element (e.g., 903) via the one or more input devices, wherein the second input includes a predefined portion (e.g., 909) of the user moving from outside a predefined angular range relative to the first user interface element (e.g., 903) to a position corresponding to the first user interface element (e.g., 903), such as if the hand (e.g., 909) in fig. 9B would be proximate to the user interface element (e.g., 903) from a side of the user interface element (e.g., 903) opposite a side of the user interface element (e.g., 903) visible in fig. 9B. For example, the electronic device detects that the user's hand touches a face of the virtual video player other than the face on which the content is presented (e.g., touches a "back" face of the virtual video player).

In some implementations, in response to detecting the second input, the electronic device 101a foregoes 1026c interacting with the first user interface element (e.g., 903) in accordance with the second input. For example, if the hand (e.g., 909) in fig. 9B were to be proximate to the user interface element (e.g., 903) from a side of the user interface element (e.g., 903) opposite a side of the user interface element (e.g., 903) visible in fig. 9B, the electronic device 101a would forgo performing the selection of the user interface element (e.g., 903) shown in fig. 9B. In some implementations, the electronic device will interact with the first user interface element if the predefined portion of the user has moved from within the predefined angular range to a location corresponding to the first user interface element. For example, in response to detecting that a user's hand touches a face of the virtual video player that presents content, the electronic device foregoes performing an action corresponding to an area of the video player that the user touches on the face of the virtual video player that presents content by moving the hand through the face of the virtual video player other than the face of the presented content.

The above-described manner of forgoing interaction with the first user interface element in response to input provided outside of the predefined angular range provides an efficient manner of preventing accidental input due to a user inadvertently touching the first user interface element from an angle outside of the predefined angular range, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some implementations, such as in fig. 9A, a first operation is performed (1028 a) in response to detecting the first input without detecting that the user's gaze (e.g., 901 a) is directed to a first user interface element (e.g., 903). In some embodiments, the attention area includes an area of the three-dimensional environment to which the user's gaze is directed plus an additional area of the three-dimensional environment that is within a predefined distance or angle of the user's gaze. In some implementations, the electronic device performs an action in response to input directed to the first user interface element while the first user interface element is within the attention area (wider than the user's gaze), even though the user's gaze is not directed to the first user interface element and even though the user's gaze is never directed to the first user interface element when the user inputs. In some embodiments, the indirect input requires a user's gaze to point to a user interface element to which the input is directed, and the direct input does not require a user's gaze to point to a user interface element to which the input is directed.

The above-described manner of performing an operation in response to an input directed to a first user interface element when a user's gaze is not directed to the first user interface element provides an efficient manner of allowing the user to see to an area of the user interface other than the first user interface element when the input directed to the first user interface element is provided, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

Fig. 11A shows the electronic device 101 displaying a three-dimensional environment 1101 on a user interface via the display generation component 120. It should be appreciated that in some embodiments, the electronic device 101 utilizes one or more of the techniques described with reference to fig. 11A-11C in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device 101 optionally includes a display generation component 120 (e.g., a touch screen) and a plurality of image sensors 314. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101 can use to capture one or more images of a user or a portion of a user when the user interacts with the electronic device 101. In some embodiments, the display generating component 120 is a touch screen capable of detecting gestures and movements of a user's hand. In some embodiments, the user interfaces shown below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hands (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

As shown in fig. 11A, the three-dimensional environment 1101 includes: two user interface objects 1103a and 1103b located within an area of the three-dimensional environment 1101 that is a first distance from a viewpoint of the three-dimensional environment 1101 associated with a user of the electronic device 101; two user interface objects 1105a and 1105b located within an area of the three-dimensional environment 1101 that is a second distance (greater than the first distance) from a point of view of the three-dimensional environment 1101 associated with a user of the electronic device 101; two user interface objects 1107a and 1107b located within an area of the three-dimensional environment 1101 that is a third distance (greater than the second distance) from a point of view of the three-dimensional environment 1101 associated with a user of the electronic device 101; and a user interface object 1109. In some embodiments, the three-dimensional environment includes a representation 604 of a table in the physical environment of the electronic device 101 (e.g., such as described with reference to fig. 6B). In some implementations, the representation 604 of the table is a photorealistic video image (e.g., video or digital passthrough) of the table displayed by the display generation component 120. In some embodiments, the representation 604 of the table is a view (e.g., a real or physical perspective) of the table through the transparent portion of the display generating component 120.

Fig. 11A-11C illustrate simultaneous or alternative inputs provided by a user's hand based on the simultaneous or alternative location of the user's gaze in a three-dimensional environment. In particular, in some embodiments, the electronic device 101 directs indirect input from a hand of a user of the electronic device 101 (e.g., as described with reference to method 800) to a different user interface object according to a distance of the user interface object from a point of view of a three-dimensional environment associated with the user. For example, in some embodiments, when the indirect input from the user's hand is directed to a user interface object relatively close to the viewpoint of the user in the three-dimensional environment 1101, the electronic device 101 optionally directs the detected indirect input to the user interface object to which the user's gaze is directed, because at a relatively close distance, the device 101 is optionally able to relatively accurately determine which of the two (or more) user interface objects to which the user's gaze is directed, which is optionally used to determine the user interface object to which the indirect input should be directed.

In fig. 11A, user interface objects 1103a and 1103b are relatively close to the viewpoint of the user in the three-dimensional environment 1101 (e.g., less than a first threshold distance, such as 1, 2, 5, 10, 20, 50 feet, from the viewpoint of the user in the three-dimensional environment) (e.g., objects 1103a and 1103b are located within an area of the three-dimensional environment 1101 that is relatively close to the viewpoint of the user). Thus, since the user's gaze 1111a is directed to the user interface object 1103a when the indirect input provided by the hand 1113a is detected, the indirect input provided by the hand 1113a detected by the device 101 is directed to the user interface object 1103a (e.g., instead of the user interface object 1103 b), as indicated by the check marks in the figure. In contrast, in fig. 11B, when the indirect input provided to the hand 1113a is detected, the user's gaze 1111d is directed to the user interface object 1103B. Thus, the device 101 directs this indirect input from the hand 1113a to the user interface object 1103b (e.g., instead of the user interface object 1103 a), as indicated by the check marks in the figure.

In some embodiments, when one or more user interface objects are relatively far from the viewpoint of a user in the three-dimensional environment 1101, the device 101 optionally prevents indirect input from pointing to such one or more user interface objects and/or visually weakens such one or more user interface objects, because at relatively long distances the device 101 is optionally unable to determine relatively accurately whether the user's gaze is directed to one or more user interface objects. For example, in fig. 11A, user interface objects 1107a and 1107b are relatively far from the viewpoint of the user in the three-dimensional environment 1101 (e.g., greater than a second threshold distance, greater than a first threshold distance, such as 10, 20, 30, 50, 100, 200 feet, from the viewpoint of the user in the three-dimensional environment) (e.g., objects 1107a and 1107b are located within an area of the three-dimensional environment 1101 relatively far from the viewpoint of the user). Thus, indirect input provided by hand 1113c detected by device 101 when user's gaze 1111c is directed (e.g., on the surface) to user interface object 1107b (or 1107 a) is ignored by device 101 and does not point to user interface object 1107b (or 1107 a), as reflected by the check marks not shown in the figure. In some embodiments, the device 101 additionally or alternatively visually weakens (e.g., grays out) the user interface objects 1107a and 1107b to indicate that the user interface objects 1107a and 1107b are not available for indirect interaction.

In some implementations, when one or more user interface objects are angled greater than a threshold angle from the gaze of the user of the electronic device 101, the device 101 optionally prevents indirect input from pointing to such one or more user interface objects and/or visually weaknesses such one or more user interface objects, for example, to prevent accidental interaction with such one or more user interface objects off-angle. For example, in fig. 11A, user interface object 1109 is optionally angled from user gaze 1111A, 1111b, and/or 1111c by more than a threshold angle (e.g., 10, 20, 30, 45, 90, 120, etc.). Thus, the device 101 optionally visually weakens (e.g., grays out) the user interface object 1109 to indicate that the user interface object 1109 is not available for indirect interaction.

However, in some embodiments, when the indirect input from the user's hand is directed to a user interface object moderately away from the viewpoint of the user in the three-dimensional environment 1101, the electronic device 101 optionally directs the detected indirect input to the user interface object based on criteria other than the user's gaze, because at moderate distances the device 101 is optionally able to determine relatively accurately which of those sets of two or more user interface objects the user's gaze is directed to, but optionally is unable to determine relatively accurately which user interface object of those sets of two or more user interface objects the gaze is directed to. In some implementations, if the user's gaze is directed toward a moderately distant user interface object that is not located with other user interface objects (e.g., greater than a threshold distance, such as 1, 2, 5, 10, 20 feet, from any other interactable user interface object), the device 101 optionally directs indirect input to the user interface object without performing the various disambiguation techniques described herein and with reference to method 1200. Furthermore, in some embodiments, the electronic device 101 performs the various disambiguation techniques described herein and with reference to the method 1200 for user interface objects located within an area (e.g., volume and/or surface or plane) in a three-dimensional environment defined by a user's gaze (e.g., the user's gaze defines the center of the volume and/or surface or plane) and not for user interface objects not located within the area (e.g., regardless of their distance from the user's point of view). In some implementations, the dimensions of the region vary based on the distance of the region and/or the user interface object it contains from the viewpoint of the user in the three-dimensional environment (e.g., within a moderately distant region of the three-dimensional environment). For example, in some embodiments, the size of the region decreases as the region is farther from the viewpoint (and increases as the region is closer to the viewpoint), and in some embodiments, the size of the region increases as the region is farther from the viewpoint (and decreases as the region is closer to the viewpoint).

In fig. 11A, user interface objects 1105a and 1105b are moderately distant from the viewpoint of the user in the three-dimensional environment 1101 (e.g., greater than a first threshold distance and less than a second threshold distance from the viewpoint of the user in the three-dimensional environment) (e.g., objects 1105a and 1105b are located within an area of the three-dimensional environment 1101 that is moderately distant from the viewpoint of the user). In fig. 11A, gaze 1111b is directed to user interface object 1105a (e.g., device 101 detects) when device 101 detects indirect input from hand 1113 b. Because the user interface objects 1105a and 1105b are moderately far from the viewpoint of the user, the device 101 determines which of the user interface objects 1105a and 1105b will receive input based on characteristics other than the user's gaze 1111 b. For example, in FIG. 11A, because user interface object 1105b is closer to the viewpoint of the user in three-dimensional environment 1101, device 101 directs input from hand 1113b to user interface object 1105b as indicated by the check marks in the figure (e.g., instead of pointing it to user interface object 1105a to which user's gaze 1111b is directed). In fig. 11B, when input from hand 1113B is detected, user's gaze 1111e is directed to user interface object 1105B (rather than user interface object 1105a in fig. 11A), and device 101 is still directed to user interface object 1105B from indirect input from hand 1113B, as indicated by the check marks in the figure, optionally not because user's gaze 1111e is directed to user interface object 1105B, but because user interface object 1105B is closer to the user's viewpoint in the three-dimensional environment than user interface object 1105a.

In some embodiments, criteria in addition to or instead of distance are used to determine to which user interface objects the indirect input is directed (e.g., when those user interface objects are moderately far from the user's point of view). For example, in some embodiments, the device 101 directs indirect input to one of the user interface objects based on which of the user interface objects is an application user interface object or a system user interface object. For example, in some embodiments, the device 101 supports a system user interface object and directs indirect input from the hand 1113b in fig. 11C to the user interface object 1105C, as indicated by the check mark, because the user interface object is a system user interface object and the user interface object 1105d (to which the user's gaze 1111f is directed) is an application user interface object. In some embodiments, the device 101 supports an application user interface object and will direct indirect input from the hand 1113b in fig. 11C to the user interface object 1105d because the user interface object is an application user interface object and the user interface object 1105C is a system user interface object (e.g., rather than because the user's gaze 1111f is directed to the user interface object 1105 d). Additionally or alternatively, in some embodiments, software, applications, and/or operating systems associated with the user interface objects define a selection priority for the user interface objects such that if the selection priority gives one user interface object a higher priority than another user interface object, the device 101 directs input to the one user interface object (e.g., user interface object 1105 c) and if the selection priority gives another user interface object a higher priority than the one user interface object, the device 101 directs input to another user interface object (e.g., user interface object 1105 d).

Fig. 12A-12F are flowcharts illustrating a method 1200 of enhancing interaction with user interface elements in a three-dimensional environment at different distances and/or angles relative to a user's gaze, according to some embodiments. In some embodiments, the method 1200 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, the method 1200 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as the one or more processors 202 of the computer system 101 (e.g., the control unit 110 in fig. 1A). Some operations in method 1200 are optionally combined and/or the order of some operations is optionally changed.

In some embodiments, method 1200 is performed by an electronic device in communication with a display generation component and one or more input devices (including an eye tracking device). For example, a mobile device (e.g., a tablet, smart phone, media player, or wearable device) or a computer. In some embodiments, the display generating component is a display integrated with the electronic device (optionally a touch screen display), an external display such as a monitor, projector, television, or hardware component (optionally integrated or external) for projecting a user interface or making the user interface visible to one or more users, or the like. In some embodiments, the one or more input devices include a device capable of receiving user input (e.g., capturing user input, detecting user input, etc.) and transmitting information associated with the user input to the electronic device. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from an electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), and so forth. In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, the electronic device displays (1202 a) via a display generation component a user interface comprising a first region comprising a first user interface object and a second user interface object, such as objects 1105a and 1105b in fig. 11A. In some implementations, the first user interface object and/or the second user interface object are interactive user interface objects, and in response to detecting input directed to a given object, the electronic device performs an action associated with the user interface object. For example, the user interface object is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the user interface object is a container (e.g., window) in which the user interface/content is displayed, and in response to detecting a selection of the user interface object and a subsequent movement input, the electronic device updates the positioning of the user interface object in accordance with the movement input. In some implementations, the first user interface object and the second user interface object are displayed in a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) generated by, displayed by, or otherwise made viewable by the device (e.g., the user interface is a three-dimensional environment and/or is displayed within the three-dimensional environment). In some embodiments, the first region, and thus the first user interface object and the second user interface object, are remote from a location corresponding to (e.g., remote from, such as greater than a threshold distance of 2, 5, 10, 15, 20 feet from) a location of the user/electronic device in the three-dimensional environment and/or a viewpoint of the user in the three-dimensional environment.

In some embodiments, when the user interface is displayed and when the user's gaze is detected via the eye-tracking device as being directed to a first region of the user interface, such as gaze 1111b in fig. 11A (e.g., the user's gaze intersects the first region, the first user interface object, and/or the second user interface object, or the user's gaze is within a threshold distance (such as 1, 2, 5, 10 feet) of intersecting the first region, the first user interface object, and/or the second user interface object). And/or can only determine that the user's gaze is directed toward the first region of the user interface), the electronic device detects (1202 b) via the one or more input devices, a corresponding input provided by a predefined portion of the user, such as input from the hand 1113b in fig. 11A (e.g., a gesture performed by a finger of the user's hand (such as an index finger) pointing toward and/or moving toward the first region, optionally wherein the movement is greater than a threshold movement (e.g., 0.5, 1, 3, 5, 10 cm) and/or the speed is greater than a threshold speed (e.g., 0.5, 1, 3, 5, 10 cm/s), or by a thumb of the hand pinching with another finger of the hand). In some embodiments, during the respective input, the location of the predefined portion of the user is away from the location corresponding to the first region of the user interface (e.g., the predefined portion of the user remains greater than a threshold distance of 2, 5, 10, 15, 20 feet from the first region, the first user interface object, and/or the second user interface object throughout the respective input).

In some embodiments, in response to detecting the respective input (1202 c), in accordance with a determination that one or more first criteria are met (e.g., the first user interface object is closer to a viewpoint of a user in the three-dimensional environment than the second user interface object, the first user interface object is a system user interface object (e.g., a user interface object of an operating system of the electronic device, rather than a user interface object of an application on the electronic device), and the second user interface object is an application user interface object (e.g., a user interface object of an application on the electronic device, rather than a user interface object of an operating system of the electronic device), etc., in some embodiments, the electronic device performs (1202 d) an operation with respect to the first user interface object based on the respective input (e.g., with respect to the user interface object 1105b in fig. 11A (e.g., and without performing an operation with respect to the second user interface object based on the respective input) based on the user's gaze without meeting the one or more first criteria (e.g., whether the one or more first criteria are met with respect to what location in a first region of the user's gaze is directed to the user interface). For example, selecting the first user interface object for further interaction (e.g., not selecting the second user interface object for further interaction), transitioning the first user interface object to a selected state such that further input will interact with the first user interface object (e.g., not transitioning the second user interface object to the selected state), selecting the first user interface object as a button (e.g., not selecting the second user interface object as a button), and so forth.

In some embodiments, in accordance with a determination that one or more second criteria different from the first criteria are met (e.g., the second user interface object is closer to a viewpoint of a user in the three-dimensional environment than the first user interface object, the second user interface object is a system user interface object (e.g., a user interface object of an operating system of the electronic device, rather than a user interface object of an application on the electronic device), and the first user interface object is an application user interface object (e.g., a user interface object of an application on the electronic device, rather than a user interface object of an operating system of the electronic device), etc., in some embodiments, the one or more second criteria are not met based on a gaze of the user (e.g., whether the one or more second criteria are met independent of where the gaze of the user is directed in a first region of the user interface)), the electronic device performs (1202 e) operations with respect to the second user interface object based on the respective inputs, such as with respect to user interface object 1105C in fig. 11C (e.g., and without performing operations with respect to the first user interface object based on the respective inputs). For example, selecting the second user interface object for further interaction (e.g., not selecting the first user interface object for further interaction), transitioning the second user interface object to a selected state such that further input will interact with the second user interface object (e.g., not transitioning the first user interface object to the selected state), selecting the second user interface object as a button (e.g., not selecting the first user interface object as a button), and so forth. The above-described manner of disambiguating which user interface object a particular input is directed to provides an efficient manner of facilitating interaction with a user interface object when there may be uncertainty as to which user interface object a given input is directed to, without requiring further user input to designate a given user interface object as the target of a given input, which simplifies interaction between a user and an electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by not requiring additional user input for further designation), which also reduces power usage and extends battery life of the electronic device by enabling a user to use the electronic device more quickly and efficiently.

In some embodiments, the user interface includes a three-dimensional environment (1204 a), such as environment 1101 (e.g., the first region is a respective volume and/or surface located at some x, y, z coordinate in the three-dimensional environment where a point of view of the three-dimensional environment associated with the electronic device is located). In some embodiments, in accordance with a determination that the respective distance is a first distance (e.g., 1 foot, 2 feet, 5 feet, 10 feet, 50 feet), the first region has a first size in the three-dimensional environment (1204 c), and in accordance with a determination that the respective distance is a second distance (e.g., 10 feet, 20 feet, 50 feet, 100 feet, 500 feet) different from the first distance, the first region has a second size in the three-dimensional environment (1204 d) different from the first size. For example, a size of an area in which the electronic device initiates operation relative to the first user interface object and the second user interface object within the area based on the one or more first criteria or second criteria (e.g., and not based on a gaze of the user being directed to the first user interface object or the second user interface object) varies based on a distance of the area from a point of view associated with the electronic device. In some embodiments, the size of the region decreases as the region of interest moves away from the viewpoint, and in some embodiments, the size of the region increases as the region of interest moves away from the viewpoint. For example, in fig. 11A, if objects 1105a and 1105 are farther from the user's point of view than shown in fig. 11A, the area that includes objects 1105a and 1105b and in which the criteria-based disambiguation described herein is performed will be different (e.g., larger), and if objects 1105a and 1105 are closer to the user's point of view than shown in fig. 11A, the area that includes objects 1105a and 1105b and in which the criteria-based disambiguation described herein is performed will be different (e.g., smaller). The above-described manner of operating with respect to regions of different sizes, depending on the distance of the region from a viewpoint associated with the electronic device, provides an efficient way of ensuring that the operation of the device with respect to the input with potential uncertainty accurately corresponds to the input with potential uncertainty without requiring further user input to manually change the size of the region of interest, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device and reduces erroneous operation of the device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, the size of the first region in the three-dimensional environment increases (1206 a) with increasing respective distances, such as described with reference to fig. 11A-11C. For example, when the region of interest is farther from a viewpoint associated with the electronic device, wherein the electronic device initiates an increase in size of the region relative to operation of the first user interface object and the second user interface object within the region based on the one or more first criteria or second criteria (e.g., and not based on whether the user's gaze is directed to the first user interface object or the second user interface object), which optionally corresponds to an uncertainty of determining where the user's gaze is directed when the potentially relevant user interface object is farther from the viewpoint associated with the electronic device (e.g., the further the two user interface objects are from the viewpoint, the more difficult it may be to determine whether the user's gaze is directed to the first user interface object or the second user interface object of the two user interface objects, and thus the electronic device optionally operates relative to the two user interface objects based on the one or more first criteria or second criteria). The above-described manner of operating relative to an area that increases in size as the area is farther from a point of view associated with the electronic device provides an efficient manner of avoiding false responses of the device to gaze-based inputs directed to objects as those objects are farther from the point of view associated with the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device and reduces false operation of the device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, the one or more first criteria are met when the first object is closer to the viewpoint of the user in the three-dimensional environment than the second object (such as user interface object 1105b in fig. 11A), and the one or more second criteria are met when the second object is closer to the viewpoint of the user in the three-dimensional environment than the first object (such as if user interface object 1105a is closer to user interface object 1105b in fig. 11A). For example, in accordance with a determination that the first user interface object is closer to a point of view associated with the electronic device in the three-dimensional environment than the second user interface object, the one or more first criteria are satisfied and the one or more second criteria are not satisfied, and in accordance with a determination that the second user interface object is closer to a point of view in the three-dimensional environment than the first user interface object, the one or more second criteria are satisfied and the one or more first criteria are not satisfied. Thus, in some embodiments, any one of the user interface objects in the first region that is closest to the point of view is the user interface object to which the device points the input (e.g., independent of whether the user's gaze is directed to another user interface object in the first region). The above-described manner of pointing input to user interface objects based on their distance from a point of view associated with an electronic device provides an efficient way of selecting user interface objects for input that simplifies interactions between a user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device and reduces erroneous operation of the device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, the one or more first criteria or the one or more second criteria are met based on a type of the first user interface object (e.g., a user interface object of an operating system of the electronic device, or a user interface object of an application other than the operating system of the electronic device) and a type of the second user interface object (e.g., a user interface object of an operating system of the electronic device, or a user interface object of an application other than the operating system of the electronic device) (1210 a). For example, in accordance with a determination that the first user interface object is a system user interface object and the second user interface object is not a system user interface object (e.g., is an application user interface object), the one or more first criteria are met and the one or more second criteria are not met, and in accordance with a determination that the second user interface object is a system user interface object and the first user interface object is not a system user interface object (e.g., is an application user interface object), the one or more second criteria are met and the one or more first criteria are not met. Thus, in some embodiments, any one of the user interface objects in the first region that is a system user interface object is the user interface object to which the device points the input (e.g., independent of whether the user's gaze is directed to another user interface object in the first region). For example, in fig. 11A, if user interface object 1105b is a system user interface object and user interface object 1105a is an application user interface object, device 101 may direct the input of fig. 11A to object 1105b instead of object 1105a (e.g., even though object 1105b is farther from the user's point of view than object 1105 a). The above-described manner of directing input to user interface objects based on their types provides an efficient and predictable manner of selecting user interface objects for input that simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device and reduces erroneous operation of the device by enabling a user to more quickly and efficiently use the electronic device.

In some embodiments, the one or more first criteria or the one or more second criteria are met based on respective priorities defined by the electronic device (e.g., by software of the electronic device, such as an application or an operating system of the electronic device) for the first user interface object and the second user interface object (1212 a). For example, in some embodiments, an application and/or operating system associated with the first user interface object and the second user interface object defines a selection priority for the first user interface object and the second user interface object such that if the selection priority gives the first user interface object a higher priority than the second user interface object, the device directs input to the first user interface object (e.g., independent of whether the user's gaze is directed to another user interface object in the first area) and if the selection priority gives the second user interface object a higher priority than the first user interface object, the device directs input to the second user interface object (e.g., independent of whether the user's gaze is directed to another user interface object in the first area). For example, in fig. 11A, if user interface object 1105b is assigned a higher selection priority (e.g., by software of device 101) and user interface object 1105a is assigned a lower selection priority, device 101 may direct the input of fig. 11A to object 1105b instead of object 1105a (e.g., even though object 1105b is farther from the user's point of view than object 1105 a). In some implementations, the relative selection priorities of the first user interface object and the second user interface object change over time based on what the respective user interface object is currently displaying (e.g., the user interface object currently displaying video/play content has a higher selection priority than the same user interface object that is displaying paused video content or other content other than video/play content). The above-described manner of directing input to user interface objects based on operating system and/or application priorities provides a flexible way of selecting user interface objects for input that simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling a user to use the electronic device more quickly and efficiently.

In some implementations, in response to detecting the respective input (1214 a), in accordance with a determination that one or more third criteria are met, including criteria that are met when the first region is greater than a threshold distance (e.g., 5, 10, 15, 20, 30, 40, 50, 100, 150 feet) from a point of view associated with the electronic device in the three-dimensional environment, the electronic device relinquishes performing (1214 b) the operation with respect to the first user interface object and relinquishes performing the operation with respect to the second user interface object, such as described with reference to user interface objects 1107a and 1107b in fig. 11A. For example, the electronic device is optionally enabled to interact with user interface objects within an area greater than a threshold distance from a viewpoint associated with the electronic device. In some embodiments, the one or more first criteria and the one or more second criteria both include criteria that are met when the first region is less than a threshold distance from a point of view associated with the electronic device. In some embodiments, when the first region is greater than a threshold distance from a viewpoint associated with the electronic device, the device determines that the user's gaze is directed to the first region (e.g., rather than a different region) in the user interface with relatively low certainty, and thus the electronic device is unable to gaze-based interact with objects within the first region to avoid false interactions with such objects. The above-described manner of disabling interaction with objects in a remote area avoids false gaze-based interactions with such objects, which simplifies interactions between a user and an electronic device, and enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device faster and more efficiently while avoiding errors in use.

In some embodiments, in accordance with a determination that the first region is greater than a threshold distance from a viewpoint associated with the electronic device in the three-dimensional environment, the electronic device visually weakens (1216 a) (e.g., blurs, fades, displays in less color (e.g., more gray scale), stops displaying, etc.) the first user interface object and the second user interface object relative to regions outside the first region that are less than the threshold distance from the viewpoint associated with the electronic device, such as described with reference to user interface objects 1107a and 1107b in fig. 11A (e.g., displays regions and/or objects outside the first region that are less than the threshold distance from the viewpoint associated with the electronic device in less or no blurs, less or no fades, more colors, or full colors, etc.). In some implementations, in accordance with a determination that the first region is less than a threshold distance from a viewpoint associated with the electronic device in the three-dimensional environment, the electronic device discards (1216 b) visually weakening the first user interface object and the second user interface object relative to regions outside of the first region of the user interface, such as for user interface objects 1103a, 1103b and 1105a, 1105b in fig. 11A. For example, in some embodiments, the electronic device visually weakens the first region and/or objects within the first region when the first region is greater than a threshold distance from a point of view associated with the electronic device. The above-described manner of visually weakening areas of a user interface that are not interactable due to their distance from a point of view provides a quick and efficient way of conveying such areas that are not interactable due to their distance from a point of view, which simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling a user to more quickly and efficiently use the electronic device while avoiding providing unnecessary input for interacting with non-interactive areas of the user interface.

In some embodiments, upon displaying the user interface, the electronic device detects (1218 a) via the one or more input devices a second corresponding input provided by a predefined portion of the user (e.g., a gesture performed by a finger of the user's hand, such as an index finger, pointing to and/or moving toward the first area, optionally wherein the movement is greater than a threshold movement (e.g., 0.5, 1, 3, 5, 10 cm) and/or a speed is greater than a threshold speed (e.g., 0.5, 1, 3, 5, 10 cm/s), or by a thumb of the hand pinching together with another finger of the hand). In some embodiments, in response to detecting the second corresponding input (1220 b), in accordance with a determination that the one or more third criteria are met, including criteria that are met when the first region is angled from a user's gaze in the three-dimensional environment by more than a threshold angle, such as described with reference to user interface object 1109 in fig. 11A (e.g., the user's gaze defines a reference axis and the first region is separated from the reference axis by an angle greater than 10, 20, 30, 45, 90, 120, etc. in some embodiments, when the second corresponding input is detected the user's gaze is not directed to the first region), the electronic device forego performing (1220 c) the corresponding operation with respect to the first user interface object and forego performing the corresponding operation with respect to the second user interface object, such as described with reference to user interface object 1109 in fig. 11A. For example, the electronic device is optionally enabled to interact with user interface objects that are greater than a threshold angle from the user's gaze. In some embodiments, the device directs the second corresponding input to a user interface object outside of the first region and performs a corresponding operation with respect to the user interface object based on the second corresponding input. The above-described manner of disabling interaction with objects that deviate from the user's gaze by a sufficient angle avoids false gaze-based interactions with such objects, which simplifies interactions between the user and the electronic device, and enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device faster and more efficiently while avoiding errors in use.

In some implementations, in accordance with a determination that the first region is angled greater than a threshold angle from a viewpoint associated with the electronic device in the three-dimensional environment, the electronic device visually weakens (1222 a) (e.g., blurs, fades, displays in less color (e.g., more gray), stops displaying, etc.) the first user interface object and the second user interface object relative to an area outside the first region of the user interface, such as described with reference to user interface object 1109 in fig. 11A (e.g., displays areas and/or objects outside the first region that are angled less than a threshold angle from the user's gaze with less or no blurs, less or no fades, more colors, or full colors, etc.). In some embodiments, if the direction of the user's gaze changes, the first user interface object and/or the second user interface object will be more weakened (e.g., emphasized) with respect to the area of the user interface if the user's gaze moves away from the first and/or second user interface object to a greater angle, and the first user interface object and/or the second user interface object will be less weakened (e.g., emphasized) with respect to the area of the user interface if the user's gaze moves away from the first user interface object and/or the second user interface object to a lesser angle. In some implementations, in accordance with a determination that the first region is angled less than a threshold angle from a viewpoint associated with the electronic device in the three-dimensional environment, the electronic device discards (1222 b) visually weakening the first user interface object and the second user interface object relative to regions outside of the first region of the user interface, such as relative to user interface objects 1103a, 1103b and 1105a, 1105b in fig. 11A. For example, in some embodiments, the electronic device visually weakens the first region and/or objects within the first region when the first region is at an angle greater than a threshold angle to the user's gaze. The above-described manner of visually weakening areas of a user interface that are not interactable due to their angulation with a user's gaze provides a quick and efficient way of conveying such areas that are not interactable due to their distance from a point of view, which simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling a user to more quickly and efficiently use the electronic device while avoiding providing unnecessary input for interaction with non-interactive areas of the user interface.

In some embodiments, the one or more first criteria and the one or more second criteria include respective criteria (1224 a) that are met when the first region is greater than a threshold distance (e.g., 3, 5, 10, 20, 30, 50 feet) from a viewpoint associated with the electronic device in the three-dimensional environment and are not met when the first region is less than the threshold distance from the viewpoint associated with the electronic device in the three-dimensional environment (e.g., if the first region is greater than the threshold distance from the viewpoint associated with the electronic device, the electronic device directs the pointing of the respective input relative to the first user interface object and the second user interface object according to the one or more first criteria or the second criteria). For example, in fig. 11A, the objects 1105a, 1105b are optionally farther from the point of view of the user than the threshold distance. In some embodiments, in response to detecting the respective input and in accordance with a determination that the first region is less than a threshold distance from a viewpoint associated with the electronic device in the three-dimensional environment (1224B), in accordance with a determination that the user's gaze is directed toward the first user interface object (e.g., and independent of whether the one or more first criteria or the one or more second criteria other than the respective criteria are met), the electronic device performs (1224B) operations with respect to the first user interface object based on the respective input, such as described with reference to user interface objects 1103a, 1103B in fig. 11A and 11B. In some embodiments, in accordance with a determination that the user's gaze is directed to the second user interface object (e.g., and independent of whether the one or more first criteria or the one or more second criteria other than the respective criteria are met), the electronic device performs (1224 d) an operation with respect to the second user interface object based on the respective input, such as described with reference to user interface objects 1103a, 1103B in fig. 11A and 11B. For example, when the first region is within a threshold distance of a viewpoint associated with the electronic device, the device directs the respective input to the first user interface object or the second user interface object based on the gaze of the user, rather than directing the respective input to the first user interface object or the second user interface object based on the one or more first criteria or the second criteria, respectively. The above-described manner of performing gaze-based pointing of input to the first region when the first region is within a threshold distance of the user's point of view provides a quick and efficient manner of allowing a user to indicate which user interface object the input should be directed to when the user interface object is at a distance at which the device is able to determine gaze location/direction with relatively high certainty, which simplifies interactions between the user and the electronic device and enhances operability of the electronic device and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

Fig. 13A shows that the electronic device 101 displays a three-dimensional environment 1301 on a user interface via the display generating part 120. It should be appreciated that in some embodiments, the electronic device 101 utilizes one or more of the techniques described with reference to fig. 13A-13C in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device 101 optionally includes a display generation component 120 (e.g., a touch screen) and a plurality of image sensors 314. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101 can use to capture one or more images of a user or a portion of a user when the user interacts with the electronic device 101. In some embodiments, the display generating component 120 is a touch screen capable of detecting gestures and movements of a user's hand. In some embodiments, the user interfaces shown below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hands (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

As shown in fig. 13A, three-dimensional environment 1301 includes three user interface objects 1303A, 1303b, and 1303c that are interactable (e.g., via user input provided by a hand 1313A of a user of device 101). For example, device 101 optionally directs such input to user interface objects 1303a, 1303b, and/or 1303c based on various characteristics of direct or indirect input provided by hand 1313a (e.g., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). In fig. 13A, three-dimensional environment 1301 also includes a representation 604 of a table in the physical environment of electronic device 101 (e.g., such as described with reference to fig. 6B). In some implementations, the representation 604 of the table is a photorealistic video image (e.g., video or digital passthrough) of the table displayed by the display generation component 120. In some embodiments, the representation 604 of the table is a view (e.g., a real or physical perspective) of the table through the transparent portion of the display generating component 120.

In some implementations, as discussed with reference to method 800, for example, when device 101 detects a user's hand in an indirect ready state at an indirect interaction distance from one or more user interface objects, device 101 assigns an indirect hover state to the user interface object based on the user's gaze (e.g., displays the user interface object to which the user's gaze is directed in an indirect hover state appearance) to indicate which user interface object will receive indirect input from the user's hand if such input is provided by the user's hand. Similarly, in some embodiments, as discussed, for example, with reference to method 800, when device 101 detects a user's hand in a direct ready state at a direct interaction distance from a user interface object, the device assigns a direct hover state to the user interface object to indicate that the user interface object will receive direct input from the user's hand if such input is provided by the user's hand.

In some embodiments, the device 101 detects that an input provided by a user's hand transitions from an indirect input to a direct input and/or vice versa. Fig. 13A-13C illustrate exemplary responses of the device 101 to such transformations. For example, in fig. 13A, device 101 detects that hand 1313A is farther than a threshold distance (e.g., at an indirect interaction distance) from all user interface objects 1303A, 1303b, and 1303c, such as 3 inches, 6 inches, 1 foot, 2 feet, 5 feet, 10 feet (e.g., hand 1313A is not within a threshold distance of any user interface objects in three-dimensional environment 1301 that may interact through hand 1313A). The hand 1313a is optionally in an indirect ready state hand shape (e.g., as described with reference to method 800). In fig. 13A, a gaze 1311a of a user of the electronic device 101 is directed towards a user interface object 1303A. Thus, device 101 displays user interface object 1303a in an indirect hover state appearance (e.g., indicated by a shadow of user interface object 1303 a), and device 101 does not display user interface objects 1303b and 1303c in an indirect hover state appearance (e.g., displays user interface objects in a non-hover state, such as indicated by user interface objects 1303b and 1303c not being shaded). If the hand 1313a is to move within a threshold distance of the user interface object 1303a, and optionally if the hand 1313a is to be in a direct ready state hand shape (e.g., as described with reference to method 800), the device 101 will optionally maintain the user interface object 1303a in a hover state (e.g., display the user interface object 1303a in a direct hover state appearance). In some embodiments, if in fig. 13A, hand 1313A is not in the indirect ready state hand shape, device 101 will optionally not display user interface object 1303A in the indirect hover state appearance (e.g., and if device 101 does not detect that at least one hand of the user is in the indirect ready state hand shape, all user interface objects 1303A, 1303b, and 1303c will optionally not be displayed in the indirect hover state). In some embodiments, the indirect hover state appearance is different depending on which hand the indirect hover state corresponds to. For example, in fig. 13A, hand 1313A is optionally the right hand of the user of electronic device 101 and produces an indirect hover state appearance for user interface object 1303A, as shown and described with reference to fig. 13A. However, if hand 1313a has changed to the left hand of the user, device 101 will optionally display user interface object 1303a in a different indirect hover state appearance (e.g., different color, different shade, different size, etc.). Displaying user interface objects in different indirect hover state appearances optionally indicates to the user which hand input from the user the device 101 will direct to those user interface objects.

In fig. 13B, device 101 detects that user's gaze 1311B has moved away from user interface object 1303a and has moved to user interface object 1303B. In fig. 13B, hand 1313a optionally remains in the indirect ready state hand shape and optionally remains farther than a threshold distance from all user interface objects 1303a, 1303B, and 1303c (e.g., hand 1313a is not within a threshold distance of any user interface objects in three-dimensional environment 1301 that may interact through hand 1313 a). In response, device 101 has moved the indirect hover state from user interface object 1303a to user interface object 1303b and displayed user interface object 1303b in an indirect hover state appearance and displayed user interface objects 1303a and 1303c in an indirect hover state appearance (e.g., displayed user interface objects 1303a and 1303c in a non-hover state).

In fig. 13C, device 101 detects that hand 1313A has moved (e.g., from its positioning in fig. 13A and/or 13B) to within a threshold distance (e.g., at its direct interaction distance) of user interface object 1303C. The device 101 optionally also detects that the hand 1313a is in a direct ready state hand shape (e.g., as described with reference to method 800). Thus, regardless of whether the user's gaze is directed to user interface object 1303a (e.g., gaze 1311 a) or user interface object 1303b (e.g., gaze 1311 b), device 101 moves the direct hover state to user interface object 1303c (e.g., moves the hover state away from user interface objects 1303a and/or 1303 b), and user interface object 1303c is being displayed in a direct hover state appearance (e.g., indicated by the shading of user interface object 1303 c), and user interface objects 1303a and 1303b are not being displayed in a (e.g., direct or indirect) hover state appearance (e.g., in a non-hover state). In some implementations, when the hand 1313a is within a threshold distance of the user interface object 1303c (e.g., and optionally in a direct ready state hand shape), a change in the user's gaze (e.g., pointing to a different user interface object) does not move the direct hover state away from the user interface object 1303c. In some embodiments, to receive a hover state for user interface object 1303C in response to the hand movement and/or shape of fig. 13C, device 101 requires user interface object 1303C to be within the user's attention area (e.g., as described with reference to method 1000). For example, if device 101 detects the location and/or shape of hand 1313a of fig. 13C, but detects that the user's attention area does not include user interface object 1303C, device 101 will optionally not move the hover state to user interface object 1303C, and will instead maintain the hover state with the user interface object previously having the hover state. If the device 101 subsequently detects that the user's attention area has moved to include the user interface object 1303c, the device 101 will optionally move the hover state to the user interface object 1303c as long as the hand 1313a is within a threshold distance of the user interface object 1303c and optionally in the direct ready state hand shape. If device 101 subsequently detects that the user's attention area again moves to not include user interface object 1303c, device 101 will optionally maintain a hover state with user interface object 1303c as long as hand 1313a still interacts with user interface object 1303c (e.g., is within a threshold distance of user interface object 1303c and/or is in a direct ready state hand shape and/or interacts directly with user interface object 1303c, etc.). If hand 1313a no longer interacts with user interface object 1303c, device 101 will optionally move the hover state to the user interface object based on the gaze of the user of the electronic device.

In some embodiments, the direct hover state appearance is different depending on which hand the direct hover state corresponds to. For example, in fig. 13C, hand 1313a is optionally the right hand of the user of electronic device 101 and produces an indirect hover state appearance for user interface object 1303C, as shown and described with reference to fig. 13C. However, if hand 1313a has changed to the left hand of the user, device 101 will optionally display user interface object 1303c in a different direct hover state appearance (e.g., different color, different shade, different size, etc.). Displaying user interface objects in different direct hover state appearances optionally indicates to the user which hand input from the user the device 101 will direct to those user interface objects.

In some embodiments, the appearance of the direct hover state (e.g., shown on user interface object 1303C in fig. 13C) is different from the appearance of the indirect hover state (e.g., shown on user interface objects 1303A and 1303B in fig. 13A and 13B, respectively). Thus, in some embodiments, a given user interface object is displayed by the device 101 in different manners (e.g., different colors, different shadows, different sizes, etc.), depending on whether the user interface object has a direct hover state or an indirect hover state.

If in fig. 13C, device 101 has detected that hand 1313a has moved within a threshold distance (e.g., within a direct interaction distance) of two interactable user interface objects (e.g., 1303b and 1303C), and optionally if hand 1313a is in a direct ready state shape, device 101 will optionally move the hover state to a user interface object that is closer to hand 1313a, e.g., to user interface object 1303b if hand 1313a is closer to user interface object 1303b, and to user interface object 1303C if hand 1313a is closer to user interface object 1303C.

Fig. 14A-14H are flowcharts illustrating a method 1400 of enhancing interactions with user interface elements for mixed direct and indirect interaction modes, according to some embodiments. In some embodiments, the method 1400 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, the method 1400 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some operations in method 1400 are optionally combined, and/or the order of some operations is optionally changed.

In some embodiments, the method 1400 is performed at an electronic device in communication with a display generation component and one or more input devices (including an eye tracking device). For example, a mobile device (e.g., a tablet, smart phone, media player, or wearable device) or a computer. In some embodiments, the display generating component is a display integrated with the electronic device (optionally a touch screen display), an external display such as a monitor, projector, television, or hardware component (optionally integrated or external) for projecting a user interface or making the user interface visible to one or more users, or the like. In some embodiments, the one or more input devices include a device capable of receiving user input (e.g., capturing user input, detecting user input, etc.) and transmitting information associated with the user input to the electronic device. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from an electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), and so forth. In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, the electronic device displays (1402 a) a user interface via a display generation component, wherein the user interface includes a plurality of user interface objects of respective types, such as user interface objects 1303A, 1303b, 1303c in fig. 13A (e.g., user interface objects selectable via one or more hand gestures (such as tap or pinch gestures)), including a first user interface object in a first state (e.g., a non-hover state such as an idle or non-selected state) and a second user interface object in a first state (e.g., a non-hover state such as an idle or non-selected state). In some implementations, the first user interface object and/or the second user interface object are interactive user interface objects, and in response to detecting input directed to a given object, the electronic device performs an action associated with the user interface object. For example, the user interface object is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the user interface object is a container (e.g., window) in which the user interface/content is displayed, and in response to detecting a selection of the user interface object and a subsequent movement input, the electronic device updates the positioning of the user interface object in accordance with the movement input. In some implementations, the first user interface object and the second user interface object are displayed in a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) generated by, displayed by, or otherwise made viewable by the device (e.g., the user interface is a three-dimensional environment and/or is displayed within the three-dimensional environment).

In some embodiments, when a user's gaze of the electronic device is directed to a first user interface object, such as gaze 1311a in fig. 13A (e.g., the user's gaze intersects the first user interface object, or the user's gaze is within a threshold distance (such as 1, 2, 5, 10 feet) of intersecting the first user interface object), in accordance with a determination that one or more criteria are met, including criteria met when a first predefined portion of the user of the electronic device is farther from a location corresponding to any of the plurality of user interface objects in the user interface than the threshold distance, such as a location of a hand 1313A in fig. 13A (e.g., a location of a user's hand or finger (such as an index finger) is not within 3 inches, 6 inches, 1 feet, 2 feet, 5 feet, 10 feet of a location corresponding to any of the plurality of user interface objects in the user interface, such that input provided by the first predefined portion of the user to the user interface object will interact in an indirect manner, such as reference methods 800, 1000, 1200, 1600, 1800, 2000) in a non-hover state (e.g., a hover state) is generated via a second user interface object (e.g., a non-hover state) in which is different from a first user interface object (e.g., a non-hover state) than a selected state (e.g., a non-user interface object) in a state (e.g., a non-user interface state) 13 b, a state (e.g., a hover state) is generated, further input from the predefined portion of the user (e.g., movement of the index finger of the hand toward the user interface object) is optionally recognized by the device as input directed to the user interface object (e.g., selection of the user interface object in a hover state) when the predefined portion of the user is farther than a threshold distance from the location corresponding to the object. Examples of such inputs are described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000. In some embodiments, such further input from a predefined portion of the user is optionally identified as not pointing to a user interface object in a non-hovering state. In some implementations, displaying the first user interface object in the second state includes updating an appearance of the first user interface object to change a color of the first user interface object, highlighting the first user interface object, lifting/moving the first user interface object toward a viewpoint of the user, etc., to indicate that the first user interface object is in a hover state (e.g., ready for further interaction), and displaying the second user interface object in the first state includes displaying the second user interface object without changing a color of the second user interface object, without highlighting the second user interface object, lifting/moving the second user interface object toward a viewpoint of the user, etc. In some embodiments, the one or more criteria include a criterion that is met when a predefined portion of the user is in a particular pose, such as described with reference to method 800. In some embodiments, if the user's gaze has pointed to the second user interface object (instead of the first user interface object) when the one or more criteria are met, the second user interface object will have been displayed in the second state and the first user interface object will have been displayed in the first state.

In some implementations, while the user's gaze is directed to the first user interface object (1402 c) (e.g., the user's gaze remains directed to the first user interface object during/after movement of a predefined portion of the user described below), the electronic device detects (1402 d) movement to the first predefined portion of the user (e.g., movement of the user's hand and/or fingers away from the first location to the second location) via the one or more input devices while the first user interface object is displayed in the second state (e.g., the hover state). In some embodiments, in response to detecting movement of the first predefined portion of the user (1402 e), in accordance with a determination that the first predefined portion of the user is moving within a threshold distance of a location corresponding to the second user interface object, such as hand 1313a in fig. 13C (e.g., before detecting movement of the first predefined portion of the user, the first predefined portion of the user is not within a threshold distance of a location corresponding to any of the plurality of user interface objects in the user interface, but after detecting movement of the first predefined portion of the user, the first predefined portion of the user is within a threshold distance of a location corresponding to the second user interface object). For example, even if the user's gaze continues to point at the first user interface object, but not at the second user interface object, the hover state is moved from the first user interface object to the second user interface object because the user's hand and/or finger is within a threshold distance of the location corresponding to the second user interface object. In some implementations, when the first predefined portion of the user is within a threshold distance of a location corresponding to the second user interface object, the pose of the first predefined portion of the user needs to be a particular pose (such as described with reference to method 800) to move the hover state to the second user interface object. The input provided by the first predefined portion of the user to the second user interface object will optionally be made in a direct interactive manner, such as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000, when the first predefined portion of the user is within a threshold distance of a location corresponding to the second user interface object. The above-described manner of moving the second state to the second user interface object provides an efficient manner of facilitating interaction with the user interface object with which interaction is most likely based on one or more of hand and gaze positioning without requiring further user input specifying the given user interface object as a target for further interaction, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some implementations, in response to detecting movement of the first predefined portion of the user (1404 a), in accordance with a determination that the first predefined portion of the user is moving within a threshold distance of locations corresponding to the second user interface object (e.g., before detecting movement of the first predefined portion of the user, the first predefined portion of the user is not within a threshold distance of locations corresponding to any of the plurality of user interface objects in the user interface, but after detecting movement of the first predefined portion of the user, the first predefined portion of the user is within a threshold distance of locations corresponding to the second user interface object, the first predefined portion of the user is optionally not within a threshold distance of locations corresponding to any other of the plurality of user interface objects in the user interface), the electronic device displays (1404 b) the first user interface object in a first state via the display generating component, such as displaying the user interface objects 1303a and/or 1303b in a non-hovering state (e.g., a non-hovering state such as an idle or non-selected state) in fig. 13C. For example, because the first predefined portion of the user has now moved within the threshold distance of the second user interface object, the first predefined portion of the user is now determined by the electronic device to interact with the second user interface object and is no longer available to interact with the first user interface object. Thus, the electronic device optionally displays the first user interface object in the first state (e.g., rather than maintaining the first user interface object displayed in the second state). The above-described manner of displaying the first user interface object in the first state provides an efficient way of indicating that the first predefined portion of the user is no longer determined to interact with the first user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently (e.g., by avoiding erroneous inputs provided by the first predefined portion of the user to an incorrect user interface object).

In some implementations, in response to detecting movement of the first predefined portion of the user (1406 a), in accordance with determining that the first predefined portion of the user is moving within a threshold distance of locations corresponding to the first user interface object (e.g., before detecting movement of the first predefined portion of the user, the first predefined portion of the user is not within a threshold distance of locations corresponding to any of the plurality of user interface objects in the user interface, but after detecting movement of the first predefined portion of the user, the first predefined portion of the user is within a threshold distance of locations corresponding to the first user interface object). For example, in fig. 13A, if the hand 1313A moves within a threshold distance of the object 1303A, the device 101 will maintain displaying the object 1303A in the second state. For example, the electronic device maintains displaying the first user interface object in the second state because the electronic device has displayed the first user interface object in the second state before the first predefined portion of the user moves within the threshold distance of the location corresponding to the first user interface object, and because the device determines that the first predefined portion of the user is still interacting with the first user interface object after the first predefined portion of the user moves within the threshold distance of the location corresponding to the first user interface object. In some embodiments, the user's gaze continues to be directed to the first user interface object, and in some embodiments, the user's gaze is no longer directed to the first user interface object. The input provided by the first predefined portion of the user to the first user interface object will optionally be made in a direct interactive manner, such as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000, when the first predefined portion of the user is within a threshold distance of a location corresponding to the first user interface object. The above-described manner of maintaining the first user interface object displayed in the second state provides an efficient way of indicating that the first predefined portion of the user is still determined to interact with the first user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device (e.g., by avoiding erroneous inputs provided by the first predefined portion of the user to an incorrect user interface object).

In some embodiments, in response to detecting movement of the first predefined portion of the user (1408 a), in accordance with a determination that the first predefined portion of the user is moving within a threshold distance of a location corresponding to a third user interface object of the plurality of user interface objects (e.g., the third user interface object is different from the first user interface object and the second user interface object). For example, even if the user's gaze continues to point to the first user interface object and not to the third user interface object, the hover state will be moved from the first user interface object to the third user interface object because the user's hand and/or finger is within a threshold distance of the location corresponding to the third user interface object (e.g., in fig. 13C, instead of the hand 1313a moving within a threshold distance of object 1303C, the hand 1313a moving within a threshold distance of object 1303b, the device 101 will display object 1303b in the second state, rather than object 1303C in the second state). In some implementations, when the first predefined portion of the user is within a threshold distance of a location corresponding to the third user interface object, the pose of the first predefined portion of the user needs to be a particular pose (such as described with reference to method 800) to move the hover state to the third user interface object. The input provided by the first predefined portion of the user to the third user interface object will optionally be made in a direct interactive manner, such as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000, when the first predefined portion of the user is within a threshold distance of a location corresponding to the third user interface object. The above-described manner of moving the second state to the user interface object when the first predefined portion of the user is within a threshold distance of the location corresponding to the user interface object provides an efficient manner of indicating that the first predefined portion of the user is still determined to interact with the user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device (e.g., by avoiding erroneous inputs provided by the first predefined portion of the user to an incorrect user interface object).

In some embodiments, in response to detecting movement of the first predefined portion of the user (1410 a), in accordance with a determination that the first predefined portion of the user is moving (1410 b) within a threshold distance of a location corresponding to the first user interface object and a location corresponding to the second user interface object (e.g., the first predefined portion of the user is now within a threshold distance of a location corresponding to two or more of the plurality of user interface objects, such as in fig. 13C, the hand 1313a has moved within a threshold distance of objects 1303b and 1303C), in accordance with a determination that the first predefined portion is closer to a location corresponding to the first user interface object (e.g., closer to object 1303b than to object 1303C), the electronic device displays (1410C) the first user interface object (e.g., 1303 b) in a second state (e.g., a hover state) via the display generating component (e.g., and displays the second user interface object in the first state). In some implementations, in accordance with a determination that the first predefined portion is closer to a location corresponding to the second user interface object than to a location corresponding to the first user interface object (e.g., closer to object 1303c than to object 1303 b), the electronic device displays (1410 d) the second user interface object (e.g., 1303 c) in a second state (e.g., a hover state) via the display generating component (e.g., and displays the first user interface object in the first state). For example, the electronic device optionally moves the second state to a user interface object whose first predefined portion of the user is closer to its corresponding location when the first predefined portion of the user is within a threshold distance of a plurality of user interface objects of the plurality of user interface objects. In some embodiments, the electronic device moves the second state as described above regardless of whether the user's gaze is directed to the first user interface object or the second (or other) user interface object, because the first predefined portion of the user is within a threshold distance of a location corresponding to at least one user interface object of the plurality of user interface objects. The above-described manner of moving the second state to the user interface object closest to the first predefined portion of the user when the first predefined portion of the user is within a threshold distance of locations corresponding to the plurality of user interface objects provides an efficient manner of selecting the user interface object for interaction (e.g., without utilizing additional user input) and indicating the user interface object to the user, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device (e.g., by avoiding erroneous inputs provided by the first predefined portion of the user to incorrect user interface objects).

In some embodiments, the one or more criteria include a criterion (1412 a) that is met when the first predefined portion of the user is in a predetermined pose, such as described with reference to hand 1313A in fig. 13A. For example, the hand is in a shape corresponding to the beginning of a gesture in which the thumb and index finger of the hand are brought together, or in a shape corresponding to the beginning of a gesture in which the index finger of the hand moves forward in space in a flick gesture (e.g., as if the index finger were flicking an imaginary surface at 0.5, 1, 2, 3cm in front of the index finger). The predetermined pose of the first predefined portion of the user is optionally as described with reference to method 800. The above-described requirement that the first predefined portion of the user is in a particular pose before the user interface object will have the second state (e.g., and is ready to accept input from the first predefined portion of the user) provides an efficient way of preventing accidental input/interaction of the first predefined portion of the user with the user interface element, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some implementations, in response to detecting movement of the first predefined portion of the user (1414 a), in accordance with determining that the first predefined portion of the user is moving within a threshold distance of locations corresponding to the first user interface object (e.g., before detecting movement of the first predefined portion of the user, the first predefined portion of the user is not within a threshold distance of locations corresponding to any of the plurality of user interface objects in the user interface, but after detecting movement of the first predefined portion of the user, the first predefined portion of the user is within a threshold distance of locations corresponding to the first user interface object). For example, if the hand 1313A has moved within a threshold distance of the object 1303A after the state shown in fig. 13A, the device 101 will optionally maintain displaying the object 1303A in the second state. In some implementations, the first user interface object in the second state (e.g., hover state) has a first visual appearance (1414C) when the first predefined portion of the user is greater than a threshold distance of the location corresponding to the first user interface object, and the first user interface object in the second state (e.g., hover state) has a second visual appearance (1414 d) different from the first visual appearance when the first predefined portion of the user is within the threshold distance of the location corresponding to the first user interface object, such as described with reference to user interface object 1303C in fig. 13C. For example, the visual appearance of the hover state for directly interacting with the first predefined portion of the user (e.g., as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000) is optionally different from the visual appearance of the hover state for indirectly interacting with the first predefined portion of the user (e.g., as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000) when the first predefined portion of the user is within a threshold distance from the location corresponding to the first user interface object. In some embodiments, the different visual appearance is one or more of the following: different amounts of spacing of the first user interface object from the back panel on which the first user interface object is displayed (e.g., displayed with no spacing or with less spacing when not in a hover state), different colors and/or highlights of the first user interface object when in a hover state (e.g., displayed with no color and/or highlights when not in a hover state), and so forth. The above-described manner of displaying the second state differently for direct and indirect interactions provides an efficient manner of indication as to the manner in which the device is responding to and/or operating interactions, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device (e.g., by avoiding erroneous inputs that are incompatible with currently active interactions with user interface objects).

In some embodiments, when the user's gaze is directed toward the first user interface object (e.g., the user's gaze intersects the first user interface object, or the user's gaze is within a threshold distance (such as 1, 2, 5, 10 feet) of intersecting the first user interface object), in accordance with a determination that one or more second criteria are met, including criteria that are met when a second predefined portion of the user, different from the first predefined portion, is farther from a location corresponding to any of the plurality of user interface objects in the user interface than the threshold distance (e.g., the location of a user's hand or finger (such as an index finger) is not within 3 inches, 6 inches, 1 foot, 2 feet, 5 feet, 10 feet of the location corresponding to any of the plurality of user interface objects in the user interface, such that input provided to the user interface object by the second predefined portion of the user is to be conducted in an indirect interaction manner, such as the reference methods 800, 1000, 1200, 1600, 1800 and 2000, if the user's gaze has pointed to a second user interface object (instead of the first user interface object) when the one or more second criteria are met, the second user interface object has been displayed in a second state, and the first user interface object has been displayed in a first state, the electronic device displays (1416 a) the first user interface object in a second state via the display generating component, such as displaying user interface objects 1303A and/or 1303B in fig. 13A and 13B in a hover state (e.g., displaying the first user interface object in a hover state based on a second predefined portion of the user). In some embodiments, in accordance with a determination that the one or more criteria are met, the first user interface object in the second state (e.g., hover state) has a first visual appearance (1416 b), and in accordance with a determination that the one or more second criteria are met, the first user interface object in the second state (e.g., hover state) has a second visual appearance (1416 c) that is different from the first visual appearance. For example, the hover state of the user interface object optionally has a different visual appearance (e.g., color, shading, highlighting, spacing from the back plate, etc.), depending on whether the hover state is based on a first predefined portion that interacts with the user interface object or a second predefined portion that interacts with the user interface object. In some embodiments, the direct-interaction hover state based on the first predefined portion of the user has a different visual appearance than the direct-interaction hover state based on the second predefined portion of the user, and the indirect-interaction hover state based on the first predefined portion of the user has a different visual appearance than the indirect-interaction hover state based on the second predefined portion of the user. In some embodiments, two predefined portions of the user interact simultaneously with two different user interface objects having different hover state appearances as described above. In some embodiments, the two predefined portions of the user interact with different or the same user interface object having different hover state appearances as described above, different (e.g., sequentially). The above-described manner of displaying the second state differently for different predefined portions of the user provides an efficient way of indicating which predefined portion of the user is responded to by the device for a given user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently (e.g., by avoiding false inputs by the user's unpaired predefined portion).

In some implementations, displaying the second user interface object (1418 a) in the second state (e.g., hover state) occurs while the user's gaze remains directed to the first user interface object (such as gaze 1311a or 1311b in fig. 13C). For example, even if the user's gaze remains directed to the first user interface object, the electronic device displays the second user interface object in the second state and/or displays the first user interface object in the first state when the first predefined portion of the user moves within a threshold distance of the corresponding location with the second user interface object. In some implementations, the user's gaze is directed toward the second user interface object. The above-described manner of selecting the user interface object for direct interaction independent of gaze movement of the second state provides an efficient way of selecting the user interface object for direct interaction without requiring additional gaze input, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, displaying the second user interface object in the second state (e.g., the hover state) is further based on determining that the second user interface object is within an attention area associated with a user of the electronic device (1420 a), such as object 1303C in fig. 13C is within an attention area associated with a user of the electronic device (e.g., if the second user interface object is not within an attention area associated with a user, the second user interface object will not be displayed in the second state (e.g., will continue to be displayed in the first state). In some embodiments, the first user interface object will continue to be displayed in the second state, and in some embodiments, the first user interface object will be displayed in the first state. For example, the attention area is optionally an area and/or volume of the user interface and/or three-dimensional environment specified based on the user's gaze direction/location, and is a factor in deciding whether the user interface object is interactable by the user under various conditions, such as described with reference to method 1000. The above-described manner of moving the second state only when the second user interface object is within the user's attention area provides an efficient way of preventing unintentional interactions with user interface objects with which the user may not be aware, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, the one or more criteria include a criterion that is met when at least one predefined portion of the user is in a predetermined pose, the at least one predefined portion including a first predefined portion (1422 a) of the user, such as described with reference to hand 1313A in fig. 13A (e.g., a ready state pose, such as those described with reference to method 800). For example, displaying the user interface object in the second state based on the gaze optionally requires that at least one predefined portion of the user be in a predetermined pose (e.g., to be able to interact with the user interface object displayed in the second state) before displaying the user interface object to which the gaze is directed in the second state. The above-described manner in which the predefined portion of the user is in a particular pose before the user interface object is displayed in the second state provides an efficient manner of preventing unintended interactions with the user interface object when the user only provides gaze input without utilizing corresponding input of the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, while the first user interface object is displayed in the second state (e.g., the hover state), the electronic device detects (1424 a) a first movement to an attention area associated with the user via one or more input devices (e.g., no movement of a first predefined portion of the user is detected). In some embodiments, in response to detecting a first movement of an attention area associated with the user (1424 b), in accordance with a determination that the attention area includes a respective type of third user interface object (e.g., in some embodiments, the first user interface object is no longer within the attention area associated with the user, in some embodiments, the user's gaze is directed to the third user interface object, in some embodiments, the user's gaze is not directed to the third user interface object), and the first predefined portion of the user is within a threshold distance of a location corresponding to the third user interface object, the electronic device displays (1424 c) the third user interface object (e.g., and displays the first user interface object in a first state) in a second state (e.g., a hover state) via the display generating component. Thus, in some implementations, even if the first predefined portion of the user does not move, the gaze movement of the user causes the attention area to move to a new location including the user interface object corresponding to a location within the threshold distance of the first predefined portion of the user, the electronic device will move the second state from the first user interface object to the third user interface object. For example, in fig. 13C, if the attention area does not initially include object 1303C, but later includes the object, then when the attention area moves to include object 1303C, device 101 will optionally display object 1303 in a second state, such as shown in fig. 13C. In some embodiments, the second state may move to the third user interface object only if the first predefined portion of the user has the second state when the first predefined portion is farther than the threshold distance from the location corresponding to the first user interface object, and the second state does not move to the third user interface object when the first predefined portion of the user is within the threshold distance of the location corresponding to the first user interface object (and continues to be within the threshold distance) and/or because the first predefined portion of the user is within the threshold distance of the location corresponding to the first user interface object (and continues to be within the threshold distance) where the first user interface object has the second state. The above-described manner of moving the second state based on changes in the attention area provides an efficient way of ensuring that user interface objects having the second state (and thus those user interface objects with which the user is interacting or potentially interacting) are user interface objects that the user is noticing rather than user interface objects that the user is not noticing, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently (e.g., by avoiding false inputs directed to user interface objects that are no longer within the user's attention).

In some implementations, after detecting the first movement of the attention area and while displaying the third user interface object in the second state (e.g., the hover state) (e.g., because the first predefined portion of the user is within a threshold distance of a location corresponding to the third user interface object), the electronic device detects (1426 a) a second movement to the attention area via the one or more input devices, wherein the third user interface object is no longer within the attention area as a result of the second movement of the attention area (e.g., the user's gaze moves away from an area that includes the third user interface object such that the attention area has moved to no longer include the third user interface object). In some embodiments, in response to detecting the second movement of the attention area (1426 b), in accordance with a determination that the first predefined portion of the user is within a threshold distance of the third user interface object (e.g., in some embodiments, it is also determined that the first predefined portion of the user directly or indirectly/remains directly or indirectly interacting with the third user interface object as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000, and/or that the first predefined portion of the user is in a predetermined pose as described with reference to method 800), the electronic device maintains (1426 c) display of the third user interface object in a second state (e.g., a hover state). For example, in fig. 13C, if after the attention area moves to include the object 1303C and the device 101 displays the object 1303C in the second state, the device 101 detects that the attention area moves again to exclude the object 1303C, the device 101 will optionally maintain displaying the object 1303C in the second state. For example, if the first predefined portion of the user remains within a threshold distance of a location corresponding to the user interface object, the second state optionally does not move away from the user interface object as a result of the attention area moving away from the user interface object. In some implementations, if the first predefined portion of the user is farther than the threshold distance from the location corresponding to the third user interface object, the second state will move away from the third user interface object (e.g., and the third user interface object will be displayed in the first state). The above-described manner of maintaining the second state of the user interface object when the first predefined portion of the user is within the threshold distance of the user interface object provides the user with an efficient manner of continuing to interact with the user interface object while viewing and/or interacting with other portions of the user interface, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some implementations, in response to detecting the second movement of the attention area and in accordance with a determination that the first predefined portion of the user has not interacted with the third user interface object (1428 a) (e.g., when or after the attention area has moved, the first predefined portion of the user has stopped interacting directly or indirectly with the third user interface object, such as described with reference to methods 800, 1000, 1200, 1600, 1800, and 2000), the electronic device displays (1428 b) the first user interface object in a second state (e.g., a hover state) similar to that shown and described with reference to fig. 13A in accordance with a determination that the first user interface object is within the attention area, the one or more criteria are met, and the user's gaze is directed to the first user interface object. In some implementations, in accordance with a determination that the second user interface object is within the attention area, the one or more criteria are met, and the user's gaze is directed to the second user interface object, the electronic device displays (1428 c) the second user interface object in a second state (e.g., a hover state). For example, the electronic device optionally no longer maintains the third user interface object in the second state if the first predefined portion of the user no longer interacts with the third user interface object when or after the attention area moves away from the third user interface object. In some implementations, the electronic device moves the second state between user interface objects of the plurality of user interface objects based on the gaze of the user. The above-described manner of moving the second state with the first predefined portion of the user no longer interacting with the third user interface object provides the user with an efficient way of being able to interact/interact with other user interface objects and does not lock interactions with the third user interface object when the first predefined portion of the user has stopped interacting with the third user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, upon meeting the one or more criteria (1430 a), before detecting movement of the first predefined portion of the user and while displaying the first user interface object in the second state (e.g., hover state), the electronic device detects (1430B) movement of the user's gaze to the second user interface object, such as gaze 1311B in fig. 13B, via the eye tracking device (e.g., the user's gaze intersects the second user interface object without intersecting the first user interface object, or the user's gaze is within a threshold distance, such as 1, 2, 5, 10 feet, of intersecting the second user interface object without intersecting the first user interface object). In some implementations, in response to detecting movement of the user's gaze to the second user interface object, the electronic device displays (1430 c) the second user interface object in a second state (e.g., a hover state) via the display generating component, such as shown in fig. 13B with user interface object 1303B (e.g., and displaying the first user interface object in the first state). Thus, in some embodiments, when the first predefined portion of the user is farther than the threshold distance from the location of any of the plurality of user interface objects, the electronic device moves the second state from the user interface object to the user interface object based on the user's gaze. The above-described manner of moving the second state based on the user gaze provides an efficient way of enabling the user to specify user interface objects for further interaction, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, after detecting movement of the first predefined portion of the user and while displaying the second user interface object in the second state (e.g., hover state) in accordance with a determination that the first predefined portion of the user is within a threshold distance of a location corresponding to the second user interface object, the electronic device detects (1432 a) movement of the user's gaze to the first user interface object (e.g., and not pointing to the second user interface object), such as gaze 1311a or 1311b in fig. 13C, via the eye tracking device. In some implementations, in response to detecting movement of the user's gaze to the first user interface object, the electronic device maintains display of the second user interface object (1432 b) in a second state (e.g., a hover state), such as shown with user interface object 1303C in fig. 13C (and maintains display of the first user interface object in the first state). Thus, in some implementations, the electronic device does not move the second state based on the user gaze when the second state is within a threshold distance of a location corresponding to the relevant user interface object based on the first predefined portion of the user. In some implementations, if the first predefined portion of the user is not within a threshold distance of a location corresponding to the relevant user interface object, the electronic device will have moved the second state to the first user interface object, optionally in accordance with gaze directed to the first user interface object. The above-described manner of maintaining the second state of the user interface object when the first predefined portion of the user is within the threshold distance of the user interface object provides the user with an efficient manner of continuing to interact with the user interface object while viewing and/or interacting with other portions of the user interface, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

Fig. 15A-15E illustrate an exemplary manner in which the electronic device 101a manages input from both hands of a user, according to some embodiments.

Fig. 15A shows that the electronic device 101a displays a three-dimensional environment via the display generating section 120. It should be appreciated that in some embodiments, the electronic device 101a utilizes one or more of the techniques described with reference to fig. 15A-15E in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device optionally includes a display generation component 120a (e.g., a touch screen) and a plurality of image sensors 314a. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101a can use to capture one or more images of the user or a portion of the user when the user interacts with the electronic device 101 a. In some embodiments, the display generating component 120a is a touch screen capable of detecting gestures and movements of the user's hand. In some embodiments, the user interfaces described below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hand (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

Fig. 15A shows that the electronic device 101a displays a three-dimensional environment. The three-dimensional environment includes a representation 1504 of a table in the physical environment of the electronic device 101a (e.g., such as table 604 in fig. 6B), a first selectable option 1503, a second selectable option 1505, and a third selectable option 1507. In some implementations, the representation 1504 of the table is a photorealistic image (e.g., video or digital passthrough) of the table displayed by the display generation component 120 a. In some implementations, the representation 1504 of the table is a view (e.g., a real or physical perspective) of the table through the transparent portion of the display generating component 120 a. In some implementations, in response to detecting a selection of a respective one of selectable options 1503, 1505, and 1507, electronic device 101a performs an action associated with the respective selected option. For example, the electronic device 101a activates a setting, initiates playback of a content item, navigates to a user interface, initiates communication with another electronic device, or performs another operation associated with a respective selected option.

In fig. 15A, the user provides input with his hand 1509 pointing to the first selectable option 1503. The electronic device 101a detects the input in response to detecting that the user's gaze 1501a points to the first selectable option 1503 and that the user's hand 1509 is in a state corresponding to a hand providing indirect input. For example, the electronic apparatus 101a detects that the hand 1509 takes a hand shape corresponding to an indirect input, such as a pinch hand shape in which the thumb of the hand 1509 is in contact with another finger of the hand 1509. In response to the user input, the electronic device 101a updates the display of the first selectable option 1503, which is why the first selectable option 1503 is different in color from the other selectable options 1505 and 1507 in fig. 15A. In some embodiments, the electronic device 101a does not perform actions associated with the selection input unless and until the end of the selection input is detected, such as detecting that the hand 1509 stops pinching the hand shape.

In fig. 15B, the user maintains the user input with a hand 1509. For example, the user continues to make a pinch hand shape with the hand 1509. As shown in fig. 15B, the user's gaze 1501B points to the second selectable option 1505 rather than continuing to point to the first selectable option 1503. Even if the user's gaze 1501b is no longer pointing at the first selectable option 1503, the electronic device 101a optionally continues to detect input from the hand 1509, and in response to detecting the end of the input (e.g., the user no longer performs pinching of the hand shape with the hand 1509), the action associated with the selectable option 1503 will optionally be performed in accordance with the input.

As shown in fig. 15B, although the user's gaze 1501B points to the second selectable option 1505, the electronic device 101a foregoes updating the appearance of the second option 1505 and makes direct input (e.g., from the hand 1509) to the second selectable option 1505. In some embodiments, the electronic device 101a makes a direct input to the second selectable option 1505 because it does not detect that the user's hand (e.g., hand 1509 or the user's other hand) is in a hand state that meets the ready state criteria. For example, in fig. 15B, no hand meets the ready state criteria because hand 1509 has interacted indirectly with (e.g., provided indirect input to) first user interface element 1503 and is therefore unavailable for input to selectable option 1505, and the other hand of the user is not visible to electronic device 101a (e.g., not detected by the various sensors of device 101 a). The ready state criteria are described in more detail above with reference to fig. 7A-8K.

In fig. 15C, the electronic device 101a detects that the user's hand 1511 meets the ready state criteria when the user's gaze 1501b points to the second selectable option 1505 and the hand 1509 continues to interact indirectly with the option 1503. For example, the hand 1511 is in a hand shape corresponding to an indirect ready state (e.g., hand state B), such as a pre-pinch hand shape in which a thumb of the hand 1511 is within a threshold distance (e.g., 0.1, 0.5, 1, 2, 3, 5, 10, etc. centimeters) of another finger of the hand 1511 without touching the finger. Because the user's gaze 1501b points to the second selectable option 1505 when the hand 1511 meets the ready state criteria, the electronic device 101a updates the second selectable option 1505 to indicate that further input provided by the hand 1511 will point to the second selectable option 1505. In some embodiments, while continuing to detect input from the pointing option 1503 of the hand 1509, the electronic device 101a detects the ready state of the hand 1511 and prepares to direct indirect input of the hand 1511 to the option 1505.

In some embodiments, the electronic device 500 detects that the hand 1511 is in an indirect ready state (e.g., hand state B) when the user's gaze 1501a points to option 1503 as shown in fig. 15A, and then detects the user's gaze 1501B on option 1505, as shown in fig. 15C. In this case, in some embodiments, the electronic device 101a does not update the appearance of option 1505 and is ready to accept indirect input from the hand 1511 pointing to option 1505 until the user's gaze 1501B points to option 1505 while the hand 1511 is in an indirect ready state (e.g., hand state B). In some embodiments, the electronic device 500 detects that the user's gaze 1501B points to option 1505 before detecting that the hand 1511 is in an indirect ready state (e.g., hand state B) as shown in fig. 15B, and then detects that the hand 1511 is in an indirect ready state as shown in fig. 15C. In this case, in some embodiments, the electronic device does not update the appearance of option 1505 and is ready to accept indirect input from hand 1511 pointing at option 1505 until it is detected that hand 1511 is in a ready state while gaze 1501b points at option 1505.

In some embodiments, if the user's gaze 1501B moves to the third selectable option 1507, the electronic device 101B will cause the second selectable option 1505 to revert to the appearance shown in fig. 15B, and will update the third selectable option 1507 to indicate that further input provided by the hand 1511 (e.g., and not provided by the hand 1509, because the hand 1509 has interacted with and/or provided input to the selectable option 1503) will point to the third selectable option 1507. Similarly, in some embodiments, if the hand 1509 is not interacting with the first selectable option 1503 and instead is in a hand shape that meets the criteria of an indirect ready state (e.g., a pre-pinch hand shape is made), the electronic device 101a directs the ready state of the hand 1509 to the selectable option 1503, 1505, or 1507 that the user is looking at (e.g., regardless of the state of the hand 1511). In some embodiments, if only one hand meets the indirect ready state criteria (e.g., in the form of a pre-pinch hand) and the other hand does not interact with the user interface element and does not meet the ready state criteria, the electronic device 101a directs the ready state of the hand in the ready state to the selectable option 1503, 1505, or 1507 that the user is looking at.

In some embodiments, as described above with reference to fig. 7A-8K, in addition to detecting an indirect ready state, the electronic device 101a also detects a direct ready state in which one of the user's hands is within a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc.) of the user interface element in a hand shape corresponding to direct manipulation (such as a pointing hand shape with one or more fingers extending and one or more fingers curling toward the palm). In some embodiments, the electronic device 101a is capable of tracking a direct ready state associated with each of the user's hands. For example, if the hand 1511 is within a threshold distance of the first selectable option 1503 when in the pointing hand shape and the hand 1509 is within a threshold distance of the second selectable option 1505 when in the pointing hand shape, the electronic device 101a points the direct ready state of the handle 1511 and any subsequent direct inputs to the first selectable option 1503 and the direct ready state of the handle 1509 and any subsequent direct inputs to the second selectable option 1505. In some embodiments, the direct ready state points to such user interface elements: the hand is within its threshold distance and moves according to the movement of the hand. For example, if the hand 1509 moves from within the threshold distance of the second selectable option 1505 to within the threshold distance of the third selectable option 1507, the electronic device 101a moves the direct ready state from the second selectable option 1505 to the third selectable option 1507 and directs further direct input of the hand 1509 to the third selectable option 1509.

In some embodiments, the electronic device 101a is capable of detecting a direct ready state (or direct input) from one hand and an indirect ready state from the other hand pointing to a user interface element that the user is looking at when the other hand meets the indirect ready state criteria. For example, if the hand 1511 is in a direct ready state or provides a direct input to the third selectable option 1503 and the hand 1509 is in a hand shape (e.g., a pre-pinch hand shape) that meets the criteria of an indirect ready state, the electronic device 101a directs the indirect ready state of the hand 1509 and any subsequent indirect input of the hand 1509 detected while the user's gaze continues to be directed to the same user interface element to which the user is looking. Also, for example, if the hand 1509 is in a direct ready state or direct input is provided to the third selectable option 1503 and the hand 1511 is in a hand shape (e.g., a pre-pinch hand shape) that meets the criteria of an indirect ready state, the electronic device 101a directs the indirect ready state of the hand 1511 and any subsequent indirect input of the hand 1511 detected while the user's gaze continues to be directed to the same user interface element to which the user is looking.

In some embodiments, the electronic device 101a stops pointing the indirect ready state at the user interface element that the user is looking at in response to detecting the direct input. For example, in fig. 15C, if the hand 1511 were to initiate a direct interaction with the third selectable option 1507 (e.g., after having been in an indirect interaction state with selectable option 1505), the electronic device 101a would stop displaying the second selectable option 1505 with an appearance indicating that the indirect ready state of the hand 1511 is pointing to the second selectable option 1505, and would update the third selectable option 1507 according to the provided direct input. For example, if the hand 1511 is within a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of the direct ready state of the third selectable option 1507, the electronic device 101a will update the third selectable option 1507 to indicate that further direct input of the hand 1511 will be directed to the third selectable option 1507. As another example, if the hand 1511 is within a direct input threshold distance (e.g., 0.05, 0.1, 0.3, 0.5, 1, 2, etc. centimeters) of the third selectable option 1507 and directly interacts with the third selectable option 1507 (e.g., provides direct input to the third selectable option), the electronic device 101a will update the appearance of the third selectable option 1507 to indicate that direct input is being provided to the third selectable option 1507.

In some embodiments, if the hand 1511 no longer meets the ready state criteria, the electronic device 101a will cease directing the ready state to the user interface element that the user is looking at. For example, if the hand 1511 is neither interacting with one of the selectable options 1503, 1505, and 1507, nor is in a hand shape that meets the criteria for an indirect ready state, the electronic device 101a stops directing the ready state associated with the hand 1511 toward the selectable option 1503, 1505, or 1507 that the user is looking at, but will continue to maintain indirect interaction of the hand 1509 with the option 1503. For example, if the hand 1511 is no longer visible to the electronic device 101B, such as in fig. 15B, the electronic device 101a will resume the appearance of the second selectable option 1505, such as in fig. 15B. As another example, if the hand 1511 interacts indirectly with one of the user interface elements while the hand 1509 interacts with the first selectable option 1503, the electronic device 101a will not direct a ready state to the other user interface element based on the user's gaze, as will be described below with reference to fig. 15D.

For example, in fig. 15D, the electronic device 101a detects an indirect input (e.g., provided by the hand 1509) directed to the first selectable option 1503 and an indirect input (e.g., provided by the hand 1511) directed to the second selectable option 1505. As shown in fig. 15D, the electronic device 101a updates the appearance of the second selectable option 1505 from the appearance of the second selectable option 1505 in fig. 15C to indicate that the hand 1513 is providing indirect input to the second selectable option 1505. In some implementations, upon detecting that the user's hand 1513 is in a hand shape that corresponds to an indirect input (e.g., pinching the hand shape), the electronic device 101a directs the input to the second selectable option 1505 in response to detecting that the user's gaze 1501b is directed to the second selectable option 1505. In some embodiments, when the input is complete, the electronic device 101a performs an action in accordance with the input directed to the second selectable option 1505. For example, the indirect selection input is completed after detecting that the hand 1513 stops making pinch gestures.

In some implementations, when both hands 1513 and 1509 interact with a user interface element (e.g., interact with the second selectable option 1505 and the first selectable option 1503, respectively), the electronic device 101a does not direct the ready state to another user interface element according to the user's gaze (e.g., because the device 101a does not detect any hands available to interact with the selectable option 1507). For example, in fig. 15D, the user points his gaze 1501c at the third selectable option 1507 when the hands 1509 and 1513 are indirectly interacting with other selectable options, and the electronic device 101a foregoes updating the third selectable option 1507 to indicate that further input will be directed to the third selectable option 1507.

Fig. 16A-16I are flowcharts illustrating a method 1600 of managing input from two hands of a user, according to some embodiments. In some embodiments, the method 1600 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, method 1600 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some operations in method 1600 are optionally combined, and/or the order of some operations is optionally changed.

In some embodiments, method 1600 is performed at an electronic device in communication with a display generating component and one or more input devices including an eye tracking device (e.g., a mobile device (e.g., a tablet, smart phone, media player, or wearable device) or computer). In some embodiments, the display generating component is a display integrated with the electronic device (optionally a touch screen display), an external display such as a monitor, projector, television, or hardware component (optionally integrated or external) for projecting a user interface or making the user interface visible to one or more users, or the like. In some embodiments, the one or more input devices include a device capable of receiving user input (e.g., capturing user input, detecting user input, etc.) and transmitting information associated with the user input to the electronic device. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from the electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), and so forth. In some implementations, the electronic device communicates with a hand tracking device (e.g., one or more cameras, depth sensors, proximity sensors, touch sensors (e.g., touch screen, touch pad)). In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, when a gaze (e.g., 1501 a) of a user of the electronic device 101a is directed toward a first user interface element (e.g., 1503) displayed via the display generating component, such as in fig. 15A (and when a first predefined portion of the user (e.g., a first hand, finger, or arm of the user, such as the right hand of the user) interacts with the first user interface element (e.g., such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000), the electronic device 101a detects (1602 a) movement of the gaze (e.g., 1501 b) of the user via the eye tracking device away from the first user interface element (e.g., 1503) to a second user interface element (e.g., 1505) displayed via the display generating component. In some implementations, in accordance with a determination that a pose (e.g., location, orientation, hand shape) of a predefined portion of the user meets one or more criteria, the predefined portion of the user indirectly interacts with the first user interface element. For example, in response to detecting that the user's hand is oriented with the palm away from the user's torso, positioned at least a threshold distance (e.g., 3, 5, 10, 20, 30, etc. centimeters) from the first user interface element, and a predetermined hand shape is made or in a predetermined pose, the user's hand indirectly interacts with the first user interface element. In some embodiments, the predetermined hand shape is a pre-pinch hand shape in which the thumb of the hand is within a threshold distance (e.g., 0.5, 1, 2, etc. centimeters) of another finger (e.g., index finger, middle finger, ring finger, little finger) of the same hand without touching the finger. For example, the predetermined hand shape is a pointing hand shape in which one or more fingers of the middle hand are extended and the one or more fingers of the hand are curled toward the palm. In some implementations, detecting a pointing hand shape includes detecting that a user is pointing to a second user interface element. In some implementations, a pointing hand shape is detected regardless of where the user is pointing (e.g., the input is directed based on the user's gaze rather than based on the direction the user is pointing). In some implementations, the first user interface element and the second user interface element are interactive user interface elements, and in response to detecting input directed to the first user interface element or the second user interface element, the electronic device performs an action associated with the first user interface element or the second user interface element, respectively. For example, the first user interface element is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the second user interface element is a container (e.g., a window) in which the user interface is displayed, and in response to detecting a selection of the second user interface element and a subsequent movement input, the electronic device updates the location of the second user interface element in accordance with the movement input. In some implementations, the first user interface element and the second user interface element are the same type of user interface element (e.g., selectable options, content items, windows, etc.). In some embodiments, the first user interface element and the second user interface element are different types of user interface elements. In some implementations, in response to detecting indirect interaction of a predetermined portion of a user with a first user interface element while the user's gaze is directed at the first user interface element, the electronic device updates an appearance (e.g., color, size, positioning) of the user interface element to indicate that additional input (e.g., selection input) is to be directed at the first user interface element, such as described with reference to methods 800, 1200, 1800, and/or 2000. In some embodiments, the first user interface element and the second user interface element are displayed in a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) generated by, displayed by, or otherwise made viewable by the device (e.g., a user interface including these elements is a three-dimensional environment and/or is displayed within a three-dimensional environment).

In some embodiments, such as in fig. 15C, in response to detecting that the user's gaze (e.g., 1501 b) is moving away from the first user interface element (e.g., 1503) to a second user interface element (e.g., 1505) displayed via the display generating component (1602 b), the electronic device 101a changes (1602C) the visual appearance (e.g., color, size, positioning) of the second user interface element (e.g., 1505) in accordance with determining that a second predefined portion (e.g., 1511) of the user (e.g., a second finger, hand, or arm of the user, such as the left hand of the user) is available to interact with the second user interface element (e.g., 1505) (e.g., such as described with reference to method 800). In some embodiments, the first predefined portion of the user is a first hand of the user and the second predefined portion of the user is a second hand of the user. In some implementations, the electronic device changes a visual appearance of the first user interface element in response to detecting that the first predefined portion of the user indirectly interacted with the first user interface element when the gaze of the user was directed at the first user interface element. In some implementations, the second predefined portion of the user may be used to interact with the second user interface element in response to detecting a pose of the second predefined portion that meets one or more criteria when the second predefined portion has not interacted with another (e.g., third) user interface element. In some embodiments, the pose and position of the first predefined portion of the user are the same before and after movement of the user's gaze from the first user interface element to the second user interface element is detected. In some implementations, the first predefined portion of the user remains interactive with the first user interface element (e.g., input provided by the first predefined portion of the user remains interactive with the first user interface element) while and after changing the visual appearance of the second user interface element. In some implementations, in response to detecting that the user's gaze moves from the first user interface element to the second user interface element, the first predefined portion of the user no longer interacts with the first user interface element (e.g., input provided by the first predefined portion of the user does not interact with the first user interface element). For example, the electronic device relinquishes performing the operation in response to the input provided by the first predefined portion of the user or performs the operation with the second user interface element in response to the input provided by the first predefined portion of the user when the first predefined portion of the user no longer interacts with the first user interface element. In some implementations, in response to detecting a gaze of the user on the second user interface element and the second predefined portion of the user being available for interaction with the second user interface element, the second predefined portion of the user becomes interactive with the second user interface element. In some embodiments, the input provided by the second predefined portion of the user causes interaction with the second user interface element while the second predefined portion of the user interacts with the second user interface element.

In some embodiments, such as in fig. 15B, in response to detecting that the user's gaze (e.g., 1501B) moves away from the first user interface element (e.g., 1503) to a second user interface element (e.g., 1505) displayed via the display generating component (1602B), in accordance with a determination that a second predefined portion of the user is not available to interact with the second user interface element (e.g., 1501B) (e.g., such as described with reference to method 800), the electronic device 101a foregoes (1602 d) changing the visual appearance of the second user interface element (e.g., 1501B). In some implementations, the electronic device maintains the display of the second user interface element without changing the visual appearance of the second user interface element. In some implementations, if the electronic device cannot detect the second predefined portion of the user, the second predefined portion of the user is unavailable for interaction with the second user interface element if the pose of the second predefined portion of the user fails to meet one or more criteria, or if the second predefined portion of the user has interacted with another (e.g., third) user interface element. In some embodiments, the pose and position of the first predefined portion of the user are the same before and after movement of the user's gaze from the first user interface element to the second user interface element is detected. In some implementations, the first predefined portion of the user remains interactive with the first user interface element (e.g., input provided by the first predefined portion of the user remains interactive with the first user interface element) upon and after detecting movement of the user's gaze from the first user interface element to the second user interface element. In some implementations, in response to detecting that the user's gaze moves from the first user interface element to the second user interface element, the first predefined portion of the user no longer interacts with the first user interface element (e.g., input provided by the first predefined portion of the user does not interact with the first user interface element). For example, the electronic device relinquishes performing the operation in response to the input provided by the first predefined portion of the user or performs the operation with the second user interface element in response to the input provided by the first predefined portion of the user when the first predefined portion of the user no longer interacts with the first user interface element. In some implementations, in response to detecting a gaze of the user on the second user interface element and the second predefined portion of the user being unavailable for interaction with the second user interface element, the second predefined portion of the user does not become interactive with the second user interface element. In some implementations, the input provided by the second predefined portion of the user does not cause interaction with the second user interface element when the second predefined portion of the user is not interacting with the second user interface element. In some embodiments, in response to detecting input provided by the second predefined portion of the user while the second predefined portion of the user is not interacting with the second user interface element, the electronic device relinquishes performing the operation in response to the input if the second predefined portion of the user is not interacting with any user interface element presented by the electronic device. In some implementations, if the second predefined portion of the user does not interact with the second user interface element because of interaction with the third user interface element, the electronic device performs an action in accordance with the input utilizing the third user interface element in response to detecting the input provided by the second predefined portion of the user.

The above-described manner of changing the visual appearance of the second user interface element in response to detecting that the user's gaze moves from the first user interface element to the second user interface element when the second predefined portion of the user is available for interaction provides an efficient way of using the plurality of portions of the user to interact with the plurality of user interface elements, which simplifies the interaction between the user and the electronic device, enhances the operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, where one or more criteria are met, including criteria met when a first predefined portion of the user (e.g., 1509) and a second predefined portion of the user (e.g., 1511) are not interacting with any user interface element (1604 a) (e.g., the electronic device is not currently detecting direct or indirect input provided by the first or second predefined portion of the user), the electronic device 101a displays (1604 b) the first user interface element (e.g., 1505) with visual characteristics that indicate that interaction (e.g., direct or indirect interaction) with the first user interface element (e.g., 1505) is possible, in accordance with a determination that the user's gaze (e.g., 1501 b) is directed to the first user interface element (e.g., 1505), wherein the second user interface element (e.g., 1507) is not displayed with the visual characteristics, such as in fig. 15C. In some implementations, displaying the first user interface element with a visual characteristic that indicates that interaction with the first user interface element is possible includes: the size, color, location, or other visual characteristic of the first user interface element is updated as compared to an appearance of the first user interface element prior to detecting that the user's gaze is directed at the first user interface element when the one or more criteria are met. In some embodiments, in response to detecting that the one or more criteria are met and that the user's gaze is directed to the first user interface element, the electronic device maintains to display the second user interface element with a visual characteristic that was displayed by the second user interface element prior to detecting that the user's gaze is directed to the first user interface element when the one or more criteria are met. In some embodiments, in response to detecting that the user's gaze moves from the first user interface element to the second user interface element when the one or more criteria are met, the electronic device displays the second user interface element with a visual characteristic indicating that interaction with the second user interface element is possible, and does not display the first user interface element with the visual characteristic. In some embodiments, the one or more criteria further include a criterion that is met when the electronic device detects that the first or second predefined portion of the user is in a ready state according to one or more steps of method 800.

In some embodiments, where one or more criteria are met, including criteria met when a first predefined portion of the user (e.g., 1509) and a second predefined portion of the user (e.g., 1511) are not interacting with any user interface element (1604 a), the electronic device 101a displays (1604C) the second user interface element (e.g., 1505) with a visual characteristic that indicates that interaction with the second user interface element (e.g., direct or indirect interaction) is possible, in accordance with a determination that the user's gaze (e.g., 1501 b) is directed to the second user interface element (e.g., 1505), wherein the first user interface element (e.g., 1507) is not displayed with the visual characteristic, such as shown in fig. 15C. In some implementations, displaying the second user interface element with a visual characteristic that indicates that interaction with the second user interface element is possible includes: the size, color, location, or other visual characteristic of the second user interface element is updated as compared to an appearance of the second user interface element before the user's gaze is detected to be directed at the second user interface element when the one or more criteria are met. In some implementations, in response to detecting that the one or more criteria are met and that the user's gaze is directed to the second user interface element, the electronic device maintains to display the first user interface element with a visual characteristic of the first user interface element display prior to detecting that the user's gaze is directed to the second user interface element when the one or more criteria are met. In some implementations, in response to detecting that the user's gaze moves from the second user interface element to the first user interface element when the one or more criteria are met, the electronic device displays the first user interface element with a visual characteristic indicating that interaction with the first user interface element is possible and does not display the second user interface element with the visual characteristic.

In some embodiments, such as in fig. 15C, when one or more criteria are met, the electronic device 101a detects (1604 d) input (e.g., direct or indirect input) to a first predefined portion (e.g., 1509) or a second predefined portion (e.g., 1511) from the user via one or more input devices. In some embodiments, the electronic device detects that the same predefined portion of the user is in a ready state according to method 800 before detecting input from the first or second predefined portion of the user. For example, when the user's right hand is farther than a threshold distance (e.g., 1, 3, 5, 10, 15, 30, etc. centimeters) from the user interface element, the electronic device detects that the user has made a pre-pinch hand shape with his right hand, and then when the right hand is farther than the threshold distance from the user interface element, the electronic device detects that the user has made a pinch hand shape with his right hand. As another example, the electronic device detects that the user makes a pointing hand shape with his left hand when the user's left hand is within a first threshold distance (e.g., 1, 3, 5, 10, 15, 30, etc. centimeters) of the respective user interface element, and then detects that the user moves his left hand within a second threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of the respective user interface element while maintaining the pointing hand shape.

In some embodiments, such as in fig. 15A, in response to detecting the input (1604 e), in accordance with a determination that the user's gaze (e.g., 1501 a) points to the first user interface element (e.g., 1503) when the input is received, the electronic device 101a performs (1604 f) an operation corresponding to the first user interface element (e.g., selects the first user interface element, navigates to a user interface associated with the first user interface element, initiates playback of the content item, activates or deactivates a setting, initiates or terminates communication with another electronic device, scrolls content of the first user interface element, etc.

In some embodiments, such as in fig. 15A, in response to detecting the input (1604 e), in accordance with a determination that the user's gaze (e.g., 1501 a) points to a second user interface element (e.g., 1503) when the input is received, the electronic device 101a performs (1604 g) an operation corresponding to the second user interface element (e.g., selects a first user interface element, navigates to a user interface associated with the first user interface element, initiates playback of a content item, activates or deactivates a setting, initiates or terminates communication with another electronic device, scrolls content of the second user interface element, etc. In some embodiments, the electronic device directs the input to a user interface element that the user is looking at when the input is received.

The above-described updating of visual characteristics of and directing input to user interface elements that the user is looking at provides an efficient way of allowing the user to interact with the user interface using either hand, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 15C, the one or more criteria include a criterion (e.g., 1606 a) that is met when at least one of a first predefined portion (e.g., 1511) or a second predefined portion (e.g., 1509) of the user is available to interact (e.g., directly or indirectly interact) with the user interface element. In some embodiments, the criterion is met when the first and/or second predefined portions of the user are in a ready state according to the method 800. In some implementations, the one or more criteria are met regardless of whether one or both of the first and second predefined portions of the user are available for interaction. In some embodiments, the first predefined portion and the second predefined portion of the user are hands of the user.

The above-described manner of indicating that one of the user interface elements is available for interaction in response to one or more criteria including criteria met when the first or second predefined portion of the user is available for interaction with the user interface element provides an efficient manner of indicating which user interface element the input will be directed to when the predefined portion of the user is available for providing input, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 15B, in response to detecting that the user's gaze (e.g., 1501B) moves away from the first user interface element (e.g., 1503) to a second user interface element (e.g., 1505) displayed via the display generating component (1608 a), the electronic device 101a foregoes (1608B) changing the visual appearance of the second user interface element (e.g., 1501B), such as in fig. 15B, in accordance with a determination that the first predefined portion (e.g., 1509) and the second predefined portion (e.g., 1511 in fig. 15C) of the user are not available for interaction (e.g., direct or indirect interaction) with the user interface element. In some implementations, when an input device (e.g., a hand tracking device, one or more cameras, etc.) in communication with the electronic device does not detect a predefined portion of the user, the predefined portion of the user is unavailable for interaction when the predefined portion of the user interacts with another user interface element (e.g., provides input directed to the other user interface element) or is not in a ready state according to the method 800. For example, if the right hand of the user is currently providing selection input to the respective user interface element and the left hand of the user is not detected by an input device in communication with the electronic device, the electronic device foregoes updating the visual appearance of the second user interface element in response to detecting that the user's gaze is moving from the first user interface element to the second user interface element.

The above-described manner of forgoing updating the visual appearance of the second user interface element when none of the predefined portions of the user is available for interaction provides an efficient manner of indicating that the input will be directed to the second user interface element only if the predefined portions of the user are available for providing input, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, when a second predefined portion (e.g., 1511) of the user is available for interaction (e.g., direct or indirect interaction) with the second user interface element, such as in fig. 15C, and after changing the visual appearance of the second user interface element (e.g., 1505) to the changed appearance of the second user interface element (e.g., 1505), the electronic device 101a detects (1610 a) via the eye-tracking device that the second predefined portion (e.g., 1511) of the user is no longer available for interaction (e.g., direct or indirect interaction) with the second user interface element (e.g., 1505), such as in fig. 15B (e.g., when the user's gaze remains on the second user interface element). In some implementations, the second predefined portion of the user is no longer available for interaction because the input device in communication with the electronic device no longer detects the second predefined portion of the user (e.g., the second predefined portion of the user is outside of the "field of view" of the one or more input devices detecting the second predefined portion of the user), the second predefined portion of the user becomes interactive with a different user interface element, or the second predefined portion of the user ceases to be in a ready state according to the method 800. For example, in response to detecting a transition of a user's hand from making a hand shape associated with a ready state to making a hand shape not associated with a ready state, the electronic device determines that the user's hand is not available for interaction with the second user interface element.

In some embodiments, such as in fig. 15B, in response to detecting that the second predefined portion of the user (e.g., 1511 in fig. 15C) is no longer available to interact (e.g., directly or indirectly interact) with the second user interface element (e.g., 1505), the electronic device 101a ceases (1610B) to display the changed appearance of the second user interface element (e.g., does not display the second user interface element in the changed appearance, and/or displays the second user interface element in an appearance that the second user interface element had prior to display in the changed appearance. In some implementations, when a second predefined portion of the user is available to interact with the second user interface element, the electronic device displays the second user interface element with the same visual appearance as the visual appearance displayed by the second user interface element prior to detecting that the user's gaze is directed at the second user interface element.

The above-described manner of reversing the change to the visual appearance of the second user interface element in response to detecting that the second predefined portion of the user is no longer available for interaction with the second user interface element provides an efficient manner of indicating that the electronic device will not perform an action with respect to the second user interface element in response to input provided by the second predefined portion of the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, after determining that the second predefined portion of the user (e.g., 1511 in fig. 15C) is unavailable for interaction (e.g., direct or indirect interaction) with the second user interface element (e.g., 1505) and when the user's gaze (e.g., 1501B) points to the second user interface element (e.g., 1505), such as in fig. 15B (e.g., having an appearance such as an idle state appearance indicating that there is no predefined portion of the user available for interaction with the second user interface element), the electronic device 101a detects (1612 a) via one or more input devices that the second predefined portion of the user (e.g., 1511) is now available for interaction (e.g., direct or indirect interaction) with the second user interface element (e.g., 1505), such as in fig. 15C. In some implementations, detecting that the second predefined portion of the user is available for interaction includes detecting that the second predefined portion of the user is in a ready state according to method 800. For example, the electronic device detects that the user's hand is in a pointing hand shape within a predefined distance (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) of the second user interface element.

In some implementations, such as in fig. 15C, in response to detecting that a second predefined portion (e.g., 1511) of the user is now available for interaction (e.g., direct or indirect interaction) with a second user interface element (e.g., 1505) (e.g., upon detecting that the user's gaze is directed at the second user interface element), the electronic device 101a changes (1612 b) the visual appearance (e.g., size, color, positioning, text or line pattern, etc.) of the second user interface element (e.g., 1505). In some embodiments, in response to detecting that a second predefined portion of the user is now available to interact with a different user interface element when the user looks at the different user interface element, the electronic device updates the visual appearance of the different user interface element and maintains the visual appearance of the second user interface element.

The above-described manner of changing the visual appearance of the second user interface element in response to detecting that the second predefined portion of the user is ready to interact with the second user interface element provides an efficient manner of indicating to the user that input provided by the second predefined portion of the user will cause an action directed to the second user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 15D, in response to detecting that the user's gaze (e.g., 1501 c) moves away from the first user interface element (e.g., 1503) to a second user interface element (e.g., 1507) displayed via the display generation component (1614 a), the electronic device 101a foregoes (1614 b) changing the visual appearance of the second user interface element (e.g., 1507) in accordance with determining that the first predefined portion (e.g., 1509) and the second predefined portion (e.g., 1511) of the user have interacted with (e.g., provided direct or indirect input directed to) the corresponding user interface element other than the second user interface element (e.g., 1507). In some embodiments, if the predefined portion of the user is providing (e.g., directly or indirectly) an input (e.g., a selection input or a selection portion of another input, such as a drag or scroll input) directed to the respective user interface element, or if the predefined portion of the user is in a direct ready state directed to the respective user interface element according to method 800, the first and/or second predefined portion of the user has interacted with the respective user interface element. For example, the right hand of the user takes on a pinch hand shape corresponding to initiation of a selection input directed to the first respective user interface, and the left hand of the user takes on a directed hand shape within a distance threshold (e.g., 1, 3, 5, 10, 15, 30, etc. centimeters) of the second respective user interface element, the directed hand shape corresponding to the left hand being in a direct ready state directed to the second respective user interface element. In some embodiments, in response to detecting a user's gaze on a respective user interface element other than the second user interface element when the first and second predefined portions of the user have interacted with other user interface elements, the electronic device forgoes changing the visual appearance of the respective user interface element.

The above-described manner of changing the visual appearance of the second user interface element in response to the user's gaze being directed at the second user interface element when the first and second predefined portions of the user have interacted with the respective user interface element provides an efficient manner of indicating to the user that the input provided by the first and second predefined portions of the user will not be directed at the second user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 15D, determining that the second predefined portion (e.g., 1511) of the user is unavailable for interaction with the second user interface element (e.g., 1507) is based on determining that the second predefined portion (e.g., 1511) of the user interacts with a third user interface element (e.g., 1505) different from the second user interface element (e.g., 1507) (e.g., provides direct or indirect input to the third user interface element) (1616 a). In some implementations, the second predefined portion of the user interacts with the third user interface element when the second predefined portion of the user is providing (e.g., directly or indirectly) input directed to the third user interface element or when the second predefined portion of the user is in a direct ready state associated with the third user interface element according to the method 800. For example, if the user's hand is in a pinch hand shape or pre-pinch hand shape that directly or indirectly provides selection input to the third user interface element, the user's hand interacts with the third user interface element and is not available for interaction with the second user interface element. As another example, if the user's hand is in a pointing hand shape within a readiness state threshold (e.g., 1, 2, 3, 5, 10, 15, 30, etc. centimeters) or a selection threshold (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of the third user interface element, the user's hand interacts with the third user interface element and is not available for interaction with the second user interface element.

The above-described manner of determining that the second predefined portion of the user is unavailable to interact with the second user interface element based on determining that the second predefined portion of the user interacts with the third user interface element provides an efficient manner of maintaining interaction with the third user interface element even though the user looks at the second user interface element, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 15D, determining that the second predefined portion (e.g., 1511) of the user is not available for interaction (e.g., direct or indirect interaction) with the second user interface element (e.g., 1507) is based on determining that the second predefined portion (e.g., 1511) of the user is not in a predetermined pose (e.g., position, orientation, hand shape) required for interaction with the second user interface element (e.g., 1507) (1618 a). In some implementations, the predetermined pose is a pose associated with a ready state in method 800. In some implementations, the predefined portion of the user is the user's hand, and the predetermined pose is that the hand is in a pointing gesture when the hand is within a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30) of the respective user interface element, with the palm facing the respective user interface element. In some embodiments, the predefined portion of the user is the user's hand and the predetermined pose is that the palm of the hand facing the user interface is in a pre-pinch hand shape, wherein the thumb and the other finger are within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters) of each other without touching. In some implementations, the electronic device forgoes changing the visual appearance of the second user interface element in response to detecting that the user's gaze is on the second user interface element if the pose of the second predefined portion does not match one or more predetermined poses required for interaction with the second user interface element.

The above-described determination that a predefined portion of a user is unavailable for interaction when the pose of the predefined portion is not a predetermined pose provides an efficient way to allow the user to make the predetermined pose to initiate input and to forgo making the pose when no input is needed, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, determining that the second predefined portion of the user (e.g., 1511 in fig. 15C) is unavailable to interact (e.g., directly or indirectly interact) with the second user interface element (e.g., 1505) is based on determining that the second predefined portion of the user (e.g., 1511) is not detected (1620 a) by one or more input devices (e.g., one or more cameras, range sensors, hand tracking devices, etc.) in communication with the electronic device, such as in fig. 15B. In some embodiments, the one or more input devices are capable of detecting a second predefined portion of the user when the second predefined portion of the user has a location within a predetermined area (e.g., a "field of view") relative to the one or more input devices, and are incapable of detecting a second predefined portion of the user when the second predefined portion of the user has a location outside the predetermined area relative to the one or more input devices. For example, a hand tracking device that includes a camera, range sensor, or other image sensor has a field of view that includes an area captured by the camera, range sensor, or other image sensor. In this example, the user's hand is unavailable to interact with the second user interface element when the user's hand is not in the field of view of the hand tracking device because the electronic device is unable to detect input from the user's hand when the user's hand is outside of the field of view of the hand tracking device.

The above-described manner of determining that the predefined portion of the user is unavailable to interact with the second user interface element based on the detection of the predefined portion of the user not being detected by the one or more input devices in communication with the electronic device provides an efficient manner of changing visual characteristics of the second user interface element in response to gaze only when the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, upon display of a first user interface element (e.g., 1505) and a second user interface element (e.g., 1507) via a display generation component (1622 a), such as in fig. 15E, in accordance with a determination that a first predefined portion (e.g., 1511) of a user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm of a location corresponding to the first user interface element (e.g., 1505), corresponding to direct interaction with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000) and a second predefined portion (e.g., 1509) of the user is within a threshold distance (1622 b) of a location corresponding to the second user interface element (e.g., 1507), the electronic device 101a indicates visual characteristics (e.g., color, location, size, line, or text pattern) of the first predefined portion (e.g., 1511) available for direct interaction with the first user interface element (e.g., 1505) are displayed (e.g., in response to some of the first predefined portion being provided in the first predefined portion of the gesture input device if the first predefined portion is not provided in the first predefined portion has a gesture input gesture corresponding to the first user interface element (e.g., 1507), the electronic device discards displaying the first user interface element with a visual characteristic indicating that the first user interface element is available for direct interaction with the first predefined portion of the user. In some implementations, the first and second predefined portions of the user have poses corresponding to predetermined poses, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000.

In some embodiments, such as in fig. 15E, upon displaying the first user interface element (e.g., 1505) and the second user interface element (e.g., 1507) via the display generation component (1622 a), in accordance with determining that the first predefined portion (e.g., 1511) of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. centimeters of the location corresponding to the first user interface element (e.g., 1505), corresponding to a visual characteristic of directly interacting with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000) and that the second predefined portion (e.g., 1509) of the user is within a threshold distance (1622 b) of the location corresponding to the second user interface element (e.g., 1507), the electronic device 101a displays (2 d) the second user interface element (e.g., 1507) with a visual characteristic indicating that the second user interface element (e.g., 1509) is available for directly interacting with the second predefined portion (e.g., 1509) of the user. In some implementations, responsive to receiving input provided by a second predefined portion of the user to the second user interface element, the electronic device performs a corresponding action associated with the second user interface element. In some implementations, if the second predefined portion of the user does not have a pose corresponding to the predefined pose (e.g., such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000), the electronic device discards displaying the second user interface element with visual characteristics that indicate that the second user interface element is available for direct interaction with the second predefined portion of the user.

The above-described manner of displaying the first user interface element with visual characteristics indicating that the first user interface element is available for direct interaction and displaying the second user interface element with visual characteristics indicating that the second user interface element is available for interaction provides an efficient way of enabling a user to direct input to the first user interface element and the second user interface element simultaneously with the first predefined portion and the second predefined portion of the user, respectively, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 15E, upon display of the first user interface element (e.g., 1505) and the second user interface element (e.g., 1507) via the display generation component (1624 a), in accordance with a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm) that determines the location of the first predefined portion (e.g., 1515) corresponding to the first user interface element (e.g., 1505) and the second predefined portion (e.g., 1509) of the user is farther than the threshold distance of the location corresponding to the second user interface element (e.g., 1507) but available for interaction (e.g., indirect interaction) with the second user interface element (e.g., 1507), such as in fig. 15E, the electronic device 101a indicates that the first predefined portion (e.g., 1515) of the user is available for direct interaction with the first user interface element (e.g., 1505), such as described with methods 800, 1000, 1200, 1400, 1800, and/or 2000 is displayed further than the threshold distance of the location corresponding to the second user interface element (e.g., 1507) but available for interaction (e.g., indirect interaction) (1624 b). In some implementations, the pose of the first predefined portion of the user corresponds to a predefined pose associated with a ready state according to method 800. In some implementations, in accordance with a determination that the location of the first predefined portion changes from within a threshold distance of the location corresponding to the first user interface element to within a threshold distance of the location corresponding to the third user interface element, the electronic device ceases to display the first user interface element with the visual characteristic and displays the third user interface element with the visual characteristic. In some implementations, the second predefined portion of the user is in a predetermined pose associated with the ready state described with reference to method 800. In some embodiments, the second predefined portion of the user is a distance from the second user interface element that corresponds to indirect interaction with the second user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000. In some implementations, the first predefined portion of the user has a pose corresponding to a predetermined pose, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000.

In some embodiments, such as in fig. 15E, upon display of the first user interface element (e.g., 1505) and the second user interface element (e.g., 1507) via the display generation component, in accordance with a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm) that determines that the first predefined portion (e.g., 1511) of the user is at a location corresponding to the first user interface element (e.g., 1505) and the second predefined portion (e.g., 1509) of the user is farther than the threshold distance (e.g., indirect interaction) (1624 b) of the second user interface element (e.g., 1507) corresponding to the location of the second user interface element (e.g., 1507), such as in fig. 15E, in accordance with a determination that the gaze (e.g., 1501 a) of the user is directed to the second user interface element (e.g., 1507), the electronic device is operative to indicate that the second predefined portion (e.g., 1509) of the user is available to display a visual characteristic of the second user interface element (e.g., 1507) of the second user interface element (e.g., 1509). In some embodiments, if the user's gaze moves from pointing at the second user interface element to pointing at the third user interface element, the electronic device ceases to display the second user interface element with the visual characteristic and displays the third user interface element with the visual characteristic.

In some implementations, upon displaying the first user interface element and the second user interface element via the display generation component (1624 a), in accordance with determining that the first predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. centimeters, corresponding to directly interacting with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000) of the location corresponding to the first user interface element and the second predefined portion of the user is farther than the threshold distance of the location corresponding to the second user interface element but available to interact with (e.g., indirectly interact with) the second user interface element (1624 b), the electronic device 101a does not display (1624 e) the second user interface element with visual characteristics indicating that the second predefined portion of the user is available to indirectly interact with the second user interface element in accordance with determining that the gaze of the user is not directed towards the second user interface element. For example, in fig. 15E, if the user's gaze 1501a does not point to the user interface element 1507, the user interface element 1507 will not be displayed with visual characteristics (e.g., shadows in fig. 15E) indicating that the hand 1509 is available for indirect interaction with the user interface element 1507. In some embodiments, the electronic device requires that the user's gaze be directed to the second user interface element so that the second user interface element is available for indirect interaction. In some implementations, when a first predefined portion of the user interacts directly with the first user interface element and a second predefined portion of the user is available for indirect interaction with another user interface element, the electronic device indicates that the first user interface element is available for direct interaction with the first predefined portion of the user and indicates that the user interface element at which the user's gaze is directed is available for indirect interaction with the second predefined portion of the user. In some embodiments, the indication of direct interaction is different from the indication of indirect interaction according to one or more steps of method 1400.

The above-described manner of displaying the first user interface element with visual characteristics that indicate that the first user interface element is available for direct interaction and displaying the second user interface element with visual characteristics that indicate that the second user interface element is available for indirect interaction provides an efficient manner of enabling a user to direct input to the first user interface element and the second user interface element simultaneously with the first predefined portion and the second predefined portion of the user, respectively, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 15E, upon displaying a first user interface element (e.g., 1507) and a second user interface element (e.g., 1505) via the display generation component (1626 a), in accordance with determining that a second predefined portion (e.g., 1511) of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm of a location corresponding to the second user interface element (e.g., 1505), corresponding to a visual characteristic that is directly interacted with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000) and that the first predefined portion (e.g., 1509) of the user is farther than the location corresponding to the first user interface element (e.g., 1507) but available to interact (e.g., indirectly interact) (1626 b) with the first user interface element (e.g., 1507), the electronic device 101a displays a visual characteristic that is available to interact directly with the second predefined portion (e.g., 1511) of the user (e.g., in fig. 16215 e.g., fig. 15. In some implementations, the pose of the second predefined portion of the user corresponds to a predefined pose associated with a ready state according to method 800. In some implementations, in accordance with a determination that the location of the second predefined portion of the user changes from being within a threshold distance of the location corresponding to the second user interface element to being within a threshold distance of the location corresponding to the third user interface element, the electronic device ceases to display the second user interface element with the visual characteristic and displays the third user interface element with the visual characteristic. In some implementations, the first predefined portion of the user is in a predetermined pose associated with the ready state described with reference to method 800. In some implementations, the first predefined portion of the user is a distance from the first user interface element that corresponds to indirect interaction with the first user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000. In some implementations, the second predefined portion of the user has a pose corresponding to a predetermined pose, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000.

In some embodiments, such as in fig. 15E, upon display of the first user interface element and the second user interface element (1626 a) via the display generation component, in accordance with a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm) that determines that the second predefined portion (e.g., 1511) of the user is at a location corresponding to the second user interface element (e.g., 1505), corresponding to direct interaction with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000, and the first predefined portion (e.g., 1509) of the user is farther than the threshold distance of the location corresponding to the first user interface element (e.g., 1507) but available for interaction (e.g., indirect interaction) with the first user interface element (e.g., 1507) (1626 b), the electronic device 101a is operative to indicate that the first predefined portion (e.g., 1509) of the user is available for direct interaction with the first user interface element (e.g., 1507), such as in fig. 15 d, such as the visual interface of the first user interface element (1507). In some implementations, if the user's gaze moves from pointing at the first user interface element to pointing at the third user interface element, the electronic device ceases to display the first user interface element with the visual characteristic and displays the third user interface element with the visual characteristic.

In some embodiments, such as in fig. 15E, upon displaying the first user interface element (e.g., 1503) and the second user interface element (e.g., 1505) via the display generating component (1626 a), in accordance with a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, 30, 50, etc. cm) that determines that the second predefined portion (e.g., 1511) of the user is at a location corresponding to the second user interface element (e.g., 1505), corresponding to direct interaction with the user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000, and the first predefined portion (e.g., 1509) of the user is farther than the threshold distance corresponding to the first user interface element (e.g., 1503) but available to interact with the first user interface element (e.g., 1503) (e.g., indirect interaction) (1626 b), in accordance with determining that the gaze (e.g., 1501 a) of the user is not directed to the first user interface element (e.g., 1503), the electronic device 101a indicates that the first portion (e.g., 1509) of the user is available for indirect interaction with the first user interface element (e.g., 1506) in the first user interface (e.g., fig. 15). In some implementations, the electronic device requires that the user's gaze be directed toward the first user interface element so that the first user interface element is available for indirect interaction. In some implementations, when the second predefined portion of the user interacts directly with the second user interface element and the first predefined portion of the user is available for indirect interaction with another user interface element, the electronic device indicates that the second user interface element is available for direct interaction with the second predefined portion of the user and that the user interface element at which the gaze of the user is directed is available for indirect interaction with the first predefined portion of the user. In some embodiments, the indication of direct interaction is different from the indication of indirect interaction according to one or more steps of method 1400. In some implementations, in response to detecting that the user's gaze is directed at the third user interface element when the first predefined portion of the user is available for indirect interaction, the electronic device displays the third user interface element with a visual characteristic indicating that the first predefined portion of the user is available for indirect interaction with the third user interface element. In some implementations, in response to detecting that the user's gaze is directed to the second user interface object when the first predefined portion of the user is available for indirect interaction, the electronic device forgoes updating the visual characteristics of the second user interface element because the second predefined portion of the user is directly interacting with the second user interface element.

The above-described manner of displaying the first user interface element with visual characteristics that indicate that the first user interface element is available for indirect interaction and displaying the second user interface element with visual characteristics that indicate that the second user interface element is available for direct interaction provides an efficient way of enabling a user to direct input to the first user interface element and the second user interface element simultaneously with the first predefined portion and the second predefined portion of the user, respectively, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, upon detecting that the user's gaze (e.g., 1501 b) moves away from the first user interface element (e.g., 1503) to the second user interface element (e.g., 1505), such as in fig. 15C, and upon displaying the second user interface element (e.g., 1505) in a changed visual appearance (e.g., the second predefined portion of the user is greater than a distance threshold (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, 20, 30, 50, etc. centimeters) associated with the direct input from the second user interface element and available for indirect interaction with the second user interface element), the electronic device 101a detects (1628 a) to the second predefined portion of the user (e.g., 1511) to interact directly with the first user interface element (e.g., 1505), such as in fig. 15E. In some implementations, the second predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 3, 5, 10, 15, 30, 50, etc. centimeters) of the first user interface element when in a predefined pose for direct interaction with the first user interface element, such as described with reference to methods 800, 1000, 1200, 1400, 1800, and/or 2000. In some implementations, the direct interaction is in accordance with a ready state of the method 800 or an input (e.g., a selection input, a drag input, a scroll input, etc.) for performing an action.

In some embodiments, such as in fig. 15E, in response to detecting that the second predefined portion (e.g., 1511) of the user is directly interacting with the first user interface element (e.g., 1505), the electronic device 101a foregoes (1628 b) displaying the second user interface element (e.g., 1503) with a changed visual appearance. In some implementations, the first predefined portion of the user is unavailable for interaction with the second user interface element. In some implementations, the electronic device changes a visual appearance of the first user interface element to indicate that the first user interface element is directly interacting with the second predefined portion of the user. In some embodiments, the electronic device discards displaying the second user interface element with the changed visual appearance in response to detecting that the second predefined portion of the user is directly interacting with the first user interface element, even though the first predefined portion of the user is available for indirect interaction with the second user interface element and/or the gaze of the user is directed to the second user interface element. In some implementations, upon indicating that the second predefined portion of the user is available for indirect interaction with the second user interface element, the electronic device detects that the second predefined portion of the user is directly interacting with another user interface element and ceases to display an indication that the second predefined portion of the user is available for indirect interaction with the second user interface element.

The above-described stopping of the display of the first user interface element in a changed appearance in response to detecting that the second predefined portion of the user interacts directly with the first user interface element provides an efficient way of avoiding accidental input directed to the second user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

17A-17E illustrate various ways in which the electronic device 101a presents a visual indication of user input according to some embodiments.

Fig. 17A shows that the electronic device 101a displays a three-dimensional environment via the display generating section 120. It should be appreciated that in some embodiments, the electronic device 101a utilizes one or more of the techniques described with reference to fig. 17A-17E in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device optionally includes a display generation component 120a (e.g., a touch screen) and a plurality of image sensors 314a. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101a can use to capture one or more images of the user or a portion of the user when the user interacts with the electronic device 101 a. In some embodiments, the display generating component 120a is a touch screen capable of detecting gestures and movements of the user's hand. In some embodiments, the user interfaces described below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hand (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

In fig. 17A, the electronic device 101a displays a three-dimensional environment that includes a representation 1704 of a table (e.g., such as table 604 in fig. 6B) in the physical environment of the electronic device 101a, a scrollable user interface element 1703, and selectable options 1705. In some embodiments, the representation 1704 of the table is a photorealistic video image (e.g., video or digital passthrough) of the table displayed by the display generation component 120 a. In some embodiments, the representation 1704 of the table is a view (e.g., a real or physical perspective) of the table through the transparent portion of the display generating component 120 a. As shown in fig. 17A, selectable option 1705 is displayed within and in front of back panel 1706. In some implementations, the back panel 1706 is a user interface that includes content corresponding to the selectable option 1705.

As will be described in greater detail herein, in some embodiments, the electronic device 101a is capable of detecting input based on the hand and/or gaze of the user of the device 101 a. In fig. 17A, the user's hand 1713 is in an inactive state (e.g., hand shape) that does not correspond to a ready state or input. In some embodiments, the ready state is the same or similar to the ready state described above with reference to fig. 7A-8K. In some embodiments, the user's hand 1713 is visible in the three-dimensional environment displayed by the device 101 a. In some embodiments, the electronic device 101a utilizes the display generation component 120a to display a photorealistic representation (e.g., video passthrough) of the user's finger and/or hand 1713. In some embodiments, the user's finger and/or hand 1713 is visible (e.g., truly transparent) through the transparent portion of the display generating component 120 a.

As shown in FIG. 17A, the scrollable user interface element 1703 and selectable option 1705 are displayed with simulated shadows. In some embodiments, the shadows are presented in a manner similar to one or more of the manners described below with reference to fig. 19A-20F. In some implementations, the shadow of the scrollable user interface element 1703 is displayed in response to detecting that the user's gaze 1701a is directed to the scrollable user interface element 1703 and the shadow of the selectable option 1705 is displayed in response to detecting that the user's gaze 1701b is directed to the selectable option 1705. It should be appreciated that in some embodiments, gaze 1701a and 1701b are shown as alternatives and are not meant to be detected simultaneously. In some implementations, the electronic device 101a additionally or alternatively updates the color of the scrollable user interface element 1703 in response to detecting a user's gaze 1701a on the scrollable user interface element 1703 and updates the color of the selectable option 1705 in response to detecting that the user's gaze 1701b is directed to the selectable option 1705.

In some embodiments, the electronic device 101a displays a visual indication near the user's hand in response to detecting that the user is beginning to provide input with their hand. FIG. 17B illustrates an exemplary visual indication of user input displayed near a user's hand. It should be appreciated that the hands 1713, 1714, 1715, and 1716 in fig. 17B are shown as alternatives and in some embodiments are not necessarily all detected at the same time.

In some implementations, in response to detecting that the user begins providing input with their hand (e.g., hand 1713 or 1714) when the user gaze 1701a is detected to be directed to the scrollable user interface element 1703, the electronic device 101a displays a virtual touchpad (e.g., 1709a or 1709 b) near the user's hand. In some embodiments, detecting that the user has begun to provide input with their hand includes detecting that the hand meets the indirect ready state criteria described above with reference to fig. 7A-8K. In some embodiments, detecting that the user is beginning to provide input with their hand includes detecting that the user is performing movements with their hand that meet one or more criteria, such as detecting that the user is beginning to "tap" with an extended finger (e.g., a finger movement threshold distance, such as 0.1, 0.2, 0.3, 0.5, 1, 2, etc. centimeters) while one or more other hands are pointing toward the palm curl.

For example, in response to detecting that the hand 1713 is beginning to provide input while the user's gaze 1701a is pointing at the scrollable user interface element 1703, the electronic device 101a displays a virtual touch pad 1709a near the hand 1713 and the virtual touch pad 1709a is displayed away from the scrollable user interface element 1703. The electronic device 101a optionally also displays a virtual shadow 1710a of the user's hand 1713 and a virtual shadow of the virtual touchpad on the virtual touchpad 1709a. In some embodiments, the virtual shadow is displayed in a manner similar to one or more of the virtual shadows described in fig. 19A-20F. In some implementations, the size and/or placement of the shadows indicates to the user how far the user must continue to move their finger to interact with the virtual touchpad 1709a, and thus initiate input directed to the scrollable user interface element 1703, such as by indicating the distance between the hand 1713 and the virtual touchpad 1709a. In some implementations, the electronic device 101a updates the color of the virtual touchpad 1709a as the user moves the finger of their hand 1713 closer to the virtual touchpad 1709a. In some implementations, if the user moves their hand 1713 away from the virtual touchpad 1709a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, etc. centimeters) or stops making a hand shape corresponding to the initiating input, the electronic device 101a stops displaying the virtual touchpad 1709a.

Similarly, in response to detecting that the hand 1714 is beginning to provide input while the user's gaze 1701a is directed toward the scrollable user interface element 1703, the electronic device 101a displays a virtual touch pad 1709b near the hand 1714 and the virtual touch pad 1709b is displayed away from the scrollable user interface element 1703. The electronic device 101a optionally also displays a virtual shadow 1710b of the user's hand 1714 and a virtual shadow of the virtual touchpad on the virtual touchpad 1709b. In some embodiments, the virtual shadow is displayed in a manner similar to one or more of the virtual shadows described in fig. 19A-20F. In some implementations, the size and/or placement of the shadows indicates to the user how far the user must continue to move their finger to interact with the virtual touchpad 1709a, and thus initiate input directed to the scrollable user interface element 1703, such as by indicating the distance between the hand 1714 and the virtual touchpad 1709b. In some implementations, the electronic device 101a updates the color of the virtual touchpad 1709b as the user moves the finger of their hand 1714 closer to the virtual touchpad 1709b. In some implementations, if the user moves their hand 1714 away from the virtual touchpad 1709b a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 20, 30, etc. centimeters) or stops making a hand shape corresponding to the initiating input, the electronic device 101a stops displaying the virtual touchpad 1709b.

Thus, in some embodiments, the electronic device 101a displays a virtual touchpad at a location near the location of the user's hand. In some implementations, the user can provide input pointing to the scrollable user interface element 1703 using a virtual touch pad 1709a or 1709 b. For example, in response to a user moving a finger of the hand 1713 or 1714 to touch the virtual touchpad 1709a or 1709b and then moving the finger away from the virtual touchpad (e.g., virtual tap), the electronic device 101a makes a selection among the scrollable user interface elements 1703. As another example, in response to detecting that the user moves a finger of the hand 1713 or 1714 to touch the virtual touchpad 1709a or 1709b, the finger is moved along the virtual touchpad and then moved away from the virtual touchpad, the electronic device 101a scrolls the scrollable user interface element 1703 as described below with reference to fig. 17C-17D.

In some implementations, the electronic device 101a displays a visual indication of user input provided by the user's hand in response to detecting that the user begins providing input directed to the selectable option 1705 (e.g., based on determining that the user's gaze 1701b is directed to the option 1705 when the user begins providing input). In some embodiments, detecting that the user has begun to provide input with their hand includes detecting that the hand meets the indirect ready state criteria described above with reference to fig. 7A-8K. In some embodiments, detecting that the user is beginning to provide input with their hand includes detecting that the user is performing movements with their hand that meet one or more criteria, such as detecting that the user is beginning to "tap" with an extended finger (e.g., a finger movement threshold distance, such as 0.1, 0.2, 0.3, 0.5, 1, 2, etc. centimeters) while one or more other hands are pointing toward the palm curl.

For example, in response to detecting that the hand 1715 is beginning to provide input when the user's gaze 1701b is directed toward the selectable option 1705, the electronic device 101a displays a visual indication 1711a near the hand 1715 and the visual indication 1711a is displayed remote from the selectable option 1705. The electronic device 101a also optionally displays a virtual shadow 1710c of the user's hand 1715 on the visual indication 1711 a. In some embodiments, the virtual shadow is displayed in a manner similar to one or more of the virtual shadows described in fig. 19A-20F. In some implementations, the size and/or placement of the shadows indicates to the user how far the user must continue to move their finger (e.g., to the location of visual indication 1711 a) to initiate input directed to selectable user interface element 1705, such as by indicating the distance between hand 1715 and visual indication 1711 a.

Similarly and in some embodiments, as an alternative to detecting the hand 1715, the electronic device 101a displays a visual indication 1711b near the hand 1716 and the visual indication 1711b is displayed remote from the selectable option 1705 in response to detecting that the hand 1716 begins providing input while the user's gaze 1701b is directed toward the selectable option 1705. The electronic device 101a also optionally displays a virtual shadow 1710d of the user's hand 1716 on the visual indication 1711b. In some embodiments, the virtual shadow is displayed in a manner similar to one or more of the virtual shadows described in fig. 19A-20F. In some implementations, the size and/or placement of the shadows indicates to the user how far the user must continue to move their finger (e.g., to the location of visual indication 1711 b) to initiate input directed to selectable user interface element 1705, such as by indicating the distance between hand 1716 and visual indication 1711b. Thus, in some embodiments, the electronic device 101a displays the visual indication 1711a or 1711b at a location near the user's hand 1715 or 1716 that begins providing input in a three-dimensional environment.

It should be appreciated that in some embodiments, the type of visual assistance presented by the electronic device is different from the examples shown herein. For example, the electronic device 101a can display a visual indication similar to the visual indication 1711a or 1711b when the user interacts with the scrollable user interface element 1703. In this example, the electronic device 101a displays visual indications similar to the indications 1711a and 1711b in response to detecting movement of an initiating tap of a user's hand (e.g., the hand 1713) when the user's gaze 1701a is directed to the scrollable user interface element 1703, and continues to display the visual indication as the user moves the finger of the hand 1713 to provide a scrolling input, thereby updating the positioning of the visual indication to follow the movement of the finger. As another example, the electronic device 101a can display virtual trackpads similar to the virtual trackpads 1709a and 1709b when the user interacts with the selectable option 1705. In this example, the electronic device 101a displays virtual trackpads similar to virtual trackpads 1709a and 1709b in response to detecting movement of an initiating tap of a user's hand (e.g., hand 1713) when the user's gaze 1701b points at selectable option 1705.

In fig. 17C, the electronic device 101a detects input provided by the hand 1715 directed to the scrollable user interface element 1703 and input provided by the hand 1713 directed to the selectable option 1705. It should be appreciated that the inputs provided by the hands 1713 and 1715 and gaze 1701a and 1701b are shown as alternatives, and in some embodiments, are detected simultaneously. Detecting input directed to the scrollable user interface element 1703 optionally includes detecting that a finger of the hand 1713 touches the virtual touch pad 1709, and then the finger and/or hand moves in the direction in which the scrollable user interface element 1703 scrolls (e.g., a vertical movement for vertical scrolling). Detecting input directed to selectable option 1705 optionally includes detecting movement of a finger touch visual indication 1711 of hand 1715. In some implementations, detecting an input directed to the option 1705 entails detecting a gaze 1701b of a user directed to the option 1705. In some implementations, the electronic device 101a detects input directed to the selectable option 1705 without detecting the gaze 1701b of the user directed to the selectable option 1705.

In some implementations, in response to detecting an input directed to the scrollable user interface element 1703, the electronic device 101a updates the display of the scrollable user interface element 1703 and the virtual touch pad 1709. In some implementations, when receiving input directed to the scrollable user interface element 1703, the electronic device 101a moves the scrollable user interface element 1703 away from a point of view associated with a user in the three-dimensional environment (e.g., through and/or through movement of an initial depth position of the virtual touch pad 1709 according to the hand 1713). In some implementations, the electronic device 101a updates the color of the scrollable user interface element 1703 as the hand 1713 moves closer to the virtual touch pad 1709. As shown in fig. 17C, upon receiving this input, the scrollable user interface element 1703 is pushed back from the orientation shown in fig. 17B and the shadow of scrollable user interface element 1703 stops displaying. Similarly, upon receiving this input, the virtual touch pad 1709 is pushed back and is no longer displayed with the virtual shading shown in FIG. 17B. In some implementations, the distance that the scrollable user interface element 1703 moves back corresponds to the amount of movement of the finger of the hand 1713 when providing input directed to the scrollable user interface element 1703. Further, as shown in fig. 17C, according to one or more steps of the method 2000, the electronic device 101a stops displaying the virtual shadow of the hand 1713 on the virtual touch pad 1709. In some implementations, while the hand 1713 is in contact with the virtual touch pad 1709, the electronic device 101a detects lateral movement of the hand 1713 and/or finger in the direction in which the scrollable user interface element 1703 is scrollable in contact with the touch pad 1709 and scrolls the content of the scrollable user interface element 1703 according to the lateral movement of the hand 1713.

In some implementations, in response to detecting an input directed to the selectable option 1705, the electronic device 101a updates the display and visual indication 1711 of the input of the selectable option 1705. In some implementations, when an input is received that points to the selectable option 1705, the electronic device 101a moves the selectable option 1705 away from a viewpoint associated with a user in the three-dimensional environment and toward the back panel 1706 and updates the color of the selectable option 1705 (e.g., movement of the initial depth position past and/or through the visual indication 1711 in accordance with the hand 1715). As shown in fig. 17C, upon receiving this input, selectable option 1705 is pushed back from the orientation shown in fig. 17B, and the shading of selectable option 1705 ceases to be displayed. In some implementations, the distance the selectable option 1705 moves back corresponds to the amount of movement of the finger of the hand 1715 when providing input directed to the selectable option 1705. Similarly, optionally in accordance with one or more steps of method 2000, electronic device 101a stops displaying a virtual shadow of hand 1715 on visual indication 1711 (e.g., because the finger of hand 1715 is now in contact with visual indication 1711). In some implementations, after the finger of the hand 1715 touches the visual indication 1711, the user removes the finger from the visual indication 1711 to provide a tap input directed to the selectable option 1705.

In some implementations, the electronic device 101a presents an audio indication that the input has been received in response to detecting an input by the hand 1713 and the gaze 1701a pointing to the scrollable user interface element 1703 or in response to detecting an input by the hand 1715 and the gaze 1701b pointing to the selectable option 1705. In some embodiments, in response to detecting that the criteria for providing the input is met when the user's gaze is not directed toward the interactive user interface element, the electronic device 101a still presents an audio indication of the input and displays the virtual touchpad 1709 or the visual indication 1711 near the user's hand, even though touching the virtual touchpad 1709 or the visual indication 1711 and/or interacting with the virtual touchpad or the visual indication does not cause the input to be directed toward the interactive user interface element. In some implementations, in response to direct input directed to the scrollable user interface element 1703 or selectable option 1705, the electronic device 101a updates the display of the scrollable user interface element 1703 or selectable option 1705, respectively, in a manner similar to that described herein, and optionally also presents the same audio feedback. In some implementations, the direct input is input provided by the user's hand (e.g., similar to one or more direct inputs associated with methods 800, 1000, and/or 1600) when the user's hand is within a threshold distance (e.g., 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc.) of the scrollable user interface element 1703 or selectable option 1705.

Fig. 17D shows the electronic device 101a detecting the end of the input provided to the scrollable user interface element 1703 and selectable option 1705. It should be appreciated that in some embodiments, the hands 1713 and 1715 and gaze 1701a and 1701b are alternatives to each other and are not necessarily all detected at the same time (e.g., the electronic device detects the hands 1713 and gaze 1701a at a first time and detects the hands 1715 and gaze 1701b at a second time). In some implementations, the electronic device 101a detects the end of the input directed to the scrollable user interface element 1703 when the user's hand 1713 moves away from the virtual touch pad 1709 a threshold distance (e.g., 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. centimeters). In some implementations, the electronic device 101a detects the end of the input directed to the selectable option 1705 when the user's hand 1715 moves away from the visual indication 1711 of the input a threshold distance (e.g., 0.05, 0.1, 0.2, 0.3, 0.5, 1, etc. cm).

In some implementations, in response to detecting the end of the input directed to the scrollable user interface element 1703 and selectable option 1705, the electronic device 101a restores the appearance of the scrollable user interface element 1703 and selectable option 1705 to the appearance of those elements prior to the detection of the input. For example, scrollable user interface element 1703 moves toward a point of view associated with a user in the three-dimensional environment to a position where it was displayed prior to detection of the input, and electronic device 101a continues to display a virtual shadow of scrollable user interface element 1703. As another example, selectable option 1705 moves toward a viewpoint associated with a user in the three-dimensional environment to a position where it was displayed prior to detection of the input, and electronic device 101a resumes display of the virtual shadow of selectable option 1705.

Further, in some embodiments, the electronic device 101a resumes the visual indication 1711 of the appearance or input of the virtual touch pad 1709 in response to detecting the end of the user input. In some implementations, the virtual touch pad 1709 moves toward a point of view associated with the user in the three-dimensional environment to a position where it was displayed prior to detecting input directed to the scrollable user interface element 1703, and the device 101a resumes display of the virtual shadow 1710e of the user's hand 1713 on the touch pad and the virtual shadow of the virtual touch pad 1709. In some implementations, after detecting an input directed to the scrollable user interface element 1703, the electronic device 101a stops the display of the virtual touch pad 1709. In some implementations, the electronic device 101a continues to display the virtual touch pad 1709 and display the virtual touch pad 1709 after input directed to the scrollable user interface element 1703 is provided until the electronic device 101a detects that the user's hand 1713 is moved away from the virtual touch pad 1709 a threshold distance (e.g., 1, 2, 3, 5, 10, 15, etc. centimeters) or at a threshold speed. Similarly, in some embodiments, the entered visual indication 1711 moves toward a point of view associated with the user in the three-dimensional environment to a position where it was displayed prior to detecting the input directed to selectable option 1705, and device 101a resumes display of virtual shadow 1710f of the user's hand 1715 on visual indication 1711. In some embodiments, upon detecting an input directed to selectable option 1705, electronic device 101a ceases display of visual indication 1711 of the input. In some embodiments, before ceasing to display the visual indication 1711, the electronic device 101a displays an animation that the indication 1711 expands and darkens before ceasing to display. In some implementations, the electronic device 101a resumes display of the visual indication 1711a in response to detecting that the user began providing subsequent input to the selectable option 1705 (e.g., moving a finger at the beginning of a flick gesture).

In some embodiments, the electronic device 101a accepts input from both hands of the user in a coordinated manner (e.g., simultaneously). For example, in fig. 17E, the electronic device 101a displays a virtual keyboard 1717 to which input may be provided based on the user's gaze and the movements of the user's hands 1721 and 1723 and/or input from those hands. For example, in response to detecting a tap gesture of the user's hands 1721 and 1723 when the user's gaze 1701c or 1701d is detected to be directed to various portions of the virtual keyboard 1717, the electronic device 101a provides text input in accordance with the focused keys of the virtual keyboard 1717. For example, in response to detecting a flick motion of the hand 1721 while the user's gaze 1701c is directed to the "a" key, the electronic device 101a enters the "a" character into the text input field, and in response to detecting a flick motion of the hand 1723 while the user's gaze 1701d is directed to the "H" key, the electronic device 101a enters the "H" character. When the user provides input with the hands 1721 and 1723, the electronic device 101a displays indications 1719a and 1719b of the input provided by the hands 1721 and 1723. In some embodiments, the indications 1719a and/or 1719b for each of the hands 1721 and 1723 are displayed in a similar manner and/or have one or more characteristics of the indications described with reference to fig. 17A-17D. The visual indications 1719a and 1719b optionally include virtual shadows 1710f and 1710g of the user's hands 1721 and 1723. In some implementations, the shadows 1710f and 1719b indicate the distance between the user's hands 1721 and 1723 and the visual indicators 1710f and 1710g, respectively, and stop displaying when the fingers of the hands 1721 and 1723 touch the indicators 1710f and 1710g, respectively. In some embodiments, after each tap input, the electronic device 101a stops displaying the visual indication 1710f or 1710g corresponding to the hand 1721 or 1723 that provided the tap. In some embodiments, the electronic device 101a displays the indication 1710f and/or 1710g in response to detecting a start of a subsequent tap input of the corresponding hand 1721 or 1723.

Fig. 18A-18O are flowcharts illustrating a method 1800 of presenting a visual indication of user input, according to some embodiments. In some embodiments, the method 1800 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, method 1800 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some operations in method 1800 are optionally combined and/or the order of some operations is optionally changed.

In some embodiments, the method 1800 is performed at an electronic device (e.g., a mobile device (e.g., a tablet, a smart phone, a media player, or a wearable device)) or a computer in communication with a display generating component and one or more input devices. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from the electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), etc., in some embodiments, the electronic device is in communication with a hand tracking device (e.g., one or more cameras, depth sensors, proximity sensors, touch sensors (e.g., touch screen, touch pad)). In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, such as in fig. 17A, the electronic device 101a displays 1802a the user interface object (e.g., 1705) in a three-dimensional environment via a display generation component. In some implementations, the user interface object is an interactive user interface object, and in response to detecting input directed to the user interface object, the electronic device performs an action associated with the user interface object. For example, the user interface object is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the user interface object is a container (e.g., window) in which the user interface/content is displayed, and in response to detecting a selection of the user interface object and a subsequent movement input, the electronic device updates the positioning of the user interface object in accordance with the movement input. In some embodiments, the user interface element is displayed in a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) that is generated by, displayed by, or otherwise made viewable by the device (e.g., a user interface including the user interface object is a three-dimensional environment and/or is displayed within a three-dimensional environment).

In some embodiments, such as in fig. 17B, upon display of the user interface object (e.g., 1705), the electronic device 101a detects (1802B) a respective input to a predefined portion (e.g., 1715) (e.g., finger, hand, arm, head, etc.) of the user that includes the electronic device via the one or more input devices (e.g., hand tracking device, head tracking device, eye tracking device, etc.), wherein during the respective input, the predefined portion (e.g., 1715) of the user is located away from a location corresponding to the user interface object (e.g., 1705) (e.g., at least away from a location threshold distance (e.g., 1, 5, 10, 20, 30, 50, 100, etc. centimeters) corresponding to the user interface object. In some implementations, the electronic device displays the user interface object in a three-dimensional environment that includes a virtual object (e.g., a user interface object, a representation of an application, a content item) and a representation of the portion of the user. In some embodiments, the user is associated with a location in the three-dimensional environment that corresponds to a location of the electronic device in the three-dimensional environment. In some embodiments, the representation of the portion of the user is a photorealistic representation of the portion of the user displayed by the display generating component or a view of the portion of the user visible through the transparent portion of the display generating component. In some embodiments, the respective input of the predefined portion of the user is an indirect input, such as described with reference to methods 800, 1000, 1200, 1600, and/or 2000.

In some embodiments, such as in fig. 17B, upon detecting the respective input (1802 c), in accordance with a determination that a first portion of movement of a predefined portion (e.g., 1715) of the user meets one or more criteria and the predefined portion (e.g., 1715) of the user is in a first position (e.g., in a three-dimensional environment), the electronic device 101a displays (1802 d) a visual indication (e.g., 1711 a) via a display generating component at a first position in the three-dimensional environment corresponding to the first position of the predefined portion (e.g., 1715) of the user. In some embodiments, the one or more criteria are met when the first portion of movement has a predetermined direction, magnitude, or speed. In some embodiments, the one or more criteria are met based on the pose of the predetermined portion of the user at the time the first portion of movement is detected and/or immediately before (e.g., immediately before). For example, if the palm of the user's hand is facing away from the user's torso while the hand is in a predetermined hand shape (e.g., a pointing hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm) while the user moves one or more fingers of the hand away from the user's torso a predetermined threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, etc. centimeters), then the movement of the user's hand meets the one or more criteria. For example, the electronic device detects that the user begins performing a flick action by moving one or more fingers and/or a hand from which one or more fingers extend. In some embodiments, in response to detecting movement of a user's finger that meets the one or more criteria, the electronic device displays a visual indication in proximity to the finger, the hand, or a different predetermined portion of the hand. For example, in response to detecting that the user begins to tap his index finger when his palm is facing away from the user's torso, the electronic device displays a visual indication near the tip of the index finger. In some implementations, the visual indication is positioned at a distance away from the tip of the index finger that matches or corresponds to a distance that the user must further move the finger to cause selection of the user interface element to which the input is directed (e.g., the user interface element to which the user's gaze is directed). In some embodiments, the visual indication is not displayed upon detection of the first portion of movement (e.g., the visual indication is displayed in response to completion of the first portion of movement meeting the one or more criteria). In some embodiments, the one or more criteria include a criterion that is met when the portion of the user moves a predetermined distance (e.g., 0.1, 0.2, 0.5, 1, 2, 3, etc. centimeters) away from the torso of the user and/or toward the user interface object, and in response to detecting a first portion of the movement of the portion of the user toward the torso of the user and/or away from the user interface object that meets the one or more criteria, the electronic device ceases to display the visual indication. In some embodiments, the one or more criteria include a criterion that is met when a predetermined portion of the user is within a predetermined location, such as within a region of interest within a threshold distance (e.g., 2, 3, 5, 10, 15, 30, etc. centimeters) of the user's gaze, such as described with reference to method 1000. In some embodiments, the one or more criteria are met regardless of the positioning of the portion of the user relative to the region of interest.

In some embodiments, such as in fig. 17B, upon detecting the respective input (1802 c), in accordance with a determination that a first portion of movement of a predefined portion of the user (e.g., 1716) meets the one or more criteria and the predefined portion of the user (e.g., 1716) is in a second location, the electronic device 101a displays (1802 e) a visual indication (e.g., 1711B) via a display generating component at a second location in the three-dimensional environment that corresponds to the second location of the predefined portion of the user (e.g., 1716), wherein the second location is different from the first location. In some embodiments, the location in the three-dimensional environment at which the visual indication is displayed depends on the location of the predefined portion of the user. In some embodiments, the electronic device displays a visual indication having a predefined spatial relationship with respect to a predefined portion of the user. In some embodiments, in response to detecting the first portion of movement of the predefined portion of the user while the predefined portion of the user is in the first position, the electronic device displays a visual indication of the predefined spatial relationship with respect to the predefined portion of the user at a first location in the three-dimensional environment, and in response to detecting the first portion of movement of the predefined portion of the user while the predefined portion of the user is in the second position, the electronic device displays a visual indication of the predefined spatial relationship with respect to the predefined portion of the user at a third location in the three-dimensional environment.

The above-described manner of displaying a visual indication corresponding to a predetermined portion of a user indicating that input is detected and that a predefined portion of the user interacts with a user interface object provides an efficient manner of indicating that input from the predefined portion of the user will cause interaction with the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by reducing unintentional input from the user), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detection of the respective input (1804 a), in accordance with a determination that a first portion of movement of a predefined portion of the user (e.g., 1715) meets the one or more criteria and in accordance with a determination that the one or more second criteria are met, including criteria met when the first portion of movement of the predefined portion of the user (e.g., 1715) is followed by a second portion of movement of the predefined portion of the user (e.g., 1715) (e.g., and the second portion of movement of the predefined portion of the user meets one or more criteria, such as distance, speed, duration, or other threshold, or the second portion of movement matches the predetermined portion of movement and the user's gaze is directed to the user interface object), the electronic device 101a performs (1804 b) a selection operation with respect to the user interface object (e.g., 1705) in accordance with the respective input. In some embodiments, performing the selection operation includes: selecting a user interface object, activating or deactivating a setting associated with the user interface object, initiating, stopping or modifying playback of a content item associated with the user interface object, initiating display of a user interface associated with the user interface object, and/or initiating communication with another electronic device. In some implementations, the one or more criteria include a criterion that is met when the second portion of movement has a distance that meets a distance threshold (e.g., a distance between a predefined portion of the user and a visual indication in a three-dimensional environment). In some implementations, in response to detecting that the distance of the moved second portion exceeds the distance threshold, the electronic device moves the visual indication (e.g., to display the visual indication at a location corresponding to the predefined portion of the user) according to the distance exceeding the threshold (e.g., rearward). For example, the visual indication is initially 2 centimeters from the tip of the user's finger and, in response to detecting that the user has moved his finger 3 centimeters toward the user interface object, the electronic device moves the visual indication 1 centimeter toward the user interface object and selects the user interface object in accordance with movement of the finger over or through the visual indication, the selection occurring once the user's fingertip has moved 2 centimeters. In some embodiments, the one or more criteria include criteria that are met in accordance with a determination that the user's gaze is directed at the user interface object and/or the user interface object is in the user's attention area (described with reference to method 1000).

In some embodiments, upon detecting the respective input (1804 a), in accordance with a determination that the first portion of the movement of the predefined portion of the user (e.g., 1715 in fig. 17C) does not meet the one or more criteria and in accordance with a determination that the one or more second criteria are met, the electronic device 101a foregoes (1804C) performing the selection operation with respect to the user interface object (e.g., 1705 in fig. 17C). In some embodiments, even if the one or more second criteria are met, including criteria met by detecting movement corresponding to the second portion of movement, the electronic device foregoes performing the selection operation if the first portion of movement does not meet the one or more criteria. For example, the electronic device performs the selection operation in response to detecting the second portion of the movement while the visual indication is displayed. In this example, the electronic device foregoes performing the selection operation in response to detecting the second portion of the movement when the electronic device does not display the visual indication.

The above-described manner of performing a selection operation in response to the one or more second criteria being met after the first portion of movement is detected and while the visual indication is displayed provides an efficient manner of accepting user input based on movement of the user's predefined portion and rejecting unintended input when movement of the user's predefined portion meets the second one or more criteria without first detecting the first portion of movement, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detection of a respective input, the electronic device 101a displays (1806 a) via a display generation component a representation of a predefined portion (e.g., 1715) of the user that moves according to movement of the predefined portion (e.g., 1715) of the user. In some implementations, the representation of the predefined portion of the user is a photorealistic representation (e.g., a passthrough video) of the portion of the user displayed in the three-dimensional environment at a location corresponding to the location of the predefined portion of the user in the physical environment of the electronic device. In some embodiments, the pose of the representation of the predefined portion of the user matches the pose of the predefined portion of the user. For example, in response to detecting that a user makes a pointing hand shape at a first location in a physical environment, the electronic device displays a representation of a hand making the pointing hand shape at a corresponding first location in a three-dimensional environment. In some embodiments, the representation of the portion of the user is a view of the portion of the user through the transparent portion of the display generating component.

The above-described manner of displaying a representation of a predefined portion of a user that moves according to movement of the predefined portion of the user provides an efficient way of presenting feedback to the user when the user moves the predefined portion of the user to provide input to the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, a predefined portion of the user (e.g., 1715) is visible in the three-dimensional environment via the display generating component (1808 a). In some embodiments, the display generating component includes a transparent portion through which the predefined portion of the user is visible (e.g., a true passthrough). In some implementations, the electronic device presents a photorealistic representation (e.g., virtual passthrough video) of the predefined portion of the user via the display generation component.

The above-described manner of making predefined portions of the user visible to the user via the display generating component provides efficient visual feedback of user input to the user, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detection of a respective input and in accordance with determining that a first portion (e.g., 1715) of a movement of a predefined portion of a user meets the one or more criteria, the electronic device 101a modifies (1810 a) a display of a user interface object (e.g., 1705) in accordance with the respective input. In some implementations, modifying the display of the user interface object includes updating one or more of a color, a size, or a positioning of the user interface object in the three-dimensional environment.

The above-described manner of modifying the display of the user interface object in response to the first portion of movement provides an efficient way of indicating that further input will be directed to the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, modifying the display of the user interface object (e.g., 1705) includes (1812 a) in accordance with a determination that after a first portion of the movement of the user's predefined portion (e.g., 1715) meets the one or more criteria, the user's predefined portion (e.g., 1705) moves toward a location corresponding to the user interface object (e.g., 1715), and in accordance with a movement of the user's predefined portion (e.g., 1705) toward a location corresponding to the user interface object (e.g., 1705), moving the user interface object (e.g., 1715) backwards in a three-dimensional environment (e.g., away from the user, in a direction of movement of the user's predefined portion) (1812 b). In some embodiments, the electronic device moves the user interface object backward an amount proportional to the amount of movement of the predefined portion of the user after the moved first portion of the one or more criteria is met. For example, in response to detecting that the predefined portion of the user moves a first amount, the electronic device moves the user interface object back a second amount. As another example, in response to detecting that the predefined portion of the user moves a third amount greater than the first amount, the electronic device moves the user interface object back a fourth amount greater than the second amount. In some implementations, the electronic device moves the user interface object backward when movement of the predefined portion of the user subsequent to the first portion of movement is detected after the predefined portion of the user has moved sufficiently to cause selection of the user interface object.

The above-described manner of moving the user interface object backwards according to the movement of the predefined portion of the user after the first portion of movement provides an efficient way of indicating to the user which user interface element the input is directed to, which simplifies the interaction between the user and the electronic device, enhances the operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, the user interface object (e.g., 1705) is displayed in a respective user interface (e.g., 1706) via a display generation component (1814 a) (e.g., in a window or other container, overlaid on a back panel, in a user interface of a respective application, etc.).

In some embodiments, such as in fig. 17C, in accordance with a determination that the respective input is a scroll input, the electronic device 101a moves the respective user interface and user interface object (e.g., 1703) back in accordance with movement of the predefined portion of the user (e.g., 1713) toward the location corresponding to the user interface object (e.g., 1703) (1814 b) (e.g., the user interface element does not move away from the user relative to the respective user interface element, but instead moves the user interface element with the respective user interface element).

In some embodiments, such as in fig. 17C, in accordance with a determination that the respective input is an input other than a scroll input (e.g., a selection input, an input that moves a user interface element within a three-dimensional environment), the electronic device moves the user interface object (e.g., 1705) relative to the respective user interface (e.g., 1706) (e.g., backwards) without moving the respective user interface (e.g., 1706) (1814C). In some implementations, the user interface objects move independently of the respective user interfaces. In some implementations, the respective user interfaces do not move. In some embodiments, in response to the scroll input, the electronic device moves the user interface object rearward along with the container of user interface objects, and in response to an input other than the scroll input, the electronic device moves the user interface object rearward without moving the container of user interface objects rearward.

The above-described manner of selectively moving the respective user interface object back in accordance with the type of input of the respective input provides an efficient way of indicating to the user which user interface element the input is directed to, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detecting the respective input (1816 a), after detecting movement of the predefined portion of the user (e.g., 1715) toward the user interface object (e.g., 1705) and after moving the user interface object back in the three-dimensional environment, the electronic device 101a detects (1816 b) movement of the predefined portion of the user (e.g., 1715) away from the location corresponding to the user interface object (e.g., toward the torso of the user). In some implementations, after performing the selection operation in response to detecting movement of a predefined portion of the user that meets one or more respective criteria, movement of the predefined portion of the user away from a location corresponding to the user interface object is detected. In some embodiments, after discarding performing the selection operation in response to detecting movement of the predefined portion of the user that does not meet the one or more respective criteria, movement of the predefined portion of the user away from the location corresponding to the user interface object is detected.

In some implementations, such as in fig. 17D, upon detecting the respective input (1816 a), in response to detecting movement of the predefined portion of the user (e.g., 1715) away from the location corresponding to the user interface object (e.g., 1705), the electronic device 101a moves (1816 c) the user interface object forward (e.g., 1705) (e.g., toward the user) in the three-dimensional environment according to movement of the predefined portion of the user (e.g., 1715) away from the location corresponding to the user interface object (e.g., 1705). In some implementations, in response to the predefined portion of the user moving away from the user interface object a distance less than a predetermined threshold, the electronic device moves the corresponding user interface element forward an amount proportional to the distance of movement of the predefined portion of the user when detecting movement of the predefined portion of the user. In some implementations, once the distance traveled by the predefined portion of the user reaches a predetermined threshold, the electronic device displays the user interface element at a distance from the user that the user displayed the user interface element prior to detection of the respective input. In some implementations, in response to detecting that the predefined portion of the user moves greater than a threshold distance away from the user interface object, the electronic device stops moving the user interface object forward and maintains displaying the user interface element at a distance from the user that the user interface object was displayed prior to detecting the respective input.

The above-described manner of moving the user interface object forward in response to movement of a predefined portion of the user away from the user interface object provides an efficient way of providing feedback to the user that movement away from the user interface element is detected, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17B, a visual indication (e.g., 1711 a) at a first location in the three-dimensional environment corresponding to a first location of a predefined portion (e.g., 1715) of the user is displayed in proximity to a representation (1818 a) of the predefined portion (e.g., 1715) of the user visible in the three-dimensional environment at a first respective location in the three-dimensional environment. In some implementations, the representation of the predefined portion of the user is a photorealistic representation (e.g., virtual passthrough) of the predefined portion of the user displayed by the display generating component. In some implementations, the representation of the predefined portion of the user is a predefined portion of the user that is visible through a transparent portion of the display generating component (e.g., a true passthrough). In some embodiments, the predefined portion of the user is the user's hand and the visual indication is displayed near the tip of the user's finger.

In some embodiments, such as in fig. 17B, the visual indication (e.g., 1711B) at the second location in the three-dimensional environment corresponding to the second location of the predefined portion (e.g., 1715B) of the user is displayed in proximity to the representation (1818B) of the predefined portion (e.g., 1715B) of the user visible in the three-dimensional environment at the second corresponding location in the three-dimensional environment. In some implementations, as the user moves the predefined portion of the user, the electronic device updates the location of the visual indication to continue to be displayed in proximity to the predefined portion of the user. In some embodiments, after detecting movement that meets the one or more criteria and before detecting movement of the portion of the user toward the torso and/or away from the user interface object, the electronic device continues to display the visual indication (e.g., at and/or near the tip of the finger performing the first portion of movement) and updates the location of the visual indication based on additional movement of the portion of the user. For example, in response to detecting movement of a user's finger that meets the one or more criteria, the movement including movement of the finger away from the user's torso and/or toward the user interface object, the electronic device displays the visual indication and continues to display the visual indication at a location of a portion of the hand (e.g., around the finger, such as an extended finger) if the user's hand moves laterally or vertically without moving toward the user's torso. In some embodiments, in accordance with a determination that the first portion of the movement does not meet the one or more criteria, the electronic device foregoes displaying the visual indication.

The above-described manner of displaying visual indications in the vicinity of the predefined portion of the user provides an efficient way of indicating that movement of the predefined portion of the user causes input to be detected at the electronic device, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 7C, upon display of the user interface object, the electronic device 101a detects (1820 a) a second corresponding input via the one or more input devices to movement of a predefined portion (e.g., 709) comprising the user, wherein during the second corresponding input, the position of the predefined portion (e.g., 709) of the user is within a position corresponding to the user interface object (e.g., the predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, etc. centimeters) of the user interface object such that the predefined portion of the user interacts directly with the user interface object, such as described with reference to methods 800, 1000, 1200, 1400, 1600, and/or 2000.

In some embodiments, such as in fig. 7C, upon detection of the second corresponding input (1820 b), the electronic device modifies (1820C) the display (e.g., color, size, positioning, etc.) of the user interface object (e.g., 705) in accordance with the second corresponding input without displaying a visual indication via the display generating component at a location corresponding to the predefined portion (e.g., 709) of the user. For example, the electronic device updates the color of the user interface object in response to detecting a predefined pose of the predefined portion of the user when the predefined portion of the user is within a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, 15, etc. centimeters) of the user interface object. In some embodiments, the electronic device detects movement of the predefined portion of the user toward the user interface object and, in response to the movement of the predefined portion of the user and once the predefined portion of the user has been in contact with the user interface object, moves the user interface object according to the movement of the predefined portion of the user (e.g., moves the user interface object in a direction, speed, and/or distance corresponding to the direction, speed, and/or distance of movement of the predefined portion of the user).

The above-described manner of modifying the display of the user interface object in accordance with the second corresponding input provides an efficient way of indicating to the user which user interface element the second input is directed to, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, the electronic device (e.g., 101 a) performs the respective operation in response to the respective input (e.g., 1821 a).

In some embodiments, upon display of the user interface object (e.g., 1703, 1705 in fig. 17C), the electronic device (e.g., 101 a) detects (e.g., 1821 b) via the one or more input devices (e.g., 314 a) a third corresponding input comprising movement of a predefined portion of the user (e.g., 1713, 1715 in fig. 17C) that includes the same type of movement as movement of the predefined portion of the user in the corresponding input (e.g., the third corresponding input is a repetition or substantial repetition of the corresponding input), wherein during the third corresponding input, the position of the predefined portion of the user is in a position corresponding to the user interface object. For example, when the input in fig. 17C is provided, the hands 1713 and/or 1715 are located at the position of option 1705.

In some embodiments, such as in fig. 17C, in response to detecting the third corresponding input, the electronic device (e.g., 101) performs (e.g., 1821C) the corresponding operation (e.g., does not display a visual indication via the display generating component at a location corresponding to the predefined portion of the user). In some implementations, the electronic device performs the same operation in response to input directed to the respective user interface element regardless of the type of input provided (e.g., direct input, indirect input, air gesture input, etc.).

Performing the same operation in response to input directed to the respective user interface element provides consistent and convenient user interaction with the electronic device regardless of the type of input received, thereby enabling a user to quickly and efficiently use the electronic device.

In some embodiments, such as in fig. 17B, in accordance with a determination that the user's gaze (e.g., 1701B) is directed to the user interface object (e.g., 1705) before the respective input (1822 a) is detected, the electronic device displays (1822B) the user interface object (e.g., 1705) with the respective visual characteristic (e.g., size, location, color) having the first value. In some embodiments, the electronic device displays the user interface object in a first color when the user's gaze is directed at the user interface object.

In some implementations, before detecting a respective input (such as the input in fig. 17B) (1822 a), in accordance with a determination that the user's gaze is not directed to the user interface object (e.g., 1705), the electronic device displays (1822 c) the user interface object (e.g., 1705) with a respective visual characteristic having a second value different from the first value. In some embodiments, the electronic device displays the user interface object in the second color when the user's gaze is not directed at the user interface object.

The above-described manner of updating the respective visual characteristics of the user interface object in dependence on whether the user's gaze is directed at the user interface object provides an efficient way of indicating to the user which user interface element the input is to be directed to, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detection of a respective input (1824 a), after a first portion of movement of a predefined portion of a user (e.g., 1715) meets the one or more criteria (1824 b), in accordance with a determination that a second portion of movement of the predefined portion of the user (e.g., 1715) meeting the one or more second criteria and a subsequent third portion of movement of the predefined portion of the user (e.g., 1715) meeting the one or more third criteria are detected, wherein the one or more second criteria include criteria met when the second portion of movement of the predefined portion of the user (e.g., 1715) includes movement (sufficient for selection) toward a location corresponding to the user interface object that is greater than a movement threshold, and the one or more third criteria include performing a tap operation (1704) relative to the user interface object (1705) when the third portion of movement is away from the location corresponding to the user interface object (e.g., 1705) and within a time threshold (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2) is detected. In some embodiments, the first portion of the movement of the predefined portion of the user is a first amount of movement of the predefined portion of the user toward the user interface object, the second portion of the movement of the predefined portion of the user is a second amount of further movement of the predefined portion of the user toward the user interface object (e.g., sufficient to indirectly select the user interface object), and the third portion of the movement of the predefined portion of the user is movement of the predefined portion of the user away from the user interface element. In some implementations, the tap operation corresponds to a selection of a user interface element (e.g., similar to tapping a user interface element displayed on a touch screen).

The above-described manner of performing a flick operation in response to detecting the first, second, and third portions of movement provides an efficient manner of receiving a flick input when the predefined portion of the user is in a position away from the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, upon detection of the respective input (1826 a), after a first portion of movement of the predefined portion of the user (e.g., 1713) satisfies the one or more criteria (1826 b), in accordance with a determination that a second portion of movement of the predefined portion of the user (e.g., 1713) that satisfies the one or more second criteria and a subsequent third portion of movement of the predefined portion of the user (e.g., 1713) that satisfies the one or more third criteria are detected, wherein the one or more second criteria include a criterion that is satisfied when the second portion of movement of the predefined portion of the user (e.g., 1713) includes movement (sufficient for selection) toward a position corresponding to the user interface object (e.g., 1703) that is greater than a movement threshold, and the one or more third criteria include a criterion that is satisfied when the third portion of movement is transverse to the position corresponding to the user interface object (e.g., 1703) is moved in accordance with a direction of movement of the third portion of the electronic interface (e.g., 1706) that is performed in accordance with a direction of movement of the electronic interface (1703) that is orthogonal to the direction of movement of the moving object that satisfies the predefined portion of the user (e.g., 1713) between the predefined portion of the user and the position corresponding to the user interface in the three-dimensional environment. In some implementations, the scrolling operation includes scrolling content (e.g., text content, images, etc.) of the user interface object according to movement of a predefined portion of the user. In some implementations, the content of the user interface object scrolls in a direction, speed, and/or amount corresponding to the direction, speed, and/or amount of movement of the predefined portion of the user in the third portion of movement. For example, if the lateral movement is a horizontal movement, the electronic device scrolls the content horizontally. As another example, if the lateral movement is a vertical movement, the electronic device scrolls the content vertically.

The above-described manner of laterally moving the first and second portions of movement and the subsequent third portion of movement including the predefined portion of the user provides an efficient manner of manipulating the user interface element when the predefined portion of the user is positioned away from the user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, such as in fig. 17C, upon detecting the respective input (1828 a), after a first portion of movement of a predefined portion of the user (e.g., 1715) meets the one or more criteria, the electronic device detects (1828 b) via the one or more input devices a second portion of movement of the predefined portion of the user (e.g., 1715) away from a location corresponding to the user interface object (e.g., 1705) (e.g., the user moves his finger toward the torso of the user and away from a location corresponding to the location of the user interface object in the three-dimensional environment).

In some embodiments, upon detecting a corresponding input (1828 a), such as the input in fig. 17C, in response to detecting the moving second portion, the electronic device updates (1828C) the appearance of the visual indication (e.g., 1711) in accordance with the moving second portion. In some embodiments, updating the appearance of the visual indication includes changing the translucency, size, color, or position of the visual indication. In some embodiments, the electronic device stops displaying the visual indication after updating the appearance of the visual indication. For example, in response to detecting a second portion of the movement of the predefined portion of the user, the electronic device expands the visual indication and fades the color and/or display of the visual indication, and then stops displaying the visual indication.

The above-described manner of updating the appearance of the visual indication based on the moving second portion provides an efficient way of confirming to the user that the moving first portion meets the one or more criteria when the moving second portion is detected, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, updating the appearance of a visual indication, such as the visual indication in fig. 17C (e.g., 1711), includes ceasing to display the visual indication (1830 a). In some embodiments, such as in fig. 17A, after ceasing to display the visual indication, the electronic device 101a detects (1830 b) a second corresponding input via the one or more input devices to a second movement comprising a predefined portion of the user (e.g., 1713), wherein during the second corresponding input, a location of the predefined portion of the user (e.g., 1713) is farther from a location corresponding to the user interface object (e.g., 1705) than a threshold distance (e.g., 3, 5, 10, 15, 30, etc. centimeters) from a location of the user interface object in the three-dimensional environment in a location of the predefined portion of the user in the physical environment of the electronic device. In some implementations, the threshold distance is a threshold distance for direct input (e.g., if the distance is less than a threshold, the electronic device optionally detects direct input).

In some implementations, such as in fig. 17B, upon detecting the second corresponding input (1830 c), in accordance with a determination that the first portion of the second movement meets the one or more criteria, the electronic device 101a displays (1830 d) a second visual indication (e.g., 1711 a) in the three-dimensional environment via the display generating component at a location corresponding to the predefined portion of the user (e.g., 1715) during the second corresponding input. In some embodiments, the electronic device displays a visual indication at a location in the three-dimensional environment corresponding to the predefined portion of the user when (e.g., each time) the electronic device detects the respective moving first portion meeting the one or more criteria.

The above-described manner of displaying the second visual indication in response to detecting the first portion of the second movement meeting the one or more criteria after updating the appearance of the first visual indication and ceasing to display the first visual indication provides an efficient manner of providing visual feedback to the user each time the electronic device detects the moving portion meeting the one or more criteria, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some implementations, such as in fig. 17C, the respective input corresponds to a scroll input (1832 a) directed to the user interface object (e.g., after detecting a first portion of movement that meets the one or more criteria, the electronic device detects further movement of a predefined portion of the user in a direction corresponding to a direction in which the user interface is scrollable). For example, in response to detecting an upward movement of a predefined portion of the user after detecting the first portion of movement, the electronic device vertically scrolls the user interface element.

In some implementations, such as in fig. 17C, the electronic device 101a scrolls (1832 b) the user interface object (e.g., 1703) according to the respective input while maintaining the display of the visual indication (e.g., 1709). In some implementations, the visual indication is a virtual touchpad and the electronic device scrolls the user interface object according to movement of the predefined portion of the user while the predefined portion of the user is in a physical position corresponding to a position of the virtual touchpad in the three-dimensional environment. In some embodiments, in response to lateral movement of the predefined portion of the user controlling the scrolling direction, the electronic device updates the location of the visual indication to continue to be displayed in proximity to the predefined portion of the user. In some embodiments, the electronic device maintains the positioning of the visual indication in the three-dimensional environment in response to lateral movement of a predefined portion of the user controlling the scrolling direction.

The above-described manner of maintaining a display of a visual indication upon detection of a scroll input provides an efficient way of providing feedback to a user of where a predefined portion of a user is positioned in order to provide a scroll input, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, upon detection of a respective input (1834 a), such as the input shown in fig. 17C, after a first portion of movement of a predefined portion of the user (e.g., 1715) meets the one or more criteria, the electronic device detects (1834 b) via the one or more input devices a second portion of movement to the predefined portion of the user (e.g., 1715) meeting one or more second criteria, including criteria met when the second portion of movement corresponds to a distance between a location corresponding to the visual indication (e.g., 1711) and the predefined portion of the user (e.g., 1715). In some embodiments, the criterion is met when the second portion of the movement includes a movement that is at least an amount of distance between the predefined portion of the user and the location corresponding to the visual indication. For example, if the visual indication is displayed at a location corresponding to one centimeter from the predefined portion of the user, the criterion is met when the second portion of the movement includes a movement of at least one centimeter toward the location corresponding to the visual indication.

In some embodiments, upon detecting a respective input (1834 a), such as one of the inputs in fig. 17C, the electronic device 101a generates (1834C) audio (and/or haptic) feedback indicative of meeting the one or more second criteria in response to detecting a second portion of the movement of the predefined portion (e.g., 1715) of the user. In some implementations, in response to detecting a second portion of movement of a predefined portion of the user that meets the one or more second criteria, the electronic device performs an action in accordance with a selection of a user interface object (e.g., the user interface object to which the input is directed).

The above-described manner of generating feedback indicating that the second portion of the movement meets the one or more second criteria provides an efficient way of confirming to the user that an input was detected, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17B, upon displaying the user interface object (e.g., 1703), the electronic device 101a detects (1836 a) that one or more second criteria are met, including criteria met when the predefined portion (e.g., 1713) of the user has a respective pose (e.g., position, orientation, shape (e.g., hand shape) when the predefined portion (e.g., 1713) of the user is away from a position corresponding to the user interface object (e.g., 1703).

In some implementations, such as in fig. 17B, in response to detecting that the one or more second criteria are met, the electronic device 101a displays (1836B) via the display-generating component a virtual surface (e.g., 1709 a) (e.g., that appears as a visual indication of a touch pad) in a vicinity of a location (e.g., within a threshold distance (e.g., 1, 3, 5, 10, etc. centimeters) corresponding to a predefined portion (e.g., 1713) of the user and remote from the user interface object (e.g., 1703). In some implementations, the visual indication is optionally square or rectangular in shape with square corners or rounded corners so as to look like a touch pad. In some implementations, the electronic device performs an action with respect to the remote user interface object in accordance with the input in response to detecting the predefined portion of the user at a location corresponding to the location of the virtual surface. For example, if the user taps a location corresponding to the virtual surface, the electronic device detects a selection input directed to a remote user interface object. As another example, if the user moves his hand laterally along the virtual surface, the electronic device detects a scroll input directed to a remote user interface object.

The above-described display of the virtual surface in response to the second criteria provides an efficient way of presenting visual guidance to the user to direct where predefined portions of the user are located to provide input to the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, upon displaying a virtual surface, such as the virtual surface (e.g., 1709) in fig. 17C, the electronic device 101a detects (1838 a) via the one or more input devices a respective movement to a predefined portion (e.g., 1713) of the user toward a location corresponding to the virtual surface (e.g., 1709). In some implementations, in response to detecting the respective movement, the electronic device changes (1838 b) a visual appearance of a virtual surface, such as the virtual surface (e.g., 1709) in fig. 17C, according to the respective movement. In some embodiments, changing the visual appearance of the virtual surface includes changing a color of the virtual surface. In some embodiments, changing the visual appearance of the virtual surface includes displaying simulated shadows of the user's hand on the virtual surface according to method 2000. In some embodiments, the color change of the virtual surface increases as the predefined portion of the user is closer to the virtual surface and reverses as the predefined portion of the user is farther from the virtual surface.

The above-described manner of changing the visual appearance of the virtual surface in response to movement of the predefined portion of the user toward the location corresponding to the virtual surface provides an efficient manner of indicating to the user that the virtual surface corresponds to user input provided by the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17C, upon display of the virtual surface (e.g., 1709), the electronic device 101a detects (1840 a) via the one or more input devices a respective movement to a predefined portion (e.g., 1713) of the user toward a location corresponding to the virtual surface (e.g., 1703). In some implementations, such as in fig. 17C, in response to detecting the respective movement, the electronic device 101a changes (1840 b) the visual appearance of the user interface object (e.g., 1703) according to the respective movement. In some embodiments, the moving of the predefined portion of the user toward the location corresponding to the virtual surface includes moving the predefined portion of the user a distance that is at least a distance between the predefined portion of the user and the location corresponding to the virtual surface. In some embodiments, the electronic device initiates selection of the user interface object in response to movement of the predefined portion of the user. In some implementations, updating the visual appearance of the user interface object includes changing a color of the user interface object. In some embodiments, the color of the user interface object gradually changes as the predefined portion of the user moves closer to the virtual surface and gradually recovers as the predefined portion of the user moves away from the virtual surface. In some implementations, the rate or extent of change in visual appearance is based on the speed of movement, the distance of movement, or the distance from the virtual touchpad of the predefined portion of the user. In some embodiments, changing the visual appearance of the user interface object includes moving the user interface object away from a predefined portion of the user in a three-dimensional environment.

The above-described manner of updating the visual appearance of the user interface object in response to detecting movement of the predefined portion of the user towards the location corresponding to the virtual surface provides an efficient manner of indicating to the user that input provided via the virtual surface will be directed to the user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17B, displaying the virtual surface (e.g., 1709 a) near the location corresponding to the predefined portion (e.g., 1713) of the user includes displaying the virtual surface (e.g., 1709 a) at a respective distance from the location corresponding to the predefined portion (e.g., 1713) of the user that corresponds to an amount of movement (1842 a) of the predefined portion (e.g., 1713) of the user toward the location corresponding to the virtual surface (e.g., 1709 a) required to perform the operation with respect to the user interface object (e.g., 1703). For example, if one centimeter of movement is required to perform an operation with respect to a user interface object, the electronic device displays the virtual surface at a location one centimeter from a location corresponding to a predefined portion of the user. As another example, if two centimeters of movement are required to perform an operation with respect to a user interface object, the electronic device displays the virtual surface at a location two centimeters from a location corresponding to a predefined portion of the user.

The above-described manner of displaying the virtual surface at a location for indicating an amount of movement of a predefined portion of the user required to perform an operation with respect to the user interface object provides an efficient way of indicating to the user how to interact with the user interface with the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17B, when displaying the virtual surface (e.g., 1709 a), the electronic device 101a displays (1844 a) a visual indication (e.g., 1710 a) of a distance between a predefined portion (e.g., 1713) of the user and a location corresponding to the virtual surface (e.g., 1709 a) on the virtual surface (e.g., 1709 a). In some embodiments, the visual indication is a simulated shadow of the predefined portion of the user on the virtual surface, such as in method 2000. In some implementations, the electronic device performs an operation with respect to the user interface object in response to detecting movement of the predefined portion of the user to a location corresponding to the virtual surface.

The above-described manner of displaying a visual indication of a distance between a predefined portion of a user and a location corresponding to a virtual surface provides an efficient way of indicating to the user a distance between the predefined portion of the user and the location corresponding to the virtual surface, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by showing to the user how much movement of the predefined portion of the user is required to perform an operation with respect to the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, upon displaying a virtual surface, such as the virtual surface (e.g., 1713) in fig. 17B, the electronic device 101a detects (1846 a) movement to a predefined portion (e.g., 1713) of the user via the one or more input devices to a respective location (e.g., in any direction) that is greater than a threshold distance (e.g., 3, 5, 10, 15, etc. centimeters) from a location corresponding to the virtual surface (e.g., 1709 a).

In some implementations, in response to detecting movement of the predefined portion (e.g., 1713) of the user to the corresponding location, the electronic device stops (1846B) displaying a virtual surface in the three-dimensional environment, such as the virtual surface (e.g., 1709 a) in fig. 17B. In some implementations, the electronic device also stops displaying the virtual surface in accordance with determining that the pose of the predefined portion of the user does not meet one or more criteria. For example, the electronic device displays the virtual surface when the user's hand is in a pointing hand shape and/or positioned with the palm facing away from the user's torso (or toward a position corresponding to the virtual surface), and in response to detecting that the pose of the user's hand no longer meets the criteria, the electronic device ceases to display the virtual surface.

The above-described manner of ceasing to display the virtual surface in response to detecting that the predefined portion of the user moves a threshold distance away from the location corresponding to the virtual surface provides an efficient manner of reducing visual confusion caused by displaying the virtual surface when the user is unlikely to interact with the virtual surface (because the predefined portion of the user is greater than the threshold distance from the location corresponding to the virtual surface), which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17B, displaying the virtual surface in proximity to the predefined portion of the user (e.g., 1713) includes (1848 a) in accordance with a determination that the predefined portion of the user (e.g., 1713) is in a first respective location when the one or more second criteria are met (e.g., pose (e.g., hand shape, location, orientation) of the predefined portion of the user meets the one or more criteria, gaze of the user is directed toward the user interface object), displaying the virtual surface (e.g., 1709 a) (1848B) in a third location in the three-dimensional environment that corresponds to the first respective location of the predefined portion of the user (e.g., 1713) (e.g., displaying the virtual surface at the predefined location relative to the predefined portion of the user). For example, the electronic device displays the virtual surface at a threshold distance (e.g., 1, 2, 3, 5, 10, etc. centimeters) from a location corresponding to the predefined portion of the user.

In some implementations, such as in fig. 17B, displaying the virtual surface (e.g., 1709B) proximate to the predefined portion (e.g., 1714) of the user includes (1848 a) in accordance with a determination that the predefined portion (e.g., 1714) of the user is at a second corresponding location different from the first corresponding location when the one or more second criteria are met, displaying the virtual surface (e.g., 1709B) in the three-dimensional environment at a fourth location different from the third location corresponding to the second corresponding location of the predefined portion (e.g., 1714) (1848 c). In some embodiments, the location at which the virtual surface is displayed depends on the location of the predefined portion of the user when the one or more second criteria are met, such that the virtual surface is displayed in a predefined location relative to the predefined portion of the user regardless of the location of the predefined portion of the user when the one or more second criteria are met.

The above-described manner of displaying virtual surfaces at different locations according to the location of the user's predefined portion provides an efficient way of easily displaying virtual surfaces at locations where the user interacts with virtual surfaces using the user's predefined portion, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, such as in fig. 17E, upon displaying a visual indication (e.g., 1850 a) corresponding to a predefined portion (e.g., 1721) of the user, the electronic device 101a detects (1850 b) via the one or more input devices a second corresponding input comprising movement of a second predefined portion (e.g., 1723) of the user (e.g., a second hand of the user), wherein during the second corresponding input, a location of the second predefined portion (e.g., 1723) of the user is distant from a location corresponding to the user interface object (e.g., 1717) (e.g., at least a threshold distance (e.g., 3, 5, 10, 15, 30, etc. centimeters) from a location corresponding to the user interface object).

In some embodiments, such as in fig. 17E, upon displaying a visual indication (e.g., 1850 a) corresponding to a predefined portion (e.g., 1721) of the user, upon detecting a second corresponding input (1850 c), in accordance with a determination that a first portion of the movement of the second predefined portion (e.g., 1723) of the user meets the one or more criteria, concurrently displayed via a display generation component (1850 d): a visual indication (e.g., 1719 a) (1850 e) corresponding to a predefined portion (e.g., 1721) of the user (e.g., displayed near the predefined portion of the user); and a visual indication (e.g., 1719 b) (1850 f) at a location corresponding to a second predefined portion (e.g., 1723) of the user in the three-dimensional environment (e.g., displayed near the second predefined portion of the user). In some embodiments, in response to detecting movement of the second predefined portion of the user without detecting movement of the first predefined portion of the user, the electronic device updates the location of the visual indication at the location corresponding to the second predefined portion of the user without updating the location of the visual indication corresponding to the predefined portion of the user. In some embodiments, in response to detecting movement of the predefined portion of the user without detecting movement of the second predefined portion of the user, the electronic device updates the location of the visual indication corresponding to the predefined portion of the user without updating the location of the visual indication at the location corresponding to the second predefined portion of the user.

The above-described manner of displaying the visual indication at a location corresponding to the second predefined portion of the user provides an efficient way of independently displaying the visual indication for both predefined portions of the user, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some implementations, upon detecting a respective input (such as the input in fig. 17B) (e.g., and in accordance with a determination that a first portion of the movement of the predefined portion of the user meets the one or more criteria), the electronic device 101a displays (1852 a) a respective visual indication (e.g., a shadow of the user's hand in accordance with method 2000; a cursor; a shadow of the cursor in accordance with method 2000, etc.) on the user interface object (e.g., 1703, 1705) indicating that the predefined portion of the user (e.g., 1713, 1714, 1715, 1716) needs to be moved toward a respective distance corresponding to the user interface object (e.g., 1703, 1705) to interact with the user interface object (e.g., 1703, 1705). In some implementations, the size and/or positioning of the visual indication (e.g., a shadow of a user's hand or a shadow of a cursor) is updated with the update of the additional movement distance of the predefined portion of the user required to interact with the user interface object. For example, once the user moves a predefined portion of the user by an amount required to interact with the user interface object, the electronic device stops displaying the corresponding visual indication.

The above-described manner of presenting respective visual indications of amounts of movement of a predefined portion of a user that are required to interact with a user interface object provides an efficient manner of providing feedback to the user when the user provides input with the predefined portion of the user, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some implementations, upon displaying a user interface object, such as the user interface object (e.g., 1703, 1705) in fig. 17A, the electronic device 101a detects 1854a that the user's gaze (e.g., 1701a, 1701 b) is directed to the user interface object (e.g., 1703, 1705). In some implementations, in response to detecting that the user's gaze (e.g., 1701a, 1701 b) is directed to a user interface object, such as the user interface object (e.g., 1703, 1705) in fig. 17A (e.g., optionally based on one or more disambiguation techniques according to the method 1200), the electronic device 101a displays (1854 b) the user interface object (e.g., 1703, 1705) with a respective visual characteristic (e.g., size, color, location) having a first value. In some implementations, in accordance with a determination that the user's gaze is not directed to the user interface object (e.g., optionally based on one or more disambiguation techniques in accordance with method 1200), the electronic device displays the user interface object with a respective visual characteristic having a second value different from the first value. In some embodiments, in response to detecting a user's gaze on a user interface object, the electronic device directs input provided by a predetermined portion of the user to the user interface object, such as described with reference to indirect interactions with the user interface object in methods 800, 1000, 1200, 1400, 1600, and/or 2000. In some implementations, in response to detecting that the user's gaze is directed toward the second user interface object, the electronic device displays the second user interface object with the respective visual characteristic having the first value.

The above-described manner of updating the values of the respective visual characteristics of the user interface object in accordance with the user's gaze provides an efficient way of indicating to the user that the system is able to direct the pointing direction of the input based on the user's gaze, which simplifies the interaction between the user and the electronic device, enhances the operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

In some embodiments, such as in fig. 17A, the three-dimensional environment includes a representation (e.g., 1704) of a corresponding object in the physical environment of the electronic device (1856 a). In some embodiments, the representation is a photorealistic representation (e.g., a passthrough video) of the corresponding object displayed by the display generation component. In some embodiments, the representation is a view of the respective object through the transparent portion of the display generating component.

In some implementations, the electronic device 101a detects 1856b that one or more second criteria are satisfied, including criteria that are satisfied when the user's gaze is directed to a representation (e.g., 1704) of the respective object, and criteria that are satisfied when the predefined portion (e.g., 1713) of the user is in the respective pose (e.g., positioning, orientation, pose, hand shape). For example, the electronic device 101a displays a representation of the speaker in a manner similar to the electronic device 101a displaying the representation 1704 of the table in fig. 17B, and detects that the hand (e.g., 1713, 1714, 1715, or 1716 in fig. 17B) is in a corresponding pose when the user's gaze is directed to the representation of the speaker. For example, the respective pose includes the user's hand being within a predefined region of the three-dimensional environment when the user's hand is in a respective shape (e.g., pointing or pinching or pre-pinching the hand shape), with the palm of the hand facing away from the user and/or toward the respective object. In some embodiments, the one or more second criteria further include a criterion that is met when the respective object is interactive. In some embodiments, the one or more second criteria further comprise a criterion that is met when the object is a virtual object. In some embodiments, the one or more second criteria further include a criterion that is met when the object is a real object in a physical environment of the electronic device.

In some embodiments, in response to detecting that the one or more second criteria are met, the electronic device displays (1856 c) one or more selectable options in proximity to the representation (e.g., 1704) of the respective object via the display generation component, wherein the one or more selectable options are selectable to perform respective operations associated with the respective object (e.g., to control operations of the respective object). For example, in response to detecting that a hand (e.g., 1713, 1714, 1715, or 1716 in fig. 17B) is in a respective pose when a user's gaze is directed to the electronic device 101a in a representation of a speaker that is displayed by the electronic device 101a in a manner similar to the representation 1704 of the table in fig. 17B, the electronic device displays one or more selectable options that are selectable to perform respective operations associated with the speaker (e.g., play, pause, fast forward, reverse, or change a playback volume of content played on the speaker). For example, the respective object is a speaker or speaker system, and the options include an option to play or pause playback on the speaker or speaker system, an option to skip forward or backward in the content or content list. In this example, the electronic device communicates with the respective object (e.g., via a wired or wireless network connection) and is capable of transmitting an indication to the respective object to cause it to perform operations in accordance with user interaction with the one or more selectable options.

The above-described manner of presenting selectable options that may be selected to perform respective operations associated with respective objects in response to detecting that a user's gaze is on the respective objects provides an efficient manner of interacting with the respective objects using the electronic device, which simplifies interactions between the user and the electronic device and enhances operability of the electronic device and makes the user-device interface more efficient, which additionally reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device while reducing errors in use.

In some embodiments, after a first portion of the movement of the predefined portion of the user (e.g., 1713) meets the one or more criteria, and while a visual indication (e.g., 1709 a) corresponding to the predefined portion of the user is displayed, the electronic device detects (1858 a) via the one or more input devices a second portion of the movement of the predefined portion of the user (e.g., 1713) that meets one or more second criteria (e.g., criteria of speed of movement, distance, duration, etc.), such as in fig. 17B.

In some embodiments, such as in fig. 17B, in response to detecting the second portion (1858B) of the movement of the predefined portion (e.g., 1713) of the user, in accordance with a determination that the gaze (e.g., 1701 a) of the user is directed to the user interface object (e.g., 1703) and the user interface object is interactive (1858 c) (e.g., the electronic device performs an operation in accordance with the user interface object in response to user input directed to the user interface object), the electronic device 101a displays (1858 d), via the display generating component, a visual indication (e.g., 1709 a) indicating that the second portion of the movement of the predefined portion (e.g., 1713) of the user meets the one or more second criteria. In some embodiments, the visual indication indicating that the second portion of the movement meets the second criterion is displayed at or near a location of the visual indication at a location corresponding to the predefined portion of the user. In some implementations, the visual indication that the second portion of the movement of the predefined portion of the user meets the one or more second criteria is an updated version (e.g., a different size, color, translucence, etc.) of the visual indication at a location corresponding to the predefined portion of the user. For example, the electronic device expands the visual indication in response to detecting movement of a predefined portion of the user that causes selection of the user interface object.

In some embodiments, such as in fig. 17C, in response to detecting the second portion (1858 b) of the movement of the predefined portion (e.g., 1713) of the user, in accordance with a determination that the gaze (e.g., 1701 a) of the user is directed to the user interface object (e.g., 1703) and the user interface object (e.g., 1703) is interactive (1858C) (e.g., the electronic device performs an operation in accordance with the user interface object in response to user input directed to the user interface object), the electronic device 101a performs (1858 e) an operation (e.g., selects the user interface object, scrolls the user interface object, moves the user interface object, navigates to a user interface associated with the user interface object, initiates playback of content associated with the user interface object, or performs another operation in accordance with the user interface object) corresponding to the respective input.

In some embodiments, in response to detecting the second portion (1858 b) of the movement of the predefined portion (e.g., 1713) of the user, in accordance with a determination that the gaze of the user is not directed to the interactive user interface object (e.g., 1703) (1858 f), the electronic device displays (1858 g) via the display generation component a visual indication (e.g., 1709) that indicates that the second portion of the movement of the predefined portion of the user meets the one or more second criteria without performing an operation in accordance with the respective input. For example, in response to detecting a second portion of the movement performed by the hands 1713, 1714, 1715, and/or 1716 when the user's gaze 1701a or 1701B is not directed to either of the user interface elements 1703 or 1705 in fig. 17B, the electronic device displays a virtual surface 1709a or 1709B or an indication 1710c or 1710d, respectively, in accordance with the movement of the hands 1713, 1714, 1715, and/or 1716. In some embodiments, the visual indication indicating that the second portion of the movement meets the second criterion is displayed at or near a location of the visual indication at a location corresponding to the predefined portion of the user. In some implementations, the visual indication that the second portion of the movement of the predefined portion of the user meets the one or more second criteria is an updated version (e.g., a different size, color, translucence, etc.) of the visual indication at a location corresponding to the predefined portion of the user. In some embodiments, the electronic device presents the same indication as an indication that the second portion of the movement of the predefined portion of the user meets the one or more second criteria regardless of whether the user's gaze is directed to the interactive user interface object. For example, if the user interface object is interactive, the electronic device expands the visual indication in response to detecting movement of a predefined portion of the user that will cause selection of the user interface object.

The above-described manner of presenting an indication whether or not a user's gaze is directed at an interactive user interface element provides an efficient way of indicating to a user that an input provided with a predefined portion of the user is detected, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use.

Fig. 19A shows that the electronic device 101 displays a three-dimensional environment 1901 on a user interface via the display generating section 120. It should be appreciated that in some embodiments, the electronic device 101 utilizes one or more of the techniques described with reference to fig. 19A-19D in a two-dimensional environment or user interface without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device 101 optionally includes a display generation component 120 (e.g., a touch screen) and a plurality of image sensors 314. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101 can use to capture one or more images of a user or a portion of a user when the user interacts with the electronic device 101. In some embodiments, the display generating component 120 is a touch screen capable of detecting gestures and movements of a user's hand. In some embodiments, the user interfaces shown below may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect the physical environment and/or movement of the user's hands (e.g., external sensors facing outward from the user) and/or sensors that detect the user's gaze (e.g., internal sensors facing inward toward the user).

As shown in fig. 19A, three-dimensional environment 1901 includes three user interface objects 1903a, 1903b, and 1903c that are interactable (e.g., via user input provided by the hands 1913a, 1913b, and/or 1913c of a user of device 101). Hands 1913a, 1913b, and/or 1913c are optionally the hands of a user detected simultaneously by device 101 or alternatively detected by device 101 such that the responses of device 101 to the inputs described herein from these hands optionally occur simultaneously or alternatively and/or sequentially. The device 101 optionally directs such input to the user interface objects 1903a, 1903b, and/or 1903c based on various characteristics of direct or indirect input provided by the hands 1913a, 1913b, and/or 1913c (e.g., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). In fig. 19A, three-dimensional environment 1901 also includes representation 604 of a table in the physical environment of electronic device 101 (e.g., such as described with reference to fig. 6B). In some implementations, the representation 604 of the table is a photorealistic video image (e.g., video or digital passthrough) of the table displayed by the display generation component 120. In some embodiments, the representation 604 of the table is a view (e.g., a real or physical perspective) of the table through the transparent portion of the display generating component 120.

In fig. 19A-19D, hands 1913a and 1913b interact indirectly with user interface object 1903a (e.g., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000), and hand 1913c interacts directly with user interface object 1903b (e.g., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). In some implementations, the user interface object 1903b is a user interface object that itself responds to input. In some implementations, the user interface object 1903b is a virtual touchpad-type user interface object, input directed to which causes the device 101 to direct corresponding input to a user interface object 1903c that is remote from the user interface object 1903b (e.g., as described with reference to method 1800).

In some embodiments, in response to detecting that the user's hand is in an indirect ready state hand shape and at an indirect interaction distance from the user interface object, the device 101 displays a cursor that is away from the user's hand at a predetermined distance from the user interface object to which the user's gaze is directed. For example, in fig. 19A, device 101 detects that hand 1913a is in an indirect ready state hand shape (e.g., as described with reference to method 800) at an indirect interaction distance from user interface object 1903a (e.g., as described with reference to method 800), and optionally detects that the user's gaze is directed toward user interface object 1903a. In response, the device 101 displays a cursor 1940a at a predetermined distance (e.g., 0.1, 0.5, 1, 2, 5, 10cm in front of the user interface object) from the user interface object 1903a and away from the hand 1913a and/or a finger (e.g., index finger) on the hand 1913 a. The position of the cursor 1940a is optionally controlled by the position of the hand 1913a such that if the hand 1913a and/or a finger (e.g., index finger) on the hand 1913a moves laterally, the device 101 moves the cursor 1940a laterally, and if the hand 1913a and/or finger (e.g., index finger) moves toward or away from the user interface object 1903a, the device 101 moves the cursor 1940a toward or away from the user interface object 1903a. The cursor 1940a is optionally a visual indication corresponding to the location of the hand 1913a and/or a corresponding finger on the hand 1913 a. When the device 101 detects that the hand 1913a and/or the corresponding finger on the hand 1913a is sufficiently moved toward the user interface object 1903a such that the cursor 1940a touches down on the user interface object 1903a according to such movement, the hand 1913a optionally interacts with the user interface object 1903a (e.g., selects, scrolls through the user interface object, etc.).

As shown in fig. 19A, device 101 also displays a simulated shadow 1942a on user interface object 1903a that corresponds to cursor 1940a and has a shape that is based on the shape of cursor 1940a as if it were projected by cursor 1940a on user interface object 1903 a. The size, shape, color, and/or position of the simulated shadow 1942a is optionally updated as the cursor 1940a moves relative to the user interface object 1903a (corresponding to movement of the hand 1913 a). The simulated shadow 1942a thus provides a visual indication of the amount by which the hand 1913a is required to interact (e.g., select, scroll, etc.) with the user interface object 1903a by the hand 1913a toward the user interface object 1903a, which optionally occurs when the cursor 1940a touches down on the user interface object 1903 a. The simulated shadow 1942a additionally or alternatively provides a visual indication of the type of interaction (e.g., indirect) between the hand 1913a and the user interface object 1903a, as the size, color, and/or shape of the simulated shadow 1942a is optionally based on the size and/or shape of the cursor 1940a, which is optionally displayed by the device 101 for indirect interaction rather than direct interaction, which will be described later.

In some implementations, the user interface object 1903a is a user interface object that can be interacted with via both hands (e.g., hands 1913a and 1913 b) simultaneously. For example, user interface object 1903a is optionally a virtual keyboard whose keys can be selected via hand 1913a and/or hand 1913 b. The hand 1913b optionally interacts indirectly with the user interface object 1903a (e.g., similar to that described with respect to the hand 1913 a). Thus, the device 101 displays a cursor 1940b corresponding to the hand 1913b and a simulated shadow 1942b corresponding to the cursor 1940 b. The cursor 1940b and simulated shadow 1942b optionally have one or more of the characteristics of the cursor 1940a and simulated shadow 1942a, similarly applicable in the context of the hand 1913 b. In embodiments where device 101 detects simultaneous interaction of hands 1913a and 1913b with user interface object 1903a indirectly, device 101 optionally simultaneously displays cursors 1940a and 1940b (controlled by hands 1913a and 1913b, respectively) and simulated shadows 1942a and 1942b (corresponding to cursors 1940a and 1940b, respectively). In fig. 19A, cursor 1940a is optionally farther from user interface object 1903a than cursor 1940 b; thus, the device 101 displays the cursor 1940a as being larger than the cursor 1940b, and accordingly displays the simulated shadow 1942a as being larger than the simulated shadow 1942b and laterally offset from the cursor 1940a more than the simulated shadow 1942b is relative to the cursor 1940 b. In some embodiments, the dimensions of cursors 1940a and 1940b in three-dimensional environment 1901 are the same. After the cursors 1940a and 1940b, respectively, are displayed by the device 101, the cursor 1940a is optionally farther from the user interface object 1903a than the cursor 1940b because the hand 1913a (corresponding to the cursor 1940 a) is optionally moved toward the user interface object 1903a by a smaller amount than the hand 1913b (corresponding to the cursor 1940 b) is moved toward the user interface object 1903 a.

In fig. 19B, device 101 has detected movement of hands 1913a and 1913B (and/or corresponding fingers on hands 1913a and 1913B) toward user interface object 1903 a. The hand 1913a optionally moves toward the user interface object 1903a an amount less than the amount required for the hand 1913a to indirectly interact with the user interface object 1903a (e.g., less than the amount required for the cursor 1940a to touch down on the user interface object 1903 a). In response to movement of the hand 1913a, the device 101 optionally moves a cursor in the three-dimensional environment 1901 toward the user interface object 1903a, thereby displaying the cursor 1940a in a smaller size than before, displaying the shadow 1942a in a smaller size than before, reducing lateral offset between the shadow 1942a and the cursor 1940a, and/or displaying the shadow 1942a in a visual characteristic having a different value (e.g., darker) than before. Thus, the device 101 has updated the display of the shadow 1942a to reflect the interaction of the hand 1913a with the user interface object 1903a such that the shadow 1942a continues to indicate the remaining movement toward the user interface object required for one or more characteristics of the interaction between the hand 1913a and the user interface object 1903a (e.g., such as those previously described, including the user's hand interacting with the user interface object (e.g., selecting the user interface object, etc.).

In fig. 19B, hand 1913B optionally moves toward user interface object 1903a an amount equal to or greater than the amount required for hand 1913B to interact with user interface object 1903a (e.g., an amount equal to or greater than the amount required for cursor 1940B to reach down to user interface object 1903 a). In response to movement of the hand 1913b, the device 101 optionally moves the cursor toward the user interface object 1903a in the three-dimensional environment 1901 and displays the cursor 1940b as being touching the user interface object 1903a downward, thereby displaying the cursor 1940b in a smaller size than before and/or ceasing to display the shadow 1942b. In response to movement of the hand 1913B and/or downward touching of the user interface object 1903a by the cursor 1940B, the device 101 optionally detects and directs corresponding input (e.g., selection input, scroll input, tap input, press-hold-lift-off input, etc., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000) from the hand 1913B to the user interface object 1903a, as indicated by the check marks alongside the cursor 1940B in fig. 19B.

In fig. 19C, the device 101 detects that the hand 1913a is moving laterally relative to the position of the hand 1913a in fig. 19B (e.g., while the hand 1913B remains in a position/state where the cursor 1940B remains touching the user interface object 1903a downward). In response, the device 101 moves the cursor 1940a and the shadow 1942a laterally relative to the user interface object 1903a, as shown in FIG. 19C. In some embodiments, if the movement of hand 1913a does not include movement toward or away from user interface object 1903a, but only includes movement that is lateral with respect to user interface object 1903a, the display of cursor 1940a and shadow 1942a (except for lateral position) remains unchanged from fig. 19B through 19C. In some embodiments, if the movement of hand 1913a does not include movement toward or away from user interface object 1903a, but only includes movement that is lateral with respect to user interface object 1903a, device 101 maintains the display of cursor 1940a (except for lateral position), but changes the display of shadow 1942a based on the content or other characteristics of user interface object 1903a at the new position of shadow 1942 a.

In fig. 19D, the device 101 detects that the hand 1913a is moving toward the user interface object 1903a by an amount equal to or greater than that required for the hand 1913a to interact with the user interface object 1903a (e.g., an amount equal to or greater than that required for the cursor 1940a to touch down on the user interface object 1903 a). In some implementations, movement of the hand 1913a is detected while the hand 1913b remains in a position/state where the cursor 1940b remains touching the user interface object 1903a downward. In response to movement of the hand 1913a, the device 101 optionally moves the cursor toward the user interface object 1903a in the three-dimensional environment 1901 and displays the cursor 1940a as being touching the user interface object 1903a downward, thereby displaying the cursor 1940b in a smaller size than before and/or ceasing to display the shadow 1942b. In response to movement of the hand 1913a and/or downward touching of the user interface object 1903a by the cursor 1940a, the device 101 optionally identifies a corresponding input (e.g., a selection input, a scroll input, a tap input, a press-hold-lift-off input, etc.) from the hand 1913a to the user interface object 1903a, as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000, as indicated by the check marks alongside the cursor 1940a in fig. 19D. In some embodiments, device 101 detects input from hands 1913a and 1913b that is simultaneously (as indicated by the simultaneous check marks alongside cursors 1940a and 1940 b), correspondingly, or sequentially directed to user interface object 1903 a.

In some implementations, in response to lateral movement of the hand 1913a and/or 1913b when the cursor 1940a and/or 1940b touches down on the user interface object 1903a, the device 101 directs movement-based input to the user interface object 1903a (e.g., scrolls the input) while laterally moving the cursor 1940a and/or 1940b that remains touching down on the user interface object 1903a according to lateral movement of the hand 1913a and/or 1913b (e.g., does not redisplay shadows 1942a and/or 1942 b). In some implementations, in response to movement of the hands 1913a and/or 1913b away from the user interface object 1903a when the cursor 1913a and/or 1913b touches down on the user interface object 1903a, the device 101 identifies an end of a corresponding input (e.g., simultaneously or sequentially identifying one or more of a tap input, a long press input, a scroll input, etc.) directed to the user interface object 1903a and/or moves the cursor 1940a and/or 1940b away from the user interface object 1903a according to movement of the hands 1940a and/or 1940 b. As the device 101 moves the cursor 1940a and/or 1940b away from the user interface object 1903a in accordance with movement of the hand 1913a and/or 1913b, the device correspondingly optionally redisplays the shadow 1942a and/or 1942b with one or more of the characteristics previously described.

Returning to fig. 19A, in some embodiments, device 101 detects direct interaction between the user's hand of device 101 and the user interface object simultaneously and/or alternately. For example, in fig. 19A, device 101 detects that hand 1913c is interacting directly with user interface object 1903 b. The hand 1913c is optionally within a direct interaction distance of the user interface object 1903b (e.g., as described with reference to method 800) and/or in a direct ready state hand shape (e.g., as described with reference to method 800). In some embodiments, when device 101 detects that a hand is interacting directly with a user interface object, device 101 displays a simulated shadow corresponding to the hand on the user interface object. In some embodiments, if a hand is within a field of view of a point of view of a three-dimensional environment displayed by device 101, device 101 displays a representation of the hand in the three-dimensional environment. It should be appreciated that in some embodiments, if the hand indirectly interacting with the user interface object is within the field of view of the viewpoint of the three-dimensional environment displayed by device 101, device 101 similarly displays a representation of that portion in the three-dimensional environment.

For example, in fig. 19A, the device 101 displays a simulated shadow 1944 corresponding to the hand 1913 c. The simulated shadow 1944 optionally has a shape and/or size based on the shape and/or size of the hand 1913c and/or a finger (e.g., index finger) on the hand 1913c as if it were projected by the hand 1913c and/or finger on the user interface object 1903 b. The size, shape, color, and/or position of the simulated shadow 1944 is optionally updated as the hand 1913c moves relative to the user interface object 1903 b. The simulated shadow 1944 thus provides a visual indication of the amount by which the hand 1913c and/or a finger (e.g., index finger) on the hand 1913c is moved toward the user interface object 1903b as needed for the hand 1913c to interact with (e.g., select, scroll, etc.) the user interface object 1903b, which optionally occurs when the hand 1913c and/or the finger on the hand 1913c touches down on the user interface object 1903b (e.g., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000). The simulated shadow 1944 additionally or alternatively provides a visual indication of the type of interaction (e.g., directly) between the hand 1913c and the user interface object 1903b because the size, color, and/or shape of the simulated shadow 1944 is optionally based on the size and/or shape of the hand 1913c (e.g., rather than on the size and/or shape of a cursor that is optionally not displayed for direct interaction with the user interface object). In some embodiments, the representation of the hand 1913c displayed by the device 101 is a photorealistic video image (e.g., video or digital passthrough) of the hand 1913c displayed by the display generating component 120 in the three-dimensional environment 1901 at a location corresponding to the location of the hand 1913c in the physical environment of the device 101 (e.g., the display location of the representation is updated as the hand 1913c moves). Thus, in some embodiments, simulated shadow 1944 is as if it were a cast shadow of a representation of hand 1913c displayed by device 101. In some embodiments, the representation of the hand 1913c displayed by the device 101 is a view of the hand 1913c through the transparent portion of the display generating component 120 (e.g., real or physical passthrough), so the position of the representation of the hand 1913c in the three-dimensional environment 1901 changes as the hand 1913c moves. Thus, in some embodiments, the simulated shadow 1944 is as if it were a shadow cast by the hand 1913c itself.

In fig. 19B, device 101 has detected movement of hand 1913c and/or a finger on hand 1913c toward user interface object 1903B. The hand 1913c optionally moves toward the user interface object 1903b an amount less than that required for the hand 1913c to interact directly with the user interface object 1903 b. In response to movement of hand 1913c, in fig. 19B, device 101 displays shadow 1944 at a smaller size than before, reduces the lateral offset between shadow 1944 and hand 1913c, and/or displays shadow 1944 with a visual characteristic having a different value (e.g., darker) than before. Thus, the device 101 has updated the display of the shadow 1944 to reflect the interaction of the hand 1913c with the user interface object 1903b such that the shadow 1944 continues to indicate one or more characteristics of the interaction between the hand 1913c and the user interface object 1903b (e.g., such as those previously described, including the remaining movement toward the user interface object that is required for the user's hand to interact with the user interface object (e.g., select the user interface object, etc.).

In fig. 19C, the device 101 detects that the hand 1913C is moving laterally with respect to the position of the hand 1913C in fig. 19B. In response, the device 101 moves the shadow 1944 laterally with respect to the user interface object 1903b, as shown in FIG. 19C. In some embodiments, if the movement of hand 1913C does not include movement toward or away from user interface object 1903B, but only includes movement that is lateral with respect to user interface object 1903B, the display of shadow 1944 (except for lateral position) remains unchanged from fig. 19B through 19C. In some implementations, the device 101 changes the display of the shadow 1944 based on the content or other characteristics of the user interface object 1903b at the new location of the shadow 1944.

In fig. 19D, the device 101 detects that the hand 1913c is moving toward the user interface object 1903b an amount equal to or greater than the amount required for the hand 1913c to interact with the user interface object 1903b (e.g., the amount required for the hand 1913c or a finger on the hand 1913c to touch down on the user interface object 1903 b). In response to movement of the hand 1913c, the device 101 optionally stops or adjusts the display of shadows 1944. In response to movement of hand 1913c and downward touching of user interface object 1903b by hand 1913c, device 101 optionally recognizes a corresponding input (e.g., a selection input, a scroll input, a tap input, a press-hold-lift-off input, etc., as described with reference to methods 800, 1000, 1200, 1400, 1600, 1800, and/or 2000) from hand 1913c to user interface object 1903b, as indicated by the check marks in user interface object 1903b in fig. 19D. If the user interface object 1903b is a virtual touchpad-type user interface object (e.g., as described with reference to method 1800), the device 101 optionally directs input corresponding to interaction of the hand 1913c with the user interface object 1903b to the remote user interface object 1903c, as indicated by the check marks in the user interface object 1903c in fig. 19D.

In some implementations, in response to lateral movement of hand 1913c while hand 1913c and/or a finger on hand 1913c remains touching down on user interface object 1903b, device 101 directs movement-based input (e.g., scroll input) to user interface object 1903b and/or 1903c (e.g., does not redisplay or adjust shadow 1944) according to lateral movement of hand 1913 c. In some implementations, in response to movement of the hand 1913c and/or a finger on the hand 1913c away from the user interface object 1903b, the device 101 identifies an end of a corresponding input (e.g., tap input, long press input, scroll input, etc.) directed to the user interface object 1903b and/or 1903c and redisplays or adjusts the shadow 1944 accordingly with one or more of the previously described characteristics.

20A-20F are flowcharts illustrating methods of enhancing interactions with user interface elements in a three-dimensional environment using visual indications of such interactions, according to some embodiments. In some embodiments, the method 2000 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, the method 2000 is managed by instructions stored in a non-transitory computer readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some operations in method 2000 are optionally combined and/or the order of some operations is optionally changed.

In some embodiments, the method 2000 is performed at an electronic device (e.g., 101 a) in communication with a display generation component and one or more input devices. For example, a mobile device (e.g., a tablet, smart phone, media player, or wearable device) or a computer. In some embodiments, the display generating component is a display integrated with the electronic device (optionally a touch screen display), an external display such as a monitor, projector, television, or hardware component (optionally integrated or external) for projecting a user interface or making the user interface visible to one or more users, or the like. In some embodiments, the one or more input devices include a device capable of receiving user input (e.g., capturing user input, detecting user input, etc.) and transmitting information associated with the user input to the electronic device. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from an electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), and so forth. In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, the electronic device displays (2002 a) user interface objects, such as user interface objects 1903a and/or 1903b in fig. 19A-19D, via a display generation component. In some implementations, the user interface object is an interactive user interface object, and in response to detecting input directed to a given object, the electronic device performs an action associated with the user interface object. For example, the user interface object is a selectable option that, when selected, causes the electronic device to perform an action, such as displaying a corresponding user interface, changing a setting of the electronic device, or initiating playback of content. As another example, the user interface object is a container (e.g., window) in which the user interface/content is displayed, and in response to detecting a selection of the user interface object and a subsequent movement input, the electronic device updates the positioning of the user interface object in accordance with the movement input. In some embodiments, the user interface element is displayed in a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) that is generated by, displayed by, or otherwise made viewable by the device (e.g., a user interface including the user interface object is a three-dimensional environment and/or is displayed within a three-dimensional environment).

In some embodiments, upon display of the user interface object, the electronic device detects (2002 b) input to a user interface object directed by a first predefined portion of a user of the electronic device (such as hands 1913a, 1913b, 1913c in fig. 19A-19D) via the one or more input devices (e.g., direct or indirect interaction of a hand, finger, etc. of the user of the electronic device with the user interface object, such as described with reference to methods 800, 1000, 1200, 1400, 1600, and/or 1800).

In some embodiments, upon detecting an input directed to a user interface object, the electronic device displays (2002 c) via a display generation component a simulated shadow displayed on the user interface object, such as shadows 1942a, 1942b, and/or shadows 1944, wherein the simulated shadow has an appearance based on a positioning of an element indicative of an interaction with the user interface object relative to the user interface object (e.g., a simulated shadow that appears to be cast by a cursor remote from and/or corresponding to a first predefined portion of the user (e.g., a visual indication such as described with reference to method 1800) or a simulated shadow that appears to be cast by a representation of the first predefined portion of the user (e.g., a virtual representation of a hand/finger, and/or an actual hand/finger via a physical or digital transmission display), etc.), the appearance optionally being based on a simulated light source and/or a shape of the element (e.g., a shape of the cursor or a portion of the user). For example, if the first predefined portion of the user is interacting directly with the user interface object, the electronic device generates simulated shadows that appear to be cast by the first predefined portion of the user on the user interface object (e.g., and does not generate shadows that appear to be cast by the cursor/visual indication on the user interface object), which optionally indicates that the interaction with the user interface object is direct interaction (e.g., rather than indirect interaction). In some implementations, such simulated shadows indicate a spacing between the first predefined portion of the user and the user interface object (e.g., indicate a distance of movement toward the user interface object required for the first predefined portion of the user to interact with the user interface object). As will be described in more detail below, in some embodiments, the electronic device generates different types of simulated shadows for indirect interactions with the user interface object, which indicate that the interactions are indirect (e.g., not direct). The above-described manner of generating and displaying shadows indicative of interactions with user interface objects provides an efficient manner of indicating the presence and/or type of interactions occurring with user interface objects, which simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes a user-device interface more efficient (e.g., by reducing errors in interactions with user interface objects), which also reduces power usage and extends battery life of the electronic device by enabling a user to more quickly and efficiently use the electronic device.

In some embodiments, the element includes a cursor displayed at a location corresponding to a location away from the first predefined portion of the user and controlled by movement of the first predefined portion of the user (2004 a), such as cursor 1940a and/or cursor 1940b. For example, in some embodiments, when a first predefined portion of the user (e.g., the user's hand) is in a particular pose and a distance corresponding to indirect interaction with the user interface object from a location corresponding to the user interface object, such as described with reference to method 800, the electronic device displays a cursor in proximity to the user interface object, the position/movement of the cursor being controlled by the first predefined portion of the user (e.g., the user's hand and/or the position/movement of a finger on the user's hand). In some implementations, the electronic device decreases a spacing between the cursor and the user interface object in response to movement of the first predefined portion of the user toward a location corresponding to the user interface object, and when movement of the first predefined portion of the user is sufficient for selection of movement of the user interface object, the electronic device removes the spacing between the cursor and the user interface object (e.g., causes the cursor to touch the user interface object). In some embodiments, the simulated shadow is a simulated shadow of the cursor on the user interface object, and the simulated shadow is updated/changed as the positioning of the cursor changes on the user interface object, and/or the distance of the cursor from the user interface object changes based on the movement/positioning of the first predefined portion of the user. The above-described manner of displaying a cursor and simulated shadows of the cursor indicative of interactions with user interface objects provides an efficient manner of indicating the type and/or amount of input required for a first predefined portion of a user to interact with user interface objects, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by reducing errors in interactions with user interface objects), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, upon displaying the user interface object and the second user interface object, and prior to detecting input directed to the user interface object by the first predefined portion of the user (2006 a), in accordance with a determination that one or more first criteria are met, including criteria met when the user's gaze is directed to the user interface object (e.g., criteria corresponding to indirect interaction with the user interface object, including one or more criteria based on distance of the first predefined portion of the user from the user interface object, pose of the first predefined portion of the user, etc., such as described with reference to method 800), the electronic device displays (2006 b) a cursor via the display generating component at a predetermined distance from the user interface object, such as described with reference to cursors 1940a and 1940b in fig. 19A (e.g., optionally displays the cursor without being associated with the user interface object before meeting the one or more first criteria). In some embodiments, when the one or more first criteria are met, the cursor is initially displayed a predetermined amount (e.g., 0.1, 0.5, 1, 5, 10 cm) apart from the user interface object. After displaying the cursor, movement of a first predefined portion of the user (e.g., toward the user interface object) corresponding to the initial spacing of the cursor from the user interface object is optionally required to interact with/select the user interface object through the cursor.

In some embodiments, in accordance with a determination that one or more second criteria are met, including criteria that are met when the user's gaze is directed to a second user interface object (e.g., criteria corresponding to indirect interaction with the second user interface object, including one or more criteria based on distance of a first predefined portion of the user from the second user interface object, pose of the first predefined portion of the user, etc., such as described with reference to method 800), the electronic device displays (2006 b) a cursor via the display generating component at a predetermined distance from the second user interface object, such as if the cursor display criteria described herein have been met with respect to object 1903c in fig. 19A (e.g., in addition to or instead of object 1903 a), which would optionally cause device 101 to display a cursor similar to cursor 1940a and/or 1940b for interaction with object 1903 c. For example, a cursor is optionally displayed unassociated with the second user interface object before the one or more second criteria are met. In some embodiments, when the one or more second criteria are met, the cursor is initially displayed a predetermined amount (e.g., 0.1, 0.5, 1, 5, 10 cm) apart from the second user interface object. After displaying the cursor, movement of the first predefined portion of the user corresponding to the initial spacing of the cursor from the second user interface object (e.g., toward the second user interface object) is optionally required to interact with/select the second user interface object through the cursor. Thus, in some embodiments, the electronic device displays a cursor for interacting with a respective user interface object based on the user's gaze. The above-described manner of displaying a cursor for interacting with a respective user interface object based on gaze provides an efficient way of preparing to interact with the user interface object, which simplifies interactions between a user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by being ready to accept interactions with the user interface object when the user is looking at the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, the simulated shadow comprises a simulated shadow (2008 a) of a virtual representation of the first predefined portion of the user, such as described with reference to simulated shadow 1944 corresponding to hand 1913 c. For example, the electronic device optionally captures images/information, etc., about one or more hands of a user in a physical environment of the electronic device with one or more sensors, and displays representations of the hands via display generation components at respective corresponding locations in a three-dimensional environment (e.g., containing the user interface objects) displayed by the electronic device for the hands. In some embodiments, the electronic device displays simulated shadows of those representations of the user's hand or portions of the user's hand in a three-dimensional environment displayed by the electronic device (e.g., as shadows displayed on the user interface object) to indicate one or more characteristics of interactions between the user's hand and the user interface object, as described herein (optionally without displaying shadows of other portions of the user or without displaying shadows of other portions of the user's hand). In some implementations, the simulated shadow corresponding to the user's hand is a simulated shadow on the user interface object during direct interaction between the user's hand and the user interface object (e.g., as described with reference to method 800). In some implementations, the simulated shadow provides a visual indication of one or more of a distance between the first predefined portion of the user and the user interface object (e.g., for selecting the user interface object), a location on the user interface object that the first predefined portion of the user will/is interacting with, etc. The above-described manner of displaying simulated shadows corresponding to representations of a first predefined portion of a user provides an efficient way of indicating characteristics of direct interaction with user interface objects, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with user interface objects), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, the simulated shadow includes a simulated shadow (2010 a) of a physical first predefined portion of the user, such as described with reference to simulated shadow 1944 corresponding to hand 1913 c. For example, the electronic device optionally (e.g., via a transparent or translucent display generating component) has a view of one or more hands of a user in a physical environment of the electronic device transmitted therethrough, and displays a three-dimensional environment (e.g., including user interface objects) via the display generating component, which results in the view of the one or more hands being visible in the three-dimensional environment displayed by the electronic device. In some embodiments, the electronic device displays simulated shadows of those hands of the user or portions of the user's hands (e.g., as shadows displayed on the user interface object) in a three-dimensional environment displayed by the electronic device to indicate one or more characteristics of interactions between the user's hands and the user interface object, as described herein (optionally without displaying shadows of other portions of the user or without displaying shadows of other portions of the user's hands). In some implementations, the simulated shadow corresponding to the user's hand is a simulated shadow on the user interface object during direct interaction between the user's hand and the user interface object (e.g., as described with reference to method 800). In some implementations, the simulated shadow provides a visual indication of one or more of a distance between the first predefined portion of the user and the user interface object (e.g., for selecting the user interface object), a location on the user interface object that the first predefined portion of the user will/is interacting with, etc. The above-described manner of displaying simulated shadows corresponding to the view of the first predefined portion of the user provides an efficient way of indicating characteristics of direct interaction with the user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, upon detecting an input directed to the user interface object and upon displaying a simulated shadow displayed on the user interface object (2012 a) (e.g., upon displaying a shadow of a cursor on the user interface object or upon displaying a shadow of a first predefined portion of a user on the user interface object), the electronic device detects (2012B) via the one or more input devices a progression to the input directed to the user interface object by the first predefined portion of the user (e.g., the first predefined portion of the user moves toward the user interface object), such as described with reference to hand 1913a in fig. 19B. In some implementations, in response to detecting the progress of the input directed to the user interface object, the electronic device changes (2012 c) the visual appearance (e.g., size, darkness, translucence, etc.) of the simulated shadow displayed on the user interface object according to the progress of the input directed to the user interface object by the first predefined portion of the user (e.g., based on distance moved, based on speed moved, based on direction moved), such as described with reference to shadow 1942a in fig. 19B. For example, in some embodiments, the visual appearance of the simulated shadow optionally changes as the first predefined portion of the user moves relative to the user interface object. For example, the electronic device optionally changes the visual appearance of the simulated shadow in a first manner when the first predefined portion of the user moves toward the user interface object (e.g., toward/interacts with the selected user interface object), and the electronic device optionally changes the visual appearance of the simulated shadow in a second manner different from (e.g., opposite) the first manner when the first predefined portion of the user moves away from (e.g., away from/interacts with) the user interface object. The above-described manner of changing the visual appearance of the simulated shadow based on the progress of the input directed to the user interface object provides an efficient way of indicating progress toward or back-up from selection of the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, changing the visual appearance of the simulated shadow includes changing the brightness (2014 a) used to display the simulated shadow, such as described with reference to shadow 1942a and/or shadow 1944. For example, in some embodiments, the electronic device optionally displays simulated shadows (e.g., of a hand and/or a cursor) with greater darkness as the first predefined portion of the user (e.g., and thus, when applicable, the cursor) moves toward the user interface object (e.g., toward the selection of/interaction with the user interface object), and optionally displays simulated shadows (e.g., of a hand and/or a cursor) with less darkness as the first predefined portion of the user (e.g., and thus, the cursor) moves away from the user interface object (e.g., away from the selection of/interaction with the user interface object). The above-described manner of changing darkness of simulated shadows based on progress of input directed to a user interface object provides an efficient manner of indicating progress toward or back-up from selection of a user interface object, which simplifies interactions between a user and an electronic device, enhances operability of the electronic device, and makes a user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling a user to more quickly and efficiently use the electronic device.

In some embodiments, changing the visual appearance of the simulated shadow includes changing a level of blurriness (and/or degree of dispersion) for displaying the simulated shadow (2016 a), such as described with reference to shadow 1942a and/or shadow 1944. For example, in some embodiments, the electronic device optionally displays simulated shadows (of the hands and/or cursors) with a lower degree of blur and/or dispersion as the first predefined portion of the user (e.g., and thus, when applicable, the cursor) moves toward the user interface object (e.g., toward the selection of/interaction with the user interface object), and optionally displays simulated shadows (of the hands and/or cursors) with a greater degree of blur and/or dispersion as the first predefined portion of the user (e.g., and thus, when applicable, the cursor) moves away from the user interface object (e.g., away from the selection of/interaction with the user interface object). The above-described manner of changing the degree of blurring of the simulated shadow based on the progress of the input directed to the user interface object provides an efficient way of indicating progress toward or back-up from selection of the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interaction with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some embodiments, changing the visual appearance of the simulated shadow includes changing the size of the simulated shadow (2018 a), such as described with reference to shadow 1942a and/or shadow 1944. For example, in some embodiments, the electronic device optionally displays simulated shadows (e.g., of a hand and/or a cursor) in a smaller size as the first predefined portion of the user (e.g., and thus, when applicable, the cursor) moves toward the user interface object (e.g., toward the selection of/interaction with the user interface object), and optionally displays simulated shadows (e.g., of a hand and/or a cursor) in a larger size as the first predefined portion of the user (e.g., and thus, when applicable, the cursor) moves away from the user interface object (e.g., away from the selection of/interaction with the user interface object). The above-described manner of changing the size of the simulated shadow based on the progress of the input directed to the user interface object provides an efficient way of indicating progress toward or back-up from selection of the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, upon detecting an input directed to the user interface object and upon displaying a simulated shadow displayed on the user interface object (2020 a) (e.g., upon displaying a shadow of a cursor on the user interface object or upon displaying a shadow of a first predefined portion of a user on the user interface object), the electronic device detects (2020 b) via the one or more input devices a first portion of the input corresponding to a laterally moving element relative to the user interface object (e.g., detects a lateral movement of the first predefined portion of the user relative to a position corresponding to the user interface object), such as described with reference to hand 1913a in fig. 19C or hand 1913C in fig. 19C. In some implementations, in response to detecting the first portion of the input, the electronic device displays (2020C) a simulated shadow at a first location on the user interface object in a first visual appearance (e.g., a first one or more of size, shape, color, darkness, blurriness, degree of dispersion, etc.), such as described with reference to hand 1913a in fig. 19C or hand 1913C in fig. 19C. In some implementations, the electronic device detects (2020 d) via the one or more input devices a second portion of the input corresponding to the lateral movement element relative to the user interface object (e.g., detects another lateral movement of the first predefined portion of the user relative to the location corresponding to the user interface object). In some implementations, in response to detecting the second portion of the input, the electronic device displays (2020 e) a simulated shadow on the user interface object at a second location different from the first location, such as described with reference to hand 1913a in fig. 19C or hand 1913C in fig. 19C, in a second visual appearance (e.g., a different one or more of size, shape, color, darkness, blurriness, degree of dispersion, etc.) that is different from the first visual appearance. In some implementations, the electronic device changes the visual appearance of the simulated shadow as the simulated shadow moves laterally over the user interface object (e.g., lateral movement corresponding to the first predefined portion of the user). In some implementations, the difference in visual appearance is based on one or more of: a difference in content of the user interface object over which the simulated shadow is displayed, a difference in distance between the first predefined portion of the user and the user interface object (different locations of the simulated shadow on the user interface object), and so on. The above-described manner of changing the visual appearance of the simulated shadow based on the shadow and/or lateral movement of the first predefined portion of the user provides an efficient manner of indicating one or more characteristics associated with different locations of the user interface object that are interactive with the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the different locations on the user interface object), which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

In some implementations, the user interface object is a virtual surface (e.g., a virtual touchpad), and the input detected at a location near the virtual surface provides input (2022 a) to a second user interface object remote from the virtual surface, such as described with respect to user interface objects 1903b and 1903 c. For example, in some embodiments, when a first predefined portion of a user (e.g., a user's hand) is in a particular pose and at a distance corresponding to indirect interaction with a particular user interface object, such as described with reference to method 800, the electronic device displays a virtual touchpad proximate to (e.g., a predetermined distance, such as 0.1, 0.5, 1, 5, 10cm, from) the first predefined portion of the user and displays simulated shadows corresponding to the first predefined portion of the user on the virtual touchpad. In some implementations, in response to movement of the first predefined portion of the user toward the virtual touchpad, the electronic device updates the simulated shadow based on a relative positioning and/or distance of the first predefined portion of the user from the virtual touchpad. In some implementations, when movement of the first predefined portion of the user is sufficient for selecting movement of the virtual touchpad with the first predefined portion of the user, the electronic device provides input (e.g., selection input, tap input, scroll input, etc.) to a particular remote user interface object based on interactions between the first predefined portion of the user and the virtual touchpad. The virtual surface has one or more characteristics of visual indications displayed at respective locations in the three-dimensional environment corresponding to respective locations of the predefined portions of the user, as described with reference to method 1800. The above-described manner of displaying virtual touchpads and simulated shadows on virtual touchpads provides an efficient way of indicating one or more characteristics of interactions with the virtual touchpads (and thus with remote user interface objects), which simplifies interactions between users and electronic devices, enhances operability of the electronic devices, and makes user-device interfaces more efficient (e.g., by avoiding errors in interacting with remote user interface objects via the virtual touchpads), which also reduces power usage and prolongs battery life of the electronic devices by enabling users to more quickly and efficiently use the electronic devices.

In some embodiments, the first predefined portion of the user interacts directly with the user interface object (e.g., as described with reference to method 1400) and simulated shadows are displayed on the user interface object (2024 a), such as described with reference to user interface object 1903b in fig. 19A-19D. For example, if the first predefined portion of the user is interacting directly with the user interface object, the electronic device generates simulated shadows that appear to be cast by the first predefined portion of the user on the user interface object (e.g., and does not generate shadows that appear to be cast by the cursor/visual indication on the user interface object), which optionally indicates that the interaction with the user interface object is direct interaction (e.g., rather than indirect interaction). In some embodiments, such simulated shadows indicate a spacing between the first predefined portion of the user and a location corresponding to the user interface object (e.g., indicate a distance of movement toward the user interface object required for the first predefined portion of the user to interact with the user interface object). The above-described manner of displaying simulated shadows on a user interface object when a first predefined portion of a user interacts directly with the user interface object provides an efficient manner of indicating one or more characteristics of interactions with the user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, in accordance with a determination that a first predefined portion of a user is within a threshold distance (e.g., 1, 2, 5, 10, 20, 50, 100, 500 cm) of a location corresponding to a user interface object, the simulated shadow corresponds to the first predefined portion of the user (2026 a), such as shadow 1944 (e.g., if the first predefined portion of the user interacts directly with the user interface object, such as described with reference to methods 800, 1000, 1200, 1400, 1600, and/or 1800), the electronic device displays the simulated shadow on the user interface object, wherein the simulated shadow corresponds to the first predefined portion of the user (e.g., has a shape based on the first predefined portion of the user). In some implementations, in accordance with a determination that the first predefined portion of the user is farther than a threshold distance (e.g., 1, 2, 5, 10, 20, 50, 100, 500 cm) from a location corresponding to the user interface object, the simulated shadow corresponds to a cursor (2026 b) controlled by the first predefined portion of the user, such as shadows 1942a and/or 1942b. For example, if a first predefined portion of the user is indirectly interacting with a user interface object, such as described with reference to methods 800, 1000, 1200, 1400, 1600, and/or 1800, the electronic device displays a cursor and a simulated shadow on the user interface object, wherein the simulated shadow corresponds to the cursor (e.g., has a cursor-based shape). Exemplary details of the cursor and/or shadows corresponding to the cursor have been previously described herein. The above-described manner of selectively displaying the cursor and its corresponding shadow provides an efficient manner of facilitating proper interaction (e.g., direct or indirect) with the user interface object, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and extends battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, upon detecting an input directed to a user interface object by a first predefined portion of the user, the electronic device detects (2028 a) a second input directed to the user interface object by a second predefined portion of the user, such as detecting that hands 1913a and 1913b are interacting with user interface object 1903a (e.g., both hands of the user satisfy indirect interaction criteria with the same user interface object, such as described with reference to method 800). In some implementations, upon detecting both the input directed to the user interface object and the second input, the electronic device simultaneously displays (2028 b) on the user interface object a simulated shadow (2028 c) relative to the user interface object indicating interaction of the first predefined portion of the user with the user interface object, and a second simulated shadow (2028 d) relative to the user interface object indicating interaction of the second predefined portion of the user with the user interface object, such as shadows 1942a and 1942b. For example, the electronic device displays a simulated shadow corresponding to a first predefined portion of the user on the keyboard (e.g., a shadow of a cursor if the first predefined portion of the user interacts indirectly with the keyboard, or a shadow of the first predefined portion of the user if the first predefined portion of the user interacts directly with the keyboard) and displays a simulated shadow corresponding to a second predefined portion of the user on the keyboard (e.g., a shadow of a cursor if the second predefined portion of the user interacts indirectly with the keyboard, or a shadow of the second predefined portion of the user if the second predefined portion of the user interacts directly with the keyboard). In some implementations, the simulated shadow corresponding to the first predefined portion of the user has one or more characteristics (e.g., as described herein) indicative of interaction of the first predefined portion of the user with the user interface object, and the simulated shadow corresponding to the second predefined portion of the user has one or more characteristics (e.g., as described herein) indicative of interaction of the second predefined portion of the user with the user interface object. The above-described manner of displaying simulated shadows of a plurality of predefined portions of a user provides an efficient way of independently indicating characteristics of interactions between the plurality of predefined portions of the user and a user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to more quickly and efficiently use the electronic device.

In some embodiments, the simulated shadow indicates how much movement (2030 a) is required for the first predefined portion of the user to interact with the user interface object, such as described with reference to shadows 1942a, 1942b, and/or 1944. For example, the visual appearance of the simulated shadow is based on the distance the first predefined portion of the user must move toward the user interface object to interact with the user interface object. Thus, the visual appearance of the simulated shadow optionally indicates how far the first predefined portion of the user has to move in order to interact with and/or select the user interface object. For example, if the simulated shadow is relatively large and/or diffuse, the simulated shadow optionally indicates that the first predefined portion of the user must move a relatively large distance towards the user interface object in order to interact with and/or select the user interface object, and if the simulated shadow is relatively small and/or the boundary is clearly defined, the simulated shadow optionally indicates that the first predefined portion of the user must move a relatively small distance towards the user interface object in order to interact with and/or select the user interface object. The above-described manner of simulating shadows to indicate how far the first predefined portion of the user has to move in order to interact with the user interface object provides an efficient way to facilitate accurate interactions between the first predefined portion of the user and the user interface object, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient (e.g., by avoiding errors in interacting with the user interface object), which also reduces power usage and prolongs battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently.

Fig. 21A illustrates the electronic device 101A displaying a three-dimensional environment and/or user interface via the display generation component 120. It should be appreciated that in some embodiments, the electronic device 101A utilizes one or more of the techniques described with reference to fig. 21A-21E in a two-dimensional environment without departing from the scope of the present disclosure. As described above with reference to fig. 1-6, the electronic device 101a optionally includes a display generation component 120a (e.g., a touch screen) and a plurality of image sensors 314a. The image sensor optionally includes one or more of the following: a visible light camera; an infrared camera; a depth sensor; or any other sensor that the electronic device 101a can use to capture one or more images of the user or a portion of the user when the user interacts with the electronic device 101 a. In some embodiments, the display generating component 120a is a touch screen capable of detecting gestures and movements of the user's hand. In some embodiments, the user interfaces shown and described may also be implemented on a head-mounted display that includes a display generating component that displays the user interface to the user, as well as sensors that detect movement of the physical environment and/or the user's hand (e.g., external sensors facing outward from the user) and/or sensors that detect gaze of the user (e.g., internal sensors facing inward toward the user).

FIG. 21A illustrates an example of the electronic device 101A displaying a first selectable option 2104 and a second selectable option 2106 within a container 2102 and a slider user interface element 2108 within a container 2109 in a three-dimensional environment. In some embodiments, containers 2102 and 2109 are windows, backplanes, backgrounds, discs, or other types of container user interface elements. In some embodiments, the contents of container 2102 and the contents of container 2109 are associated with the same application (e.g., or with the operating system of electronic device 101 a). In some implementations, the contents of container 2102 and the contents of container 2109 are associated with different applications, or the contents of one of containers 2102 or 2109 are associated with an operating system. In some implementations, in response to detecting a selection of one of selectable options 2104 or 2106, electronic device 101 performs an action associated with the selected selectable option. In some embodiments, the slider 2108 includes an indication 2112 of the current value of the slider 2108. For example, slider 2108 indicates a parameter, magnitude, value, etc. of a setting of electronic device 101a or an application. In some embodiments, in response to an input to change the current value of the slider (e.g., by manipulating the indicator 2112 within the slider 2108), the electronic device 101a updates the setting associated with the slider 2108 accordingly.

As shown in fig. 21A, in some embodiments, the electronic device 101A detects a gaze 2101A of a user pointing at the container 2102. In some implementations, in response to detecting that the user's gaze 2101a is directed to the container 2102, the electronic device 101a updates the positioning of the container 2102 to display the container 2102 in a three-dimensional environment at a position closer to the user's viewpoint than the positioning of the container 2102 prior to detecting the gaze 2101a directed to the container 2102. For example, before detecting that the user's gaze 2101a is directed toward container 2109, electronic device 101a displays containers 2102 and 2102 at the same distance from the user's point of view in a three-dimensional environment. In this example, in response to detecting that the user's gaze 2101A is directed toward container 2102 as shown in fig. 21A, electronic device 101A displays container 2102 closer to the user's viewpoint than container 2109. For example, the electronic device 101a displays the container 2102 in a larger size and/or displays the container with virtual shadows and/or with stereoscopic depth information corresponding to a position closer to the user's viewpoint.

Fig. 21A shows an example in which the electronic device 101A detects a selection input directed to the selectable option 2104 and the slider 2108. Although fig. 21A shows multiple selection inputs, it should be understood that in some embodiments, the selection inputs shown in fig. 22A are detected at different times rather than simultaneously.

In some implementations, the electronic device 101a detects selection of one of the user interface elements (such as one of the selectable options 2104 or 2106 or the indicator 2112 of the slider 2108) by detecting an indirect selection input, a direct selection input, an air gesture selection input, or an input device selection input. In some implementations, detecting selection of the user interface element includes detecting that a user's hand is performing a corresponding gesture. In some embodiments, according to one or more steps of methods 800, 1000, 1200, and/or 1600, detecting an indirect selection input includes detecting, via input device 314a, that a user's gaze is directed toward a respective user interface element while detecting that the user's hand is making a selection gesture (such as a pinch hand gesture, in which the user touches his thumb to the other finger of the hand, such that the selectable option moves toward a container in which the selectable option is displayed, in which selection occurs when the selectable option reaches the container). In some implementations, according to one or more steps of methods 800, 1400, and/or 1600, detecting a direct selection input includes detecting, via input device 314a, that a user's hand makes a selection gesture, such as a pinch gesture within a predefined threshold distance (e.g., 1, 2, 3, 5, 10, 15, or 30 centimeters) of a location of a respective user interface element, or a press gesture in which the user's hand "presses" into the location of the respective user interface element while in a pointing-to-hand shape. In some implementations, according to one or more steps of methods 1800 and/or 2000, detecting the air-gesture input includes detecting that a user's gaze is directed toward a respective user interface element while detecting a press gesture at a location of the air-gesture user interface element displayed in the three-dimensional environment via display generation component 120 a. In some implementations, detecting an input device selection includes detecting manipulation of a mechanical input device (e.g., stylus, mouse, keyboard, touch pad, etc.) in a predefined manner corresponding to selection of a user interface element when a cursor controlled by the input device is associated with a location of the respective user interface element and/or when a gaze of a user is directed at the respective user interface element.

For example, in fig. 21B, the electronic device 101a detects a part of the pointing option 2104 of the direct selection input by the hand 2103 a. In some embodiments, the hand 2103a is in a hand shape (e.g., a "hand state D") included in the direct selection gesture, such as a pointing hand shape in which one or more fingers are extended and one or more fingers curl toward the palm. In some embodiments, the portion of the direct selection input does not include completion of the press gesture (e.g., the hand moves a threshold distance in the direction from option 2104 to container 2102, such as a distance corresponding to the visual spacing between option 2104 and container 2102). In some embodiments, hand 2103a is within a direct selection threshold distance of selectable option 2104.

In some embodiments, the electronic device 101a detects a portion of the pointer 2112 of the pointing slider 2108 that is input using the hand 2103 d. In some embodiments, the hand 2103D is in a hand shape (e.g., a "hand state D") included in the direct selection gesture, such as a pointing hand shape in which one or more fingers are extended and one or more fingers curl toward the palm. In some embodiments, the portion of the input does not include the end of the input, such as the user stopping making the pointing hand shape. In some embodiments, the hand 2103d is within a direct selection threshold distance of the indicator 2112 of the slider 2108.

In some embodiments, the electronic device 101a detects a portion of the indirect selection input with the hand 2103b pointing at the selectable option 2104 while the gaze 2101a points at the option 2104. In some embodiments, the hand 2103B is in a hand shape (e.g., "hand state B") included in the indirect selection gesture, such as a pinch-in hand shape in which the thumb is touching another finger of the hand 2103B. In some embodiments, the portion of the indirect selection input does not include completion of the pinch gesture (e.g., movement of the thumb away from the finger). In some embodiments, hand 2103b is farther from selectable option 2104 than the direct selection threshold distance when the portion of the indirect selection input is provided.

In some embodiments, the electronic device 101a detects a portion of the pointer 2112 pointing to the slider 208 that is indirectly input with the hand 2103b while the gaze 2101b is pointing to the slider 2108. In some embodiments, the hand 2103B is in a hand shape (e.g., "hand state B") included in the indirect selection gesture, such as a pinch-in hand shape in which the thumb is touching another finger of the hand 2103B. In some embodiments, the portion of the indirect input does not include completion of a pinch gesture (e.g., movement of the thumb away from the finger). In some embodiments, the hand 2103b is farther from the slider 2112 than the direct selection threshold distance when providing the portion of the indirect input.

In some implementations, the electronic device 101a detects a portion of the air gesture selection input with the hand 2103c while the gaze 2101a is pointing at the option 2104 pointing at the selectable option 2104. In some implementations, the hand 2103c is in a hand shape (e.g., a "hand state B") included in the air gesture selection gesture, such as the hand being in a pointing hand shape that is within a threshold distance (e.g., 0.1, 0.3, 0.5, 1, 2, or 3 centimeters) of the air gesture element 2114 displayed by the device 101. In some implementations, the portion of the air gesture selection input does not include completion of the selection input (e.g., when hand 2103c is within a threshold distance (e.g., 0.1, 0.3, 0.5, 1, 2, or 3 centimeters) of air gesture element 114, the movement of hand 2103c away from the user's point of view corresponds to an amount of visual separation between selectable option 2104 and container 2102 such that the movement corresponds to pushing option 2104 to the position of container 2102. In some embodiments, hand 2103c is farther from selectable option 2104 than the direct selection threshold distance when providing the portion of the air gesture selection input.

In some implementations, the electronic device 101a detects a portion of the pointing slider 2108 that was entered with an air gesture by the hand 2103c while the gaze 2101b was pointing at the slider 2108. In some implementations, the hand 2103c is in a hand shape (e.g., a "hand state B") included in the air gesture selection gesture, such as the hand being in a pointing hand shape that is within a threshold distance (e.g., 0.1, 0.3, 0.5, 1, 2, or 3, etc. centimeters) of the air gesture element 2114. In some implementations, the portion of the air gesture input does not include completion of the air gesture input (e.g., movement of hand 2103c away from air gesture element 2114, hand 2103c stops making the air gesture hand shape). In some embodiments, hand 2103c is farther from slider 2108 than the direct selection threshold distance when providing the portion of the air gesture input.

In some embodiments, in response to detecting the portion of the selection input (e.g., one of the selection inputs) that points to option 2104, electronic device 101a provides visual feedback to the user that the selection input points to option 2104. For example, as shown in fig. 21B, the electronic device 101a updates the color of the option 2104 and increases the visual spacing of the option 2104 from the container 2102 in response to detecting a portion of the pointing option 2104 of the selection input. In some embodiments, if the user's gaze 2101a is not directed to a user interface element included in the container 2102, the electronic device 101a continues to display the container 2102 at a position shown in fig. 21B, which is visually spaced from a position where the electronic device 101a would display the container 2102. In some embodiments, because the selection input does not point to option 2106, electronic device 101A maintains display of option 2104 in the same color as that used to display option 2106 in fig. 21A prior to detecting the portion of the input that points to option 2106. Further, in some embodiments, the electronic device 101a displays the option 2106 without visual separation from the container 2102 because the beginning of the selection input is not directed to the option 2106.

In some embodiments, the beginning of the selection input to option 2104 corresponds to option 2104 moving toward container 2102 but not touching the container. For example, the initiation of the direct input provided by hand 2103a includes movement of hand 2103a downward or in a direction from option 2104 toward container 2102 while the hand is in a pointing hand shape. As another example, the onset of the air gesture input provided by hand 2103c and gaze 2101a includes movement of hand 2103c downward or in a direction from option 2104 toward container 2102 while the hand is in a pointing hand shape when hand 2103c is within a threshold distance (e.g., 0.1, 0.3, 0.5, 1, 2, or 3 cm) from air gesture element 2114. As another example, the initiation of the indirect selection input provided by hand 2103b and gaze 2101a includes detecting that hand 2103b is maintaining the pinch hand shape for a time less than a predetermined time threshold (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, 5, etc. seconds) corresponding to an amount of movement of option 2104 toward container 2102 corresponding to arrival of option 2104 at container 2102. In some embodiments, selection of option 2104 occurs when the selection input corresponds to an amount by which option 2104 is moved toward container 2102 by option 2104 to reach container 2102. In fig. 21B, the input corresponds to the option 2104 being moved toward the container 2102 portion by an amount less than the amount of visual separation between the option 2104 and the container 2102.

In some embodiments, in response to detecting the portion of the input (e.g., one of the inputs) that is pointing to the slider 2108, the electronic device 101a provides visual feedback to the user that the input is pointing to the slider 2108. For example, the electronic device 101a displays the slider 2108 with a visual spacing from the container 2109. Further, in response to detecting gaze 2101b of the user directed to an element within container 2109, electronic apparatus 101A updates the positioning of container 2109 to display container 2109 closer to the viewpoint of the user than the position of container 2109 was displayed in FIG. 21A prior to detecting the start of the input directed to slider 2108. In some embodiments, the portion of the input pointing to the slider 2108 shown in fig. 21B corresponds to selecting the indicator 2112 of the slider 2108 for adjustment, but does not yet include a portion of the input for adjusting the indicator 2112 (and thus the value controlled by the slider 2108).

Fig. 21C illustrates an example of the indicator 2112 of the electronic device 101a redirecting and/or adjusting the selection input to the slider 2108 in response to detecting movement included in the input. For example, after providing the portion of the selection input described above with reference to fig. 21B, in response to detecting that the user's hand movement is less than a threshold (e.g., a threshold corresponding to a distance from option 2104 to a boundary of container 2102) amount (e.g., a speed amount, a distance amount, a time amount), electronic device 101a redirects the selection input from option 2104 to option 2106, as will be described in more detail below. In some embodiments, in response to detecting movement of the user's hand while providing input directed to the slider 2108, the electronic device 101 updates the indicator 2112 of the slider 2108 according to the detected movement, as will be described in more detail below.

In some implementations, after detecting a portion of the selection input pointing to option 2104 described above with reference to fig. 21B (e.g., via hand 2103a or hand 2103B and gaze 2101c or hand 2103c and gaze 2101 c), electronic device 101a detects movement of the hand (e.g., 2103a, 2103B, or 2103 c) in a direction from option 2104 toward option 2106. In some implementations, the amount of movement (e.g., amount of speed, amount of distance, amount of duration) corresponds to an amount less than the distance between option 2104 and the boundary of container 2102. In some embodiments, the electronic device 101a maps the size of the container 2102 to a predetermined amount of movement (e.g., the amount of movement of the hand 2103a, 2103b, or 2103c providing input) corresponding to the distance from the option 2104 to the boundary of the container 2102. In some embodiments, after detecting the portion of the selection input pointing options 2104 described above with reference to fig. 21B, the electronic device 101a detects the user's gaze 2101c pointing options 2106. In some embodiments, for direct input provided by hand 2103a, in response to detecting movement of hand 2103a, electronic device 101a redirects selection input from option 2104 to option 2106. In some embodiments, for indirect input provided by hand 2103b, in response to detecting that user's gaze 2101c is pointing at option 2106 and/or detecting movement of hand 2103b, electronic device 101a redirects selection input from option 2104 to option 2106. In some embodiments, for air gesture input provided by hand 2103c, in response to detecting that user's gaze 2101c is pointing at option 2106 and/or detecting movement of hand 2103c, electronic device 101a redirects selection input from option 2104 to option 2106.

Fig. 21C illustrates an example of redirecting selection input between different elements within a respective container 2102 of a user interface. In some embodiments, the electronic device 101a redirects the selection input from another container to one container in response to detecting that the user's gaze is directed to the one container. For example, if option 2106 is in a different container than option 2104, then the selection input is directed from option 2104 to option 2106 in response to the above-described movement of the user's hand and the user's gaze directed to the container of option 2104 (e.g., gaze directed to option 2104).

In some embodiments, if upon detecting the portion of the selection input, the electronic device 101a detects that the user's gaze is directed out of the container 2102, it is still possible to redirect the selection input to one of the options 2104 or 2106 within the container 2102. For example, in response to detecting that the user's gaze 2101C is pointing at option 2106 (after facing away from container 2102), electronic device 101a redirects an indirect gesture input or an air gesture input from option 2104 to option 2106, as shown in fig. 21C. As another example, in response to detecting movement of the hand 2103a as described above while detecting a direct selection input, the electronic device 101a redirects the input from option 2104 to option 2106, regardless of where the user is looking.

In some embodiments, in response to redirecting the selection input from option 2104 to option 2106, electronic device 101a updates option 2104 to indicate that the selection input does not point to option 2104 and updates option 2106 to indicate that the selection input points to option 2106. In some embodiments, updating option 2104 includes displaying option 2104 in a color that is not corresponding to the selection (e.g., the same color as was used to display option 2104 in fig. 21A prior to detecting the beginning of the selection input) and/or displaying option 2104 without visual separation from container 2102. In some embodiments, updating option 2106 includes displaying option 2106 in a color that indicates selection of pointing option 2106 (e.g., different from the color used to display option 2106 in fig. 21B when pointing option 2104 is entered) and/or displaying option 2106 at a visual spacing from container 2102.

In some implementations, the amount of visual separation between option 2106 and container 2102 corresponds to an amount of further input required to cause selection of option 2106 (such as additional motion of hand 2103a to provide direct selection, additional motion of hand 2103c to provide air gesture selection, or continuation of pinch gesture with hand 2103b to provide indirect selection). In some embodiments, when the selection input is redirected from option 2104 to option 2106, the progress of the portion of the selection input provided by hands 2103a, 2103b and/or 2103c to option 2104 before the selection input is redirected away from option 2104 is applicable to the selection of option 2106, as described in more detail below with reference to method 2200.

In some embodiments, the electronic device 101 redirects the selection input from option 2104 to option 2106 without detecting another initiation of the selection input directed to option 2106. For example, in the event that the electronic device 101a does not detect the start of a selection gesture with one of the hands 2103a, 2103b or 2103c that specifically points to the option 2106, the selection input is redirected.

In some embodiments, in response to detecting movement of hand 2103d, 2103b or 2103c while detecting an input directed to slider 2108, electronic device 101a does not redirect the input. In some embodiments, the electronic device 101a updates the positioning of the indicator 2112 of the slider 2108 according to movement of the hand (e.g., speed, distance, duration of movement) providing input directed to the slider 2108, as illustrated in fig. 21C.

Fig. 21D illustrates an example of the electronic device 101a canceling selection of option 2106 in response to further movement of the hand 2103a, 2103b or 2103c providing a selection input directed to option 2106 and/or the user's gaze 2101e facing away from the container 2102 and/or option 2106. For example, the electronic device 101a cancels the direct selection input provided by the hand 2103a in response to detecting that the upward or lateral movement of the hand 2103a corresponds to an amount greater than the distance between the option 2106 and the boundary of the container 2102. As another example, electronic device 101a cancels the air gesture input provided by hand 2103c in response to detecting that hand 2103c is moved upward or sideways by an amount corresponding to a distance greater than the distance between option 2106 and the boundary of container 2102 and/or in response to detecting that user's gaze 2101e is directed out of container 2106 or that user's gaze 2101d is away from option 2102 but within container 2102. In some implementations, the electronic device 101a does not cancel the direct selection input or the air gesture selection input in response to the downward movement of the hand 2103a or 2103c, respectively, because the downward movement may correspond to the user intent to select option 2106 rather than to cancel the user intent to select. As another example, the electronic device 101a cancels the indirect selection input in response to detecting that the hand 2103b is moved upward, downward, or laterally by an amount that is greater than the corresponding amount of distance between the option 2106 and the boundary of the container 2102 and/or in response to detecting that the user's gaze 2101e is directed out of the container 2106 or that the user's gaze 2101d is away from the option 2102 but within the container 2102. As described above, in some embodiments, the amount of movement required to cancel the input is mapped to the corresponding amount of movement of hand 2103a, regardless of the size of option 2106 and container 2102.

In some embodiments, in response to a selection input canceling the pointing option 2106, the electronic device 101a updates the display of the option 2106 to indicate that the electronic device 101a is not receiving a selection input of the pointing option 2106. For example, the electronic device 101A displays the option 2106 in a color that does not correspond to the selection input (e.g., the same color as the color of the option 2106 in fig. 21A before the selection input is detected) and/or displays the option 2106 as having no visual spacing from the container 2102. In some embodiments, if the user's gaze 2101D is still directed toward the container 2102, the electronic device 101a displays the container 2102 at a location proximate to the user's point of view, as shown in fig. 21D. In some implementations, if the user's gaze 2101e is directed away from the container 2102, the electronic device 101a displays the container 2102 at a location remote from the user's point of view (e.g., displays the container as having no virtual shadow, having a smaller size, having stereoscopic depth information corresponding to a location further from the user's point of view).

In some embodiments, in response to the same amount and/or direction of movement of the hand 2103d, 2103b or 2103c as described above as part of the input to the slider 2108, the electronic device 101a continues to adjust the positioning of the indicator 2112 of the slider 2108 without canceling the input to the slider 2108. In some embodiments, the electronic device 101a updates the positioning of the pointer 2108 of the slider 2108 according to the direction and amount of movement (e.g., speed, distance, duration of movement, etc.).

In some embodiments, if instead of detecting a user request to cancel a selection input as shown in fig. 21D, the electronic device 101a detects a continuation of a selection input directed to option 2106, the electronic device 101a selects option 2106. For example, fig. 21E shows that the electronic device 101a detects continuation of the selection input shown in fig. 21C. In some embodiments, the electronic device 101a detects continuation of the direct selection input, including detecting further movement of the hand 2103a in a direction from the option 2106 toward the container 2102 by an amount corresponding to at least an amount of visual spacing between the option 2106 and the container 2102, such that the option 2106 reaches the container 2102. In some implementations, the electronic device 101a detects a continuation of the air gesture selection input that includes further movement of the hand 2103c in a direction from the option 2106 toward the container 2102 by an amount corresponding to at least an amount of visual spacing between the option 2106 and the container 2102 when the user's gaze 2101c points at the option 2106, when the hand 2103c is within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, or 3 cm) from the air gesture element 2114, thus the option 2106 reaches the container 2102. In some embodiments, the electronic device 101a detects a continuation of the indirect input, including the hand 2103b remaining in pinch-hand shape for a time corresponding to the arrival of the option 2106 at the container 2102 while the user's gaze 2101c is pointed at the option 2106. Thus, in some embodiments, the electronic device 101a selects option 2104 in response to continuation of the selection input after redirecting the selection input from option 2106 to option 2106 without detecting additional initiation of the selection input.

Fig. 22A-22K are flowcharts illustrating a method 2200 of redirecting input from one user interface element to another user interface element in response to detecting movement included in the input, according to some embodiments. In some embodiments, the method 2200 is performed at a computer system (e.g., computer system 101 in fig. 1, such as a tablet, smart phone, wearable computer, or head-mounted device) that includes a display generating component (e.g., display generating component 120 in fig. 1, 3, and 4) (e.g., heads-up display, touch screen, projector, etc.) and one or more cameras (e.g., cameras pointing downward toward the user's hand (e.g., color sensors, infrared sensors, and other depth sensing cameras) or cameras pointing forward from the user's head). In some embodiments, method 2200 is managed by instructions stored in a non-transitory computer-readable storage medium and executed by one or more processors of a computer system, such as one or more processors 202 of computer system 101 (e.g., control unit 110 in fig. 1A). Some of the operations in method 2200 are optionally combined and/or the order of some of the operations are optionally changed.

In some embodiments, the method 2200 is performed at an electronic device (e.g., 101 a) in communication with a display generation component (e.g., 120 a) and one or more input devices (e.g., 314 a) (e.g., a mobile device (e.g., a tablet, smart phone, media player, or wearable device) or computer). In some embodiments, the display generating component is a display integrated with the electronic device (optionally a touch screen display), an external display such as a monitor, projector, television, or hardware component for projecting a user interface or making the user interface visible to one or more users (optionally integrated or external), or the like. In some embodiments, the one or more input devices include a device capable of receiving user input (e.g., capturing user input, detecting user input, etc.) and transmitting information associated with the user input to the electronic device. Examples of input devices include a touch screen, a mouse (e.g., external), a touch pad (optionally integrated or external), a remote control device (e.g., external), another mobile device (e.g., separate from the electronic device), a handheld device (e.g., external), a controller (e.g., external), a camera, a depth sensor, an eye tracking device, and/or a motion sensor (e.g., a hand tracking device, a hand motion sensor), and so forth. In some implementations, the electronic device communicates with a hand tracking device (e.g., one or more cameras, depth sensors, proximity sensors, touch sensors (e.g., touch screen, touch pad)). In some embodiments, the hand tracking device is a wearable device, such as a smart glove. In some embodiments, the hand tracking device is a handheld input device, such as a remote control or a stylus.

In some embodiments, such as in fig. 21A, an electronic device (e.g., 101A) displays (2202 a) a user interface via a display generation component (e.g., 120 a) that includes a respective region (e.g., 2102) including a first user interface element (e.g., 2104) and a second user interface element (e.g., 2106). In some implementations, the respective region is a user interface element, such as a container, a back panel, or a (e.g., application) window. In some embodiments, the first user interface element and the second user interface element are selectable user interface elements that, when selected, cause the electronic device to perform an action associated with the selected user interface element. For example, selection of the first user interface element and/or the second user interface element causes the electronic device to launch an application, open a file, initiate and/or stop playback of content with the electronic device, navigate to a corresponding user interface, change a setting of the electronic device, initiate communication with the second electronic device, or perform another action in response to the selection.

In some implementations, such as in fig. 21B, upon display of the user interface, the electronic device (e.g., 101 a) detects (2202B) a first input directed to a first user interface element (e.g., 2104) in the respective region (e.g., 2102) via the one or more input devices (e.g., 314 a). In some implementations, the first input is one or more inputs that are a subset of a sequence of inputs for causing selection of the first user interface element (e.g., not a complete sequence of inputs for causing selection of the first user interface element). For example, detecting indirect input corresponding to an input selecting the first user interface element includes detecting, via an eye tracking device in communication with the electronic device, that a gaze of the user is directed toward the first user interface element, while detecting, via a hand tracking device, that the user performs a pinch gesture in which a thumb of the user touches a finger of the same hand of the thumb, and then the thumb and finger are moved away from each other (e.g., such as described with reference to methods 800, 1200, 1400, and/or 1800), the electronic device selecting the first user interface element. In this example, detecting the first input (e.g., as an indirect input) corresponds to detecting that the user's gaze is directed toward the first user interface element while detecting that the thumb touches a finger on the thumb's hand (e.g., without detecting that the thumb and finger are moving away from each other). As another example, detecting direct input corresponding to an input selecting the first user interface element includes detecting that the user "presses" the first user interface element with his hand and/or the extended finger a predetermined distance (e.g., 0.5, 1, 2, 3, 4, 5, or 10 centimeters) while the hand is in a pointing-hand shape (e.g., a hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm), such as described with reference to methods 800, 1200, 1400, and/or 1800. In this example, detecting the first input (e.g., as a direct input) corresponds to detecting that the user "presses" the first user interface element a distance less than a predetermined distance while the user's hand is in the pointing hand shape (e.g., not detecting that the "press" input continues to a point where the first user interface element has been pressed a predetermined distance threshold and is thus selected). In some embodiments, if, after the above-described kneading, the device detects movement of the kneading toward the first user interface element sufficient to push the first user interface element back by the above-described predetermined distance (e.g., movement corresponding to "pushing" the first user interface element), the first user interface element may alternatively be selected using indirect input. In such embodiments, the first input is optionally movement of the hand while maintaining the pinch hand shape, but insufficient movement.

In some embodiments, such as in fig. 21B, in response to detecting the first input directed to the first user interface element (e.g., 2104), the electronic device (e.g., 101 a) modifies (2202 c) an appearance of the first user interface element (e.g., 2104) to indicate that further input directed to the first user interface element (e.g., 2104) will cause selection of the first user interface element (e.g., 2104). In some implementations, modifying the appearance of the first user interface element includes displaying the first user interface element in a different color, pattern, text pattern, translucency, and/or line pattern than that used to display the first user interface element prior to detecting the first input. In some embodiments, it is possible to modify different visual characteristics of the first user interface element. In some embodiments, the user interface and/or user interface elements are displayed in (e.g., the user interface is a three-dimensional environment and/or is displayed within) a three-dimensional environment (e.g., a computer-generated reality (CGR) environment, such as a Virtual Reality (VR) environment, a Mixed Reality (MR) environment, or an Augmented Reality (AR) environment, etc.) that is generated by, displayed by, or otherwise made viewable by the device. In some implementations, modifying the appearance of the first user interface element includes updating a positioning of the first user interface element in the user interface, such as moving the first user interface element away from a point of view of a user in the three-dimensional environment (e.g., from a vantage point within the three-dimensional environment from which the three-dimensional environment is presented via a display generation component in communication with the electronic device) and/or reducing a spacing between the first user interface element and a back plate toward which the first user interface element moves when "pushed".

In some implementations, such as in fig. 21B, while the first user interface element (e.g., 2104) is displayed (and not yet selected) in a modified appearance, the electronic device (e.g., 101 a) detects (2202 d) the second input (e.g., via the hand 2103a, 2103B, or 2103C and/or gaze 2102C in fig. 21C) via the one or more input devices (e.g., 314). In some implementations, the second input includes movement of a predefined portion of the user (e.g., the user's hand) away from the first user interface element in a predetermined direction (e.g., left, right, up, away from the first user interface element toward the user's torso). For example, if the first input is a user looking at the first user interface element while touching his thumb to a finger on the thumb's hand (e.g., without moving the thumb away from the finger) (e.g., the first input is an indirect input), the second input is a movement of the user's hand (e.g., left, right, up, away from the first user interface element toward the user's torso) while the thumb continues to touch the finger. As another example, if the first input is a user "pressing" the first user interface element with his hand and/or extended finger while the hand is in a pointing hand shape (e.g., the first input is a direct input), the second input is a movement of the hand while maintaining a pointing hand shape or while in a different hand shape (e.g., left, right, up, away from the first user interface element toward the torso of the user).

In some embodiments, such as in fig. 21C, in response to detecting the second input, in accordance with a determination that the second input includes a movement (2202 e) corresponding to movement away from the first user interface element (e.g., 2104), in accordance with a determination that the movement corresponds to movement within a respective region (e.g., 2102) of the user interface, the electronic device (e.g., 101 a) relinquishes (2202 f) selection of the first user interface element (e.g., 2104) and modifies an appearance of the second user interface element (e.g., 2106) to indicate that further input directed to the second user interface element (e.g., 2106) will cause selection of the second user interface element (e.g., 2106). In some implementations, the electronic device modifies the appearance of the first user interface element to no longer indicate that further input directed to the first user interface element will cause selection of the first user interface element. For example, the electronic device restores the appearance of the first user interface element (e.g., one or more characteristics thereof) to the appearance of the first user interface element prior to the detection of the first input. In some implementations, if the first input is an indirect input, the movement corresponds to movement within a respective region of the user interface if the distance, speed, duration, etc., meets one or more criteria (e.g., is less than a predetermined threshold). In some implementations, if the first input is a direct input, the movement corresponds to movement within a respective region of the user interface if the user's hand remains within the respective region of the user interface during the movement (e.g., or within a region of the three-dimensional environment between a boundary of the respective region of the user interface and a viewpoint of the user in the three-dimensional environment). In some implementations, modifying the appearance of the second user interface element includes displaying the second user interface element in a different color, pattern, text pattern, translucency, and/or line pattern than that used to display the second user interface element prior to detection of the second input. In some embodiments, it is possible to modify a different visual characteristic of the second user interface element. In some embodiments, modifying the appearance of the second user interface element includes updating a positioning of the second user interface element in the user interface, such as moving the second user interface element away from a viewpoint of a user in the three-dimensional environment. In some implementations, in response to detecting a third input (e.g., a continuation of an input sequence corresponding to an input selecting a user interface element, such as a remainder of a movement required to previously select a first user interface element) subsequent to the second input, the electronic device selects the second user interface element and performs an action associated with the second user interface element. In some implementations, the electronic device updates an appearance of the second user interface element without detecting initiation of the second selection input after detecting the second input to indicate that further input directed to the second user interface element will cause selection of the second user interface element. For example, if the first input is an indirect input, the electronic device updates the appearance of the second user interface element without detecting initiation of another pinch gesture (e.g., the user continues to touch his thumb to another finger instead of moving the thumb away and pinch it again). As another example, if the first input is a direct input, the electronic device updates the appearance of the second user interface element without detecting that the user moves his hand away from the first user interface element and the second user interface element (e.g., toward the user's point of view) and presses his hand again toward the second user interface element. In some embodiments, when the electronic device updates the appearance of the second user interface element, progress toward selecting the first user interface element is translated into progress toward selecting the second user interface element. For example, if the first input is an indirect input and if the electronic device detects a pinch hand shape in which the thumb and finger are touching within a predetermined time threshold (e.g., 0.1, 0.5, 1, 2, 3, or 5 seconds) the electronic device selects the corresponding user interface element, then when the electronic device updates the appearance of the second user interface element, the electronic device does not resume computing the time to maintain the pinch hand shape. As another example, if the first input is a direct input and if the electronic device selects a respective user interface element if the respective user interface element is "pushed" a threshold distance (e.g., 0.5, 1, 2, 3, 5, or 10 centimeters), then movement of the user's hand in a direction between the second user interface element and the user's point of view during the first input and the second input is counted as meeting the threshold distance. In some embodiments, the electronic device resets the criteria for selecting the second user interface element after updating the appearance of the second user interface element. For example, if the first input is an indirect input, the electronic device does not select the second user interface element unless and until the pinch hand shape maintains a full threshold time from the time the electronic device updates the appearance of the second user interface element. As another example, if the first input is a direct input, the electronic device does not select the second user interface element unless and until after the electronic device updates the appearance of the second user interface element, the electronic device detects that the user "pushes" the second user interface element a threshold distance.

In some embodiments, such as in fig. 21D, in response to detecting the second input, in accordance with a determination that the second input includes movement (2202 e) corresponding to movement away from the first user interface element (e.g., 2104), in accordance with a determination that the movement corresponds to movement in the first direction outside of a respective region (e.g., 2102) of the user interface, the electronic device (e.g., 120 a) relinquishes (2202 g) selection of the first user interface element (e.g., 2104) without modifying an appearance of the second user interface element (e.g., 2106). In some implementations, the electronic device modifies the appearance of the first user interface element to no longer indicate that further input directed to the first user interface element will cause selection of the first user interface element. For example, the electronic device restores the appearance of the first user interface element (e.g., one or more characteristics thereof) to the appearance of the first user interface element prior to the detection of the first input. In some implementations, if the first input is an indirect input, the movement corresponds to movement outside of a respective region of the user interface if the distance, speed, duration, etc. meets one or more criteria (e.g., is greater than a predetermined threshold). In some implementations, if the first input is a direct input, the movement corresponds to movement outside of the respective region of the user interface if the user's hand is present within the respective region of the user interface during the movement (e.g., or within a region of the three-dimensional environment between a boundary of the respective region of the user interface and a viewpoint of the user in the three-dimensional environment).

The above-described manner of relinquishing selection of the first user interface element in response to detecting the second input provides an efficient manner of reducing accidental user input while allowing modification of the target element of the input, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which also reduces power usage and extends battery life of the electronic device by enabling the user to use the electronic device more quickly and efficiently while reducing errors in use and by reducing the likelihood that the electronic device will perform unexpected and then reverse operations.

In some embodiments, such as in fig. 21D, in response to detecting the second input, and in accordance with a determination that the movement corresponds to movement (2204 a) in a second direction (e.g., other than the first direction, such as downward) outside of a respective area (e.g., 2102) of the user interface, in accordance with a determination that the first input includes discarding (2204 b) a selection of the first user interface element (e.g., 2104) by a predefined portion (e.g., 2103 b) of the user (e.g., one or more fingers, hands, arms, head) when the predefined portion (e.g., hold) is farther than a threshold distance (e.g., 5, 10, 15, 20, 30, or 50 centimeters) from a location corresponding to the first user interface element (e.g., 2104) from a virtual touchpad or input indication according to method 1800, the predefined portion of the user is indirect input and the user is farther than the threshold distance (or when the electronic device does not display the virtual touchpad or input indication according to method 1800). In some embodiments, the movement in the second direction is a movement of a predefined portion of the user. In some embodiments, in response to detecting a downward movement of the predefined portion of the user, the electronic device relinquishes the selection of the first user interface element if the second input is an indirect input. In some embodiments, the electronic device also forgoes selection of the second user interface element and the modification of the appearance of the second user interface element. In some implementations, the electronic device modifies an appearance of the first user interface element to not indicate that further input will cause selection of the first user interface element. In some implementations, the electronic device maintains an appearance of the first user interface element to indicate that further input will cause selection of the first user interface element. In some implementations, in accordance with a determination that the first input includes an input provided by a predefined portion of the user while the predefined portion of the user (e.g., maintained) is farther than a threshold distance from a location corresponding to the first user interface element and the predefined portion of the user is within the threshold distance indicated by the virtual touchpad or input in accordance with method 1800, the electronic device selects the first user interface element in accordance with the second input. In some implementations, in accordance with a determination that the first input includes input provided by a predefined portion of the user while the predefined portion of the user (e.g., the hold) is farther than a threshold distance from a location corresponding to the first user interface element and the predefined portion of the user is within the threshold distance indicated by the virtual touchpad or input of the method 1800, the electronic device relinquishes selection of the first user interface element.

In some embodiments, such as in fig. 21E, in response to detecting the second input, and in accordance with a determination that the movement corresponds to movement (2204 a) in a second direction (e.g., different from the first direction, such as downward) outside of a respective area (e.g., 2102) of the user interface, in accordance with a determination that the first input includes an input (e.g., the input is a direct input) provided by a predefined portion (e.g., 2103 a) of the user when the predefined portion (e.g., 2103 a) of the user is proximate to a threshold distance from a location corresponding to the first user interface element (e.g., 2106), the electronic device (e.g., 101 a) selects (2204 c) the first user interface element (e.g., 2106) in accordance with the second input. In some embodiments, the electronic device does not select the first user interface element unless and until the second input meets one or more criteria. For example, the one or more criteria include a criterion that is met when a predefined portion of the user "pushes" the first user interface element a predefined distance (e.g., 0.5, 1, 2, 3, 4, 5, or 10 centimeters) away from the user's point of view (and/or toward the back plate of the first user interface element).

The above-described manner of selecting the first user interface element in response to movement in the second direction, discarding the selection of the first user interface element if the input is detected when the predefined portion of the user is further than the first user interface element by a threshold distance, and selecting the first user interface element in accordance with the second input if the first input is detected when the predefined portion of the user is closer than the threshold distance from the location corresponding to the first user interface element, provides an intuitive manner of canceling or not canceling the user input in accordance with the direction of movement and the distance between the predefined portion of the user and the first user interface element when the input is received, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some implementations, such as in fig. 21D, the first input includes input provided by a predefined portion (e.g., one or more fingers, hands, arms, head) of the user (e.g., 2103a, 2103 b), and selection of the first user interface element (e.g., 2104, 2106) is abandoned in accordance with a determination that movement of the second input corresponds to movement in the first direction (e.g., up, left, or right) outside of a respective area (e.g., 2102) of the user interface, regardless of whether the predefined portion (e.g., 2103a, 2103 b) of the user corresponds to the first user interface element (e.g., 2104, 2106) farther than a threshold distance (e.g., 0.5, 1, 2, 3, 5, 10, or 50 centimeters) at the first input period (e.g., for indirect input or when interacting with a virtual touchpad or input indication according to method 1800) or closer than a threshold distance (e.g., for direct input) (2206). In some embodiments, whether the first input is a direct input or an indirect input, detecting movement of the second input upward, leftward or rightward causes the electronic device to forgo selection of the first user interface element. In some embodiments, in response to detecting the downward movement of the second input, the electronic device discards the selection of the first user interface element if the first input is an indirect input, but does not discard the selection of the first user interface element if the first input is a direct input.

The foregoing manner of dropping selection of the first user interface element in response to movement in the first direction, regardless of whether the predefined portion of the user is within a threshold distance of the first user interface element during the first input, provides an efficient and consistent manner of canceling selection of the first user interface element, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21D, upon display of the user interface, the electronic device (e.g., 101 a) detects (2208 a) a third input via the one or more input devices to a third user interface element (e.g., 2108) directed into the respective region, wherein the third user interface element (e.g., 2108) is a slider element, and the third input includes a moving portion for controlling the slider element (e.g., 2108). In some embodiments, the slider element includes multiple indications of values for respective characteristics controlled by the slider and an indication of a current value of the slider element that the user is able to move by providing an input (such as a third input) directed to the slider element. For example, the slider element controls a value for a characteristic (such as a setting of the electronic device such as playback volume, brightness, or a time threshold to enter a sleep mode if no input is received). In some implementations, the third input includes a selection of a slider element (e.g., an indication of a current value of the slider element) that causes the electronic device to update the indication of the current value of the slider element according to a moving portion of the third input. In some embodiments, the third input is a direct input that includes detecting movement of the user's hand while the hand is within a predetermined threshold distance (e.g., 0.5, 1, 2, 3, 4, 5, 10, 15, or 30 centimeters) of the slider element, and then the hand is in a pinch-hand shape (e.g., the hand shape of another finger in which the thumb is still touching the hand). In some embodiments, the third input is an indirect input comprising detecting a pinch gesture made by a user's hand while the user's gaze is directed toward the slider element, and subsequently movement of the hand while the hand is in a pinch hand shape. In some implementations, the third input includes interaction with a virtual touch pad or input indication according to method 1800 while the user's gaze is directed to the slider element.

In some embodiments, such as in fig. 21D, in response to detecting the third input directed to the third user interface element (e.g., 2108), the electronic device (e.g., 101 a) modifies (2208 b) an appearance of the third user interface element (e.g., 2108) to indicate that further input directed to the third user interface element (e.g., 2108) will cause further control of the third user interface element (e.g., 2108) and updates the third user interface element (e.g., 2108) according to a moving portion of the third input. In some implementations, modifying the appearance of the third user interface element includes modifying a size, color, or shape of the slider element (e.g., an indication of a current value of the slider element) and/or updating a location of the slider element (e.g., an indication of a current value of the slider element) to move the slider element (e.g., an indication of a current value of the slider element) closer to a viewpoint of a user in the three-dimensional environment. In some implementations, updating the third user interface element according to the moving portion of the third input includes updating an indication of a current value of the slider element according to a magnitude and/or direction of the moving portion of the third input. For example, in response to an upward, downward, rightward, or leftward movement, the electronic device moves an indication of the current value of the slider element upward, downward, rightward, or leftward, respectively. As another example, in response to movement having a relatively high speed, a relatively long duration, and/or a relatively large distance, the electronic device moves an indication of the current value of the slider element a relatively large amount, and in response to movement having a relatively low speed, a relatively short duration, and/or a relatively small distance, the electronic device moves an indication of the current value of the slider element a relatively small amount. In some embodiments, movement of the slider is limited to one axis of movement (e.g., left to right, top to bottom), and the electronic device updates the current value of the slider only in response to movement along the axis of the adjustable slider. For example, in response to a rightward movement of the pointing slider that is adjustable from left to right, the electronic device adjusts the current value of the slider to the right, but in response to an upward movement of the pointing slider, the electronic device discards updating the current value of the slider (or updates the current value of the slider based only on the leftward or rightward component of the movement).

In some implementations, such as in fig. 21D, the electronic device (e.g., 101 a) detects (2208 c) through fourth inputs when the third user interface element (e.g., 2108) is displayed in a modified appearance and when the third user interface element (e.g., 2108) is updated according to a moving portion of the third input (e.g., and before a termination of the third input or a corresponding input (such as release of a hand pinch shape) that terminates the update of the third user interface element is detected). In some implementations, the fourth input includes a moving portion.

In some implementations, such as in fig. 21D, in response to detecting the fourth input, in accordance with a determination that the fourth input includes movement (2208D) corresponding to movement away from the third user interface element (e.g., in the first direction, in the second direction, in any direction), the electronic device (e.g., 101 a) maintains (2208 e) a modified appearance of the third user interface element (e.g., 2108) to indicate that further input directed to the third user interface element (e.g., 2108) will cause further control of the third user interface element.

In some implementations, if the third input is an indirect input or an input associated with a virtual touchpad or input indication according to method 1800, the movement corresponds to movement outside of the respective area of the user interface based on the speed, duration, and/or distance of the movement. In some embodiments, if the third input is a direct input, the movement corresponds to movement outside of a respective region of the user interface if the movement includes moving the user's hand outside of the respective region of the user interface (e.g., or a three-dimensional volume squeezed from the respective region of the user interface toward a viewpoint of the user in a three-dimensional environment).

In some embodiments, such as in fig. 21D, in response to detecting the fourth input, in accordance with a determination that the fourth input includes movement (2208D) corresponding to movement away from the third user interface element (e.g., 2108) (e.g., in the first direction, in the second direction, in any direction), the electronic device (e.g., 101 a) updates (2208 f) the third user interface element (e.g., 2108) in accordance with movement of the fourth input, regardless of whether movement of the fourth input corresponds to movement outside of a respective region (e.g., 2109) of the user interface. In some implementations, the electronic device updates the slider element (e.g., an indication of a current value of the slider element) according to movement of the predefined portion unless and until termination of the third input is detected. For example, termination of the third input includes detecting that the user moves his thumb away from his finger to stop pinching the hand shape and/or away from the virtual touchpad or input indication movement according to method 1800. In some implementations, the electronic device does not stop pointing the input at the slider element in response to a portion of the input movement corresponding to movement outside of the respective region of the user interface.

The above-described manner of updating the slider element in response to movement corresponding to movement away from the third user interface element outside the respective region of the user interface provides an efficient way of refining the value of the slider element with multiple movement inputs, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient by performing additional operations when a set of conditions has been met without requiring further user input.

In some implementations, such as in fig. 21C, the moving portion of the third input includes an input (2210 a) having a respective magnitude provided by a predefined portion (e.g., one or more fingers, hands, arms, head) of the user (e.g., 2103d, 2103b, 2103C).

In some implementations, such as in fig. 21C, updating the third user interface element (e.g., 2108) according to the moving portion of the third input includes (2210 b) moving the third user interface element (e.g., 2108) at a first speed during the moving portion of the third input according to determining that the predefined portion of the user (e.g., 2103d, 2103b, 2103C) is moving by a first amount determined based on the first speed of the predefined portion of the user (e.g., 2103d, 2103b, 2103C) and the respective magnitude of the moving portion of the third input.

In some implementations, such as in fig. 21D, updating the third user interface element (e.g., 2108) according to the moving portion of the third input includes (2210 b) moving the third user interface element (e.g., 2108) at a second speed greater than the first speed during the moving portion of the third input according to determining the predefined portion of the user (e.g., 2103D, 2103b, 2103 c) and a second amount greater than the first amount determined by the second speed of the predefined portion of the user (e.g., 2103b, 2103c, 2103D) and a respective magnitude of the moving portion of the third input, wherein the second amount of movement of the third user interface element (e.g., 2108) is greater than the first amount of movement of the third user interface element (e.g., 2108) for the respective magnitude of the moving portion of the third input. In some implementations, in response to detecting a relatively high speed movement of the predefined portion of the user, the electronic device moves an indication of a current value of the slider element by a relatively high amount for a given distance of movement of the predefined portion of the user. In some implementations, in response to detecting a relatively low-speed movement of the predefined portion of the user, the electronic device moves the indication of the current value of the slider element by a relatively low amount for a given distance of movement of the predefined portion of the user. In some implementations, if the speed of movement changes over time as movement is detected, the electronic device similarly changes the magnitude of movement of the indication of the current value of the slider element as movement input is received.

The above-described manner of updating the slider element by an amount corresponding to the speed of movement of the predefined portion of the user provides an efficient way of quickly updating the slider element by a relatively large amount and accurately updating the slider element by a relatively small amount, which simplifies interaction between the user and the electronic device by providing additional functionality to the user without cluttering the user interface with additional controls, enhances operability of the electronic device, and makes the user-device interface more efficient.

In some implementations, such as in fig. 21D, movement of the second input is provided by a corresponding movement of a predefined portion (e.g., one or more fingers, hands, arms, head) of the user (e.g., 2103a, 2103b, 2103 c) (2212 a).

In some implementations, such as in fig. 21D, in accordance with a determination that the respective region of the user interface (e.g., 2102) has a first size, in accordance with a determination that the respective movement of the predefined portion of the user has a first magnitude, the movement of the second input corresponds to movement outside of the respective region of the user interface (e.g., 2102) (2212 b). In some implementations, the magnitude of the movement of the second input is dependent on the speed, distance, and duration of the moving portion of the second input. For example, a relatively high speed, a relatively large distance, and/or a relatively long duration facilitate relatively high magnitude movement of the moving portion of the second input, while a relatively low speed, a relatively small distance, and/or a relatively short duration facilitate relatively low magnitude movement of the moving portion of the second input.

In some implementations, in accordance with a determination that the respective region of the user interface has a second size that is different from the first size (e.g., if the container 2102 in fig. 21D has a size that is different from the size shown in fig. 21D), in accordance with a determination that the respective movement of the predefined portion (e.g., 2103a, 2103b, 2103 c) of the user has a first magnitude, the movement of the second input corresponds to movement (2212 c) outside of the respective region (e.g., 2102) of the user interface. In some implementations, the magnitude of the moving portion of the second input corresponds to or does not correspond to movement outside of the respective region of the user interface, regardless of the size of the respective region of the user interface. In some implementations, the electronic device maps movement of respective magnitudes of the predefined portions of the user to movement corresponding to movement outside of the respective areas of the user interface, regardless of the size of the respective areas of the user interface.

The above-described manner in which the magnitude of the moving portion of the second input corresponds or does not correspond to movement outside the respective region, regardless of the size of the respective region, provides a consistent manner of canceling or not canceling input directed to elements in the respective region of the user interface, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, detecting the first input includes detecting (e.g., via an eye tracking device of the one or more input devices in communication with the electronic device) that a gaze (e.g., 2101 a) of a user of the electronic device (e.g., 101 a) is directed to the first user interface element (e.g., 2104) (2214 a). In some implementations, if the first input is an indirect input or an input involving a virtual touchpad or input indicator according to the method 1800, the first input includes a gaze of a user of the electronic device directed to the first user interface element. In some implementations, if the first input is a direct input, the first input does not include the user's gaze pointing toward the first user interface element when the first input is detected (e.g., but according to method 1000, the first user interface element is in the attention area).

In some implementations, such as in fig. 21C, detecting the second input includes detecting movement corresponding to movement away from the first user interface element (e.g., 2104) and the user's gaze (e.g., 2101C) is no longer directed toward the first user interface element (e.g., 2104) (2214 b). In some implementations, the second input is detected when the user's gaze is directed to the first user interface element. In some implementations, the second input is detected when the user's gaze is directed to the second user interface element. In some implementations, the second input is detected when the user's gaze is directed to a respective region of the user interface (e.g., in addition to the first user interface element). In some implementations, the second input is detected when the user's gaze is directed to a location in the user interface other than the corresponding region of the user interface.

In some implementations, such as in fig. 21C, when the user's gaze (e.g., 2101C) is not directed to the first user interface element (e.g., 2106), the following operations are performed: discarding the selection of the first user interface element (e.g., 2104) and modifying the appearance of the second user interface element (e.g., 2106) to indicate that further input directed to the second user interface element (e.g., 2106) will cause the selection of the second user interface element (2214 c). In some implementations, the electronic device discards the selection of the first user interface element and modifies the appearance of the second user interface element when the user's gaze is directed to the first user interface element. In some embodiments, the electronic device discards the selection of the first user interface element and modifies the appearance of the second user interface element when the user's gaze is directed to the second user interface element. In some implementations, the electronic device discards the selection of the first user interface element and modifies the appearance of the second user interface element when the user's gaze is directed to a corresponding region of the user interface (e.g., in addition to the first user interface element). In some embodiments, the electronic device discards the selection of the first user interface element and modifies the appearance of the second user interface element at a location in the user interface where the user's gaze is directed to other than the corresponding region of the user interface. In some implementations, in accordance with a determination that the user's gaze is not directed toward the first user interface element when the first input is initially detected, the electronic device foregoes updating the first user interface element to indicate that further input will cause selection of the first user interface element (e.g., the first user interface element previously has input focus but loses input focus when the user's gaze moves away from the first user interface element). In some implementations, even if the user's gaze is not directed to the first user interface element when the second input is received, the electronic device directs further input to the second user interface element in response to movement of the second input within a respective region of the user interface.

The above-described manner of forgoing selection of the first user interface element and modifying the appearance of the second user interface element when the user's gaze is away from the first user interface element provides an efficient manner of redirecting the first input when gaze is moved away from the first user interface element (e.g., when looking at a different user interface element to which the corresponding input is to be directed), which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, detecting the first input includes detecting (e.g., via an eye tracking device of one of the one or more input devices in communication with the electronic device) that a gaze (e.g., 2101 a) of a user of the electronic device (e.g., 101 a) is directed to a respective region (e.g., 2102) of the user interface (2216 a). In some implementations, if the first input is an indirect input or an input involving a virtual touchpad or input indicator according to the method 1800, the first input includes a gaze of a user of the electronic device directed to a respective area of the user interface. In some implementations, if the first input is a direct input, the first input does not include the user's gaze pointing at a respective region of the user interface when the first input is detected (e.g., but according to method 1000, the respective region of the user interface is in the attention area).

In some implementations, such as in fig. 21B, upon displaying the first user interface element (e.g., 2104) in a modified appearance and before detecting the second input, the electronic device (e.g., 101 a) detects (e.g., 2216B) via the one or more input devices that the user's gaze (e.g., 2101B) is directed to a second region (e.g., 2109) of the user interface (e.g., a third user interface element in the second region) that is different from the corresponding region (e.g., 2102). In some implementations, the second region of the user interface includes one or more third user interface elements. In some implementations, the second area of the user interface is a container, a back panel, or a (e.g., application) window.

In some embodiments, such as in fig. 21B, in response to detecting that the user's gaze (e.g., 2101B) is directed to the second region (e.g., 2109) (e.g., a third user interface element in the second region), in accordance with a determination that the second region (e.g., 2109) includes a third (e.g., selectable, interactive, etc.) user interface element (e.g., 2108), the electronic device (e.g., 101 a) modifies (2216 c) the appearance of the third user interface element (e.g., 2108) to indicate that further input directed to the third user interface element (e.g., 2108) will cause interaction with the third user interface element (e.g., directing input focus to the second region and/or the third user interface element). In some implementations, modifying the appearance of the third user interface element includes displaying the third user interface element in a different color, pattern, text pattern, translucency, and/or line pattern than that used to display the third user interface element prior to detecting that the user's gaze is directed to the second region. In some embodiments, it is possible to modify different visual characteristics of the third user interface element. In some implementations, modifying the appearance of the third user interface element includes updating a positioning of the third user interface element in the user interface, such as moving the third user interface element toward or away from a viewpoint of a user in the three-dimensional environment. In some implementations, the electronic device also updates the appearance of the first user interface element to no longer indicate that further input will cause selection of the first user interface element and relinquishes selection of the first user interface element. In some embodiments, if the second area does not include any selectable and/or interactive user interface elements, the electronic device maintains an updated appearance of the first user interface element to indicate that further input will cause selection of the first user interface element.

The above-described manner of modifying the appearance of the third user interface element to indicate that further input will cause selection of the third user interface element in response to detecting that the user's gaze is directed to the second region provides an efficient manner of redirecting selection input from one element to another element even when the elements are in different regions of the user interface, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, the first input includes movement of a predefined portion (e.g., one or more fingers, hands, arms, eyes, head) of a user (e.g., 2103a, 2103B, 2103 c) of the electronic device (e.g., 2103 a) in space in an environment of the electronic device (e.g., 101 a) without the predefined portion of the user (e.g., 101a, 2103B, 2103 c) contacting a physical input device (e.g., touch pad, touch screen, etc.) (2218). In some implementations, the electronic device detects the first input using one or more of: an eye tracking device that tracks a user's gaze without physical contact with the user; a hand tracking device that tracks a user's hand without making physical contact with the user; and/or a head tracking device that tracks the user's head without physically contacting the user. In some implementations, the input device for detecting the first input includes one or more cameras, range sensors, and the like. In some embodiments, the input device is incorporated into a device housing that is in contact with a user of the electronic device when receiving the first user input, but the orientation of the housing relative to the portion of the user in contact with the housing does not affect detection of the first input. For example, eye tracking devices, hand tracking devices, and/or head tracking devices are incorporated into head-mounted electronic devices.

The above-described manner of detecting the first input without the predefined portion of the user contacting the physical input device provides an efficient manner of detecting the input without the user having to manipulate the physical input device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, the first input includes a pinch gesture (2220) performed by a hand (e.g., 2103a, 2103B) of a user of the electronic device (e.g., 101 a). In some embodiments, the electronic device detects the pinch gesture using a hand tracking device in communication with the electronic device. In some embodiments, detecting the pinch gesture includes detecting that the user touches his thumb to another finger on the same hand as the thumb. In some embodiments, detecting a pinch gesture of the first input further includes detecting that the user moves the thumb away from the finger. In some embodiments, the first does not include detecting that the user moves the thumb away from the finger (e.g., maintaining the pinch hand shape at the end of the first input).

The above-described manner in which the first input includes pinch gestures performed by the user's hand provides an efficient way of detecting input without the user having to manipulate a physical input device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, the first input includes movement of a finger of a hand of a user (e.g., 2103a, 2103B, 2103 c) of the electronic device (e.g., 101 a) through space in an environment of the electronic device (e.g., 101 a) (2222). In some embodiments, the electronic device detects a finger of a user's hand via a hand tracking device in communication with the electronic device. In some embodiments, the first input includes detecting movement of a finger through space in the environment of the electronic device while the hand is in a pointing hand shape in which the finger protrudes away from the torso and/or palm of the hand of the user and one or more other fingers curl toward the palm of the hand of the user. In some embodiments, the movement of the finger is in a direction from the user's point of view toward the first user interface element. In some embodiments, the movement of the finger is a movement caused by movement of a user's hand that includes the finger. In some embodiments, the movement of the finger is independent of movement from the rest of the hand. For example, the movement of the finger is movement centered on the phalangeal joint of the hand. In some embodiments, the palm of the user's hand is substantially stationary while the fingers are moving.

The above-described manner in which the first input includes movement of a finger of a user's hand provides an efficient way of detecting input without the user having to manipulate a physical input device, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some implementations, such as in fig. 21C, in response to detecting the second input, in accordance with a determination that the second input includes movement corresponding to movement away from the first user interface element (e.g., 2104), in accordance with a determination that the movement corresponds to movement within a respective region (e.g., 2102) of the user interface, the electronic device (e.g., 101 a) modifies (2224) an appearance of the first user interface element (e.g., 2104) to indicate that further input is no longer to be directed to the first user interface element (e.g., 2104) (e.g., the first user interface element no longer has input focus). In some implementations, the electronic device modifies the appearance of the first user interface element to indicate that further input will no longer be directed to the first user interface element because further input will be directed to the second user interface element. In some implementations, the electronic device modifies one or more characteristics of the appearance of the first user interface element to be the same as the one or more characteristics of the appearance of the first user interface element prior to detection of the first input. For example, before detecting the first input, the electronic device displays the first user interface element as having the first color and/or separated from a corresponding region of the user interface by a corresponding distance (e.g., 1, 2, 3, 5, 10, 15, 20, or 30 centimeters). In this example, upon detecting the first input, the electronic device displays the first user interface element as having the second color separated from the respective region of the user interface by a distance less than the respective distance. In this example, in response to detecting the second input, the electronic device displays the first user interface element as having the first color separated from a respective region of the user interface by a respective distance. In some implementations, in response to the second input, the electronic device displays the first user interface element as having the first color without being separated from a corresponding region of the user interface.

The above-described manner of modifying the appearance of the first user interface element to indicate that further input will no longer be directed to the first user interface element provides an efficient way of indicating to the user which user interface element has the input focus of the electronic device, which simplifies interactions between the user and the electronic device, enhances operability of the electronic device, makes the user-device interface more efficient, and provides enhanced visual feedback to the user.

In some embodiments, such as in fig. 21C-21D, in accordance with a determination that the second input is provided by a predefined portion (e.g., one or more fingers, hands, arms, heads) of the user (e.g., 2103b, 2103C) when the predefined portion of the user of the electronic device is farther than a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, or 50 centimeters) from a location corresponding to the respective region (e.g., 2102) (e.g., 2226 a) of the user (e.g., the second input is an indirect input and/or an input related to a virtual touchpad or input indication in accordance with the method 1800), the movement of the second input corresponds to movement within the respective region (e.g., 2102) of the user interface, such as in fig. 21C, and the movement of the second input corresponds to movement (6 b) outside the respective region (e.g., 2222102) of the user interface, such as in fig. 21D, when the second input does not meet the one or more first criteria. In some embodiments, the one or more first criteria are based on a speed, duration, and/or distance of movement of the second input. In some implementations, the electronic device translates the movement based on a corresponding magnitude of movement of the speed, duration, and/or distance of movement of the second input. For example, relatively high movement speeds, relatively long durations, and/or relatively large distances correspond to relatively large movement magnitudes, while relatively low movement speeds, relatively short durations, and/or relatively small distances correspond to relatively small movement magnitudes. In some implementations, the electronic device compares the movement magnitude to a predetermined threshold distance (e.g., a predetermined distance independent of the size of the corresponding region of the user interface, a distance equal to the dimensions (e.g., width, height) of the corresponding region of the user interface). In some embodiments, the one or more first criteria are met when the movement magnitude exceeds a predetermined threshold distance.

In some embodiments, such as in fig. 21C-21D, in accordance with a determination that the second input is provided by a predefined portion (e.g., one or more fingers, hands, arms, head) (e.g., 2103 a) of the user (e.g., indirect input and/or input related to a virtual touchpad or input indication according to method 1800) (2226 a) of the electronic device (e.g., 101 a) when the predefined portion of the user (e.g., 2103 a) is a distance from a location corresponding to the respective region (e.g., 2102) that is greater than a threshold distance (e.g., 1, 2, 3, 5, 10, 15, 30, or 50 cm), the second input is a direct input provided by the predefined portion (e.g., one or more fingers, hands, arms, head) of the user (e.g., 2103 a) when the second input meets one or more second criteria different from the first criteria, the second input is a movement corresponding to the user (e.g., 2106 a) in accordance with a determination that the predefined portion of the user (e.g., 2103 a) is a distance from the location corresponding to the respective region (e.g., 2102), the second input is a movement corresponding to a user (e.g., 21) in a movement such as in a graph, such as the second interface, such as movement, e.g., 21C, or the movement, the user, and the movement criteria, such as the movement criteria, are not corresponding to the user (e.g., 21C) in the respective region, and the respective region, are not shown. In some embodiments, the one or more second criteria are met when the predefined portion of the user moves from a location within (e.g., a three-dimensional volume extruded from) the respective region of the user interface to a location outside (e.g., a three-dimensional volume extruded from) the respective region of the user interface. In some embodiments, the one or more second criteria are based on a distance of movement of the second input, and not on a speed or duration of movement of the second input. In some embodiments, if the second input is an indirect input, the electronic device determines whether movement of the second input corresponds to movement within the respective region based on the speed, duration, and/or distance, or if the second input is a direct input, the electronic device determines whether movement of the second input corresponds to movement within the respective region based on a position of a predefined portion of the user in the three-dimensional environment during the second input.

The above-described manner of applying different criteria to determine whether movement of the second input corresponds to movement outside the respective region of the user interface based on a distance between a predefined portion of the user and a location corresponding to the respective region of the user interface provides an intuitive way of canceling or not canceling input to the first user interface element for various input types, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some implementations, such as in fig. 21B, modifying the appearance of the first user interface element (e.g., 2104) to indicate that further input directed to the first user interface element (e.g., 2104) will cause selection of the first user interface element (e.g., 2104) includes moving the first user interface element away from a viewpoint of a user in the three-dimensional environment (2228 a). In some embodiments, the electronic device displays the first user interface element as not separated from the respective region of the user interface unless and until a detection is made that the user's gaze is directed to the respective region of the user interface and/or the respective hand shape of the user's hand (e.g., a pre-pinch hand shape in which the user's thumb is within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, 4, or 5 centimeters) of the other finger of the user's hand, or a directed hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm of the hand). In some implementations, in response to detecting that the user's gaze is directed to a respective region of the user interface and/or a respective hand shape of the user's hand, the electronic device displays the first user interface element (e.g., and the second user interface element) as separate from the respective region of the user interface by one or more of: the first user interface element (e.g., and the second user interface element) is moved toward the viewpoint of the user and/or the corresponding region of the user interface is moved away from the user. In some implementations, in response to detecting a selection input (e.g., a first input) directed to a first user interface element, the electronic device moves the first user interface element away from a point of view of a user (e.g., and toward a respective region of the user interface).

In some embodiments, such as in fig. 21C, modifying the appearance of the second user interface element (e.g., 2106) to indicate that further input directed to the second user interface element (e.g., 2106) will cause selection of the second user interface element (e.g., 2106) includes moving the second user interface element (e.g., 2106) away from the viewpoint of the user in the three-dimensional environment (2228 b). In some embodiments, the electronic device displays the second user interface element as not separated from the respective region of the user interface unless and until a detection is made that the user's gaze is directed to the respective region of the user interface and/or the respective hand shape of the user's hand (e.g., a pre-pinch hand shape in which the user's thumb is within a threshold distance (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, 4, or 5 centimeters) of the other finger of the user's hand, or a directed hand shape in which one or more fingers are extended and one or more fingers are curled toward the palm of the hand). In some implementations, in response to detecting that the user's gaze is directed to a respective region of the user interface and/or a respective hand shape of the user's hand, the electronic device displays the second user interface element (e.g., and the first user interface element) as separate from the respective region of the user interface by one or more of: the second user interface element (e.g., and the first user interface element) is moved toward the viewpoint of the user and/or the corresponding region of the user interface is moved away from the user. In some implementations, in response to detecting a selection input (e.g., a second input) directed to the second user interface element, the electronic device moves the second user interface element away from the user's point of view (e.g., and toward a respective region of the user interface).

The above-described manner of moving the first user interface element or the second user interface element away from the user's point of view to indicate that further input directed to the first user interface element or the second user interface element will cause selection of the first user interface element or the second user interface element provides an efficient manner of indicating progress toward selection of the first user interface element or the second user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, makes the user-device interface more efficient, and provides enhanced visual feedback to the user.

In some implementations, upon displaying the second user interface element (e.g., 2106) in a modified appearance to indicate that further input directed to the second user interface element (e.g., 2106) will cause selection of the second user interface element (e.g., 2106) (e.g., in response to the second input), such as in fig. 21C, the electronic device (e.g., 101 a) detects (2230 a) to a third input via the one or more input devices, such as in fig. 21E. In some embodiments, the third input is a selection input, such as a direct selection input, an indirect selection input, or an input related to interaction with a virtual touchpad or input indication according to method 1800.

In some embodiments, such as in fig. 21E, in response to detecting the third input, in accordance with a determination that the third input corresponds to a further (e.g., selection) input directed to the second user interface element (e.g., 2106), the electronic device (e.g., 101 a) selects (2230 b) the second user interface element (e.g., 2106) in accordance with the third input. In some embodiments, the third input is a continuation of the first input. For example, if the first input is a portion of a direct selection input that includes detecting that the user's hand is "pushing" the first option toward a corresponding region of the user interface, the third input is further movement of the user's hand toward the corresponding region of the user interface that is directed toward the second user interface element (e.g., to "push" the second user interface element toward the corresponding user interface element). As another example, if the first input is a portion of an indirect selection input that includes detecting a pinch gesture made by a user's hand and maintaining the pinch hand shape, the third input is a continuation of maintaining the pinch hand shape. As another example, if the first input is a portion of an indirect selection input that includes detecting movement of a user's hand toward the first user interface element while in a pinch hand shape, the third input is a continuation of movement (e.g., toward the second user interface element) while the hand maintains the pinch hand shape. In some embodiments, selecting the second user interface element includes performing an action associated with the second user interface element, such as launching an application, opening a file, initiating and/or stopping playback of content with the electronic device, navigating to a corresponding user interface, changing a setting of the electronic device, or initiating communication with the second electronic device.

The above-described manner of selecting the second user interface element in response to the third input detected after the second input provides an efficient manner of selecting the second user interface element after moving the input focus from the first user interface element to the second user interface element, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some implementations, such as in fig. 21A, selection of the first user interface element (e.g., 2104) requires an input (2232 a) associated with a first magnitude (e.g., a magnitude of time, distance, intensity, etc.) before the first input is detected. In some implementations, selection of the first user interface element in response to the direct selection input entails detecting a movement of a user's finger and/or hand (e.g., when the user's hand is in a pointing hand shape) by a predetermined distance (e.g., 0.5, 1, 2, 3, 4, 5, or 10 cm) magnitude, such as a distance between the first user interface element and a corresponding region of the user interface. In some implementations, selection of the first user interface element in response to the indirect selection input requires detecting that the user maintains the pinch hand shape for a predetermined amount of time (e.g., 0.1, 0.2, 0.3, 0.5, 1, 2, 3, 4, 5, or 10 seconds) after performing the pinch gesture. In some implementations, selection of the first user interface element in response to the indirect selection input entails detecting that the user moves his hand toward the first user interface element a predetermined distance (e.g., 0.5, 1, 2, 3, 5, or 10 centimeters) while in the pinch hand shape.

In some embodiments, such as in fig. 2104, the first input includes an input of a second magnitude less than the first magnitude (2232 b). In some embodiments, if the first input is a direct input, the movement of the hand is less than a predetermined distance magnitude. In some embodiments, if the first input is an indirect input, the hand maintains the pinch hand shape for a time less than a predetermined amount of time. In some embodiments, if the first input is an indirect input, the hand moves toward the first user interface element while in the pinch hand shape a distance less than a predetermined distance magnitude.

In some implementations, such as in fig. 21A, selection of the second user interface element (e.g., 2106) requires an input (2232 c) associated with a third magnitude (e.g., a magnitude of time, distance, intensity, etc.) before the second input is detected. In some implementations, the third magnitude is a magnitude of movement required to select the second user interface element with a respective selection input. In some embodiments, the third magnitude is the same as the first magnitude. In some implementations, the first magnitude is different from the third magnitude.

In some implementations, such as in fig. 21C, in response to detecting the second input, selection of the second user interface element (e.g., 2106) requires a further input (2232 d) associated with a third magnitude of the first input that is less than the second magnitude. For example, if selection of the second user interface element by indirect input requires maintaining the pinch hand shape for 1 second and the first input includes maintaining the pinch hand shape for 0.3 seconds, the electronic device selects the second user interface element in response to detecting that the pinch hand shape is maintained for an additional 0.7 seconds (e.g., after detecting the first input and/or the second input). In some implementations, the second input is associated with a respective magnitude, and selection of the second user interface element requires a further input associated with a third magnitude that is less than a sum of the second magnitude of the first input and the respective magnitude of the second input. For example, if selection of the second user interface element by direct input requires a user's hand to move 2 centimeters away from the user's point of view (e.g., toward the second user interface element), the first input includes moving 0.5 centimeters away from the user's point of view (e.g., toward the first user interface element) and the second input includes moving 0.3 centimeters away from the user's point of view (e.g., toward the second user interface element), then further input requires moving 1.2 centimeters away from the user's point of view (e.g., toward the second user interface element).

The above-described manner of requiring further input of a magnitude having a third magnitude that is less than the second magnitude provides an efficient manner of quickly selecting the second user interface element after the second input is detected, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, such as in fig. 21B, the first input includes a selection initiation portion followed by a second portion, and the appearance of the first user interface element (e.g., 2104) is modified to indicate that further input directed to the first user interface element (e.g., 2104) will cause selection of the first user interface element (e.g., 2104) based on the first input including the selection initiation portion (2234 a). In some embodiments, if the first input is an indirect selection input, detecting the initiation portion of the first input includes detecting a pinch gesture performed by a user's hand, and detecting the second portion of the first input includes detecting that the user maintains a pinch hand shape and/or moving the hand while maintaining the pinch hand shape. In some implementations, if the first input is a direct selection input, detecting the initiation portion of the first input includes detecting that the user has moved his hand from a position between the first user interface element and the user's point of view (e.g., while making a pointing hand shape) to a position corresponding to the first user interface element in the three-dimensional environment (e.g., while making a pointing hand shape). In some implementations, if the first input is an input related to a virtual touchpad or input indication according to method 1800, detecting the initiating portion includes detecting that the user moved a finger to a position of the virtual touchpad and/or input indication, and detecting the second portion includes detecting that the user continued to move his finger through the virtual touchpad or input indication (e.g., toward the first user interface element and/or away from the user's point of view).

In some embodiments, such as in fig. 21C, in the event that the electronic device (e.g., 101 a) does not detect another selection initiation portion after the selection initiation portion included in the first input (2234 b), the appearance of the second user interface element (e.g., 2106) is modified to indicate that further input directed to the second user interface element (e.g., 2106) will cause selection of the second user interface element (e.g., 2106). In some implementations, the appearance of the second user interface element is modified in response to detecting the second input (e.g., after detecting the first input (including the initiation of the first input) without detecting a subsequent initiation of the selection input.

In some implementations, such as in fig. 21B, when the second user interface element (e.g., 2106) is not displayed in a modified appearance (e.g., before the first and second inputs are detected, or after the second user interface element is stopped being displayed in a modified appearance after the first and second inputs are detected), the electronic device (e.g., 101 a) detects (2234 c) via the one or more input devices a third input directed to the second user interface element (e.g., 2106).

In some embodiments, such as in fig. 21C, in response to detecting the third input (2234 d), in accordance with a determination that the third input includes a selection initiation portion (e.g., the third input is a selection input), the electronic device (e.g., 101 a) modifies (2234 e) an appearance of the second user interface element (e.g., 2106) to indicate that further input directed to the second user interface element (e.g., 2106) will cause selection of the second user interface element. In some implementations, the electronic device modifies an appearance of the second user interface element in response to detecting the initiation of the selection input to indicate that further input will cause selection of the second user interface element.

In some embodiments, such as in fig. 21A, in response to detecting the third input (2234 d), in accordance with a determination that the third input does not include a selection initiation portion (e.g., the third input is not a selection input or includes a second portion of a selection input but does not include an initiation portion of a selection input), the electronic device (e.g., 101A) forgoes (2234 f) modifying the appearance of the second user interface element (e.g., 2106). In some embodiments, unless the electronic device detects an initiation portion of the selection input (e.g., before receiving a second portion of the selection input or after receiving a movement within a corresponding region of the input (e.g., of the first input) and a subsequent user interface), the electronic device does not modify the appearance of the second user interface element to indicate that further input will cause selection of the second user interface element.

The above-described manner of modifying the appearance of the second user interface element in the event that no additional initiation portion is detected after the initiation portion of the first input is detected to indicate that further input will cause selection of the second user interface element provides an efficient manner of redirecting the selection input (e.g., redirecting the second user interface element from the first user interface element) without starting the selection input from the beginning, which simplifies interaction between the user and the electronic device, enhances operability of the electronic device, and makes the user-device interface more efficient, which provides the user with additional control options without cluttering the user interface with additional display controls.

In some embodiments, aspects/operations of methods 800, 1000, 1200, 1400, 1600, 1800, 2000, and/or 2200 may be interchanged, substituted, and/or added between the methods. For example, the three-dimensional environment of methods 800, 1000, 1200, 1400, 1600, 1800, 2000, and/or 2200, the direct input in methods 800, 1000, 1400, 1600, 2000, and/or 2200, the indirect input in methods 800, 1000, 1200, 1400, 1600, 2000, and/or 2200, and/or the air gesture input in methods 1800, 2000, and/or 2200 are optionally interchanged, replaced, and/or added between these methods. For the sake of brevity, these details are not repeated here.

The foregoing description, for purposes of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention and various described embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A method, comprising:

at an electronic device in communication with a display generation component and one or more input devices:

displaying, via the display generating component, a user interface comprising user interface elements;

while displaying the user interface element, detecting input from a predefined portion of a user of the electronic device via the one or more input devices; and

in response to detecting the input from the predefined portion of the user of the electronic device:

in accordance with a determination that a pose of the predefined portion of the user satisfies one or more criteria prior to detecting the input, performing a respective operation in accordance with the input from the predefined portion of the user of the electronic device; and

In accordance with a determination that the pose of the predefined portion of the user did not meet the one or more criteria prior to detecting the input, the respective operation is aborted from being performed in accordance with the input from the predefined portion of the user of the electronic device.

2. The method of claim 1, further comprising:

displaying the user interface element with a visual characteristic having a first value and displaying a second user interface element included in the user interface with the visual characteristic having a second value when the pose of the predefined portion of the user does not meet the one or more criteria; and

updating the visual characteristics of the user interface element to which the input focus is directed when the pose of the predefined portion of the user meets the one or more criteria, comprising:

in accordance with a determination that an input focus is directed to the user interface element, updating the user interface element to be displayed with the visual characteristic having a third value; and

in accordance with a determination that the input focus is directed to the second user interface element, the second user interface element is updated to be displayed with the visual characteristic having a fourth value.

3. The method according to claim 2, wherein:

in accordance with a determination that the predefined portion of the user is within a threshold distance of a location corresponding to the user interface element, the input focus is directed toward the user interface element, and

in accordance with a determination that the predefined portion of the user is within the threshold distance of the second user interface element, the input focus is directed toward the second user interface element.

4. A method according to any one of claims 2 to 3, wherein:

in accordance with a determination that the user's gaze is directed toward the user interface element, the input focus is directed toward the user interface element, and

in accordance with a determination that the gaze of the user is directed toward the second user interface element, the input focus is directed toward the second user interface element.

5. A method according to any one of claims 2 to 3, wherein updating the visual characteristics of the user interface element to which the input focus is directed comprises:

in accordance with a determination that the predefined portion of the user is less than a threshold distance from a location corresponding to the user interface element, updating the visual characteristics of the user interface element to which the input focus is directed in accordance with a determination that the pose of the predefined portion of the user meets a first set of one or more criteria; and

In accordance with a determination that the predefined portion of the user is greater than the threshold distance from the location corresponding to the user interface element, the visual characteristics of the user interface element to which the input focus is directed are updated in accordance with a determination that the pose of the predefined portion of the user meets a second set of one or more criteria different from the first set of one or more criteria.

6. The method of any of claims 1-5, wherein the pose of the predefined portion of the user meeting the one or more criteria comprises:

in accordance with a determination that the predefined portion of the user is less than a threshold distance from a location corresponding to the user interface element, the pose of the predefined portion of the user meets a first set of one or more criteria; and

in accordance with a determination that the predefined portion of the user is greater than the threshold distance from the location corresponding to the user interface element, the pose of the predefined portion of the user satisfies a second set of one or more criteria different from the first set of one or more criteria.

7. The method of any of claims 1-6, wherein the pose of the predefined portion of the user meeting the one or more criteria comprises:

In accordance with a determination that the predefined portion of the user is holding an input device of the one or more input devices, the pose of the predefined portion of the user meets a first set of one or more criteria, an

In accordance with a determination that the predefined portion of the user is not holding the input device, the pose of the predefined portion of the user meets a second set of one or more criteria.

8. The method of any of claims 1-7, wherein the pose of the predefined portion of the user meeting the one or more criteria comprises:

in accordance with a determination that the predefined portion of the user is greater than the threshold distance from the location corresponding to the user interface element, the pose of the predefined portion of the user satisfies the first set of one or more criteria.

9. The method of any one of claims 1 to 8, wherein:

In accordance with a determination that the predefined portion of the user is greater than a threshold distance from a location corresponding to the user interface element during the respective input, the one or more criteria include a criterion that is met when the user's attention is directed to the user interface element, and

in accordance with a determination that the predefined portion of the user is less than the threshold distance from the location corresponding to the user interface element during the respective input, the one or more criteria do not include a requirement to direct the attention of the user to the user interface element in order to satisfy the one or more criteria.

10. The method of any one of claims 1 to 9, the method further comprising:

in response to detecting that the user's gaze is directed toward a first region of the user interface, visually weakening a second region of the user interface relative to the first region of the user interface via the display generating component; and

in response to detecting that the gaze of the user is directed toward the second region of the user interface, the first region of the user interface is visually weakened relative to the second region of the user interface via the display generating component.

11. The method of claim 10, wherein the user interface is accessible by the electronic device and a second electronic device, the method further comprising:

discarding visually weakening the second region of the user interface relative to the first region of the user interface via the display generating means in accordance with an indication that a gaze of a second user of the second electronic device is directed at the first region of the user interface; and

discarding the first region of the user interface visually weakened relative to the second region of the user interface via the display generating means in accordance with an indication that the gaze of the second user of the second electronic device is directed to the second region of the user interface.

12. The method of any of claims 1-11, wherein detecting the input from the predefined portion of the user of the electronic device comprises detecting, via a hand tracking device, a pinch gesture performed by the predefined portion of the user.

13. The method of any of claims 1-12, wherein detecting the input from the predefined portion of the user of the electronic device comprises detecting, via a hand tracking device, a press gesture performed by the predefined portion of the user.

14. The method of any of claims 1-13, wherein detecting the input from the predefined portion of the user of the electronic device comprises detecting a lateral movement of the predefined portion of the user relative to a location corresponding to the user interface element.

15. The method of any one of claims 1 to 14, further comprising:

prior to determining that the pose of the predefined portion of the user meets the one or more criteria prior to detecting the input:

detecting, via an eye tracking device, that a gaze of the user is directed to the user interface element; and

in response to detecting that the gaze of the user is directed to the user interface element, a first indication of the gaze of the user is directed to the user interface element is displayed via the display generating component.

16. The method of claim 15, further comprising:

before detecting the input from the predefined portion of the user of the electronic device, when the pose of the predefined portion of the user meets the one or more criteria before detecting the input:

a second indication that the pose of the predefined portion of the user meets the one or more criteria prior to detecting the input is displayed via the display generating component, wherein the first indication is different from the second indication.

17. The method of any one of claims 1 to 16, further comprising:

while displaying the user interface element, detecting a second input from a second predefined portion of the user of the electronic device via the one or more input devices; and

in response to detecting the second input from the second predefined portion of the user of the electronic device:

in accordance with a determination that a pose of the second predefined portion of the user before the second input is detected meets one or more second criteria, performing a second corresponding operation in accordance with the second input from the second predefined portion of the user of the electronic device; and

in accordance with a determination that the pose of the second predefined portion of the user did not meet the one or more second criteria prior to detection of the second input, the second corresponding operation is aborted from being performed in accordance with the second input from the second predefined portion of the user of the electronic device.

18. The method of any of claims 1-17, wherein the user interface is accessible by the electronic device and a second electronic device, the method further comprising:

The user interface element is displayed with a visual characteristic having a first value before detecting: the pose of the predefined portion of the user meeting the one or more criteria prior to detecting the input;

displaying the user interface element with the visual characteristic having a second value different from the first value when the pose of the predefined portion of the user meets the one or more criteria prior to detecting the input; and

the user interface element is maintained displayed with the visual characteristic having the first value when a pose of a predefined portion of a second user of the second electronic device satisfies the one or more criteria while the user interface element is displayed with the visual characteristic having the first value.

19. The method of claim 18, further comprising:

in response to detecting the input from the predefined portion of the user of the electronic device, displaying the user interface element with the visual characteristic having a third value; and

the user interface element is displayed with the visual characteristic having the third value in response to an indication of an input from the predefined portion of the second user of the second electronic device.

20. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for:

displaying, via a display generating component, a user interface comprising user interface elements;

while displaying the user interface element, detecting input from a predefined portion of a user of the electronic device via one or more input devices; and

21. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

22. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying, via a display generating component, a user interface comprising user interface elements;

means for: while displaying the user interface element, detecting input from a predefined portion of a user of the electronic device via one or more input devices; and

means for, in response to detecting the input from the predefined portion of the user of the electronic device:

23. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

24. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-19.

25. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 1-19.

26. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 1 to 19.

27. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

apparatus for performing any one of the methods of claims 1 to 19.

28. A method, comprising:

displaying a first user interface element via the display generating component;

while displaying the first user interface element, detecting, via the one or more input devices, a first input directed to the first user interface element; and

in response to detecting the first input directed to the first user interface element:

In accordance with a determination that the first user interface element is within an attention area associated with a user of the electronic device, performing a first operation corresponding to the first user interface element; and

in accordance with a determination that the first user interface element is not within the attention area associated with the user, performing the first operation is aborted.

29. The method of claim 28, wherein the first input directed to the first user interface element is an indirect input directed to the first user interface element, the method further comprising:

while displaying the first user interface element, detecting a second input via the one or more input devices, wherein the second input corresponds to a direct input directed to the respective user interface element; and

in response to detecting the second input, performing an operation associated with the respective user interface element regardless of whether the respective user interface element is within the attention area associated with the user.

30. The method of any of claims 28-29, wherein the attention area associated with the user is based on a direction of gaze of the user of the electronic device.

31. The method of any of claims 28 to 30, further comprising:

detecting, while the first user interface element is within the attention area associated with the user, that one or more criteria for moving the attention area to a location where the first user interface element is not within the attention area are met; and

after detecting that the one or more criteria are met:

detecting a second input directed to the first user interface element; and

in response to detecting the second input directed to the first user interface element:

in accordance with a determination that the second input is detected within respective time thresholds at which the one or more criteria are met, performing a second operation corresponding to the first user interface element; and

in accordance with a determination that the second input is detected after the respective time threshold for which the one or more criteria are met, the second operation is aborted.

32. The method of any of claims 28-31, wherein the first input comprises a first portion followed by a second portion, the method further comprising:

upon detecting the first input:

Detecting the first portion of the first input while the first user interface element is within the attention area;

in response to detecting the first portion of the first input, performing a first portion of the first operation corresponding to the first user interface element;

detecting the second portion of the first input while the first user interface element is outside the attention area; and

in response to detecting the second portion of the first input, a second portion of the first operation corresponding to the first user interface element is performed.

33. The method of claim 32, wherein the first input corresponds to a press input, the first portion of the first input corresponds to initiation of the press input, and the second portion of the first input corresponds to continuation of the press input.

34. The method of claim 32, wherein the first input corresponds to a drag input, the first portion of the first input corresponds to initiation of the drag input, and the second portion of the first input corresponds to continuation of the drag input.

35. The method of claim 32, wherein the first input corresponds to a selection input, the first portion of the first input corresponds to initiation of the selection input, and the second portion of the first input corresponds to continuation of the selection input.

36. The method of any of claims 32-35, wherein detecting the first portion of the first input comprises detecting that a predefined portion of the user has a respective pose and is within a respective distance of a location corresponding to the first user interface element without detecting movement of the predefined portion of the user, and detecting the second portion of the first input comprises detecting the movement of the predefined portion of the user.

37. The method of any of claims 28-36, wherein the first input is provided by a predefined portion of the user, and detecting the first input comprises detecting that the predefined portion of the user is within a distance threshold of a location corresponding to the first user interface element, the method further comprising:

upon detecting the first input directed to the first user interface element and prior to performing the first operation, detecting, via the one or more input devices, that the predefined portion of the user is moved to a distance greater than the distance threshold from the location corresponding to the first user interface element; and

Responsive to detecting that the predefined portion moves to the distance from the location corresponding to the first user interface element that is greater than the distance threshold, the first operation corresponding to the first user interface element is aborted.

38. The method of any of claims 28-37, wherein the first input is provided by a predefined portion of the user, and detecting the first input comprises detecting that the predefined portion of the user is in a respective spatial relationship with respect to a location corresponding to the first user interface element, the method further comprising:

detecting, via the one or more input devices, that the predefined portion of the user is not interacting with the first user interface element within a respective time threshold for obtaining the respective spatial relationship with respect to the location corresponding to the first user interface element while the predefined portion of the user is in the respective spatial relationship with respect to the location corresponding to the first user interface element during the first input and prior to performing the first operation; and

responsive to detecting that the predefined portion of the user has not interacted with the first user interface element within the respective time threshold of obtaining the respective spatial relationship with respect to the location corresponding to the first user interface element, refraining from performing the first operation corresponding to the first user interface element.

39. The method of any of claims 28-38, wherein a first portion of the first input is detected when the user's gaze is directed toward the first user interface element, and a second portion of the first input subsequent to the first portion of the first input is detected when the user's gaze is not directed toward the first user interface element.

40. The method of any of claims 28-39, wherein the first input is provided by a predefined portion of the user moving from within a predefined range of angles relative to the first user interface element to a position corresponding to the first user interface element, the method further comprising:

detecting, via the one or more input devices, a second input directed to the first user interface element, wherein the second input includes movement of the predefined portion of the user from outside the predefined angular range relative to the first user interface element to the location corresponding to the first user interface element; and

responsive to detecting the second input, forgoing interaction with the first user interface element in accordance with the second input.

41. The method of any of claims 28-40, wherein the first operation is performed in response to detecting the first input without detecting that the user's gaze is directed toward the first user interface element.

42. An electronic device, comprising:

one or more processors;

a memory; and

displaying the first user interface element via a display generating component;

while displaying the first user interface element, detecting a first input directed to the first user interface element via one or more input devices; and

43. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

displaying the first user interface element via a display generating component;

44. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying the first user interface element via a display generating component;

means for: while displaying the first user interface element, detecting a first input directed to the first user interface element via one or more input devices; and

Means for, in response to detecting the first input directed to the first user interface element:

45. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

46. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 28-41.

47. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 28-41.

48. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 28 to 41.

49. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

apparatus for performing any one of the methods of claims 28 to 41.

50. A method, comprising:

at an electronic device in communication with a display generation component and one or more input devices including an eye tracking device:

displaying, via the display generating component, a user interface comprising a first area, the first area comprising a first user interface object and a second user interface object;

detecting, via the one or more input devices, respective inputs provided by predefined portions of the user while the user interface is displayed and while detecting, via the eye-tracking device, that the user's gaze is directed toward the first region of the user interface, wherein during the respective inputs, the predefined portions of the user are located away from a location corresponding to the first region of the user interface; and

in response to detecting the respective input:

in accordance with a determination that one or more first criteria are met, performing an operation with respect to the first user interface object based on the respective inputs; and

in accordance with a determination that one or more second criteria different from the first criteria are met, an operation is performed with respect to the second user interface object based on the respective input.

51. The method of claim 50, wherein:

The user interface includes a three-dimensional environment, and

the first region is a respective distance from a viewpoint associated with the electronic device in the three-dimensional environment, wherein:

in accordance with a determination that the respective distance is a first distance, the first region has a first size in the three-dimensional environment, an

In accordance with a determination that the respective distance is a second distance different from the first distance, the first region has a second size in the three-dimensional environment different from the first size.

52. The method of claim 51, wherein the first region increases in size in the three-dimensional environment as the respective distance increases.

53. The method of any one of claims 50 to 52, wherein:

the one or more first criteria are met when the first object is closer to the viewpoint of the user in the three-dimensional environment than the second object, and the one or more second criteria are met when the second object is closer to the viewpoint of the user in the three-dimensional environment than the first object.

54. The method of any one of claims 50 to 53, wherein:

The one or more first criteria or the one or more second criteria are met based on the type of the first user interface object and the type of the second user interface object.

55. The method of any one of claims 50 to 54, wherein:

the one or more first criteria are met or the one or more second criteria are met based on respective priorities defined by the electronic device for the first user interface object and the second user interface object.

56. The method of any one of claims 50 to 55, further comprising:

in response to detecting the respective input:

in accordance with a determination that one or more third criteria are met, the one or more third criteria including a criterion that is met when the first region is greater than a threshold distance from a viewpoint associated with the electronic device in a three-dimensional environment, refraining from performing the operation with respect to the first user interface object and refraining from performing the operation with respect to the second user interface object.

57. The method of claim 56, further comprising:

in accordance with a determination that the first region is greater than the threshold distance from the viewpoint associated with the electronic device in the three-dimensional environment, the first user interface object and the second user interface object are visually weakened relative to regions outside of the first region of the user interface, an

In accordance with a determination that the first region is less than the threshold distance from the viewpoint associated with the electronic device in the three-dimensional environment, the first user interface object and the second user interface object are visually dismissed relative to the region outside of the first region of the user interface.

58. The method of any one of claims 50 to 57, further comprising:

while displaying the user interface, detecting, via the one or more input devices, a second corresponding input provided by the predefined portion of the user; and

in response to detecting the second corresponding input:

in accordance with a determination that one or more third criteria are met, the one or more third criteria including a criterion that is met when the first region is at an angle greater than a threshold angle from the gaze of the user in a three-dimensional environment, refraining from performing a respective operation with respect to the first user interface object and refraining from performing a respective operation with respect to the second user interface object.

59. The method of claim 58, further comprising:

in accordance with a determination that the first region is angled from the viewpoint associated with the electronic device in the three-dimensional environment by more than the threshold angle, the first user interface object and the second user interface object are visually weakened relative to regions outside of the first region of the user interface, and

In accordance with a determination that the first region is at an angle less than the threshold angle from the viewpoint associated with the electronic device in the three-dimensional environment, the first user interface object and the second user interface object are visually dismissed relative to the region outside of the first region of the user interface.

60. The method of any of claims 50-59, wherein the one or more first criteria and the one or more second criteria include respective criteria that are met when the first region is greater than a threshold distance from a viewpoint associated with the electronic device in a three-dimensional environment and that are not met when the first region is less than the threshold distance from the viewpoint associated with the electronic device in the three-dimensional environment, the method further comprising:

in response to detecting the respective input and in accordance with a determination that the first region is less than the threshold distance from the viewpoint associated with the electronic device in the three-dimensional environment:

in accordance with a determination that the gaze of the user is directed to the first user interface object, performing the operation with respect to the first user interface object based on the respective input; and

In accordance with a determination that the gaze of the user is directed toward the second user interface object, the operation is performed with respect to the second user interface object based on the respective input.

61. An electronic device, comprising:

one or more processors;

a memory; and

displaying, via a display generating component, a user interface comprising a first area, the first area comprising a first user interface object and a second user interface object;

detecting, via the one or more input devices, respective inputs provided by predefined portions of the user while the user interface is displayed and while detecting, via an eye tracking device, that the user's gaze is directed toward the first region of the user interface, wherein during the respective inputs, locations of the predefined portions of the user are away from locations corresponding to the first region of the user interface; and

in response to detecting the respective input:

62. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

In response to detecting the respective input:

63. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying, via a display generating component, a user interface comprising a first area, the first area comprising a first user interface object and a second user interface object;

means for: detecting, via the one or more input devices, respective inputs provided by predefined portions of the user while the user interface is displayed and while detecting, via an eye tracking device, that the user's gaze is directed toward the first region of the user interface, wherein during the respective inputs, locations of the predefined portions of the user are away from locations corresponding to the first region of the user interface; and

means for, in response to detecting the respective input:

64. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

means for, in response to detecting the respective input:

65. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 50-60.

66. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 50-60.

67. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 50 to 60.

68. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

apparatus for performing any one of the methods of claims 50 to 60.

69. A method, comprising:

displaying, via the display generating component, a user interface, wherein the user interface comprises a plurality of user interface objects of respective types, including a first user interface object in a first state and a second user interface object in the first state;

in accordance with a determination that one or more criteria are met when a gaze of a user of the electronic device is directed to the first user interface object, the one or more criteria including a criterion met when a first predefined portion of the user of the electronic device is farther than a threshold distance from a location corresponding to any of the plurality of user interface objects in the user interface, displaying the first user interface object in a second state via the display generating component while maintaining display of the second user interface object in the first state, wherein the second state is different from the first state; and

When the gaze of the user is directed to the first user interface object:

detecting movement of the first predefined portion of the user via the one or more input devices while the first user interface object is displayed in the second state; and

in response to detecting the movement of the first predefined portion of the user:

in accordance with a determination that the first predefined portion of the user moves within the threshold distance of a location corresponding to the second user interface object, the second user interface object is displayed in the second state via the display generation component.

70. The method of claim 69, further comprising:

in accordance with the determination that the first predefined portion of the user moves within the threshold distance of the location corresponding to the second user interface object, the first user interface object is displayed in the first state via the display generating component.

71. The method of any one of claims 69-70, further comprising:

In accordance with a determination that the first predefined portion of the user moves within the threshold distance of a location corresponding to the first user interface object, the first user interface object is maintained displayed in the second state.

72. The method of any one of claims 69 to 71, further comprising:

in accordance with a determination that the first predefined portion of the user moves within the threshold distance of a location corresponding to a third user interface object of the plurality of user interface objects, the third user interface object is displayed in the second state via the display generation component.

73. The method of any one of claims 69-72, further comprising:

in accordance with a determination that the first predefined portion of the user moves within the threshold distance of a location corresponding to the first user interface object and the location corresponding to the second user interface object:

in accordance with a determination that the first predefined portion is closer to the location corresponding to the first user interface object than to the location corresponding to the second user interface object, displaying the first user interface object in the second state via the display generating component; and

In accordance with a determination that the first predefined portion is closer to the location corresponding to the second user interface object than to the location corresponding to the first user interface object, the second user interface object is displayed in the second state via the display generation component.

74. The method of any one of claims 69-73, wherein the one or more criteria include a criterion that is met when the first predefined portion of the user is in a predetermined pose.

75. The method of any one of claims 69 to 74, further comprising:

in accordance with a determination that the first predefined portion of the user moves within the threshold distance of a location corresponding to the first user interface object, the first user interface object is maintained displayed in the second state,

wherein:

the first user interface object in the second state has a first visual appearance when the first predefined portion of the user is greater than the threshold distance of the location corresponding to the first user interface object, and

The first user interface object in the second state has a second visual appearance that is different from the first visual appearance when the first predefined portion of the user is within the threshold distance of the location corresponding to the first user interface object.

76. The method of any one of claims 69 to 75, further comprising:

in accordance with a determination that one or more second criteria are met when the gaze of the user is directed to the first user interface object, the one or more second criteria including a criterion met when a second predefined portion of the user, different from the first predefined portion, is farther from the location corresponding to any of the plurality of user interface objects in the user interface than the threshold distance, displaying the first user interface object in the second state via the display generating component,

wherein:

in accordance with a determination that the one or more criteria are met, the first user interface object in the second state has a first visual appearance, and

in accordance with a determination that the one or more second criteria are met, the first user interface object in the second state has a second visual appearance that is different from the first visual appearance.

77. The method of any of claims 69-76, displaying the second user interface object in the second state occurs while the gaze of the user remains directed to the first user interface object.

78. The method of any of claims 69-77, wherein displaying the second user interface object in the second state is further in accordance with a determination that the second user interface object is within an attention area associated with the user of the electronic device.

79. The method of any of claims 69-78, wherein the one or more criteria include a criterion that is met when at least one predefined portion of the user is in a predetermined pose, the at least one predefined portion including the first predefined portion of the user.

80. The method of any one of claims 69-79, further comprising:

while displaying the first user interface object in the second state, detecting a first movement of an attention area associated with the user via the one or more input devices; and

in response to detecting the first movement of the attention area associated with the user:

In accordance with a determination that the attention area includes a third user interface object of the respective type, and the first predefined portion of the user is within the threshold distance of a location corresponding to the third user interface object, the third user interface object is displayed in the second state via the display generating component.

81. The method of claim 80, further comprising:

detecting a second movement of the attention area via the one or more input devices after detecting the first movement of the attention area and while displaying the third user interface object in the second state, wherein the third user interface object is no longer within the attention area as a result of the second movement of the attention area; and

in response to detecting the second movement of the attention area:

in accordance with a determination that the first predefined portion of the user is within the threshold distance of the third user interface object, the third user interface object is maintained to be displayed in the second state.

82. The method of claim 81, further comprising:

in response to detecting the second movement of the attention area and in accordance with a determination that the first predefined portion of the user is not interacting with the third user interface object:

In accordance with a determination that the first user interface object is within the attention area, the one or more criteria are met, and the gaze of the user is directed to the first user interface object, the first user interface object is displayed in the second state; and

in accordance with a determination that the second user interface object is within the attention area, the one or more criteria are met, and the gaze of the user is directed to the second user interface object, the second user interface object is displayed in the second state.

83. The method of any one of claims 69-82, further comprising:

upon satisfaction of the one or more criteria:

detecting, via the eye tracking device, movement of the gaze of the user to the second user interface object before the movement of the first predefined portion of the user is detected and while the first user interface object is displayed in the second state; and

in response to detecting the movement of the gaze of the user to the second user interface object, the second user interface object is displayed in the second state via the display generating component.

84. The method of claim 83, further comprising:

detecting, via the eye tracking device, movement of the gaze of the user to the first user interface object after the movement of the first predefined portion of the user is detected and while the second user interface object is displayed in the second state in accordance with the determination that the first predefined portion of the user is within the threshold distance of the location corresponding to the second user interface object; and

in response to detecting the movement of the gaze of the user to the first user interface object, the second user interface object is maintained displayed in the second state.

85. An electronic device, comprising:

one or more processors;

a memory; and

displaying, via a display generating component, a user interface, wherein the user interface comprises a plurality of user interface objects of respective types, including a first user interface object in a first state and a second user interface object in the first state;

when the gaze of the user is directed to the first user interface object:

86. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

When the gaze of the user is directed to the first user interface object:

87. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying, via a display generating component, a user interface, wherein the user interface comprises a plurality of user interface objects of respective types, including a first user interface object in a first state and a second user interface object in the first state;

means for: in accordance with a determination that one or more criteria are met when a gaze of a user of the electronic device is directed to the first user interface object, the one or more criteria including a criterion met when a first predefined portion of the user of the electronic device is farther than a threshold distance from a location corresponding to any of the plurality of user interface objects in the user interface, displaying the first user interface object in a second state via the display generating component while maintaining display of the second user interface object in the first state, wherein the second state is different from the first state; and

Means for: when the gaze of the user is directed to the first user interface object:

88. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

89. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 69-84.

90. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 69-84.

91. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 69 to 84.

92. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

apparatus for performing any one of the methods of claims 69 to 84.

93. A method, comprising:

detecting, via the eye-tracking device, movement of a gaze of a user of the electronic device away from a first user interface element displayed via the display generating component to a second user interface element displayed via the display generating component while the gaze is directed at the first user interface element; and

in response to detecting the movement of the gaze of the user away from the first user interface element to the second user interface element displayed via the display generating component:

in accordance with a determination that a second predefined portion of the user that is different from the first predefined portion is available for interaction with the second user interface element, changing a visual appearance of the second user interface element; and

In accordance with a determination that the second predefined portion of the user is not available for interaction with the second user interface element, the changing the visual appearance of the second user interface element is abandoned.

94. The method of claim 93, further comprising:

performing the following when one or more criteria are met, the one or more criteria including a criterion that is met when the first predefined portion of the user and the second predefined portion of the user do not interact with any user interface elements:

in accordance with a determination that the gaze of the user is directed to the first user interface element, displaying the first user interface element with a visual characteristic that indicates that interaction with the first user interface element is possible, wherein the second user interface element is not displayed with the visual characteristic; and

in accordance with a determination that the gaze of the user is directed to the second user interface element, displaying the second user interface element with a visual characteristic that indicates that interaction with the second user interface element is possible, wherein the first user interface element is not displayed with the visual characteristic;

detecting input from the first predefined portion or the second predefined portion of the user via the one or more input devices when the one or more criteria are met; and

In response to detecting the input:

performing an operation corresponding to the first user interface element in accordance with the determination that the gaze of the user is directed to the first user interface element when the input is received; and

and according to the determination, when the input is received, the gaze of the user is directed to the second user interface element, and an operation corresponding to the second user interface element is executed.

95. The method of claim 94, wherein the one or more criteria include a criterion that is met when at least one of the first predefined portion or the second predefined portion of the user is available for interaction with a user interface element.

96. The method of any one of claims 93 to 95, further comprising:

in accordance with a determination that the first predefined portion and the second predefined portion of the user are not available for interaction with a user interface element, the changing the visual appearance of the second user interface element is abandoned.

97. The method of any one of claims 93 to 96, further comprising:

detecting, via the eye tracking device, that the second predefined portion of the user is no longer available for interaction with the second user interface element while the second predefined portion of the user is available for interaction with the second user interface element and after changing the visual appearance of the second user interface element to a changed appearance of the second user interface element; and

in response to detecting that the second predefined portion of the user is no longer available for interaction with the second user interface element, ceasing to display the changed appearance of the second user interface element.

98. The method of any one of claims 93 to 97, further comprising:

detecting, via the one or more input devices, that the second predefined portion of the user is now available for interaction with the second user interface element after the determining that the second predefined portion of the user is not available for interaction with the second user interface element and while the gaze of the user is directed to the second user interface element; and

In response to detecting that the second predefined portion of the user is now available for interaction with the second user interface element, the visual appearance of the second user interface element is changed.

99. The method of any one of claims 93 to 98, further comprising:

in accordance with a determination that the first and second predefined portions of the user have interacted with a corresponding user interface element other than the second user interface element, the changing the visual appearance of the second user interface element is abandoned.

100. The method of any of claims 93-99, wherein the determining that the second predefined portion of the user is not available for interaction with the second user interface element is based on determining that the second predefined portion of the user interacts with a third user interface element that is different from the second user interface element.

101. The method of any of claims 93-100, wherein the determining that the second predefined portion of the user is not available for interaction with the second user interface element is based on determining that the second predefined portion of the user is not in a predetermined pose required for interaction with the second user interface element.

102. The method of any of claims 93-101, wherein the determining that the second predefined portion of the user is not available for interaction with the second user interface element is based on determining that the second predefined portion of the user is not detected by the one or more input devices in communication with the electronic device.

103. The method of any one of claims 93 to 102, further comprising:

while displaying the first user interface element and the second user interface element via the display generating component:

in accordance with a determination that the first predefined portion of the user is within a threshold distance of a location corresponding to the first user interface element and the second predefined portion of the user is within the threshold distance of a location corresponding to the second user interface element:

displaying the first user interface element with visual characteristics indicating that the first predefined portion of the user is available for direct interaction with the first user interface element; and

the second user interface element is displayed with the visual characteristic indicating that the second user interface element is available for direct interaction with the second predefined portion of the user.

104. The method of any one of claims 93 to 103, further comprising:

in accordance with a determination that the first predefined portion of the user is within a threshold distance of a location corresponding to the first user interface element and the second predefined portion of the user is farther than the threshold distance of a location corresponding to the second user interface element but available for interaction with the second user interface element:

displaying the first user interface element with visual characteristics indicating that the first predefined portion of the user is available for direct interaction with the first user interface element;

in accordance with a determination that the gaze of the user is directed toward the second user interface element, displaying the second user interface element with visual characteristics that indicate that the second predefined portion of the user is available for indirect interaction with the second user interface element; and

in accordance with a determination that the gaze of the user is not directed toward the second user interface element, the second user interface element is not displayed with visual characteristics that indicate that the second predefined portion of the user is available for indirect interaction with the second user interface element.

105. The method of any one of claims 93 to 104, further comprising:

in accordance with a determination that the second predefined portion of the user is within a threshold distance of a location corresponding to the second user interface element, and the first predefined portion of the user is farther than the threshold distance of a location corresponding to the first user interface element but available for interaction with the first user interface element:

displaying the second user interface element with visual characteristics indicating that the second user interface element is available for direct interaction with the second predefined portion of the user;

in accordance with a determination that the gaze of the user is directed toward the first user interface element, displaying the first user interface element with visual characteristics that indicate that the first predefined portion of the user is available for indirect interaction with the first user interface element; and

in accordance with a determination that the gaze of the user is not directed toward the first user interface element, the first user interface element is not displayed with the visual characteristic indicating that the first predefined portion of the user is available for indirect interaction with the first user interface element.

106. The method of any one of claims 93 to 105, further comprising:

upon detecting the movement of the gaze of the user away from the first user interface element toward the second user interface element and while displaying the second user interface element in the changed visual appearance, detecting, via the one or more input devices, that the second predefined portion of the user is directly interacting with the first user interface element; and

responsive to detecting that the second predefined portion of the user interacts directly with the first user interface element, the second user interface element is forgone being displayed in the changed visual appearance.

107. An electronic device, comprising:

one or more processors;

a memory; and

detecting, via an eye tracking device, movement of a gaze of a user of the electronic device away from a first user interface element displayed via a display generating component to a second user interface element displayed via the display generating component while the gaze is directed at the first user interface element; and

108. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

109. An electronic device, comprising:

one or more processors;

a memory;

means for: detecting, via an eye tracking device, movement of a gaze of a user of the electronic device away from a first user interface element displayed via a display generating component to a second user interface element displayed via the display generating component while the gaze is directed at the first user interface element; and

means for, in response to detecting the movement of the gaze of the user away from the first user interface element to the second user interface element displayed via the display generating component:

110. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

111. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 93-106.

112. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 93-106.

113. An electronic device, comprising:

one or more processors;

a memory; and

means for performing any one of the methods of claims 93 to 106.

114. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

Means for performing any one of the methods of claims 93 to 106.

115. A method, comprising:

displaying a user interface object in a three-dimensional environment via the display generating component;

detecting, via the one or more input devices, a respective input comprising movement of a predefined portion of a user of the electronic device while the user interface object is displayed, wherein during the respective input, a location of the predefined portion of the user is away from a location corresponding to the user interface object; and

upon detection of the respective input:

in accordance with a determination that a first portion of the movement of the predefined portion of the user meets one or more criteria and the predefined portion of the user is in a first location, displaying, via the display generating component, a visual indication at a first location in the three-dimensional environment corresponding to the first location of the predefined portion of the user; and

in accordance with a determination that the first portion of the movement of the predefined portion of the user meets the one or more criteria and the predefined portion of the user is in a second location, a visual indication is displayed via the display generating component at a second location in the three-dimensional environment corresponding to the second location of the predefined portion of the user, wherein the second location is different from the first location.

116. The method of claim 115, further comprising:

upon detection of the respective input:

in accordance with the determining that the first portion of the movement of the predefined portion of the user meets the one or more criteria and in accordance with the determining that one or more second criteria are met, the one or more second criteria including a criterion met when the first portion of the movement of the predefined portion of the user is followed by a second portion of the movement of the predefined portion of the user, performing a selection operation with respect to the user interface object in accordance with the respective input; and

in accordance with the determination that the first portion of the movement of the predefined portion of the user does not meet the one or more criteria and in accordance with the determination that the one or more second criteria are met, forgoing performing the selection operation with respect to the user interface object.

117. The method of any one of claims 115-116, further comprising:

upon detection of the respective input, a representation of the predefined portion of the user that moves according to the movement of the predefined portion of the user is displayed via the display generating component.

118. The method of any of claims 115-117, wherein the predefined portion of the user is visible in the three-dimensional environment via the display generation component.

119. The method of any one of claims 115 to 118, further comprising:

upon detection of the respective input and in accordance with the determination that the first portion of the movement of the predefined portion of the user meets the one or more criteria, modifying a display of the user interface object in accordance with the respective input.

120. The method of claim 119, wherein modifying the display of the user interface object comprises:

in accordance with a determination that the predefined portion of the user moves toward a location corresponding to the user interface object after the first portion of the movement of the predefined portion of the user meets the one or more criteria, the user interface object is moved back in the three-dimensional environment in accordance with the movement of the predefined portion of the user toward the location corresponding to the user interface object.

121. The method according to claim 120, wherein:

displaying the user interface objects in respective user interfaces via the display generating means,

In accordance with a determination that the respective input is a scroll input, the electronic device moves the respective user interface and the user interface object back in accordance with the movement of the predefined portion of the user toward the location corresponding to the user interface object, an

In accordance with a determination that the respective input is an input other than a scroll input, the electronic device moves the user interface object relative to the respective user interface without moving the respective user interface.

122. The method of any one of claims 120 to 121, further comprising:

upon detection of the respective input:

detecting movement of the predefined portion of the user away from the location corresponding to the user interface object after detecting the movement of the predefined portion of the user toward the user interface object and after moving the user interface object back in the three-dimensional environment; and

in response to detecting the movement of the predefined portion of the user away from the location corresponding to the user interface object, the user interface object is moved forward in the three-dimensional environment in accordance with the movement of the predefined portion of the user away from the location corresponding to the user interface object.

123. The method of any one of claims 115-122, wherein:

the visual indication at the first location in the three-dimensional environment corresponding to the first location of the predefined portion of the user is displayed in proximity to a representation of the predefined portion of the user visible in the three-dimensional environment at a first respective location in the three-dimensional environment, and

the visual indication at the second location in the three-dimensional environment corresponding to the second location of the predefined portion of the user is displayed in proximity to the representation of the predefined portion of the user visible in the three-dimensional environment at a second corresponding location in the three-dimensional environment.

124. The method of any one of claims 119-123, further comprising:

detecting, via the one or more input devices, a second corresponding input comprising movement of the predefined portion of the user while the user interface object is displayed, wherein during the second corresponding input the location of the predefined portion of the user is at the location corresponding to the user interface object; and

Upon detection of the second corresponding input:

modifying a display of the user interface object according to the second corresponding input without displaying the visual indication at the location corresponding to the predefined portion of the user via the display generating component.

125. The method of any of claims 119-124, wherein the electronic device performs a respective operation in response to the respective input, and the method further comprises:

while displaying the user interface object, detecting, via the one or more input devices, a third corresponding input comprising movement of the predefined portion of the user, the movement comprising the same type of movement as the movement of the predefined portion of the user in the corresponding input, wherein during the third corresponding input the location of the predefined portion of the user is at the location corresponding to the user interface object; and

in response to detecting the third corresponding input, the corresponding operation is performed.

126. The method of any one of claims 115 to 125, further comprising:

before detecting the respective input:

In accordance with a determination that the user's gaze is directed at the user interface object, displaying the user interface object with a respective visual characteristic having a first value; and

in accordance with a determination that the gaze of the user is not directed to the user interface object, the user interface object is displayed with the respective visual characteristic having a second value different from the first value.

127. The method of any one of claims 115 to 126, further comprising:

upon detection of the respective input:

after the first portion of the movement of the predefined portion of the user meets the one or more criteria:

in accordance with a determination that a second portion of the movement of the predefined portion of the user that satisfies one or more second criteria and a subsequent third portion of the movement of the predefined portion of the user that satisfies one or more third criteria are detected, wherein the one or more second criteria include criteria that are satisfied when the second portion of the movement of the predefined portion of the user includes movement toward the location corresponding to the user interface object that is greater than a movement threshold, and the one or more third criteria include criteria that are satisfied when the third portion of the movement is away from the location corresponding to the user interface object and is detected within a time threshold of the second portion of the movement, a tap operation is performed with respect to the user interface object.

128. The method of any one of claims 115 to 127, further comprising:

upon detection of the respective input:

in accordance with a determination that a second portion of the movement of the predefined portion of the user that satisfies one or more second criteria and a subsequent third portion of the movement of the predefined portion of the user that satisfies one or more third criteria are detected, wherein the one or more second criteria include criteria that are satisfied when the second portion of the movement of the predefined portion of the user includes a movement that is greater than a movement threshold toward the location corresponding to the user interface object, and the one or more third criteria include performing a scroll operation in accordance with the third portion of the movement relative to the user interface object when the third portion of the movement is a lateral movement relative to the location corresponding to the user interface object.

129. The method of any one of claims 115 to 128, further comprising:

Upon detection of the respective input:

detecting, via the one or more input devices, a second portion of the movement of the predefined portion of the user away from the location corresponding to the user interface object after the first portion of the movement of the predefined portion of the user meets the one or more criteria; and

in response to detecting the second portion of the movement, an appearance of the visual indication is updated in accordance with the second portion of the movement.

130. The method of claim 129, wherein updating the appearance of the visual indication comprises ceasing to display the visual indication, the method further comprising:

after ceasing to display the visual indication, detecting, via the one or more input devices, a second corresponding input comprising a second movement of the predefined portion of the user, wherein during the second corresponding input, the location of the predefined portion of the user is distant from the location corresponding to the user interface object; and

upon detection of the second corresponding input:

in accordance with a determination that a first portion of the second movement meets the one or more criteria, during the second corresponding input, a second visual indication is displayed via the display generating component at a location in the three-dimensional environment corresponding to the predefined portion of the user.

131. The method of any of claims 115-130, wherein the respective input corresponds to a scroll input directed to the user interface object, the method further comprising:

scrolling the user interface object according to the respective input while maintaining the visual indication displayed.

132. The method of any one of claims 115 to 131, further comprising:

upon detection of the respective input:

detecting, via the one or more input devices, a second portion of the movement of the predefined portion of the user that meets one or more second criteria after the first portion of the movement of the predefined portion of the user meets the one or more criteria, the one or more second criteria including a criterion that is met when the second portion of the movement corresponds to a distance between a location corresponding to the visual indication and the predefined portion of the user; and

in response to detecting the second portion of the movement of the predefined portion of the user, generating audio feedback indicating that the one or more second criteria are met.

133. The method of any one of claims 115 to 132, further comprising:

Detecting, while displaying the user interface object, that one or more second criteria are met, the one or more second criteria including a criterion that is met when the predefined portion of the user has a respective pose when the location of the predefined portion of the user is away from the location corresponding to the user interface object; and

in response to detecting that the one or more second criteria are met, a virtual surface is displayed via the display generating component proximate to and remote from the user interface object at a location corresponding to the predefined portion of the user.

134. The method of claim 133, further comprising:

while displaying the virtual surface, detecting, via the one or more input devices, a respective movement of the predefined portion of the user toward a location corresponding to the virtual surface; and

in response to detecting the respective movement, changing a visual appearance of the virtual surface according to the respective movement.

135. The method of any of claims 132-133, further comprising:

In response to detecting the respective movement, changing a visual appearance of the user interface object in accordance with the respective movement.

136. The method of any of claims 133-135, wherein displaying the virtual surface near a location corresponding to the predefined portion of the user includes displaying the virtual surface at a respective distance from the location corresponding to the predefined portion of the user, the respective distance corresponding to an amount of movement of the predefined portion of the user toward a location corresponding to the virtual surface required to perform an operation with respect to the user interface object.

137. The method of any of claims 133-136, further comprising:

when the virtual surface is displayed, a visual indication of a distance between the predefined portion of the user and a location corresponding to the virtual surface is displayed on the virtual surface.

138. The method of any of claims 133-137, further comprising:

while displaying the virtual surface, detecting, via the one or more input devices, movement of the predefined portion of the user to a respective location greater than a threshold distance from a location corresponding to the virtual surface; and

In response to detecting the movement of the predefined portion of the user to the respective location, ceasing to display the virtual surface in the three-dimensional environment.

139. The method of any of claims 133-138, wherein displaying the virtual surface proximate the predefined portion of the user includes:

in accordance with a determination that the predefined portion of the user is at a first respective location when the one or more second criteria are met, displaying the virtual surface in the three-dimensional environment at a third location corresponding to the first respective location of the predefined portion of the user; and

in accordance with a determination that the predefined portion of the user is at a second respective location different from the first respective location when the one or more second criteria are met, the virtual surface is displayed at a fourth location in the three-dimensional environment that is different from the third location corresponding to the second respective location of the predefined portion of the user.

140. The method of any one of claims 115 to 139, further comprising:

upon displaying the visual indication corresponding to the predefined portion of the user:

Detecting, via the one or more input devices, a second corresponding input comprising movement of a second predefined portion of the user, wherein during the second corresponding input, a location of the second predefined portion of the user is away from the location corresponding to the user interface object; and

upon detection of the second corresponding input:

in accordance with a determination that a first portion of the movement of the second predefined portion of the user meets the one or more criteria, concurrently displaying via the display generating component:

the visual indication corresponding to the predefined portion of the user; and

a visual indication at a location corresponding to the second predefined portion of the user in the three-dimensional environment.

141. The method of any one of claims 115 to 140, further comprising:

upon detecting the respective input, a respective visual indication is displayed on the user interface object, the respective visual indication representing a respective distance the predefined portion of the user needs to be moved toward the location corresponding to the user interface object to interact with the user interface object.

142. The method of any one of claims 115-141, further comprising:

Detecting that the user's gaze is directed to the user interface object while the user interface object is displayed; and

in response to detecting that the gaze of the user is directed to the user interface object, the user interface object is displayed with a respective visual characteristic having a first value.

143. The method of any of claims 115-142, wherein the three-dimensional environment includes a representation of a respective object in a physical environment of the electronic device, the method further comprising:

detecting that one or more second criteria are met, the one or more second criteria including a criterion that is met when the user's gaze is directed to the representation of the respective object, and a criterion that is met when the predefined portion of the user is in a respective pose; and

in response to detecting that the one or more second criteria are met, one or more selectable options are displayed in proximity to the representation of the respective object via the display generation component, wherein the one or more selectable options are selectable to perform a respective operation associated with the respective object.

144. The method of any one of claims 115 to 143, further comprising:

Detecting, via the one or more input devices, a second portion of the movement of the predefined portion of the user that meets one or more second criteria after the first portion of the movement of the predefined portion of the user meets the one or more criteria and while the visual indication corresponding to the predefined portion of the user is displayed; and

in response to detecting the second portion of the movement of the predefined portion of the user:

in accordance with a determination that the user's gaze is directed toward the user interface object and the user interface object is interactive:

displaying, via the display generating component, a visual indication that the second portion of the movement of the predefined portion of the user meets the one or more second criteria; and

executing an operation corresponding to the user interface object according to the corresponding input; and

in accordance with a determination that the gaze of the user is not directed to an interactive user interface object:

displaying, via the display generating component, the visual indication without performing an operation according to the respective input, the visual indication representing that the second portion of the movement of the predefined portion of the user meets the one or more second criteria.

145. An electronic device, comprising:

one or more processors;

a memory; and

upon detection of the respective input:

146. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

upon detection of the respective input:

147. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying a user interface object in a three-dimensional environment via the display generating component;

means for: detecting, via the one or more input devices, a respective input comprising movement of a predefined portion of a user of the electronic device while the user interface object is displayed, wherein during the respective input, a location of the predefined portion of the user is away from a location corresponding to the user interface object; and

Means for: upon detection of the respective input:

148. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

means for: upon display of the user interface object, detecting, via the one or more input devices, a respective input comprising movement of a predefined portion of a user of the electronic device, wherein during the respective input, a location of the predefined portion of the user is distant from a location corresponding to the user interface object; and

Means for: upon detection of the respective input:

149. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 115-144.

150. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 115-144.

151. An electronic device, comprising:

one or more processors;

a memory; and

means for performing any one of the methods of claims 115-144.

152. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

means for performing any one of the methods of claims 115-144.

153. A method, comprising:

displaying a user interface object via the display generating means;

while displaying the user interface object, detecting, via the one or more input devices, an input directed to the user interface object by a first predefined portion of a user of the electronic device; and

upon detecting the input directed to the user interface object, displaying, via the display generating component, a simulated shadow displayed on the user interface object, wherein the simulated shadow has an appearance based on a positioning of an element indicative of an interaction with the user interface object relative to the user interface object.

154. The method of claim 153, wherein the element comprises a cursor displayed at a location corresponding to a location remote from the first predefined portion of the user and controlled by movement of the first predefined portion of the user.

155. The method of claim 154, further comprising:

upon displaying the user interface object and a second user interface object, and prior to detecting the input directed to the user interface object by the first predefined portion of the user:

in accordance with a determination that one or more first criteria are met, the one or more first criteria including a criterion that is met when the user's gaze is directed to the user interface object, displaying, via the display generating component, the cursor at a predetermined distance from the user interface object;

in accordance with a determination that one or more second criteria are met, the one or more second criteria including a criterion that is met when the gaze of the user is directed to the second user interface object, the cursor is displayed via the display generating component at the predetermined distance from the second user interface object.

156. The method of claim 153, wherein the simulated shadow comprises a simulated shadow of a virtual representation of the first predefined portion of the user.

157. The method of claim 153, wherein the simulated shadow comprises a simulated shadow of the physical first predefined portion of the user.

158. The method of any of claims 153-157, further comprising:

upon detecting the input directed to the user interface object and upon displaying the simulated shadow displayed on the user interface object:

detecting, via the one or more input devices, progress of the input directed to the user interface object by the first predefined portion of the user; and

in response to detecting the progress of the input directed to the user interface object, changing a visual appearance of the simulated shadow displayed on the user interface object in accordance with the progress of the input directed to the user interface object by the first predefined portion of the user.

159. The method of claim 158, wherein changing the visual appearance of the simulated shadow comprises changing a brightness used to display the simulated shadow.

160. The method of any of claims 158-159, wherein changing the visual appearance of the simulated shadow includes changing a level of blurriness for displaying the simulated shadow.

161. The method of any of claims 158-160, wherein changing the visual appearance of the simulated shadow includes changing a size of the simulated shadow.

162. The method of any one of claims 153-161, further comprising:

detecting, via the one or more input devices, a first portion of the input corresponding to moving the element laterally relative to the user interface object;

in response to detecting the first portion of the input, displaying the simulated shadow at a first location on the user interface object with a first visual appearance;

detecting, via the one or more input devices, a second portion of the input corresponding to moving the element laterally relative to the user interface object; and

in response to detecting the second portion of the input, the simulated shadow is displayed at a second location on the user interface object that is different from the first location with a second visual appearance that is different from the first visual appearance.

163. The method of any of claims 153-162, wherein the user interface object is a virtual surface and the input detected at a location near the virtual surface provides input to a second user interface object that is remote from the virtual surface.

164. The method of any of claims 153-163, wherein the first predefined portion of the user interacts directly with the user interface object and the simulated shadow is displayed on the user interface object.

165. The method of any of claims 153-164, wherein:

in accordance with a determination that the first predefined portion of the user is within a threshold distance of a location corresponding to the user interface object, the simulated shadow corresponds to the first predefined portion of the user, and

in accordance with a determination that the first predefined portion of the user is farther than the threshold distance from the location corresponding to the user interface object, the simulated shadow corresponds to a cursor controlled by the first predefined portion of the user.

166. The method of any one of claims 153-165, further comprising:

Upon detecting the input directed to the user interface object by the first predefined portion of the user, detecting a second input directed to the user interface object by a second predefined portion of the user; and

upon simultaneous detection of the input and the second input directed to the user interface object, simultaneously displaying on the user interface object:

the simulated shadow relative to the user interface object indicative of interaction of the first predefined portion of the user with the user interface object; and

a second simulated shadow relative to the user interface object indicating interaction of the second predefined portion of the user with the user interface object.

167. The method of any of claims 153-166, wherein the simulated shadow indicates how much movement is required for the first predefined portion of the user to interact with the user interface object.

168. An electronic device, comprising:

one or more processors;

a memory; and

Displaying a user interface object via the display generating means;

169. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

displaying a user interface object via the display generating means;

170. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying a user interface object via the display generating means;

171. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

Means for: displaying a user interface object via the display generating means;

172. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 153-167.

173. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 153-167.

174. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 153-167.

175. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

apparatus for performing any one of the methods of claims 153-167.

176. A method, comprising:

displaying, via the display generating component, a user interface comprising respective areas, the respective areas comprising a first user interface element and a second user interface element;

while displaying the user interface, detecting, via the one or more input devices, a first input directed to the first user interface element in the respective region;

in response to detecting the first input directed to the first user interface element, modifying an appearance of the first user interface element to indicate that further input directed to the first user interface element will cause selection of the first user interface element;

while displaying the first user interface element in the modified appearance, detecting a second input via the one or more input devices; and

In response to detecting the second input:

in accordance with a determination that the second input includes a movement corresponding to a movement away from the first user interface element:

in accordance with a determination that the movement corresponds to movement within the respective region of the user interface, discarding selection of the first user interface element and modifying an appearance of the second user interface element to indicate that further input directed to the second user interface element will cause selection of the second user interface element; and

in accordance with a determination that the movement corresponds to movement in a first direction outside of the respective region of the user interface, selection of the first user interface element is dismissed without modifying the appearance of the second user interface element.

177. The method of claim 176, further comprising:

in response to detecting the second input, and in accordance with a determination that the movement corresponds to movement in a second direction outside the respective region of the user interface:

in accordance with a determination that the first input includes input provided by a predefined portion of a user when the predefined portion is farther than a threshold distance from a location corresponding to the first user interface element, relinquishing selection of the first user interface element; and

In accordance with a determination that the first input includes input provided by the predefined portion of the user when the predefined portion of the user is proximate to the threshold distance from the location corresponding to the first user interface element, the first user interface element is selected in accordance with the second input.

178. The method of any of claims 176-177, wherein the first input includes input provided by a predefined portion of a user, and the movement of the second input responsive to the determination corresponds to movement in the first direction outside the respective region of the user interface to forgo selection of the first user interface element regardless of whether the predefined portion of the user is farther from a threshold distance or closer to a location of the first user interface element than the first input period distance.

179. The method of any one of claims 176-178, further comprising:

while displaying the user interface, detecting, via the one or more input devices, a third input directed to a third user interface element in the respective region, wherein the third user interface element is a slider element and the third input includes a moving portion for controlling the slider element;

In response to detecting the third input directed to the third user interface element, modifying an appearance of the third user interface element to indicate that further input directed to the third user interface element will cause further control of the third user interface element, and updating the third user interface element in accordance with the moving portion of the third input;

detecting a fourth input while displaying the third user interface element in a modified appearance and while updating the third user interface element in accordance with the moving portion of the third input; and

in response to detecting the fourth input:

in accordance with a determination that the fourth input includes a movement corresponding to a movement away from the third user interface element:

maintaining a modified appearance of the third user interface element to indicate that further input directed to the third user interface element will cause further control of the third user interface element; and

updating the third user interface element in accordance with the movement of the fourth input, irrespective of whether the movement of the fourth input corresponds to movement outside the respective region of the user interface.

180. The method of claim 179, wherein the moving portion of the third input includes inputs provided by predefined portions of a user having respective magnitudes, and updating the third user interface element in accordance with the moving portion of the third input includes:

in accordance with a determination that the predefined portion of the user is moving at a first speed during the moving portion of the third input, updating the third user interface element by a first amount determined based on the first speed of the predefined portion of the user and the respective magnitude of the moving portion of the third input;

in accordance with a determination that the predefined portion of the user is moving at a second speed greater than the first speed during the moving portion of the third input, updating the third user interface element by a second amount greater than the first amount determined based on the second speed of the predefined portion of the user and the respective magnitude of the moving portion of the third input, wherein the second amount of movement of the third user interface element is greater than the first amount of movement of the third user interface element for the respective magnitude of the moving portion of the third input.

181. The method of any one of claims 176-180, wherein:

said movement of said second input is provided by a corresponding movement of a predefined portion of the user,

in accordance with a determination that the respective region of the user interface has a first size, in accordance with a determination that the respective movement of the predefined portion of the user has a first magnitude, the movement of the second input corresponds to movement outside of the respective region of the user interface, and

in accordance with a determination that the respective region of the user interface has a second size different from the first size, the respective movement of the predefined portion of the user in accordance with the determination has the first magnitude, the movement of the second input corresponding to movement outside the respective region of the user interface.

182. The method of any one of claims 176-181, wherein:

detecting the first input includes detecting that a gaze of a user of the electronic device is directed toward the first user interface element,

detecting the second input includes detecting that a movement corresponding to the movement away from the first user interface element and the gaze of the user is no longer directed toward the first user interface element, and

When the gaze of the user is not directed to the first user interface element, performing the following: discarding the selection of the first user interface element, and modifying the appearance of the second user interface element to indicate that further input directed to the second user interface element will cause selection of the second user interface element.

183. The method of any of claims 176-182, wherein detecting the first input includes detecting that a gaze of a user of the electronic device is directed toward the respective region of the user interface, the method further comprising:

while displaying the first user interface element in the modified appearance and before detecting the second input, detecting, via the one or more input devices, that the gaze of the user is directed to a second region of the user interface that is different from the respective region; and

in response to detecting that the gaze of the user is directed to the second region:

in accordance with a determination that the second region includes a third user interface element, an appearance of the third user interface element is modified to indicate that further input directed to the third user interface element will cause interaction with the third user interface element.

184. The method of any of claims 176-183, wherein the first input includes movement of a predefined portion of a user of the electronic device in space in an environment of the electronic device without the predefined portion of the user contacting a physical input device.

185. The method of any of claims 176-184, wherein the first input includes a pinch gesture performed by a hand of a user of the electronic device.

186. The method of any of claims 176-185, wherein the first input includes movement of a finger of a hand of a user of the electronic device through space in an environment of the electronic device.

187. The method of any one of claims 176-186, further comprising:

in response to detecting the second input:

in accordance with the determination that the second input includes movement corresponding to movement away from the first user interface element:

in accordance with the determination that the movement corresponds to movement within the respective region of the user interface, the appearance of the first user interface element is modified to indicate that further input will no longer be directed to the first user interface element.

188. The method of any one of claims 176-187, wherein:

in accordance with a determination that the second input is provided by a predefined portion of a user of the electronic device when the predefined portion is farther than a threshold distance from a location corresponding to the respective region:

the movement of the second input corresponds to movement within the respective region of the user interface when the second input meets one or more first criteria, and the movement of the second input corresponds to movement outside the respective region of the user interface when the second input does not meet the one or more first criteria, and

in accordance with a determination that the second input is provided by the predefined portion of the user of the electronic device when the predefined portion is proximate to the threshold distance from the location corresponding to the respective region:

the movement of the second input corresponds to movement within the respective region of the user interface when the second input meets one or more second criteria different from the first criteria, and the movement of the second input corresponds to movement outside the respective region of the user interface when the second input does not meet the one or more second criteria.

189. The method of any one of claims 176-188, wherein:

modifying the appearance of the first user interface element to indicate that further input directed to the first user interface element will cause selection of the first user interface element includes moving the first user interface element away from the viewpoint of the user in the three-dimensional environment, and

modifying the appearance of the second user interface element to indicate that further input directed to the second user interface element will cause selection of the second user interface element includes moving the second user interface element away from the viewpoint of the user in the three-dimensional environment.

190. The method of any one of claims 176-189, further comprising:

detecting, via the one or more input devices, a third input while displaying the second user interface element in the modified appearance to indicate that further input directed to the second user interface element will cause selection of the second user interface element; and

in response to detecting the third input:

in accordance with a determination that the third input corresponds to a further input directed to the second user interface element, the second user interface element is selected in accordance with the third input.

191. The method of any one of claims 176-190, wherein:

prior to detecting the first input, selection of the first user interface element requires an input associated with a first magnitude,

the first input comprises an input of a second magnitude less than the first magnitude,

selection of the second user interface element requires an input associated with a third magnitude before the second input is detected, and

in response to detecting the second input, selection of the second user interface element requires further input associated with the third magnitude that is less than the second magnitude of the first input.

192. The method of any one of claims 176 to 191, wherein:

the first input comprising a selection initiation portion followed by a second portion, and the appearance of the first user interface element being modified to indicate that further input directed to the first user interface element will cause selection of the first user interface element in accordance with the first input comprising the selection initiation portion,

in the event that the electronic device does not detect another selection initiation portion after the selection initiation portion included in the first input, the appearance of the second user interface element is modified to indicate that further input directed to the second user interface element will cause selection of the second user interface element, and the method further comprises:

Detecting, via the one or more input devices, a third input directed to the second user interface element while the second user interface element is not displayed in the modified appearance; and

in response to detecting the third input:

in accordance with a determination that the third input includes the selection initiation portion, modifying the appearance of the second user interface element to indicate that further input directed to the second user interface element will cause selection of the second user interface element; and

in accordance with a determination that the third input does not include the selection initiation portion, the modification of the appearance of the second user interface element is abandoned.

193. An electronic device, comprising:

one or more processors;

a memory; and

displaying, via a display generating component, a user interface comprising respective areas, the respective areas comprising a first user interface element and a second user interface element;

while displaying the user interface, detecting a first input directed to the first user interface element in the respective region via one or more input devices;

in response to detecting the second input:

194. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform a method comprising:

in response to detecting the second input:

195. An electronic device, comprising:

one or more processors;

a memory;

means for: displaying, via a display generating component, a user interface comprising respective areas, the respective areas comprising a first user interface element and a second user interface element;

means for: while displaying the user interface, detecting a first input directed to the first user interface element in the respective region via one or more input devices;

means for, in response to detecting the first input directed to the first user interface element: modifying an appearance of the first user interface element to indicate that further input directed to the first user interface element will cause selection of the first user interface element;

Means for: while displaying the first user interface element in the modified appearance, detecting a second input via the one or more input devices; and

means for, in response to detecting the second input:

196. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

means for, in response to detecting the second input:

197. An electronic device, comprising:

one or more processors;

a memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 176-192.

198. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to perform any of the methods of claims 176-192.

199. An electronic device, comprising:

one or more processors;

a memory; and

apparatus for performing any one of the methods of claims 176-192.

200. An information processing apparatus for use in an electronic device, the information processing apparatus comprising:

Apparatus for performing any one of the methods of claims 176-192.