WO2024137458A1

WO2024137458A1 - Artificial intelligence expression engine

Info

Publication number: WO2024137458A1
Application number: PCT/US2023/084534
Authority: WO
Inventors: Vincent Charles Cheung; Grant Gardner; Meng Wang; Eric Liu GAN; Michelle Jia-Ying CHEUNG; Maria Alejandra RUIZ GUTIERREZ; Jiemin ZHANG
Original assignee: Meta Platforms Technologies, Llc
Priority date: 2022-12-21
Filing date: 2023-12-18
Publication date: 2024-06-27

Abstract

Aspects of the present disclosure are related to creating a dynamic, continuously editable artificial reality (XR) world. Instances of an XR world can be created based on an original template for that world, then users can make edits to the world, such as through an artificial intelligence (AI) interpreter, such as an AI expression engine. In some implementations, these changes are not saved to the original template, and rules can be applied to determine how long the changes persist, who can see the changes, etc. The rules can be established by the user creating the instance and/or by other users modifying the instance of the XR world. Implementations can focus on the ephemeral nature of assets (e.g., virtual objects) within an instance of an XR world, instead of their persistence within the XR world.

Description

ARTIFICIAL INTELLIGENCE EXPRESSION ENGINE

TECHNICAL FIELD

[0001] The present disclosure is directed to an artificial intelligence (Al) expression engine for creating dynamic, continuously editable artificial reality (XR) worlds.

BACKGROUND

[0002] Artificial reality (XR) devices are becoming more prevalent. As they become more popular, the applications implemented on such devices are becoming more sophisticated. Augmented reality (AR) applications can provide interactive 3D experiences that combine images of the real-world with virtual objects, while virtual reality (VR) applications can provide an entirely self-contained 3D computer environment. For example, an AR application can be used to superimpose virtual objects over a video feed of a real scene that is observed by a camera. A real-world user in the scene can then make gestures captured by the camera that can provide interactivity between the real-world user and the virtual objects. Mixed reality (MR) systems can allow light to enter a user's eye that is partially generated by a computing system and partially includes light reflected off objects in the real-world. AR, MR, and VR (together XR) experiences can be observed by a user through a head-mounted display (HMD), such as glasses or a headset.

[0003] Many people are turning to the promise of artificial reality: XR worlds expand users’ experiences beyond their real world, allow them to learn and play in new ways, and help them connect with other people. An XR world becomes familiar when its users customize it with objects that interact among themselves and with the users. While creating some objects in an XR world can be simple, as objects get more complex, the skills needed for creating them increase until only experts can create multi-faceted objects such as a house. To create an entire XR world can take weeks or months for a team of experts. As XR worlds become more photorealistic, and as the objects within them provide richer interactive experiences, the effort to successfully create them increases even more until some creation is beyond the scope, or the resources, of many, even experts.

[0004] In contrast, the success of an XR platform is often dependent on the amount of people who can create their own customized spaces within the XR worlds and populate them with objects of their own creation. User engagement with the XR worlds decreases when the user is prevented from making a world to their liking or from populating it with rich, realistic objects.

[0005] According to an aspect of the present invention there is provided a method for creating a dynamic, continuously editable artificial reality world, the method comprising: creating an instance, of multiple instances, of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world; detecting that the one or more artificial reality devices are no longer accessing the instance of the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the instance of the artificial reality world, reverting the instance of the artificial reality world to the original template.

[0006] Optionally, the instance of the artificial reality world is created based on a request of a creator artificial reality device of the one or more artificial reality devices, and wherein the creator artificial reality device specifies the triggering event.

[0007] Optionally, the triggering event includes one or more of: expiration of a specified period of time after performing the at least one asset modification to the at least one asset of the instance of the artificial reality world, detection of the creator artificial reality device no longer accessing the instance of the artificial reality world, or a combination thereof.

[0008] Optionally, an artificial reality device, of the one or more artificial reality devices, transmits a command of the one or more commands, wherein an asset modification, of the one or more asset modifications, is performed to the instance of the artificial reality world, corresponding to an asset of the one or more assets, based on the command, and wherein the artificial reality device, of the one or more artificial reality devices, specifies the triggering event with respect to the asset of the one or more assets. [0009] Optionally, the triggering event includes one or more of a specified period of time after performing the modification of the one or more modifications to the instance of the artificial reality world, detection of the artificial reality device no longer accessing the instance of the artificial reality world, or a combination thereof.

[0010] Optionally, the instance of the artificial reality world is created based on a request of a creator artificial reality device of the one or more artificial reality devices, and wherein the creator artificial reality device specifies one or more rules for performing the one or more asset modifications to the one or more assets of the instance of the artificial reality world.

[0011] Optionally, the one or more commands includes a command to edit an asset of the one or more assets in the instance of the artificial reality world, and wherein an asset modification, of the one or more asset modifications, corresponding to the asset, of the one or more assets, is performed by making the asset dynamic in the instance of the artificial reality world.

[0012] Optionally, the one or more assets include a skybox, a virtual object, a sound, a virtual space, or any combination thereof.

[0013] Optionally, the one or more commands are generated by at least one of voice, gaze, a gesture, or any combination thereof.

[0014] Optionally, the triggering event is a cessation of speaking about the at least one asset, of the one or more assets, by at least one user associated with respective artificial reality devices of the one or more artificial reality devices.

[0015] Optionally, the at least one asset modified by the at least one asset modification comprises a virtual object, the at least one asset modification comprises a modification to a display of the virtual object in the instance of the artificial reality world, and the reversing the at least one asset modification comprises reversing the modification to the display of the virtual object in the instance of the artificial reality world.

[0016] Optionally, the at least one asset modified by the at least one asset modification is part of the original template of the artificial reality world.

[0017] According to a further aspect of the present invention, there is provided a computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform a process for creating a dynamic, continuously editable artificial reality world, the process comprising: creating an instance of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; and in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world.

[0018] Optionally, the computer-readable storage medium further comprises: detecting that the one or more artificial reality devices are no longer accessing the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the artificial reality world, reverting the instance of the artificial reality world to the original template.

[0019] Optionally, the instance is of multiple instances of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template.

[0020] Optionally, the at least one asset modified by the at least one asset modification comprises a virtual object, the at least one asset modification comprises a modification to a display of the virtual object in the instance of the artificial reality world, and the reversing the at least one asset modification comprises reversing the modification to the display of the virtual object in the instance of the artificial reality world.

[0021] According to a further aspect of the present invention there is provided a computing system for creating a dynamic, continuously editable artificial reality world, the computing system comprising: one or more processors; and one or more memories storing instructions that, when executed by the one or more processors, cause the computing system to perform a process comprising: creating an instance of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; and in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world.

[0022] Optionally, the computing system further comprises: detecting that the one or more artificial reality devices are no longer accessing the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the artificial reality world, reverting the instance of the artificial reality world to the original template.

[0023] Optionally, the instance is of multiple instances of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template.

[0024] Optionally, the at least one asset modified by the at least one asset modification is part of the original template of the artificial reality world.

[0025] Certain embodiments of the present invention will be described by way of example and with reference to the accompanying drawings, of which:-

BRIEF DESCRIPTION OF THE DRAWINGS

[0026] Figure 1 is a block diagram illustrating an overview of devices on which some implementations of the present technology can operate;

[0027] Figure 2A is a wire diagram illustrating a virtual reality headset which can be used in some implementations of the present technology;

[0028] Figure 2B is a wire diagram illustrating a mixed reality headset which can be used in some implementations of the present technology;

[0029] Figure 2C is a wire diagram illustrating controllers which, in some implementations, a user can hold in one or both hands to interact with an artificial reality environment;

[0030] Figure 3 is a block diagram illustrating an overview of an environment in which some implementations of the present technology can operate;

[0031] Figure 4 is a block diagram illustrating components which, in some implementations, can be used in a system employing the disclosed technology;

[0032] Figure 5 is a flow diagram illustrating a process used in some implementations of the present technology for creating a dynamic, continuously editable artificial reality (XR) world; [0033] Figure 6A is a flow diagram illustrating a process used in some implementations of the present technology for processing input from a user of an artificial reality (XR) device, by an artificial intelligence (Al) expression engine, to understand what a user wants to accomplish in an XR world;

[0034] Figure 6B is a flow diagram illustrating a process used in some implementations of the present technology for providing asset options for an artificial reality (XR) world based on input from a user received via an XR device, by an artificial intelligence (Al) expression engine;

[0035] Figure 6C is a flow diagram illustrating a process used in some implementations of the present technology for providing a response by supplementing an instance of an artificial reality (XR) world with assets, by an artificial intelligence (Al) expression engine;

[0036] Figure 7A is a conceptual diagram illustrating an example view on an artificial reality (XR) device of an instance of an XR world formed based on an original template;

[0037] Figure 7B is a conceptual diagram illustrating an example view on an artificial reality (XR) device of a modification to an instance of an XR world by a user; and

[0038] Figure 7C is a conceptual diagram illustrating an example view on an artificial reality (XR) device of a reversal of a modification to an instance of an XR world based on a user leaving the XR world.

[0039] The techniques introduced here may be better understood by referring to the following Detailed Description in conjunction with the accompanying drawings, in which like reference numerals indicate identical or functionally similar elements. DETAILED DESCRIPTION

[0040] Aspects of the present disclosure relate to creating a dynamic, continuously editable artificial reality (XR) world, instead of static XR worlds, that users can visit via XR devices. In some implementations, instances of an XR world can be created based on an original template for that world, then users can make edits to the world, such as through an artificial intelligence (Al) interpreter, such as an Al expression engine. In some implementations, these changes are not saved to the original template, and rules can be applied to determine how long the changes persist, which users can perform the changes, which users can see the changes, which assets (or asset types) of the XR world instance can be changed, etc. The rules can be established by the user creating the instance and/or by other users modifying the instance of the XR world. For example, the rules can specify that flowers persist in the world until the end of spring, that only changes made by a user’s friends persist for longer than 2 minutes, that changes made by a user are deleted when that user leaves the world, etc. Some implementations can focus on the ephemeral nature of virtual objects and other assets within an instance of an XR world, instead of their persistence within the XR world.

[0041] For example, a group of friends can hang out in an instance of virtual coffee shop like they often do in the physical world. A running joke in the group can be resurfaced from a trip that they went on where someone took a picture in a phone booth, but ended up getting stuck inside. They can materialize a virtual phone booth in the virtual coffee shop via their XR devices to recreate that experience to everyone’s laughter. Later on, one of the friends can remove the virtual phone booth, and instead summons a virtual pool table for a quick match. When the friends leave the instance of the virtual coffee shop, the virtual coffee shop can return to its original state, without the virtual pool table.

[0042] Some changes to the instance of the XR world can change elements of the original template, such as removing a virtual pool table that is part of the original template and replacing it with a virtual foosball table. Triggering events can control when the change is reversed, for example when the virtual foosball table is reverted back to the virtual pool table from the original template. In some implementations, a first user can add an asset to an instance of an XR world, such as a virtual snowman, and a second user can edit the asset added by the first user, such as to modify the virtual clothing worn by the virtual snowman. Since these two changes are performed by different users, their persistence may be different. For example, the virtual clothing may revert to the original virtual clothing for the snowman after a duration of time (e.g., 10 minutes), when it is detected that the second user is no longer present in the instance of the XR world, etc. The virtual snowman addition to the instance of the XR world may revert (e.g., the virtual snowman can be deleted) after a different duration of time (e.g., 1 hour), when it is detected that the first user is no longer present in the instance of the XR world, etc.

[0043] Embodiments of the disclosed technology may include or be implemented in conjunction with an artificial reality system. Artificial reality or extra reality (XR) is a form of reality that has been adjusted in some manner before presentation to a user, which may include, e.g., virtual reality (VR), augmented reality (AR), mixed reality (MR), hybrid reality, or some combination and/or derivatives thereof. Artificial reality content may include completely generated content or generated content combined with captured content (e.g., real-world photographs). The artificial reality content may include video, audio, haptic feedback, or some combination thereof, any of which may be presented in a single channel or in multiple channels (such as stereo video that produces a three-dimensional effect to the viewer). Additionally, in some embodiments, artificial reality may be associated with applications, products, accessories, services, or some combination thereof, that are, e.g., used to create content in an artificial reality and/or used in (e.g., perform activities in) an artificial reality. The artificial reality system that provides the artificial reality content may be implemented on various platforms, including a head-mounted display (HMD) connected to a host computer system, a standalone HMD, a mobile device or computing system, a "cave" environment or other projection system, or any other hardware platform capable of providing artificial reality content to one or more viewers. [0044] "Virtual reality" or"VR," as used herein, refers to an immersive experience where a user's visual input is controlled by a computing system. "Augmented reality" or "AR" refers to systems where a user views images of the real world after they have passed through a computing system. For example, a tablet with a camera on the back can capture images of the real world and then display the images on the screen on the opposite side of the tablet from the camera. The tablet can process and adjust or "augment" the images as they pass through the system, such as by adding virtual objects. "Mixed reality" or "MR" refers to systems where light entering a user's eye is partially generated by a computing system and partially composes light reflected off objects in the real world. For example, a MR headset could be shaped as a pair of glasses with a pass-through display, which allows light from the real world to pass through a waveguide that simultaneously emits light from a projector in the MR headset, allowing the MR headset to present virtual objects intermixed with the real objects the user can see. "Artificial reality," "extra reality," or "XR," as used herein, refers to any of VR, AR, MR, or any combination or hybrid thereof.

[0045] The implementations described herein provide specific technological improvements in the area of XR experiences in that they allow templates of XR worlds to be modified dynamically by users in real-time as they interact with the XR worlds. Some implementations can receive input via any of a number of different methods to build, modify, move, and remove assets, making each instance of an XR world unique with an infinite number of possibilities. Some implementations thus enable an ephemeral XR experience in which assets that are relevant to users within that experience at that particular time are displayed. The ephemeral XR world alterations can substitute for permanent alterations to the template of the XR world, for example when such alterations would not be useful for other users and/or at different times.

[0046] The ephemeral XR world alterations achieve a lightweight approach for customizing an XR world to specific users at specific timings. For instance, the ephemeral nature automatically removes the world alterations in circumstances where they are less relevant, thus conserving system resources for more relevant world content. Thus, by reducing the number of unnecessary and/or irrelevant assets displayed on an XR device rendering the XR world, battery power can be conserved, and processing speed can be improved. Implementations are further necessarily rooted in computer technology, as they are specific to artificial reality and XR experiences displayed on an XR device.

[0047] Several implementations are discussed below in more detail in reference to the figures. Figure 1 is a block diagram illustrating an overview of devices on which some implementations of the disclosed technology can operate. The devices can comprise hardware components of a computing system 100 that can create a dynamic, continuously editable artificial reality (XR) world. In various implementations, computing system 100 can include a single computing device 103 or multiple computing devices (e.g., computing device 101 , computing device 102, and computing device 103) that communicate over wired or wireless channels to distribute processing and share input data. In some implementations, computing system 100 can include a stand-alone headset capable of providing a computer created or augmented experience for a user without the need for external processing or sensors. In other implementations, computing system 100 can include multiple computing devices such as a headset and a core processing component (such as a console, mobile device, or server system) where some processing operations are performed on the headset and others are offloaded to the core processing component. Example headsets are described below in relation to Figures 2A and 2B. In some implementations, position and environment data can be gathered only by sensors incorporated in the headset device, while in other implementations one or more of the non-headset computing devices can include sensor components that can track environment or position data. [0048] Computing system 100 can include one or more processor(s) 110 (e.g., central processing units (CPUs), graphical processing units (GPUs), holographic processing units (HPUs), etc.) Processors 110 can be a single processing unit or multiple processing units in a device or distributed across multiple devices (e.g., distributed across two or more of computing devices 101-103).

[0049] Computing system 100 can include one or more input devices 120 that provide input to the processors 110, notifying them of actions. The actions can be mediated by a hardware controller that interprets the signals received from the input device and communicates the information to the processors 110 using a communication protocol. Each input device 120 can include, for example, a mouse, a keyboard, a touchscreen, a touchpad, a wearable input device (e.g., a haptics glove, a bracelet, a ring, an earring, a necklace, a watch, etc.), a camera (or other light-based input device, e.g., an infrared sensor), a microphone, or other user input devices.

[0050] Processors 1 10 can be coupled to other hardware devices, for example, with the use of an internal or external bus, such as a PCI bus, SCSI bus, or wireless connection. The processors 1 10 can communicate with a hardware controller for devices, such as for a display 130. Display 130 can be used to display text and graphics. In some implementations, display 130 includes the input device as part of the display, such as when the input device is a touchscreen or is equipped with an eye direction monitoring system. In some implementations, the display is separate from the input device. Examples of display devices are: an LCD display screen, an LED display screen, a projected, holographic, or augmented reality display (such as a heads-up display device or a head-mounted device), and so on. Other I/O devices 140 can also be coupled to the processor, such as a network chip or card, video chip or card, audio chip or card, USB, firewire or other external device, camera, printer, speakers, CD-ROM drive, DVD drive, disk drive, etc.

[0051] In some implementations, input from the I/O devices 140, such as cameras, depth sensors, IMU sensor, GPS units, LiDAR or other time-of-flights sensors, etc. can be used by the computing system 100 to identify and map the physical environment of the user while tracking the user's location within that environment. This simultaneous localization and mapping (SLAM) system can generate maps (e.g., topologies, girds, etc.) for an area (which may be a room, building, outdoor space, etc.) and/or obtain maps previously generated by computing system 100 or another computing system that had mapped the area. The SLAM system can track the user within the area based on factors such as GPS data, matching identified objects and structures to mapped objects and structures, monitoring acceleration and other position changes, etc.

[0052] Computing system 100 can include a communication device capable of communicating wirelessly or wire-based with other local computing devices or a network node. The communication device can communicate with another device or a server through a network using, for example, TCP/IP protocols. Computing system 100 can utilize the communication device to distribute operations across multiple network devices.

[0053] The processors 110 can have access to a memory 150, which can be contained on one of the computing devices of computing system 100 or can be distributed across of the multiple computing devices of computing system 100 or other external devices. A memory includes one or more hardware devices for volatile or non-volatile storage, and can include both read-only and writable memory. For example, a memory can include one or more of random access memory (RAM), various caches, CPU registers, read-only memory (ROM), and writable non-volatile memory, such as flash memory, hard drives, floppy disks, CDs, DVDs, magnetic storage devices, tape drives, and so forth. A memory is not a propagating signal divorced from underlying hardware; a memory is thus non-transitory. Memory 150 can include program memory 160 that stores programs and software, such as an operating system 162, artificial intelligence (Al) expression engine 164, and other application programs 166. Memory 150 can also include data memory 170 that can include, e.g., artificial reality (XR) world data, original template data, instance data, rendering data, command data, asset data, asset modification data, triggering event data, configuration data, settings, user options or preferences, etc., which can be provided to the program memory 160 or any element of the computing system 100.

[0054] Some implementations can be operational with numerous other computing system environments or configurations. Examples of computing systems, environments, and/or configurations that may be suitable for use with the technology include, but are not limited to, XR headsets, personal computers, server computers, handheld or laptop devices, cellular telephones, wearable electronics, gaming consoles, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, or the like.

[0055] Figure 2A is a wire diagram of a virtual reality head-mounted display (HMD) 200, in accordance with some embodiments. The HMD 200 includes a front rigid body 205 and a band 210. The front rigid body 205 includes one or more electronic display elements of an electronic display 245, an inertial motion unit (IMU) 215, one or more position sensors 220, locators 225, and one or more compute units 230. The position sensors 220, the IMU 215, and compute units 230 may be internal to the HMD 200 and may not be visible to the user. In various implementations, the IMU 215, position sensors 220, and locators 225 can track movement and location of the HMD 200 in the real world and in an artificial reality environment in three degrees of freedom (3DoF) or six degrees of freedom (6DoF). For example, the locators 225 can emit infrared light beams which create light points on real objects around the HMD 200. As another example, the IMU 215 can include e.g., one or more accelerometers, gyroscopes, magnetometers, other non-camera-based position, force, or orientation sensors, or combinations thereof. One or more cameras (not shown) integrated with the HMD 200 can detect the light points. Compute units 230 in the HMD 200 can use the detected light points to extrapolate position and movement of the HMD 200 as well as to identify the shape and position of the real objects surrounding the HMD 200.

[0056] The electronic display 245 can be integrated with the front rigid body 205 and can provide image light to a user as dictated by the compute units 230. In various embodiments, the electronic display 245 can be a single electronic display or multiple electronic displays (e.g., a display for each user eye). Examples of the electronic display 245 include: a liquid crystal display (LCD), an organic light-emitting diode (OLED) display, an active-matrix organic light-emitting diode display (AMOLED), a display including one or more quantum dot light-emitting diode (QOLED) sub-pixels, a projector unit (e.g., microLED, LASER, etc.), some other display, or some combination thereof.

[0057] In some implementations, the HMD 200 can be coupled to a core processing component such as a personal computer (PC) (not shown) and/or one or more external sensors (not shown). The external sensors can monitor the HMD 200 (e.g., via light emitted from the HMD 200) which the PC can use, in combination with output from the IMU 215 and position sensors 220, to determine the location and movement of the HMD 200.

[0058] Figure 2B is a wire diagram of a mixed reality HMD system 250 which includes a mixed reality HMD 252 and a core processing component 254. The mixed reality HMD 252 and the core processing component 254 can communicate via a wireless connection (e.g., a 60 GHz link) as indicated by link 256. In other implementations, the mixed reality system 250 includes a headset only, without an external compute device or includes other wired or wireless connections between the mixed reality HMD 252 and the core processing component 254. The mixed reality HMD 252 includes a pass-through display 258 and a frame 260. The frame 260 can house various electronic components (not shown) such as light projectors (e.g., LASERS, LEDs, etc.), cameras, eye-tracking sensors, MEMS components, networking components, etc.

[0059] The projectors can be coupled to the pass-through display 258, e.g., via optical elements, to display media to a user. The optical elements can include one or more waveguide assemblies, reflectors, lenses, mirrors, collimators, gratings, etc., for directing light from the projectors to a user's eye. Image data can be transmitted from the core processing component 254 via link 256 to HMD 252. Controllers in the HMD 252 can convert the image data into light pulses from the projectors, which can be transmitted via the optical elements as output light to the user's eye. The output light can mix with light that passes through the display 258, allowing the output light to present virtual objects that appear as if they exist in the real world.

[0060] Similarly to the HMD 200, the HMD system 250 can also include motion and position tracking units, cameras, light sources, etc., which allow the HMD system 250 to, e.g., track itself in 3DoF or 6DoF, track portions of the user (e.g., hands, feet, head, or other body parts), map virtual objects to appear as stationary as the HMD 252 moves, and have virtual objects react to gestures and other real-world objects. [0061 ] Figure 2C illustrates controllers 270 (including controller 276A and 276B), which, in some implementations, a user can hold in one or both hands to interact with an artificial reality environment presented by the HMD 200 and/or HMD 250. The controllers 270 can be in communication with the HMDs, either directly or via an external device (e.g., core processing component 254). The controllers can have their own IMU units, position sensors, and/or can emit further light points. The HMD 200 or 250, external sensors, or sensors in the controllers can track these controller light points to determine the controller positions and/or orientations (e.g., to track the controllers in 3DoF or 6DoF). The compute units 230 in the HMD 200 or the core processing component 254 can use this tracking, in combination with IMU and position output, to monitor hand positions and motions of the user. The controllers can also include various buttons (e.g., buttons 272A-F) and/or joysticks (e.g., joysticks 274A- B), which a user can actuate to provide input and interact with objects.

[0062] In various implementations, the HMD 200 or 250 can also include additional subsystems, such as an eye tracking unit, an audio system, various network components, etc., to monitor indications of user interactions and intentions. For example, in some implementations, instead of or in addition to controllers, one or more cameras included in the HMD 200 or 250, or from external cameras, can monitor the positions and poses of the user's hands to determine gestures and other hand and body motions. As another example, one or more light sources can illuminate either or both of the user's eyes and the HMD 200 or 250 can use eye-facing cameras to capture a reflection of this light to determine eye position (e.g., based on set of reflections around the user's cornea), modeling the user's eye and determining a gaze direction.

[0063] Figure 3 is a block diagram illustrating an overview of an environment 300 in which some implementations of the disclosed technology can operate. Environment 300 can include one or more client computing devices 305A-D, examples of which can include computing system 100. In some implementations, some of the client computing devices (e.g., client computing device 305B) can be the HMD 200 or the HMD system 250. Client computing devices 305 can operate in a networked environment using logical connections through network 330 to one or more remote computers, such as a server computing device.

[0064] In some implementations, server 310 can be an edge server which receives client requests and coordinates fulfillment of those requests through other servers, such as servers 320A-C. Server computing devices 310 and 320 can comprise computing systems, such as computing system 100. Though each server computing device 310 and 320 is displayed logically as a single server, server computing devices can each be a distributed computing environment encompassing multiple computing devices located at the same or at geographically disparate physical locations.

[0065] Client computing devices 305 and server computing devices 310 and 320 can each act as a server or client to other server/client device(s). Server 310 can connect to a database 315. Servers 320A-C can each connect to a corresponding database 325A-C. As discussed above, each server 310 or 320 can correspond to a group of servers, and each of these servers can share a database or can have their own database. Though databases 315 and 325 are displayed logically as single units, databases 315 and 325 can each be a distributed computing environment encompassing multiple computing devices, can be located within their corresponding server, or can be located at the same or at geographically disparate physical locations. [0066] Network 330 can be a local area network (LAN), a wide area network (WAN), a mesh network, a hybrid network, or other wired or wireless networks. Network 330 may be the Internet or some other public or private network. Client computing devices 305 can be connected to network 330 through a network interface, such as by wired or wireless communication. While the connections between server 310 and servers 320 are shown as separate connections, these connections can be any kind of local, wide area, wired, or wireless network, including network 330 or a separate public or private network.

[0067] Figure 4 is a block diagram illustrating components 400 which, in some implementations, can be used in a system employing the disclosed technology. Components 400 can be included in one device of computing system 100 or can be distributed across multiple of the devices of computing system 100. The components 400 include hardware 410, mediator 420, and specialized components 430. As discussed above, a system implementing the disclosed technology can use various hardware including processing units 412, working memory 414, input and output devices 416 (e.g., cameras, displays, IMU units, network connections, etc ), and storage memory 418. In various implementations, storage memory 418 can be one or more of: local devices, interfaces to remote storage devices, or combinations thereof. For example, storage memory 418 can be one or more hard drives or flash drives accessible through a system bus or can be a cloud storage provider (such as in storage 315 or 325) or other network storage accessible via one or more communications networks. In various implementations, components 400 can be implemented in a client computing device such as client computing devices 305 or on a server computing device, such as server computing device 310 or 320.

[0068] Mediator 420 can include components which mediate resources between hardware 410 and specialized components 430. For example, mediator 420 can include an operating system, services, drivers, a basic input output system (BIOS), controller circuits, or other hardware or software systems.

[0069] Specialized components 430 can include software or hardware configured to perform operations for creating a dynamic, continuously editable artificial reality (XR) world. Specialized components 430 can include instance creation module 434, command receipt module 436, asset modification module 438, triggering event detection module 440, asset modification reversal module 442, access detection module 444, instance reversion module 446, and components and APIs which can be used for providing user interfaces, transferring data, and controlling the specialized components, such as interfaces 432. In some implementations, components 400 can be in a computing system that is distributed across multiple computing devices or can be an interface to a server-based application executing one or more of specialized components 430. Although depicted as separate components, specialized components 430 may be logical or other nonphysical differentiations of functions and/or may be submodules or code-blocks of one or more applications. In some implementations, one or more of specialized components 430 can be omitted, such as, for example, access detection module 444 and instance reversion module 446.

[0070] Instance creation module 434 can create an instance of an artificial reality (XR) world. As used herein, an “instance” of an XR world can be a version of the XR world common only to the XR devices accessing that instance. In some implementations, an instance can be limited to a particular number of users (e.g., 35 users, 50 users, or any other suitable threshold), a particular group of users (e.g., a group of friends that want to play together, a group of users that are 18+ years old, etc.), a limited duration of time (e.g., 10 minutes, a season, etc ), and/or the like.

[0071] In some implementations, once the instance reaches a certain number of users, a new user can request a new instance, or in response to any other suitable trigger for instance creation, instance creation module 434 can create a new instance of the XR world for users requesting to create or access the XR world. Thus, in some implementations, instance creation module 434 can create multiple instances of a single XR world. In some implementations, upon request of creation of an instance of the XR world by an XR device, instance creation module 434 can create the instance of the XR world based on an original template of the XR world. In implementations in which instance creation module 434 creates multiple instances of the XR world, the multiple instances can share the original template upon creation. The original template can include, for example, an initial set of assets.

[0072] As used herein, an “asset” can be any visual and/or audible content within an XR world, such as a virtual object, audio, a virtual space, a skybox, etc. As used herein, a “skybox” can be a background that shows the general location of the user’s world. Additional details on skyboxes, and on generating a skybox from an image, are provided in U.S. Provisional Patent Application No. 63/309,767, with Attorney Docket No. 3589-0120PV01 . The skybox can be the distant background, and it cannot be touched by the user, but it may have changing weather, seasons, night and day, and the like. In one example, a skybox can be a mountainous background including a distant mountain and the sky. In some implementations, an original template can include a defined skybox and a virtual space with a number of virtual objects in a preconfigured orientation, and any instance created from this original template can be a virtual world comprising the skybox and a defined virtual space with the virtual objects organized in the preconfigured orientation. Further details regarding creating an instance of an XR world are described herein with respect to block 502 of Figure 5. [0073] Command receipt module 436 can receive one or more commands with respect to one or more assets in the instance of the XR world created by instance creation module 434. In some implementations, command receipt module 436 can receive the one or more commands directly or indirectly from an XR device capturing the command from a user. In some implementations, command receipt module 436 can receive the one or more commands over any suitable network, such as network 330 of Figure 3. In some implementations, the one or more commands can include a command to add one or more assets to the instance of the XR world, such as a virtual object, an audio effect (e.g., music, a sound effect, noise, etc.), a virtual space (e.g., a virtual room or other indoor or outdoor virtual area), a skybox, etc. In some implementations, the one or more commands can include a command to edit one or more assets in the instance of the XR world. For example, the command can be to resize, reshape, flip, warp, rescale, change the color, change the tone, etc., of an existing asset (either from the original template or previously added by a user) in the instance of the XR world.

[0074] In some implementations, the command can be to remove one or more assets in the instance of the XR world. For example, the command can be to delete (i.e., remove from view on one or more XR devices accessing the instance of the XR world) one or more assets in the instance of the XR world. In some implementations, the command can be to remove an asset existing in the original template of the instance of the XR world. In some implementations, the command can be to remove an asset added by the user making the command. In some implementations, the command can be to remove an asset added by a user other than the user making the command. In some implementations, command receipt module 436 can receive any combination of commands to add, edit, and/or remove one or more assets in the instance of the XR world. Further details regarding receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the XR world, are described herein with respect to block 504 of Figure 5.

[0075] Asset modification module 438 can perform, in accordance with the one or more commands received by command receipt module 436, one or more asset modifications to one or more assets of the instance of the XR world using artificial intelligence (Al) techniques. For example, if the command received by command receipt module 436 is to add an asset to the instance of the XR world, asset modification module 438 can build the asset according to the command and/or retrieve the asset from an asset library, such as asset library 622 of Figure 6B described further herein. If the command received by command receipt module 436 is to edit an asset in the instance of the XR world, asset modification module 438 can change the asset in accordance with the command. In some implementations, asset modification module 438 can further provide rendering data for the added and/or edited asset which can be used to display the asset on XR devices accessing the instance of the XR world. If the command received by command receipt module 436 is to remove an asset from the instance of the XR world, asset modification module 438 can remove rendering data associated with rendering the asset in the instance of the XR world from the rendering data provided to the XR devices accessing the instance.

[0076] Further details regarding building, editing, and/or removing an asset according to a command using Al techniques are described in U.S. Provisional Patent Application No. 63/309,760, with Attorney Docket No. 3589-0119PV01 , and U.S. Provisional Patent Application No. 63/382,180, with Attorney Docket No. 3589- 0183DP01. Further details regarding performing one or more asset modifications in accordance with one or more commands are described herein with respect to block 506 of Figure 5.

[0077] Triggering event detection module 440 can detect occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the XR world. In some implementations, triggering event detection module 440 can detect that one or more rules specified for the asset modification is no longer being met or that one or more rules specified for reversal of the asset modification is being met. For example, triggering event detection module 440 can detect that a particular XR device is no longer accessing the instance of the XR world (e.g., has exited the instance of the XR world, has accessed a different instance of the XR world, has accessed a different XR world, has been deactivated and/or removed by a user, etc.). In some implementations, the particular XR device can be, for example, an XR device associated with a user that requested creation of the instance of the XR world via instance creation module 434. In some implementations, the particular XR device can be an XR device associated with a user who provided the command (received by command receipt module 436) that caused the asset modification performed by asset modification module 438. In some implementations, the particular XR device can be any other XR device previously accessing the instance of the XR world, as specified by a rule.

[0078] In another example, triggering event detection module 440 can detect expiration of a time period specified by a rule (e.g., 10 minutes, until the end of the day, until the end of December, etc.). In still another example, triggering event detection module 440 can detect that users associated with XR devices accessing the instance of the XR world are no longer speaking about the asset in conversation (either audibly or via text). In some implementations, upon detection of the occurrence of the triggering event, triggering event detection module 440 can transmit a flag or other indicator that the triggering event has occurred with respect to an asset modification to asset modification reversal module 442. Further details regarding detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications are described herein with respect to block 508 of Figure 5.

[0079] Asset modification reversal module 442 can, in response to triggering event detection module 440 detecting occurrence of the triggering event, reverse the at least one asset modification to the at least one asset of the instance of the XR world. For example, if the asset modification is the addition of an asset, asset modification reversal module 442 can remove the asset in response to the occurrence of the triggering event. In another example, if the asset modification is editing of the asset, asset modification reversal module 442 can undo the changes to the asset. In still another example, if the asset modification is removal of the asset, asset modification reversal module 442 can add the asset back to the instance of the XR world. In some implementations, asset modification reversal module can further provide rendering data to one or more of the XR devices accessing the instance of the XR world to reverse the at least one asset modification as it is displayed on the XR devices. Further details regarding reversing at least one asset modification to at least one asset of the instance of the XR world are described herein with respect to block 510 of Figure 5.

[0080] In some implementations, specialized components 430 can further include access detection module 444 and/or instance reversion module 446. Access detection module 444 can detect that the one or more XR devices that were accessing the instance of the XR world are no longer accessing the instance of the XR world. In other words, access detection module 444 can determine that there are no users left within the instance of the XR world. In some implementations, access detection module 444 can determine that the one or more XR devices are no longer accessing the instance of the XR world by determining that the one or more XR devices have exited the instance of the XR world, have accessed a different instance of the XR world, have accessed a different XR world, have been deactivated and/or removed by a user, and/or the like.

[0081] Instance reversion module 446 can, in response to access detection module 444 detecting that the one or more XR devices are no longer accessing the instance of the XR world, revert the instance of the XR world to the original template. In other words, in some implementations, once there are no users left within the instance of the XR world, all additions, changes, and/or removals made with respect to the original template of the XR world can be reversed to return the instance of XR world to its original template. Thus, in some implementations, any asset modifications made by asset modification module 438 can be undone. When an XR device later accesses the instance of the XR world, the instance of the XR world provided to the XR device is the world as defined by its original template that, in some implementations, is common to all instances of the XR world upon creation.

[0082] Those skilled in the art will appreciate that the components illustrated in Figures 1-4 described above, and in each of the flow diagrams discussed below, may be altered in a variety of ways. For example, the order of the logic may be rearranged, substeps may be performed in parallel, illustrated logic may be omitted, other logic may be included, etc. In some implementations, one or more of the components described above can execute one or more of the processes described below.

[0083] Figure 5 is a flow diagram illustrating a process 500 used in some implementations for creating a dynamic, continuously editable artificial reality (XR) world. In some implementations, process 500 can be performed as a response to a user request to create an instance of an XR world. In some implementations, process 500 can be performed by a server located remotely from an XR device used to render the XR world, such as a platform computing system or a developer computing system. [0084] At block 502, process 500 can create an instance of the XR world. Process 500 can create the instance of the XR world, for example, based on a user request received from an XR device to create or access the XR world. The XR device can generate the user request based on, for example, selection of a virtual button (e.g., via a gesture) corresponding to the XR world displayed on the XR device; selection of a physical button on or in operable communication with the XR device (e.g., through one or more controllers, such as controller 276A and/or controller 276B of Figure 2C); an audible announcement to create or access the XR world captured by the XR device or another XR device within an XR system (e.g., through a microphone integral with or in operable communication with the XR device); by accessing a portal to the XR world from a virtual menu, from an application, and/or from another XR world; and/or the like.

[0085] In some implementations, process 500 can create the instance of the XR world based on an original template, i.e. , a starting template from which one or more users can request to add, modify, and/or delete assets, as described further herein. In some implementations, the instance can be of multiple instances of the XR world. In some implementations, upon their respective creation, the multiple instances of the XR world can share the original template. In other words, in some implementations, all instances of an XR world can appear similar when they are created (e.g., have the same assets, characteristics, virtual objects, etc.), albeit with different avatars representing different users within different instances of the XR world.

[0086] Upon creation, one or more XR devices can access the instance of the XR world and can render an original template of the XR world. For example, upon creation, the XR device requesting creation of the instance of the XR world (the “creator XR device”) and/or other XR devices associated with other users can render the XR world on their respective XR devices.

[0087] At block 504, process 500 can receive one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the XR world. In some implementations, process 500 can receive the one or more commands from at least one of the one or more XR devices accessing the XR world. In some implementations, process 500 can receive at least one of the one or more commands from the creator XR device. In some implementations process 500 can receive the one or more commands directly or indirectly from one or more XR devices over any suitable network, such as network 330 of Figure 3.

[0088] In some implementations, the one or more assets can include a virtual object. As used herein, a “virtual object” can be any visual object rendered in the XR world, such as a virtual tree, a virtual dog, a virtual billboard, a virtual doorway, etc. In some implementations, the one or more assets can include a virtual space. As used herein, a “virtual space” can be any rendered indoor or outdoor space in the XR world in which users can visit, such as a virtual room, a virtual stadium, a virtual campground, etc. In some implementations, the one or more assets can include a sound, such as a sound effect, music, etc. In some implementations, a sound can be associated with one or more virtual objects and/or virtual spaces within the XR world, such as dance music within a virtual club, a “moo” sound associated with a virtual cow, etc.

[0089] An XR device can generate a command using any suitable method. In some implementations, the XR device can generate a command based on a selection of a physical button on the XR device or one or more other XR devices in an XR system in operable communication with the XR device, such as one or more controllers. The one or more controllers can include, for example, controller 276A or controller 276B of Figure 2C. Using the one or more of the controllers, a user of the XR device can point and click on a virtual selectable element in the XR world using a physical button in order to add, edit, and/or remove an asset in the XR world. For example, the user can point and click on a virtual button associated with adding, editing, and/or removing an asset, and/or can select an asset in the XR world and make changes to it via a virtual menu. In some implementations, the user can further point, select, hold, and move an asset in the XR world using the physical button on the controller, as well as perform other functions via the same or different physical buttons on the controller, and/or via different motions with respect to the physical buttons (e.g., a tap versus a double tap). [0090] Alternatively or additionally, a user of the XR device can audibly speak the command (e.g., “build a tree over there,” “I wish that table wasn’t there,” “pink flowers are my favorite this time of year,” etc.). The audible command can be received by the XR device via a microphone integral with or in operable communication with the XR device. In some implementations, the XR device can interpret the audible command using natural language processing techniques, and transmit text representative of the command to process 500, that can receive the command. In some implementations, the XR device can transmit an audio signal representative of the audio command to process 500, and process 500 can perform natural language processing techniques to interpret the audible command.

[0091] Alternatively or additionally, the XR device can generate a command based on a gaze of the user of the XR device. For example, the XR device can include eye tracking circuitry, e.g., one or more cameras integral with the XR device facing the eyes of the user. In some implementations, the one or more cameras can be used in conjunction with one or more light-emitting diodes (LEDs) illuminating one or more eyes of the user. The eye tracking circuitry of the XR device can ascertain where in the XR world the user is looking, and generate a command based on the direction and/or location of the gaze. The XR device can perform eye tracking and/or gaze estimation using any other suitable technique. For example, the user can look at a virtual button or other selectable element associated with adding, removing, and/or editing an asset in the XR world. In some implementations, the user’s gaze can be used in conjunction with one or more other commands to provide further context to the command. For example, a user can speak, “build a basketball court over there,” and the gaze of the user can indicate the location of where to build the basketball court. In some implementations, the XR device can capture the gaze of the user, translate the gaze into a direction and/or location within the XR world, and transmit an indication of the direction and/or location to process 500. In some implementations, the XR device can capture the gaze of the user, can transmit an indication of the gaze to process 500, and process 500 can translate the gaze into a direction and/or location within the XR world.

[0092] Alternatively or additionally, the XR device can generate a command based on a gesture of the user of the XR device. For example, the XR device can include and/or be in operable communication with one or more cameras capturing the movements of the user in the real-world while the user is in the XR world on the XR device. The gesture can be, for example, pointing, clicking, and/or otherwise indicating a portion of the XR world by one of more of the user’s hands. For example, the user can gesture toward a virtual button or other selectable element associated with adding, removing, and/or editing an asset in the XR world. In some implementations, the user’s gesture can be used in conjunction with one or more other commands to provide further context to the command. For example, a user can speak, “play music in that room,” and a gesture by the user can indicate the location of where to play the music. In some implementations, the XR device can capture the gesture of the user, translate the gesture into a direction and/or location within the XR world, and transmit an indication of the direction and/or location to process 500. In some implementations, the XR device can capture the gesture of the user, can transmit an indication of the gesture to process 500, and process 500 can translate the gesture into a direction and/or location within the XR world.

[0093] At block 506, process 500 can perform, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the XR world. The one or more asset modifications can modify the one or more assets based on the add, edit, and/or remove command. In some implementations, the at least one asset modified by the at least one asset modification can be part of the original template of the XR world. For example, the original template of the XR world can include virtual pine trees. A user can make a command to change the virtual pine trees to virtual palm trees (or to remove the virtual pine trees), thereby changing an asset existing in the original template of the XR world. In some implementations, the at least one asset modified by the at least one asset modification can be an asset that was not part of the original template of the XR world. For example, a user can make a command to change the color of a virtual dog previously added to the instance of the XR world by that user or another user within the instance of the XR world.

[0094] In some implementations, the instance of the XR world can be created based on a request of a creator XR device of the one or more XR devices accessing the instance of the XR world. In some implementations, the creator XR device can specify one or more rules for performing the one or more asset modifications to the one or more assets of the instance of the XR world. The rules can specify, for example, what type of assets can be added, removed, and/or modified in the instance of the XR world (e.g., only virtual decorations); how long the asset modifications can persist in the instance of the XR world (e.g., for 2 minutes, until the end of April, until the XR device making the command leaves the instance of the XR world, until the creator XR device stops accessing the instance of the XR world, etc.); which of the XR devices can make asset modifications (e.g., only XR devices associated with friends of the creator XR device, only XR devices associated with users having particular demographics, only XR devices having a particular membership, etc.); and/or the like. For example, a creator XR device can specify a rule that virtual objects added by a friend persist forever, while virtual objects added by others only last for 10 minutes.

[0095] In some implantations, an asset can belong to an asset type, such as: chairs, couches, and tables belonging to the asset type furniture; virtual objects capable of motion and/or interaction belonging to the asset type non-person entity; trees, plants, flowers, and bushes belonging to the asset type landscaping; etc. One or more rules can specify a persistence for an asset modification based on the asset type. For example, asset modifications that relate to a first asset type may persist for a first duration of time (e.g., 15 minutes) and asset modifications that relate to a second asset type may persist for a second duration of time (e.g., 1 hour). In this example, a modification to the landscaping of an instance of a XR world, such as the trees present, can persist for 1 hour while a modification that adds a virtual object capable of motion, such as an animal that moves according to an artificial intelligence algorithm or movement model, can persist for 15 minutes.

[0096] In some implementations, an instance of a XR world can include one or more rules that define a predefined virtual space, such as a particular room, fenced area, or any other suitable field or volume of space, from which users can perform modifications. For example, users present in the predefined virtual space can issue commands that modify the instance of the XR world, while users that are not present in the predefined virtual space may be prohibited from issuing such commands.

[0097] In some implementations, the one or more commands can include a command to edit an asset of the one or more assets in the instance of the XR world. In some implementations, an asset modification corresponding to the asset can be performed by making the asset dynamic in the instance of the XR world. For example, for a virtual object, a visual or audible effect can be associated with the virtual object, such as an animation or sound effect. For example, an XR device can make a command to make a virtual portrait of a queen give a “thumbs up” sign every time an XR device comes within a threshold distance of the virtual portrait. In another example, an XR device can make a command to have a virtual cat purr every time a user of an XR device virtually pets it. In still another example, an XR device can make a command to make a monkey dance for 2 minutes.

[0098] At block 508, process 500 can detect occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the XR world. The triggering event can include, for example, expiration of a specified period of time after performing the at least one asset modification (e.g., 10 minutes), expiration of another period of time specified by the rules (e.g., persistence until the end of the day), detection of the creator XR device no longer accessing the instance of the XR world, detection that another XR device that made the asset modification is no longer accessing the instance of the XR world, detection that the asset is no longer being used and/or viewed on any of the XR devices in the instance of the XR world, cessation of speaking about the at least one asset by at least one user associated with respective XR devices of the one or more XR devices (such as in conversation), etc.

[0099] At block 510, process 500 can, in response to detecting occurrence of the triggering event, reverse the at least one asset modification to the at least one asset of the instance of the XR world. In some implementations, process 500 previously performed the asset modification in response to a command, and the reversed asset modification reverses the alteration to the instance of the virtual world caused by the command. For example, if the command is to add an asset, process 500 can remove the asset in response to the triggering event. In another example, if the command is to remove an asset, process 500 can add the asset back in response to the triggering event. In still another example, if the command is to modify an asset, process 500 can revert the asset to its state prior to modification in response to the triggering event. In some implementations, when the at least one asset modified by the at least one asset modification comprises a virtual object, the at least one asset modification can comprise a modification to a display of the virtual object in the instance of the XR world, and reversing the at least one asset modification can comprise reversing the modification to the display of the virtual object in the instance of the XR world.

[00100] In some implementations in which the triggering event is a cessation of speaking about the at least one asset by at least one user associated with respective XR devices of the one or more XR devices, process 500 can reverse the asset modification to the at least one asset based on the cessation. For example, one or more users associated with XR devices accessing the instance of the XR world can speak about and/or have a conversation about baseball. While the one or more users are talking about baseball (interpreted by process 500 to be a build command), process 500 can add a virtual batting cage to the instance of the XR world. When the one or more users stop talking about baseball (e.g., change the subject), process 500 can remove the virtual batting cage from the instance of the XR world (i.e. , silence regarding baseball can be interpreted by process 500 as a command to remove the virtual batting cage from the instance of the XR world).

[00101] In some implementations, process 500 can further detect that the one or more XR devices are no longer accessing the XR world. For example, process 500 can detect that the users associated with the one or more XR devices have exited the instance of the XR world, traveled to a different instance of the XR world, traveled to a different XR world, exited an application associated with the XR world, deactivated or removed the XR device, etc. In some implementations, process 500 can, in response to detecting that the one or more XR devices are no longer accessing the XR world, revert the instance of the XR world to the original template. For example, process 500 can reverse all additions, deletions, and/or modifications made by XR devices accessing the instance of the XR world, such that the instance of the XR world appears in its original form as it was when the instance was created. In some implementations, process 500 can revert the instance of the XR world to the original template based on detection that the creator XR device is no longer accessing the instance of the XR world, that one or more other XR devices are no longer accessing the instance of the XR world, and/or based on detection that no XR devices are accessing the instance of the XR world.

[00102] In some implementations, the commands received by process 500 can be generated by an artificial intelligence (Al) expression engine. For example, input from XR devices and their associated users can be input to an artificial intelligence expression engine, which can translate the input into one or more commands for modifying the instance of the XR world. In another example, portions of process 500 can incorporate functionality from the artificial intelligence expression engine.

[00103] Figure 6A is a flow diagram illustrating a process 600A used in some implementations of the present technology for processing input 602 from a user of an XR device, by an artificial intelligence (Al) expression engine, to understand what a user wants to accomplish in an XR world. The XR world can be dynamic, having an infinite number of possibilities, with assets that can constantly change. The Al expression engine can enable users, via respective XR devices, to modify the content of the XR world to enable such possibilities, instead of having a predefined set of scripted, fixed interactions for users and assets within the XR world.

[00104] Input 602 can include speech audio 604, which can be the raw audio captured by respective XR devices of each user in an instance of the XR world. Speech audio 604 can be fed into automatic speech recognition 610, which can transcribe speech audio 604 into text for each user in the XR world generating speech audio 604. In some implementations, however, it is contemplated that speech audio 604 may not be generated by some users, such as when users have their microphone muted or turned off on their respective XR devices. Input 602 can further include people state 606, which can be where users are in the XR world (e.g., position), users’ gazes, users’ gestures (e.g., pointing directions), users’ identities, time and/or season for users, preferences, histories thereof, and/or the like. Input 602 can further include world state 608, which can be the assets that already exist in the XR world and their histories, including name, type, positions, environment, audio, theme and style of the XR world, etc.

[00105] People state 606, world state 608, and text from automatic speech recognition 610 can be fed into continuous input processing 612. Continuous input processing 612 can identify when the Al expression engine is being spoken to with text from automatic speech recognition 610 (e.g., by applying continuous natural language processing techniques) and context provided by people state 606 and world state 608, and produce a structured result. In some implementations, continuous input processing 612 can identify when the Al expression engine is being spoken to without “wake” words (e.g., “hey, Expression Engine, ...”). The structured result can be provided to multimodal input 614, which can take the structured result and combine it with context provided by people state 606 and world state 608. Multimodal input 614 can enable the use of gaze and gestures as input 602, and resolve named assets in the XR world. Multimodal input 614 can be fed into decision engine 616 of the Al expression engine.

[00106] Figure 6B is a flow diagram illustrating a process 600B used in some implementations of the present technology for providing asset options 618 based on input 602 from a user received via an XR device, by an Al expression engine. As described further herein, asset options 618 can include, for example, skyboxes, virtual objects, sounds, virtual spaces, etc. In some implementations, asset options 618 can include any asset that users can imagine, and all aspects of the XR world can be configurable, instead of being limited by a finite asset library 622.

[00107] Assets 624 can, given a set of parameters from decision engine 616, return one or more assets as an array. In some implementations, assets 624 cannot create or modify assets in the instance of the XR world, but rather can provide the instructions for the assets 624. In some implementations, asset library 622 can be a user-generated repository of assets (e.g., skyboxes, virtual objects, sounds, virtual spaces, etc.). Digitize 620 can create an asset based on a real-world, physical object. In some implementations, digitize 620 can create the asset asynchronously, and store the asset in asset library 622. Family of apps 626 can pull content from applications outside of the XR world, such as two-dimensional photos and videos from social media applications, etc.

[00108] Generate from text 628 can generate assets using user-provided text and other parameters (e.g., style, size, color, culture, geographic locations, etc.). Generate from asset 630 can generate assets derived from existing assets by modifying, tweaking, and/or remixing the existing assets. For example, a user can make parameters changes such as size, color, positions, rotation, etc., which may or may not be uniformly applied across the entire asset (e.g., change the color of the leaves on a virtual tree). In another example, a user can make targeted changes to just a portion or one aspect of the asset or a subset of the asset (e.g., add a new limb to a tree). In still another example, a user can make changes to a sound or music. In some implementations, generate from asset 630 can turn a two-dimensional (2D) photograph into a 2D virtual object, a three-dimensional (3D) virtual object, a 3D photograph, a virtual space, and/or a sound. In some implementations, generate from asset 630 can generate a skybox from a 360-degree photograph and/or video.

[00109] Figure 6C is a flow diagram illustrating a process 600C used in some implementations of the present technology for providing a response 632 by supplementing an instance of an XR world with assets, by an Al expression engine. Decision engine 616 can take structured input from multimodal input 614 and, in some implementations, assets 624, and can decide one or more actions to be performed as response 632.

[00110] In some implementations, decision engine 616 can provide world placement 634, which can be where to put an asset in the XR world. World placement 634 can use an appropriate size for the asset and place it in the desired location in the XR world, including the x-, y-, and z-axes. In some implementations, decision engine 616 can further provide trainable actions 636, which can be behaviors that an asset can perform in the XR world. By providing a dynamic, continuously editable XR world, trainable actions 636 can be infinite, instead of a predefined, fixed set of actions. [00111] In some implementations, decision engine 616 can provide animations and pathfinding 638, which can be movement of an asset, e.g., as locomotion, and/or as the asset navigates the XR world. For example, animations and pathfinding 638 can provide Al-based physics motions, such as by enabling a human character to have legs and to walk around the XR world, even on uneven surfaces, and to navigate virtual terrain. In some implementations, decision engine 616 can provide natural language generation 640, which can be structured, deterministic text responses in the XR world. In some implementations, decision engine 616 can provide conversational model 642, which can generate open-ended text responses, such as through chat bots and/or translation. Natural language generation 640 and conversational model 642 can be fed into text-to-speech 644, which can convert normal language text from natural language generation 640 and conversational model 642 into audio 650 as a response 632.

[00112] Responses 632 can include entity mutations 646, embodied movement 648, and/or audio 650. Entity mutations 646 can create, modify, and/or delete assets in the XR world. When combined with multimodal input 614 and assets 624, entity mutations 646 can provide for an instantaneously morphable XR world. Embodied movement 648 can bring life to the XR world by enabling assets to move and/or perform one or more tasks in the XR world. When combined with trainable actions 636 and animations and pathfinding 638, embodied movement 648 can realize an infinite number of various motions and actions. Audio 650 can be played in the XR world, and can include music, sound effects, speech, and/or synthesized speech. In some implementations, audio 650 can only be played in a particular location in the XR world.

[00113] Figure 7A is a conceptual diagram illustrating an example view 700A on an XR device of an instance 702 of an XR world formed based on an original template. The original template can be the version of the XR world displayed in view 700A upon creation of instance 702 of the XR world. The original template (which, in some implementations, can be common across all instances of the XR world) can be a virtual clubhouse having virtual pool table 710 and virtual couch 712. The locations for virtual pool table 710 and virtual couch 712 within the virtual clubhouse can be part of the original template in some implementations. Other object parameters, such as colors, sizes, orientation, and the like, can also be stored as part of the original template. Avatars 704-708 associated with users on respective XR devices can be visiting instance 702 of the XR world and can make one or more changes to the instance of the XR world that can be viewed on one, some, or all of the XR devices associated with avatars 704-708, based one or more rules.

[00114] Figure 7B is a conceptual diagram illustrating an example view 700B on an XR device of a modification to an instance 702 of an XR world by a user. In instance 702, the user associated with avatar 704 can provide commands 716-718 to place a virtual object; in this case, command 716 is made by voice and command 718 is made by gesture. Command 716 can be, “put a soccer ball over there!” Command 718 can indicate where the user associated with avatar 704 wishes to place the virtual object, e.g., by gesturing in front of virtual pool table 710. Thus, virtual soccer ball 720 can be placed in instance 702 of the XR world in the desired location. In some implementations, one or more of the XR devices associated with avatars 704-708 (e.g., a creator XR device associated with the user requesting creation of instance 702 of the XR world, the XR device associated with avatar 704 requesting the change to instance 702 of the XR world, etc.) can specify one or more rules with respect to virtual soccer ball 720, such as where virtual soccer ball 720 can be placed in instance 702, how it can be placed in instance 702, how long it can be placed there, who can place it there, who can see it, etc. In some implementations, one or more of the XR devices associated with avatars 704-708 can further specify one or more triggering events that can remove virtual soccer ball 720 from instance 702 of the virtual world.

[00115] Figure 7C is a conceptual diagram illustrating an example view 700C on an XR device of a reversal of a modification to an instance 702 of an XR world based on a user leaving the instance 702 of the XR world. In some implementations, one or more of the XR devices associated with avatars 704-708 (e.g., a creator XR device associated with the user requesting creation of instance 702 of the XR world, the XR device associated with avatar 704 requesting the change to instance 702 of the XR world, etc.), and/or a platform or developer computing system, can specify one or more triggering events with respect to virtual soccer ball 720. The one or more triggering events can include, for example, expiration of a specified period of time, a particular user leaving the instance of the XR world, a creator of the instance of the XR world leaving the XR world, a condition not being fulfilled (e.g., an asset must remain in conversation to stay in instance 702 of the XR world), etc. In the example of Figure 7C, the triggering event can be avatar 704 (associated with the user adding virtual soccer ball 720) leaving instance 702 of the XR world. Thus, virtual soccer ball 720 (as well as avatar 704) are removed from view 700C of instance 702 of the XR world. [00116] In some implementations, the users associated with avatars 704-708 can issue one or more commands to modify elements of the original template of instance 702 of the XR world, such as pool table 710 and/or virtual couch 712. For example, a user associated with avatar 704 (e.g., creator user) can issue one or more commands (e.g., via the creator user’s XR device) to: change the size, orientation, or placement of virtual pool table 710; replace virtual couch 712 with a different virtual object; delete virtual pool table 710 and/or virtual couch 712; or implement any other suitable modification. In some implementations, the creator user (or any other suitable user) can also define one or more triggering events to reverse the modification to the element of the original template. Upon detection of this defined triggering event, the modification to the element of the original template can be reversed, returning the element of the original template to its original state.

[00117] Reference in this specification to "implementations" (e.g., "some implementations," "various implementations," “one implementation,” “an implementation,” etc.) means that a particular feature, structure, or characteristic described in connection with the implementation is included in at least one implementation of the disclosure. The appearances of these phrases in various places in the specification are not necessarily all referring to the same implementation, nor are separate or alternative implementations mutually exclusive of other implementations. Moreover, various features are described which may be exhibited by some implementations and not by others. Similarly, various requirements are described which may be requirements for some implementations but not for other implementations.

[00118] As used herein, being above a threshold means that a value for an item under comparison is above a specified other value, that an item under comparison is among a certain specified number of items with the largest value, or that an item under comparison has a value within a specified top percentage value. As used herein, being below a threshold means that a value for an item under comparison is below a specified other value, that an item under comparison is among a certain specified number of items with the smallest value, or that an item under comparison has a value within a specified bottom percentage value. As used herein, being within a threshold means that a value for an item under comparison is between two specified other values, that an item under comparison is among a middle-specified number of items, or that an item under comparison has a value within a middle-specified percentage range. Relative terms, such as high or unimportant, when not otherwise defined, can be understood as assigning a value and determining how that value compares to an established threshold. For example, the phrase "selecting a fast connection" can be understood to mean selecting a connection that has a value assigned corresponding to its connection speed that is above a threshold.

[00119] As used herein, the word "or" refers to any possible permutation of a set of items. For example, the phrase "A, B, or C" refers to at least one of A, B, C, or any combination thereof, such as any of: A; B; C; A and B; A and C; B and C; A, B, and C; or multiple of any item such as A and A; B, B, and C; A, A, B, C, and C; etc.

[00120] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Specific embodiments and implementations have been described herein for purposes of illustration, but various modifications can be made without deviating from the scope of the embodiments and implementations. The specific features and acts described above are disclosed as example forms of implementing the claims that follow. Accordingly, the embodiments and implementations are not limited except as by the appended claims.

[00121] Any patents, patent applications, and other references noted above are incorporated herein by reference. Aspects can be modified, if necessary, to employ the systems, functions, and concepts of the various references described above to provide yet further implementations. If statements or subject matter in a document incorporated by reference conflicts with statements or subject matter of this application, then this application shall control.

Claims

CLAIMS l/We claim:

1. A method for creating a dynamic, continuously editable artificial reality world, the method comprising: creating an instance, of multiple instances, of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world; detecting that the one or more artificial reality devices are no longer accessing the instance of the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the instance of the artificial reality world, reverting the instance of the artificial reality world to the original template.

2. The method of claim 1 , wherein the instance of the artificial reality world is created based on a request of a creator artificial reality device of the one or more artificial reality devices, and wherein the creator artificial reality device specifies the triggering event.

3. The method of claim 2, wherein the triggering event includes one or more of: expiration of a specified period of time after performing the at least one asset modification to the at least one asset of the instance of the artificial reality world, detection of the creator artificial reality device no longer accessing the instance of the artificial reality world, or a combination thereof.

4. The method of any preceding claim, wherein an artificial reality device, of the one or more artificial reality devices, transmits a command of the one or more commands, wherein an asset modification, of the one or more asset modifications, is performed to the instance of the artificial reality world, corresponding to an asset of the one or more assets, based on the command, and wherein the artificial reality device, of the one or more artificial reality devices, specifies the triggering event with respect to the asset of the one or more assets; in which case optionally wherein the triggering event includes one or more of a specified period of time after performing the modification of the one or more modifications to the instance of the artificial reality world, detection of the artificial reality device no longer accessing the instance of the artificial reality world, or a combination thereof.

5. The method of any preceding claim, wherein the instance of the artificial reality world is created based on a request of a creator artificial reality device of the one or more artificial reality devices, and wherein the creator artificial reality device specifies one or more rules for performing the one or more asset modifications to the one or more assets of the instance of the artificial reality world.

6. The method of any preceding claim, wherein the one or more commands includes a command to edit an asset of the one or more assets in the instance of the artificial reality world, and wherein an asset modification, of the one or more asset modifications, corresponding to the asset, of the one or more assets, is performed by making the asset dynamic in the instance of the artificial reality world.

7. The method of any preceding claim, wherein the one or more assets include a skybox, a virtual object, a sound, a virtual space, or any combination thereof.

8. The method of any preceding claim, wherein the one or more commands are generated by at least one of voice, gaze, a gesture, or any combination thereof.

9. The method of any preceding claim, wherein the triggering event is a cessation of speaking about the at least one asset, of the one or more assets, by at least one user associated with respective artificial reality devices of the one or more artificial reality devices.

10. The method of preceding claim, wherein the at least one asset modified by the at least one asset modification comprises a virtual object, the at least one asset modification comprises a modification to a display of the virtual object in the instance of the artificial reality world, and the reversing the at least one asset modification comprises reversing the modification to the display of the virtual object in the instance of the artificial reality world.

11. The method of any preceding claim, wherein the at least one asset modified by the at least one asset modification is part of the original template of the artificial reality world.

12. A computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform a process for creating a dynamic, continuously editable artificial reality world, the process comprising: creating an instance of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; and in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world.

13. The computer-readable storage medium of claim 12, and any one or more of: a) further comprising: detecting that the one or more artificial reality devices are no longer accessing the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the artificial reality world, reverting the instance of the artificial reality world to the original template; or b) wherein the instance is of multiple instances of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template; or c) wherein the at least one asset modified by the at least one asset modification comprises a virtual object, the at least one asset modification comprises a modification to a display of the virtual object in the instance of the artificial reality world, and the reversing the at least one asset modification comprises reversing the modification to the display of the virtual object in the instance of the artificial reality world.

14. A computing system for creating a dynamic, continuously editable artificial reality world, the computing system comprising: one or more processors; and one or more memories storing instructions that, when executed by the one or more processors, cause the computing system to perform a process comprising: creating an instance of the artificial reality world, wherein the instance is accessed by one or more artificial reality devices that render an original template of the artificial reality world; receiving one or more commands to A) add, B) edit, C) remove, or D) any combination thereof, one or more assets in the instance of the artificial reality world; performing, in accordance with the one or more commands, one or more asset modifications to one or more assets of the instance of the artificial reality world; detecting occurrence of a triggering event with respect to at least one of the one or more asset modifications to at least one asset of the instance of the artificial reality world; and in response to detecting occurrence of the triggering event, reversing the at least one asset modification to the at least one asset of the instance of the artificial reality world.

15. The computing system of claim 14, and any one or more of: a) further comprising: detecting that the one or more artificial reality devices are no longer accessing the artificial reality world; and in response to detecting that the one or more artificial reality devices are no longer accessing the artificial reality world, reverting the instance of the artificial reality world to the original template; or b) wherein the instance is of multiple instances of the artificial reality world, and wherein, at creation, the multiple instances of the artificial reality world share the original template; or c) wherein the at least one asset modified by the at least one asset modification is part of the original template of the artificial reality world.