US20210248826A1 - Surface distinction for mobile rendered augmented reality - Google Patents
Surface distinction for mobile rendered augmented reality Download PDFInfo
- Publication number
- US20210248826A1 US20210248826A1 US17/170,431 US202117170431A US2021248826A1 US 20210248826 A1 US20210248826 A1 US 20210248826A1 US 202117170431 A US202117170431 A US 202117170431A US 2021248826 A1 US2021248826 A1 US 2021248826A1
- Authority
- US
- United States
- Prior art keywords
- coordinate
- cluster
- feature points
- mesh
- mobile client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/50—Controlling the output signals based on the game progress
- A63F13/53—Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
- A63F13/537—Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
- G06T15/205—Image-based rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/647—Three-dimensional objects by matching two-dimensional images to three-dimensional objects
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/20—Input arrangements for video game devices
- A63F13/21—Input arrangements for video game devices characterised by their sensors, purposes or types
- A63F13/213—Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/25—Output arrangements for video game devices
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/60—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
- A63F13/65—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition
- A63F13/655—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition by importing photos, e.g. of the player
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/80—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game specially adapted for executing a specific type of game
- A63F2300/8082—Virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/21—Collision detection, intersection
Definitions
- the disclosure generally relates to the field of mobile rendered augmented reality and more specifically to surface distinction in mobile rendered augmented reality environments.
- AR augmented reality
- Conventional augmented reality (AR) systems do not enable a user to control an AR object to interact between two surfaces in mobile computer environments.
- An AR object limited to a fixed position or the range of interactions between the AR object and the environment was confined to a single surface.
- conventional AR systems for mobile clients are unable to identify surfaces within an environment and determine spatial relationships between the identified surfaces (e.g., the height or distance between points on respective surfaces).
- conventional solutions often require immense computing resources, power consumption, and memory capacity that are lacking in mobile computer environments. Accordingly, there is a need for a practical, mobile surface distinction solution that allows AR users to have an immersive experience where AR objects can naturally interact with identified surfaces.
- FIG. 1 illustrates an augmented reality (AR) game system environment, in accordance with at least one embodiment.
- AR augmented reality
- FIG. 2 is a block diagram of the surface distinction application of FIG. 1 , in accordance with at least one embodiment.
- FIG. 3 is a flowchart illustrating a process for providing an AR object for display based on surface distinction, in accordance with at least one embodiment.
- FIGS. 4A and 4B are flowcharts illustrating a process for providing an AR object for display based on the process of FIG. 3 , in accordance with at least one embodiment.
- FIGS. 5A, 5B, 5C, 5D, and 5E illustrate a process for controlling an AR object based on user interactions, at a mobile client, with planar surfaces detected by an AR system, in accordance with at least one embodiment.
- FIG. 6 illustrates a block diagram including components of a machine able to read instructions from a machine-readable medium and execute them in a processor (or controller), in accordance with at least one embodiment.
- surfaces within a user's environment are identified to enable an augmented reality (AR) object to be provided for display on a mobile client based on the identified surfaces.
- AR augmented reality
- Conventional AR systems for mobile clients are unable to identify surfaces within an environment and determine spatial relationships between the identified surfaces (e.g., the height or distance between points on respective surfaces).
- the systems and methods described herein may selectively limit the data stored to conserve memory resources.
- the surfaces may be identified using generated meshes rather than individual feature points, which optimizes processing resources (e.g., by analyzing one mesh as opposed to multiple feature points that make up the mesh).
- a configuration that performs these functions, which may be referred to herein as “surface distinction,” and enables AR objects to interact with the identified surfaces in a mobile rendered AR system while optimizing for the memory, power, and processing constraints (e.g., of the mobile client).
- a camera coupled with the mobile client captures a camera view of an environment.
- the environment may correspond to the physical world, which includes surfaces within a field of view of the camera (i.e., the “camera view”).
- a processor e.g., of the mobile device
- the processor generates a three-dimensional (3D) virtual coordinate space using the feature points.
- the processor identifies two or more clusters from the feature points, each cluster corresponding to a different surface.
- the processor generates meshes from the clusters, where each mesh is defined by coordinates in the 3D virtual coordinate space (e.g., the shape of the mesh may be outlined by the coordinates). Using these coordinates, the processor may determine a height difference between two surfaces (e.g., a difference between two Z-coordinates of respective meshes).
- the processor receives a user interaction between the two surfaces. For example, the user makes a swipe on the display starting from one surface and ending at the other surface.
- the processor provides an AR object for display at the mobile client, where the AR object is configured to interact with the two surfaces based on the user interaction. For example, the processor provides an AR avatar for display that runs from one surface to the other surface.
- Surface distinction allows the user to control AR objects within an AR application as though the objects are interacting with reality around the user, presenting an immersive AR experience for the user.
- the methods described herein allow surface distinction on a mobile client executed AR application without consuming too much processing and/or battery power.
- FIG. 1 illustrates an AR system environment, in accordance with at least one embodiment.
- the AR system environment may be an AR game system environment that enables a user to play AR games on a mobile client 100 , and in some embodiments, presents immersive gaming experiences to the user via surface distinction.
- the system environment includes a mobile client 100 , an AR system 110 , an AR engine 120 , a surface distinction application 130 , a database 140 , and a network 150 .
- the AR system 110 in some example embodiments, may include the mobile client 100 , the AR engine 120 , the surface distinction application 130 , and the database 140 .
- the AR system 110 may include the AR engine 120 , the surface distinction application 130 , and the database 140 , but not the mobile client 100 , such that the AR system 110 communicatively couples (e.g., wireless communication) to the mobile client 100 from a remote server.
- the mobile client 100 is a mobile device that is or incorporates a computer.
- the mobile client may be, for example, a relatively small computing device in which network processing (e.g., processor and/or controller) and power resources (e.g., battery) may be limited and have a formfactor size such as a smartphone, tablet, wearable device (e.g., smartwatch) and/or a portable internet enabled device.
- network processing e.g., processor and/or controller
- power resources e.g., battery
- the limitations of such device extend from scientific principles that must be adhered to in designing such products for portability and use away from constant power draw sources.
- the mobile client 100 has general and/or special purpose processors, memory, storage, networking components (either wired or wireless).
- the mobile client 100 can communicate over one or more communication connections (e.g., a wired connection such as ethernet or a wireless communication via cellular signal (e.g., LTE, 5G), WiFi, satellite) and includes a global positioning system (GPS) used to determine a location of the mobile client 100 .
- the mobile client 100 also includes a screen 103 (e.g., a display) and a display driver to provide for display interfaces on the display associated with the mobile client 100 .
- the mobile client 100 executes an operating system, such as GOOGLE ANDROID OS and/or APPLE iOS, and includes a display and/or a user interface that the user can interact with.
- an operating system such as GOOGLE ANDROID OS and/or APPLE iOS
- the mobile client 100 also includes one or more cameras (e.g., the camera 102 ) that can capture forward and rear facing images and/or videos.
- the one or more cameras 102 may be configured to capture depths of objects within an image.
- the one or more cameras 102 may be a dual camera, Light Detection and Ranging (LiDAR) camera, ultrasonic imaging camera, or any suitable camera capable of determining distance between an object and the camera.
- LiDAR Light Detection and Ranging
- the mobile client 100 couples to the AR system 110 , which enables it to execute an AR application (e.g., the AR client 101 ).
- the AR engine 120 interacts with the mobile client 100 to execute the AR client 101 (e.g., an AR game).
- the AR engine 120 may be a game engine such as UNITY and/or UNREAL ENGINE.
- the AR engine 120 displays, and the user interacts with, the AR game via the mobile client 100 .
- the AR application refers to an AR gaming application in many instances described herein, the AR application may be a retail application integrating AR for modeling purchasable products, an educational application integrating AR for demonstrating concepts within a learning curriculum, or any suitable interactive application in which AR may be used to augment the interactions.
- the AR engine 120 is integrated into and/or hosted on the mobile client 100 . In other embodiments, the AR engine 120 is hosted external to the mobile client 100 and communicatively couples to the mobile client 100 over the network 150 .
- the AR system 110 may comprise program code that executes functions as described herein.
- the AR system 110 includes the surface distinction application 130 .
- the surface distinction application enables surface distinction an AR game such that AR objects (e.g., virtual objects rendered by the AR engine 120 ) may appear to interact with various surfaces in an environment of the user.
- the user may capture an image and/or video of an environment, which may include one or more objects (e.g., a table, a book, etc.) captured within a camera view of the camera 102 or the mobile client 100 .
- the surface distinction application 130 may identify surfaces depicted within both images and videos, many instances described herein will refer to surface distinction in images captured by a mobile client.
- the AR engine 120 renders an AR object, where the rendering may be based on the identified surfaces within the environment (e.g., the surface of a table).
- the surface distinction application 130 identifies and distinguishes surfaces (e.g., floors and walls) in an image of the environment. For example, the surface distinction application 130 may distinguish a surface of a floor from a surface of a table.
- the surface distinction application 130 provides an AR object for display as interacting with one or more surfaces.
- the AR object is an AR avatar (e.g., a AR representation of a human resembling the user) and the avatar is displayed sitting on the floor, climbing from a table to a book resting on the table, etc.
- 5A-5E illustrate one example of the surface distinction application 130 identifying surfaces within a living room and enabling a user to control an AR avatar that is configured to travel between identified surfaces (e.g., walking in a visually natural way between surfaces).
- the AR system 110 includes applications instead of and/or in addition to the surface distinction application 130 .
- the surface distinction application 130 may be hosted on and/or executed by the mobile client 100 .
- the surface distinction application 130 is communicatively coupled to the mobile client 100 .
- the database 140 stores images or videos that may be used by the surface distinction application 130 to identify surfaces within an image or video.
- the mobile client 100 may transmit images or videos collected by the camera 102 during the execution of the AR client 101 to the database 140 .
- the data stored within the database 140 may be collected from a single client (e.g., the mobile client 100 ) or multiple clients (e.g., other mobile clients that are communicatively coupled to the AR system 110 through the network 150 ).
- the surface distinction application 130 may use images and/or videos of environments stored in the database 140 to train a model to classify objects within the environments. In turn, the classified objects may be used to determine a particular surface depicted in the image. The classification of objects within an image is further described in the description of FIG. 2 .
- the network 150 transmits data between the mobile client 100 and the AR system 110 .
- the network 150 may be a local area and/or wide area network that uses wired and/or wireless communication systems, such as the internet.
- the network 150 includes encryption capabilities to ensure the security of data, such as secure sockets layer (SSL), transport layer security (TLS), virtual private networks (VPNs), internet protocol security (IPsec), etc.
- SSL secure sockets layer
- TLS transport layer security
- VPNs virtual private networks
- IPsec internet protocol security
- FIG. 2 is a block diagram of the surface distinction application 130 of FIG. 1 , in accordance with at least one embodiment.
- the surface distinction application 130 includes a network interface 210 , a cluster module 220 , a mesh generation module 230 , and a rendering module 240 .
- the surface distinction application 130 includes modules other than those shown in FIG. 2 .
- the surface distinction application 130 may include a machine learning model trained to classify objects within images received from the mobile client 100 .
- the modules may be embodied as program code (e.g., software comprised of instructions stored on non-transitory computer readable storage medium and executable by at least one processor such as the processor 602 in FIG. 6 ) and/or hardware (e.g., application specific integrated circuit (ASIC) chips or field programmable gate arrays (FPGA) with firmware.
- the modules correspond to at least having the functionality described when executed/operated.
- the network interface 210 may be a communication interface for the surface distinction application 130 to communicate with various components in the AR system 110 , such as the mobile client 100 , the AR engine 120 , and the database 140 .
- the mobile client 100 may transmit requests with data payloads to the surface distinction application 130 via the network interface 210 .
- the requests may be to identify surfaces within an image, modify the display of AR objects (e.g., in response to user interactions with the mobile client 100 with a request to control the state of an AR object), or any suitable action to provide an interactive AR experience using the AR client 101 .
- the requests may have data payloads such as an image, a video, or information about a user interaction (e.g., coordinates on the screen 103 that the user has interacted with in a request to control an AR object).
- the “network interface” may be referred to as an “application interface” in embodiments where the surface distinction application is hosted and executed on the mobile client 100 .
- the application interface may be a communication interface for the surface distinction application 130 to communication with various components in the mobile client 100 such as the camera 102 and the screen 103 .
- Requests received by the network interface 210 from the mobile client 100 may be automatically generated during execution of the AR client 101 (e.g., during gameplay). Alternatively or additionally, the requests may be generated responsive to a user interaction with the mobile client 110 . For example, the user may tap the screen 103 at a location corresponding to a book, and the AR client 101 may send a request to the surface distinction application 130 to identify a surface, based on the user interaction, within an image depicting the table. The requests from the mobile client 100 may identify the mobile client 100 and/or the AR client 101 . Other components within the AR system 110 may communicate with the surface distinction application 130 via the network interface 210 . For example, the AR Engine 120 may provide feature points of an image to the surface distinction application 130 via the network interface 210 .
- the network interface 210 may take various forms.
- the network interface 210 takes the form of an application programming interface (API) such as REST (representational state transfer), SOAP (Simple Object Access Protocol), RPC (remote procedural call), or another suitable type.
- API application programming interface
- the network interface 210 may receive an image captured within the camera view of the camera 102 of the mobile client 100 .
- the network interface 210 may transmit the image to the AR engine 120 , which determines feature points in the image of the environment and provides the feature points to the cluster module 220 (e.g., via the network interface 210 ).
- the mobile client 100 may provide the image to the AR engine 120 , which subsequently provides feature points to the cluster module 220 .
- the feature points may provide information about the content of an image for subsequent image processing, where the information indicates features of structures within the image such as surfaces, corners, points, objects, etc.
- the use of feature points by the surface distinction application 130 is further described with respect to the cluster module 220 and mesh generation module 230 .
- the cluster module 220 groups the feature points received from the AR engine 120 into clusters.
- the clusters may be indicative of distinct features in the image such as objects and surfaces (e.g., chairs, tables, walls, floors, etc.).
- An object depicted in an image from a camera view of the environment may be referred to herein as a “real-world object” as such objects have physical form and existence.
- the cluster module 220 identifies clusters using Euclidean clustering, where feature points within a threshold distance of one another form a cluster.
- the cluster module 220 may identify clusters using alternative methods, in other embodiments.
- the cluster module 220 may generate a three-dimensional (3D) virtual coordinate space using the received feature points. Each feature point may be characterized by Cartesian coordinates, which may indicate a depth and a height of the feature point within the 3D virtual coordinate space.
- the surface distinction application 130 may store a first set of feature points of an first image corresponding to what is depicted in a camera view of a mobile device and subsequently discard the first set of feature points so that memory resources may be saved for a second set of feature points (e.g., when the camera view changes to depict different objects).
- a first image may depict a table and the cluster module 220 may receive a first set of feature points identified for the table.
- a subsequently received second image may depict a chair and the cluster module 220 may receive a second set of feature points identified for the chair.
- the cluster module 220 may determine, using the received feature points, that camera view of the mobile device is capturing different objects or a different view of the environment. For example, the cluster module 220 may compare the values of the first set and second set of feature points to determine that they correspond to different objects. The cluster module 220 may determine to discard the feature points corresponding to objects that are not captured within the most recent camera view. For example, the cluster module 220 may discard the first set of feature points (e.g., responsive to determining that the first set and second set of feature points are different or exceed a threshold measure of difference). In this way, the surface distinction application 130 may conserve memory resources.
- a cluster indicates physical surfaces within the environment.
- the surfaces may be of a real-world object (e.g., the top of a table or seat of a chair) or standalone surfaces such as floors.
- Cluster attributes such as its size and location within the captured image relative to other clusters may correspond to similar attributes of the surface relative to other surfaces in the environment.
- the cluster module 220 may identify a cluster in an image of a room with a box. The identified cluster may correspond to the surfaces of the box, indicate a location of the box within the image of the room, and scale according to a size of the box in the image.
- Feature point density may be used to determine the presence of an object.
- feature point density is proportional to image texture, where an object with varying colors or surface features (e.g., protrusions, corners, ridges, etc.) may be associated with a higher density of clusters than a flat surface whose image texture does not vary as much.
- the cluster module 220 identifies a plurality of clusters from the feature points received from the AR engine 120 , e.g., each cluster corresponding to a real-world object in the camera view of the environment. In response, the cluster module 220 determines a size of each of the clusters. Based on the determined sizes, the cluster module 220 selects one of the clusters. For example, the cluster module 220 may select the largest cluster. In another example embodiment, the cluster module 220 receives user input provided by the mobile client 100 via the network interface 210 .
- the user input may be a tap or swipe on the screen 103 , a click or drag with a computer cursor, or any other suitable user input via the mobile client 100 indicating a selection of a real-world object displayed on the screen 103 .
- the cluster module 220 subsequently selects a cluster corresponding to the selected real-world object.
- the cluster module 220 may distinguish clusters of feature points associated with real-world objects from clusters of feature points associated with flat, two-dimensional surfaces (2D) (e.g., walls, floors) in the camera view.
- the cluster module 220 applies a Random Sample Consensus (RANSAC) algorithm to the image to identify these 2D surfaces.
- RANSAC Random Sample Consensus
- API custom matrix system application programming interface
- the mesh generation module 230 generates a 3D mesh for a real-world object or surface from its corresponding cluster.
- the mesh generation module 230 may use triangulation (e.g., Delaunay triangulation) to generate a 3D mesh from the cluster of feature points.
- the mesh generation module 230 builds a stencil shader and accordingly generates a 3D mesh that corresponds to bounds of the identified cluster. Therefore, the 3D mesh approximates the shape of the real-world object.
- the mesh generation module 230 may iterate through each cluster identified by the cluster module 220 to generate 3D meshes.
- the mesh generation module 230 may generate a 3D mesh of a user-specified cluster responsive to a user selection of a point or area within the image (e.g., using the mobile client 100 ) corresponding to the cluster.
- the mesh generation module 230 may detect 2D surfaces depicted within an image.
- the mesh generation module 230 may determine features of the mesh (e.g., depth, convexity, etc.) and use the determined features to determine whether the mesh corresponds to a 3D image.
- the mesh generation module 230 allows surface distinction to be performed using a mesh rather than individual feature points.
- the surface distinction application 130 is processing, for example, one 3D mesh as opposed to multiple feature points (e.g., millions of feature points). This way, the surface distinction application 130 may optimize processing resources.
- a cluster determined by the cluster module 220 corresponds to a box.
- the mesh generation module 230 may generate a 3D mesh that is shaped similarly to the box.
- the 3D mesh of the box would correspond to a height, length, and width of the box.
- the mesh generation module 230 may be configured to refine the density of the mesh. For example, a lower density mesh may provide the primary points for the 3D mesh of the object in terms of dimensions. The more refined, or higher density, mesh may allow for generating a 3D mesh that may also generate a corresponding texture of the object (e.g., ridges, bumps, etc.).
- the mesh generation module 230 provides the 3D mesh to the rendering module 240 using, for example, a .NET bridge.
- the mesh generation module 230 generates the 3D mesh for a cluster after classifying a real-world object within the received image.
- the surface distinction application 130 may include a machine learning model that classifies real-world objects in images of the environment.
- the mesh generation module 230 may apply an image to the machine learning model, which outputs a type for each of the real-world objects in the image.
- the mesh generation module 230 may generate the 3D mesh for the selected real-world object based on the type output by the machine learning model.
- the module may determine coordinates of the 3D mesh within the 3D virtual coordinate space generated by the cluster module 220 . These determined coordinates may be boundary coordinates outlining, for example, edges, surfaces, cavities, etc. of the 3D mesh as located within the virtual coordinate space representing the environment in the camera view.
- the surface distinction application 130 includes a machine learning model training engine that accesses the database 140 for images of real-world objects to train the machine learning model.
- the machine learning model training engine may generate training sets using the accessed images and labels identifying the objects depicted within the images. The training engine may then use the training sets to train the machine learning model.
- the rendering module 240 provides for display, on the mobile client 100 , an AR object that is generated, or “rendered,” by the AR engine 120 .
- the rendering module 240 may determine a location on a virtual coordinate space to position the rendered AR object and transmit the location and the rendered AR object to the mobile client 100 for display.
- the rendering module 240 may determine various states to display an AR object depending on an identified surface on which the AR object is rendered and/or a location within the identified surface. For example, the rendering module 240 may render an AR avatar that appears to look down when the AR avatar is displayed at the edge of a surface. The rendering module 240 may determine that the edge of the surface has a coordinate within the virtual coordinate space with a non-zero Z-coordinate or value greater than a threshold Z-coordinate and display the AR avatar in a state where it appears to look down. The rendering module 240 may also provide an AR object for display that indicates the presence of other identified surfaces to which it may travel. For example, the AR avatar may be rendered to point to a table's surface to which it may climb.
- the rendering module 240 provides for display the AR object as interacting with the surface (e.g., standing on the surface, traveling on the surface, sitting on the surface, etc.).
- the rendering module 240 provides the 3D meshes of the surfaces for display on the mobile client 100 .
- the mesh generation module 230 may provide the generated 3D meshes for display on the mobile client's display 103 .
- the mesh generation module 230 provides an indication of identified surfaces for display by overlaying 3D meshes of the surfaces over the image from the camera view of the mobile client 100 .
- the rendering module 240 may receive, from the mobile client 100 , data indicative of user interactions indicating one or more locations at which an AR object is requested to be provided at the screen 103 .
- a user may tap or swipe the screen 103 (e.g., a touchscreen display) and the AR client 101 may provide one or more points (e.g., coordinates corresponding to areas of the screen 103 ) to the surface distinction application 210 .
- the rendering module 240 determines one or more corresponding coordinates on the 3D virtual coordinate space to provide the AR object for display.
- the rendering module 240 may determine a set of coordinates in the 3D virtual coordinate space that correspond to a path between two identified surfaces: a floor and a table.
- the rendering module 240 may use the Z-coordinates of respective identified surfaces (e.g., coordinates corresponding to the center each surface) within the 3D virtual coordinate space to determine height differences between two surfaces.
- the rendering module 240 may determine a depth difference using X and Y coordinates of the respective identified surfaces.
- the rendering module 240 may determine a shortest path between the centers of each surface.
- the rendering module 240 may provide a path (e.g., stepping stones) along this shortest path for display at the mobile client 100 .
- the rendering module 240 requests the AR engine 120 render AR stepping stones and subsequently transmits the rendered stepping stones to the mobile client 100 for display.
- the state that the rendering module 240 determines to display an AR object may include the size or ratio of the AR object, which may vary depending on the location within the 3D virtual coordinate space that the AR object is placed.
- the rendering module 240 may display the AR avatar, as it travels along the AR stepping stones, in an increasingly larger size ratios if the avatar is traveling towards the user or a decreasingly smaller size ratios if the avatar is traveling away.
- the rendering module 240 promotes a more visually natural display of the AR objects as they interact with identified surfaces.
- the rendering module 240 may render instructions for display to direct a user to perform a particular user interaction.
- the rendered instructions may include text, pictures, animations, AR objects, etc.
- the rendering module 240 may provide an arrow for display pointing from one surface to another, indicating that the user make a swiping motion on the screen 103 to follow the arrow and cause a set of AR steps to be generated and thus, creating a path between the two surfaces for an AR avatar to travel between.
- the rendering module 240 may determine the instructions using the coordinates of the 3D meshes corresponding to the surfaces. For example, the rendering module 240 may use the Cartesian coordinates of two points within respective surface meshes to determine a line in the 3D virtual coordinate space between the two meshes. The rendering module 240 may then use the line to determine a location to provide the instructional arrow for display.
- FIG. 3 is a flowchart illustrating an example process 300 for providing an AR object for display based on surface distinction, in accordance with at least one embodiment.
- the process 300 may be performed by the surface distinction application 130 .
- the surface distinction application 130 may perform operations of the process 300 in parallel or in different orders, or may perform different, additional, or fewer steps.
- the surface distinction application 130 may receive, from the mobile client, an request based on a user interaction identify a particular surface (e.g., the user taps on a book displayed in an image shown on the screen 103 ) and an identified 306 cluster corresponds to the user's selected surface.
- the surface distinction application may also be applied to videos.
- the process 300 may be performed by a surface distinction application hosted on a mobile client or hosted on a remote server that is communicatively coupled to the mobile client.
- planar surfaces are 2D surfaces with minimal or no texture (e.g., cavities, ridges, protrusions, etc.).
- planar surface and surface may be used interchangeably herein unless the context in which either term is used indicates otherwise.
- Planar surfaces may include relatively flat surfaces such as floors, walls, surfaces of flat objects (e.g., a top of a box) or semi-flat objects (e.g., a page of an open book), or any suitable 2D surface.
- the planar surface may have a slope.
- the planar surface may be the exterior of a triangular roof.
- the mobile client 100 transmits an image from a camera view (e.g., field of view from the camera 102 ) of a park with grass, a bench, and a tree. Planar surfaces depicted in this image include the grass and the bench (e.g., the seat and the back of the bench).
- the surface distinction application 130 may provide the received image to the AR engine 120 to identify feature points within the image.
- An interface of the surface distinction application such as the network interface 210 or the application interface (e.g., for surface distinction applications hosted on the mobile client 100 ) may receive 302 the image.
- the surface distinction application 130 receives 304 feature points associated with the planar surfaces.
- the AR engine 120 may identify feature points within the received image to provide to the surface distinction application 130 .
- the received feature points may be characterized by coordinates to indicate the position of distinct, 3D features within the 2D image relative to other objects within the image. For example, a coordinate corresponding to a point on a bench within an image of a park may be represented by Cartesian coordinates.
- the coordinate may indicate the depth of the point on the bench.
- the depth may be represented as a distance away from a reference point (e.g., an origin in the 3D virtual coordinate space), which may be a point on an object within the image or the camera 102 .
- An interface of the surface distinction application such as the network interface 210 or the application interface (e.g., for surface distinction applications hosted on the mobile client 100 ) may receive 304 the feature points.
- the surface distinction application 130 identifies 306 clusters based on the feature points, each cluster corresponding to a respective planar surface of the planar surfaces.
- the cluster module 220 may apply a RANSAC algorithm to the image to identify clusters of feature points that correspond to the respective planar surfaces. For example, the cluster module 220 identifies a first cluster of feature points identified in the image of the park that correspond to where in the image that grass is depicted and a second cluster of feature points that correspond to the bench seat.
- the surface distinction application 130 determines 308 , based on the identified clusters, locations of the planar surfaces in a 3D virtual coordinate space.
- the cluster module 220 may generate a 3D virtual coordinate space using the received feature points.
- mesh generation module 230 may determine the coordinates of the identified clusters, where those coordinates indicate the locations of the planar surfaces.
- the mesh generation module 230 may generate meshes using the feature points in the identified clusters and determine additional coordinates outlining the shape of the surface.
- the mesh generation module 230 determines the coordinates of the feature points in a cluster representing a bench in an image of a park and generates, using those feature points, a 3D mesh (e.g., using Delaunay triangulation) where the 3D mesh is characterized by more coordinates than the feature points in the cluster.
- a 3D mesh e.g., using Delaunay triangulation
- Each coordinate composing the 3D mesh may be a location of a planar surface of the bench.
- the surface distinction application 130 provides 310 for display at the mobile client an AR object, where the AR object is provided for display at a location of the determined locations.
- the rendering module 240 may provide 310 an AR object (e.g., an AR ball) for display at the mobile client 100 , where the AR ball is displayed as resting on the surface of a real-world bench captured within the camera view of the mobile client 100 .
- the mesh generation module 230 may have determined at least one location of a surface of the bench (e.g., using coordinates of the 3D mesh of the bench within the 3D virtual coordinate space).
- the AR ball may be rendered by the AR engine 120 and positioned, by the rendering module 240 , within the 3D virtual coordinate space at a location that is equivalent to a coordinate of the 3D mesh of the bench such that the AR ball appears to rest atop a surface of the bench.
- FIGS. 4A and 4B are flowcharts illustrating an example process 400 for providing an AR object for display based on the process 300 of FIG. 3 , in accordance with at least one embodiment.
- the process 400 may be performed by the surface distinction application 130 .
- the surface distinction application 130 may perform operations of the process 400 in parallel or in different orders, or may perform different, additional, or fewer steps.
- the surface distinction application may also be applied to videos.
- the surface distinction application 130 generates 402 a 3D virtual coordinate space following receiving 304 feature points associated with planar surfaces (e.g., from the AR engine 120 ).
- the cluster module 220 may use the received feature points to generate the coordinate space.
- the feature points may be characterized by a coordinate system when received such that the feature points indicate at least a height and a depth for features in an image.
- the cluster module 220 may use the coordinate system provided, convert the feature points to another coordinate system, or perform coordinate transformation to rotate axes.
- the surface distinction application 130 identifies 404 and 406 a first and second cluster, respectively, from the feature points.
- the cluster module 220 may perform identification 404 and 406 as part of identifying 306 clusters. Each identified cluster may correspond to a respective planar surface. Continuing the example described with reference to the process 300 , the cluster module 220 may identify 404 a first cluster from the feature points corresponding to grass in an image of a park and identify 406 a second cluster from the feature points corresponding to a bench in the image.
- the surface distinction application 130 generates 408 and 410 a first and a second mesh, respectively.
- the mesh generation module 230 may generate 408 a first 3D mesh from the first cluster and generate 410 a second 3D mesh from the second cluster.
- the first mesh represents the grass identified by the first cluster of feature points and the second mesh represents the bench identified by the second cluster of feature points.
- the first mesh may be associated with a first Z-coordinate in the 3D virtual coordinate space.
- the first mesh may include a feature point with a Z-coordinate of 0 to indicate a height of the grass in the image relative to other objects (e.g., a mesh with a negative Z-coordinate has a feature that is below the point at the grass with a Z-coordinate of 0).
- the second mesh may be associated with a second Z-coordinate in the 3D virtual coordinate space.
- the second mesh may include a feature point with a non-zero Z-coordinate to indicate that a point on the bench has a surface that is higher than the grass at the Z-coordinate of 0.
- the 3D meshes may also be similarly characterized by X and Y coordinates, which may be used to indicate depths of surfaces relative to other surfaces in the image.
- the surface distinction application 130 determines 412 a height difference between a first planar surface and a second planar surface based on the first and second Z-coordinates.
- the mesh generation module 230 may determine 412 a difference between two Z-coordinates to determine the height difference between two surfaces at the respective points. For example, the mesh generation module 230 determines a height difference between grass and a bench depicted in an image of a park using Z-coordinates of feature points of the respective 3D meshes of the grass and bench.
- the surface distinction application 130 provides 414 for display at the mobile client indication of respective locations of the first and second planar surfaces.
- the indications of the locations of surfaces identified by the surface distinction application 130 may be the 3D meshes generated by the mesh generation module 230 .
- the render module 240 may provide 414 3D meshes of identified planar surfaces for display at the mobile client 100 (e.g., using the screen 103 ), where the 3D meshes are displayed overlaying the corresponding surfaces.
- the render module 240 may provide 414 a 3D mesh of the grass overlaying the grass in the image of the park and provide 414 a 3D mesh of the bench overlaying the surface of the bench.
- the user of the mobile client 100 may see, on the screen 103 , the image and the 3D meshes overlaying the image.
- the surface distinction application 130 receives 416 a user interaction between the first and second planar surfaces.
- the network interface 210 receives 416 a user interaction of a swipe of the user's finger across the screen 103 between a starting coordinate at the 3D mesh corresponding to the grass and an ending coordinate at the 3D mesh corresponding to the bench.
- the surface distinction application provides 418 for display at the mobile client an AR object.
- the AR object may be configured to interact with the first and second planar surfaces based on the user interaction.
- the rendering module 240 provides 418 for display an AR avatar at the mobile client 100 that is configured to travel between the grass and the bench at the locations where the user's finger swiped across the image.
- the rendering module 240 may provide 418 the AR object for display as a part of providing 310 for display the AR object at a determined location of a planar surface in the 3D virtual coordinate space.
- FIGS. 5A, 5B, 5C, 5D, and 5E illustrate an example process for controlling an AR object based on user interactions, at a mobile client, with planar surfaces detected by an AR system, in accordance with at least one embodiment.
- Various environments, both real and virtual, are depicted in FIGS. 5A-5E .
- an “environment” may refer to a real-world environment and a “virtual environment” may refer to an environment that has been captured by a computer (e.g., via imaging) for processing and may not necessarily be presented to the user of the mobile client 100 .
- Virtual environments are presented herein to promote clarity with describing the process for identifying surfaces.
- FIG. 5A shows an environment 500 a of a living room where example objects and respective first, second, and third surfaces 510510 , 511 , and 512 exist.
- the first surface 510510 is a floor of the living room
- the second surface 511 is a surface of a table on the surface 510510
- the third surface 512 is a surface of a book on the surface 511 .
- a mobile client 100 may capture an image or video of the environment 500 a while using the AR client 101 .
- a captured image may be transmitted to the surface distinction application 130 for processing, as shown in FIGS. 5B and 5C .
- the network interface 210 may receive this image.
- FIG. 5B shows a virtual environment 500 b where feature points 520 have been identified within an image of the environment 500 a that includes depictions of surfaces 510 , 511 , and 512 .
- the feature points 520 depicted are examples of feature points and may vary in size and/or density or may be greater or fewer points than what is depicted in FIG. 5B .
- the feature points 520 may indicate corners and edges of surfaces depicted within the image, which may correspond to boundary points of surfaces.
- the surface 511 corresponding to the top of a table is represented by feature points.
- the cluster module 220 may apply a RANSAC algorithm to identify a 2D surface, from the feature points, that corresponds to the top of the table.
- the cluster module 220 may generate a 3D virtual coordinate space using the feature points 520 .
- FIG. 5C shows a virtual environment 500 c where first, second and third 3D meshes 530 , 531 , and 532 have been identified within the environment 500 b for respective first, second and third surfaces 510 , 511 , and 512 .
- the mesh generation module 230 may use clusters of feature points 520 shown in environment 500 b to generate the 3D meshes for the surfaces depicted within the image of environment 500 a .
- the rendering module 240 may provide the generated meshes for display at the mobile client 100 to indicate to a user surfaces identifies by the surface distinction application 130 .
- FIG. 5D shows an environment 500 d where a mobile client 100 is executing the AR client 101 while capturing an image of the living room of environment 500 a .
- the mobile client 100 displays depictions of surfaces 510 , 511 , and 512 , an AR avatar 540 , and one or more AR stepping stones 541 .
- the rendering module 240 may provide the AR avatar 540 and the AR stepping stones 541 for display on the mobile client 100 .
- the user of the mobile client 100 may swipe his finger across the display, where the swipe may be characterized by a starting coordinate on the 3D virtual coordinate space corresponding to the surface 510 of the living room floor and an ending coordinate corresponding to the surface 511 of the top of the table.
- This user interaction may be received by the network interface 210 as a request for the rendering module 240 to render the AR stepping stones 541 along the path of the swipe.
- the rendering module 240 may determine a path within the 3D coordinate space that maps to a line between the starting and ending coordinates and generate the AR stepping stones 541 along the determined line.
- FIG. 5E shows an environment 500 e where the mobile client 100 displays the AR avatar 540 at a location of a identified surface 511 , the top of the table.
- the user may have performed an additional user interaction with the display of the mobile client 100 to request that the AR avatar 540 move up the AR stepping stones 541 .
- the network interface 210 may receive the user interaction and the rendering module 240 may provide the AR avatar 540 in a state depicted as traveling from the floor to the table.
- the user may continue to perform user interactions such as tapping the display of the mobile client 100 where the surface 512 of the book is depicted to request that the AR avatar 540 interact with the surface 512 (e.g., sit on the book).
- the rendering module 240 may decrease the size of the AR objects depending on their location within the 3D virtual coordinate space to promote a natural appearance when displayed on the mobile client 100 . For example, as the AR avatar 540 moves from the floor to the table, the rendering module 240 may decrease the size ratio of the AR avatar 540 to indicate that the AR object is moving away from the user and towards the table.
- FIG. 6 is a block diagram illustrating components of an example machine able to read instructions from a machine-readable medium and execute them in a processor (or controller).
- FIG. 6 shows a diagrammatic representation of a machine in the example form of a computer system 600 within which program code (e.g., software) for causing the machine to perform any one or more of the methodologies discussed herein may be executed.
- the program code may correspond to functional configuration of the modules and/or processes described with FIGS. 1-5E .
- the program code may be comprised of instructions 624 executable by one or more processors 602 .
- the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
- the machine may be a portable computing device or machine (e.g., smartphone, tablet, wearable device (e.g., smartwatch)) capable of executing instructions 624 (sequential or otherwise) that specify actions to be taken by that machine.
- a portable computing device or machine e.g., smartphone, tablet, wearable device (e.g., smartwatch)
- machine capable of executing instructions 624 (sequential or otherwise) that specify actions to be taken by that machine.
- instructions 624 quential or otherwise
- the term “machine” shall also be taken to include any collection of machines that individually or jointly execute instructions 624 to perform any one or more of the methodologies discussed herein.
- the example computer system 600 includes at least one processor 602 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any combination of these), a main memory 604 , and a static memory 606 , which are configured to communicate with each other via a bus 608 .
- the computer system 600 may further include visual display interface 610 .
- the visual interface may include a software driver that enables displaying user interfaces on a screen (or display).
- the visual interface may display user interfaces directly (e.g., on the screen) or indirectly on a surface, window, or the like (e.g., via a visual projection unit). For ease of discussion the visual interface may be described as a screen.
- the visual interface 610 may include or may interface with a touch enabled screen.
- the computer system 600 may also include alphanumeric input device 612 (e.g., a keyboard or touch screen keyboard), a cursor control device 614 (e.g., a mouse, a trackball, a joystick, a motion sensor, or other pointing instrument), a storage unit 616 , a signal generation device 618 (e.g., a speaker), and a network interface device 620 , which also are configured to communicate via the bus 608 .
- alphanumeric input device 612 e.g., a keyboard or touch screen keyboard
- a cursor control device 614 e.g., a mouse, a trackball, a joystick, a motion sensor,
- the storage unit 616 includes a machine-readable medium 622 on which is stored instructions 624 (e.g., software) embodying any one or more of the methodologies or functions described herein.
- the instructions 624 (e.g., software) may also reside, completely or at least partially, within the main memory 604 or within the processor 602 (e.g., within a processor's cache memory) during execution thereof by the computer system 600 , the main memory 604 and the processor 602 also constituting machine-readable media.
- the instructions 624 (e.g., software) may be transmitted or received over a network 626 via the network interface device 620 .
- machine-readable medium 622 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions (e.g., instructions 624 ).
- the term “machine-readable medium” shall also be taken to include any medium that is capable of storing instructions (e.g., instructions 624 ) for execution by the machine and that cause the machine to perform any one or more of the methodologies disclosed herein.
- the term “machine-readable medium” includes, but not be limited to, data repositories in the form of solid-state memories, optical media, and magnetic media.
- an AR application e.g., an AR game
- a user may want to have an AR object interact with surfaces in the real-world environment.
- Conventional implementations for mobile clients do not allow for surface distinction. Since conventional systems are unable to spatially differentiate surfaces, they limit AR objects to interact with a single surface and may cause AR objects to be displayed disproportionate to a particular surface without knowledge of the depth or height of one surface relative to another.
- the methods described herein detect surfaces within an image and determines spatial relationships between the surfaces through clustering and mesh generation. Furthermore, the methods described herein may selectively store feature point data corresponding to the most recent camera view of the phone. For example, a user may direct their mobile phone's camera at a table, but then turn around to capture a chair.
- the feature points corresponding to the table may be discarded.
- the methods described herein promote efficient memory use and as a consequence, promote efficient power and processing use as well by not expending those resources on the data that was discarded.
- the methods herein include a clustering algorithm that scans identified feature points, groups the ones closest together, and generates a mesh. By using generated mesh instead of individual feature points to detect surfaces, processing resources of a device are optimized (e.g., processing one mesh vs. millions of feature points). Accordingly, the methods described herein enable surface distinction on mobile client rendered AR systems without consuming excessive amounts of processing power, thus presenting an immersive gaming experience to the user.
- Modules may constitute either software modules (e.g., code embodied on a machine-readable medium or in a transmission signal) or hardware modules.
- a hardware module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner.
- one or more computer systems e.g., a standalone, client or server computer system
- one or more hardware modules of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- a hardware module may be implemented mechanically or electronically.
- a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations.
- a hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- hardware module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
- “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where the hardware modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
- Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple of such hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- a resource e.g., a collection of information
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions.
- the modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- the one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., application program interfaces (APIs).)
- SaaS software as a service
- the performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines.
- the one or more processors or processor-implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the one or more processors or processor-implemented modules may be distributed across a number of geographic locations.
- any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment.
- the appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
- Coupled and “connected” along with their derivatives. It should be understood that these terms are not intended as synonyms for each other. For example, some embodiments may be described using the term “connected” to indicate that two or more elements are in direct physical or electrical contact with each other. In another example, some embodiments may be described using the term “coupled” to indicate that two or more elements are in direct physical or electrical contact. The term “coupled,” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other. The embodiments are not limited in this context.
- the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion.
- a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
- “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Graphics (AREA)
- Human Computer Interaction (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Hardware Design (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Optics & Photonics (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
- This application claims the benefit of U.S. Provisional Application No. 62/971,766, filed Feb. 7, 2020, which is incorporated by reference in its entirety.
- The disclosure generally relates to the field of mobile rendered augmented reality and more specifically to surface distinction in mobile rendered augmented reality environments.
- Conventional augmented reality (AR) systems do not enable a user to control an AR object to interact between two surfaces in mobile computer environments. An AR object limited to a fixed position or the range of interactions between the AR object and the environment was confined to a single surface. In particular, conventional AR systems for mobile clients are unable to identify surfaces within an environment and determine spatial relationships between the identified surfaces (e.g., the height or distance between points on respective surfaces). Moreover, conventional solutions often require immense computing resources, power consumption, and memory capacity that are lacking in mobile computer environments. Accordingly, there is a need for a practical, mobile surface distinction solution that allows AR users to have an immersive experience where AR objects can naturally interact with identified surfaces.
- The disclosed embodiments have other advantages and features which will be more readily apparent from the detailed description, the appended claims, and the accompanying figures (or drawings). A brief introduction of the figures is below.
-
FIG. 1 illustrates an augmented reality (AR) game system environment, in accordance with at least one embodiment. -
FIG. 2 is a block diagram of the surface distinction application ofFIG. 1 , in accordance with at least one embodiment. -
FIG. 3 is a flowchart illustrating a process for providing an AR object for display based on surface distinction, in accordance with at least one embodiment. -
FIGS. 4A and 4B are flowcharts illustrating a process for providing an AR object for display based on the process ofFIG. 3 , in accordance with at least one embodiment. -
FIGS. 5A, 5B, 5C, 5D, and 5E illustrate a process for controlling an AR object based on user interactions, at a mobile client, with planar surfaces detected by an AR system, in accordance with at least one embodiment. -
FIG. 6 illustrates a block diagram including components of a machine able to read instructions from a machine-readable medium and execute them in a processor (or controller), in accordance with at least one embodiment. - The Figures (FIGS.) and the following description relate to preferred embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.
- Reference will now be made in detail to several embodiments, examples of which are illustrated in the accompanying figures. It is noted that wherever practicable similar or like reference numbers may be used in the figures and may indicate similar or like functionality. The figures depict embodiments of the disclosed system (or method) for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles described herein.
- In one example embodiment of a disclosed system, method and computer readable storage medium, surfaces within a user's environment are identified to enable an augmented reality (AR) object to be provided for display on a mobile client based on the identified surfaces. Conventional AR systems for mobile clients are unable to identify surfaces within an environment and determine spatial relationships between the identified surfaces (e.g., the height or distance between points on respective surfaces). The systems and methods described herein may selectively limit the data stored to conserve memory resources. Furthermore, the surfaces may be identified using generated meshes rather than individual feature points, which optimizes processing resources (e.g., by analyzing one mesh as opposed to multiple feature points that make up the mesh). Accordingly, described is a configuration that performs these functions, which may be referred to herein as “surface distinction,” and enables AR objects to interact with the identified surfaces in a mobile rendered AR system while optimizing for the memory, power, and processing constraints (e.g., of the mobile client).
- In one example configuration, a camera coupled with the mobile client (e.g., integrated with the mobile client or wirelessly or wired connection with the mobile client) captures a camera view of an environment. The environment may correspond to the physical world, which includes surfaces within a field of view of the camera (i.e., the “camera view”). A processor (e.g., of the mobile device) processes program code that causes the processor to execute specified functions as are further described herein. Accordingly, the processor receives the image and several feature points, identified from the image, associated with the surfaces in the environment. The processor generates a three-dimensional (3D) virtual coordinate space using the feature points. The processor identifies two or more clusters from the feature points, each cluster corresponding to a different surface. The processor generates meshes from the clusters, where each mesh is defined by coordinates in the 3D virtual coordinate space (e.g., the shape of the mesh may be outlined by the coordinates). Using these coordinates, the processor may determine a height difference between two surfaces (e.g., a difference between two Z-coordinates of respective meshes). The processor receives a user interaction between the two surfaces. For example, the user makes a swipe on the display starting from one surface and ending at the other surface. The processor provides an AR object for display at the mobile client, where the AR object is configured to interact with the two surfaces based on the user interaction. For example, the processor provides an AR avatar for display that runs from one surface to the other surface.
- Surface distinction allows the user to control AR objects within an AR application as though the objects are interacting with reality around the user, presenting an immersive AR experience for the user. In particular, the methods described herein allow surface distinction on a mobile client executed AR application without consuming too much processing and/or battery power.
-
FIG. 1 illustrates an AR system environment, in accordance with at least one embodiment. The AR system environment may be an AR game system environment that enables a user to play AR games on amobile client 100, and in some embodiments, presents immersive gaming experiences to the user via surface distinction. The system environment includes amobile client 100, anAR system 110, anAR engine 120, asurface distinction application 130, adatabase 140, and anetwork 150. TheAR system 110, in some example embodiments, may include themobile client 100, theAR engine 120, thesurface distinction application 130, and thedatabase 140. In other example embodiments, theAR system 110 may include theAR engine 120, thesurface distinction application 130, and thedatabase 140, but not themobile client 100, such that theAR system 110 communicatively couples (e.g., wireless communication) to themobile client 100 from a remote server. - The
mobile client 100 is a mobile device that is or incorporates a computer. The mobile client may be, for example, a relatively small computing device in which network processing (e.g., processor and/or controller) and power resources (e.g., battery) may be limited and have a formfactor size such as a smartphone, tablet, wearable device (e.g., smartwatch) and/or a portable internet enabled device. The limitations of such device extend from scientific principles that must be adhered to in designing such products for portability and use away from constant power draw sources. - The
mobile client 100 has general and/or special purpose processors, memory, storage, networking components (either wired or wireless). Themobile client 100 can communicate over one or more communication connections (e.g., a wired connection such as ethernet or a wireless communication via cellular signal (e.g., LTE, 5G), WiFi, satellite) and includes a global positioning system (GPS) used to determine a location of themobile client 100. Themobile client 100 also includes a screen 103 (e.g., a display) and a display driver to provide for display interfaces on the display associated with themobile client 100. Themobile client 100 executes an operating system, such as GOOGLE ANDROID OS and/or APPLE iOS, and includes a display and/or a user interface that the user can interact with. - The
mobile client 100 also includes one or more cameras (e.g., the camera 102) that can capture forward and rear facing images and/or videos. The one ormore cameras 102 may be configured to capture depths of objects within an image. For example, the one ormore cameras 102 may be a dual camera, Light Detection and Ranging (LiDAR) camera, ultrasonic imaging camera, or any suitable camera capable of determining distance between an object and the camera. - In some embodiments, the
mobile client 100 couples to theAR system 110, which enables it to execute an AR application (e.g., the AR client 101). TheAR engine 120 interacts with themobile client 100 to execute the AR client 101 (e.g., an AR game). For example, theAR engine 120 may be a game engine such as UNITY and/or UNREAL ENGINE. TheAR engine 120 displays, and the user interacts with, the AR game via themobile client 100. Although the AR application refers to an AR gaming application in many instances described herein, the AR application may be a retail application integrating AR for modeling purchasable products, an educational application integrating AR for demonstrating concepts within a learning curriculum, or any suitable interactive application in which AR may be used to augment the interactions. In some embodiments, theAR engine 120 is integrated into and/or hosted on themobile client 100. In other embodiments, theAR engine 120 is hosted external to themobile client 100 and communicatively couples to themobile client 100 over thenetwork 150. TheAR system 110 may comprise program code that executes functions as described herein. - In some example embodiments, the
AR system 110 includes thesurface distinction application 130. The surface distinction application enables surface distinction an AR game such that AR objects (e.g., virtual objects rendered by the AR engine 120) may appear to interact with various surfaces in an environment of the user. The user may capture an image and/or video of an environment, which may include one or more objects (e.g., a table, a book, etc.) captured within a camera view of thecamera 102 or themobile client 100. While thesurface distinction application 130 may identify surfaces depicted within both images and videos, many instances described herein will refer to surface distinction in images captured by a mobile client. TheAR engine 120 renders an AR object, where the rendering may be based on the identified surfaces within the environment (e.g., the surface of a table). - During game play, the
surface distinction application 130 identifies and distinguishes surfaces (e.g., floors and walls) in an image of the environment. For example, thesurface distinction application 130 may distinguish a surface of a floor from a surface of a table. In some embodiments, thesurface distinction application 130 provides an AR object for display as interacting with one or more surfaces. For example, the AR object is an AR avatar (e.g., a AR representation of a human resembling the user) and the avatar is displayed sitting on the floor, climbing from a table to a book resting on the table, etc.FIGS. 5A-5E , described further herein, illustrate one example of thesurface distinction application 130 identifying surfaces within a living room and enabling a user to control an AR avatar that is configured to travel between identified surfaces (e.g., walking in a visually natural way between surfaces). In some embodiments, theAR system 110 includes applications instead of and/or in addition to thesurface distinction application 130. In some embodiments, thesurface distinction application 130 may be hosted on and/or executed by themobile client 100. In other embodiments, thesurface distinction application 130 is communicatively coupled to themobile client 100. - The
database 140 stores images or videos that may be used by thesurface distinction application 130 to identify surfaces within an image or video. Themobile client 100 may transmit images or videos collected by thecamera 102 during the execution of theAR client 101 to thedatabase 140. The data stored within thedatabase 140 may be collected from a single client (e.g., the mobile client 100) or multiple clients (e.g., other mobile clients that are communicatively coupled to theAR system 110 through the network 150). Thesurface distinction application 130 may use images and/or videos of environments stored in thedatabase 140 to train a model to classify objects within the environments. In turn, the classified objects may be used to determine a particular surface depicted in the image. The classification of objects within an image is further described in the description ofFIG. 2 . - The
network 150 transmits data between themobile client 100 and theAR system 110. Thenetwork 150 may be a local area and/or wide area network that uses wired and/or wireless communication systems, such as the internet. In some embodiments, thenetwork 150 includes encryption capabilities to ensure the security of data, such as secure sockets layer (SSL), transport layer security (TLS), virtual private networks (VPNs), internet protocol security (IPsec), etc. -
FIG. 2 is a block diagram of thesurface distinction application 130 ofFIG. 1 , in accordance with at least one embodiment. Thesurface distinction application 130 includes anetwork interface 210, acluster module 220, amesh generation module 230, and arendering module 240. In some embodiments, thesurface distinction application 130 includes modules other than those shown inFIG. 2 . For example, thesurface distinction application 130 may include a machine learning model trained to classify objects within images received from themobile client 100. The modules may be embodied as program code (e.g., software comprised of instructions stored on non-transitory computer readable storage medium and executable by at least one processor such as theprocessor 602 inFIG. 6 ) and/or hardware (e.g., application specific integrated circuit (ASIC) chips or field programmable gate arrays (FPGA) with firmware. The modules correspond to at least having the functionality described when executed/operated. - The
network interface 210 may be a communication interface for thesurface distinction application 130 to communicate with various components in theAR system 110, such as themobile client 100, theAR engine 120, and thedatabase 140. Themobile client 100 may transmit requests with data payloads to thesurface distinction application 130 via thenetwork interface 210. The requests may be to identify surfaces within an image, modify the display of AR objects (e.g., in response to user interactions with themobile client 100 with a request to control the state of an AR object), or any suitable action to provide an interactive AR experience using theAR client 101. The requests may have data payloads such as an image, a video, or information about a user interaction (e.g., coordinates on thescreen 103 that the user has interacted with in a request to control an AR object). Although not depicted inFIG. 2 , the “network interface” may be referred to as an “application interface” in embodiments where the surface distinction application is hosted and executed on themobile client 100. The application interface may be a communication interface for thesurface distinction application 130 to communication with various components in themobile client 100 such as thecamera 102 and thescreen 103. - Requests received by the
network interface 210 from themobile client 100 may be automatically generated during execution of the AR client 101 (e.g., during gameplay). Alternatively or additionally, the requests may be generated responsive to a user interaction with themobile client 110. For example, the user may tap thescreen 103 at a location corresponding to a book, and theAR client 101 may send a request to thesurface distinction application 130 to identify a surface, based on the user interaction, within an image depicting the table. The requests from themobile client 100 may identify themobile client 100 and/or theAR client 101. Other components within theAR system 110 may communicate with thesurface distinction application 130 via thenetwork interface 210. For example, theAR Engine 120 may provide feature points of an image to thesurface distinction application 130 via thenetwork interface 210. Thenetwork interface 210 may take various forms. For example, in some embodiments, thenetwork interface 210 takes the form of an application programming interface (API) such as REST (representational state transfer), SOAP (Simple Object Access Protocol), RPC (remote procedural call), or another suitable type. - The
network interface 210 may receive an image captured within the camera view of thecamera 102 of themobile client 100. Thenetwork interface 210 may transmit the image to theAR engine 120, which determines feature points in the image of the environment and provides the feature points to the cluster module 220 (e.g., via the network interface 210). Alternatively, themobile client 100 may provide the image to theAR engine 120, which subsequently provides feature points to thecluster module 220. The feature points may provide information about the content of an image for subsequent image processing, where the information indicates features of structures within the image such as surfaces, corners, points, objects, etc. The use of feature points by thesurface distinction application 130 is further described with respect to thecluster module 220 andmesh generation module 230. - The
cluster module 220 groups the feature points received from theAR engine 120 into clusters. The clusters may be indicative of distinct features in the image such as objects and surfaces (e.g., chairs, tables, walls, floors, etc.). An object depicted in an image from a camera view of the environment may be referred to herein as a “real-world object” as such objects have physical form and existence. In some embodiments, thecluster module 220 identifies clusters using Euclidean clustering, where feature points within a threshold distance of one another form a cluster. Thecluster module 220 may identify clusters using alternative methods, in other embodiments. Thecluster module 220 may generate a three-dimensional (3D) virtual coordinate space using the received feature points. Each feature point may be characterized by Cartesian coordinates, which may indicate a depth and a height of the feature point within the 3D virtual coordinate space. - In some embodiments, the surface distinction application 130 (e.g., via the cluster module 220) may store a first set of feature points of an first image corresponding to what is depicted in a camera view of a mobile device and subsequently discard the first set of feature points so that memory resources may be saved for a second set of feature points (e.g., when the camera view changes to depict different objects). For example, a first image may depict a table and the
cluster module 220 may receive a first set of feature points identified for the table. A subsequently received second image, may depict a chair and thecluster module 220 may receive a second set of feature points identified for the chair. Thecluster module 220 may determine, using the received feature points, that camera view of the mobile device is capturing different objects or a different view of the environment. For example, thecluster module 220 may compare the values of the first set and second set of feature points to determine that they correspond to different objects. Thecluster module 220 may determine to discard the feature points corresponding to objects that are not captured within the most recent camera view. For example, thecluster module 220 may discard the first set of feature points (e.g., responsive to determining that the first set and second set of feature points are different or exceed a threshold measure of difference). In this way, thesurface distinction application 130 may conserve memory resources. - A cluster indicates physical surfaces within the environment. The surfaces may be of a real-world object (e.g., the top of a table or seat of a chair) or standalone surfaces such as floors. Cluster attributes such as its size and location within the captured image relative to other clusters may correspond to similar attributes of the surface relative to other surfaces in the environment. For example, the
cluster module 220 may identify a cluster in an image of a room with a box. The identified cluster may correspond to the surfaces of the box, indicate a location of the box within the image of the room, and scale according to a size of the box in the image. For example, in response to an image where the box is closer, and therefore appears larger, to themobile client 100, the cluster corresponding to the box would also become larger (e.g., more feature points in the cluster). Feature point density may be used to determine the presence of an object. In some embodiments, feature point density is proportional to image texture, where an object with varying colors or surface features (e.g., protrusions, corners, ridges, etc.) may be associated with a higher density of clusters than a flat surface whose image texture does not vary as much. - In some embodiments, the
cluster module 220 identifies a plurality of clusters from the feature points received from theAR engine 120, e.g., each cluster corresponding to a real-world object in the camera view of the environment. In response, thecluster module 220 determines a size of each of the clusters. Based on the determined sizes, thecluster module 220 selects one of the clusters. For example, thecluster module 220 may select the largest cluster. In another example embodiment, thecluster module 220 receives user input provided by themobile client 100 via thenetwork interface 210. The user input may be a tap or swipe on thescreen 103, a click or drag with a computer cursor, or any other suitable user input via themobile client 100 indicating a selection of a real-world object displayed on thescreen 103. Thecluster module 220 subsequently selects a cluster corresponding to the selected real-world object. - The
cluster module 220 may distinguish clusters of feature points associated with real-world objects from clusters of feature points associated with flat, two-dimensional surfaces (2D) (e.g., walls, floors) in the camera view. In particular, thecluster module 220 applies a Random Sample Consensus (RANSAC) algorithm to the image to identify these 2D surfaces. In some embodiments, a custom matrix system application programming interface (API) enables thecluster module 220 to use the RANSAC methodology on themobile client 100. - The
mesh generation module 230 generates a 3D mesh for a real-world object or surface from its corresponding cluster. Themesh generation module 230 may use triangulation (e.g., Delaunay triangulation) to generate a 3D mesh from the cluster of feature points. In some embodiments, themesh generation module 230 builds a stencil shader and accordingly generates a 3D mesh that corresponds to bounds of the identified cluster. Therefore, the 3D mesh approximates the shape of the real-world object. Themesh generation module 230 may iterate through each cluster identified by thecluster module 220 to generate 3D meshes. In some embodiments, themesh generation module 230 may generate a 3D mesh of a user-specified cluster responsive to a user selection of a point or area within the image (e.g., using the mobile client 100) corresponding to the cluster. - In some embodiments, the
mesh generation module 230 may detect 2D surfaces depicted within an image. Themesh generation module 230 may determine features of the mesh (e.g., depth, convexity, etc.) and use the determined features to determine whether the mesh corresponds to a 3D image. Themesh generation module 230 allows surface distinction to be performed using a mesh rather than individual feature points. By using the generated mesh instead of individual feature points to detect surfaces, thesurface distinction application 130 is processing, for example, one 3D mesh as opposed to multiple feature points (e.g., millions of feature points). This way, thesurface distinction application 130 may optimize processing resources. - In one example, a cluster determined by the
cluster module 220 corresponds to a box. Themesh generation module 230 may generate a 3D mesh that is shaped similarly to the box. The 3D mesh of the box would correspond to a height, length, and width of the box. In addition, themesh generation module 230 may be configured to refine the density of the mesh. For example, a lower density mesh may provide the primary points for the 3D mesh of the object in terms of dimensions. The more refined, or higher density, mesh may allow for generating a 3D mesh that may also generate a corresponding texture of the object (e.g., ridges, bumps, etc.). Themesh generation module 230 provides the 3D mesh to therendering module 240 using, for example, a .NET bridge. - In some embodiments, the
mesh generation module 230 generates the 3D mesh for a cluster after classifying a real-world object within the received image. For example, thesurface distinction application 130 may include a machine learning model that classifies real-world objects in images of the environment. Themesh generation module 230 may apply an image to the machine learning model, which outputs a type for each of the real-world objects in the image. Themesh generation module 230 may generate the 3D mesh for the selected real-world object based on the type output by the machine learning model. Once themesh generation module 230 generates the 3D mesh, the module may determine coordinates of the 3D mesh within the 3D virtual coordinate space generated by thecluster module 220. These determined coordinates may be boundary coordinates outlining, for example, edges, surfaces, cavities, etc. of the 3D mesh as located within the virtual coordinate space representing the environment in the camera view. - In some embodiments, the
surface distinction application 130 includes a machine learning model training engine that accesses thedatabase 140 for images of real-world objects to train the machine learning model. The machine learning model training engine may generate training sets using the accessed images and labels identifying the objects depicted within the images. The training engine may then use the training sets to train the machine learning model. - The
rendering module 240 provides for display, on themobile client 100, an AR object that is generated, or “rendered,” by theAR engine 120. To provide the AR object for display at themobile client 100, therendering module 240 may determine a location on a virtual coordinate space to position the rendered AR object and transmit the location and the rendered AR object to themobile client 100 for display. - The
rendering module 240 may determine various states to display an AR object depending on an identified surface on which the AR object is rendered and/or a location within the identified surface. For example, therendering module 240 may render an AR avatar that appears to look down when the AR avatar is displayed at the edge of a surface. Therendering module 240 may determine that the edge of the surface has a coordinate within the virtual coordinate space with a non-zero Z-coordinate or value greater than a threshold Z-coordinate and display the AR avatar in a state where it appears to look down. Therendering module 240 may also provide an AR object for display that indicates the presence of other identified surfaces to which it may travel. For example, the AR avatar may be rendered to point to a table's surface to which it may climb. When the AR object is at a location on the virtual coordinate space that overlaps with the generated 3D mesh of a surface, therendering module 240 provides for display the AR object as interacting with the surface (e.g., standing on the surface, traveling on the surface, sitting on the surface, etc.). - In some embodiments, the
rendering module 240 provides the 3D meshes of the surfaces for display on themobile client 100. Themesh generation module 230 may provide the generated 3D meshes for display on the mobile client'sdisplay 103. In some embodiments, themesh generation module 230 provides an indication of identified surfaces for display by overlaying 3D meshes of the surfaces over the image from the camera view of themobile client 100. - The
rendering module 240 may receive, from themobile client 100, data indicative of user interactions indicating one or more locations at which an AR object is requested to be provided at thescreen 103. For example, a user may tap or swipe the screen 103 (e.g., a touchscreen display) and theAR client 101 may provide one or more points (e.g., coordinates corresponding to areas of the screen 103) to thesurface distinction application 210. Therendering module 240 then determines one or more corresponding coordinates on the 3D virtual coordinate space to provide the AR object for display. - The
rendering module 240 may determine a set of coordinates in the 3D virtual coordinate space that correspond to a path between two identified surfaces: a floor and a table. Therendering module 240 may use the Z-coordinates of respective identified surfaces (e.g., coordinates corresponding to the center each surface) within the 3D virtual coordinate space to determine height differences between two surfaces. In addition to determining a height difference, therendering module 240 may determine a depth difference using X and Y coordinates of the respective identified surfaces. Using the determined height and depth differences, therendering module 240 may determine a shortest path between the centers of each surface. Therendering module 240 may provide a path (e.g., stepping stones) along this shortest path for display at themobile client 100. For example, therendering module 240 requests theAR engine 120 render AR stepping stones and subsequently transmits the rendered stepping stones to themobile client 100 for display. - The state that the
rendering module 240 determines to display an AR object may include the size or ratio of the AR object, which may vary depending on the location within the 3D virtual coordinate space that the AR object is placed. Continuing the previous example, therendering module 240 may display the AR avatar, as it travels along the AR stepping stones, in an increasingly larger size ratios if the avatar is traveling towards the user or a decreasingly smaller size ratios if the avatar is traveling away. Thus, therendering module 240 promotes a more visually natural display of the AR objects as they interact with identified surfaces. - The
rendering module 240 may render instructions for display to direct a user to perform a particular user interaction. The rendered instructions may include text, pictures, animations, AR objects, etc. For example, therendering module 240 may provide an arrow for display pointing from one surface to another, indicating that the user make a swiping motion on thescreen 103 to follow the arrow and cause a set of AR steps to be generated and thus, creating a path between the two surfaces for an AR avatar to travel between. Therendering module 240 may determine the instructions using the coordinates of the 3D meshes corresponding to the surfaces. For example, therendering module 240 may use the Cartesian coordinates of two points within respective surface meshes to determine a line in the 3D virtual coordinate space between the two meshes. Therendering module 240 may then use the line to determine a location to provide the instructional arrow for display. -
FIG. 3 is a flowchart illustrating anexample process 300 for providing an AR object for display based on surface distinction, in accordance with at least one embodiment. Theprocess 300 may be performed by thesurface distinction application 130. Thesurface distinction application 130 may perform operations of theprocess 300 in parallel or in different orders, or may perform different, additional, or fewer steps. For example, prior to identifying 306 the clusters, thesurface distinction application 130 may receive, from the mobile client, an request based on a user interaction identify a particular surface (e.g., the user taps on a book displayed in an image shown on the screen 103) and an identified 306 cluster corresponds to the user's selected surface. Furthermore, while an image is referenced in theprocess 300, the surface distinction application may also be applied to videos. Theprocess 300 may be performed by a surface distinction application hosted on a mobile client or hosted on a remote server that is communicatively coupled to the mobile client. - The
surface distinction application 130 receives 302 an image from a camera of a mobile client, where the image depicts planar surfaces. As refer to herein, “planar surfaces” are 2D surfaces with minimal or no texture (e.g., cavities, ridges, protrusions, etc.). The terms “planar surface” and “surface” may be used interchangeably herein unless the context in which either term is used indicates otherwise. Planar surfaces may include relatively flat surfaces such as floors, walls, surfaces of flat objects (e.g., a top of a box) or semi-flat objects (e.g., a page of an open book), or any suitable 2D surface. The planar surface may have a slope. For example, the planar surface may be the exterior of a triangular roof. In one example, themobile client 100 transmits an image from a camera view (e.g., field of view from the camera 102) of a park with grass, a bench, and a tree. Planar surfaces depicted in this image include the grass and the bench (e.g., the seat and the back of the bench). Thesurface distinction application 130 may provide the received image to theAR engine 120 to identify feature points within the image. An interface of the surface distinction application such as thenetwork interface 210 or the application interface (e.g., for surface distinction applications hosted on the mobile client 100) may receive 302 the image. - The
surface distinction application 130 receives 304 feature points associated with the planar surfaces. TheAR engine 120 may identify feature points within the received image to provide to thesurface distinction application 130. The received feature points may be characterized by coordinates to indicate the position of distinct, 3D features within the 2D image relative to other objects within the image. For example, a coordinate corresponding to a point on a bench within an image of a park may be represented by Cartesian coordinates. The coordinate may indicate the depth of the point on the bench. The depth may be represented as a distance away from a reference point (e.g., an origin in the 3D virtual coordinate space), which may be a point on an object within the image or thecamera 102. An interface of the surface distinction application such as thenetwork interface 210 or the application interface (e.g., for surface distinction applications hosted on the mobile client 100) may receive 304 the feature points. - The
surface distinction application 130 identifies 306 clusters based on the feature points, each cluster corresponding to a respective planar surface of the planar surfaces. Thecluster module 220 may apply a RANSAC algorithm to the image to identify clusters of feature points that correspond to the respective planar surfaces. For example, thecluster module 220 identifies a first cluster of feature points identified in the image of the park that correspond to where in the image that grass is depicted and a second cluster of feature points that correspond to the bench seat. - The
surface distinction application 130 determines 308, based on the identified clusters, locations of the planar surfaces in a 3D virtual coordinate space. Thecluster module 220 may generate a 3D virtual coordinate space using the received feature points. Using the generated 3D virtual coordinate space,mesh generation module 230 may determine the coordinates of the identified clusters, where those coordinates indicate the locations of the planar surfaces. Themesh generation module 230 may generate meshes using the feature points in the identified clusters and determine additional coordinates outlining the shape of the surface. For example, themesh generation module 230 determines the coordinates of the feature points in a cluster representing a bench in an image of a park and generates, using those feature points, a 3D mesh (e.g., using Delaunay triangulation) where the 3D mesh is characterized by more coordinates than the feature points in the cluster. Each coordinate composing the 3D mesh may be a location of a planar surface of the bench. - The
surface distinction application 130 provides 310 for display at the mobile client an AR object, where the AR object is provided for display at a location of the determined locations. Therendering module 240 may provide 310 an AR object (e.g., an AR ball) for display at themobile client 100, where the AR ball is displayed as resting on the surface of a real-world bench captured within the camera view of themobile client 100. In this example, themesh generation module 230 may have determined at least one location of a surface of the bench (e.g., using coordinates of the 3D mesh of the bench within the 3D virtual coordinate space). The AR ball may be rendered by theAR engine 120 and positioned, by therendering module 240, within the 3D virtual coordinate space at a location that is equivalent to a coordinate of the 3D mesh of the bench such that the AR ball appears to rest atop a surface of the bench. -
FIGS. 4A and 4B are flowcharts illustrating anexample process 400 for providing an AR object for display based on theprocess 300 ofFIG. 3 , in accordance with at least one embodiment. Theprocess 400 may be performed by thesurface distinction application 130. Thesurface distinction application 130 may perform operations of theprocess 400 in parallel or in different orders, or may perform different, additional, or fewer steps. Furthermore, while an image is referenced in theprocess 400, the surface distinction application may also be applied to videos. - The
surface distinction application 130 generates 402 a 3D virtual coordinate space following receiving 304 feature points associated with planar surfaces (e.g., from the AR engine 120). Thecluster module 220 may use the received feature points to generate the coordinate space. The feature points may be characterized by a coordinate system when received such that the feature points indicate at least a height and a depth for features in an image. Thecluster module 220 may use the coordinate system provided, convert the feature points to another coordinate system, or perform coordinate transformation to rotate axes. - The
surface distinction application 130 identifies 404 and 406 a first and second cluster, respectively, from the feature points. Thecluster module 220 may performidentification process 300, thecluster module 220 may identify 404 a first cluster from the feature points corresponding to grass in an image of a park and identify 406 a second cluster from the feature points corresponding to a bench in the image. - By way of example, the
surface distinction application 130 generates 408 and 410 a first and a second mesh, respectively. Themesh generation module 230 may generate 408 a first 3D mesh from the first cluster and generate 410 a second 3D mesh from the second cluster. For example, the first mesh represents the grass identified by the first cluster of feature points and the second mesh represents the bench identified by the second cluster of feature points. The first mesh may be associated with a first Z-coordinate in the 3D virtual coordinate space. The first mesh may include a feature point with a Z-coordinate of 0 to indicate a height of the grass in the image relative to other objects (e.g., a mesh with a negative Z-coordinate has a feature that is below the point at the grass with a Z-coordinate of 0). The second mesh may be associated with a second Z-coordinate in the 3D virtual coordinate space. The second mesh may include a feature point with a non-zero Z-coordinate to indicate that a point on the bench has a surface that is higher than the grass at the Z-coordinate of 0. The 3D meshes may also be similarly characterized by X and Y coordinates, which may be used to indicate depths of surfaces relative to other surfaces in the image. - The
surface distinction application 130 determines 412 a height difference between a first planar surface and a second planar surface based on the first and second Z-coordinates. Themesh generation module 230 may determine 412 a difference between two Z-coordinates to determine the height difference between two surfaces at the respective points. For example, themesh generation module 230 determines a height difference between grass and a bench depicted in an image of a park using Z-coordinates of feature points of the respective 3D meshes of the grass and bench. - The
surface distinction application 130 provides 414 for display at the mobile client indication of respective locations of the first and second planar surfaces. The indications of the locations of surfaces identified by thesurface distinction application 130 may be the 3D meshes generated by themesh generation module 230. The rendermodule 240 may provide 414 3D meshes of identified planar surfaces for display at the mobile client 100 (e.g., using the screen 103), where the 3D meshes are displayed overlaying the corresponding surfaces. For example, the rendermodule 240 may provide 414 a 3D mesh of the grass overlaying the grass in the image of the park and provide 414 a 3D mesh of the bench overlaying the surface of the bench. The user of themobile client 100 may see, on thescreen 103, the image and the 3D meshes overlaying the image. - The
surface distinction application 130 receives 416 a user interaction between the first and second planar surfaces. In one example, thenetwork interface 210 receives 416 a user interaction of a swipe of the user's finger across thescreen 103 between a starting coordinate at the 3D mesh corresponding to the grass and an ending coordinate at the 3D mesh corresponding to the bench. - The surface distinction application provides 418 for display at the mobile client an AR object. The AR object may be configured to interact with the first and second planar surfaces based on the user interaction. For example, the
rendering module 240 provides 418 for display an AR avatar at themobile client 100 that is configured to travel between the grass and the bench at the locations where the user's finger swiped across the image. Therendering module 240 may provide 418 the AR object for display as a part of providing 310 for display the AR object at a determined location of a planar surface in the 3D virtual coordinate space. - Example AR Application with Surface Distinction
-
FIGS. 5A, 5B, 5C, 5D, and 5E . illustrate an example process for controlling an AR object based on user interactions, at a mobile client, with planar surfaces detected by an AR system, in accordance with at least one embodiment. Various environments, both real and virtual, are depicted inFIGS. 5A-5E . As referred to herein, an “environment” may refer to a real-world environment and a “virtual environment” may refer to an environment that has been captured by a computer (e.g., via imaging) for processing and may not necessarily be presented to the user of themobile client 100. Virtual environments are presented herein to promote clarity with describing the process for identifying surfaces. -
FIG. 5A shows anenvironment 500 a of a living room where example objects and respective first, second, andthird surfaces second surface 511 is a surface of a table on the surface 510510, and thethird surface 512 is a surface of a book on thesurface 511. Although not shown inFIG. 5A , amobile client 100 may capture an image or video of theenvironment 500 a while using theAR client 101. A captured image may be transmitted to thesurface distinction application 130 for processing, as shown inFIGS. 5B and 5C . Thenetwork interface 210 may receive this image. -
FIG. 5B shows avirtual environment 500 b where feature points 520 have been identified within an image of theenvironment 500 a that includes depictions ofsurfaces FIG. 5B . The feature points 520 may indicate corners and edges of surfaces depicted within the image, which may correspond to boundary points of surfaces. For example, thesurface 511 corresponding to the top of a table is represented by feature points. Thecluster module 220 may apply a RANSAC algorithm to identify a 2D surface, from the feature points, that corresponds to the top of the table. Thecluster module 220 may generate a 3D virtual coordinate space using the feature points 520. -
FIG. 5C shows avirtual environment 500 c where first, second and third 3D meshes 530, 531, and 532 have been identified within theenvironment 500 b for respective first, second andthird surfaces mesh generation module 230 may use clusters of feature points 520 shown inenvironment 500 b to generate the 3D meshes for the surfaces depicted within the image ofenvironment 500 a. Therendering module 240 may provide the generated meshes for display at themobile client 100 to indicate to a user surfaces identifies by thesurface distinction application 130. -
FIG. 5D shows anenvironment 500 d where amobile client 100 is executing theAR client 101 while capturing an image of the living room ofenvironment 500 a. Themobile client 100 displays depictions ofsurfaces AR avatar 540, and one or moreAR stepping stones 541. Therendering module 240 may provide theAR avatar 540 and theAR stepping stones 541 for display on themobile client 100. The user of themobile client 100 may swipe his finger across the display, where the swipe may be characterized by a starting coordinate on the 3D virtual coordinate space corresponding to thesurface 510 of the living room floor and an ending coordinate corresponding to thesurface 511 of the top of the table. This user interaction may be received by thenetwork interface 210 as a request for therendering module 240 to render theAR stepping stones 541 along the path of the swipe. Therendering module 240 may determine a path within the 3D coordinate space that maps to a line between the starting and ending coordinates and generate theAR stepping stones 541 along the determined line. -
FIG. 5E shows anenvironment 500 e where themobile client 100 displays theAR avatar 540 at a location of a identifiedsurface 511, the top of the table. Although not shown, the user may have performed an additional user interaction with the display of themobile client 100 to request that theAR avatar 540 move up theAR stepping stones 541. Thenetwork interface 210 may receive the user interaction and therendering module 240 may provide theAR avatar 540 in a state depicted as traveling from the floor to the table. The user may continue to perform user interactions such as tapping the display of themobile client 100 where thesurface 512 of the book is depicted to request that theAR avatar 540 interact with the surface 512 (e.g., sit on the book). Therendering module 240 may decrease the size of the AR objects depending on their location within the 3D virtual coordinate space to promote a natural appearance when displayed on themobile client 100. For example, as theAR avatar 540 moves from the floor to the table, therendering module 240 may decrease the size ratio of theAR avatar 540 to indicate that the AR object is moving away from the user and towards the table. -
FIG. 6 is a block diagram illustrating components of an example machine able to read instructions from a machine-readable medium and execute them in a processor (or controller). Specifically,FIG. 6 shows a diagrammatic representation of a machine in the example form of acomputer system 600 within which program code (e.g., software) for causing the machine to perform any one or more of the methodologies discussed herein may be executed. The program code may correspond to functional configuration of the modules and/or processes described withFIGS. 1-5E . The program code may be comprised ofinstructions 624 executable by one ormore processors 602. In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. - The machine may be a portable computing device or machine (e.g., smartphone, tablet, wearable device (e.g., smartwatch)) capable of executing instructions 624 (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute
instructions 624 to perform any one or more of the methodologies discussed herein. - The
example computer system 600 includes at least one processor 602 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), one or more radio-frequency integrated circuits (RFICs), or any combination of these), amain memory 604, and astatic memory 606, which are configured to communicate with each other via abus 608. Thecomputer system 600 may further includevisual display interface 610. The visual interface may include a software driver that enables displaying user interfaces on a screen (or display). The visual interface may display user interfaces directly (e.g., on the screen) or indirectly on a surface, window, or the like (e.g., via a visual projection unit). For ease of discussion the visual interface may be described as a screen. Thevisual interface 610 may include or may interface with a touch enabled screen. Thecomputer system 600 may also include alphanumeric input device 612 (e.g., a keyboard or touch screen keyboard), a cursor control device 614 (e.g., a mouse, a trackball, a joystick, a motion sensor, or other pointing instrument), astorage unit 616, a signal generation device 618 (e.g., a speaker), and anetwork interface device 620, which also are configured to communicate via thebus 608. - The
storage unit 616 includes a machine-readable medium 622 on which is stored instructions 624 (e.g., software) embodying any one or more of the methodologies or functions described herein. The instructions 624 (e.g., software) may also reside, completely or at least partially, within themain memory 604 or within the processor 602 (e.g., within a processor's cache memory) during execution thereof by thecomputer system 600, themain memory 604 and theprocessor 602 also constituting machine-readable media. The instructions 624 (e.g., software) may be transmitted or received over anetwork 626 via thenetwork interface device 620. - While machine-
readable medium 622 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions (e.g., instructions 624). The term “machine-readable medium” shall also be taken to include any medium that is capable of storing instructions (e.g., instructions 624) for execution by the machine and that cause the machine to perform any one or more of the methodologies disclosed herein. The term “machine-readable medium” includes, but not be limited to, data repositories in the form of solid-state memories, optical media, and magnetic media. - While using an AR application (e.g., an AR game), a user may want to have an AR object interact with surfaces in the real-world environment. Conventional implementations for mobile clients do not allow for surface distinction. Since conventional systems are unable to spatially differentiate surfaces, they limit AR objects to interact with a single surface and may cause AR objects to be displayed disproportionate to a particular surface without knowledge of the depth or height of one surface relative to another. The methods described herein detect surfaces within an image and determines spatial relationships between the surfaces through clustering and mesh generation. Furthermore, the methods described herein may selectively store feature point data corresponding to the most recent camera view of the phone. For example, a user may direct their mobile phone's camera at a table, but then turn around to capture a chair. The feature points corresponding to the table may be discarded. Thus, by storing information from a recent camera view, the methods described herein promote efficient memory use and as a consequence, promote efficient power and processing use as well by not expending those resources on the data that was discarded. Additionally, the methods herein include a clustering algorithm that scans identified feature points, groups the ones closest together, and generates a mesh. By using generated mesh instead of individual feature points to detect surfaces, processing resources of a device are optimized (e.g., processing one mesh vs. millions of feature points). Accordingly, the methods described herein enable surface distinction on mobile client rendered AR systems without consuming excessive amounts of processing power, thus presenting an immersive gaming experience to the user.
- Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Although individual operations of one or more methods are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently, and nothing requires that the operations be performed in the order illustrated. Structures and functionality presented as separate components in example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein.
- Certain embodiments are described herein as including logic or a number of components, modules, or mechanisms. Modules may constitute either software modules (e.g., code embodied on a machine-readable medium or in a transmission signal) or hardware modules. A hardware module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. In example embodiments, one or more computer systems (e.g., a standalone, client or server computer system) or one or more hardware modules of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an application or application portion) as a hardware module that operates to perform certain operations as described herein.
- In various embodiments, a hardware module may be implemented mechanically or electronically. For example, a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special-purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations. A hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- Accordingly, the term “hardware module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein. As used herein, “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where the hardware modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
- Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple of such hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
- Similarly, the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or processors or processor-implemented hardware modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
- The one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., application program interfaces (APIs).)
- The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the one or more processors or processor-implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the one or more processors or processor-implemented modules may be distributed across a number of geographic locations.
- Some portions of this specification are presented in terms of algorithms or symbolic representations of operations on data stored as bits or binary digital signals within a machine memory (e.g., a computer memory). These algorithms or symbolic representations are examples of techniques used by those of ordinary skill in the data processing arts to convey the substance of their work to others skilled in the art. As used herein, an “algorithm” is a self-consistent sequence of operations or similar processing leading to a desired result. In this context, algorithms and operations involve physical manipulation of physical quantities. Typically, but not necessarily, such quantities may take the form of electrical, magnetic, or optical signals capable of being stored, accessed, transferred, combined, compared, or otherwise manipulated by a machine. It is convenient at times, principally for reasons of common usage, to refer to such signals using words such as “data,” “content,” “bits,” “values,” “elements,” “symbols,” “characters,” “terms,” “numbers,” “numerals,” or the like. These words, however, are merely convenient labels and are to be associated with appropriate physical quantities.
- Unless specifically stated otherwise, discussions herein using words such as “processing,” “computing,” “calculating,” “determining,” “presenting,” “displaying,” or the like may refer to actions or processes of a machine (e.g., a computer) that manipulates or transforms data represented as physical (e.g., electronic, magnetic, or optical) quantities within one or more memories (e.g., volatile memory, non-volatile memory, or a combination thereof), registers, or other machine components that receive, store, transmit, or display information.
- As used herein any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
- Some embodiments may be described using the expression “coupled” and “connected” along with their derivatives. It should be understood that these terms are not intended as synonyms for each other. For example, some embodiments may be described using the term “connected” to indicate that two or more elements are in direct physical or electrical contact with each other. In another example, some embodiments may be described using the term “coupled” to indicate that two or more elements are in direct physical or electrical contact. The term “coupled,” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other. The embodiments are not limited in this context.
- As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
- In addition, use of the “a” or “an” are employed to describe elements and components of the embodiments herein. This is done merely for convenience and to give a general sense of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise.
- Upon reading this disclosure, those of skill in the art will appreciate still additional alternative structural and functional designs for a system and a process for surface distinction in an augmented reality system executed on a mobile client through the disclosed principles herein. Thus, while particular embodiments and applications have been illustrated and described, it is to be understood that the disclosed embodiments are not limited to the precise construction and components disclosed herein. Various modifications, changes and variations, which will be apparent to those skilled in the art, may be made in the arrangement, operation and details of the method and apparatus disclosed herein without departing from the spirit and scope defined in the appended claims.
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/170,431 US20210248826A1 (en) | 2020-02-07 | 2021-02-08 | Surface distinction for mobile rendered augmented reality |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062971766P | 2020-02-07 | 2020-02-07 | |
US17/170,431 US20210248826A1 (en) | 2020-02-07 | 2021-02-08 | Surface distinction for mobile rendered augmented reality |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210248826A1 true US20210248826A1 (en) | 2021-08-12 |
Family
ID=77177475
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/170,431 Abandoned US20210248826A1 (en) | 2020-02-07 | 2021-02-08 | Surface distinction for mobile rendered augmented reality |
US17/170,629 Active US11393176B2 (en) | 2020-02-07 | 2021-02-08 | Video tools for mobile rendered augmented reality game |
US17/170,255 Abandoned US20210247846A1 (en) | 2020-02-07 | 2021-02-08 | Gesture tracking for mobile rendered augmented reality |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/170,629 Active US11393176B2 (en) | 2020-02-07 | 2021-02-08 | Video tools for mobile rendered augmented reality game |
US17/170,255 Abandoned US20210247846A1 (en) | 2020-02-07 | 2021-02-08 | Gesture tracking for mobile rendered augmented reality |
Country Status (1)
Country | Link |
---|---|
US (3) | US20210248826A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11617949B1 (en) | 2021-09-28 | 2023-04-04 | Sony Group Corporation | Methods for predefining virtual staircases connecting platforms in extended reality (XR) environments |
WO2023052885A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method to regulate jumps and falls by playable characters in xr spaces |
WO2023052887A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method to improve user understanding of xr spaces based in part on mesh analysis of physical surfaces |
WO2023052868A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method for quasi-random placement of virtual items in an extended reality (xr) space |
GB2612767A (en) * | 2021-11-03 | 2023-05-17 | Sony Interactive Entertainment Inc | Virtual reality interactions |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPWO2020129115A1 (en) * | 2018-12-17 | 2021-11-04 | 株式会社ソニー・インタラクティブエンタテインメント | Information processing system, information processing method and computer program |
US11475639B2 (en) * | 2020-01-03 | 2022-10-18 | Meta Platforms Technologies, Llc | Self presence in artificial reality |
US20210365896A1 (en) * | 2020-05-21 | 2021-11-25 | HUDDL Inc. | Machine learning (ml) model for participants |
US11861315B2 (en) * | 2021-04-21 | 2024-01-02 | Meta Platforms, Inc. | Continuous learning for natural-language understanding models for assistant systems |
US11295503B1 (en) | 2021-06-28 | 2022-04-05 | Facebook Technologies, Llc | Interactive avatars in artificial reality |
US11789544B2 (en) * | 2021-08-19 | 2023-10-17 | Meta Platforms Technologies, Llc | Systems and methods for communicating recognition-model uncertainty to users |
US20230230152A1 (en) * | 2022-01-14 | 2023-07-20 | Shopify Inc. | Systems and methods for generating customized augmented reality video |
US11899846B2 (en) * | 2022-01-28 | 2024-02-13 | Hewlett-Packard Development Company, L.P. | Customizable gesture commands |
US11886767B2 (en) | 2022-06-17 | 2024-01-30 | T-Mobile Usa, Inc. | Enable interaction between a user and an agent of a 5G wireless telecommunication network using augmented reality glasses |
US20240087251A1 (en) * | 2022-09-09 | 2024-03-14 | Shopify Inc. | Methods for calibrating augmented reality scenes |
KR20240065988A (en) * | 2022-11-07 | 2024-05-14 | 삼성전자주식회사 | An augmented reality device for detecting an object by using an artificial intelligence model included in an external device |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200111255A1 (en) * | 2018-10-05 | 2020-04-09 | Magic Leap, Inc. | Rendering location specific virtual content in any location |
US20200211286A1 (en) * | 2019-01-02 | 2020-07-02 | The Boeing Company | Augmented Reality System Using Enhanced Models |
US20200342670A1 (en) * | 2019-04-26 | 2020-10-29 | Google Llc | System and method for creating persistent mappings in augmented reality |
US20200364937A1 (en) * | 2019-05-16 | 2020-11-19 | Subvrsive, Inc. | System-adaptive augmented reality |
US20210134064A1 (en) * | 2019-10-31 | 2021-05-06 | Magic Leap, Inc. | Cross reality system with quality information about persistent coordinate frames |
US20210142580A1 (en) * | 2019-11-12 | 2021-05-13 | Magic Leap, Inc. | Cross reality system with localization service and shared location-based content |
US20210256768A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with prioritization of geolocation information for localization |
US20210256767A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with accurate shared maps |
US20210256755A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with map processing using multi-resolution frame descriptors |
US20210256766A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system for large scale environments |
US20210343087A1 (en) * | 2020-04-29 | 2021-11-04 | Magic Leap, Inc. | Cross reality system for large scale environments |
US20210358150A1 (en) * | 2019-03-27 | 2021-11-18 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Three-dimensional localization method, system and computer-readable storage medium |
US20220036648A1 (en) * | 2019-04-12 | 2022-02-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and terminal device for determining occluded area of virtual object |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11861906B2 (en) * | 2014-02-28 | 2024-01-02 | Genius Sports Ss, Llc | Data processing systems and methods for enhanced augmentation of interactive video content |
US10521671B2 (en) * | 2014-02-28 | 2019-12-31 | Second Spectrum, Inc. | Methods and systems of spatiotemporal pattern recognition for video content development |
EP3551304A1 (en) * | 2016-12-09 | 2019-10-16 | Unity IPR APS | Creating, broadcasting, and viewing 3d content |
US11145125B1 (en) * | 2017-09-13 | 2021-10-12 | Lucasfilm Entertainment Company Ltd. | Communication protocol for streaming mixed-reality environments between multiple devices |
US11276216B2 (en) * | 2019-03-27 | 2022-03-15 | Electronic Arts Inc. | Virtual animal character generation from image or video data |
US10983662B2 (en) * | 2019-04-01 | 2021-04-20 | Wormhole Labs, Inc. | Distally shared, augmented reality space |
US11887253B2 (en) * | 2019-07-24 | 2024-01-30 | Electronic Arts Inc. | Terrain generation and population system |
US20210275908A1 (en) * | 2020-03-05 | 2021-09-09 | Advanced Micro Devices, Inc. | Adapting encoder resource allocation based on scene engagement information |
US11494875B2 (en) * | 2020-03-25 | 2022-11-08 | Nintendo Co., Ltd. | Systems and methods for machine learned image conversion |
-
2021
- 2021-02-08 US US17/170,431 patent/US20210248826A1/en not_active Abandoned
- 2021-02-08 US US17/170,629 patent/US11393176B2/en active Active
- 2021-02-08 US US17/170,255 patent/US20210247846A1/en not_active Abandoned
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200111255A1 (en) * | 2018-10-05 | 2020-04-09 | Magic Leap, Inc. | Rendering location specific virtual content in any location |
US20200211286A1 (en) * | 2019-01-02 | 2020-07-02 | The Boeing Company | Augmented Reality System Using Enhanced Models |
US20210358150A1 (en) * | 2019-03-27 | 2021-11-18 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Three-dimensional localization method, system and computer-readable storage medium |
US20220036648A1 (en) * | 2019-04-12 | 2022-02-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Method and terminal device for determining occluded area of virtual object |
US20200342670A1 (en) * | 2019-04-26 | 2020-10-29 | Google Llc | System and method for creating persistent mappings in augmented reality |
US20200364937A1 (en) * | 2019-05-16 | 2020-11-19 | Subvrsive, Inc. | System-adaptive augmented reality |
US20210134064A1 (en) * | 2019-10-31 | 2021-05-06 | Magic Leap, Inc. | Cross reality system with quality information about persistent coordinate frames |
US20210142580A1 (en) * | 2019-11-12 | 2021-05-13 | Magic Leap, Inc. | Cross reality system with localization service and shared location-based content |
US20210256768A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with prioritization of geolocation information for localization |
US20210256766A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system for large scale environments |
US20210256755A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with map processing using multi-resolution frame descriptors |
US20210256767A1 (en) * | 2020-02-13 | 2021-08-19 | Magic Leap, Inc. | Cross reality system with accurate shared maps |
US20210343087A1 (en) * | 2020-04-29 | 2021-11-04 | Magic Leap, Inc. | Cross reality system for large scale environments |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11617949B1 (en) | 2021-09-28 | 2023-04-04 | Sony Group Corporation | Methods for predefining virtual staircases connecting platforms in extended reality (XR) environments |
WO2023052885A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method to regulate jumps and falls by playable characters in xr spaces |
WO2023052886A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Methods for predefining virtual staircases connecting platforms in extended reality (xr) environments |
WO2023052887A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method to improve user understanding of xr spaces based in part on mesh analysis of physical surfaces |
WO2023052868A1 (en) * | 2021-09-28 | 2023-04-06 | Sony Group Corporation | Method for quasi-random placement of virtual items in an extended reality (xr) space |
US11759711B2 (en) | 2021-09-28 | 2023-09-19 | Sony Group Corporation | Method for quasi-random placement of virtual items in an extended reality (XR) space |
US11944905B2 (en) | 2021-09-28 | 2024-04-02 | Sony Group Corporation | Method to regulate jumps and falls by playable characters in XR spaces |
GB2612767A (en) * | 2021-11-03 | 2023-05-17 | Sony Interactive Entertainment Inc | Virtual reality interactions |
Also Published As
Publication number | Publication date |
---|---|
US11393176B2 (en) | 2022-07-19 |
US20210245043A1 (en) | 2021-08-12 |
US20210247846A1 (en) | 2021-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210248826A1 (en) | Surface distinction for mobile rendered augmented reality | |
CN108875633B (en) | Expression detection and expression driving method, device and system and storage medium | |
US9947139B2 (en) | Method and apparatus for providing hybrid reality environment | |
US20200219323A1 (en) | Interactive mixed reality platform utilizing geotagged social media | |
JP6423435B2 (en) | Method and apparatus for representing a physical scene | |
US20180276882A1 (en) | Systems and methods for augmented reality art creation | |
KR101636027B1 (en) | Methods and systems for capturing and moving 3d models and true-scale metadata of real world objects | |
CN112243583B (en) | Multi-endpoint mixed reality conference | |
CN112138386A (en) | Volume rendering method and device, storage medium and computer equipment | |
US11763479B2 (en) | Automatic measurements based on object classification | |
US11757997B2 (en) | Systems and methods for facilitating shared extended reality experiences | |
WO2022088819A1 (en) | Video processing method, video processing apparatus and storage medium | |
Du et al. | Video fields: fusing multiple surveillance videos into a dynamic virtual environment | |
CN116097316A (en) | Object recognition neural network for modeless central prediction | |
US11451721B2 (en) | Interactive augmented reality (AR) based video creation from existing video | |
WO2023231793A1 (en) | Method for virtualizing physical scene, and electronic device, computer-readable storage medium and computer program product | |
CN112206519A (en) | Method, device, storage medium and computer equipment for realizing game scene environment change | |
US11949527B2 (en) | Shared augmented reality experience in video chat | |
US12002146B2 (en) | 3D modeling based on neural light field | |
US11471773B2 (en) | Occlusion in mobile client rendered augmented reality environments | |
US10755459B2 (en) | Object painting through use of perspectives or transfers in a digital medium environment | |
GB2547529A (en) | 3D digital content interaction and control | |
Du | Fusing multimedia data into dynamic virtual environments | |
US12002165B1 (en) | Light probe placement for displaying objects in 3D environments on electronic devices | |
US20230298266A1 (en) | Hierarchical scene model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: KRIKEY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHRIRAM, KETAKI LALITHA UTHRA;SHRIRAM, JHANVI SAMYUKTA LAKSHMI;FONSECA, LUIS PEDRO OLIVEIRA DA COSTA;SIGNING DATES FROM 20210204 TO 20210205;REEL/FRAME:055221/0497 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |