EP3959691A1 - Managing content in augmented reality - Google Patents
Managing content in augmented realityInfo
- Publication number
- EP3959691A1 EP3959691A1 EP19832503.7A EP19832503A EP3959691A1 EP 3959691 A1 EP3959691 A1 EP 3959691A1 EP 19832503 A EP19832503 A EP 19832503A EP 3959691 A1 EP3959691 A1 EP 3959691A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- content
- environment
- image data
- scene
- physical object
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003190 augmentative effect Effects 0.000 title claims abstract description 14
- 238000003860 storage Methods 0.000 claims abstract description 46
- 238000000034 method Methods 0.000 claims abstract description 37
- 230000004807 localization Effects 0.000 claims abstract description 11
- 238000010801 machine learning Methods 0.000 claims description 27
- 230000009471 action Effects 0.000 claims description 19
- 230000015654 memory Effects 0.000 description 33
- 238000004891 communication Methods 0.000 description 16
- 230000000007 visual effect Effects 0.000 description 16
- 239000010410 layer Substances 0.000 description 10
- 238000004590 computer program Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 230000002085 persistent effect Effects 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 240000007320 Pinus strobus Species 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 230000004397 blinking Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- This description generally relates to managing content in augmented reality.
- AR augmented reality
- the coordinates of the AR content are stored.
- the AR content is often referential to the object that is physically located nearby. If the physical object moves or is removed from the AR scene, the AR content that has been positioned may become inaccurate since the AR content references an object that is no longer in the original location.
- an AR application may allow users to attach an AR label to objects that require repair in their workplace such as adding an AR label to a machine that needs repair. If the machine is removed or moved to another location, the AR label that is still rendered in the AR environment may not be relevant.
- a method for managing augmented reality (AR) content in an AR environment includes obtaining image data associated with a scene of an AR environment, where the AR environment includes AR content positioned in a coordinate space of the AR environment.
- the method includes detecting a physical object from the image data, associating the physical object with the AR content, and storing coordinates of the AR content and information indicating that the physical object is associated with the AR content in an AR scene storage for future AR localization.
- a corresponding AR system and a non-transitory computer-readable medium storing corresponding instructions may be provided.
- the method includes one or more of the following features (or any combination thereof).
- the detecting of the physical object may be based on one or more machine learning (ML) models.
- the one or more ML models is configured to determine a classification of the physical object, wherein the classification is associated with the AR content.
- the image data is first image data
- the method further includes obtaining second image data of at least a portion the scene to localize the AR content, detecting that the physical object is not present in the second image data, and causing the AR content to not be rendered in the AR environment.
- the method includes obtaining second image data of at least a portion the scene to localize the AR content, detecting that the physical object has moved to a new location in the scene of the AR environment based on the second image data and the AR scene storage and moving the AR content in the AR environment to a location that corresponds to the new location of the physical object in the second image data.
- the method includes obtaining second image data of at least a portion the scene to localize the AR content, detecting that the physical object is not present or has moved to a new location in the scene of the AR environment based on the second image data and the AR scene storage, and transmitting, via an application programming interface (API), a notification to a developer associated with the AR content of the AR environment.
- the AR content is located proximate to the physical object.
- the associating includes analyzing one or more terms associated with the AR content, and determining that at least one of the terms is associated with a classification of the physical object.
- the AR collaborative service or the client AR application configured to obtain image data associated with a scene of the AR environment, where the AR environment includes AR content positioned in a coordinate space of the AR environment and the AR content is associated with a physical object in the AR environment, detect that the physical object is not present in the image data or is moved to a new position in the scene, and initiate an action to manage the AR content associated with the physical object.
- the AR system may include any of the above/below features (or any combination thereof).
- the AR collaborative service or the client AR application is configured to detect the physical object using one or more machine learning (ML) models.
- the AR collaborative service or the client AR application is configured to detect a type of the physical object using the one or more ML models, wherein the type is associated with the AR content.
- the client AR application is configured to cause the AR content not to be rendered from the AR environment.
- the AR collaborative service or the client AR application is configured to move the AR content in the AR environment to a location that corresponds to the new position of the physical object in the image data.
- the AR collaborative service or the client AR application is configured to transmit, via an application programming interface (API), a notification to a developer of the AR content of the AR environment.
- API application programming interface
- the AR collaborative service or the client AR application is configured to analyze one or more terms associated with the AR content, and determine that at least one of the terms is associated with a classification of the physical object.
- a non-transitory computer-readable medium storing executable instructions that when executed by at least one processor are configured to manage augmented reality (AR) content in an AR environment, where the executable instructions includes instructions that cause the at least one processor to obtain first image data associated with a scene of an AR environment, where the AR environment includes AR content positioned in a coordinate space of the AR environment, detect a type of a physical object located proximate to the AR content from the first image data, store coordinates of the AR content with a link to the type of the physical object in an AR scene storage, obtain second image data associated with the scene of the AR environment to localize the AR environment, detect that the type of the physical object is not present in the second image data or is moved to a new position in the scene based on the second image data and the AR scene storage, and initiate an action to manage the AR content associated with the physical object.
- a corresponding AR system and a corresponding method may be provided.
- the non-transitory computer-readable medium may include any of the above/below features (or any combination thereof).
- the initiate the action may include instructions to not render the AR content from the AR environment.
- the initiate the action may include instructions to move the AR content in the AR environment to a location that corresponds to the new position of the physical object in the second image data.
- the initiate the action may include instructions to transmit, via an application programming interface (API), a notification to a developer of the AR content.
- API application programming interface
- the type of the physical object is detected using one or more machine learning (ML) models.
- ML machine learning
- FIG. 1 A depicts an AR system having a semantic content manager according to an aspect.
- FIG. IB depicts an example of the semantic content manager according to an aspect.
- FIG. 2 depicts an AR scene with a detected object referenced by AR content according to an aspect.
- FIG. 3 A depicts a conventional AR scene showing AR content for an object that moved outside of the scene.
- FIG. 3B depicts an AR scene from an AR system having the semantic content manager according to an aspect.
- FIG. 4A depicts a conventional AR scene showing AR content for an object that moved to a different location in the scene.
- FIG. 4B depicts an AR scene from an AR system having the semantic content manager according to another aspect.
- FIG. 5 illustrates a flow chart depicting example operations for managing AR content when storing an AR scene for future localization according to an aspect.
- FIG. 6 illustrates a flow chart depicting example operations for managing AR content when localizing an AR scene according to an aspect.
- FIG. 7 illustrates a flow chart depicting example operations for managing AR content when storing and localizing an AR scene according to an aspect.
- FIG. 8 illustrates example computing devices of the AR system according to an aspect.
- the embodiments provide a semantic content manager configured to detect and classify, using one or more machine learning (ML) models, a physical object referenced by or located proximate to AR content from image data captured by a device’s camera, and store the classification along with coordinates of the AR content in the AR persistent space.
- the semantic content manager may generate a link between the AR content and the nearby physical object such that the AR content is attached to the object that relates to the AR content. For example, if the AR content is positioned proximate to a chair in the real-word, the semantic content manager may analyze the camera’s input, detect the classification of the object as“chair”, and initiate storage of the classification along with the AR content’s coordinates in the AR persistent space.
- ML machine learning
- the semantic content manager may detect if the object that is associated with the AR content is present in the current AR scene or moved to a different location in the AR current scene. In some examples, if the object has moved to different location in the current AR scene, the semantic content manager may automatically move the AR content to be in a location that corresponds to the new location of the object. In some examples, if the object has been removed, the semantic content manager may cause the AR content not to be rendered in the AR scene.
- the semantic content manager may transmit a notification to a developer of the AR content informing them of the absence or movement such that the developer can decide to not show the AR content, move the location of the AR content, or leave the AR content in the original location.
- an AR scene may appear to be localized more accurately when the
- AR content that is associated with a certain physical object continues to be spatially associated with that physical object even if the physical object moves in the physical space.
- the technical benefits may include the removal of inaccurate placement of AR content linked to physical objects that have been removed from the scene and should not be rendered by an application.
- the semantic content manager may provide the ability to refine localization of specific AR content to compensate for potential drift or errors in large scale localization. For example, when the localization result is offset, the AR content would be positioned in a location different from desired location. If that AR content is known to be associated with a specific object, the techniques discussed herein may shift the AR content’s location to more accurately align with the physical object in the space even if that requires rendering at a different coordinate position in the localization coordinate space than originally stored.
- FIG. 1A illustrates an AR system 100 for managing AR content 130 positioned within a scene 125 of an AR environment 101 according to an aspect.
- the AR system 100 includes an AR collaborative service 104, executable by one or more AR servers 102, configured to create a multi-user or collaborative AR experience that users can share.
- the AR collaborative service 104 communicates, over a network 150, with a plurality of computing devices including a computing device 106 and a computing device 108, where a user of the computing device 106 and a user of the computing device 108 may share the same AR environment 101.
- the AR environment 101 may involve a physical space which is within the view of a user and a virtual space within which the AR content 130 is positioned.
- the AR content 130 is a text description (“My Chair”) along with an arrow that points to an object 121 (e.g., a chair), where the object 121 is a physical object in the physical space.
- the AR content 130 may include virtual content that is added by one or more users.
- This displaying of the AR content 130 is therefore according to a mapping between the virtual space and the physical space.
- Overlaying of the AR content 130 may be implemented, for example, by superimposing the AR content 130 into an optical field of view of a user of the physical space, by reproducing a view of the user of the physical space on one or more display screens, and/or in other ways, for example by using head up displays, mobile device display screens and so forth.
- the computing device 106 is configured to execute a client AR application
- the client AR application 110 is a software development kit (SDK) that operates in conjunction with one or more AR applications 109.
- the AR applications 109 may be any type of AR applications (e.g., gaming, entertainment, medicine, education, etc.) executable on the computing device 106.
- the client AR application 110 in combination with one or more sensors on the computing device 106, is configured to detect and track its position relative to the physical space, detect the size and location of different types of surfaces (e.g., horizontal, vertical, angled), and estimate the environment’s current lighting conditions.
- the client AR application 110 is configured to communicate with the AR collaborative service 104 via one or more application programming interfaces (APIs). Although two computing devices are illustrated in FIG. 1A, the AR collaborative service 104 may communicate and share the AR environment 101 with any number of computing devices.
- APIs application programming interfaces
- the computing device 106 may be, for example, a computing device such as a controller, or a mobile device (e.g., a smartphone, a tablet, a joystick, or other portable controller ⁇ s)).
- the computing device 106 includes a wearable device (e.g., a head mounted device) that is paired with, or communicates with a mobile device for interaction in the AR environment 101.
- the AR environment 101 is a representation of an environment that may be generated by the computing device 106 (and/or other virtual and/or augmented reality hardware and software). In this example, the user is viewing the AR environment 101 with the computing device 106. Since the details and use of the computing device 108 may be the same with respect to the computing device 106, the details of the computing device 108 are omitted for the sake of brevity.
- the AR system 100 includes a semantic content manager
- the client AR application 110 is configured to execute the semantic content manager 112 (e.g., included within the client AR application 110 executable by the computing device 106).
- the AR collaboration service 104 is configured to execute the semantic content manager 112 (e.g., included within the AR collaboration service 104 executable by the AR server 102). In some examples, one or more operations of the semantic content manager 112 are executed by the client AR application 110 and one or more operations of the semantic content manager 112 are executed by the AR collaboration service 104.
- the semantic content manager 112 is configured to detect and classify an object 121 referenced by or located proximate to AR content 130, and store the classification along with coordinates of the AR content 130 in AR scene storage 111.
- the AR scene storage 111 includes a coordinate space in which visual information (e.g., detected by the computing device 106) from the physical space and the AR content 130 are positioned.
- the positions of the visual information and the AR content 130 are updated in the AR scene storage 111 from image frame to image frame.
- the AR scene storage 111 includes a three-dimensional (3D) map of the AR environment 101.
- the AR scene storage 111 includes a sparse point map of the AR environment.
- the information in the AR scene storage 111 is used to share the AR environment 101 with one or more users that join the AR environment 101 and to calculate where each user’s computing device is located in relation to the physical space of the AR environment 101 such that multiple users can view and interact with the AR environment 101.
- the semantic content manager 112 may detect if the object 121 that is associated with the AR content 130 is present in the scene 125 or moved to a different location in the scene 125. In some examples, if the object 121 has moved to different location in the scene 125, the semantic content manager 112 may automatically move the AR content 130 to a location that corresponds to the new location of the object 121. In some examples, if the object 121 has been removed from the physical space, the semantic content manager 112 may cause the AR content 130 not to be rendered on a display screen associated with the computing device 106.
- the semantic content manager 112 may transmit a notification to a developer of the AR content 130 informing them of the absence or movement such that the developer can decide to not show the AR content 130, move the location of the AR content 130, or leave the AR content 130 in the original location.
- FIG. IB illustrates a schematic diagram of the semantic content manager 112 according to an aspect.
- the semantic content manager 112 includes an anchor module 114 configured to obtain image data 113a associated with the scene 125 of the AR environment 101.
- the anchor module 114 is configured to execute when saving the scene 125 to the AR scene storage 111.
- a user may use a camera on the computing device 106 to capture the scene 125 from the physical space of the AR environment 101.
- the image data 113a includes image data of one or more frames captured by the computing device 106.
- the AR environment 101 includes AR content 130 positioned in a coordinate space of the AR environment 101. For example, a user may add the AR content 130 to the AR environment 101, and the location of the AR content 130 may be indicated by its position (e.g., x, y, z locations) in the coordinate space of the AR environment 101.
- the anchor module 114 may detect, using one or more machine learning (ML) models 115, an object 121 referenced by or located proximate to the AR content 130 from the image data 113a in the scene 125 of the AR environment.
- ML machine learning
- an object 121 positioned relatively close to the AR content 130 may indicate that the AR content 130 references that object 121.
- the AR content 130 is a text description (“My Chair”) along with an arrow that points to an object 121 (e.g., a chair), where the object 121 is a physical object in the physical space.
- the anchor module 114 may detect the object 121 in the scene 125 by analyzing the image data 113a captured by the computing device 106 according to one or more object recognition techniques.
- the anchor module 114 uses one or more ML models 115 to determine a type 123 of the object 121.
- the type 123 may include a classification (e.g., a semantic label) of the object 121.
- the anchor module 114 may determine the type 123 of the object 121 as a chair (or a chair classification or label).
- the ML models 115 include one or more trained classifiers configured to detect the type 123 of an object 121 in the scene 125 based on the image data 113a.
- the one or more trained classifiers may detect an object 121 in the scene 125, and assign a semantic label to the object 121.
- the semantic labels may include different characterizations of objects such as chairs, laptops, desks, etc.
- the ML classifiers may detect multiple objects within the camera image, with associated positional information (e.g., the recognition of "chair” is associated with a particular portion of the camera frame, not with the frame as a whole.
- Depth estimate information from other ML and AR models can be combined with the labeled portion of a camera frame to estimate the 3D position of a recognized object.
- the ML model(s) may allow a particularly precise detection and classification of the physical object.
- the ML models 115 include a neural network.
- the neural network may be an interconnected group of nodes, each node representing an artificial neuron.
- the nodes are connected to each other in layers, with the output of one layer becoming the input of a next layer.
- Neural networks transform an input, received by the input layer, transform it through a series of hidden layers, and produce an output via the output layer.
- Each layer is made up of a subset of the set of nodes.
- the nodes in hidden layers are fully connected to all nodes in the previous layer and provide their output to all nodes in the next layer.
- the nodes in a single layer function independently of each other (i.e., do not share connections). Nodes in the output provide the transformed input to the requesting process.
- the semantic content manager 112 uses a convolutional neural network in the object classification algorithm, which is a neural network that is not fully connected. Convolutional neural networks therefore have less complexity than fully connected neural networks. Convolutional neural networks can also make use of pooling or max-pooling to reduce the dimensionality (and hence complexity) of the data that flows through the neural network and thus this can reduce the level of computation required. This makes computation of the output in a convolutional neural network faster than in neural networks.
- the anchor module 114 may determine that the detected object 121 and its type 123 are associated with the AR content 130. In some examples, if the detected object 121 is within a certain distance of the AR content 130, the anchor module 114 may determine that the detected object 121 is associated with the AR content 130. In some examples, the anchor module 114 is configured to determine a level of relevancy of the object 121 (and/or the type 123 of the object 121) to the AR content 130, and if the level of relevancy is above a threshold amount, the anchor module 114 is configured to determine that the detected object 121 is associated with the AR content 130. In some examples, the level of relevancy is determined based on the distance of the location of the object 121 to the location of the AR content 130 in the coordinate space (e.g., a shorter distance may indicate a higher relevancy).
- the level of relevancy is based on (e.g., further based on) a semantic comparison of the type 123 of the object 121 and the AR content 130.
- the AR content 130 may be associated with one or more terms.
- the AR content 130 includes the description“My chair.”
- the anchor module 114 may detect the term “chair” from the AR content 130 and determine that the term“chair” is associated with the type 123 (e.g., chair classification). Then, the anchor module 114 may determine that one or more terms associated with the AR content 130 is semantically the same/similar to the type 123 of the object 121.
- the AR content 130 includes a virtual object
- the anchor module 114 may detect the type of virtual object (e.g., using the techniques described above). If the type of virtual object and the type 123 of the object 121 are determined as semantically similar, the anchor module 114 may increase the level of relevancy and/or determine that the AR content 130 is associated with the object 121.
- the anchor module 114 may generate a link 127 between the object 121 and the AR content 130 and store the coordinates of the AR content 130 with the link to the object 121 in the AR environment 101. For example, instead of only storing the coordinates of the AR content 130 at the AR scene storage 111, the AR content 130 is also stored with the link 127 to the object 121, which indicates that the AR content 130 with associated with a certain type of physical object (e.g., the AR content 130 references a chair in the physical environment).
- the link 127 may be computer-generated data that indicates that the object 121 is linked to the AR content 130. In some examples, the link 127 indicates the type 123 of the object 121 that is associated with the AR content 130.
- the semantic content manager 112 may send or provide information about the object 121, the type 123, and the link 127 as well as the coordinates of the AR content 130 (which indicate the position of the AR content 130 in the coordinate space in which the image data 113a is mapped onto) to the AR collaboration service 104.
- the client AR application 110 is configured to detect a set of visual feature points from the image data 113a and track the movement of the set of visual feature points over time.
- the set of visual feature points are a plurality of points (e.g., interesting points) that represent the user’s environment, and the set of visual feature points may be updated over time.
- the set of visual feature points may be referred to an anchor or a set of persistent visual features that represent physical objects in the physical world, and the set of visual feature points are stored in the AR scene storage 111 to be used to localize the AR environment 101 in a subsequent session or for another user.
- the visual feature points in the AR scene storages 113 may be used to compare and match against other visual feature points (e.g., detected from image data 113b) in order to determine whether the physical space is the same as the physical space of the stored visual feature points and to calculate the location of the computing device within the AR environment 101 in relation to the stored visual feature points in the AR scene storage 111.
- other visual feature points e.g., detected from image data 113b
- the AR localizer 116 may obtain image data 113b that captures at least a portion of the scene 125 of the AR environment 101.
- a user may use a camera on the computing device 106 or the computing device 108 to capture at least a portion of the scene 125 from the physical space of the AR environment 101.
- the AR localizer 116 may detect, using the one or more ML models 115, whether the object 121 is present or has moved to a different location in the scene 125 using the current image data (e.g., the image data 113b). For example, when localizing the scene 125, the AR localizer 116 may determine that the AR content 130 to be rendered is associated with the type 123 of the object 121 based on the link 127 that is stored in the AR scene storage 111. Using the object recognition techniques described above, the AR localizer 116 is configured to detect whether an object 121 in the image data 113b having the same type 123 is present or located at the same location as the object 121 in the image data 113a.
- the AR scene storage 111 includes information that maps the image data 113a (and the AR content 130) to the coordinate space of the AR environment 101.
- the AR scene storage 111 includes information that maps the chair onto the coordinate space.
- the AR localizer 116 is configured to detect whether the current scene 125 includes the type 123 of the object 121 that was stored in the AR scene storage 111. If the image data 113b includes a chair, the AR localizer 116 is configured to determine whether the chair that was stored in the AR scene storage 111 is located at the same position as the chair detected in the image data 113b. If the chair is located at a different position, the AR localizer 116 is configured to determine the new location of the chair from the image data 113b.
- an action module 118 of the semantic content manager 112 is configured to execute one or more actions. If the object 121 is detected as not present in the image data 113b, the action module 118 may cause the AR application 119 to not render the AR content 130 from the AR scene storage 111. If the object 121 is present in the image data 113b but has moved to a new location, the action module 118 is configured to move the AR content 130 in the AR environment 101 to be proximate to the new location of the object 121. In some examples, if the object 121 is not present in the image data 113b or has moved to a different location, the action module 118 may transmit, via an API 120, a notification to a developer associated with the AR content 130.
- FIG. 3A illustrates the results of an AR system without the semantic content manager 112 when an object is no longer in the scene 125.
- an AR system (without the semantic content manager 112) may still render the AR content 130.
- the semantic content manager 112 is configured to detect that the image data 113b does not include the object 121 having the determined type 123.
- the semantic content manager 112 is configured to cause the AR content 130 to not be rendered in the scene 125 of the AR environment 101.
- FIG. 4A illustrates the results of an AR system without the semantic content manager 112 when the physical object has moved to a different location in the scene 125.
- a conventional AR system may still render the AR content 130 in the position of the saved scene since the AR content 130 is attached to only the coordinates of the AR space, which may cause the AR content 130 to be inaccurate.
- the semantic content manager 112 is configured to detect that the object 121 is within a new location from the image data 113b, and automatically move the AR content 130 to a position that corresponds to the new location of the object 121.
- FIG. 5 illustrates a flow chart 500 depicting example operations of the AR system 100 according to an aspect.
- the example operations of the flow chart 500 relate to the detection of an object associated with AR content when storing an AR scene for future localization.
- Operation 502 includes obtaining image data 113a associated with a scene 125 of an AR environment 101, where the AR environment 101 includes AR content 130 positioned in a coordinate space of the AR environment 101.
- Operation 504 includes detecting an object 121 from the image data 113a.
- Operation 506 includes associating the object 121 with the AR content 130.
- Operation 508 includes storing coordinates of the AR content 130 and information indicating that the object 121 is associated with the AR content 130 in AR scene storage 111 for future AR localization.
- FIG. 6 illustrates a flow chart 600 depicting example operations of the AR system 100 according to an aspect.
- the example operations of the flow chart 600 relate to the detection of an object associated with AR content when localizing to a current session.
- Operation 602 includes obtaining image data 113b associated with a scene 125 of the AR environment 101, where the AR environment 101 includes AR content 130 positioned in a coordinate space of the AR environment 101, and the AR content 130 is associated with an object 121 in the AR environment 101.
- Operation 604 includes detecting that the object 121 is not present in the image data 113b or is moved to a new position in the scene 125.
- Operation 606 includes initiating an action to manage the AR content 130 associated with the object 121. In some examples, the action includes removing the AR content 130 for display within the scene 125, re-positioning the AR content 130 to a location corresponding to the new location of the object 121, and/or sending a notification to a developer of the AR content 130.
- FIG. 7 illustrates a flow chart 700 depicting example operations of the AR system 100 according to an aspect.
- the example operations of the flow chart 700 relate to the detection of an object associated with AR content when storing from a first session and localizing to a second session.
- Operation 702 includes obtaining first image data 113a associated with a scene
- Operation 704 includes detecting a type 123 of an object 121 located proximate to the AR content 130 from the first image data 113a.
- Operation 706 includes storing coordinates of the AR content 1230 with a link 127 to the type 123 of the object 121 in AR scene storage 111.
- Operation 708 includes obtaining second image data 113b associated with the scene 125 of the AR environment 101 to localize the AR environment 101.
- Operation 710 including detecting that the type 123 of the object 121 is not present in the second image data 113b or is moved to a new position in the scene 125 based on the second image data 113b and the AR scene storage 111.
- Operation 712 including initiating an action to manage the AR content 130 associated with the object 121
- FIG. 8 shows an example of an example computer device 800 and an example mobile computer device 850, which may be used with the techniques described here.
- Computing device 800 includes a processor 802, memory 804, a storage device 806, a high speed interface 808 connecting to memory 804 and high-speed expansion ports 810, and a low speed interface 812 connecting to low speed bus 814 and storage device 806.
- Each of the components 802, 804, 806, 808, 810, and 812 are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate.
- the processor 802 can process instructions for execution within the computing device 800, including instructions stored in the memory 804 or on the storage device 806 to display graphical information for a GUI on an external input/output device, such as display 816 coupled to high speed interface 808.
- an external input/output device such as display 816 coupled to high speed interface 808.
- multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory.
- multiple computing devices 800 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
- the memory 804 stores information within the computing device 800.
- the memory 804 is a volatile memory unit or units.
- the memory 804 is a non-volatile memory unit or units.
- the memory 804 may also be another form of computer-readable medium, such as a magnetic or optical disk.
- the storage device 806 is capable of providing mass storage for the computing device 800.
- the storage device 806 may be or contain a computer- readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations.
- a computer program product can be tangibly embodied in an information carrier.
- the computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above.
- the information carrier is a computer- or machine-readable medium, such as the memory 804, the storage device 806, or memory on processor 802.
- the high speed controller 808 manages bandwidth-intensive operations for the computing device 800, while the low speed controller 812 manages lower bandwidth- intensive operations. Such allocation of functions is exemplary only.
- the high-speed controller 808 is coupled to memory 804, display 816 (e.g., through a graphics processor or accelerator), and to high-speed expansion ports 810, which may accept various expansion cards (not shown).
- low-speed controller 812 is coupled to storage device 806 and low-speed expansion port 814.
- the low-speed expansion port which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- input/output devices such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- the computing device 800 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 820, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 824. In addition, it may be implemented in a personal computer such as a laptop computer 822. Alternatively, components from computing device 800 may be combined with other components in a mobile device (not shown), such as device 850. Each of such devices may contain one or more of computing device 800, 850, and an entire system may be made up of multiple computing devices 800, 850 communicating with each other.
- Computing device 850 includes a processor 852, memory 864, an input/output device such as a display 854, a communication interface 866, and a transceiver 868, among other components.
- the device 850 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage.
- a storage device such as a microdrive or other device, to provide additional storage.
- Each of the components 850, 852, 864, 854, 866, and 868 are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.
- the processor 852 can execute instructions within the computing device 850, including instructions stored in the memory 864.
- the processor may be implemented as a chipset of chips that include separate and multiple analog and digital processors.
- the processor may provide, for example, for coordination of the other components of the device 850, such as control of user interfaces, applications run by device 850, and wireless communication by device 850.
- Processor 852 may communicate with a user through control interface 858 and display interface 856 coupled to a display 854.
- the display 854 may be, for example, a TFT LCD (Thin-Film-Transistor Liquid Crystal Display) or an OLED (Organic Light Emitting Diode) display, or other appropriate display technology.
- the display interface 856 may comprise appropriate circuitry for driving the display 854 to present graphical and other information to a user.
- the control interface 858 may receive commands from a user and convert them for submission to the processor 852.
- an external interface 862 may be in communication with processor 852, so as to enable near area communication of device 850 with other devices.
- External interface 862 may provide, for example, for wired communication in some implementations, or for wireless communication in other implementations, and multiple interfaces may also be used.
- the memory 864 stores information within the computing device 850.
- the memory 864 can be implemented as one or more of a computer-readable medium or media, a volatile memory unit or units, or a non-volatile memory unit or units.
- Expansion memory 874 may also be provided and connected to device 850 through expansion interface 872, which may include, for example, a SIMM (Single In Line Memory Module) card interface.
- SIMM Single In Line Memory Module
- expansion memory 874 may provide extra storage space for device 850, or may also store applications or other information for device 850.
- expansion memory 874 may include instructions to carry out or supplement the processes described above, and may include secure information also.
- expansion memory 874 may be provide as a security module for device 850, and may be programmed with instructions that permit secure use of device 850.
- secure applications may be provided via the SIMM cards, along with additional information, such as placing identifying information on the SIMM card in a non-hackable manner.
- the memory may include, for example, flash memory and/or NVRAM memory, as discussed below.
- a computer program product is tangibly embodied in an information carrier.
- the computer program product contains instructions that, when executed, perform one or more methods, such as those described above.
- the information carrier is a computer- or machine-readable medium, such as the memory 864, expansion memory 874, or memory on processor 852, that may be received, for example, over transceiver 868 or external interface 862.
- Device 850 may communicate wirelessly through communication interface
- Communication interface 866 may provide for communications under various modes or protocols, such as GSM voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others. Such communication may occur, for example, through radio-frequency transceiver 868. In addition, short-range communication may occur, such as using a Bluetooth, Wi-Fi, or other such transceiver (not shown). In addition, GPS (Global Positioning System) receiver module 870 may provide additional navigation- and location- related wireless data to device 850, which may be used as appropriate by applications running on device 850.
- GPS Global Positioning System
- Device 850 may also communicate audibly using audio codec 860, which may receive spoken information from a user and convert it to usable digital information. Audio codec 860 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of device 850. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 850.
- Audio codec 860 may receive spoken information from a user and convert it to usable digital information. Audio codec 860 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of device 850. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 850.
- the computing device 850 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a cellular telephone 880. It may also be implemented as part of a smart phone 882, personal digital assistant, or other similar mobile device.
- Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof.
- ASICs application specific integrated circuits
- These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- the term “module” may include software and/or hardware.
- the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
- the systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.
- LAN local area network
- WAN wide area network
- the Internet the global information network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- the computing devices depicted in FIG. 8 can include sensors that interface with a virtual reality (VR headset 890).
- VR headset 890 virtual reality
- one or more sensors included on a computing device 850 or other computing device depicted in FIG. 8 can provide input to VR headset 890 or in general, provide input to a VR space.
- the sensors can include, but are not limited to, a touchscreen, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors.
- the computing device 850 can use the sensors to determine an absolute position and/or a detected rotation of the computing device in the VR space that can then be used as input to the VR space.
- the computing device 850 may be incorporated into the VR space as a virtual object, such as a controller, a laser pointer, a keyboard, a weapon, etc.
- a virtual object such as a controller, a laser pointer, a keyboard, a weapon, etc.
- Positioning of the computing device/virtual object by the user when incorporated into the VR space can allow the user to position the computing device to view the virtual object in certain manners in the VR space.
- the virtual object represents a laser pointer
- the user can manipulate the computing device as if it were an actual laser pointer.
- the user can move the computing device left and right, up and down, in a circle, etc., and use the device in a similar fashion to using a laser pointer.
- one or more input devices included on, or connect to, the computing device 850 can be used as input to the VR space.
- the input devices can include, but are not limited to, a touchscreen, a keyboard, one or more buttons, a trackpad, a touchpad, a pointing device, a mouse, a trackball, a joystick, a camera, a microphone, earphones or buds with input functionality, a gaming controller, or other connectable input device.
- a user interacting with an input device included on the computing device 850 when the computing device is incorporated into the VR space can cause a particular action to occur in the VR space.
- a touchscreen of the computing device 850 can be rendered as a touchpad in VR space.
- a user can interact with the touchscreen of the computing device 850.
- the interactions are rendered, in VR headset 890 for example, as movements on the rendered touchpad in the VR space.
- the rendered movements can control objects in the VR space.
- one or more output devices included on the computing device 850 can provide output and/or feedback to a user of the VR headset 890 in the VR space.
- the output and feedback can be visual, tactical, or audio.
- the output and/or feedback can include, but is not limited to, vibrations, turning on and off or blinking and/or flashing of one or more lights or strobes, sounding an alarm, playing a chime, playing a song, and playing of an audio file.
- the output devices can include, but are not limited to, vibration motors, vibration coils, piezoelectric devices, electrostatic devices, light emitting diodes (LEDs), strobes, and speakers.
- the computing device 850 may appear as another object in a computer-generated, 3D environment. Interactions by the user with the computing device 850 (e.g., rotating, shaking, touching a touchscreen, swiping a finger across a touch screen) can be interpreted as interactions with the object in the VR space.
- the computing device 850 appears as a virtual laser pointer in the computer-generated, 3D environment.
- the user manipulates the computing device 850, the user in the VR space sees movement of the laser pointer.
- the user receives feedback from interactions with the computing device 850 in the VR space on the computing device 850 or on the VR headset 890.
- one or more input devices in addition to the computing device can be rendered in a computer-generated, 3D environment.
- the rendered input devices e.g., the rendered mouse, the rendered keyboard
- Computing device 800 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
- Computing device 850 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, and other similar computing devices.
- the components shown here, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed in this document.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Computer Hardware Design (AREA)
- Computer Graphics (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Processing Or Creating Images (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/395,832 US11055919B2 (en) | 2019-04-26 | 2019-04-26 | Managing content in augmented reality |
US16/396,145 US11151792B2 (en) | 2019-04-26 | 2019-04-26 | System and method for creating persistent mappings in augmented reality |
PCT/US2019/065239 WO2020219110A1 (en) | 2019-04-26 | 2019-12-09 | Managing content in augmented reality |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3959691A1 true EP3959691A1 (en) | 2022-03-02 |
Family
ID=69138001
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19832503.7A Pending EP3959691A1 (en) | 2019-04-26 | 2019-12-09 | Managing content in augmented reality |
EP19832782.7A Pending EP3959692A1 (en) | 2019-04-26 | 2019-12-09 | System and method for creating persistent mappings in augmented reality |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19832782.7A Pending EP3959692A1 (en) | 2019-04-26 | 2019-12-09 | System and method for creating persistent mappings in augmented reality |
Country Status (3)
Country | Link |
---|---|
EP (2) | EP3959691A1 (en) |
CN (2) | CN113614794B (en) |
WO (2) | WO2020219110A1 (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120306850A1 (en) * | 2011-06-02 | 2012-12-06 | Microsoft Corporation | Distributed asynchronous localization and mapping for augmented reality |
US9996974B2 (en) * | 2013-08-30 | 2018-06-12 | Qualcomm Incorporated | Method and apparatus for representing a physical scene |
US9791917B2 (en) * | 2015-03-24 | 2017-10-17 | Intel Corporation | Augmentation modification based on user interaction with augmented reality scene |
EP3698233A1 (en) * | 2017-10-20 | 2020-08-26 | Google LLC | Content display property management |
CN108269307B (en) * | 2018-01-15 | 2023-04-07 | 歌尔科技有限公司 | Augmented reality interaction method and equipment |
US10504282B2 (en) * | 2018-03-21 | 2019-12-10 | Zoox, Inc. | Generating maps without shadows using geometry |
-
2019
- 2019-12-09 WO PCT/US2019/065239 patent/WO2020219110A1/en unknown
- 2019-12-09 CN CN201980094560.9A patent/CN113614794B/en active Active
- 2019-12-09 WO PCT/US2019/065235 patent/WO2020219109A1/en unknown
- 2019-12-09 CN CN201980094558.1A patent/CN113614793B/en active Active
- 2019-12-09 EP EP19832503.7A patent/EP3959691A1/en active Pending
- 2019-12-09 EP EP19832782.7A patent/EP3959692A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
EP3959692A1 (en) | 2022-03-02 |
WO2020219109A1 (en) | 2020-10-29 |
CN113614794A (en) | 2021-11-05 |
WO2020219110A1 (en) | 2020-10-29 |
CN113614793A (en) | 2021-11-05 |
CN113614794B (en) | 2024-06-04 |
CN113614793B (en) | 2024-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11151792B2 (en) | System and method for creating persistent mappings in augmented reality | |
US11055919B2 (en) | Managing content in augmented reality | |
US20200342681A1 (en) | Interaction system for augmented reality objects | |
US11609675B2 (en) | Placement of objects in an augmented reality environment | |
US10345925B2 (en) | Methods and systems for determining positional data for three-dimensional interactions inside virtual reality environments | |
US20170329503A1 (en) | Editing animations using a virtual reality controller | |
US20200273250A1 (en) | Positional recognition for augmented reality environment | |
JP7008730B2 (en) | Shadow generation for image content inserted into an image | |
CN108697935B (en) | Avatars in virtual environments | |
CN111771180B (en) | Mixed placement of objects in augmented reality environments | |
CN111373349B (en) | Method, apparatus and storage medium for navigating in augmented reality environment | |
US10354446B2 (en) | Methods and apparatus to navigate within virtual-reality environments | |
US11057549B2 (en) | Techniques for presenting video stream next to camera | |
US20180158243A1 (en) | Collaborative manipulation of objects in virtual reality | |
US20220335661A1 (en) | System and method for playback of augmented reality content triggered by image recognition | |
US20230113461A1 (en) | Generating and rendering motion graphics effects based on recognized content in camera view finder | |
US10649616B2 (en) | Volumetric multi-selection interface for selecting multiple objects in 3D space | |
CN105190469A (en) | Causing specific location of an object provided to a device | |
US11354011B2 (en) | Snapping range for augmented reality | |
CN113614794B (en) | Managing content in augmented reality | |
US11805176B1 (en) | Toolbox and context for user interactions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20210827 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
PUAG | Search results despatched under rule 164(2) epc together with communication from examining division |
Free format text: ORIGINAL CODE: 0009017 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20240715 |
|
B565 | Issuance of search results under rule 164(2) epc |
Effective date: 20240715 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 19/00 20110101AFI20240710BHEP |