WO2018175335A1 - Procédé et système permettant de découvrir et de positionner un contenu dans un espace de réalité augmentée - Google Patents

Procédé et système permettant de découvrir et de positionner un contenu dans un espace de réalité augmentée Download PDF

Info

Publication number
WO2018175335A1
WO2018175335A1 PCT/US2018/023166 US2018023166W WO2018175335A1 WO 2018175335 A1 WO2018175335 A1 WO 2018175335A1 US 2018023166 W US2018023166 W US 2018023166W WO 2018175335 A1 WO2018175335 A1 WO 2018175335A1
Authority
WO
WIPO (PCT)
Prior art keywords
site
hmd
user
data
video feed
Prior art date
Application number
PCT/US2018/023166
Other languages
English (en)
Inventor
Seppo T. VALLI
Pekka K. SILTANEN
Original Assignee
Pcms Holdings, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pcms Holdings, Inc. filed Critical Pcms Holdings, Inc.
Publication of WO2018175335A1 publication Critical patent/WO2018175335A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality

Definitions

  • the present disclosure relates to augmented reality (AR) technologies and merging real and virtual elements to produce new visualizations, typically a video, in which physical and digital objects co-exist and interact in real time.
  • AR augmented reality
  • AR objects may basically be any digital information for which spatiality (3D position and orientation in space) gives added value, for example pictures, videos, graphics, text, and audio.
  • Augmented reality visualizations present augmented virtual elements as a part of the physical view.
  • Augmented reality terminals equipped with a camera and a display e.g. using augmented-reality glasses, either video-see-through or optical-see-through, either monocular or stereoscopic, capture video from a user's environment and show physical elements together with virtual elements on a display.
  • AR visualizations may be created in such a way that they can be seen correctly from different viewpoints. For example, when the user changes his/her viewpoint, virtual elements stay or act as if they were part of the physical scene.
  • Tracking technologies may be employed for deriving 3D properties of the environment for AR content production, and when viewing the content, for tracking the viewer's (camera) position with respect to the environment.
  • the viewer's (camera) position can be tracked e.g. by tracking known objects in the viewer's video stream and/or using one or more depth cameras.
  • Inertial measurement sensors may also be used to assist with tracking.
  • One embodiment is directed to a method including receiving, at a first augmented reality (AR) head mounted display (HMD) at a first site, a video feed of at least part of a second site that is remote from the first site.
  • the method includes rendering, with the first AR HMD, the video feed.
  • the method includes determining, with the first AR HMD, gaze data of a first AR HMD user with respect to the rendered video feed.
  • the method also includes obtaining, at the first AR HMD, selected AR object data based on first AR HMD user input.
  • the method also includes outputting the gaze data of the first AR HMD user and the selected AR object data from the first AR HMD and displaying, with the first AR HMD, a received video feed of at least part of the second site that includes an AR object based on the output selected AR object data, the AR object being positioned based on the output gaze data.
  • the selected AR object data includes at least one of an AR object selected by the first AR HMD user, orientation data determined from first AR HMD user input data, and scale data determined from first AR HMD user input data.
  • the rendering, with the first AR HMD, the video feed is according to a shared coordinate system between the first site and the second site.
  • outputting the gaze data of the first AR HMD user and the selected AR object data from the first AR HMD comprises transmitting the gaze data of the first AR HMD user and the selected AR object data to a second AR HMD at or near to the second site.
  • the displaying, with the first AR HMD, the received video feed of at least part of the second site that includes the AR object positioned based on the output gaze data includes determining a position of the AR object based on three-dimensional ray-casting.
  • Another embodiment is directed to a method including constructing a three-dimensional model of a real-world environment corresponding to a first participant of the teleconference; transmitting a video feed of at least part of the real-world environment to at least a second participant of the teleconference; receiving gaze data from the second participant with respect to the transmitted video; receiving selected AR object data from the second participant; determining a position for augmenting an AR object based on the gaze data and the three-dimensional model of the real-world environment; rendering the AR object based on the determined position; and transmitting an augmented video feed including the AR object to the second participant.
  • One embodiment of the method includes determining a shared coordinate system between the first and second participants to the teleconference.
  • the shared coordinate system is based on an adjacent arrangement, an overlapping arrangement, or a combination of adjacent and overlapping arrangements between a first site corresponding to the first participant and at least a second site corresponding to the second participant.
  • the method includes determining an intersection point between at least some of the gaze data and a surface of the three-dimensional model.
  • Another embodiment is directed to a system including a processor disposed in an augmented reality (AR) display located at a first site, and a transceiver coupled to the processor to receive a video feed of at least part of a second site that is remote from the first site.
  • AR augmented reality
  • the system includes a non-transitory computer storage medium storing instructions operative, when executed on the processor, to perform functions including: rendering the video feed; determining gaze data of a userof the augmented reality display with respect to the rendered video feed; obtaining selected AR object data based on input by a user; outputting via the transceiver the gaze data of the user and the selected AR object data from the AR display; and displaying via the AR display, a received video feed of at least part of the second site that includes an AR object based on the output selected AR object data, the AR object being positioned based on the output gaze data.
  • FIG. 1 A depicts an example local user's view of a video feed from a remote site according to an embodiment.
  • FIG. 1 B depicts a first user's view and a second user's view of a video feed from a remote site, according to an embodiment.
  • FIG. 2A depicts an example system including a shared geometry, a first site system, a second site system, and an n site system according to an embodiment.
  • FIG. 2B depicts a system environment in accordance with an embodiment.
  • FIG. 3 depicts an example method in accordance with an embodiment.
  • FIG. 4 depicts an example video feed being transmitted from a remote site to a local site in accordance with an embodiment.
  • FIG. 5 depicts an example rendering of remote sites positioned in a triangular adjacent three- site geometry in accordance with an embodiment.
  • FIG. 6 depicts an overview of overlapping and adjacent geometries of sites in accordance with an embodiment.
  • FIG. 7 depicts an example user, the user's gaze direction and field of view, and pixels of a viewed image in accordance with an embodiment.
  • FIG. 8 depicts a ray being shot from a selected pixel through a camera and to a point of intersection of a closest object blocking the path of the ray in accordance with an embodiment.
  • FIG. 9 depicts a local user viewing an example rendering of a remote site video feed augmented with an object in accordance with an embodiment.
  • FIG. 10 depicts an example communication and/or processing sequence in accordance with an embodiment.
  • FIG. 11 depicts another example communication and/or processing sequence in accordance with an embodiment.
  • FIG. 12 is a schematic block diagram illustrating the components of an exemplary wireless transmit/receive unit, in accordance with at least one embodiment in accordance with an embodiment.
  • FIG. 13 is a schematic block diagram illustrating the components of an example network entity in accordance with an embodiment.
  • AR augmented reality
  • their positions are typically defined with respect to the physical environment.
  • an expert user e.g. programmer.
  • Most methods developed for positioning a virtual element are quite difficult to use and are mostly used in a local space, in which the user and the augmented object are in the same physical location.
  • Methods and systems disclosed herein enable a user to augment virtual objects to a remote participant's environment during a video conference by just looking at a desired position at a video feed of the remote site.
  • printed graphical markers are used in the environment and are detected from a video as a reference for both augmenting virtual information in a correct orientation and scale and for tracking the viewer's (camera) position.
  • the pose of a virtual object may be defined with respect to the markers, whose pose is known in advance by the system.
  • a presentation device may recognize the marker and position the AR object relative to the marker.
  • the user who is placing the marker is physically present at the site where the content is augmented.
  • Markerless AR avoids distracting markers as in marker based placement by relying on detecting distinctive features of the environment and using them for augmenting virtual information and tracking a user's position.
  • planar image based placement a planar element of a scene may be used as a marker.
  • feature-based techniques such as planar image based placement, there is more in-advance preparations than for marker based methods. Possible reasons for this are that feature-based techniques may include more complex data capture, more complex processing, and more complex tools for AR content production. In addition, they typically do not give as explicit scale reference for the augmentations as when using markers.
  • an application offers a user interface for selecting a known feature set (e.g. a poster on the wall or a logo of a machine) from the local environment.
  • a known feature set e.g. a poster on the wall or a logo of a machine
  • the set of features that is used for tracking is, in practice, a planar image that can be used similarly as markers to define 3D location and 3D orientation.
  • buttons that are mapped with different degrees of freedom e.g. X, Y, Z directions in the world model.
  • Touch or mouse interaction based placement techniques may be used in combination with marker or planar image based placement when adjusting the AR object's place defined by the fiducial marker.
  • Gesture based placement may utilize a user's hand gesture, such as for example, the user's mid-air finger movement, in front of the user terminal's camera.
  • a user application may track the finger movements and may position the AR object by interpreting the gestures.
  • Terminal movement based placement uses the movability and small size of a handheld AR terminal. Terminal movement placement techniques show an AR object on the terminal screen and then the object's position is fixed relative to the terminal's position. The AR object can then be moved by moving the AR terminal. Often, terminal movement based placement is combined with touch screen interaction, e.g. for rotating the object.
  • 3D reconstruction can be used to support setting pose of the augmented virtual objects.
  • shapes of objects in an environment are captured, resulting in a set of 3D points of the shapes.
  • Virtual objects may be positioned with respect to the generated 3D information using e.g. mouse or touch interaction. Interaction may be enabled e.g. by showing the 3D information from different perspectives for positioning the virtual object from different directions.
  • the AR terminal casts an imaginary 3D ray from the selected position in the video stream shown on the terminal.
  • the system uses a predefined 3D model of the environment (e.g. produced by 3D reconstruction), which is used to calculate an intersection of the ray and the 3D model. The object is then positioned into the intersection.
  • an application provides a user with visual hints of constraints found in a real environment, such as edges and planar surfaces, and allows the user to attach AR objects to the constraints.
  • the technique is based on a predefined 3D model of the environment (e.g. produced by 3D reconstruction).
  • Simple annotations e.g. text and drawings
  • the technique is based on having a predefined 3D model of the environment (e.g. produced by 3D reconstruction).
  • Disclosures herein describe, in at least one embodiment, environments in which there is no VR scene shared between sites, and, in some cases, only the information of the user's gaze direction is shared between the sites.
  • limited information saves network bandwidth and processing, which is advantageous because 3D reconstructed models can be very large in size.
  • 3D capture technologies allow 3D models of objects and people to be captured and transmitted to remote locations in real time.
  • the 3D model may be rendered to a remote user's view using e.g. augmented reality glasses so that the transmitted objects seem to appear in the remote user's space. Rendering at a remote user's view allows people to be represented by realistic virtual avatars that replicate people's movements in real time.
  • a telepresence system 3D models of several geographically distant sites are often merged together in order to give the user an impression that the people in different sites interact in a same virtual environment.
  • Two techniques for merging the 3D models are: overlapping remote spaces and adjacent remote spaces. Telepresence systems may also use combinations of the two methods.
  • Overlapping remote spaces may be created by bringing the 3D geometries of the remote sites, as well as avatars of the users, into one common virtual reality environment. In the common virtual reality environment, there are no boundaries between the site models.
  • a window paradigm In many existing telepresence solutions, a window paradigm is used.
  • the remote users are seen through a window-like display. Behaving like a natural window, the display allows users to experience e.g. motion parallax and stereoscopic 3D perception.
  • the geometries of the remote spaces may not be overlaid so there may not be conformance problems.
  • Standardization can include constraining the number and position of collaborating partners, which may present onerous restrictions in some circumstances.
  • the solutions using adjacent remote spaces may be implemented using traditional telepresence techniques, such as for example, sending video streams between sites to communicate.
  • traditional telepresence techniques such as for example, sending video streams between sites to communicate.
  • the physical geometries of the sites are fixed or the user's positions are tracked and the video positions of the cameras capturing the users are selected according to the user's positions.
  • the 3D models of the remote sites may be positioned adjacently so that the models do not overlap. Then, the user avatars may be positioned into the respective 3D models and the others may see the remote users in a merged VR space or in synthesized videos captured from virtual camera positions, without the models or avatars colliding.
  • One drawback to using augmented reality lies in the difficulty of creating the AR system and the content shown to the user.
  • Creating an augmented reality application is a complex programming task that is often carried out by programming professionals.
  • producing the content shown in AR is a task that is simplified if proper tools are provided.
  • AR content may be linked to a real-world 3D space, and providing tools that allow easy placement of AR content into desired positions in the real-world environment may be advantageous.
  • Embodiments described above for placing an AR object with respect to the real world include manipulating 3D positions using a 2D user interface, a task that is not familiar to most users. Also, most of the above disclosed systems do not address the problem of positioning content to the environment of a remote user during a teleconference.
  • Embodiments herein describe systems and methods that solve a problem of AR content creation in a case in which a non-expert user wants to place AR content (e.g. a 3D model of a new piece of furniture) to a remote environment (e.g. a remote user's living room) just by looking at the position where the content should be augmented.
  • AR content e.g. a 3D model of a new piece of furniture
  • remote environment e.g. a remote user's living room
  • Embodiments herein describe systems and methods that provide the ability to select a position for AR content (e.g. a 3D model) and augment the content to a remote environment (e.g. a remote user's living room).
  • a position for AR content e.g. a 3D model
  • a remote environment e.g. a remote user's living room
  • One embodiment of a content positioning process may include starting a video conference. Users in several geographically distant sites may start a video conference in which the users are wearing AR head mounted displays (HMDs) or the like to see video feeds of remote sites.
  • HMDs head mounted displays
  • FIG. 1A depicts a local user's view (from Site 1) of a video feed from a remote site (Site 2).
  • Block 102 of FIG. 1A depicts a local user's view of a video feed from Site 2 rendered with the local user's AR goggles, or head mounted device (HMD) and including an augmentation- position selection indicator. The user could be searching for a position for AR content.
  • the right half of FIG. 1A, block 104 depicts the local user's view at Site 1 of the video feed from Site 2 with AR content positioned at (or based on) the selected spot.
  • FIG. 1 B, block 106 depicts a local user searching for a position for AR content while wearing an HMD.
  • Block 108 illustrates a user at site 1 viewing the positioned content.
  • Block 110 illustrates a user at Site 3 viewing the positioned content from a different viewpoint.
  • the content positioning process includes defining positions of each remote site in a common geometry. For example, in a system may define positions of each remote site in a common geometry to define how the remote users' video feeds are rendered to local user's view.
  • One embodiment of a process may include tracking a local user's orientation and/or gaze direction.
  • a local user's system e.g., an AR HMD
  • Another embodiment of a process may include sending, receiving, and/or processing user input relating to a selected virtual object.
  • a local user may select a virtual object to be augmented.
  • the local user may rotate the object to a correct orientation, select a scale, look at a desired position in the video feed from the remote site (feed is rendered to the view provided by the AR HMD) to augment the object and/or indicate to the system to augment the object to that position.
  • One embodiment of a process may include calculating a direction at which the user is looking with respect to the rendered view of the remote site.
  • the process calculates a direction where a user is looking with respect to the rendered view of the remote site and sends this direction information together with scale information, orientation information and the virtual object to be augmented to the remote site.
  • Another embodiment of a process may include calculating an intersection of the gaze direction and the 3D model of the remote site.
  • the process can include calculating an intersection of the gaze direction and the 3D model of the remote site.
  • the process can further include indicating where the user is looking at by rendering e.g. a cursor and/or line of sight to the users' views.
  • Another embodiment of a process may include adding a virtual object to the intersection coordinates and/or rendering the view with the new augmented object to one or more users.
  • the process can include adding a virtual object to the intersection coordinates and rendering the view with the new augmented object to all the users.
  • the process can include providing an individual video feed to each of the users, from individual camera (virtual or real) positions.
  • the site of the user wanting to add one or more virtual objects to one or more other sites environment may be referred to as the "augmenting site,” and the site of the real-world environment that the virtual object is added may be referred to as the "augmented site.”
  • the shared geometry 202 may be a controlling system that positions the conference participants in a common coordinate system, optionally, to allow each site's viewing system to position the remote video feeds so that the participants have a common understanding of the meeting setup.
  • site 1 200 may be referred to as an augmenting site
  • site 2 201 may be referred to as an augmented site.
  • the augmenting site includes an AR viewing system such as AR viewing for User 1 210, an AR content selection system such as 238, and an AR content position system 230.
  • the AR viewing system may be configured to allow a user to view AR objects and videos from both local and other sites augmented to the user's view, positioned according to the shared geometry 202.
  • Shared geometry 202 enables relative positions 204, 206 and 208 and the like to be shared among user 1 210, user 2 212, and user 3 214.
  • user 2 at site 2 201 may have a virtual camera or several virtual reality cameras 216, 222 that provide video feeds 218, 220 and 224 to user 2, user 1 210, and user 3 214, respectively.
  • user 1 210 After receiving video feed 220, at site 1 200, user 1 210 transmits a gaze direction 226 and determines AR content selection 238 to enable AR content positioning 230.
  • AR content positioning 230 provides the gaze direction 228 and virtual object positioning 232 to site 2 201. More particularly, gaze direction 228 and virtual object 232 can be provided to a 3D model management module 234 at site 2 201.
  • the AR content selection system 238 may be configured to allow a user to select virtual objects to be augmented.
  • the AR content positioning 230 may be configured to receive eye gaze direction 226 with respect to local geometry and use the shared geometry 202 to calculate gaze direction with respect to remote geometry. The direction information and the selected virtual content can then be sent to the remote site, such as site n 203.
  • the augmented site can include a 3D reconstruction module 236, a 3D model management module 234 and one or more virtual cameras 216.
  • 3D reconstruction module 236 can create a 3D model of the site using a set of sensors, e.g. depth sensors with RGB cameras, and updates the 3D model in real time.
  • 3D model management module 234 can add AR objects to the reconstructed geometry.
  • a virtual camera 216 can create a synthesized video feed from the 3D model management module 234 and transmits the feed 224 to a local site, such as user 3 AR viewing system 214.
  • some sites may not be an augmenting site 201 or an augmented site 203.
  • Such sites may have only a viewing system, for example, if the users at the site do not augment objects to other sites.
  • no objects can be augmented, if a site has only a viewing system.
  • the AR viewing system in accordance with some embodiments shows video feeds, such as video feed 224, from the remote sites augmented to a local user's view, such as user 3 214.
  • the feeds from remote sites may be augmented around the user so that when the user turns around, the remote site he/she is facing changes.
  • the system positions the feeds relative to the user. Since each of the participant's respective viewing systems define the remote feed positions locally, some embodiments include a common geometry management system that defines a common geometry system so that each of the users have a same understanding of spatiality and common scale. Specifically, shared geometry 202 shares relative positions 204, 206 and 208 among the users to enable the positioning of the respective users within the video feeds 218, 220 and 224.
  • the AR viewing system 240 may include one or more of the following components: a user application 242, a presentation system 244, a world model system 246, a tracking system 248, and/or a context system 250.
  • a user application 242, presentation system 244, world model system 246, tracking system 248 and context system 250 can be implemented via a processor disposed in an augmented reality (AR) display.
  • AR augmented reality
  • a client-server type relationship can exist between a display and a server.
  • a display can function as a client device with user interface functionality, but rely on a connected server to perform other functionalities.
  • a user application may run on a user terminal such as a mobile device and/or may implement e.g. user interface functionalities and/or control client-side logic of the AR system 240.
  • the presentation system 244 may control all outputs to the user, such as video streams, and/or may render 2D and/or 3D AR objects into the videos, audio and/or tactile outputs.
  • the world model system 246 may store and provide access to digital representations of the world, e.g. points of interest (Pol) and AR objects linked to the Pols.
  • digital representations of the world e.g. points of interest (Pol) and AR objects linked to the Pols.
  • the tracking system 248 may capture changes in the user's location and orientation so that the AR objects are rendered in such a way that the user experiences them to be to be part of the environment.
  • the tracking system 248, for example, may contain a module for tracking the user's gaze direction.
  • the tracking system 248 may be integrated with a head mounted display (HMD) that can track the user's eye movement, for example, from close distances.
  • HMD head mounted display
  • the context system 250 may store and provide access to information about the user and real time status information, e.g. where the user is using the system.
  • a user has a user application running in an AR terminal, such as a head mounted display (HMD), which can be a mobile device such as a cell phone or the like attached to a head mounting apparatus to create an HMD.
  • the presentation system 244 brings AR objects, e.g. video feeds from remote sites, into the user's view, augmenting them into the live video feed of the local environment captured by the user's AR terminal.
  • the viewing system 240 may also have an audio system 252 producing spatial audio so that the sound attached to the augmented object, e.g. sound of the person speaking in the remote site video feed, seems to be coming from the right direction.
  • the presentation system 244 positions the AR objects into correct positions in the video stream by using 3D coordinates of the local environment and the AR object, provided by the world model system 246.
  • the tracking system 248 tracks the user's position with respect to the world model 246, allowing the presentation system 244 to keep the AR objects' positions unchanged in the user's view with respect to the environment even when the user's position changes.
  • AR content selection 238 can be used to discover and control AR content that is shown by the AR viewing system.
  • the selection system 238 may be a mobile application that a user is already familiar with for consuming content in the user's own mobile device, for example an application for browsing the user's images.
  • the disclosed system may also implement service discovery routines so that an AR content control system can discover AR viewing system interfaces and connect to the AR viewing system over a wireless network.
  • a user indicates the 3D position where the AR viewing system renders the content.
  • the AR scene creation defines for each AR object a position, an orientation, and a scale (collectively the pose).
  • the position is defined so that a user looks to the desired position in the remote site video feed and the AR content positioning system may compute eye direction with respect to the common coordinate system maintained by, for example, a common geometry system or shared geometry 202.
  • the 3D reconstruction system 236 captures shape and appearance of real objects in the site where the virtual objects are augmented.
  • the system may be based on obtaining depth data from several sensors such as sensors associated with virtual camera 216 capturing the site from different angles. Based on the depth data, the system may calculate surface vertices and normals and interpolate the result to determine a mathematical representation of the surfaces corresponding to real-world objects. The resulting surfaces may be combined with normal RGB video camera images to create a natural-looking virtual representation of the site's environment.
  • the 3D model management module 234 stores the 3D model reconstruction results and communicates with the common geometry management 202 to select correct views to the 3D model so that the virtual camera views 216 can be provided to the users from different angles. 3D model management module 234 may also calculate the intersection point of the gaze direction and the 3D model and adds augmented virtual objects to the intersection.
  • a virtual camera 216 provides views to the virtual world from different angles.
  • virtual cameras 216 in combination with AR viewing system 240 create a high-quality synthesized video feed from 3D reconstruction results online.
  • An advantage of at least some embodiments of systems and methods herein is that virtual content in a 3D environment may be positioned without manipulating 3D objects via a 2D user interface.
  • Manipulating 3D objects via a 2D interface is a task that is not natural to most people.
  • an embodiment is directed to a method for a viewing system for video conferencing.
  • a system and method enables video conferencing between numerous sites, where each site may have one or more users.
  • the number of sites may be limited by the method used for defining the common geometry that defines how the sites are shown to the user on the other sites.
  • the user may preferably use HMD-type AR goggles.
  • Some remote site display methods may allow using large displays where the remote sites are rendered.
  • Block 302 provides for starting a video conference between one or more sites.
  • Block 310 provides for generating a reconstruction of at least part of the sites constantly during the video conference.
  • Each site may have a 3D reconstruction set-up that captures the shape and appearance of real objects in the site.
  • the 3D reconstruction system may e.g. be based on obtaining depth data from several sensors capturing the site from different angles.
  • the result of the 3D reconstruction may be a surface or faceted 3D model of the site environment.
  • Block 312 may provide for combining the site geometries into one, shared geometry.
  • the telepresence system may create a shared geometry combining all the site geometries.
  • the geometries may be combined so that they are, for example, adjacent or overlapping.
  • Geometries of two or more sites can be combined trivially (either adjacent or overlapping), but when the number of sites increases, there are several options to create the common adjacent geometry.
  • Overlapping geometries mean that virtual models from different sites may contain overlapping objects, making it hard to realize what object the user is looking at when trying to select a position for augmenting an object. As a result, erroneous positions may be selected.
  • Block 320 provides for placing virtual cameras to each site and sending a synthesized video feed of the 3D reconstructed model to the other sites.
  • Each site has a capture set-up that includes one or more cameras that capture the site environment in real time and transmit the captured videos to the other sites.
  • the captured video feeds are generated by virtual cameras that create synthetized video of the 3D reconstructed model from selected viewpoints.
  • the virtual cameras may be positioned based on the selected shared geometry method. Since the users at the other sites see a 2D projection of the 3D environment, there is no need to transmit 3D information between the sites.
  • FIG. 4 depicts an example of a video feed being transmitted from a remote site (site 2) 404 to a local site (site 1) 402. As depicted in FIG. 4, the remote video feed of at least part of the remote site captured by the camera(s) 406 is transmitted from the remote site to the HMD 408 of the local site.
  • Block 322 provides for rendering videos received from sites to the AR HMD or goggles worn by users at each site, based on the shared geometry. Video feeds from the remote sites are rendered to the local user's view into respective positions that may be defined by the common geometry.
  • FIG. 5 depicts an example rendering of remote sites positioned in a triangular adjacent three site geometry. As depicted in FIG. 5, the two remote sites 502 and 504 are rendered to the local user's view 506.
  • FIG. 6 depicts an overview of overlapping and adjacent geometries of sites.
  • Organization 602 of FIG. 6 depicts an example rendering of an adjacent geometry for two sites.
  • Organization 604 of FIG. 6 depicts an example triangular adjacent three- site geometry.
  • Block 330 provides for tracking a user's gaze direction and head position with respect to the rendered video and calculating the intersection of the gaze direction and the rendered video image.
  • the user's AR HMD or goggles may contain a gaze tracking module that is used to determine a direction vector with respect to a local geometry.
  • the gaze tracking module may be implemented by two 3D cameras (integrated into an HMD or AR goggles) that are able to view and/or determine the user's pupil position and gaze direction in three dimensions.
  • the gaze direction can be calculated with respect to the remote video the user is looking at, the gaze direction can be used to calculate the pixel in the remote video that the user is looking at, Specifically, referring to FIG. 7, a user 702 is shown wearing an HMD with a given field of view 706A in a gaze direction 704A representing a scene field of view 706B from gaze direction 704B.
  • the pixel coordinates 708 are also illustrated to demonstrate a selected pixel position.
  • Block 332 provides for getting user input for selecting a position for augmentation.
  • a user looks at a spot for selection in the remote video and indicates to the system when the spot is selected.
  • the method of indication may be via a voice command and/or a hand gesture in front of the AR goggle or an HMD camera.
  • the system may show a cursor augmented to the spot the user is looking at.
  • the cursor can be shown only in the local user's video, or a 3D cursor (or a 3D line showing the gaze direction) may be augmented to each user's video enabling all the users to see the augmentation positioning process.
  • Block 340 provides for transferring image intersection coordinates to the site that sent the video using, for example, ray casting to get the intersection point of a 3D reconstructed model. Since the intersection of gaze direction and the video shown in the user's display can be calculated, the point at which the gaze intersects the 3D reconstructed model can also be calculated. To calculate such a point, for example, a well-known ray casting method may be used.
  • One known method of ray casting includes Roth, Scott D. (February 1982), "Ray Casting for Modeling Solids", Computer Graphics and Image Processing, 18 (2): 109-144, doi: 10.1016/0146- 664X(82)90169-1 , which is incorporated by reference.
  • Roth a ray is shot from the selected pixel in the generated image (through a virtual camera), and the point of intersection of the closest object blocking the path of that ray is selected as an intersection point.
  • a visual example of ray casting is shown in FIG. 8. The ray casting calculation may be done at the remote site, so there is no need to transfer any information except the pixel coordinates between the sites.
  • Block 342 provides for getting user input for uploading augmented content and transferring the content to the site where the content will be augmented.
  • the user may upload a virtual object to be augmented.
  • the user may also rotate and scale the object to desirable orientation and size.
  • a user interface for this may be implemented on a mobile device, or using gestures to select one object of a group of objects rendered to user's view.
  • the uploaded object and the orientation information, such as transformation matrices, may then be transmitted to the remote site.
  • Block 350 provides for augmenting the content into each of the synthesized videos to the position of intersection, and sending to each of the sites.
  • the object to be augmented is added to the 3D reconstructed 3D model of the remote site, and the synthesized video feeds from selected virtual camera positions may be sent to each site.
  • Block 352 provides for rendering the video feeds to the AR goggles or HMD worn by users at each site.
  • the video feeds containing augmented objects are rendered to each user's view.
  • a user 902 wearing an HMD views a scene 906 with virtual object 904.
  • the system and method includes creating a 3D reconstructed model of a site environment 1008, and positioning virtual reality cameras to individual positions for each site 1010, creating synthesized video feeds of the reconstructed 3D model from each virtual camera position 1012.
  • video feeds are provided to site 1 1002 and site N 1006.
  • Site N 1006 receives the video feed 1016 and renders the video feed to AR goggles/HMD or other display.
  • Site 1 1002 receives video feed 1014 with a viewpoint determined for site 1 and the video feed is rendered to AR goggles/HMD or other display 1020.
  • the users gaze direction is tracked 1024 and site 1 user provides input for position selection 1026.
  • the gaze direction information 1028 is provided to Site 2 which enables Site 2 1004 to calculate an intersection of gaze and a 3D reconstructed model 1030.
  • User at site 1 1002 provides input for selecting a virtual object 1032 and provides the data about the virtual object 1034 to Site 2 1004.
  • the virtual object is positioned within the reconstructed 3D model at the intersection point 1036.
  • synthesized video feeds of the reconstructed 3D model are created from each virtual reality camera position 1038.
  • the synthesized video feeds are provided to Site 1 1040 and to Site N 1042.
  • the synthesized video feed to Site 1 is a video feed with a viewpoint from the user at Site 1 and the synthesized video feed to Site N is a video feed with a viewpoint from the user at Site N.
  • the synthesized video feed is then rendered at Site 1 via AR goggles/HMD/display 1044, rendered at Site 2 1046, and rendered at Site N 1048.
  • FIG. 11 another sequence illustrates an embodiment from an alternate perspective. More particularly, FIG. 11 illustrates interactions between Site 1 1102, Site 2 1104 and Site 3 1106.
  • Video feed from site 2 1104 is provided to site 1 1 102 at step 1108.
  • cameras can perform a local 3D scan at step 1110 and build a local 3D model 1116.
  • gaze tracking is performed 112 and a gaze vector is computed at step 1114.
  • a user at Site 1 1102 an input from a user, for example, can select an AR object.
  • the system can determine scale and orientation at step 1118.
  • a position and gaze vector is provided to Site 2.
  • an AR location is computed at step 1122.
  • Site 1 1102 can also provide AR object, scale and orientation data at step 1124.
  • Site 2 1104 renders a local augmented reality scene for the user at Site 2 from his/her viewpoint.
  • Transmitted videos can also be augmented at step 1130.
  • different video feeds can be provided to both Site 1 and Site 3. Specifically, Site 1 receives an augmented video as a view from user 1 , 1132 and Site 3 receives an augmented video as a view from user 3 1140.
  • a user at site 1 augments a virtual object to an environment of site 2, and the user at site 1, site 2 and site n (site 3) see the augmented objects rendered to users respective AR goggles/HMD or 3D display.
  • the system may use real video cameras.
  • the cameras may be positioned as described in International Application No. PCT/US 16/46848, filed Aug. 12, 2016, entitled “System and Method for Augmented Reality Multi-View Telepresence," and PCT Patent Application No. PCT/US17/38820 filed June 22, 2017, entitled “System and Method for Spatial Interaction Using Automatically Positioned Cameras” which are incorporated herein by reference.
  • Each video camera may capture an individual video feed that is transmitted to one of the remote users.
  • Embodiments described herein for calculating an intersection between a gaze direction and a 3D reconstructed remote model can be used when using real video cameras.
  • apriori knowledge of optical properties, position and capture direction of the camera can enable placing a virtual camera with the same or similar properties and pose with respect to a reconstructed model, and enable calculating the intersection similar to using virtual cameras only.
  • the object can be augmented to outgoing video streams, using a 3D reconstructed model as a positioning reference.
  • a 3D reconstructed model as a positioning reference.
  • the 3D reconstructed model is not shared with other sites, only the captured video streams. Limiting the sharing of the 3D reconstructed model is preferable due to the possibly extreme bandwidth requirements of sharing real-time generated 3D model. However, in some use cases it may be preferable to share an entire 3D reconstructed geometry.
  • a shared geometrid enables creating a single combined geometry of all the site 3D reconstructions, and the position selection can be implemented as described in U.S. Patent Publication No. 2016/0026242, which is incorporated herein by reference. Sharing the 3D reconstruction allows each site to implement augmentation individually, without notifying any other parties of identified objects each user has selected to be rendered into his/her own view.
  • Some embodiments described above allow positioning augmented objects in the intersections of gaze direction and the virtual model, meaning the augmented objects are touching the 3D reconstructed virtual model.
  • some exemplary embodiments use user interaction to move the object along a "gaze ray", which enables moving the object away from the intersection point, but maintaining the object along the gaze ray.
  • this action is collaborative. For example, one user selects the gaze direction and another user who sees the remote environment from different angle, positions the object in the correct position along the ray.
  • the orientation of the virtual object augmented to remote video may be selected after it has been augmented into the remote video.
  • the system may offer a user interface for rotating the object (e.g. by recognizing user's gestures in front of AR goggles/HMD).
  • the gestures can be interpreted by a local system and transmitted to a remote site that updates the virtual object orientation accordingly. Any of the users viewing the remote augmented video may change the orientation.
  • Pekka, Seppo and Sanni are having an enhanced video conference using the system described above. They all have set the system in their apartments. Seppo has a 3D model of a piece of furniture he thinks looks good in Pekka's environment. Seppo selects a position where he wants to add the furniture model, by looking at a position at a wall in the video view coming from Pekka's apartment. Seppo selects a furniture model using his mobile phone user interface (Ul) and informs the system to augment the object using voice command.
  • Ul mobile phone user interface
  • modules that carry out (i.e., perform, execute, and the like) various functions described herein.
  • each described module includes hardware (e.g., one or more processors, microprocessors, microcontrollers, microchips, application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), memory devices, and/or one or more of any other type or types of devices and/or components deemed suitable by those of skill in the relevant art in a given context and/or for a given implementation.
  • hardware e.g., one or more processors, microprocessors, microcontrollers, microchips, application-specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), memory devices, and/or one or more of any other type or types of devices and/or components deemed suitable by those of skill in the relevant art in a given context and/or for a given implementation.
  • Each described module shown and described can also include instructions executable for carrying out the one or more functions described as being carried out by the particular module, and those instructions could take the form of or at least include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, stored in any non-transitory computer-readable medium deemed suitable by those of skill in the relevant art.
  • Exemplary embodiments disclosed herein are implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU) or other network entity.
  • WTRU wireless transmit/receive unit
  • FIG. 12 is a system diagram of an exemplary WTRU 1202, which may be employed as a mobile device, a remote device, a camera, a monitoring-and-communication system, and/or a transmitter, in embodiments described herein.
  • the WTRU 1202 may include a processor 1218, a communication interface 1219 including a transceiver 1220, a transmit/receive element 1222, a speaker/microphone 1224, a keypad 1226, a display/touchpad 1228, a non-removable memory 1230, a removable memory 1232, a power source 1234, a global positioning system (GPS) chipset 1236, and sensors 1238.
  • GPS global positioning system
  • the processor 1218 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like.
  • the processor 1218 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 1202 to operate in a wireless environment.
  • the processor 1218 may be coupled to the transceiver 1220, which may be coupled to the transmit/receive element 1222. While FIG. 4 depicts the processor 1218 and the transceiver 1220 as separate components, it will be appreciated that the processor 1218 and the transceiver 1220 may be integrated together in an electronic package or chip.
  • the transmit/receive element 1222 may be configured to transmit signals to, or receive signals from, a base station over the air interface 1216.
  • the transmit/receive element 1222 may be an antenna configured to transmit and/or receive RF signals.
  • the transmit/receive element 1222 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples.
  • the transmit/receive element 1222 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 1222 may be configured to transmit and/or receive any combination of wireless signals.
  • the WTRU 1202 may include any number of transmit/receive elements 1222. More specifically, the WTRU 1202 may employ MIMO technology. Thus, in one embodiment, the WTRU 1202 may include two or more transmit/receive elements 1222 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 1216.
  • the WTRU 1202 may include two or more transmit/receive elements 1222 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 1216.
  • the transceiver 1220 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 1222 and to demodulate the signals that are received by the transmit/receive element 1222.
  • the WTRU 1202 may have multi-mode capabilities.
  • the transceiver 1220 may include multiple transceivers for enabling the WTRU 1202 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.
  • the processor 1218 of the WTRU 1202 may be coupled to, and may receive user input data from, the speaker/microphone 1224, the keypad 1226, and/or the display/touchpad 1228 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit).
  • the processor 1218 may also output user data to the speaker/microphone 1224, the keypad 1226, and/or the display/touchpad 1228.
  • the processor 1218 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 1230 and/or the removable memory 1232.
  • the non-removable memory 1230 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device.
  • the removable memory 1232 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.
  • SIM subscriber identity module
  • SD secure digital
  • the processor 1218 may access information from, and store data in, memory that is not physically located on the WTRU 1202, such as on a server or a home computer (not shown).
  • the processor 1218 may receive power from the power source 1234, and may be configured to distribute and/or control the power to the other components in the WTRU 1202.
  • the power source 1234 may be any suitable device for powering the WTRU 1202.
  • the power source 1234 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), and the like), solar cells, fuel cells, and the like.
  • the processor 1218 may also be coupled to the GPS chipset 1236, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 1202.
  • location information e.g., longitude and latitude
  • the WTRU 1202 may receive location information over the air interface 1216 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 1202 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
  • the processor 1218 may further be coupled to other peripherals 1238, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity.
  • the peripherals 1238 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
  • sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module
  • FIG. 13 depicts an exemplary network entity 1390 that may be used in embodiments of the present disclosure, for example as part of a monitoring-and-communication system, as described herein.
  • network entity 1390 includes a communication interface 1392, a processor 1394, and non-transitory data storage 1396, all of which are communicatively linked by a bus, network, or other communication path 1398.
  • Communication interface 1392 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 1392 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 1392 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 1392 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi Fi communications, and the like). Thus, communication interface 1392 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
  • wireless communication interface 1392 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for
  • Processor 1394 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
  • Data storage 1396 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random- access memory (RAM) to name but a few, as any one or more types of non-transitory data storage deemed suitable by those of skill in the relevant art could be used. As depicted in FIG. 13, data storage 1396 contains program instructions 1397 executable by processor 1394 for carrying out various combinations of the various network-entity functions described herein. [0133] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements.
  • ROM read only memory
  • RAM random access memory
  • register cache memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
  • a processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

L'invention concerne un système et un procédé consistant à : recevoir, sur un premier visiocasque (HMD) de réalité augmentée d'un premier site, un flux vidéo d'un second site qui est distant du premier site ; restituer le flux vidéo à l'aide du premier HMD AR ; déterminer les données de regard à l'aide du premier HMD AR ; obtenir, sur le premier HMD AR, des données d'objet AR sélectionnées d'après une entrée utilisateur du premier HMD AR ; générer les données de regard du premier utilisateur HMD AR et les données d'objet AR sélectionnées à partir du premier HMD AR ; et afficher, à l'aide du premier HMD AR, un flux vidéo reçu du second site qui comprend un objet AR d'après les données d'objet AR sélectionnées de sortie, l'objet AR étant positionné d'après les données de regard de sortie. Dans un mode de réalisation, un serveur interagit avec le HMD AR dans une relation client-serveur.
PCT/US2018/023166 2017-03-24 2018-03-19 Procédé et système permettant de découvrir et de positionner un contenu dans un espace de réalité augmentée WO2018175335A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762476318P 2017-03-24 2017-03-24
US62/476,318 2017-03-24

Publications (1)

Publication Number Publication Date
WO2018175335A1 true WO2018175335A1 (fr) 2018-09-27

Family

ID=61899393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/023166 WO2018175335A1 (fr) 2017-03-24 2018-03-19 Procédé et système permettant de découvrir et de positionner un contenu dans un espace de réalité augmentée

Country Status (1)

Country Link
WO (1) WO2018175335A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2563731A (en) * 2017-05-09 2018-12-26 A9 Com Inc Markerless image analysis for augmented reality
US11030474B1 (en) 2019-05-28 2021-06-08 Apple Inc. Planar region boundaries based on intersection
WO2021163224A1 (fr) 2020-02-10 2021-08-19 Magic Leap, Inc. Colocalisation dynamique de contenu virtuel
WO2021242451A1 (fr) * 2020-05-29 2021-12-02 Microsoft Technology Licensing, Llc Émojis basés sur des gestes de la main
US11201953B2 (en) 2018-07-24 2021-12-14 Magic Leap, Inc. Application sharing
CN114894253A (zh) * 2022-05-18 2022-08-12 威海众合机电科技有限公司 一种应急视觉智能增强方法、系统及设备
US11475644B2 (en) 2020-02-14 2022-10-18 Magic Leap, Inc. Session manager
US11494528B2 (en) 2020-02-14 2022-11-08 Magic Leap, Inc. Tool bridge
US11763559B2 (en) 2020-02-14 2023-09-19 Magic Leap, Inc. 3D object annotation
EP4195159A4 (fr) * 2020-08-06 2024-01-17 Maxell Ltd Procédé et système de partage de réalité virtuelle

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012135553A1 (fr) * 2011-03-29 2012-10-04 Qualcomm Incorporated Obstruction sélective par les mains au-dessus de projections virtuelles sur des surfaces physiques par un suivi du squelette
US20130335405A1 (en) * 2012-06-18 2013-12-19 Michael J. Scavezze Virtual object generation within a virtual environment
US20150215351A1 (en) * 2014-01-24 2015-07-30 Avaya Inc. Control of enhanced communication between remote participants using augmented and virtual reality
US20160026242A1 (en) 2014-07-25 2016-01-28 Aaron Burns Gaze-based object placement within a virtual reality environment
US20160027218A1 (en) * 2014-07-25 2016-01-28 Tom Salter Multi-user gaze projection using head mounted display devices

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012135553A1 (fr) * 2011-03-29 2012-10-04 Qualcomm Incorporated Obstruction sélective par les mains au-dessus de projections virtuelles sur des surfaces physiques par un suivi du squelette
US20130335405A1 (en) * 2012-06-18 2013-12-19 Michael J. Scavezze Virtual object generation within a virtual environment
US20150215351A1 (en) * 2014-01-24 2015-07-30 Avaya Inc. Control of enhanced communication between remote participants using augmented and virtual reality
US20160026242A1 (en) 2014-07-25 2016-01-28 Aaron Burns Gaze-based object placement within a virtual reality environment
US20160027218A1 (en) * 2014-07-25 2016-01-28 Tom Salter Multi-user gaze projection using head mounted display devices

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROTH, SCOTT D.: "Ray Casting for Modeling Solids", COMPUTER GRAPHICS AND IMAGE PROCESSING, vol. 18, no. 2, February 1982 (1982-02-01), pages 109 - 144, XP001376683

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10339714B2 (en) 2017-05-09 2019-07-02 A9.Com, Inc. Markerless image analysis for augmented reality
US10733801B2 (en) 2017-05-09 2020-08-04 A9.Com. Inc. Markerless image analysis for augmented reality
GB2563731B (en) * 2017-05-09 2021-04-14 A9 Com Inc Markerless image analysis for augmented reality
GB2563731A (en) * 2017-05-09 2018-12-26 A9 Com Inc Markerless image analysis for augmented reality
US11201953B2 (en) 2018-07-24 2021-12-14 Magic Leap, Inc. Application sharing
US11936733B2 (en) 2018-07-24 2024-03-19 Magic Leap, Inc. Application sharing
US11030474B1 (en) 2019-05-28 2021-06-08 Apple Inc. Planar region boundaries based on intersection
US11335070B2 (en) 2020-02-10 2022-05-17 Magic Leap, Inc. Dynamic colocation of virtual content
US20220245905A1 (en) * 2020-02-10 2022-08-04 Magic Leap, Inc. Dynamic colocation of virtual content
WO2021163224A1 (fr) 2020-02-10 2021-08-19 Magic Leap, Inc. Colocalisation dynamique de contenu virtuel
US11475644B2 (en) 2020-02-14 2022-10-18 Magic Leap, Inc. Session manager
US11494528B2 (en) 2020-02-14 2022-11-08 Magic Leap, Inc. Tool bridge
US11763559B2 (en) 2020-02-14 2023-09-19 Magic Leap, Inc. 3D object annotation
US11797720B2 (en) 2020-02-14 2023-10-24 Magic Leap, Inc. Tool bridge
US11861803B2 (en) 2020-02-14 2024-01-02 Magic Leap, Inc. Session manager
WO2021242451A1 (fr) * 2020-05-29 2021-12-02 Microsoft Technology Licensing, Llc Émojis basés sur des gestes de la main
US11340707B2 (en) 2020-05-29 2022-05-24 Microsoft Technology Licensing, Llc Hand gesture-based emojis
EP4195159A4 (fr) * 2020-08-06 2024-01-17 Maxell Ltd Procédé et système de partage de réalité virtuelle
CN114894253A (zh) * 2022-05-18 2022-08-12 威海众合机电科技有限公司 一种应急视觉智能增强方法、系统及设备

Similar Documents

Publication Publication Date Title
WO2018175335A1 (fr) Procédé et système permettant de découvrir et de positionner un contenu dans un espace de réalité augmentée
US10535181B2 (en) Virtual viewpoint for a participant in an online communication
US11363240B2 (en) System and method for augmented reality multi-view telepresence
US11095856B2 (en) System and method for 3D telepresence
US9332222B2 (en) Controlled three-dimensional communication endpoint
JP2023126303A (ja) 画像ディスプレイデバイスの位置特定マップを決定および/または評価するための方法および装置
WO2018005235A1 (fr) Système et procédé d'interaction spatiale utilisant des caméras positionnées automatiquement
JP6932796B2 (ja) レイヤ化拡張型エンターテインメント体験
WO2018039071A1 (fr) Procédé et système de présentation de sites de réunion à distance à partir de points de vue dépendants d'un utilisateur
CN108693970A (zh) 用于调适可穿戴装置的视频图像的方法和设备
JP2020068513A (ja) 画像処理装置および画像処理方法
CN104866261A (zh) 一种信息处理方法和装置
US20190139313A1 (en) Device and method for sharing an immersion in a virtual environment
US10482671B2 (en) System and method of providing a virtual environment
CN116670722A (zh) 增强现实协作系统
JP2023092729A (ja) 通信装置、通信システム、表示方法、及びプログラム
CN116055709A (zh) 同步多ar显示装置沙盘信息显示系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18715881

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18715881

Country of ref document: EP

Kind code of ref document: A1