WO2013155217A1 - Realistic occlusion for a head mounted augmented reality display - Google Patents

Realistic occlusion for a head mounted augmented reality display Download PDF

Info

Publication number
WO2013155217A1
WO2013155217A1 PCT/US2013/036023 US2013036023W WO2013155217A1 WO 2013155217 A1 WO2013155217 A1 WO 2013155217A1 US 2013036023 W US2013036023 W US 2013036023W WO 2013155217 A1 WO2013155217 A1 WO 2013155217A1
Authority
WO
WIPO (PCT)
Prior art keywords
occlusion
data
virtual
real
virtual object
Prior art date
Application number
PCT/US2013/036023
Other languages
English (en)
French (fr)
Inventor
Kevin A. Geisner
Brian J. MOUNT
Stephen G. Latta
Daniel J. MCCULLOCH
Kyungsuk David LEE
Ben J. SUGDEN
Jeffrey N. Margolis
Kathryn Stone Perez
Sheridan Martin SMALL
Mark J. Finocchio
Robert L. CROCCO
Original Assignee
Geisner Kevin A
Mount Brian J
Latta Stephen G
Mcculloch Daniel J
Lee Kyungsuk David
Sugden Ben J
Margolis Jeffrey N
Kathryn Stone Perez
Small Sheridan Martin
Finocchio Mark J
Crocco Robert L
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/443,368 external-priority patent/US9122053B2/en
Application filed by Geisner Kevin A, Mount Brian J, Latta Stephen G, Mcculloch Daniel J, Lee Kyungsuk David, Sugden Ben J, Margolis Jeffrey N, Kathryn Stone Perez, Small Sheridan Martin, Finocchio Mark J, Crocco Robert L filed Critical Geisner Kevin A
Publication of WO2013155217A1 publication Critical patent/WO2013155217A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1041Mechanical or electronic switches, or control elements
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0118Head-up displays characterised by optical features comprising devices for improving the contrast of the display / brillance control visibility
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B2027/0178Eyeglass type

Definitions

  • Augmented reality also referred to as mixed reality, is a technology that allows virtual imagery to be mixed with a user's view of the real world.
  • virtual objects In addition to making the physical properties (e.g. shape, color, size, texture) of virtual objects realistic in a display, it is desired that their position and movement with respect to real objects display realistically. For example, it is desired that a virtual object be blocked from view and block another object, real or virtual, from view like a real object would in a user field of view provided by the head mounted display device.
  • a spatial occlusion relationship identifies at least one or more portions of an object being blocked from view in the user field of view, either partially or wholly.
  • Either of the real object or the virtual object can be either an occluding object or an occluded object.
  • the occluding object blocks the occluded object from view, at least partially.
  • a partial occlusion interface is the intersection at which a boundary for an occluding portion, of an occluding object is adjacent an unoccluded, e.g. unblocked, portion of the occluded object in the spatial relationship.
  • dashed lines 708, 710 and 712 in Figure 7B are each an example of a partial occlusion interface between the virtual dolphin 7022 and the real tree 7162.
  • a model in accordance with a level of detail may also be generated for a conforming occlusion interface in which at least a portion of boundary data of a virtual object conforms to at least a portion of boundary data for a real object.
  • a conforming occlusion can be a whole occlusion or a partial occlusion.
  • a level of detail for a model for example a geometrical model, representing an occlusion interface is determined based on one or more criteria such as a depth distance from the display device, display size, and closeness to a point of gaze.
  • Some embodiments also comprise realistic, three dimensional audio occlusion of an occluded object, real or virtual, based on physical properties of the occluding object.
  • the technology provides one or more embodiments of a method for displaying realistic occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • a spatial occlusion relationship between an occluding object and an occluded object including the real object and the virtual object is determined to exist based on overlapping three dimensional (3D) space positions of the objects in a 3D mapping of at least a user field of view of the display device system.
  • object boundary data of an occluding portion of the occluding object in the partial occlusion is retrieved.
  • a level of detail is determined for a model representing a partial occlusion interface based on level of detail criteria.
  • the display device system alone or with the aid of other computers, generates a model of the partial occlusion interface based on the retrieved object boundary data in accordance with the determined level of detail.
  • a modified version of boundary data of the virtual object adjacent an unoccluded portion of the real object is generated based on the model and the generated adjacent boundary data has a shape based on the model.
  • the display device displays an unoccluded portion of the virtual object in accordance with the modified version of its boundary data.
  • a see-through, augmented reality display device system comprises a see-through, augmented reality display having a user field of view and being supported by a support structure of the see- through, augmented reality display device. At least one camera for capturing image data and depth data for real objects in the user field of view is also supported by the support structure.
  • One or more software controlled processors are communicatively coupled to the at least one camera for receiving image and depth data including the user field of view. The one or more software controlled processors determine a spatial occlusion relationship between an occluding object and an occluded object based on the image and depth data.
  • the occluding object and the occluded object include a virtual object and a real object.
  • the one or more software controlled processors are communicatively coupled to the see- through, augmented reality display and the one or more processors cause the see-through display to represent the spatial occlusion relationship in the display by modifying display of the virtual object.
  • the one or more processors cause the see- through display to represent the spatial occlusion relationship in the display by determining a level of detail for generating a model of an occlusion interface between the real object and the virtual object based on level of detail criteria.
  • a modified version of object boundary data may be generated for the virtual object based on the generated model, and the see-through, augmented reality display can display the virtual object based on the modified version of its object boundary data.
  • the technology provides one or more embodiments of one or more processor readable storage devices having instructions encoded thereon which instructions cause one or more processors to execute a method for providing realistic audiovisual occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • the method comprises determining a spatial occlusion relationship between a virtual object and a real object in an environment of a head mounted, augmented reality display device based on a three dimensional mapping of the environment. Whether an audio occlusion relationship exists between the virtual object and the real object is determined, and if so, audio data for the occluded object is modified based on one or more physical properties associated with an occluding object.
  • One or more earphones of the display device are caused to output the modified audio data.
  • Figure 1A is a block diagram depicting example components of one embodiment of a see-through, augmented reality display device system.
  • Figure IB is a block diagram depicting example components of another embodiment of a see-through, augmented reality display device system.
  • Figure 2A is a side view of an eyeglass temple of a frame in an embodiment of the see-through, augmented reality display device embodied as eyeglasses providing support for hardware and software components.
  • Figure 2B is a top view of an embodiment of a display optical system of a see- through, near-eye, augmented reality display device.
  • Figure 2C is a block diagram of one embodiment of a computing system that can be used to implement a network accessible computing system.
  • Figure 3 A is a block diagram of a system from a software perspective for providing realistic occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • Figure 3B illustrates an example of a reference object data set.
  • Figure 3C illustrates some examples of data fields in an object physical properties data set.
  • Figure 4A illustrate an example of a spatial occlusion resulting in an audio occlusion of a virtual object by a real object.
  • Figure 4B illustrates an example of a spatial occlusion resulting in an audio occlusion of a real object by a virtual object.
  • Figure 5 A is a flowchart of an embodiment of a method for displaying realistic partial occlusion between a real object and a virtual object by a head mounted augmented reality display device system.
  • Figure 5B is a flowchart of an implementation example for determining a spatial occlusion relationship between a virtual object and a real object in a user field of view of a head mounted, augmented reality display device based on 3D space positions for the objects.
  • Figure 5C is a flowchart of an embodiment of a method for displaying a realistic conforming occlusion interface between a real object occluded by a conforming virtual object by a head mounted, augmented reality display device system.
  • Figure 6A is a flowchart of an implementation example for determining a level of detail for representing an occlusion interface based on level of detail criteria including a depth position of the occlusion interface.
  • Figure 6B is a flowchart of an implementation example for determining a level of detail for representing an occlusion interface based on level of detail criteria including a display size of the occlusion interface.
  • Figure 6C is a flowchart of an implementation example for determining a level of detail for representing an occlusion interface based on level of detail criteria and a gaze priority value.
  • Figure 6D is a flowchart of an implementation example for determining a level of detail using speed of the interface as a basis.
  • Figure 7A illustrates an example of a level of detail using at least part of a boundary of a pre-defined bounding geometrical shape.
  • Figure 7B illustrates an example of a level of detail using geometry fitting with a first accuracy criteria.
  • Figure 7C illustrates an example of a level of detail using geometry fitting with a second accuracy criteria indicating a higher level of modeled detail.
  • Figure 7D illustrates an example of a level of detail using a bounding volume as boundary data for at least a real object.
  • Figure 8A illustrates an example of partial occlusion interfaces modeled as triangle legs for the virtual object in Figure 7 A.
  • Figure 8B illustrates an example of partial occlusion interfaces modeled by geometry fitting with the first accuracy criteria for the virtual object in Figure 7B.
  • Figure 8C is a reference image of the unmodified virtual object, a dolphin, in Figures 7A, 7B, 7C and 8A and 8B.
  • Figure 9A illustrates an example of a real person to which a conforming virtual object is registered.
  • Figure 9B illustrates examples of conforming occlusion interfaces modeled at a first level of detail with a first accuracy criteria for a virtual object.
  • Figure 9C illustrates an example of a conforming occlusion interface modeled with a second level of detail with a second accuracy criteria for a virtual object.
  • Figure 10 illustrates examples of displaying shadow effects between occluding real and virtual objects.
  • Figure 11 is a flowchart describing an embodiment of a process for displaying one or more virtual objects in a user field of view of a head mounted, augmented reality display device,
  • Figure 12 is a flowchart describing an embodiment of a process for accounting for shadows.
  • Figure 13A is a flowchart of an embodiment of a method for providing realistic audiovisual occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • Figure 13B is a flowchart of an implementation process example for determining whether an audio occlusion relationship between the virtual object and the real object exists based on one or more sound occlusion models associated with one or more physical properties of an occluding object.
  • Various embodiments are described for providing realistic occlusion between a real object and a virtual object by a see-through, augmented reality display device system.
  • One or more cameras capture image data, of a field of view of the display of a display device system, hereafter referred to as a user field of view as it approximates a user's field of view when looking through the display device.
  • a spatial occlusion relationship between a real object and a virtual object in the user field of view is identified based on the captured image data.
  • a 3D model including 3D object space positions of at least the user field of view may be mapped based on stereopsis processing of the image data or based on depth data from one or more depth sensors and the image data.
  • a 3D space is a volume of space occupied by the object.
  • the 3D space can match the 3D shape of the object or be a less precise volume like a bounding shape around an object.
  • a bounding shape are a bounding box, a bounding sphere, a bounding cylinder, a bounding ellipse or a complex polygon which is typically a little larger than the object.
  • the bounding volume may have a shape of a predefined geometry as in the examples. In other examples, the bounding volume shape is not a predetermined shape. For example, the volume may follow the detected edges of the object.
  • the bounding volume may be used as an occlusion volume in some embodiments as discussed further below.
  • a 3D space position represents position coordinates for the boundary of the volume or 3D space. In other words, the 3D space position identifies how much space an object occupies and where in the user field of view that occupied space is.
  • one object blocks, either partially or wholly, another object in the field of view.
  • a real pine tree partially occludes a virtual dolphin.
  • not rendering the virtual object can represent its occlusion on the display.
  • a real object can be wholly or partially blocked by a virtual object according to an executing application.
  • the virtual object can be displayed for the display elements, e.g. pixels of the display, for the whole or parts of the virtual object which are in front of the whole or parts of the real object.
  • the virtual object can be sized to completely cover the real object.
  • the virtual object is to be displayed to conform its shape over at least a portion of a real object.
  • There is a conforming occlusion interface as the shape of the occluding virtual object is dependent upon the shape of the portion of the real object which it has occluded, meaning blocked from view.
  • a conforming occlusion interface is also modeled as discussed below to form a basis for generation of a modified version of the virtual object boundary data upon which display of the virtual object is based.
  • partial occlusion there is a partial occlusion interface which is the intersection at which an object boundary of an occluding portion of an occluding object meets or is adjacent to an unoccluded portion of an occluded object.
  • a partial occlusion or a whole occlusion between a real object and a virtual object either type of object can be either the occluding object or the occluded object.
  • image data of the unoccluded portion of the virtual object is modified to represent the occlusion as the real object is actually seen through the display.
  • Displayed image data may be moving image data like video as well as still image data.
  • image data of the real world and the virtual images are displayed to a user so the user is not actually looking at the real world.
  • the same embodiments for methods and processes discussed below may also be applied for a video- see display, if desired.
  • Z buffering of image data of real objects and virtual image data based on a Z depth test can be performed. In the case of a video-see display, image data of an occluded portion of an object, be it real or virtual, is not displayed, and the image data of the occluding object, be it real or virtual is displayed.
  • FIG. 1 A is a block diagram depicting example components of an embodiment of a see-through, augmented or mixed reality display device system.
  • System 8 includes a see-through display device as a near-eye, head mounted display device 2 in communication with a processing unit 4 via a wire 6 in this example or wirelessly in other examples.
  • head mounted, display device 2 is in the shape of eyeglasses in a frame 115, with a display optical system 14 for each eye in which image data is projected into a user's eye to generate a display of the image data while a user also sees through the display optical systems 14 for an actual direct view of the real world.
  • Each display optical system 14 is also referred to as a see-through display, and the two display optical systems 14 together may also be referred to as a see-through display.
  • Frame 115 provides a support structure for holding elements of the system in place as well as a conduit for electrical connections. In this embodiment, frame 115 provides a convenient eyeglass frame as support for the elements of the system discussed further below. Some other examples of a near-eye support structure are a visor frame or a goggles support.
  • the frame 115 includes a nose bridge portion 104 with a microphone 110 for recording sounds and transmitting audio data to control circuitry 136.
  • a temple or side arm 102 of the frame rests on each of a user's ears, and in this example the temple 102 is illustrated as including control circuitry 136 for the display device 2.
  • an image generation unit 120 is included on each temple 102 in this embodiment as well. Also, not shown in this view, but illustrated in Figures 2A and 2B are outward facing cameras 113 for recording digital images and videos and transmitting the visual recordings to the control circuitry 136 which may in turn send the captured image data to the processing unit 4 which may also send the data to one or more computer systems 12 over a network 50.
  • the processing unit 4 may take various embodiments.
  • processing unit 4 is a separate unit which may be worn on the user's body, e.g. a wrist, or be a separate device like a mobile device (e.g. smartphone).
  • the processing unit 4 may communicate wired or wirelessly (e.g., WiFi, Bluetooth, infrared, RFID transmission, wireless Universal Serial Bus (WUSB), cellular, 3G, 4G or other wireless communication means) over a communication network 50 to one or more computing systems 12 whether located nearby or at a remote location.
  • the functionality of the processing unit 4 may be integrated in software and hardware components of the display device 2 as in Figure IB. An example of hardware components of the processing unit 4 is shown in Figure 2C.
  • One or more remote, network accessible computer system(s) 12 may be leveraged for processing power and remote data access.
  • An example of hardware components of a computing system 12 is shown in Figure 2C.
  • An application may be executing on computing system 12 which interacts with or performs processing for an application executing on one or more processors in the see-through, augmented reality display system 8.
  • a 3D mapping application may be executing on the one or more computers systems 12 and the user's display device system 8.
  • the application instances may perform in a master and client role in which a client copy is executing on the display device system 8 and performs 3D mapping of its user field of view, receives updates of the 3D mapping from the computer system(s) 12 in a view independent format, receives updates of objects in its view from the master 3D mapping application and sends image data, and depth and object identification data, if available, back to the master copy.
  • the 3D mapping application executing on different display device systems 8 in the same environment share data updates in real time, for example object identifications and occlusion data like an occlusion volume for a real object either in a peer-to-peer configuration between devices or to a 3D mapping application executing in one or more network accessible computing systems.
  • the shared data in some examples may be referenced with respect to a common coordinate system for the environment.
  • one head mounted display (HMD) device may receive data from another HMD device including image data or data derived from image data, position data for the sending HMD, e.g. GPS or IR data giving a relative position, and orientation data.
  • An example of data shared between the HMDs is depth map data including image data and depth data captured by its front facing cameras 113 and occlusion volumes for real objects in the depth map.
  • the real objects may still be unidentified or have been recognized by software executing on the HMD device or a supporting computer system, e.g. 12 or another display device system 8.
  • the second HMD can map the position of the objects in the received depth map for its user perspective based on the position and orientation data of the sending HMD. Any common objects identified in both the depth map data of a field of view of the recipient HMD device and the depth map data of a field of view of the sending HMD device may also be used for mapping.
  • An example of an environment is a 360 degree visible portion of a real location in which the user is situated. A user may only be looking at a subset of his environment which is his field of view. For example, a room is an environment. A person may be in a house and be in the kitchen looking at the top shelf of the refrigerator.
  • the top shelf of the refrigerator is within his field of view, the kitchen is his environment, but his upstairs bedroom is not part of his current environment as walls and a ceiling block his view of the upstairs bedroom.
  • Some other examples of an environment may be a ball field, a street location, a section of a store, a customer section of a coffee shop and the like.
  • a location can include multiple environments, for example, the house may be a location.
  • the user and his friends may be wearing their display device systems for playing a game which takes place throughout the house. As each player moves about the house, his environment changes.
  • a perimeter around several blocks may be a location and different intersections provide different environments to view as different cross streets come into view.
  • Capture devices 20 may be, for example, cameras that visually monitor one or more users and the surrounding space such that gestures and/or movements performed by the one or more users, as well as the structure of the surrounding space including surfaces and objects, may be captured, analyzed, and tracked. Such information may be used for example, to update display positions of virtual objects, displaying location based information to a user, and for identifying gestures to indicate one or more controls or actions for an executing application (e.g. game application).
  • an executing application e.g. game application
  • Capture devices 20 may be depth cameras. According to an example embodiment, each capture device 20 may be configured with RGB and IR components to capture video with depth information including a depth image that may include depth values via any suitable technique including, for example, time-of-flight, structured light, stereo image, or the like. According to one embodiment, the capture device 20 may organize the depth information into "Z layers," or layers that may be perpendicular to a Z axis extending from the depth camera along its line of sight. The depth image may include a two-dimensional (2-D) pixel area of the captured field of view where each pixel in the 2- D pixel area may represent a length in, for example, centimeters, millimeters, or the like of an object in the captured field of view from the camera.
  • 2-D two-dimensional
  • Figure IB is a block diagram depicting example components of another embodiment of a see-through, augmented or mixed reality display device system 8 which may communicate over a communication network 50 with other devices.
  • the control circuitry 136 of the display device 2 incorporates the functionality which a processing unit provides in Figure 1A and communicates wirelessly via a wireless transceiver (see 137 in Figure 2 A) over a communication network 50 to one or more computer systems 12.
  • Figure 2A is a side view of an eyeglass temple 102 of the frame 115 in an embodiment of the see-through, augmented reality display device 2 embodied as eyeglasses providing support for hardware and software components.
  • video camera 113 At the front of frame 115 is physical environment facing video camera 113 that can capture video and still images, typically in color, of the real world to map real objects in the field of view of the see-through display, and hence, in the field of view of the user.
  • the cameras 113 may also be depth sensitive cameras which transmit and detect infrared light from which depth data may be determined.
  • a separate depth sensor (not shown) on the front of the frame 115 may also provide depth data to objects and other surfaces in the field of view.
  • the depth data and image data form a depth map of the captured field of view of the cameras 113 which are calibrated to include the user field of view.
  • a three dimensional (3D) mapping of the user field of view can be generated based on the depth map.
  • Some examples of depth sensing technologies that may be included on the head mounted display device 2 without limitation, are SONAR, LIDAR, Structured Light, and/or Time of Flight.
  • stereopsis is used for determining depth information instead of or in addition to a depth sensor.
  • the outward facing cameras 113 provide overlapping image data from which depth information for objects in the image data may be determined based on stereopsis.
  • Parallax and contrasting features such as color contrast may be used to resolve a relative position of one real object from another in the captured image data, for example for objects beyond a depth resolution of a depth sensor.
  • the cameras 113 are also referred to as outward facing cameras meaning facing outward from the user's head.
  • the illustrated camera 113 is a front facing camera which is calibrated with respect to a reference point of its respective display optical system 14.
  • a reference point is an optical axis (see 142 in Figure 2B) of its respective display optical system 14.
  • the calibration allows the field of view of the display optical systems 14, also referred to as the user field of view as mentioned above, to be determined from the data captured by the cameras 113.
  • Control circuits 136 provide various electronics that support the other components of head mounted display device 2.
  • the right temple 102r includes control circuitry 136 for the display device 2 which includes a processing unit 210, a memory 244 accessible to the processing unit 210 for storing processor readable instructions and data, a wireless interface 137 communicatively coupled to the processing unit 210, and a power supply 239 providing power for the components of the control circuitry 136 and the other components of the display 2 like the cameras 113, the microphone 110 and the sensor units discussed below.
  • the processing unit 210 may comprise one or more processors including a central processing unit (CPU) and a graphics processing unit (GPU), particularly in embodiments without a separate processing unit 4.
  • inertial sensors 132 Inside, or mounted to temple 102, are an ear phone 130 of a set of ear phones 130, inertial sensors 132, one or more location or proximity sensors 144, some examples of which are a GPS transceiver, an infrared (IR) transceiver, or a radio frequency transceiver for processing RFID data.
  • inertial sensors 132 include a three axis magnetometer, three axis gyro and three axis accelerometer. The inertial sensors are for sensing position, orientation, and sudden accelerations of head mounted display device 2. From these movements, head position, and thus orientation of the display device, may also be determined.
  • each of the devices processing an analog signal in its operation include control circuitry which interfaces digitally with the digital processing unit 210 and memory 244 and which produces or converts analog signals, or both produces and converts analog signals, for its respective device.
  • Some examples of devices which process analog signals are the sensor devices 144, 132, and ear phones 130 as well as the microphone 110, cameras 113 and an IR illuminator 134A, and an IR detector or camera 134B discussed below.
  • an image source or image generation unit 120 which produces visible light representing images.
  • the image generation unit 120 can display a virtual object to appear at a designated depth location in a field of view to provide a realistic, in- focus three dimensional display of a virtual object which interacts with one or more real objects.
  • Some examples of embodiments of image generation units 120 which can display virtual objects at various depths are described in the following applications which are hereby incorporated by reference: "Automatic Variable Virtual Focus for Augmented Reality Displays," having U.S. patent application no. 12/941,825 and inventors Avi Bar-Zeev and John Lewis, and which was filed November 8, 2010 and "Automatic Focus Improvement for Augmented Reality Displays," having U.S. patent application no.
  • a focal length for an image generated by the microdisplay is changed by adjusting a displacement between an image source such as a microdisplay and at least one optical element like a lens or by adjusting the optical power of an optical element which receives the light representing the image.
  • the change in focal length results in a change in a region of the field of view of the display device in which the image of the virtual object appears to be displayed.
  • multiple images, each including a virtual object may be displayed to the user at a rate rapid enough so human temporal image fusion makes the images appear to be present at once to human eyes.
  • a composite image of the in-focus portions of the virtual images generated at the different focal regions is displayed.
  • the image generation unit 120 includes a microdisplay for projecting images of one or more virtual objects and coupling optics like a lens system for directing images from the microdisplay to a reflecting surface or element 124.
  • the microdisplay may be implemented in various technologies including transmissive projection technology, micro organic light emitting diode (OLED) technology, or a reflective technology like digital light processing (DLP), liquid crystal on silicon (LCOS) and Mirasol ® display technology from Qualcomm, Inc.
  • the reflecting surface 124 directs the light from the image generation unit 120 into a lightguide optical element 112, which directs the light representing the image into the user's eye.
  • Figure 2B is a top view of an embodiment of one side of a see-through, near- eye, augmented reality display device including a display optical system 14.
  • a portion of the frame 115 of the near-eye display device 2 will surround a display optical system 14 for providing support and making electrical connections.
  • a portion of the frame 115 surrounding the display optical system is not depicted.
  • the display optical system 14 is an integrated eye tracking and display system.
  • the system embodiment includes an opacity filter 114 for enhancing contrast of virtual imagery, which is behind and aligned with optional see- through lens 116 in this example, lightguide optical element 112 for projecting image data from the image generation unit 120 is behind and aligned with opacity filter 114, and optional see-through lens 118 is behind and aligned with lightguide optical element 112.
  • Light guide optical element 112 transmits light from image generation unit 120 to the eye 140 of the user wearing head mounted, display device 2.
  • Light guide optical element 112 also allows light from in front of the head mounted, display device 2 to be transmitted through light guide optical element 112 to eye 140, as depicted by arrow 142 representing an optical axis of the display optical system 14r, thereby allowing the user to have an actual direct view of the space in front of head mounted, display device 2 in addition to receiving a virtual image from image generation unit 120.
  • the walls of light guide optical element 112 are see-through.
  • Light guide optical element 112 is a planar waveguide in this embodiment and includes a first reflecting surface 124 (e.g., a mirror or other surface) which reflects incident light from image generation unit 120 such that light is trapped inside a waveguide.
  • a representative reflecting element 126 represents the one or more optical elements like mirrors, gratings, and other optical elements which direct visible light representing an image from the planar waveguide towards the user eye 140.
  • Infrared illumination and reflections also traverse the planar waveguide 112 for an eye tracking system 134 for tracking the position of the user's eyes which may be used for applications such as gaze detection, blink command detection and gathering biometric information indicating a personal state of being for the user.
  • the eye tracking system 134 comprises an eye tracking IR illumination source 134A (an infrared light emitting diode (LED) or a laser (e.g. VCSEL)) and an eye tracking IR sensor 134B (e.g. IR camera, arrangement of IR photodetectors, or an IR position sensitive detector (PSD) for tracking glint positions).
  • LED infrared light emitting diode
  • VCSEL laser
  • an eye tracking IR sensor 134B e.g. IR camera, arrangement of IR photodetectors, or an IR position sensitive detector (PSD) for tracking glint positions.
  • representative reflecting element 126 also implements bidirectional infrared (IR) filtering which directs IR illumination towards the eye 140, preferably centered about the optical axis 142 and receives IR reflections from the user eye 140.
  • IR infrared
  • reflecting element 126 may include a hot mirror or gratings for implementing the bidirectional IR filtering.
  • a wavelength selective filter 123 passes through visible spectrum light from the reflecting surface 124 and directs the infrared wavelength illumination from the eye tracking illumination source 134A into the planar waveguide 112.
  • Wavelength selective filter 125 passes the visible light and the infrared illumination in an optical path direction heading towards the nose bridge 104.
  • Wavelength selective filter 125 directs infrared radiation from the waveguide including infrared reflections of the user eye 140, preferably including reflections captured about the optical axis 142, out of the waveguide 112 to the IR sensor 134B.
  • the eye tracking unit optics are not integrated with the display optics.
  • Opacity filter 114 which is aligned with light guide optical element 112, selectively blocks natural light from passing through light guide optical element 112 for enhancing contrast of virtual imagery.
  • the system renders a scene for the augmented reality display, it takes note of which real-world objects are in front of which virtual objects and vice versa. If a virtual object is in front of a real-world object, then the opacity is turned on for the coverage area of the virtual object. If the virtual object is (virtually) behind a real-world object, then the opacity is turned off, as well as any color for that display area, so the user will only see the real-world object for that corresponding area of real light.
  • the opacity filter assists the image of a virtual object to appear more realistic and represent a full range of colors and intensities.
  • electrical control circuitry for the opacity filter receives instructions from the control circuitry 136 via electrical connections routed through the frame. More details of an opacity filter are provided in U.S. Patent Application No. 12/887,426, "Opacity Filter For See-Through Mounted Display,” filed on September 21, 2010, incorporated herein by reference in its entirety.
  • Figures 2A and 2B only show half of the head mounted display device 2.
  • a full head mounted display device would include another set of optional see-through lenses 116 and 118, another opacity filter 114, another light guide optical element 112, another image generation unit 120, physical environment facing camera 113 (also referred to as outward facing or front facing camera 113), eye tracking assembly 134, and earphone 130. Additional details of a head mounted display device system are illustrated in United States Patent Application Serial No. 12/905952 entitled "Fusing Virtual Content Into Real Content", Filed October 15, 2010, fully incorporated herein by reference.
  • FIG. 2C is a block diagram of one embodiment of a computing system that can be used to implement one or more network accessible computing systems 12 or a processing unit 4 which may host at least some of the software components of computing environment 54 or other elements depicted in Figure 3A.
  • an exemplary system includes a computing device, such as computing device 200.
  • computing device 200 In its most basic configuration, computing device 200 typically includes one or more processing units 202 including one or more central processing units (CPU) and one or more graphics processing units (GPU).
  • Computing device 200 also includes memory 204.
  • memory 204 may include volatile memory 205 (such as RAM), non-volatile memory 207 (such as ROM, flash memory, etc.) or some combination of the two.
  • device 200 may also have additional features/functionality.
  • device 200 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape.
  • additional storage is illustrated in Figure 2C by removable storage 208 and non-removable storage 210.
  • Device 200 may also contain communications connection(s) 212 such as one or more network interfaces and transceivers that allow the device to communicate with other devices.
  • Device 200 may also have input device(s) 214 such as keyboard, mouse, pen, voice input device, touch input device, etc.
  • Output device(s) 216 such as a display, speakers, printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here.
  • Figure 3A is a block diagram of a system from a software perspective for providing realistic occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • Figure 3A illustrates a computing environment embodiment 54 from a software perspective which may be implemented by a head mounted display device system like system 8, one or more remote computing systems 12 in communication with one or more display device systems or a combination of these. Additionally, display device systems can communicate with other display device systems for sharing data and processing resources. Network connectivity allows leveraging of available computing resources.
  • the software components of a computing environment 54 comprise an image and audio processing engine 191 in communication with an operating system 190.
  • Image and audio processing engine 191 processes image data (e.g.
  • Image and audio processing engine 191 includes object recognition engine 192, gesture recognition engine 193, virtual data engine 195, eye tracking software 196 if eye tracking is in use, an occlusion engine 302, a 3D positional audio engine 304 with a sound recognition engine 194, and a scene mapping engine 306 all in communication with each other.
  • the computing environment 54 also stores data in image and audio data buffer(s) 199.
  • the buffers provide memory for receiving image data captured from the outward facing cameras 1 13, image data captured by other capture devices if available, image data from an eye tracking camera of an eye tracking assembly 134 if used, buffers for holding image data of virtual objects to be displayed by the image generation units 120, and buffers for both input and output audio data like sounds captured from the user via microphone 110 and sound effects for an application from the 3D audio engine 304 to be output to the user via earphones 130.
  • a 3D mapping of the user field of view of the see-through display can be determined by the scene mapping engine 306 based on captured image data and depth data for the user field of view.
  • a depth map can represent the captured image data and depth data.
  • a view dependent coordinate system may be used for the mapping of the user field of view as whether an object occludes another object depends on the user's point of view.
  • An example of a view dependent coordinate system is an x, y, z coordinate system in which the z-axis or depth axis extends orthogonally or as a normal from the front of the see-through display.
  • the image and depth data for the depth map representing the user field of view is received from the cameras 113 on the front of the display device 2.
  • the object recognition engine 192 can detect boundaries of a real object in the depth map and may assign a bounding volume as a 3D space surrounding a real object.
  • the bounding volume is identified to the 3D scene mapping engine 306 and the occlusion engine 302.
  • the object recognition engine 192 may identify the bounding volume in a message to the operating system 190 which broadcasts the message to other engines like the scene mapping and occlusion engine and applications which register for such data.
  • the bounding volume may be used as an occlusion volume for occlusion processing even before object recognition is performed.
  • a fast moving object may be causing occlusions which are processed based on the occlusion volumes and the depth map data even though the object passes out of view before it is recognized.
  • the boundary of the occlusion volume may be used, at least partially, as a basis for generating an occlusion interface.
  • the scene mapping engine 306 may assign 3D space positions for one or more real objects detected in the user field of view based on the depth map. As objects are identified by the object recognition engine 192 as discussed below, the 3D spaces or volumes for the objects in the mapping may be refined to better match the actual shape of a real object. From the virtual data engine 195 or an executing application, a 3D space position of a virtual object is determined within the 3D mapping of the user field of view.
  • the occlusion engine 302 may assign an occlusion volume to a virtual object as well based on level of detail criteria.
  • Mapping what is around the user in the user's environment can be aided with sensor data.
  • Data from the orientation sensor 132 e.g. the three axis accelerometer 132C and the three axis magnetometer 132A, determines position changes of the user's head and correlation of those head position changes with changes in the image and depth data from the front facing cameras 1 13 can identify positions of objects relative to one another.
  • depth map data of another HMD device, currently or previously in the environment, along with position and head orientation data for this other HMD device can also be used to map what is in the user environment. Shared real objects in their depth maps can be used for image alignment and other techniques for image mapping. With the position and orientation data as well, what objects are coming into view can be predicted as well so occlusion and other processing can start even before the objects are in view.
  • the scene mapping engine 306 can also use a view independent coordinate system for 3D mapping.
  • the map can be stored in the view independent coordinate system in a storage location (e.g. 324) accessible by other display device systems 8, other computer systems 12 or both, be retrieved from memory and be updated over time as one or more users enter or re-enter the environment.
  • image and object registration into a common coordinate system may be performed using an extrinsic calibration process. The registration and/or alignment of images (or objects within the images) onto a common coordinate system allows the scene mapping engine to be able to compare and integrate real-world objects, landmarks, or other features extracted from the different images into a unified 3-D map associated with the real-world environment.
  • the scene mapping engine 306 may first search for a pre-generated 3D map identifying 3D space positions and identification data of objects stored locally or accessible from another display device system 8 or a network accessible computer system 12.
  • the map may include stationary objects.
  • the map may also include objects moving in real time and current light and shadow conditions if the map is presently being updated by another system.
  • the pre-generated map may include identification data for objects which tend to enter the environment at predicted times in order to speed recognition processing.
  • a pre-generated map may also store occlusion data as discussed below.
  • a pre-generated map may be stored in a network accessible database like image and map database(s) 324.
  • the environment may be identified by location data.
  • Location data may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map.
  • GPS data from a GPS transceiver 144 of the location and proximity sensors on the display device 2 may identify the location of the user.
  • an IP address of a WiFi hotspot or cellular station to which the display device system 8 has a connection can identify a location.
  • Cameras at known positions within a location may identify the user and other people through facial recognition.
  • maps and map updates, or at least object identification data may be exchanged between display device systems 8 in a location via infra-red, Bluetooth or WUSB as the range of the signal allows.
  • An example of image related data which may be used to generate a map is metadata associated with any matched image data from which objects and their positions within a coordinate system for the location can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras of the user's display device system 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified.
  • image data for mapping an environment can come from cameras other than those 113 on the user's display device 2.
  • Image and depth data from multiple perspectives can be received in real time from other 3D image capture devices 20 under control of one or more network accessible computer systems 12 or from at least one other display device system 8 in the environment.
  • Depth images from multiple perspectives may be combined based on a view independent coordinate system for describing an environment (e.g. an x, y, z representation of a room, a store space, or a geofenced area) for creating the volumetric or 3D mapping.
  • a view independent coordinate system for describing an environment (e.g. an x, y, z representation of a room, a store space, or a geofenced area) for creating the volumetric or 3D mapping.
  • the scene mapping engine 306 receives depth images from multiple cameras, the engine 306 correlates the images to have a common coordinate system by lining up the images and uses depth data to create the volumetric description of the environment.
  • a 3D mapping may be modeled as a 3D mesh of an environment.
  • a mesh may comprise a detailed geometric representation of various features and surfaces within a particular environment or region of an environment.
  • a 3D point cloud representing the surfaces of objects including things like walls and floors in a space can be generated based on captured image data and depth data of the user environment.
  • a 3D mesh of the surfaces in the environment can then be generated from the point cloud. More information regarding the generation of 3-D maps can be found in U.S. Patent Application 13/017,690, "Three-Dimensional Environment Reconstruction," incorporated herein by reference in its entirety.
  • scene mapping may be a collaborative effort with other display device systems 8, or other network accessible image capture devices (e.g. 20) in a location providing image data and depth data, or a combination of these, and one or more network accessible computer system(s) 12 to help with the computations and share map updates.
  • network accessible image capture devices e.g. 20
  • network accessible computer system(s) 12 to help with the computations and share map updates.
  • a scene mapping engine 306 on a network accessible computer system 12 receives image data of multiple user fields of view from multiple see-through augmented reality display device systems 8 in an environment and correlates their image data based on capture times for the data in order to track changes of objects and lighting and shadow in the environment in real time. 3D map updates can then be sent to the multiple display device systems 8 in the environment.
  • the 3D mapping data can be saved in accordance with pre- generation criteria for future faster retrieval. Some examples of such pre-generation criteria includes stationary objects, time of day and ambient conditions which effect light and shadow.
  • a display device system 8 can broadcast its image data or 3D map updates to other display device systems 8 in an environment and likewise receive such updates from another device system.
  • Each local scene mapping engine 306 then updates its 3D mapping in accordance with the broadcasts.
  • a scene mapping engine 306, particularly one executing on a display device system 8 can map a user field of view based on image data and depth data captured by cameras 113 on the device.
  • the user field of view 3D mapping may also be determined remotely or using a combination of remote and local processing.
  • the object recognition engine 192 distinguishes real objects from each other by marking object boundaries and comparing the object boundaries with structural data.
  • marking object boundaries is detecting edges within detected or derived depth data and image data, connecting the edges and comparing with stored structure data in order to find matches within a probability criteria.
  • a polygon mesh may also be used to represent the object's boundary as mentioned above.
  • One or more databases of structure data 200 accessible over one or more communication networks 50 may include structural information about objects.
  • Structure data 200 may include structural information regarding one or more inanimate objects in order to help recognize the one or more inanimate objects, some examples of which are furniture, sporting equipment, automobiles and the like.
  • the structure data 200 may store structural information as image data or use image data as references for pattern and facial recognition.
  • the object recognition engine may store structural information as image data or use image data as references for pattern and facial recognition.
  • 192 may also perform facial and pattern recognition on image data of the objects based on stored image data from other sources as well like user profile data 197 of the user, other users profile data 322 accessible by a hub, location indexed images and 3D maps 324 and Internet accessible images 326.
  • Motion capture data from image and depth data may also identify motion characteristics of an object.
  • the object recognition engine 192 may also check detected properties of an object against reference properties of an object like its size, shape and motion characteristics.
  • An example of such a set of reference properties for an object is a reference object data set as stored in reference objects data sets(s) 318.
  • Figure 3B illustrates an example of a reference object data set 318N with some examples of data fields.
  • the reference data sets 318 available to the object recognition engine 192 may have been predetermined manually offline by an application developer or by pattern recognition software and stored. Additionally, if a user inventorizes an object by viewing it with the display device system 8 and inputting data for the data fields, a reference object data set is generated. Also, reference object data sets can be created and stored for sharing with other users as indicated in share permissions.
  • the data fields include a type of object 341 which may be a data record which also includes sub-fields. For the type of object 341, the other data fields provide data records identifying the types of physical properties available for the type of object.
  • the other data records identify physical interaction characteristics 342, size ranges 343, shape selections available 344, typical types of material 345, colors available 347, patterns available 348, surface(s) available 351, typical surface texture(s) 346, a geometric orientation 350 of each available surface 351.
  • Figure 3C illustrates some examples of data fields in an object physical properties data set 320N stored for either a specific real object or a specific virtual object which includes data values detected or otherwise determined based on captured data of the real object, or data pre-defined or generated by an application for the specific virtual object.
  • the example data fields include a type of object 381, physical interaction characteristics 382 which are determined based on other physical properties like size 383, in three dimensions in this example, shape 384, also 3D in this example, structure 399 (e.g. skeletal or for an inanimate object), also in three dimensions in this example, boundary data 400 and type of material 385.
  • Position data 394 in 3D for the object may also be stored.
  • the position data 394 includes motion data 395 tracking a direction of movement through positions in a location.
  • Surface 388N represents an exemplary data set for each surface identified. The data set includes one or more surface textures 390, a geometric orientation 393 of the surfaceN, a surface shape 389 (e.g.
  • the surrounding free space (3D) 392 may be determined from position data 391 of the surfaceN relative to one or more surfaces of one or more other objects, real or virtual, in the real environment. These other objects would typically be nearest neighbor objects. Furthermore, the position of surfaces of the same object with respect to each other may be a basis for determining thickness and 3D shape overall.
  • the surrounding free space and position data may be used to determine when an audio occlusion exists.
  • the real object physical properties data set 335 may be stored in one or more network accessible data stores 320.
  • the object recognition engine 192 Upon detection of one or more objects by the object recognition engine 192, other engines of the image and audio processing engine 191 like the scene mapping engine 306 and the occlusion engine 302 receive an identification of each object detected and a corresponding position and/or orientation. This object data is also reported to the operating system 190 which passes the object data along to other executing applications like other upper level applications 166.
  • a perspective of a user wearing a display device referred to hereafter as a user perspective and the user field of view from the user perspective, may be approximated by a view dependent coordinate system, having orthogonal X, Y and Z axes in which a Z-axis represents a depth position from the front of the display device system 8 or one or more points determined in relation to the front of the display device system like an approximate location for the user's foveae.
  • the depth map coordinate system of the depth cameras 1 13 may be used to approximate a view dependent coordinates system of the user field of view in some examples.
  • the occlusion engine 302 identifies occlusions between objects, and in particular between real and virtual objects, based on volumetric position data for recognized objects within the view dependent coordinate system of a 3D mapping of the user field of view as updated by the objection recognition engine 192 and the scene mapping engine 306.
  • a 3D space position of an object is volumetric position data in that it represents the volume occupied by the object and the position of that object volume in a coordinate system.
  • the occlusion engine 302 compares the 3D space positions of objects in the user field of view from the user perspective for each upcoming display update.
  • the occlusion engine 302 can process objects currently in the field of view and those predicted to enter the field of view as notified by the scene mapping engine 306.
  • An occlusion may be identified by an overlap portion in the coordinates of 3D space positions. For example, the virtual object and the real object share area covering the same area in X and Y view dependent coordinates, but have different depths, for example one object is in front of another object.
  • the 3D object boundary data represented in the 3D space positions is projected as a mask of object boundary data onto a 2d viewing plane in an image buffer 199 for determining overlapping boundaries. Depth data associated with the boundary data is then used to identify which boundary belongs to the occluding object and which boundary data belongs to the occluded object.
  • the occlusion engine can notify the virtual data engine 195 (see below) to not display the virtual object.
  • the virtual object or its parts can be sized to completely cover the real object and its parts.
  • the display is to be updated to show part of the virtual object in relation to the real object.
  • the display is to be updated to show part of the virtual object while part of the real object is still seen through the display device 2.
  • the occlusion engine 302 identifies and stores in an occlusion data set object boundary data of an occluding portion, also referred to as a blocking or an overlapping portion, of the occluding object as a basis upon which a partial occlusion interface is to be generated.
  • object boundary data of an occluding portion also referred to as a blocking or an overlapping portion
  • the processing may be performed separately for each partial occlusion interface.
  • a virtual object can conform its shape over at least a portion of a real object.
  • the portions of the object boundary data of both the real and virtual object in the conforming portions can also be stored in an occlusion data set for use in representing or modeling the conforming interface.
  • a modified version of the virtual object is generated to represent the occlusion.
  • a modified version of the boundary data of the virtual object is generated by the occlusion engine 302.
  • the virtual data engine 195 displays the unoccluded portion of the virtual object in accordance with its modified boundary data.
  • the virtual object's boundary data e.g.
  • a video-see device may employ embodiments of the same methods and processes.
  • the occlusion engine 302 determines a level of detail for a model generated of the partial occlusion interface in order to display an unoccluded portion of the virtual object adjacent the partial occlusion interface. The more the model of the interface matches the detail of the boundary data of the overlapping portion, the more realistic the interface will look on the display. A level of detail can also be determined by the engine 302 for a conforming occlusion interface. The level of detail defines parameters and which techniques may be used effecting the resulting geometry of the model of either type of interface.
  • Rule sets 311 for the different occlusion levels of detail control which geometrical modeling techniques may be used and accuracy criteria like how much of the object boundary data determined based on detection of the object or stored in a detailed version of the object is to be incorporated in the model unmodified and a smoothing tolerance. For example, for a same set of boundary data as a sequence of edges, one level of detail may result in a generated model of the sequence of edges as a curve which incorporates more of the unmodified object boundary data than another level of detail which results in a model of the same sequence of edges as a line.
  • Another example of a level of detail is using bounding or occlusion volumes for object boundary data and tracking the occlusion volumes using depth map data for quicker occlusion processing rather than waiting for object recognition.
  • Level of detail criteria are factors which effect how much detail a user would perceive either due to approximations of human perceptual limitations or display resolution.
  • Examples of level of detail criteria as may be represented in data stored in memory as occlusion level of detail criteria 310 include depth position, display size, speed of the interface in the user field of view and distance from a point of gaze and these criteria and determinations based on them are discussed in more detail with respect to Figures 6A to 6D.
  • Occlusion data sets 308 generated by the occlusion engine 302 or received from another system (8 or 12) are also stored in memory.
  • occlusion data is associated with a virtual object and a real object and includes one or more models generated at one or more levels of detail for at least one occlusion interface between the virtual and real objects.
  • the unmodified boundary data at issue for the occlusion interface may also be stored in the occlusion data set.
  • Occlusion level of detail criteria 310 and occlusion level of detail rules 311 are also stored for use by the occlusion engine in determining how to model a partial occlusion interface or a conforming occlusion interface.
  • Occlusion data may be shared like object identification data and position data with a pre-generated map or as data useful for generating a 3D map.
  • Occlusion data may be first generated for one mobile display device system. As subsequent display devices encounter the same occlusion, they can download generated occlusion interfaces for different levels of detail instead of regenerating them. For example, for a level of detail based on being within a depth distance range and a user perspective angular range to an object, the previously generated model of a partial occlusion interface may be re-used. Such saved occlusion data may be particularly useful for stationary real objects in an environment like a building. However, for moving real objects having a predictable speed range and path through a location, e.g. a bus on a schedule in a street scene, saved occlusion data may also save time. Whether an object is stationary or movable or a mobility rating may be determined based on the type of object 381 for the object.
  • occlusion engine 302 Besides detecting spatial occlusions in a user field of view, other occlusions in the user's environment or location but not in his field of view can be identified by the occlusion engine 302 based on 3D space positions of the objects with respect to the user.
  • An occlusion engine 302 executing in the display device system 8 or the hub 12 can identify the occlusions. Although not seen, such an occlusion with respect to the user may cause audio data associated with the occluded object to be modified based on the physical properties of the occluding object.
  • the 3D audio engine 304 is a positional 3D audio engine which receives input audio data and outputs audio data for the earphones 130.
  • the received input audio data may be for a virtual object or be that generated by a real object. Audio data for virtual objects generated by an application can be output to the earphones to sound as if coming from the direction of the virtual object projected into the user field of view.
  • An example of a positional 3D audio engine which may be used with an augmented reality system is disclosed in U.S. patent application no. 12/903,610 entitled "System and Method for High- Precision 3-Dimensional Audio for Augmented Reality," to Flaks et al, and filed October 13, 2010, which is hereby incorporated by reference.
  • the output audio data may come from sound library 312.
  • Sound recognition software 194 of the 3D audio engine identifies audio data from the real world received via microphone 110 for application control via voice commands and environment and object recognition. Besides recognizing content of the audio data like a voice command or a piece of music, the 3D audio engine 304 attempts to recognize which object made the audio data. Based on a sound library 312, the engine 304 can identify a sound with a physical object, e.g. a horn sound associated with a certain make or model of car. Additionally, voice data files stored in user profile data 197 or user profiles 322 may also identify a speaker with whom a person object mapped in the environment may be associated.
  • display device systems 8 and 3D image capture devices 20 in a location upload their captured audio data to a hub computing system 12.
  • this may be the user's voice but can also include sounds made in the environment of the user.
  • sounds made in the environment of the user Based on the sound quality and objects near a user as well as an identification of a type of object based on the sound library used by the sound recognition software component, which object in an environment or location which generated a sound can be made.
  • pre-generated 3D maps of a location can provide an audio index of sounds of objects fixed in the location or which enter and leave the location on a regular basis, e.g. train and bus sounds.
  • Sharing of data about objects, both real and virtual, including sounds they make between multiple display device systems 8 and the hubs 12 facilitate identifying the object which made the sound. So sound object candidates identified based on matches in the sound library 312 or voice data files can be compared with objects identified in an environment and even a location for a match.
  • the 3D audio engine 304 can access a sound occlusion model 316 for the audial occluded object which provides rules for modifying the audio data as output to the earphones 130 to represent the occlusion.
  • the method figures below provide some examples of how to determine whether a spatial occlusion has caused an audial occlusion. For example, one criteria is whether a sound making part of an occluded object is blocked in the partial occlusion.
  • Figures 4A and 4B provide examples of audio occlusion due to spatial occlusion.
  • Figure 4A illustrate an example of a spatial occlusion resulting in an audio occlusion of a virtual object by a real object.
  • Figure 4A also illustrates occlusion of a sound making area.
  • the user's hand 404 as seen in the user field of view as indicated by gaze lines 4011 and 401r, is identified as being positioned over monster 402 in the field of view and having virtually the same depth distance as monster 402, so audio for monster 402 is muffled in accordance with the sound dampening characteristics of a human hand.
  • a distance for a sound effect like muffling between the occluding object and the occluded object may indicate no significant audio occlusion exists or is a factor on weightings for things like volume, tone and pitch associated with audio data.
  • Monster 403 is partially occluded in this field of view by the user's arm 405, but monster 403 is several feet back from the arm depth and monster 402 as well.
  • the sound absorbing characteristics of a single human body part has a very small range, so no audial occlusion effect exists for an occluded object like monster 403 which is several feet away.
  • Figure 4B illustrates an example of a spatial occlusion resulting in an audio occlusion of a real object by a virtual object.
  • a virtual brick wall 410 appears to users Bob 406 and George 408 in their respective head mounted display devices 2 while executing a quest type of game which they are both playing, and which an action of George triggered to appear.
  • neither George 408 nor Bob 406 should be able to hear each other due to the sound absorption characteristic of a thick brick wall (e.g. 18 inches) 410 between them if it were real.
  • audio data generated by George is blocked, e.g. his cries for help, or removed from audio received via Bob's microphone and sent to Bob's earphones.
  • George's 3D audio engine modifies audio data received at George's earphones to remove audio data generated by Bob.
  • the user To hear the audio of the virtual objects generated by executing applications and sent to the 3D audio engine 304, the user typically uses the earphones for better hearing.
  • the sounds of the real objects received at the microphone can be buffered before being output to the user's earphones so the user experiences the audio occlusion effects applied to the real object audio when the user is using the earphones.
  • the physical properties including type of material of an object are used to determine its one or more effects on audio data.
  • a sound occlusion model 316 may include rules for representing the one or more effects which the 3D audio engine 304 can implement.
  • one type of material may be primarily a sound absorber in which the amplitude of the sound wave is dampened and the sound energy is turned to heat energy. Absorbers are good for sound proofing.
  • a sound occlusion model may for example indicate a damping coefficient to the amplitude of the audio data to represent an absorption effect.
  • Another type of material may act to reflect sound waves such that the angle of incidence is the angle of reflection for a pre-defined percentage of waves hitting the material. Echo and Doppler effects may be output by the 3D audio engine as a result.
  • a third type of material acts as a sound diffuser reflecting incident sound waves in all directions.
  • a sound occlusion model associated with the object having this type of material has rules for generating reflection signals of audio data in random directions off the size and shape of the occluding object which the 3D audio engine implements.
  • 3D audio engines such as may be used in interactive gaming with all artificial display environments, have techniques for modifying sound waves to create echos, Doppler effects as well as absorption, reflection and diffusion effects.
  • the outward facing cameras 113 in conjunction with the object recognition engine 192 and gesture recognition engine 193 implements a natural user interface (NUI) in embodiments of the display device system 8.
  • NUI natural user interface
  • Blink commands or gaze duration data identified by the eye tracking software 196 are also examples of physical action user input. Voice commands may also supplement other recognized physical actions such as gestures and eye gaze.
  • the gesture recognition engine 193 can identify actions performed by a user indicating a control or command to an executing application. The action may be performed by a body part of a user, e.g. a hand or finger, but also an eye blink sequence of an eye can be a gesture.
  • the gesture recognition engine 193 includes a collection of gesture filters, each comprising information concerning a gesture that may be performed by at least a part of a skeletal model.
  • the gesture recognition engine 193 compares a skeletal model and movements associated with it derived from the captured image data to the gesture filters in a gesture library to identify when a user (as represented by the skeletal model) has performed one or more gestures. In some examples, matching of image data to image models of a user's hand or finger during gesture training sessions may be used rather than skeletal tracking for recognizing gestures.
  • Virtual data engine 195 processes virtual objects and registers the 3D space position and orientation of virtual objects in relation to one or more coordinate systems, for example in user field of view dependent coordinates or in the view independent 3D map coordinates.
  • the virtual data engine 195 determines the position of image data of a virtual object or imagery (e.g. shadow) in display coordinates for each display optical system 14. Additionally, the virtual data engine 195 performs translation, rotation, and scaling operations for display of the virtual object at the correct size and perspective.
  • a virtual object position may be dependent upon, a position of a corresponding object which may be real or virtual.
  • the virtual data engine 195 can update the scene mapping engine about the 3D space positions of the virtual objects processed.
  • Device data 198 may include a unique identifier for the computer system 8, a network address, e.g. an IP address, model number, configuration parameters such as devices installed, identification of the operating system, and what applications are available in the display device system 8 and are executing in the display system 8 etc. Particularly for the see-through, augmented reality display device system 8, the device data may also include data from sensors or determined from the sensors like the orientation sensors 132, the temperature sensor 138, the microphone 110, and the one or more location and proximity transceivers 144.
  • a network address e.g. an IP address, model number, configuration parameters such as devices installed, identification of the operating system, and what applications are available in the display device system 8 and are executing in the display system 8 etc.
  • the device data may also include data from sensors or determined from the sensors like the orientation sensors 132, the temperature sensor 138, the microphone 110, and the one or more location and proximity transceivers 144.
  • the method embodiments below are described in the context of the system embodiments described above. However, the method embodiments are not limited to operating in the system embodiments described above and may be implemented in other system embodiments. Furthermore, the method embodiments are continuously performed, and there may be multiple occlusions between real and virtual objects being processed for a current user field of view. For example, as the user wearing the head mounted, augmented reality display device system moves at least her head, and real and virtual objects move as well, the user's field of view continues to change as do observable occlusions.
  • a display typically has a display or frame rate which updates faster than the human eye can perceive, for example 30 frames a second.
  • Figures 5A through 5C illustrate some embodiments which may be used to cause a see-through display or other head mounted display to represent the spatial occlusion relationship in the display by modifying display of the virtual object.
  • FIG. 5A is a flowchart of an embodiment of a method for displaying a realistic partial occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • the occlusion engine in step 502 identifies a partial spatial occlusion between a real object and a virtual object based on their 3D space positions from a user perspective, and in step 506 retrieves object boundary data of an occluding portion of an occluding object in the partial occlusion.
  • the occlusion engine 302 determines a level of detail for a model, e.g.
  • the occlusion engine 302 generates a modified version of the boundary data of the virtual object based on the model to include boundary data adjacent an unoccluded portion of the real object which has a shape based on the model of the partial occlusion interface.
  • the shape of the adjacent boundary data is like that of the model.
  • the virtual engine data in step 514 causes the image generation unit to display an unoccluded portion of the virtual object in accordance with the modified version of its boundary data.
  • a video-see HMD device may modify the embodiment of Figure 5A in that steps 512 and 514 may be performed with respect to the occluding object, be it real or virtual, as a video-see display is not see through but displays image data of the real world which can be manipulated as well as image data of virtual objects.
  • a see-through display may take a hybrid approach and may modify at least a portion of a boundary of a real object and display its image data in accordance with the modified boundary portion.
  • FIG. 5B is a flowchart of an implementation example for determining a spatial occlusion relationship between a virtual object and a real object in a user field of view of a head mounted, augmented reality display device based on 3D space position data for the objects.
  • the occlusion engine 302 identifies an overlap of a 3D space position of a real object with a 3D space position of a virtual object in a 3D mapping of a user field of view from a user perspective.
  • the occlusion engine 302 identifies, in step 524, which object is the occluded object and which object is the occluding object for the overlap based on depth data for the respective portions of the virtual object and the real object in the overlap.
  • the occlusion engine 302 determines whether the occlusion is whole or partial based on position coordinates of the 3D space positions of the real and virtual objects in terms of the non-depth axes of the 3D mapping.
  • the occlusion engine 302 can notify the virtual data engine 195 to not display a virtual object which is wholly occluded by a real object.
  • the occlusion engine 302 does not modify the boundary of the virtual object for this occlusion.
  • a virtual object occludes at least a portion of a real object and conforms its shape to the shape of the real object.
  • a user may have designated a setting in his user profile 322 for his avatar to be displayed conforming to him to other display device systems 8 when a scene mapping engine 306 or higher level application 166 identifies that he is in a field of view of these other display device systems 8.
  • the other viewers see the avatar instead of him from their respective perspectives, and the avatar mimics his movements.
  • FIG. 5C is a flowchart of an embodiment of a method for displaying a realistic conforming occlusion interface between a real object occluded by a conforming virtual object by a head mounted, augmented reality display device system.
  • the occlusion engine 302 retrieves object boundary data for at least the portions of the occluding virtual object and the occluded real object.
  • a level of detail for an occlusion version of boundary data for the virtual object is determined based on level of detail criteria and the retrieved object boundary data for the real and virtual objects.
  • step 536 the occlusion engine 302 generates the occlusion interface model of at least a portion of the boundary data for the virtual object based on the determined level of detail, and in step 537, generates a modified version of the boundary data of the virtual object based on the occlusion interface model.
  • step 538 the virtual data engine 195 displays the virtual object in accordance with the modified version of its boundary data.
  • Figures 6A, 6B, 6C and 6D describe examples of process steps for selecting a level of detail at which to display an occlusion interface based on different types of level of detail criteria including depth, display size, speed of the interface in the user field of view and positional relationship to a point of gaze.
  • Figure 6A is a flowchart of an implementation example for determining a level of detail for representing either a partial occlusion interface or a conforming occlusion interface based on level of detail criteria including a depth position of the occlusion interface.
  • the occlusion engine 302 in step 542 tracks a depth position of the occlusion interface in the user field of view, and in step 544, selects a level of detail based on the depth position in the field of view. Tracking a depth position includes monitoring changes in a depth position of each object or portions of each object in the occlusion in order to tell where the interface is and predict where it will be at a future reference time. In the case where depth cameras are available, the scene mapping engine updates positional values based on readings from depth sensors or depth cameras. Additionally, or in lieu of depth data or in supplement, the scene mapping engine can identify a depth change based on parallax determined based upon the positions of image elements, e.g. pixels, in image data of the same object captured separately from the front facing cameras 113.
  • Parallax shows an apparent difference in position of an object when viewed from at least two different lines of sight to the object, and is measured by an angle between the two lines. Closer objects have a higher parallax than distance objects. For example, when driving along a road with a tree, as his car nears the tree, parallax detected by the user's eyes for the tree increases. However, no parallax is detected for the moon in the sky because it is so far away even though the user is moving with respect to it. An increase or decrease in parallax can indicate a depth position change of the object. Additionally, a change in parallax can indicate a change in viewing angle.
  • a level of detail may be incremental as in continuous level of detail or there may be a respective distance range associated with each discrete level of detail in a set.
  • An intersection distance between two discrete levels of detail may be identified as areas for the virtual data engine to apply level of detail transition techniques to avoid a "popping" effect as the modeling of the occlusion interface becomes more detailed.
  • the level of detail selected identifies how accurately the occlusion interface is to be modeled to look as natural or realistic as if the virtual object in the spatial occlusion relationship were a real object.
  • a level of detail may include a level of detail for a geometrical model of the occlusion interface.
  • One example of a level of detail which may be selected for a geometrical model are rules for using at least part of a boundary of a pre-defined bounding geometrical shape like a circle, square, rectangle or triangle as a model or representation of the occlusion interface.
  • geometry fitting such as line or curving fitting may be used to fit object boundary data points in the data set representing the occlusion interface and examples of accuracy criteria include smoothing criteria and percentage of object boundary data stored for the occlusion to be included in the resulting curve, line or other fitted geometrical shape or geometry produced by the fitting.
  • the boundary data for at least a real object in the occlusion is a bounding volume or occlusion volume.
  • An application may be displaying virtual objects, and they are moving quickly or the user wearing the HMD is moving quickly, so occlusions are occurring rapidly. Less detailed bounding shapes facilitate quicker processing by taking advantage of human perceptual limitations in noticing the details of fast moving objects.
  • a tree's boundary data may be represented as a cylinder, an ellipse may surround a person in the field of view.
  • the conforming occlusion interface is modeled as at least a portion of the bounding volume for the real object.
  • the use of bounding volumes as the boundary data will simplify interfaces.
  • the object boundary data of an occluding portion retrieved is a portion of a cylinder.
  • the cylinder boundary data is retrieved for the tree rather than a more detailed and realistic version of boundary data.
  • a virtual object can also be represented by a bounding volume which can further simplify interfaces.
  • occlusions may be processed based on depth map data such as may be captured from front facing cameras 113 as the bounding volumes may be assigned prior to refining boundaries and real object identification.
  • depth map data such as may be captured from front facing cameras 113 as the bounding volumes may be assigned prior to refining boundaries and real object identification.
  • Another example of a display aspect which rules for a level of detail may govern is a respective gap tolerance between the real object and the virtual object meeting at an occlusion interface. The less fitted the geometrical representation is to the object boundary data, the more likely one or more gaps may result. For example, when a user's real fingers occlude sections of a virtual ball, the portions of the virtual ball between the fingers may be rendered to stop a short distance from the object boundary data representing the user's fingers resulting in a small gap.
  • a level of detail may be included in the set in which the virtual object is allowed to be rendered without correcting for occlusion. Criteria which may allow this include that the display size of the partial occlusion interface is below the display element, e.g. picture element or pixel, resolution of the display. Another factor that also effects the level of detail is the number of edges or data points determined from the raw image data. In other embodiments, a very detailed level of detail may indicate to use the detected edges as the model for the partial occlusion interface to represent the interface resulting in a very detailed display.
  • Realism of the displayed occlusion is balanced against efficiency in updating the display to represent motion of the virtual objects and updating the 3D mapping of the user environment.
  • Other level of detail criteria may include an efficiency factor representing a time in which the display of the occlusion is to be completed. Compliance with this factor may be determined based on status messages of available processing time for the various processing units, including graphics processing units, between the collaborating processors of the display device system 8 and one or more network accessible computer systems 12, and other display device systems 8 which make their excess processing power available. If processing resources are not available, a lower, less realistic level of detail than the depth position may warrant may be selected.
  • a hub computer system 12 or another display device system 8 may have already generated and stored a model representing a partial occlusion interface or a conforming occlusion interface and also the image data for rendering the occlusion interface for the same real object and virtual object at a level of detail.
  • an occlusion data set may store the model generated of a partial occlusion interface or a conforming occlusion interface at a certain level of detail, and the hub computer system 12 can retrieve the stored model and send it over a network to the display device system 8 having the same occlusion in its field of view at a depth position suitable for the level of detail.
  • the display device system 8 can translate, rotate and scale the occlusion data for its perspective.
  • the hub computing system 12 may also retrieve image data for the occlusion interface from another display device system and perform scaling, rotating or translation of the image data for the perspective of the display device system 8 as needed and send the modified image data to the display device system 8 which is in a format ready for processing by the image generation unit 120.
  • the sharing of occlusion and image data can also make a more detailed level comply with the processing efficiency criteria.
  • Lighting and shadows affect the visibility of details. For example, at a certain depth position, more details of a real object may be visible in bright daylight than at night or in a shadow cast by another real or virtual object. Rendering an occlusion interface for a virtual object with the real object at a level of detail for bright daylight may be computationally inefficient on a cloudy, rainy day.
  • the occlusion engine 302 optionally determines a lighting value for the 3D space position of the occlusion interface based on values assigned for lighting level, degree of shadow and reflectivity by the scene mapping software, and in step 548, optionally, modifies the selected level of detail based on the lighting value in view of the depth position.
  • Figure 6B is a flowchart of an implementation example for determining a level of detail for representing an occlusion interface based on level of detail criteria including a display size of the occlusion interface.
  • the occlusion engine 302 tracks a depth position of the occlusion interface, and in step 554 identify physical properties, including object size and shape, for the portions of the virtual and real objects at the occlusion interface, for example based on their respective associated object physical properties data sets 320NS .
  • a display size of the portion of the virtual object at the occlusion interface can be determined at step 556 by the virtual data engine 195 in response to a request by the occlusion engine 302 by calculating a display size based on the depth position, the identified physical properties of the portions of the objects including object size and shape, and a coordinate transformation to identify how many display elements (e.g. pixels or sub- pixels on the display) would represent the image of the occlusion interface on the display. For example, if the display size is significantly below the pixel resolution of the display, a level of detail which indicates no occlusion processing may be selected as the occlusion won't be visible, or hardly at all to justify the computational cost. In step 558, the occlusion engine 302 selects a level of detail corresponding to the determined display size.
  • Figure 6C is a flowchart of an implementation example for determining a level of detail for representing an occlusion interface based on level of detail criteria based on a gaze priority value.
  • the eye tracking software 196 in step 562, identifies a point of gaze in a user field of view.
  • a point of gaze may be determined by detecting pupil positions of the user's eyes, extending lines of sight from each of the user's approximated retina locations based on eyeball models and identifying the intersection point of the lines of sight in the 3D mapped user field of view.
  • the intersection point is a point of gaze which may be an object in the field of view.
  • the point of gaze in a coordinate system may be stored in a memory location accessible by the other software for their processing.
  • the occlusion engine 302 in step 564, assigns a priority value for each occlusion interface based on its respective position from the point of gaze, and in step 566, selects a level of detail for generating a model of the partial occlusion interface or the conforming occlusion interface based on level of detail criteria and the priority value.
  • the priority value may be based on a distance criteria from the point of gaze.
  • an occlusion interface positioned within the Panum's fusional area the area of single vision for human binocular vision, may receive a higher priority value than those positioned outside the Panum's fusional area.
  • Figure 6D is a flowchart of an implementation example for determining a level of detail using speed of the interface as a basis.
  • the occlusion engine 302 in step 572 determines a speed of an occlusion interface based on the speed of the objects for the occlusion.
  • the occlusion may be a predicted or future occlusion based on their speeds.
  • the occlusion engine 302 uses speed as a basis for selecting a level of detail.
  • speed may be one of a plurality of factors considered in determining a level of detail for processing an occlusion. The higher the speed, the less detailed the occlusion processing, and no occlusion may be selected as a level if things are moving so fast.
  • FIGS 7A and 8A illustrate an example of using at least part of a boundary of a pre-defined geometrical shape by using sides of a triangle as models for the partial occlusion interfaces 704 and 706.
  • Figures 7B and 8B illustrate examples of line fitting as a form of geometry fitting with a first accuracy criteria
  • Figure 7C illustrates line fitting with a second accuracy criteria which has more accuracy.
  • Figure 8C is a reference image of the unmodified virtual object, a dolphin, in Figures 7A, 7B, 7C and 8A and 8C.
  • Figure 7A illustrates an example of a level of detail using at least part of a boundary of a pre-defined bounding geometrical shape.
  • Figure 8A illustrates an example of partial occlusion interfaces modeled as triangle legs for the virtual object in Figure 7A.
  • the pine tree 716i in Figure 7A is not a triangle but has a border including triangle like features.
  • a central portion including a fin is occluded by the pine tree.
  • there are two partial occlusion interfaces which are modeled as sides of a triangle due to the depth of the virtual object and the pine tree. Due to the distance to the trees, a larger gap tolerance is permitted for this level of detail in this example between the end of the real branches and the beginning of the virtual dolphin sides.
  • Figure 7B illustrates an example of a level of detail using geometry fitting with a first accuracy criteria.
  • a line fitting algorithm may be used with a smoothing criteria.
  • a smoothing criteria may indicate a maximum on how far the fitted geometry can be from the originally detected boundary data, e.g. points and edges, or a complexity level of a polygon, e.g. triangle vs. tetrahedron, can be used to represent a polygon mesh version of the object portion retrieved from a storage location after recognizing the object.
  • the third, fourth, and fifth lower tiers of branches would be too far from a fitted line to not have their shape represented in the geometry of partial occlusion interfaces 708, 710 and 712.
  • Figure 8B illustrates the resulting partial occlusion interfaces 708, 710 and 712 which include indentations for space between the tiers of branches.
  • Figure 7C illustrates an example of a level of detail using geometry fitting with a second accuracy criteria indicating a higher level of modeled detail.
  • a geometry fitting algorithm like curve or line fitting may be used to model the more detail detected boundary data of the tree which now includes branches with pine needles which can be seen through causing much more detail to be represented in the partial occlusion interfaces.
  • the user field of view is identified indicating the user is gazing through the branches in a section of the tree 7163 at the dolphin fin.
  • the geometry fitting algorithm likely has more boundary data to work with from captured image and depth data, and accuracy criteria indicating lower tolerance for deviating from the boundary data.
  • Partial occlusion interfaces 724N are representative of interfaces between the trunk of the tree and the dolphin between branches.
  • Interface 72 IN is representative of the occlusion interface of branch sections between pine needles.
  • Interface 720N is representative of the occlusion interface of pine needle sections on the branches which are in front of the dolphin in the user perspective.
  • Figure 7D illustrates an example of a level of detail using a bounding volume as boundary data for at least a real object.
  • a person, Bob 406 is being seen through a see-through display device 2, such as may be worn by George 408 in this example.
  • George is gazing at virtual monster 732 as indicated by gaze lines 7311 and 73 lr in this display frame.
  • the monsters 732 and 733 are jumping around the room quickly, so the occlusion engine 302 is using a bounding volume of a predefined shape, an ellipse in this example, for tracking Bob based on depth map data as the monsters keeping jumping around through different display frames.
  • Bob 406 is considered a real object though he may not be identified as a person yet by the object recognition engine 192.
  • the occlusion engine 302 uses the ellipse in modeling the occlusion interface with the monsters.
  • the monster 732 is cut off for display at the ellipse boundary rather than at Bob's right arm.
  • Monster 733 similarly has a portion cut off or not displayed which is occluded by the ellipse. Due to the speed of the occlusions due to the monsters jumping around the room, a less detailed occlusion interface may be presented in accordance with level of detail criteria.
  • Figure 9A illustrates an example of a real person to which a conforming virtual object is registered.
  • a person in the user field of view hereafter Sam, is wearing a T-shirt 804.
  • Sam is at an event where one can be seen wearing a virtual sweater indicating the college he attended.
  • Sam's virtual sweater 902 conforms to Sam's body as clothing generally does.
  • Figure 9B illustrates examples of conforming occlusion interfaces modeled at a first level of detail with a first accuracy criteria for a virtual object.
  • FIG. 9B and 9C indicate conforming and partial occlusion interfaces of the occluding virtual sweater 902 and the real object portions of Sam like his T-shirt 804, arms, shoulders and pants.
  • Occlusion interface 910 is a conforming interface as the position of the volume or 3D space which the virtual shoulders of the sweater occupy is based on conforming to the shape and size of Sam's real shoulders.
  • the lapels 9081 and 9082 of the sweater have partial occlusion interfaces with Sam's T-shirt 804 and pants.
  • Sections of the lapels 906i and 9062 take their shape based on Sam's mid-section shape which includes the bulges 806i and 8 ⁇ 62. Thus, the mid lapel sections 906i and 9062 do not lie flat and follow the contours of the bulges.
  • Figure 9C illustrates an example of a conforming occlusion interface modeled with a second level of detail with a second accuracy criteria for a virtual object. In this current field of view, the wearer sees Sam again centered in her field of view but at least twice the distance away. Based on the distance, the boundary data of the virtual lapel sections 9061 and 9062 are not displayed and are replaced by smooth less detailed curves for smooth lying lapels 9081 and 9082 on Sam's sweater 902.
  • FIG. 10 illustrates examples of displaying shadow effects between occluding real and virtual objects.
  • a shadow of a virtual object can be displayed on a real object, and a virtual object can be displayed with the shadow of a real object appearing upon it.
  • an area of shadow can be identified in display coordinates and an opacity filter 114 in front of the display optical systems 14 can adjust incoming light for those display coordinates to appear darker in some examples to give a shadow effect.
  • Image data of a shadow can also be displayed to appear on a virtual or real object using conventional real-time shadow generation techniques.
  • a position of a shadow of a real object can be determined by conventional shadow detection techniques used in image processing.
  • the scene mapping engine 306 can determine a position of a shadow cast by a virtual object and whether the virtual object is to be displayed in shadow.
  • spherical balls 932 and 940 are real objects and box 936 is a virtual object.
  • the scene mapping engine 306 detects the shadow 934 for ball 932 and the shadow 942 for ball 940 from image and depth data captured of the user field of view by the front facing cameras 113 or other cameras in the environment.
  • the scene mapping engine 306 updates the 3D mapping of the user field of view identifying the shadows, and other applications like the occlusion engine 302 and the virtual data engine 195 receive notice of the real shadow positions when they retrieve their next map updates.
  • a 3D space position for virtual box 936 in the user field of view is determined, and the occlusion engine 302 determines the virtual box 936 is partially occluded by ball 932 and slightly occludes ball 940.
  • the occlusion engine 302 determines whether there is shadow occlusion as well, meaning an occluding object is casting a shadow on the occluded object based on the shadow positions of the 3D mapping.
  • the occlusion engine 302 determines if shadows are generated by the occlusion and if a shadow is to be applied to an object in an occlusion relationship. Besides a partial occlusion interface 933, the engine 302 determines that a shadow of occluding real ball 932 extends to a surface of virtual occluded box 936. The occlusion engine can identify one or more shadow occlusion boundaries 935 for the virtual box 936 which indicates a portion of the virtual box to be in shadow. Shadow can have a transparency level which can be seen through. As mentioned above, a partial occlusion interface identified as being in shadow may receive a less detail level of detail for its modeling due to the shadow effects.
  • the occlusion engine 302 also identifies a partial occlusion interface 937 for the virtual box 936 occluding the real ball 940, and a shadow occlusion boundary 939 on the real ball 940.
  • the virtual data engine 195 is notified of the modified boundary data due to the partial occlusion interfaces and the shadow occlusion boundaries for updating the display accordingly. Boundaries like polygon meshes and edges are not displayed typically. They are a basis used by the virtual data engine 195 for identifying shape and size information for image data.
  • Figure 11 is a flowchart describing an embodiment of a process for displaying one or more virtual objects in a user field of view of a see-through, augmented reality display device, for example one like that in Figures 1A through 2B. Steps are described which may be performed by or for an opacity filter. The methods of Figures 11 and 12 may be performed in a display device system without an opacity filter 114 by not performing those steps related to the opacity filter.
  • the virtual data engine 195 accesses the 3D mapping of the user field of view from the user perspective. For a virtual image, such as may include a virtual object, the system has a target 3D space position of where to insert the virtual image.
  • step 954 the system renders the previously created three dimensional model of the environment from the point of view of the user, the user perspective, of see-through, augmented reality display device 2 in a z-buffer, without rendering any color information into the corresponding color buffer. This effectively leaves the rendered image of the environment to be all black, but does store the z (depth) data for the objects in the environment.
  • virtual content e.g., virtual images corresponding to virtual objects
  • Steps 954 and 956 result in a depth value being stored for each pixel (or for a subset of pixels).
  • the virtual data engine 195 determines color information for virtual content to be displayed into the corresponding color buffer. This determination may be performed in a number of ways. In some embodiments, a Z or depth test is performed for each pixel. The color information for a virtual object is selected if it is of a portion of the virtual object closer to the display device than any other object, real or virtual. In other words, the pixel corresponds to an unoccluded portion of a virtual object. In the case of a video-see display, the color information may be for a real unoccluded object as well as for a virtual unoccluded object. Back to the case of a see-through display, no color information is selected for the pixel if it corresponds to an occluded portion of a virtual object.
  • the modified boundary data of a virtual object may be used as a basis for selecting which color information for virtual content is written to which pixels.
  • the virtual content buffered for display are versions of the virtual content which already include any modifications to image data based on modified boundary data due to the occlusion processing of occlusion interfaces with respect to level of detail so the color information can simply be written to the color buffer for such virtual content. Any of these approaches effectively allow the virtual images to be drawn on the microdisplay 120 with taking into account real world objects or other virtual objects occluding all or part of a virtual object. In other words, any of these approaches can cause the see-through display to represent the spatial occlusion relationship in the display by modifying display of the virtual object.
  • the system identifies the pixels of microdisplay 120 that display virtual images.
  • alpha values are determined for the pixels of microdisplay 120.
  • the alpha value indicates how much of an alpha value is used to identify how opaque an image is, on a pixel-by-pixel basis.
  • the alpha value can be binary (e.g., on or off).
  • the alpha value can be a number with a range.
  • each pixel identified in step 960 will have a first alpha value and all other pixels will have a second alpha value.
  • the pixels of the opacity filter are determined based on the alpha values.
  • the opacity filter has the same resolution as microdisplay 120 and, therefore, the opacity filter can be controlled using the alpha values.
  • the opacity filter has a different resolution than microdisplay 120 and, therefore, the data used to darken or not darken the opacity filter will be derived from the alpha value by using any of various mathematical algorithms for converting between resolutions. Other means for deriving the control data for the opacity filter based on the alpha values (or other data) can also be used.
  • step 966 the images in the z-buffer and color buffer, as well as the alpha values and the control data for the opacity filter if used, are adjusted to account for light sources (virtual or real) and shadows (virtual or real). More details of step 966 are provided below with respect to Figure 12.
  • step 968 the composite image based on the z-buffer and color buffer is sent to microdisplay 120. That is the virtual image is sent to microdisplay 120 to be displayed at the appropriate pixels, accounting for perspective and occlusions.
  • the control data for the opacity filter is transmitted from one or more processors of the control circuitry 136 or the processing unit to control opacity filter 114. Note that the process of Figure 11 can be performed many times per second (e.g., the refresh rate).
  • Figure 12 is a flowchart describing an embodiment of a process for accounting for light sources and shadows, which is an example implementation of step 966 of Figure 11.
  • the scene mapping engine 306 identifies one or more light sources that need to be accounted. For example, a real light source may need to be accounted for when drawing a virtual image. If the system is adding a virtual light source to the user's view, then the effect of that virtual light source can be accounted for in the head mounted display device 2.
  • the scene mapping engine 306 identifies one or more light sources that need to be accounted for when drawing a virtual image.
  • the effect of that virtual light source can be accounted for in the head mounted display device 2.
  • step 972 the portions of the 3D mapping of the user field of view (including virtual images) that are illuminated by the light source are identified.
  • step 974 an image depicting the illumination is added to the color buffer described above.
  • step 976 the scene mapping engine 306 and the occlusion engine 302, for shadows resulting from occlusions, identify one or more areas of shadow that need to be added by the virtual data engine 195, optionally with the aid of the opacity filter. For example, if a virtual image is added to an area in a shadow, then the shadow needs to be accounted for when drawing the virtual image by adjusting the color buffer in step 978.
  • the occlusion engine 302 indicates a real object, a shadow occlusion interface on the real object and a transparency degree for the shadow upon which the virtual data engine 195 generates and renders shadow as virtual content which is registered to the real object if it is subject to a virtual shadow in step 980.
  • the pixels of opacity filter 114 that correspond to the location of the virtual shadow are darkened.
  • the different steps for displaying the partial occlusion interface may be performed solely by the see-through, augmented reality display device system 8 or collaboratively with one or more hub computing systems 12 alone or in combination with other display device systems 8.
  • Figure 13A is a flowchart of an embodiment of a method for providing realistic audiovisual occlusion between a real object and a virtual object by a head mounted, augmented reality display device system.
  • the occlusion engine 302 determines a spatial occlusion relationship between a virtual object and a real object in an environment of a head mounted, augmented reality display device based on three dimensional data representing object volume or space positions.
  • the occlusion engine 302 determines whether the spatial occlusion relationship meets a field of view criteria for the display device.
  • Some examples of a field of view criteria are whether the occlusion is in the field view and a projected time at which the occlusion is expected to come into the field of view based on motion tracking data for the objects. If the occlusion meets the field of view criteria, a determination is made in step 1006 as to whether the spatial occlusion is a partial occlusion. Responsive to the occlusion being a partial occlusion, in step 1008, processing for displaying a realistic partial occlusion is performed. Otherwise in step 1010, processing is performed for displaying a realistic whole occlusion of one object by another.
  • step 1012 determines whether an audio occlusion relationship exists between the virtual object and the real object. If an audio occlusion relationship does not exist, the audio data is output in step 1016. If the audio occlusion relationship exists, then in step 1014 audio data for an occluded object in the relationship is modified based on one or more physical properties associated with an occluding object in the relationship, and the modified audio data is output in step 1018.
  • FIG. 13B is a flowchart of an implementation process example for determining whether an audio occlusion relationship between the virtual object and the real object exists based on one or more sound occlusion models associated with one or more physical properties of an occluding object.
  • the 3D audio engine 304 identifies at least one sound occlusion model associated with one or more physical properties of an occluding object and which model(s) represents at least one effect on sound and at least one distance range for the at least one effect.
  • the 3D audio engine retrieves a depth distance between the objects in the spatial occlusion relationship and determines in step 1026 whether the occluded object is within the at least one distance range. If not, the audio data is output unmodified as in step 1016.
  • the 3D audio engine 304 determines, in step 1028, whether a sound making portion of the occluded object associated with the audio data is occluded. Based on the object type of the object and the sound recognized as being made by the occluded object, the portion of the object which made the sound can be identified. From the 3D space position data of the occluded and occluding objects, whether the sound making portion is blocked or not can be determined. For example, if a partially occluded object is a person but the person's face is not blocked at all, there is no audio occlusion of voice data from the person.
  • the audio data is modified by the 3D audio engine 304 in accordance with the at least one effect on sound represented by the identified sound occlusion model and the 3D audio engine 304 performs step 1018 of outputting the modified audio data.
  • the example computer systems illustrated in the figures include examples of processor readable storage devices.
  • Computer readable storage devices are also processor readable storage device. Such devices may include volatile and nonvolatile, removable and non-removable memory devices implemented in any method or technology for storage of information such as processor readable instructions, data structures, program modules or other data.
  • processor or computer readable storage devices are RAM, ROM, EEPROM, cache, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, memory sticks or cards, magnetic cassettes, magnetic tape, a media drive, a hard disk, magnetic disk storage or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by a computer.
PCT/US2013/036023 2012-04-10 2013-04-10 Realistic occlusion for a head mounted augmented reality display WO2013155217A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/443,368 2012-04-10
US13/443,368 US9122053B2 (en) 2010-10-15 2012-04-10 Realistic occlusion for a head mounted augmented reality display

Publications (1)

Publication Number Publication Date
WO2013155217A1 true WO2013155217A1 (en) 2013-10-17

Family

ID=49328136

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/036023 WO2013155217A1 (en) 2012-04-10 2013-04-10 Realistic occlusion for a head mounted augmented reality display

Country Status (2)

Country Link
CN (1) CN103472909B (zh)
WO (1) WO2013155217A1 (zh)

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016014873A1 (en) * 2014-07-25 2016-01-28 Microsoft Technology Licensing, Llc Virtual reality environment with real world objects
WO2016014234A1 (en) * 2014-07-22 2016-01-28 Sony Computer Entertainment Inc. Virtual reality headset with see-through mode
US20160026242A1 (en) 2014-07-25 2016-01-28 Aaron Burns Gaze-based object placement within a virtual reality environment
WO2016048633A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Systems, apparatuses, and methods for gesture recognition and interaction
US9645397B2 (en) 2014-07-25 2017-05-09 Microsoft Technology Licensing, Llc Use of surface reconstruction data to identify real world floor
GB2549563A (en) * 2016-04-19 2017-10-25 Adobe Systems Inc Image compensation for an occluding direct-view augmented reality system
US9836117B2 (en) 2015-05-28 2017-12-05 Microsoft Technology Licensing, Llc Autonomous drones for tactile feedback in immersive virtual reality
US9858720B2 (en) 2014-07-25 2018-01-02 Microsoft Technology Licensing, Llc Three-dimensional mixed-reality viewport
US9898864B2 (en) 2015-05-28 2018-02-20 Microsoft Technology Licensing, Llc Shared tactile interaction and user safety in shared space multi-person immersive virtual reality
US9904055B2 (en) 2014-07-25 2018-02-27 Microsoft Technology Licensing, Llc Smart placement of virtual objects to stay in the field of view of a head mounted display
US9911232B2 (en) 2015-02-27 2018-03-06 Microsoft Technology Licensing, Llc Molding and anchoring physically constrained virtual environments to real-world environments
US9922667B2 (en) 2014-04-17 2018-03-20 Microsoft Technology Licensing, Llc Conversation, presence and context detection for hologram suppression
WO2018071225A1 (en) * 2016-10-14 2018-04-19 Microsoft Technology Licensing, Llc Modifying hand occlusion of holograms based on contextual information
DE102016225262A1 (de) * 2016-12-16 2018-06-21 Bayerische Motoren Werke Aktiengesellschaft Verfahren und Vorrichtung zum Betreiben eines Anzeigesystems mit einer Datenbrille
US10095032B2 (en) 2013-06-11 2018-10-09 Sony Interactive Entertainment Europe Limited Handling different input signals
CN109427099A (zh) * 2017-08-29 2019-03-05 深圳市掌网科技股份有限公司 一种基于表面的增强信息显示方法和系统
US10311638B2 (en) 2014-07-25 2019-06-04 Microsoft Technology Licensing, Llc Anti-trip when immersed in a virtual reality environment
US20190174237A1 (en) * 2017-12-06 2019-06-06 Oticon A/S Hearing device or system adapted for navigation
US10403045B2 (en) 2017-08-11 2019-09-03 Adobe Inc. Photorealistic augmented reality system
WO2019173451A1 (en) * 2018-03-06 2019-09-12 Bose Corporation Audio device with magnetic field sensor
US10451875B2 (en) 2014-07-25 2019-10-22 Microsoft Technology Licensing, Llc Smart transparency for virtual objects
US10529359B2 (en) 2014-04-17 2020-01-07 Microsoft Technology Licensing, Llc Conversation detection
EP3716217A1 (en) * 2019-03-28 2020-09-30 InterDigital CE Patent Holdings Techniques for detection of real-time occlusion
WO2020205121A1 (en) * 2019-04-01 2020-10-08 Microsoft Technology Licensing, Llc Depth-compressed representation for 3d virtual scene
WO2020214505A1 (en) * 2019-04-19 2020-10-22 Facebook Technologies, Llc Semantic-augmented artificial-reality experience
WO2021021346A1 (en) * 2019-08-01 2021-02-04 Krikey, Inc. Occlusion in mobile client rendered augmented reality environments
US10932027B2 (en) 2019-03-03 2021-02-23 Bose Corporation Wearable audio device with docking or parking magnet having different magnetic flux on opposing sides of the magnet
US11044570B2 (en) 2017-03-20 2021-06-22 Nokia Technologies Oy Overlapping audio-object interactions
US11061081B2 (en) 2019-03-21 2021-07-13 Bose Corporation Wearable audio device
US11067644B2 (en) 2019-03-14 2021-07-20 Bose Corporation Wearable audio device with nulling magnet
US11076214B2 (en) 2019-03-21 2021-07-27 Bose Corporation Wearable audio device
US11074036B2 (en) 2017-05-05 2021-07-27 Nokia Technologies Oy Metadata-free audio-object interactions
US11096004B2 (en) * 2017-01-23 2021-08-17 Nokia Technologies Oy Spatial audio rendering point extension
US11215896B2 (en) 2016-10-11 2022-01-04 Sony Corporation Display apparatus
US11272282B2 (en) 2019-05-30 2022-03-08 Bose Corporation Wearable audio device
WO2022090536A1 (en) * 2020-11-02 2022-05-05 Inter Ikea Systems B.V. Method and device for communicating a soundscape in an environment
US11395087B2 (en) 2017-09-29 2022-07-19 Nokia Technologies Oy Level-based audio-object interactions
WO2023083754A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method or computer program for synthesizing a spatially extended sound source using variance or covariance data
WO2023083753A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method or computer program for synthesizing a spatially extended sound source using modification data on a potentially modifying object
WO2023083752A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for synthesizing a spatially extended sound source using elementary spatial sectors
US11768376B1 (en) 2016-11-21 2023-09-26 Apple Inc. Head-mounted display system with display and adjustable optical components

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3087467A4 (en) * 2013-12-27 2017-09-20 Intel Corporation Audio obstruction effects in 3d parallax user interfaces
CN103679793B (zh) * 2013-12-31 2016-09-28 广州博冠信息科技有限公司 一种渲染方法及设备
BR112017002426A2 (pt) * 2014-08-04 2021-10-05 Facebook, Inc. Método e sistema para reconstruir porções faciais obstruídas para ambiente de realidade virtual
IL236243A (en) * 2014-12-14 2016-08-31 Elbit Systems Ltd Visual enhancement of color icons is shown
US9407865B1 (en) * 2015-01-21 2016-08-02 Microsoft Technology Licensing, Llc Shared scene mesh data synchronization
US11263794B2 (en) * 2015-01-21 2022-03-01 Chengdu Idealsee Technology Co., Ltd. Binocular see-through AR head-mounted display device and information displaying method thereof
US11468639B2 (en) 2015-02-20 2022-10-11 Microsoft Technology Licensing, Llc Selective occlusion system for augmented reality devices
US9652897B2 (en) * 2015-06-25 2017-05-16 Microsoft Technology Licensing, Llc Color fill in an augmented reality environment
JP6676294B2 (ja) 2015-06-30 2020-04-08 キヤノン株式会社 情報処理装置、情報処理方法、プログラム
CN105657370A (zh) * 2016-01-08 2016-06-08 李昂 一种封闭性的可穿戴全景摄像与处理系统及其操作方法
KR102587841B1 (ko) * 2016-02-11 2023-10-10 매직 립, 인코포레이티드 깊이 평면들 간의 감소된 스위칭을 갖는 다중-깊이 평면 디스플레이 시스템
US10244211B2 (en) * 2016-02-29 2019-03-26 Microsoft Technology Licensing, Llc Immersive interactive telepresence
NZ745790A (en) * 2016-03-15 2023-04-28 Magic Leap Inc Direct light compensation technique for augmented reality system
US9933855B2 (en) * 2016-03-31 2018-04-03 Intel Corporation Augmented reality in a field of view including a reflection
WO2017187708A1 (ja) * 2016-04-26 2017-11-02 ソニー株式会社 情報処理装置、情報処理方法、及びプログラム
CN109478092A (zh) * 2016-07-12 2019-03-15 富士胶片株式会社 图像显示系统、以及头戴式显示器的控制装置及其工作方法和工作程序
CN106293100A (zh) * 2016-08-24 2017-01-04 上海与德通讯技术有限公司 虚拟现实设备中视线焦点的确定方法及虚拟现实设备
CN106444023A (zh) * 2016-08-29 2017-02-22 北京知境科技有限公司 一种超大视场角的双目立体显示的透射式增强现实系统
US9972122B1 (en) * 2016-12-20 2018-05-15 Canon Kabushiki Kaisha Method and system for rendering an object in a virtual view
CN106803286A (zh) * 2017-01-17 2017-06-06 湖南优象科技有限公司 基于多视点图像的虚实遮挡实时处理方法
EP3574494A4 (en) * 2017-01-24 2021-03-24 Lonza Limited METHODS AND SYSTEMS FOR USING A VIRTUAL OR EXTENDED REALITY DISPLAY TO PERFORM INDUSTRIAL MAINTENANCE
US10460527B2 (en) * 2017-06-30 2019-10-29 Tobii Ab Systems and methods for displaying images in a virtual world environment
CN107506032A (zh) * 2017-08-17 2017-12-22 深圳市华星光电半导体显示技术有限公司 基于透明显示器的增强现实显示方法和装置
US10469819B2 (en) 2017-08-17 2019-11-05 Shenzhen China Star Optoelectronics Semiconductor Display Technology Co., Ltd Augmented reality display method based on a transparent display device and augmented reality display device
CN107562197A (zh) * 2017-08-24 2018-01-09 北京灵犀微光科技有限公司 显示方法及装置
CN108107580A (zh) * 2017-12-20 2018-06-01 浙江煮艺文化科技有限公司 一种虚拟现实场景呈现展示方法及系统
EP3729376A4 (en) * 2017-12-22 2021-01-20 Magic Leap, Inc. PROCESS OF RENDERING OF OCCLUSION USING A RAY THROWING AND ACTUAL DEPTH
US10878285B2 (en) * 2018-04-12 2020-12-29 Seiko Epson Corporation Methods and systems for shape based training for an object detection algorithm
TWI669682B (zh) * 2018-05-25 2019-08-21 光寶電子(廣州)有限公司 影像處理系統及影像處理方法
CN108830940A (zh) * 2018-06-19 2018-11-16 广东虚拟现实科技有限公司 遮挡关系处理方法、装置、终端设备及存储介质
CN110673718B (zh) 2018-07-02 2021-10-29 苹果公司 用于显示系统的基于聚焦的调试和检查
US10777012B2 (en) 2018-09-27 2020-09-15 Universal City Studios Llc Display systems in an entertainment environment
CN111282271B (zh) * 2018-12-06 2023-04-07 网易(杭州)网络有限公司 移动终端游戏中的声音渲染方法、装置和电子设备
CN111323026B (zh) * 2018-12-17 2023-07-07 兰州大学 一种基于高精度点云地图的地面过滤方法
CN110009720B (zh) * 2019-04-02 2023-04-07 阿波罗智联(北京)科技有限公司 Ar场景中的图像处理方法、装置、电子设备及存储介质
WO2020226832A1 (en) 2019-05-06 2020-11-12 Apple Inc. Device, method, and computer-readable medium for presenting computer-generated reality files
CN113544634A (zh) 2019-05-06 2021-10-22 苹果公司 用于构成cgr文件的设备、方法和图形用户界面
CN111061575A (zh) * 2019-11-27 2020-04-24 Oppo广东移动通信有限公司 数据处理方法、装置、用户设备及增强现实系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100164990A1 (en) * 2005-08-15 2010-07-01 Koninklijke Philips Electronics, N.V. System, apparatus, and method for augmented reality glasses for end-user programming
US20100328344A1 (en) * 2009-06-25 2010-12-30 Nokia Corporation Method and apparatus for an augmented reality user interface
KR101032813B1 (ko) * 2010-08-24 2011-05-04 윤상범 가상현실 무도 대련장치 및 방법, 그 기록 매체
US20110231757A1 (en) * 2010-02-28 2011-09-22 Osterhout Group, Inc. Tactile control in an augmented reality eyepiece
US20120001901A1 (en) * 2010-06-30 2012-01-05 Pantech Co., Ltd. Apparatus and method for providing 3d augmented reality

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4789745B2 (ja) * 2006-08-11 2011-10-12 キヤノン株式会社 画像処理装置および方法
US20110075257A1 (en) * 2009-09-14 2011-03-31 The Arizona Board Of Regents On Behalf Of The University Of Arizona 3-Dimensional electro-optical see-through displays
US20120075167A1 (en) * 2010-09-29 2012-03-29 Eastman Kodak Company Head-mounted display with wireless controller

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100164990A1 (en) * 2005-08-15 2010-07-01 Koninklijke Philips Electronics, N.V. System, apparatus, and method for augmented reality glasses for end-user programming
US20100328344A1 (en) * 2009-06-25 2010-12-30 Nokia Corporation Method and apparatus for an augmented reality user interface
US20110231757A1 (en) * 2010-02-28 2011-09-22 Osterhout Group, Inc. Tactile control in an augmented reality eyepiece
US20120001901A1 (en) * 2010-06-30 2012-01-05 Pantech Co., Ltd. Apparatus and method for providing 3d augmented reality
KR101032813B1 (ko) * 2010-08-24 2011-05-04 윤상범 가상현실 무도 대련장치 및 방법, 그 기록 매체

Cited By (68)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095032B2 (en) 2013-06-11 2018-10-09 Sony Interactive Entertainment Europe Limited Handling different input signals
US10679648B2 (en) 2014-04-17 2020-06-09 Microsoft Technology Licensing, Llc Conversation, presence and context detection for hologram suppression
US9922667B2 (en) 2014-04-17 2018-03-20 Microsoft Technology Licensing, Llc Conversation, presence and context detection for hologram suppression
US10529359B2 (en) 2014-04-17 2020-01-07 Microsoft Technology Licensing, Llc Conversation detection
WO2016014234A1 (en) * 2014-07-22 2016-01-28 Sony Computer Entertainment Inc. Virtual reality headset with see-through mode
US10371944B2 (en) 2014-07-22 2019-08-06 Sony Interactive Entertainment Inc. Virtual reality headset with see-through mode
US9858720B2 (en) 2014-07-25 2018-01-02 Microsoft Technology Licensing, Llc Three-dimensional mixed-reality viewport
US10649212B2 (en) 2014-07-25 2020-05-12 Microsoft Technology Licensing Llc Ground plane adjustment in a virtual reality environment
US9766460B2 (en) 2014-07-25 2017-09-19 Microsoft Technology Licensing, Llc Ground plane adjustment in a virtual reality environment
US9865089B2 (en) 2014-07-25 2018-01-09 Microsoft Technology Licensing, Llc Virtual reality environment with real world objects
US9645397B2 (en) 2014-07-25 2017-05-09 Microsoft Technology Licensing, Llc Use of surface reconstruction data to identify real world floor
US9904055B2 (en) 2014-07-25 2018-02-27 Microsoft Technology Licensing, Llc Smart placement of virtual objects to stay in the field of view of a head mounted display
US10451875B2 (en) 2014-07-25 2019-10-22 Microsoft Technology Licensing, Llc Smart transparency for virtual objects
WO2016014873A1 (en) * 2014-07-25 2016-01-28 Microsoft Technology Licensing, Llc Virtual reality environment with real world objects
US10416760B2 (en) 2014-07-25 2019-09-17 Microsoft Technology Licensing, Llc Gaze-based object placement within a virtual reality environment
US20160026242A1 (en) 2014-07-25 2016-01-28 Aaron Burns Gaze-based object placement within a virtual reality environment
US10311638B2 (en) 2014-07-25 2019-06-04 Microsoft Technology Licensing, Llc Anti-trip when immersed in a virtual reality environment
US10096168B2 (en) 2014-07-25 2018-10-09 Microsoft Technology Licensing, Llc Three-dimensional mixed-reality viewport
US10725533B2 (en) 2014-09-26 2020-07-28 Intel Corporation Systems, apparatuses, and methods for gesture recognition and interaction
WO2016048633A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Systems, apparatuses, and methods for gesture recognition and interaction
US9911232B2 (en) 2015-02-27 2018-03-06 Microsoft Technology Licensing, Llc Molding and anchoring physically constrained virtual environments to real-world environments
US9836117B2 (en) 2015-05-28 2017-12-05 Microsoft Technology Licensing, Llc Autonomous drones for tactile feedback in immersive virtual reality
US9898864B2 (en) 2015-05-28 2018-02-20 Microsoft Technology Licensing, Llc Shared tactile interaction and user safety in shared space multi-person immersive virtual reality
US10891804B2 (en) 2016-04-19 2021-01-12 Adobe Inc. Image compensation for an occluding direct-view augmented reality system
US11514657B2 (en) 2016-04-19 2022-11-29 Adobe Inc. Replica graphic causing reduced visibility of an image artifact in a direct-view of a real-world scene
GB2549563B (en) * 2016-04-19 2018-09-26 Adobe Systems Inc Image compensation for an occluding direct-view augmented reality system
GB2549563A (en) * 2016-04-19 2017-10-25 Adobe Systems Inc Image compensation for an occluding direct-view augmented reality system
US10134198B2 (en) 2016-04-19 2018-11-20 Adobe Systems Incorporated Image compensation for an occluding direct-view augmented reality system
US11215896B2 (en) 2016-10-11 2022-01-04 Sony Corporation Display apparatus
WO2018071225A1 (en) * 2016-10-14 2018-04-19 Microsoft Technology Licensing, Llc Modifying hand occlusion of holograms based on contextual information
US11768376B1 (en) 2016-11-21 2023-09-26 Apple Inc. Head-mounted display system with display and adjustable optical components
DE102016225262A1 (de) * 2016-12-16 2018-06-21 Bayerische Motoren Werke Aktiengesellschaft Verfahren und Vorrichtung zum Betreiben eines Anzeigesystems mit einer Datenbrille
US11096004B2 (en) * 2017-01-23 2021-08-17 Nokia Technologies Oy Spatial audio rendering point extension
US11044570B2 (en) 2017-03-20 2021-06-22 Nokia Technologies Oy Overlapping audio-object interactions
US11074036B2 (en) 2017-05-05 2021-07-27 Nokia Technologies Oy Metadata-free audio-object interactions
US11442693B2 (en) 2017-05-05 2022-09-13 Nokia Technologies Oy Metadata-free audio-object interactions
US11604624B2 (en) 2017-05-05 2023-03-14 Nokia Technologies Oy Metadata-free audio-object interactions
US10403045B2 (en) 2017-08-11 2019-09-03 Adobe Inc. Photorealistic augmented reality system
CN109427099A (zh) * 2017-08-29 2019-03-05 深圳市掌网科技股份有限公司 一种基于表面的增强信息显示方法和系统
US11395087B2 (en) 2017-09-29 2022-07-19 Nokia Technologies Oy Level-based audio-object interactions
EP3496417A3 (en) * 2017-12-06 2019-08-07 Oticon A/s Hearing system adapted for navigation and method therefor
US10820121B2 (en) 2017-12-06 2020-10-27 Oticon A/S Hearing device or system adapted for navigation
CN109922417B (zh) * 2017-12-06 2022-06-14 奥迪康有限公司 适于导航的听力装置或系统
CN109922417A (zh) * 2017-12-06 2019-06-21 奥迪康有限公司 适于导航的听力装置或系统
US20190174237A1 (en) * 2017-12-06 2019-06-06 Oticon A/S Hearing device or system adapted for navigation
US10516929B2 (en) 2018-03-06 2019-12-24 Bose Corporation Audio device
WO2019173451A1 (en) * 2018-03-06 2019-09-12 Bose Corporation Audio device with magnetic field sensor
CN111903142A (zh) * 2018-03-06 2020-11-06 伯斯有限公司 具有磁场传感器的音频设备
CN111903142B (zh) * 2018-03-06 2022-06-07 伯斯有限公司 具有磁场传感器的音频设备
US10932027B2 (en) 2019-03-03 2021-02-23 Bose Corporation Wearable audio device with docking or parking magnet having different magnetic flux on opposing sides of the magnet
US11067644B2 (en) 2019-03-14 2021-07-20 Bose Corporation Wearable audio device with nulling magnet
US11076214B2 (en) 2019-03-21 2021-07-27 Bose Corporation Wearable audio device
US11061081B2 (en) 2019-03-21 2021-07-13 Bose Corporation Wearable audio device
WO2020193703A1 (en) * 2019-03-28 2020-10-01 Interdigital Ce Patent Holdings Techniques for detection of real-time occlusion
EP3716217A1 (en) * 2019-03-28 2020-09-30 InterDigital CE Patent Holdings Techniques for detection of real-time occlusion
US10872463B2 (en) 2019-04-01 2020-12-22 Microsoft Technology Licensing, Llc Depth-compressed representation for 3D virtual scene
WO2020205121A1 (en) * 2019-04-01 2020-10-08 Microsoft Technology Licensing, Llc Depth-compressed representation for 3d virtual scene
CN113711166B (zh) * 2019-04-19 2024-04-09 元平台技术有限公司 语义增强的人工现实体验
US11217011B2 (en) 2019-04-19 2022-01-04 Facebook Technologies, Llc. Providing semantic-augmented artificial-reality experience
CN113711166A (zh) * 2019-04-19 2021-11-26 脸谱科技有限责任公司 语义增强的人工现实体验
WO2020214505A1 (en) * 2019-04-19 2020-10-22 Facebook Technologies, Llc Semantic-augmented artificial-reality experience
US11272282B2 (en) 2019-05-30 2022-03-08 Bose Corporation Wearable audio device
WO2021021346A1 (en) * 2019-08-01 2021-02-04 Krikey, Inc. Occlusion in mobile client rendered augmented reality environments
US11471773B2 (en) 2019-08-01 2022-10-18 Krikey, Inc. Occlusion in mobile client rendered augmented reality environments
WO2022090536A1 (en) * 2020-11-02 2022-05-05 Inter Ikea Systems B.V. Method and device for communicating a soundscape in an environment
WO2023083754A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method or computer program for synthesizing a spatially extended sound source using variance or covariance data
WO2023083753A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method or computer program for synthesizing a spatially extended sound source using modification data on a potentially modifying object
WO2023083752A1 (en) * 2021-11-09 2023-05-19 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for synthesizing a spatially extended sound source using elementary spatial sectors

Also Published As

Publication number Publication date
CN103472909B (zh) 2017-04-12
CN103472909A (zh) 2013-12-25

Similar Documents

Publication Publication Date Title
US9122053B2 (en) Realistic occlusion for a head mounted augmented reality display
WO2013155217A1 (en) Realistic occlusion for a head mounted augmented reality display
US11533489B2 (en) Reprojecting holographic video to enhance streaming bandwidth/quality
JP7445642B2 (ja) クロスリアリティシステム
US9268406B2 (en) Virtual spectator experience with a personal audio/visual apparatus
US9286711B2 (en) Representing a location at a previous time period using an augmented reality display
US9230368B2 (en) Hologram anchoring and dynamic positioning
US20200098191A1 (en) Systems and methods for augmented reality
EP3011411B1 (en) Hybrid world/body locked hud on an hmd
US9041622B2 (en) Controlling a virtual object with a real controller device
US9292085B2 (en) Configuring an interaction zone within an augmented reality environment
US9183676B2 (en) Displaying a collision between real and virtual objects
US10955665B2 (en) Concurrent optimal viewing of virtual objects
US9639989B2 (en) Video processing device, video processing method, and video processing system
US20140146394A1 (en) Peripheral display for a near-eye display device
US20130326364A1 (en) Position relative hologram interactions
US20140368537A1 (en) Shared and private holographic objects

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13775424

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13775424

Country of ref document: EP

Kind code of ref document: A1