WO2017027338A1 - Apparatus and method for supporting interactive augmented reality functionalities - Google Patents

Apparatus and method for supporting interactive augmented reality functionalities Download PDF

Info

Publication number
WO2017027338A1
WO2017027338A1 PCT/US2016/045654 US2016045654W WO2017027338A1 WO 2017027338 A1 WO2017027338 A1 WO 2017027338A1 US 2016045654 W US2016045654 W US 2016045654W WO 2017027338 A1 WO2017027338 A1 WO 2017027338A1
Authority
WO
WIPO (PCT)
Prior art keywords
marker
camera
augmented reality
markers
image
Prior art date
Application number
PCT/US2016/045654
Other languages
French (fr)
Inventor
Seppo T. VALLI
Original Assignee
Pcms Holdings, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pcms Holdings, Inc. filed Critical Pcms Holdings, Inc.
Publication of WO2017027338A1 publication Critical patent/WO2017027338A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • AR Augmented Reality
  • virtual 3D models or animations are embedded into video views from the real environment.
  • the position of the augmented 3D information is defined in relation to a set of known features in the video views.
  • these distinctive features are brought to the scene in the form of graphical markers, which are relatively easy to detect and track from the video.
  • graphical tags, fiducials or markers have been commonly used.
  • Graphical markers have certain advantages over the using of natural features. For example, graphical markers help to make the offline process for mixed reality content production and use more independent of the actual target environment. This allows content to be positioned more reliably in the target embodiment based on the position of graphical markers, whereas changes in the environment (e.g., changes in lighting or in the position of miscellaneous objects) can otherwise make it more difficult for an augmented reality system to consistently identify position and orientation information based only on the environment.
  • Passive graphical markers are typically printed on paper or other stable substrate and cannot change their appearance after being printed. Being passive, they naturally also lack the ability to support additional functionalities based on, for example, image processing or electrical connections.
  • An example of a dynamic graphical marker is described in, for example, U.S. Patent Application Publication No. 2013/0109961 Al, entitled “Apparatus and method for providing dynamic fiducial markers for devices.”
  • a camera marker operates as an electronic marker device that combines a wide angle, electronic pan-tilt-zoom camera with a display.
  • the device's display is used in some embodiments for declaring and advertising the availability of Augmented Reality (AR) information, as well as for showing markers to augment such information.
  • AR Augmented Reality
  • a camera marker is operated to display an image of at least a first augmented reality marker.
  • the camera marker is further operated to capture an image of at least a second augmented reality marker.
  • the second augmented reality marker may be displayed by a second camera marker. Based at least in part on the image of the second augmented reality marker, the position of the second augmented reality marker with respect to the camera marker may be determined. This position may then be provided to a position server.
  • the device's camera is used for automatic calibration of a multi-marker setup, enabling accurate detection and tracking of the user and supporting new interaction and presence related effects and services.
  • user acceptance for one or more surrounding devices is gained by embedding the device in the form of a familiar consumer object, such as a photo frame.
  • FIG. 1 is a schematic plan view of an embodiment in which camera markers are employed in an augmented reality system in a room.
  • FIG. 2A provides a perspective view of a camera marker device with a marker shown on a display of the device.
  • FIG. 2B provides a perspective view of a camera marker together with augmented content as seen by AR glasses or a smart phone.
  • FIG. 2C further illustrates exemplary effects of transforming a fiducial marker displayed on a screen of the camera marker device or viewed by AR glasses.
  • FIG. 3 illustrates architecture of an exemplary augmented reality system employing one or more camera markers.
  • FIG. 4 is a functional block diagram of components of a camera marker device.
  • FIG. 5 is a schematic illustration of an embodiment in which a camera marker operates as a virtual mirror.
  • FIG. 6 illustrates an exemplary wireless transmit/receive unit (WTRU) that may be employed as camera marker or common position server in some embodiments.
  • WTRU wireless transmit/receive unit
  • FIG. 7 illustrates an exemplary network entity that may be employed as a camera marker or common position server in some embodiments.
  • FIG. 8 illustrates a collection of exemplary augmented reality (AR) markers of the type that can be displayed by exemplary camera markers.
  • AR augmented reality
  • FIG. 9 illustrates an exemplary method of positioning AR content using camera marker devices.
  • FIG. 10 is a block diagram illustrating an exemplary functional architecture of components of a camera marker in accordance with an exemplary embodiment of the present disclosure.
  • a camera marker is an electronic marker device that combines a wide angle, electronic pan-tilt-zoom camera with a display.
  • the device's display can be used for declaring and advertising the availability of Augmented Reality (AR) information, as well as for showing markers to augment such information.
  • AR Augmented Reality
  • the device's camera can be used for automatic calibration of a multi-marker setup.
  • the calibrated marker setup can be used for accurately detecting and tracking the user to support new interaction and presence-related effects and services.
  • the camera marker is provided in the form of a familiar consumer object, such as a digital photo frame, to encourage user acceptance of the device and to allow the device to blend with existing room decoration.
  • FIG. 1 illustrates a user 105 with AR glasses (with field of view shown by lines 107) in a room 100 with a camera marker 115 on each wall.
  • Camera markers 115 are placed in such a way that at least one other camera marker is seen on each camera view (lines 117 depict exemplary fields of view of each camera marker). This is to ensure that a self-calibration process can be made to derive a common coordinate system for the multi-marker setup.
  • a common coordinate system can then be used to derive an accurate estimate for the user' s position in the room, and for capturing his/her visual parameters like gestures, appearance, etc.
  • camera markers there are few restrictions on the placement of camera markers in the environment.
  • the number and location of camera markers can be selected based on, for example, the size of the space to be tracked and requirements for the tracking accuracy. That the camera marker devices 115 can also be positioned horizontally, e.g., on tables. In many AR visualizations, it is beneficial to define the floor level. However, to preserve the safety of the user and of the device, it may not be beneficial to place a camera marker on the floor.
  • camera markers can be used in conjunction with one or more traditional printed fiducial markers. For example, a printed fiducial marker may be placed on a surface and may be used to define the surface. For example, a printed fiducial marker may be placed on the floor and used to define the floor level.
  • a printed fiducial marker may be used to define other surfaces, such as desktops, tabletops, and walls.
  • a printed maker may be attached to the viewing device, e.g. on the backside of a tablet, in order to facilitate determination of the pose (the position and orientation) of the viewing device by the system.
  • FIG. 1 illustrates a user 105 in a room 100 with multiple camera markers 115.
  • Wide angle cameras are used for calibration and user tracking purposes.
  • the wide-angle camera has an angle of view of at least 60° (see field of view lines 117).
  • the wide-angle camera has an angle of view of at least 90°.
  • the wide-angle camera has an angle of view of at least 120°.
  • a virtual character 110 is augmented into AR glasses, which carry a camera for marker detection and tracking (see field of view lines 107).
  • a camera marker device is accessible and controllable both over an internet protocol (IP) network and locally, e.g., by using a smart phone as a remote control device.
  • IP internet protocol
  • Local control is particularly useful for user interaction, such as for changing, scaling and turning the augmented or displayed content.
  • FIG. 2A illustrates an exemplary camera marker device 200.
  • the device's appearance does not necessarily differ much from a smartphone or a tablet.
  • the appearance of an ordinary photo frame may be beneficial.
  • the size of the display depends on the targeted content (markers and/or eye-catchers) and the desired freedom for moving the marker on the display.
  • the marker 205 shown on the display can be replaced or modified over network. The content changes respectively.
  • the camera 210 is used for calibration and user tracking purposes. In the example illustrated in FIG. 2A, the wide-angle camera and the display of the camera marker are both on the same face of the smart marker, with both facing the same direction.
  • FIG. 2B illustrates an embodiment of a face model 220 augmented in relation to the marker 205, as seen by AR glasses or a smart phone.
  • a view-point to the marker / marker orientation is kept unchanged.
  • a camera marker combines a wide-angle camera for tracking the space with a display for showing markers for Augmented Reality (AR) information.
  • AR Augmented Reality
  • the camera marker can be used for new user interaction and presence related services.
  • Self-calibration of the camera marker simplifies the setup process of a system that includes multiple camera markers. Self-calibration is useful for ensuring the system's accuracy and stability both in AR visualization and user tracking related functionalities. Self-calibration may be based on detecting the pose of other markers in each camera marker's own camera view.
  • camera marker devices are in communication with a common position server that combines all captured information, e.g., mapping marker positions into common coordinates, and that derives a common estimate for the position of the user.
  • a common position server that combines all captured information, e.g., mapping marker positions into common coordinates, and that derives a common estimate for the position of the user.
  • one of the camera marker devices operates as the server.
  • a separate server device is provided (either locally or, for example as a cloud service or other networked service) in order to reduce computational capabilities required from each individual camera marker.
  • FIG. 3 illustrates an exemplary system architecture.
  • the system may comprise a user 301 with AR glasses.
  • Camera markers (315, 316) may be linked with local computers (320a, 320b, 320n), which may be integrated with the camera marker (e.g., unit 318).
  • the camera markers 315, 316 and their local computers 320 may communicate with a server and/or central computer 325, which combines all devices.
  • the user's AR glasses may communicate with the backend 325 by a wireless connection 330.
  • an AR augmentation 303 may be presented to the user's AR view, in relation to the camera markers.
  • a network connection is used to provide content to the electronic marker display, either a marker for AR, a picture (or video) for an eye-catcher, animation, advertisement or other chosen content.
  • the connection is wireless (e.g., WLAN), although a cord may be used to power the device.
  • a network connection (which may be the same network connection) can be used for another purpose to connect the camera marker to the user's viewing device.
  • the viewing device may be, for example, a camera phone, tablet, virtual glasses with camera, and the like. Thanks to this connection, the system infrastructure tracks at each time instant the marker's (marker display's) orientation with respect to the user. This information can be used, e.g., for tracking the user by electronically zooming in the marker camera view, and analyzing its content.
  • a further communication connection is used for local remote control of the camera marker by a smart phone or a TV-like remote controller.
  • Feasible connection technologies include WLAN, Bluetooth, and Infrared link, among others.
  • FIG. 4 is a functional block diagram of components in one example of a camera marker device.
  • the camera marker includes a processor 402, a camera 404, which may be a wide-angle camera, a display 406, such as a LCD, which may be used to display an AR marker.
  • An optional keypad 408 may be provided.
  • the keypad 408 may be implemented as a feature of the display 406, where the display 406 is a touch-sensitive display.
  • the exemplary camera marker device further includes non-volatile memory 410 and volatile memory 412, a transmitter 414, and a receiver 416. Other network input/output connections may also be provided.
  • a camera marker is provided with audio capture and playback features. Audio may be used to increase the attractiveness and effectiveness of the videos used for announcing/advertising the available AR content. Audio may also be used as a component of the augmented AR content. A microphone can be used to capture user responses or commands.
  • a paper marker on the floor could specify the floor level without the risk of an electronic device being stepped on.
  • Paper markers may also be used as a way to balance the trade-off between calibration accuracy and system cost.
  • natural print-out pictures can be used as part of a hybrid marker setup. Even natural planar or 3D feature sets can be detected by multiple camera markers and used for augmenting 3D objects.
  • At least some local processing is performed in each marker device in order to reduce the amount of information to be transmitted to the common server.
  • Marker detection is one of such local operations.
  • camera marker setup is relatively stable, and tracking in camera markers is not needed to such an extent as in the user's viewing device (AR glasses or tablet), which is moving along with the user.
  • Another example is the control of the wide- angle camera in order to capture, for example, cropped views of other markers (for marker detection and identification), or user's visual parameters.
  • a third example for local processing is to use camera view for deriving the actual lighting conditions in the environment in order to adapt the respective properties for the virtual content for improved photorealism.
  • camera markers can be equipped with 3D cameras, such as RGB-D or ToF sensors, for capturing depth information.
  • 3D cameras such as RGB-D or ToF sensors
  • RGB-D or ToF sensors for capturing depth information.
  • the use of camera markers may encourage the acceptance of 3D cameras as a ubiquitous part of users' environment.
  • the 3D captured scene can be used to implement accurate user-perspective AR rendering (cf. illustration for device- perspective and user-perspective magic lenses in Baricevic et.al 2012).
  • a more traditional way of capturing 3D information is to use two (e.g., stereo) or more cameras.
  • multiple markers can be used in AR both to give more and better 3D data of the environment.
  • multiple markers are calibrated with respect to each other and the scene.
  • calibration is performed by capturing the multi-marker scene by a moving external camera and making geometrical calculations from its views.
  • An example of a multi-camera calibration method is given in [5].
  • Providing the markers with wide-angle cameras enables self-calibration in a multiple camera-marker system.
  • the views of the marker cameras themselves can be used for the mutual calibration of all devices, and the calibration can be updated when necessary, e.g., to adapt into any possible changes in the setup.
  • a user's position can also be derived by the cameras around the user. This outside-in type of tracking can be accomplished by camera markers, and brings some potential benefits.
  • marker cameras can be used to capture the user's visual appearance, gestures, or motion.
  • More reliable user tracking can be effected using multiple connected and calibrated marker cameras around the user, as described above. This makes the user capture easier and more accurate (e.g., tracking in large spaces, handling of occlusions in the scene, etc.) than when using one camera tracking, whether it is based on one wearable camera or one (typically fixed) external camera.
  • the users can be provided with individualized viewpoints of the same content. It is also possible to serve the different users with different content, especially in embodiments in which the service is permitted to track users' identities.
  • the user identities may be, for example, anonymously numbered indices. Such anonymous indices are sufficiently detailed to provide some level of service enhancement. However, having information regarding real identities enables the system to provide more personalized services.
  • Camera markers are an access point for information and interaction for the user, and they are used in some embodiments for analyzing users' responses to the content, monitoring user activities in the space, and collecting related contextual information.
  • HCI human-computer interface
  • User behavior data can be used in many ways to better understand and serve the user.
  • the observations can also be used actively to provide the user with interactive content, as described in the following.
  • the device's display or augmentation may be used to reflect the user the environment to almost any desired direction seen from the device (provided that the camera shows a wide enough panorama from the environment).
  • the device acts as a virtual mirror, mimicking a physical mirror view even for a moving user (resulting with mirror-like motion parallax seen by the user).
  • the virtual mirror can be in a fixed angle, reflecting to/from any chosen direction, or even dynamically turn to any desired direction while the user is moving (e.g., depending on his/her trace). This will enable new types effects and services in many spaces.
  • the virtual mirror concept is illustrated in FIG. 5.
  • a user 501 with AR gear may view (view 505) a "virtual mirror" 510 (e.g., a camera marker) on a wall 508.
  • the virtual mirror 510 may give a reflection 515, which may shift direction relative to the user's motions.
  • FIG. 2C illustrates effects of scaling and turning the marker 205 over a network connection (either by the user or the service provider).
  • Local user control enables for example the user to place an advertised 3D product (e.g., a couch) at the appropriate scale into a preferred position to his/her room.
  • the provider or broadcaster of the advertisement does not need to know about the local circumstances, as the user is the one to make the composition for AR visualisation.
  • the size of the camera marker's display naturally limits the freedom for (locally) controlling the point at which an object is augmented. This is especially true for spatial translations.
  • an augmented object can be moved longer distances by allowing the user to change the perspective (angle) of a marker image as well as the distance between the marker and the augmented object.
  • Each camera marker used in an application has a location and orientation (pose) with respect to a coordinate system used to provide augmented content.
  • two or more camera markers cooperate to determine their respective locations and orientations.
  • a set of six values is used to specify the location and orientation of camera marker with respect to the coordinate system, such as the three spatial coordinates (x,y,z) defining the location and three Euler angles ( ⁇ , ⁇ , ⁇ ) defining the orientation.
  • ⁇ , ⁇ , ⁇ three spatial coordinates
  • Euler angles
  • each of the camera markers displays an AR marker or other fiducial marker, which may be a unique (or at least locally unique) marker.
  • Each of the camera markers obtains an image from its camera and processes the image to recognize the two displayed AR markers and to determine the coordinates of those markers within the image. Having processed the image to recognize the AR markers, the camera marker further determines the angle between those markers.
  • the determination of the position and orientation of the AR markers within the image may be performed using known AR marker detection techniques using, for example, statistical based, gradient based, pixel connectivity-edge linking based and Hough transform based methods.
  • the image taken by each camera is further processed to determine the distance of the other camera markers.
  • the distance of the other camera markers is determined based at least in part on the apparent size and perspective of the camera markers within the image. This may be done by, for example, comparing the apparent size and shape of the camera markers in the image with a known actual size and shape of the camera markers. This may also be done by comparing the apparent size and shape of AR markers displayed on the camera markers with a known actual size and shape of the camera markers.
  • each camera marker may convey information identifying its own actual physical dimensions, allowing camera markers to estimate the distance of the AR marker based on a comparison between the actual and apparent physical dimensions of that AR marker.
  • the distance of the other camera markers may be determined based at least in part on the corresponding depth measurement.
  • depth can be measured using other techniques.
  • the camera markers may exchange audio signals, with the travel time of the audio signals being indicative of the distance between markers.
  • each camera marker is equipped with one or more accelerometers, and the camera marker is operative to process readings from the accelerometer or accelerometers to determine one or more angles of that camera marker with respect to the vertical (e.g., "pitch,” "roll,” and “yaw” angles).
  • the angle and distance measurements performed as described above provide sufficient information to determine the location and orientation of all of the camera markers. These calculations can be performed using well-known principles of trigonometry.
  • the locations and orientations of visual features other than camera markers may be used in the setup process.
  • one or more printed markers may be employed, such as a printed marker placed on the floor.
  • FIG. 9 An exemplary method is illustrated in FIG. 9.
  • a camera marker operates to obtain a wide- angle image of a location in which it is situated (905), with the wide-angle image including one or more other camera markers in the field of view of the image.
  • the camera marker operates to detect other camera markers (or printed markers) in the image (910).
  • the camera marker then operates based on the shape and size of the camera markers in the image to determine marker coordinates in the local coordinate system (915).
  • the camera marker determines from the camera image the coordinates in the local coordinate system corresponding to the six degrees of freedom of the marker, e.g., x, y, z for marker position, and ⁇ , ⁇ , ⁇ (or ⁇ , ⁇ , ⁇ ) for marker orientation, the coordinates representing the pose of the marker. This determination may be made based on inputs such as a known actual size of the visible camera marker (or of the AR marker displayed by that camera marker), information concerning distortion introduced by wide-angle imaging optics, pixel resolution, focal length, and the like.
  • the camera marker provides the coordinates of the other camera markers (and possibly printed fiducial markers) in the local coordinate system to a common position server (920), along with identifiers of the markers (which may be identifiers encoded in the markers themselves using, for example, a QR code or similar technology).
  • a common position server is implemented in the camera marker itself, as in the embodiment of FIG. 9. In other embodiments, the common position server is implemented in a different camera marker or in a separate network node.
  • the common position server receives information regarding marker positions expressed in one or more different coordinate systems.
  • the common position server may receive from one camera marker an indication that a marker labeled Ml is at coordinates ( ⁇ , ⁇ , ⁇ , ⁇ , ⁇ , ⁇ ) in one local coordinate system and that the same marker labeled Ml is at coordinates ( ⁇ ', ⁇ ', ⁇ ', ⁇ ', ⁇ ', ⁇ )' in a different local coordinate system measured with respect to a different camera marker. Similar sets of coordinates may also be received representing the positions of different markers as measured by different coordinate systems.
  • the position server defines a common coordinate system and transforms the coordinates of the markers into the common coordinate system (925).
  • Various techniques may be used to transform the coordinates of the markers into the common coordinate system.
  • one of the local coordinate systems is defined as the common coordinate system.
  • a transformation is then found between the common coordinate system and the local coordinate systems.
  • the transformation may be found by testing a plurality of different transformations that result in the alignment of the locations of different camera markers.
  • a best alignment may be selected as, for example, an alignment that minimizes the sum of least squares of distances between representations of the different camera markers in different coordinate systems.
  • the exemplary system operates to determine a transformation that transforms the coordinate system of M2 to the local coordinate system of Ml .
  • This transformation may take the form, for example, of a vector offset combined with a rotation.
  • an arbitrary location in the coordinate system of M2 may be expressed by a vector X 2 .
  • the transformation parameters may be determined with the use of a search through a search space to minimize the sum of square distances between different representations of the same camera marker. For example, suppose additional markers MA and MB have positions represented, respectively, by vectors Ai and Bi in the coordinate system of Ml as measured by Ml . Those same markers have positions represented, respectively, by vectors A 2 and B 2 in the coordinate system of M2 as measured by M2. The positions as measured by M2 can thus be expressed in the coordinate system of Ml as follows:
  • S is given by the equation:
  • the transformation parameters may include one or more scaling factors.
  • a is a scalar.
  • a is a vector or tensor value.
  • R is not a unitary rotation matrix but rather an arbitrary matrix, the components of which are adjusted using, e.g., a search technique to minimize the sum S.
  • XM 3 ' Ti, 2 + Ri, 2 XM 3 .
  • a least squares technique may be employed to determine a best fit position in the common coordinate system.
  • the least squares technique may be weighted to accommodate the reliability of different measurements. For example, a position measurement from a nearby camera marker may be weighted more heavily than a position measurement from a more distant camera marker.
  • a reliability measure may also be associated with different coordinate transforms, with transforms based on a greater number of marker positions being considered relatively more reliable, and transforms that result in a very low sum S being considered relatively more reliable. In such an embodiment, a position measurement that results from more reliable transforms is itself considered more reliable and thus is weighted more heavily in determining a best fit position.
  • the AR rendering system includes an AR device (such as a headset, tablet, or other device) with a camera.
  • the AR device takes an image of the environment (935) and operates to locate AR markers (such as camera markers) in the image (940).
  • the AR system determines the position and orientation of the AR device within the common coordinate system (945).
  • the AR system further operates to render AR content based on the determined position and orientation of the AR device (950).
  • one or more printed markers are displayed on the AR device to facilitate tracking of the AR device by camera markers.
  • the functions of the described camera marker are performed using a general purpose consumer tablet computer.
  • a tablet computer has readily versions of the components needed such as a display, camera (though typically not with wide-angle optics), and wired and wireless network connections.
  • a camera marker is implemented using dedicated software running on the tablet device.
  • the camera marker is implemented using a special-purpose version of a tablet computer.
  • the special-purpose version of the tablet computer may, for example, have reduced memory, lower screen resolution (possibly greyscale only), wide-angle optics, and may be pre-loaded with appropriate software to enable camera marker functionality.
  • inessential functionality such as GPS, magnetometer, and audio functions may be omitted from the special-purpose tablet computer.
  • Exemplary embodiments disclosed herein are implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU) or other network entity.
  • WTRU wireless transmit/receive unit
  • FIG. 6 is a system diagram of an exemplary WTRU 102, which may be employed as a user device in embodiments described herein.
  • the WTRU 102 may include a processor 118, a communication interface 119 including a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a non-removable memory 130, a removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and sensors 138.
  • GPS global positioning system
  • the processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like.
  • the processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment.
  • the processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122. While FIG. 6 depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
  • the transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station over the air interface 115/116/117.
  • the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals.
  • the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples.
  • the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
  • the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MTMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
  • the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
  • the transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122.
  • the WTRU 102 may have multi-mode capabilities.
  • the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.
  • the processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit).
  • the processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128.
  • the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132.
  • the non-removable memory 130 may include random-access memory (RAM), readonly memory (ROM), a hard disk, or any other type of memory storage device.
  • the removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.
  • SIM subscriber identity module
  • SD secure digital
  • the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
  • the processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102.
  • the power source 134 may be any suitable device for powering the WTRU 102.
  • the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), and the like), solar cells, fuel cells, and the like.
  • the processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102.
  • location information e.g., longitude and latitude
  • the WTRU 102 may receive location information over the air interface 115/116/117 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
  • the processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity.
  • the peripherals 138 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
  • sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module
  • FIG. 7 depicts an exemplary network entity 190 that may be used in embodiments of the present disclosure, for example as a common server used for the setup of one or more camera markers.
  • network entity 190 includes a communication interface 192, a processor 194, and non-transitory data storage 196, all of which are communicatively linked by a bus, network, or other communication path 198.
  • Communication interface 192 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 192 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 192 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 192 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
  • wireless communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers)
  • Processor 194 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
  • Data storage 196 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random-access memory (RAM) to name but a few, as any one or more types of non-transitory data storage deemed suitable by those of skill in the relevant art could be used. As depicted in FIG. 7, data storage 196 contains program instructions 197 executable by processor 194 for carrying out various combinations of the various network-entity functions described herein.
  • FIG. 8 illustrates examples of patterns that can be displayed on the display of a camera marker for use as an AR marker, without limitation.
  • FIG. 10 illustrates a functional architecture of a camera marker 1001 in accordance with an embodiment.
  • the camera marker 1001 may operate various modules.
  • a camera module 1005 may operate within the camera marker 1001.
  • a marker display module 1010 may operate within the camera marker 1001, to display the AR marker.
  • a coordinate conversion module 1015 may operate within the camera marker 1001, to determine the coordinates, relative to the camera marker, of other markers detected by image capture.
  • a position server module 1020 may operate within the camera marker 1001.
  • the position server module 1020 may include a shared coordinate conversion module 1022, which may convert the local coordinates of detected markers into a shared coordinate system.
  • the camera module 1005, marker display module 1010, coordinate conversion module 1015, position server module 1020, and marker transform/scale module 1040, as well as other modules, may communicate with a memory 1030.
  • the memory may include rules for marker transform/scale 1032, captured images 1034, local coordinates 1036, other camera locations 1038, and/or the like.
  • the camera marker may have communications incoming from or outgoing to an AR unit 1003.
  • there is a method comprising: providing a plurality of camera markers; and operating the plurality of camera markers to perform self-calibration.
  • the self-calibration includes determination of a shared coordinate system.
  • the method further comprises rendering augmented content to a user using the shared coordinate system.
  • the self-calibration includes determination of a location of the camera marker in the shared coordinate system.
  • the self-calibration includes determination of an orientation of the camera marker in the shared coordinate system.
  • a method comprising: operating a camera marker to display an image of at least a first augmented reality marker; operating the camera marker to capture an image of at least a second augmented reality marker; based on the image, determining a pose of the second augmented reality marker with respect to the camera marker; and providing the pose to a position server.
  • the method further comprises operating a second camera marker to capture the image of the first augmented reality marker.
  • the method further comprises operating the second camera marker to display the second augmented reality marker.
  • the second augmented reality marker is a second camera marker.
  • the method further comprises detecting an image of a user by the camera marker and determining a pose of the user based on the image of the user.
  • the position server is implemented in the camera marker. In some embodiments, the position server is implemented in a separate camera marker. In some embodiments, the position server operates to define a shared coordinate system. In some embodiments, the method further comprises rendering augmented reality content using the shared coordinate system. In some embodiments, the rendering of the augmented reality content includes providing sound from a speaker of the camera marker. In some embodiments, the method further comprises controlling the camera marker to modify the first augmented reality marker, the modification being selected from the group consisting of changing, scaling and turning the augmented reality marker.
  • the method further comprises changing the rendering of augmented content in response to modification of the augmented reality marker.
  • the controlling is provided by remote control.
  • the remote control is provided over an internet protocol (IP) network.
  • IP internet protocol
  • the remote control is provided using a protocol selected from the group consisting of WLAN, Bluetooth, and an Infrared link.
  • the method further comprises: determining a pose of an augmented reality viewing device using at least the first augmented reality marker; and rendering augmented reality content on the augmented reality viewing device using the determined pose.
  • the method further comprises determining a pose of an augmented reality viewing device using at least the first augmented reality marker.
  • the viewing device is selected from the group consisting of a camera phone, a tablet computer, and a virtual reality headset.
  • the method further comprises determining a position of an augmented reality viewing device using at least the first augmented reality marker and the second augmented reality marker.
  • the second augmented reality marker is displayed on a second camera marker.
  • the position server operates to define a shared coordinate system and to determine a position of the camera marker in the shared coordinate system.
  • the augmented reality marker is a printed fiducial marker used to identify a surface level.
  • the augmented reality marker is a printed fiducial marker.
  • the camera marker displays information advertising the available augmented content.
  • capturing an image includes capturing a depth image.
  • a method of providing a virtual mirror comprising: obtaining an image from a camera of a camera marker; processing the image to emulate a reflected image; and rendering the processed image on an augmented reality display at a position determined at least in part by an augmented reality marker displayed by the camera marker.
  • the processed image is rendered substantially at the position of the augmented reality marker.
  • a camera marker comprising: a wide-angle camera on a front face of the camera marker; and a display on the front face of the camera marker.
  • the camera further comprises logic in communication with the wide-angle camera to determine a relative location of at least one other camera marker.
  • the camera marker is operative to display an augmented reality (AR) marker on the display.
  • the wide-angle camera is an electronic pan-tilt-zoom camera.
  • the wide-angle camera is a depth camera.
  • the camera marker is implemented in a digital photo frame.
  • a camera marker system comprising: a first camera marker including a first display and a first front-facing camera; a second camera marker including a second display and a second front-facing camera; wherein the first camera marker is positioned in a field of view of the second front-facing camera, and wherein the second camera marker is positioned in a field of view of the first front-facing camera.
  • the camera marker system further comprises a common position server.
  • the camera marker system further comprises an augmented reality system.
  • a method of defining a coordinate system comprising: operating a first camera marker to determine a pose of a second camera marker in a first local coordinate system; operating the second camera marker to determine a pose of the first camera marker in a second local coordinate system; and determining a transformation between the first local coordinate system and the second local coordinate system.
  • the method further comprises defining a global coordinate system.
  • the method further comprises determining a transformation between the first local coordinate system and the global coordinate system.
  • the method further comprises determining a transformation between the second local coordinate system and the global coordinate system.
  • the method further comprises determining a pose of the first camera marker in the global coordinate system.
  • the method further comprises determining a pose of the second camera marker in the global coordinate system. In one embodiment, the method further comprises rendering augmented reality content using the global coordinate system. In one embodiment, the method further comprises rendering the augmented reality content using an augmented reality viewer. In one embodiment, the augmented reality viewer is a head-mounted display. In one embodiment, the augmented reality viewer is a tablet computer. In one embodiment, the augmented reality viewer is a smartphone. In one embodiment, the augmented reality viewer is a wearable device.
  • modules include hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits
  • Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions could take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer- readable medium or media, such as commonly referred to as RAM, ROM, etc.
  • ROM read only memory
  • RAM random access memory
  • register cache memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD- ROM disks, and digital versatile disks (DVDs).
  • a processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A camera marker system is disclosed. In an exemplary embodiment, a camera marker includes a display operative to display an augmented reality marker and a wide-angle camera on the same side of the camera marker as the display. A plurality of camera markers perform a self-calibration routine in which each camera marker determines relative locations of other camera markers in its field of view, and the camera markers cooperate to define a shared coordinate system. The location and orientation of an augmented reality viewer, such as a head-mounted display, can then be determined within the shared coordinate system using camera markers in a field of view of the augmented reality viewer.

Description

APPARATUS AND METHOD FOR SUPPORTING
INTERACTIVE AUGMENTED REALITY FUNCTIONALITIES
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to and the benefit under 35 U.S.C. §119(e) of U.S. Provisional Patent Application Serial No. 62/202,431, filed August 7, 2015 and entitled "APPARATUS AND METHOD FOR SUPPORTING INTERACTIVE AUGMENTED REALITY FUNCTIONALITIES," the full contents of which are hereby incorporated herein by reference.
BACKGROUND
[0002] In Mixed or Augmented Reality (AR) systems, virtual 3D models or animations are embedded into video views from the real environment. The position of the augmented 3D information is defined in relation to a set of known features in the video views. Traditionally these distinctive features are brought to the scene in the form of graphical markers, which are relatively easy to detect and track from the video.
[0003] Recently, using natural features for augmented reality tracking has also become feasible and popular. It is less intrusive than using graphical markers but also more challenging, as the AR content production process becomes more dependent on the target environment, e.g., whether or not the environment contains distinctive features to use, or if the lighting conditions stay stable enough between the offline content production and the real-time use.
[0004] In Augmented Reality, graphical tags, fiducials or markers have been commonly used. Graphical markers have certain advantages over the using of natural features. For example, graphical markers help to make the offline process for mixed reality content production and use more independent of the actual target environment. This allows content to be positioned more reliably in the target embodiment based on the position of graphical markers, whereas changes in the environment (e.g., changes in lighting or in the position of miscellaneous objects) can otherwise make it more difficult for an augmented reality system to consistently identify position and orientation information based only on the environment.
[0005] Passive graphical markers are typically printed on paper or other stable substrate and cannot change their appearance after being printed. Being passive, they naturally also lack the ability to support additional functionalities based on, for example, image processing or electrical connections. An example of a dynamic graphical marker is described in, for example, U.S. Patent Application Publication No. 2013/0109961 Al, entitled "Apparatus and method for providing dynamic fiducial markers for devices."
SUMMARY
[0006] The present disclosure addresses the benefits and opportunities offered by an electronic, connected, and camera equipped AR marker. In particular, the present disclosure describes a camera marker. In some embodiments, a camera marker operates as an electronic marker device that combines a wide angle, electronic pan-tilt-zoom camera with a display. The device's display is used in some embodiments for declaring and advertising the availability of Augmented Reality (AR) information, as well as for showing markers to augment such information.
[0007] In an exemplary embodiment, a camera marker is operated to display an image of at least a first augmented reality marker. The camera marker is further operated to capture an image of at least a second augmented reality marker. The second augmented reality marker may be displayed by a second camera marker. Based at least in part on the image of the second augmented reality marker, the position of the second augmented reality marker with respect to the camera marker may be determined. This position may then be provided to a position server.
[0008] In an exemplary embodiment, the device's camera is used for automatic calibration of a multi-marker setup, enabling accurate detection and tracking of the user and supporting new interaction and presence related effects and services.
[0009] In some embodiments, user acceptance for one or more surrounding devices is gained by embedding the device in the form of a familiar consumer object, such as a photo frame.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] FIG. 1 is a schematic plan view of an embodiment in which camera markers are employed in an augmented reality system in a room.
[0011] FIG. 2A provides a perspective view of a camera marker device with a marker shown on a display of the device.
[0012] FIG. 2B provides a perspective view of a camera marker together with augmented content as seen by AR glasses or a smart phone. [0013] FIG. 2C further illustrates exemplary effects of transforming a fiducial marker displayed on a screen of the camera marker device or viewed by AR glasses.
[0014] FIG. 3 illustrates architecture of an exemplary augmented reality system employing one or more camera markers.
[0015] FIG. 4 is a functional block diagram of components of a camera marker device.
[0016] FIG. 5 is a schematic illustration of an embodiment in which a camera marker operates as a virtual mirror.
[0017] FIG. 6 illustrates an exemplary wireless transmit/receive unit (WTRU) that may be employed as camera marker or common position server in some embodiments.
[0018] FIG. 7 illustrates an exemplary network entity that may be employed as a camera marker or common position server in some embodiments.
[0019] FIG. 8 illustrates a collection of exemplary augmented reality (AR) markers of the type that can be displayed by exemplary camera markers.
[0020] FIG. 9 illustrates an exemplary method of positioning AR content using camera marker devices.
[0021] FIG. 10 is a block diagram illustrating an exemplary functional architecture of components of a camera marker in accordance with an exemplary embodiment of the present disclosure.
DETAILED DESCRIPTION
[0022] This present disclosure describes a camera marker. In exemplary embodiments, a camera marker is an electronic marker device that combines a wide angle, electronic pan-tilt-zoom camera with a display. The device's display can be used for declaring and advertising the availability of Augmented Reality (AR) information, as well as for showing markers to augment such information.
[0023] The device's camera can be used for automatic calibration of a multi-marker setup. The calibrated marker setup can be used for accurately detecting and tracking the user to support new interaction and presence-related effects and services. In some embodiments, the camera marker is provided in the form of a familiar consumer object, such as a digital photo frame, to encourage user acceptance of the device and to allow the device to blend with existing room decoration.
[0024] FIG. 1 illustrates a user 105 with AR glasses (with field of view shown by lines 107) in a room 100 with a camera marker 115 on each wall. Camera markers 115 are placed in such a way that at least one other camera marker is seen on each camera view (lines 117 depict exemplary fields of view of each camera marker). This is to ensure that a self-calibration process can be made to derive a common coordinate system for the multi-marker setup. A common coordinate system can then be used to derive an accurate estimate for the user' s position in the room, and for capturing his/her visual parameters like gestures, appearance, etc.
[0025] There are few restrictions on the placement of camera markers in the environment. The number and location of camera markers can be selected based on, for example, the size of the space to be tracked and requirements for the tracking accuracy. That the camera marker devices 115 can also be positioned horizontally, e.g., on tables. In many AR visualizations, it is beneficial to define the floor level. However, to preserve the safety of the user and of the device, it may not be beneficial to place a camera marker on the floor. In some embodiments, camera markers can be used in conjunction with one or more traditional printed fiducial markers. For example, a printed fiducial marker may be placed on a surface and may be used to define the surface. For example, a printed fiducial marker may be placed on the floor and used to define the floor level. Similarly, a printed fiducial marker may be used to define other surfaces, such as desktops, tabletops, and walls. In some embodiments, a printed maker may be attached to the viewing device, e.g. on the backside of a tablet, in order to facilitate determination of the pose (the position and orientation) of the viewing device by the system.
[0026] FIG. 1 illustrates a user 105 in a room 100 with multiple camera markers 115. Wide angle cameras are used for calibration and user tracking purposes. In some embodiments, the wide-angle camera has an angle of view of at least 60° (see field of view lines 117). In other embodiments, the wide-angle camera has an angle of view of at least 90°. In further embodiments, the wide-angle camera has an angle of view of at least 120°. A virtual character 110 is augmented into AR glasses, which carry a camera for marker detection and tracking (see field of view lines 107).
[0027] In exemplary embodiments, a camera marker device is accessible and controllable both over an internet protocol (IP) network and locally, e.g., by using a smart phone as a remote control device. Local control is particularly useful for user interaction, such as for changing, scaling and turning the augmented or displayed content.
[0028] FIG. 2A illustrates an exemplary camera marker device 200. As illustrated, the device's appearance does not necessarily differ much from a smartphone or a tablet. For user acceptance, the appearance of an ordinary photo frame may be beneficial. The size of the display depends on the targeted content (markers and/or eye-catchers) and the desired freedom for moving the marker on the display. The marker 205 shown on the display can be replaced or modified over network. The content changes respectively. The camera 210 is used for calibration and user tracking purposes. In the example illustrated in FIG. 2A, the wide-angle camera and the display of the camera marker are both on the same face of the smart marker, with both facing the same direction.
[0029] FIG. 2B illustrates an embodiment of a face model 220 augmented in relation to the marker 205, as seen by AR glasses or a smart phone. Generally, for FIGs. 2B-C, a view-point to the marker / marker orientation is kept unchanged.
[0030] In an exemplary embodiment, a camera marker combines a wide-angle camera for tracking the space with a display for showing markers for Augmented Reality (AR) information. In addition to benefits in AR, the camera marker can be used for new user interaction and presence related services.
[0031] Self-calibration of the camera marker simplifies the setup process of a system that includes multiple camera markers. Self-calibration is useful for ensuring the system's accuracy and stability both in AR visualization and user tracking related functionalities. Self-calibration may be based on detecting the pose of other markers in each camera marker's own camera view.
[0032] To support self-calibration and user-tracking properties, camera marker devices are in communication with a common position server that combines all captured information, e.g., mapping marker positions into common coordinates, and that derives a common estimate for the position of the user. In some embodiments, one of the camera marker devices operates as the server. In other embodiments, a separate server device is provided (either locally or, for example as a cloud service or other networked service) in order to reduce computational capabilities required from each individual camera marker.
[0033] FIG. 3 illustrates an exemplary system architecture. In one embodiment, the system may comprise a user 301 with AR glasses. Camera markers (315, 316) may be linked with local computers (320a, 320b, 320n), which may be integrated with the camera marker (e.g., unit 318). The camera markers 315, 316 and their local computers 320 may communicate with a server and/or central computer 325, which combines all devices. The user's AR glasses may communicate with the backend 325 by a wireless connection 330. In some embodiments, an AR augmentation 303 may be presented to the user's AR view, in relation to the camera markers.
[0034] A network connection is used to provide content to the electronic marker display, either a marker for AR, a picture (or video) for an eye-catcher, animation, advertisement or other chosen content. Preferably the connection is wireless (e.g., WLAN), although a cord may be used to power the device. [0035] A network connection (which may be the same network connection) can be used for another purpose to connect the camera marker to the user's viewing device. The viewing device may be, for example, a camera phone, tablet, virtual glasses with camera, and the like. Thanks to this connection, the system infrastructure tracks at each time instant the marker's (marker display's) orientation with respect to the user. This information can be used, e.g., for tracking the user by electronically zooming in the marker camera view, and analyzing its content.
[0036] A further communication connection is used for local remote control of the camera marker by a smart phone or a TV-like remote controller. Feasible connection technologies include WLAN, Bluetooth, and Infrared link, among others.
[0037] FIG. 4 is a functional block diagram of components in one example of a camera marker device. As illustrated in FIG. 4, the camera marker includes a processor 402, a camera 404, which may be a wide-angle camera, a display 406, such as a LCD, which may be used to display an AR marker. An optional keypad 408 may be provided. The keypad 408 may be implemented as a feature of the display 406, where the display 406 is a touch-sensitive display. The exemplary camera marker device further includes non-volatile memory 410 and volatile memory 412, a transmitter 414, and a receiver 416. Other network input/output connections may also be provided.
[0038] In some embodiments, a camera marker is provided with audio capture and playback features. Audio may be used to increase the attractiveness and effectiveness of the videos used for announcing/advertising the available AR content. Audio may also be used as a component of the augmented AR content. A microphone can be used to capture user responses or commands.
[0039] When building up a multi-marker setup, various combinations of electronic and paper markers are feasible. In such a setup, for example, a paper marker on the floor could specify the floor level without the risk of an electronic device being stepped on. Paper markers may also be used as a way to balance the trade-off between calibration accuracy and system cost. In addition to graphical markers, also natural print-out pictures can be used as part of a hybrid marker setup. Even natural planar or 3D feature sets can be detected by multiple camera markers and used for augmenting 3D objects.
[0040] In some embodiments, at least some local processing is performed in each marker device in order to reduce the amount of information to be transmitted to the common server. Marker detection is one of such local operations. Note that camera marker setup is relatively stable, and tracking in camera markers is not needed to such an extent as in the user's viewing device (AR glasses or tablet), which is moving along with the user. Another example is the control of the wide- angle camera in order to capture, for example, cropped views of other markers (for marker detection and identification), or user's visual parameters. A third example for local processing is to use camera view for deriving the actual lighting conditions in the environment in order to adapt the respective properties for the virtual content for improved photorealism.
[0041] Instead of just utilizing visual cameras, camera markers can be equipped with 3D cameras, such as RGB-D or ToF sensors, for capturing depth information. As the success of devices such as the Kinect camera has shown, it can increase the versatility and performance of related functionalities and services. The use of camera markers may encourage the acceptance of 3D cameras as a ubiquitous part of users' environment.
[0042] As a reference, a system for real-time 3D reconstruction of room-sized spaces has been described in [2]. The system uses Kinect Fusion modified to a set of fixed sensors, which might be used also in a system of 3D camera markers.
[0043] Together with the information on the user' s real view-point (the information obtained, e.g., by analyzing the captured 3D scene, or obtained from virtual glasses), the 3D captured scene can be used to implement accurate user-perspective AR rendering (cf. illustration for device- perspective and user-perspective magic lenses in Baricevic et.al 2012). A more traditional way of capturing 3D information is to use two (e.g., stereo) or more cameras.
[0044] As described above, multiple markers can be used in AR both to give more and better 3D data of the environment. To provide this benefit, multiple markers are calibrated with respect to each other and the scene. Typically, calibration is performed by capturing the multi-marker scene by a moving external camera and making geometrical calculations from its views. An example of a multi-camera calibration method is given in [5].
[0045] Providing the markers with wide-angle cameras enables self-calibration in a multiple camera-marker system. The views of the marker cameras themselves can be used for the mutual calibration of all devices, and the calibration can be updated when necessary, e.g., to adapt into any possible changes in the setup.
[0046] In [3], a feasible process for auto-calibration is described, which can be applied also for multiple camera markers setup. The calibration is a real time process and does not need a separate calibration phase. The user may lay markers randomly on suitable places and start tracking immediately. The accuracy of the system improves on the run as the transformation matrices are updated dynamically. Calibration can also be done as a separate stage, and the results can be saved and used later with another application. The described algorithms can be applied to various types of markers. [0047] Viewing of AR information is typically made by a device using a camera (usually in a mobile phone or wearable glasses) to detect the orientation and scale of a marker. Location of the user with respect to the marker can respectively be sent from the viewing device over a network connection. This is an example of a type of inside-out type tracking based on camera's own motion.
[0048] In some embodiments, a user's position can also be derived by the cameras around the user. This outside-in type of tracking can be accomplished by camera markers, and brings some potential benefits. In addition to capturing user's position, marker cameras can be used to capture the user's visual appearance, gestures, or motion.
[0049] More reliable user tracking can be effected using multiple connected and calibrated marker cameras around the user, as described above. This makes the user capture easier and more accurate (e.g., tracking in large spaces, handling of occlusions in the scene, etc.) than when using one camera tracking, whether it is based on one wearable camera or one (typically fixed) external camera.
[0050] In the case of multiple users, after the position of each of the users is determined, the users can be provided with individualized viewpoints of the same content. It is also possible to serve the different users with different content, especially in embodiments in which the service is permitted to track users' identities. The user identities may be, for example, anonymously numbered indices. Such anonymous indices are sufficiently detailed to provide some level of service enhancement. However, having information regarding real identities enables the system to provide more personalized services.
[0051] Camera markers are an access point for information and interaction for the user, and they are used in some embodiments for analyzing users' responses to the content, monitoring user activities in the space, and collecting related contextual information. The use of camera markers allows for human-computer interface (HCI) studies to be performed in Augmented Reality. This can be compared to known ways of user observation by an external camera or eye-tracker.
[0052] User behavior data can be used in many ways to better understand and serve the user. The observations can also be used actively to provide the user with interactive content, as described in the following.
[0053] By tracking the user, interactive effects can be produced. These effects can be shown both when using the camera marker for showing AR information (seen, e.g., by AR glasses) or for showing the effects directly on a large enough camera marker display. [0054] As the orientation of the camera marker device is determined with respect to the user, the device's display or augmentation may be used to reflect the user the environment to almost any desired direction seen from the device (provided that the camera shows a wide enough panorama from the environment). In one such embodiment, the device acts as a virtual mirror, mimicking a physical mirror view even for a moving user (resulting with mirror-like motion parallax seen by the user). The virtual mirror can be in a fixed angle, reflecting to/from any chosen direction, or even dynamically turn to any desired direction while the user is moving (e.g., depending on his/her trace). This will enable new types effects and services in many spaces. The virtual mirror concept is illustrated in FIG. 5. A user 501 with AR gear may view (view 505) a "virtual mirror" 510 (e.g., a camera marker) on a wall 508. The virtual mirror 510 may give a reflection 515, which may shift direction relative to the user's motions.
[0055] FIG. 2C illustrates effects of scaling and turning the marker 205 over a network connection (either by the user or the service provider). Local user control enables for example the user to place an advertised 3D product (e.g., a couch) at the appropriate scale into a preferred position to his/her room. The provider or broadcaster of the advertisement does not need to know about the local circumstances, as the user is the one to make the composition for AR visualisation.
[0056] The size of the camera marker's display naturally limits the freedom for (locally) controlling the point at which an object is augmented. This is especially true for spatial translations. However, an augmented object can be moved longer distances by allowing the user to change the perspective (angle) of a marker image as well as the distance between the marker and the augmented object.
[0057] Each camera marker used in an application has a location and orientation (pose) with respect to a coordinate system used to provide augmented content. In an exemplary setup process, two or more camera markers cooperate to determine their respective locations and orientations. In the exemplary setup process, a set of six values is used to specify the location and orientation of camera marker with respect to the coordinate system, such as the three spatial coordinates (x,y,z) defining the location and three Euler angles (φ,θ, ψ) defining the orientation. It should be understood that alternative parameterizations of the location and orientation of the camera markers may also be employed.
[0058] Consider, for the sake of illustration, an embodiment in which three camera markers are in use, and in which each of the camera markers is equipped with a wide-angle camera with a field of view that encompasses the other two camera markers. In the exemplary setup process, each of the camera markers displays an AR marker or other fiducial marker, which may be a unique (or at least locally unique) marker. Each of the camera markers obtains an image from its camera and processes the image to recognize the two displayed AR markers and to determine the coordinates of those markers within the image. Having processed the image to recognize the AR markers, the camera marker further determines the angle between those markers. The determination of the position and orientation of the AR markers within the image may be performed using known AR marker detection techniques using, for example, statistical based, gradient based, pixel connectivity-edge linking based and Hough transform based methods.
[0059] In some embodiments, the image taken by each camera is further processed to determine the distance of the other camera markers. In some embodiments, the distance of the other camera markers is determined based at least in part on the apparent size and perspective of the camera markers within the image. This may be done by, for example, comparing the apparent size and shape of the camera markers in the image with a known actual size and shape of the camera markers. This may also be done by comparing the apparent size and shape of AR markers displayed on the camera markers with a known actual size and shape of the camera markers. The actual size and shape of camera markers and AR markers displayed thereon may be made available to each camera marker (and/or common position server) in advance, e.g., during manufacture or configuration, or that information may be shared (e.g., over a local network) during the setup process. The distance estimation process takes into consideration the optics and resolution of the camera marker. In some embodiments, each AR marker (whether printed or displayed on a camera marker) may convey information identifying its own actual physical dimensions, allowing camera markers to estimate the distance of the AR marker based on a comparison between the actual and apparent physical dimensions of that AR marker.
[0060] In embodiments in which the camera marker is provided with a stereo camera, depth camera or other depth-sensing technology, the distance of the other camera markers may be determined based at least in part on the corresponding depth measurement. In some embodiments, depth can be measured using other techniques. For example, the camera markers may exchange audio signals, with the travel time of the audio signals being indicative of the distance between markers.
[0061] In some embodiments, the image taken by each camera is further processed to determine the apparent angle of orientation of the other markers. In some embodiments, each camera marker is equipped with one or more accelerometers, and the camera marker is operative to process readings from the accelerometer or accelerometers to determine one or more angles of that camera marker with respect to the vertical (e.g., "pitch," "roll," and "yaw" angles). [0062] In the exemplary setup process, the angle and distance measurements performed as described above provide sufficient information to determine the location and orientation of all of the camera markers. These calculations can be performed using well-known principles of trigonometry.
[0063] In some embodiments, the locations and orientations of visual features other than camera markers may be used in the setup process. For example, as mentioned above, one or more printed markers may be employed, such as a printed marker placed on the floor.
[0064] An exemplary method is illustrated in FIG. 9. A camera marker operates to obtain a wide- angle image of a location in which it is situated (905), with the wide-angle image including one or more other camera markers in the field of view of the image. The camera marker operates to detect other camera markers (or printed markers) in the image (910). The camera marker then operates based on the shape and size of the camera markers in the image to determine marker coordinates in the local coordinate system (915). In an exemplary embodiment, the camera marker determines from the camera image the coordinates in the local coordinate system corresponding to the six degrees of freedom of the marker, e.g., x, y, z for marker position, and φ,θ, ψ (or α, β, γ) for marker orientation, the coordinates representing the pose of the marker. This determination may be made based on inputs such as a known actual size of the visible camera marker (or of the AR marker displayed by that camera marker), information concerning distortion introduced by wide-angle imaging optics, pixel resolution, focal length, and the like.
[0065] In the illustrated example, the camera marker provides the coordinates of the other camera markers (and possibly printed fiducial markers) in the local coordinate system to a common position server (920), along with identifiers of the markers (which may be identifiers encoded in the markers themselves using, for example, a QR code or similar technology). In some embodiments, the common position server is implemented in the camera marker itself, as in the embodiment of FIG. 9. In other embodiments, the common position server is implemented in a different camera marker or in a separate network node.
[0066] The common position server receives information regarding marker positions expressed in one or more different coordinate systems. For example, the common position server may receive from one camera marker an indication that a marker labeled Ml is at coordinates (χ,γ,ζ,φ,θ, ψ) in one local coordinate system and that the same marker labeled Ml is at coordinates (χ',γ',ζ',φ',θ', ψ)' in a different local coordinate system measured with respect to a different camera marker. Similar sets of coordinates may also be received representing the positions of different markers as measured by different coordinate systems. The position server defines a common coordinate system and transforms the coordinates of the markers into the common coordinate system (925).
[0067] Various techniques may be used to transform the coordinates of the markers into the common coordinate system. As one example, one of the local coordinate systems is defined as the common coordinate system. A transformation is then found between the common coordinate system and the local coordinate systems. The transformation may be found by testing a plurality of different transformations that result in the alignment of the locations of different camera markers. A best alignment may be selected as, for example, an alignment that minimizes the sum of least squares of distances between representations of the different camera markers in different coordinate systems.
[0068] For example, consider a system with three camera markers Ml, M2, and M3. Suppose that the local coordinate system of a camera marker Ml is selected as the common coordinate system, and suppose that camera markers Ml and M2 are in one another's fields of view, while camera marker M3 is only in the field of view of M2. The system of camera markers cooperates to determine the coordinates of camera marker M3 in the common coordinate system.
[0069] The exemplary system operates to determine a transformation that transforms the coordinate system of M2 to the local coordinate system of Ml . This transformation may take the form, for example, of a vector offset combined with a rotation. For example, an arbitrary location in the coordinate system of M2 may be expressed by a vector X2. The same location in the coordinate system of Ml may be expressed by the vector Xi, where Xi = Ti,2 + Ri,2 X2 , with Ti,2 being a three-dimensional vector and Ri,2 being a three-dimensional rotation matrix.
[0070] The transformation parameters (vector Ti,2 and matrix Ri,2) may be determined with the use of a search through a search space to minimize the sum of square distances between different representations of the same camera marker. For example, suppose additional markers MA and MB have positions represented, respectively, by vectors Ai and Bi in the coordinate system of Ml as measured by Ml . Those same markers have positions represented, respectively, by vectors A2 and B2 in the coordinate system of M2 as measured by M2. The positions as measured by M2 can thus be expressed in the coordinate system of Ml as follows:
Figure imgf000013_0001
Bl' = Tl,2 + Rl,2 B2
Ideally, the transformation parameters Ti,2 and matrix Ri,2 are selected such that Ai'
Figure imgf000013_0002
and such that Bi' =Bi. However, due to inaccuracies and other variations in position determination, it may be desirable to select the transformation parameters Ti,2 and matrix Ri,2 so as to minimize the sum S, where S is given by the equation:
S = I Ai* - Ai I2 + I Bi* - Bi |2
It should be apparent that the above equation can readily be generalized to include components that represent additional camera marker positions.
[0071] In some embodiments, the transformation parameters may include one or more scaling factors. Such a transformation may be expressed as Xi = Ti,2 + Ri,2 αΧ2 , for example, where a is a scaling factor. In some embodiments, a is a scalar. In other embodiments, a is a vector or tensor value. In some embodiments, R is not a unitary rotation matrix but rather an arbitrary matrix, the components of which are adjusted using, e.g., a search technique to minimize the sum S.
[0072] Once a desirable set of transformation parameters has been determined, it is possible to determine the location of the marker M3 in the common coordinate system defined by Ml, even though M3 is not in the field of view of Ml . Suppose that the position of M3 is represented by a vector XM3 in the local coordinate system of M2, then the coordinates of M3 in the common coordinate system can be determined to be X , where
XM3' = Ti,2 + Ri,2 XM3 .
[0073] In embodiments in which the position of a single camera marker is measured by more than one other camera marker, that single camera marker may have more than one associated set of location coordinates in the common coordinate system. In such a situation, a least squares technique may be employed to determine a best fit position in the common coordinate system. The least squares technique may be weighted to accommodate the reliability of different measurements. For example, a position measurement from a nearby camera marker may be weighted more heavily than a position measurement from a more distant camera marker. A reliability measure may also be associated with different coordinate transforms, with transforms based on a greater number of marker positions being considered relatively more reliable, and transforms that result in a very low sum S being considered relatively more reliable. In such an embodiment, a position measurement that results from more reliable transforms is itself considered more reliable and thus is weighted more heavily in determining a best fit position.
[0074] For the sake of clarity, the example given above makes reference only to transformation of position vectors. It should be understood that information on the orientation of different camera markers (e.g., Euler angles) can likewise be transformed among different coordinate systems in order to represent the orientations of the camera markers with respect to a common coordinate system.
[0075] With reference to FIG. 9, after coordinates of the camera markers (and possibly other markers, such as printed markers) have been determined in a common coordinate system, those coordinates are communicated to an augmented reality (AR) rendering system (930). In some embodiments, the AR rendering system includes an AR device (such as a headset, tablet, or other device) with a camera. The AR device takes an image of the environment (935) and operates to locate AR markers (such as camera markers) in the image (940). Based on the location of the AR markers within the image, and based further on the coordinates of those markers within the common coordinate system, the AR system determines the position and orientation of the AR device within the common coordinate system (945). The AR system further operates to render AR content based on the determined position and orientation of the AR device (950). It should be understood that other components of the AR device, such as accelerometers and gyroscopic sensors, can be used to assist with the tracking of the AR device. In some embodiments, one or more printed markers are displayed on the AR device to facilitate tracking of the AR device by camera markers.
[0076] In some embodiments, the functions of the described camera marker are performed using a general purpose consumer tablet computer. A tablet computer has readily versions of the components needed such as a display, camera (though typically not with wide-angle optics), and wired and wireless network connections. In some embodiments, a camera marker is implemented using dedicated software running on the tablet device. In some embodiments, the camera marker is implemented using a special-purpose version of a tablet computer. The special-purpose version of the tablet computer may, for example, have reduced memory, lower screen resolution (possibly greyscale only), wide-angle optics, and may be pre-loaded with appropriate software to enable camera marker functionality. In some embodiments, inessential functionality such as GPS, magnetometer, and audio functions may be omitted from the special-purpose tablet computer.
[0077] Exemplary embodiments disclosed herein are implemented using one or more wired and/or wireless network nodes, such as a wireless transmit/receive unit (WTRU) or other network entity.
[0078] FIG. 6 is a system diagram of an exemplary WTRU 102, which may be employed as a user device in embodiments described herein. As shown in FIG. 6, the WTRU 102 may include a processor 118, a communication interface 119 including a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a non-removable memory 130, a removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and sensors 138. It will be appreciated that the WTRU 102 may include any subcombination of the foregoing elements while remaining consistent with an embodiment.
[0079] The processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to the transceiver 120, which may be coupled to the transmit/receive element 122. While FIG. 6 depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
[0080] The transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station over the air interface 115/116/117. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, as examples. In yet another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
[0081] In addition, although the transmit/receive element 122 is depicted in FIG. 6 as a single element, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MTMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
[0082] The transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122. As noted above, the WTRU 102 may have multi-mode capabilities. Thus, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, as examples.
[0083] The processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit). The processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128. In addition, the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132. The non-removable memory 130 may include random-access memory (RAM), readonly memory (ROM), a hard disk, or any other type of memory storage device. The removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like. In other embodiments, the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
[0084] The processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102. The power source 134 may be any suitable device for powering the WTRU 102. As examples, the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), and the like), solar cells, fuel cells, and the like.
[0085] The processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to, or in lieu of, the information from the GPS chipset 136, the WTRU 102 may receive location information over the air interface 115/116/117 from a base station and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
[0086] The processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 138 may include sensors such as an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
[0087] FIG. 7 depicts an exemplary network entity 190 that may be used in embodiments of the present disclosure, for example as a common server used for the setup of one or more camera markers. As depicted in FIG. 7, network entity 190 includes a communication interface 192, a processor 194, and non-transitory data storage 196, all of which are communicatively linked by a bus, network, or other communication path 198.
[0088] Communication interface 192 may include one or more wired communication interfaces and/or one or more wireless-communication interfaces. With respect to wired communication, communication interface 192 may include one or more interfaces such as Ethernet interfaces, as an example. With respect to wireless communication, communication interface 192 may include components such as one or more antennae, one or more transceivers/chipsets designed and configured for one or more types of wireless (e.g., LTE) communication, and/or any other components deemed suitable by those of skill in the relevant art. And further with respect to wireless communication, communication interface 192 may be equipped at a scale and with a configuration appropriate for acting on the network side— as opposed to the client side— of wireless communications (e.g., LTE communications, Wi-Fi communications, and the like). Thus, communication interface 192 may include the appropriate equipment and circuitry (perhaps including multiple transceivers) for serving multiple mobile stations, UEs, or other access terminals in a coverage area.
[0089] Processor 194 may include one or more processors of any type deemed suitable by those of skill in the relevant art, some examples including a general-purpose microprocessor and a dedicated DSP.
[0090] Data storage 196 may take the form of any non-transitory computer-readable medium or combination of such media, some examples including flash memory, read-only memory (ROM), and random-access memory (RAM) to name but a few, as any one or more types of non-transitory data storage deemed suitable by those of skill in the relevant art could be used. As depicted in FIG. 7, data storage 196 contains program instructions 197 executable by processor 194 for carrying out various combinations of the various network-entity functions described herein.
[0091] FIG. 8 illustrates examples of patterns that can be displayed on the display of a camera marker for use as an AR marker, without limitation.
[0092] FIG. 10 illustrates a functional architecture of a camera marker 1001 in accordance with an embodiment. The camera marker 1001 may operate various modules. A camera module 1005 may operate within the camera marker 1001. A marker display module 1010 may operate within the camera marker 1001, to display the AR marker. A coordinate conversion module 1015 may operate within the camera marker 1001, to determine the coordinates, relative to the camera marker, of other markers detected by image capture. In some embodiments, a position server module 1020 may operate within the camera marker 1001. The position server module 1020 may include a shared coordinate conversion module 1022, which may convert the local coordinates of detected markers into a shared coordinate system. There may also be a marker transform/scale module 1040, which may scale and/or transform a displayed marker in relation to an AR unit.
[0093] The camera module 1005, marker display module 1010, coordinate conversion module 1015, position server module 1020, and marker transform/scale module 1040, as well as other modules, may communicate with a memory 1030. The memory may include rules for marker transform/scale 1032, captured images 1034, local coordinates 1036, other camera locations 1038, and/or the like.
[0094] At various times, the camera marker may have communications incoming from or outgoing to an AR unit 1003.
[0095] In one embodiment, there is a method comprising: providing a plurality of camera markers; and operating the plurality of camera markers to perform self-calibration. In some embodiments, the self-calibration includes determination of a shared coordinate system. In some embodiments, the method further comprises rendering augmented content to a user using the shared coordinate system. In some embodiments, the self-calibration includes determination of a location of the camera marker in the shared coordinate system. In some embodiments, the self-calibration includes determination of an orientation of the camera marker in the shared coordinate system.
[0096] In one embodiment, there is a method comprising: operating a camera marker to display an image of at least a first augmented reality marker; operating the camera marker to capture an image of at least a second augmented reality marker; based on the image, determining a pose of the second augmented reality marker with respect to the camera marker; and providing the pose to a position server. In some embodiments, the method further comprises operating a second camera marker to capture the image of the first augmented reality marker. In some embodiments, the method further comprises operating the second camera marker to display the second augmented reality marker. In some embodiments, the second augmented reality marker is a second camera marker. In some embodiments, the method further comprises detecting an image of a user by the camera marker and determining a pose of the user based on the image of the user. In some embodiments, the position server is implemented in the camera marker. In some embodiments, the position server is implemented in a separate camera marker. In some embodiments, the position server operates to define a shared coordinate system. In some embodiments, the method further comprises rendering augmented reality content using the shared coordinate system. In some embodiments, the rendering of the augmented reality content includes providing sound from a speaker of the camera marker. In some embodiments, the method further comprises controlling the camera marker to modify the first augmented reality marker, the modification being selected from the group consisting of changing, scaling and turning the augmented reality marker. In some embodiments, the method further comprises changing the rendering of augmented content in response to modification of the augmented reality marker. In some embodiments, the controlling is provided by remote control. In some embodiments, the remote control is provided over an internet protocol (IP) network. In some embodiments, the remote control is provided using a protocol selected from the group consisting of WLAN, Bluetooth, and an Infrared link. In some embodiments, the method further comprises: determining a pose of an augmented reality viewing device using at least the first augmented reality marker; and rendering augmented reality content on the augmented reality viewing device using the determined pose. In some embodiments, the method further comprises determining a pose of an augmented reality viewing device using at least the first augmented reality marker. In some embodiments, the viewing device is selected from the group consisting of a camera phone, a tablet computer, and a virtual reality headset. In some embodiments, the method further comprises determining a position of an augmented reality viewing device using at least the first augmented reality marker and the second augmented reality marker. In some embodiments, the second augmented reality marker is displayed on a second camera marker. In some embodiments, the position server operates to define a shared coordinate system and to determine a position of the camera marker in the shared coordinate system. In some embodiments, the augmented reality marker is a printed fiducial marker used to identify a surface level. In some embodiments, the augmented reality marker is a printed fiducial marker. In some embodiments, the camera marker displays information advertising the available augmented content. In some embodiments, capturing an image includes capturing a depth image.
[0097] In one embodiment, there is a method of providing a virtual mirror, the method comprising: obtaining an image from a camera of a camera marker; processing the image to emulate a reflected image; and rendering the processed image on an augmented reality display at a position determined at least in part by an augmented reality marker displayed by the camera marker. In some embodiments, the processed image is rendered substantially at the position of the augmented reality marker.
[0098] In one embodiment, there is a camera marker comprising: a wide-angle camera on a front face of the camera marker; and a display on the front face of the camera marker. In some embodiments, the camera further comprises logic in communication with the wide-angle camera to determine a relative location of at least one other camera marker. In some embodiments, the camera marker is operative to display an augmented reality (AR) marker on the display. In some embodiments, the wide-angle camera is an electronic pan-tilt-zoom camera. In some embodiments, the wide-angle camera is a depth camera. In some embodiments, the camera marker is implemented in a digital photo frame.
[0099] In one embodiment, there is a camera marker system comprising: a first camera marker including a first display and a first front-facing camera; a second camera marker including a second display and a second front-facing camera; wherein the first camera marker is positioned in a field of view of the second front-facing camera, and wherein the second camera marker is positioned in a field of view of the first front-facing camera. In one embodiment, the camera marker system further comprises a common position server. In one embodiment, the camera marker system further comprises an augmented reality system.
[0100] In one embodiment, there is a method of defining a coordinate system comprising: operating a first camera marker to determine a pose of a second camera marker in a first local coordinate system; operating the second camera marker to determine a pose of the first camera marker in a second local coordinate system; and determining a transformation between the first local coordinate system and the second local coordinate system. In one embodiment, the method further comprises defining a global coordinate system. In one embodiment, the method further comprises determining a transformation between the first local coordinate system and the global coordinate system. In one embodiment, the method further comprises determining a transformation between the second local coordinate system and the global coordinate system. In one embodiment, the method further comprises determining a pose of the first camera marker in the global coordinate system. In one embodiment, the method further comprises determining a pose of the second camera marker in the global coordinate system. In one embodiment, the method further comprises rendering augmented reality content using the global coordinate system. In one embodiment, the method further comprises rendering the augmented reality content using an augmented reality viewer. In one embodiment, the augmented reality viewer is a head-mounted display. In one embodiment, the augmented reality viewer is a tablet computer. In one embodiment, the augmented reality viewer is a smartphone. In one embodiment, the augmented reality viewer is a wearable device.
[0101] Note that various hardware elements of one or more of the described embodiments are referred to as "modules" that carry out (i.e., perform, execute, and the like) various functions that are described herein in connection with the respective modules. As used herein, a module includes hardware (e.g., one or more processors, one or more microprocessors, one or more microcontrollers, one or more microchips, one or more application-specific integrated circuits
(ASICs), one or more field programmable gate arrays (FPGAs), one or more memory devices) deemed suitable by those of skill in the relevant art for a given implementation. Each described module may also include instructions executable for carrying out the one or more functions described as being carried out by the respective module, and it is noted that those instructions could take the form of or include hardware (i.e., hardwired) instructions, firmware instructions, software instructions, and/or the like, and may be stored in any suitable non-transitory computer- readable medium or media, such as commonly referred to as RAM, ROM, etc.
[0102] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements. In addition, the methods described herein may be implemented in a computer program, software, or firmware incorporated in a computer- readable medium for execution by a computer or processor. Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD- ROM disks, and digital versatile disks (DVDs). A processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.
REFERENCES
[1] Domagoj Baricevic, Cha Lee, Matthew Turk, Tobias Hollerer, Doug A. Bowman (2012), "A Hand-Held AR Magic Lens with User-Perspective Rendering", ISMAR 2012, 10 p.
[2] Andrew Maimone and Henry Fuchs (2012), "Real-Time Volumetric 3D Capture of Room- Sized Scenes for Telepresence", 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, 3DTV-CON 2012, Zurich, Switzerland, October 15-17, 2012.
[3] Siltanen S., Hakkarainen M., Honkamaa P. (2007), "Automatic marker field calibration", in Proc. Virtual Reality International Conference (VRIC), Laval, France, April 2007, pp. 261-267.
[4] US2013/0109961 Al (2013), Raja Bose, Jonathan Lester, Jorg Brakensiek, "APPARATUS AND METHOD FOR PROVIDING DYNAMIC FIDUCIAL MARKERS FOR DEVICES", US Patent Application Publication, Nokia Corporation, May 2, 2013.
[5] Marcel Bruckner and Joachim Denzler (2010), "Active Self-calibration of Multi-camera Systems", M. Goesele et al. (Eds.): DAGM 2010, LNCS 6376, Springer- Verlag Berlin Heidelberg 2010, pp. 31-40.

Claims

1. A method comprising:
operating a camera marker to display an image of at least a first augmented reality marker; operating the camera marker to capture an image of at least a second augmented reality marker;
based on the image, determining a pose of the second augmented reality marker with respect to the camera marker; and
providing the pose to a position server.
2. The method of claim 1, further comprising operating a second camera marker to capture the image of the first augmented reality marker.
3. The method of claim 2, further comprising operating the second camera marker to display the second augmented reality marker.
4. The method of any of claims 2-3, wherein the second augmented reality marker is displayed on a second camera marker.
5. The method of claim any of claims 1-4, further comprising detecting an image of a user by the camera marker and determining a position of the user based on the image of the user.
6. The method of any of claims 1-5, wherein the position server operates to define a shared coordinate system.
7. The method of claim 6, further comprising rendering augmented reality content using the shared coordinate system.
8. The method of any of claims 1-5, wherein the position server operates to define a shared coordinate system and to determine a pose of the camera marker in the shared coordinate system.
9. The method of any of claims 1-8, further comprising controlling the camera marker to modify the first augmented reality marker, the modification being selected from the group consisting of changing, scaling and turning the augmented reality marker.
10. The method of claim 9, further comprising changing the rendering of augmented content in response to modification of the augmented reality marker.
11. The method of any of claims 1-10, further comprising:
determining a pose of an augmented reality viewing device using at least the first augmented reality marker; and
rendering augmented reality content on the augmented reality viewing device using the determined pose.
12. The method of any of claims 1-11, further comprising determining a pose of an augmented reality viewing device using at least the first augmented reality marker and the second augmented reality marker.
13. The method of claim 12, wherein the second augmented reality marker is displayed on a second camera marker.
14. A camera marker system comprising:
a first camera marker including a first display and a first front-facing camera;
a second camera marker including a second display and a second front-facing camera; wherein the first camera marker is positioned in a field of view of the second front-facing camera, and wherein the second camera marker is positioned in a field of view of the first front- facing camera;
a common position server; and
a processor and a non-transitory storage medium storing instructions operative, when executed on the processor, to perform functions including:
operating a camera marker to display an image of at least a first augmented reality marker;
operating the camera marker to capture an image of at least a second augmented reality marker; based on the image, determining a pose of the second augmented reality marker with respect to the camera marker; and
providing the pose to a position server.
15. The camera marker system of claim 14, further comprising an augmented reality system.
PCT/US2016/045654 2015-08-07 2016-08-04 Apparatus and method for supporting interactive augmented reality functionalities WO2017027338A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562202431P 2015-08-07 2015-08-07
US62/202,431 2015-08-07

Publications (1)

Publication Number Publication Date
WO2017027338A1 true WO2017027338A1 (en) 2017-02-16

Family

ID=56799552

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/045654 WO2017027338A1 (en) 2015-08-07 2016-08-04 Apparatus and method for supporting interactive augmented reality functionalities

Country Status (1)

Country Link
WO (1) WO2017027338A1 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694730A (en) * 2017-04-06 2018-10-23 赫克斯冈技术中心 It is manipulated using the near field of the AR devices of image trace
CN109087399A (en) * 2018-07-17 2018-12-25 上海游七网络科技有限公司 A method of by positioning figure Fast synchronization AR space coordinates
US10403046B2 (en) * 2017-10-20 2019-09-03 Raytheon Company Field of view (FOV) and key code limited augmented reality to enforce data capture and transmission compliance
CN110365995A (en) * 2019-07-22 2019-10-22 视云融聚(广州)科技有限公司 The stream media service method and system of augmented reality label are merged in video
WO2019205850A1 (en) * 2018-04-27 2019-10-31 腾讯科技(深圳)有限公司 Pose determination method and device, intelligent apparatus, and storage medium
CN111857341A (en) * 2020-06-10 2020-10-30 浙江商汤科技开发有限公司 Display control method and device
GB2584122A (en) * 2019-05-22 2020-11-25 Sony Interactive Entertainment Inc Data processing
WO2021110051A1 (en) 2019-12-05 2021-06-10 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and system for associating device coordinate systems in a multi‐person ar system
WO2021133942A1 (en) * 2019-12-27 2021-07-01 Snap Inc. Marker-based shared augmented reality session creation
WO2021175920A1 (en) * 2020-03-06 2021-09-10 Telefonaktiebolaget Lm Ericsson (Publ) Methods providing video conferencing with adjusted/modified video and related video conferencing nodes
CN115100276A (en) * 2022-05-10 2022-09-23 北京字跳网络技术有限公司 Method and device for processing picture image of virtual reality equipment and electronic equipment
US11696011B2 (en) 2021-10-21 2023-07-04 Raytheon Company Predictive field-of-view (FOV) and cueing to enforce data capture and transmission compliance in real and near real time video
US11700448B1 (en) 2022-04-29 2023-07-11 Raytheon Company Computer/human generation, validation and use of a ground truth map to enforce data capture and transmission compliance in real and near real time video of a local scene
US11792499B2 (en) 2021-10-21 2023-10-17 Raytheon Company Time-delay to enforce data capture and transmission compliance in real and near real time video

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8749396B2 (en) * 2011-08-25 2014-06-10 Satorius Stedim Biotech Gmbh Assembling method, monitoring method, communication method, augmented reality system and computer program product

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8749396B2 (en) * 2011-08-25 2014-06-10 Satorius Stedim Biotech Gmbh Assembling method, monitoring method, communication method, augmented reality system and computer program product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LEDERMANN F ET AL: "Dynamically shared optical tracking", AGUMENTED REALITY TOOLKIT, THE FIRST IEEE INTERNATIONAL WORKSHOP SEP. 29, 2002, PISCATAWAY, NJ, USA,IEEE, 1 January 2002 (2002-01-01), pages 76 - 83, XP010620356, ISBN: 978-0-7803-7680-9 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694730A (en) * 2017-04-06 2018-10-23 赫克斯冈技术中心 It is manipulated using the near field of the AR devices of image trace
CN108694730B (en) * 2017-04-06 2022-06-24 赫克斯冈技术中心 Near field manipulation of AR devices using image tracking
US10403046B2 (en) * 2017-10-20 2019-09-03 Raytheon Company Field of view (FOV) and key code limited augmented reality to enforce data capture and transmission compliance
WO2019205850A1 (en) * 2018-04-27 2019-10-31 腾讯科技(深圳)有限公司 Pose determination method and device, intelligent apparatus, and storage medium
US11158083B2 (en) 2018-04-27 2021-10-26 Tencent Technology (Shenzhen) Company Limited Position and attitude determining method and apparatus, smart device, and storage medium
CN109087399A (en) * 2018-07-17 2018-12-25 上海游七网络科技有限公司 A method of by positioning figure Fast synchronization AR space coordinates
CN109087399B (en) * 2018-07-17 2024-03-01 上海游七网络科技有限公司 Method for rapidly synchronizing AR space coordinate system through positioning map
GB2584122A (en) * 2019-05-22 2020-11-25 Sony Interactive Entertainment Inc Data processing
GB2584122B (en) * 2019-05-22 2024-01-10 Sony Interactive Entertainment Inc Data processing
CN110365995A (en) * 2019-07-22 2019-10-22 视云融聚(广州)科技有限公司 The stream media service method and system of augmented reality label are merged in video
WO2021110051A1 (en) 2019-12-05 2021-06-10 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and system for associating device coordinate systems in a multi‐person ar system
EP4058874A4 (en) * 2019-12-05 2023-05-03 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and system for associating device coordinate systems in a multi-person ar system
CN114730212A (en) * 2019-12-05 2022-07-08 Oppo广东移动通信有限公司 Method and system for associating device coordinate systems in a multi-person AR system
US11663736B2 (en) 2019-12-27 2023-05-30 Snap Inc. Marker-based shared augmented reality session creation
WO2021133942A1 (en) * 2019-12-27 2021-07-01 Snap Inc. Marker-based shared augmented reality session creation
WO2021175920A1 (en) * 2020-03-06 2021-09-10 Telefonaktiebolaget Lm Ericsson (Publ) Methods providing video conferencing with adjusted/modified video and related video conferencing nodes
CN111857341A (en) * 2020-06-10 2020-10-30 浙江商汤科技开发有限公司 Display control method and device
US11696011B2 (en) 2021-10-21 2023-07-04 Raytheon Company Predictive field-of-view (FOV) and cueing to enforce data capture and transmission compliance in real and near real time video
US11792499B2 (en) 2021-10-21 2023-10-17 Raytheon Company Time-delay to enforce data capture and transmission compliance in real and near real time video
US11700448B1 (en) 2022-04-29 2023-07-11 Raytheon Company Computer/human generation, validation and use of a ground truth map to enforce data capture and transmission compliance in real and near real time video of a local scene
CN115100276A (en) * 2022-05-10 2022-09-23 北京字跳网络技术有限公司 Method and device for processing picture image of virtual reality equipment and electronic equipment
CN115100276B (en) * 2022-05-10 2024-01-19 北京字跳网络技术有限公司 Method and device for processing picture image of virtual reality equipment and electronic equipment

Similar Documents

Publication Publication Date Title
WO2017027338A1 (en) Apparatus and method for supporting interactive augmented reality functionalities
US11488364B2 (en) Apparatus and method for supporting interactive augmented reality functionalities
US11798190B2 (en) Position and pose determining method, apparatus, smart device, and storage medium
US11195049B2 (en) Electronic device localization based on imagery
CN106920279B (en) Three-dimensional map construction method and device
US9699375B2 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
US9558559B2 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
JP5260705B2 (en) 3D augmented reality provider
CN110249291A (en) System and method for the augmented reality content delivery in pre-capture environment
KR102398478B1 (en) Feature data management for environment mapping on electronic devices
WO2017177019A1 (en) System and method for supporting synchronous and asynchronous augmented reality functionalities
TW201920985A (en) Apparatus and method for generating a representation of a scene
CN104169965A (en) Systems, methods, and computer program products for runtime adjustment of image warping parameters in a multi-camera system
WO2012041208A1 (en) Device and method for information processing
CN113936085B (en) Three-dimensional reconstruction method and device
KR102197615B1 (en) Method of providing augmented reality service and server for the providing augmented reality service
WO2018061172A1 (en) Imaging angle adjustment system, imaging angle adjustment method and program
JP2016194783A (en) Image management system, communication terminal, communication system, image management method, and program
CN111385481A (en) Image processing method and device, electronic device and storage medium
KR20180111224A (en) Terminal and method for controlling the same
KR20180041430A (en) Mobile terminal and operating method thereof
WO2021259287A1 (en) Depth map generation method, and device and storage medium
CN115830280A (en) Data processing method and device, electronic equipment and storage medium
JP7225016B2 (en) AR Spatial Image Projection System, AR Spatial Image Projection Method, and User Terminal
CN110443841B (en) Method, device and system for measuring ground depth

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16756846

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16756846

Country of ref document: EP

Kind code of ref document: A1