US20130321575A1 - High definition bubbles for rendering free viewpoint video - Google Patents

High definition bubbles for rendering free viewpoint video Download PDF

Info

Publication number
US20130321575A1
US20130321575A1 US13/598,747 US201213598747A US2013321575A1 US 20130321575 A1 US20130321575 A1 US 20130321575A1 US 201213598747 A US201213598747 A US 201213598747A US 2013321575 A1 US2013321575 A1 US 2013321575A1
Authority
US
United States
Prior art keywords
fvv
high definition
regions
sub
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/598,747
Inventor
Adam Kirk
Neil Fishman
Don Gillett
Patrick Sweeney
Kanchan Mitra
David Eraker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US201261653983P priority Critical
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/598,747 priority patent/US20130321575A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ERAKER, DAVID, FISHMAN, NEIL, GILLET, DON, KIRK, ADAM, SWEENEY, PATRICK, MITRA, KANCHAN
Publication of US20130321575A1 publication Critical patent/US20130321575A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/243Image signal generators using stereoscopic image cameras using three or more 2D image sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/246Calibration of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/56Particle system, point based geometry or rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/005Audio distribution systems for home, i.e. multi-room use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Abstract

A “Dynamic High Definition Bubble Framework” allows local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. Generally, the FVV is presented to the user as a broad area from some distance away. Then, as the user zooms in or changes viewpoints, one or more areas of the overall area are provided in higher definition or fidelity. Therefore, rather than capturing and providing high definition everywhere (at high computational and bandwidth costs), the Dynamic High Definition Bubble Framework captures one or more “bubbles” or volumetric regions in higher definition in locations where it is believed that the user will be most interested. This information is then provided to the client to allow individual clients to navigate and zoom different regions of the FVV during playback without losing fidelity or resolution in the zoomed areas.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit under Title 35, U.S. Code, Section 119(e), of a previously filed U.S. Provisional Patent Application, Ser. No. 61/653,983 filed on May 31, 2012, by Simonnet, et al., and entitled “INTERACTIVE SPATIAL VIDEO,” the subject matter of which is incorporated herein by reference.
  • BACKGROUND
  • In general, in free-viewpoint video (FVV), multiple video streams are used to re-render a time-varying scene from arbitrary viewpoints. The creation and playback of a FVV is typically accomplished using a substantial amount of data. In particular, in FVV, scenes are generally simultaneously recorded from many different perspectives using sensors such as RGB cameras. This recorded data is then generally processed to extract 3D geometric information in the form of geometric proxies or models using various 3D reconstruction (3DR) algorithms. The original RGB data and geometric proxies are then recombined during rendering, using various image based rendering (IBR) algorithms, to generate multiple synthetic viewpoints.
  • Unfortunately, when a complex FVV such as a football game is recorded or otherwise captured, rendering the entire volume of the overall capture area to generate the FVV generally uses a very large dataset and a correspondingly large computational overhead for rendering the various viewpoints of the FVV for viewing on local clients.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Further, while certain disadvantages of prior technologies may be noted or discussed herein, the claimed subject matter is not intended to be limited to implementations that may solve or address any or all of the disadvantages of those prior technologies.
  • In general, a “Dynamic High Definition Bubble Framework” as described herein provides various techniques that allow local clients to display free viewpoint video (FVV) of complex 3D scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. These techniques allow the client to perform spatial navigation through the FVV, while changing viewpoints and/or zooming into one or more higher definition regions or areas (specifically defined and referred to herein as “high definition bubbles”) within the overall area or scene of the FVV.
  • More specifically, the Dynamic High Definition Bubble Framework enables local rendering of FVV by providing a lower fidelity geometric proxy of an overall scene or viewing area in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints). This allows the user to view the entire volume of the scene as FVV, with interesting features or regions of the scene being provided in higher detail and optionally from a plurality of user-selectable viewpoints, while reducing the amount of data that is transmitted to the client for local rendering of the FVV. Note that the high definition bubbles may have differing levels of resolution or fidelity levels as well as differing numbers of viewpoints. Further, some of these viewpoints may be available at different resolutions or fidelity levels even within the same high definition bubble.
  • The Dynamic High Definition Bubble Framework enables these capabilities by providing multiple areas or sub-regions of higher definition video capture within the overall viewing area or scene. One implementation of this concept is to use multiple cameras (e.g., a camera array or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired. Concurrently, a set of cameras (e.g., a camera array or the like) that zoom in on particular regions of interest within the overall scene are used to create higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene.
  • For example, various embodiments of the Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene. One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies while the remaining areas of the overall scene are transferred to the client using lower resolution geometric proxies. Advantageously, the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., camera arrays following a ball or a particular player in a soccer game). These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area of the overall scene.
  • Captured image data is then used to generate geometric proxies or 3D models of the scene for local rendering of the FVV from any available viewpoint and at any desired resolution corresponding to the selected viewpoint. Note also that the FVV can be pre-rendered and sent to the client as a viewable and navigable FVV.
  • In particular, when used to stream 3D geometric proxies or models and corresponding RGB data to the client for locally render the FVV, the techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint and resolution selected by the user when viewing or navigating the FVV. This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client. In the server side example, using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • In other words, in various embodiments, the Dynamic High Definition Bubble Framework creates a navigable FVV that presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then chooses an optimal dataset to use to render various portions of the FVV at the desired resolutions/fidelity. This allows the Dynamic High Definition Bubble Framework to seamlessly support varying resolutions for different regions while optimally choosing the appropriate dataset to process for the desired output. Advantageously, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems. In other words, even though the user is zooming into particular areas or regions, the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • In view of the above summary, it is clear that the Dynamic High Definition Bubble Framework described herein provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. In addition to the just described benefits, other advantages of the Dynamic High Definition Bubble Framework will become apparent from the detailed description that follows hereinafter when taken in conjunction with the accompanying drawing figures.
  • DESCRIPTION OF THE DRAWINGS
  • The specific features, aspects, and advantages of the claimed subject matter will become better understood with regard to the following description, appended claims, and accompanying drawings where:
  • FIG. 1 provides an exemplary architectural flow diagram that illustrates program modules for using a “Dynamic High Definition Bubble Framework” for creating and navigating free viewpoint videos (FVV) of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV to clients, as described herein.
  • FIG. 2 provides an illustration of high definition bubbles within an overall viewing area or scene, as described herein
  • FIG. 3 provides illustration of the use of separate camera arrays to capture a high definition bubble and an overall viewing area, as described herein.
  • FIG. 4 illustrates a general system flow diagram that illustrates exemplary methods for implementing various embodiments of the Dynamic High Definition Bubble Framework for creating and navigating FVV's having high definition bubbles, as described herein.
  • FIG. 5 is a general system diagram depicting a simplified general-purpose computing device having simplified computing and I/O capabilities for use in implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • In the following description of the embodiments of the claimed subject matter, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the claimed subject matter may be practiced. It should be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the presently claimed subject matter.
  • 1.0 Introduction:
  • Note that some or all of the concepts described herein are intended to be understood in view of the overall context of the discussion of “Interactive Spatial Video” provided in U.S. Provisional Patent Application, Ser. No. 61/653,983 filed on May 31, 2012, by Simonnet, et al., and entitled “INTERACTIVE SPATIAL VIDEO,” the subject matter of which is incorporated herein by reference.
  • Note that various examples discussed in the following paragraphs refer to football games and football stadiums for purposes of explanation. However, it should be understood that the techniques described herein are not limited to any particular location, any particular activities, any particular size of volumetric space, or any particular number of scenes or objects.
  • In general, when a complex free-viewpoint video (FVV) of 3D scenes is recorded, one or more overall capture areas typically surround the “action”, which is confined to one or more smaller volumetric areas or sub-regions within the overall capture area. For example, in a football game, the size of the field is relatively large, but at any given time, the interesting action is generally centered on the ball and one or more players or athletes around the ball. While it is technically feasible to capture and render the entire capture volume at full fidelity, this would typically result in the generation of very large datasets to be sent from the server to the client for local rendering.
  • Advantageously, a “Dynamic High Definition Bubble Framework,” as described herein, provides various techniques that specifically address such concerns by providing the client with one or more lower fidelity geometric proxies of an overall viewing area or volumetric space. Concurrently, the Dynamic High Definition Bubble Framework provides one or more sub-regions of the overall viewing area as higher fidelity representations. Local clients then use this information to view and navigate through the overall FVV while providing the user with the capability to zoom into areas of higher fidelity. In other words, the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. Advantageously, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems. In other words, even though the user is zooming into particular areas or regions, the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • More specifically, the Dynamic High Definition Bubble Framework enables local rendering of image frames of the FVV by providing a lower fidelity geometric proxy of an overall scene in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail). This allows the user to view the entire volume of the scene as FVV, with interesting features or regions of the scene being provided in higher detail in the event that the user zooms into such regions, while reducing the amount of data that is transmitted to the client for local rendering of the FVV.
  • One implementation of this concept is to use multiple cameras (e.g., camera arrays or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired. Concurrently, a set of cameras that zoom in on particular regions of interest within the overall scene (such as the “action” in a football game where a player is carrying the ball) are used to capture data for creating higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene. These bubbles are specifically defined and referred to herein as “high definition bubbles.”Further, depending upon the available camera data, multiple viewpoints of potentially varying resolution or fidelity may be available within each bubble.
  • For any given scenario (e.g., sporting events, movie scenes, concerts, etc.), the Dynamic High Definition Bubble Framework typically presents a broad view of the overall viewing area or volumetric space from some distance away. Then, as the user zooms in or changes viewpoints, one or more areas of the overall scene or viewing area are provided in higher definition or fidelity. Therefore, rather than providing high definition everywhere (at high computational and bandwidth costs), the Dynamic High Definition Bubble Framework captures one or more bubbles in higher definition in locations or regions where it is believed that the user will be most interested. In other words, an author of the FVV will use the Dynamic High Definition Bubble Framework to capture bubbles in places where it is believed that user's may want more detail, or where the author want user's to be able to explore the FVV in greater detail.
  • Bubbles can be presented to the user in various ways. For example, in displaying the FVV to the user, the user is provided with the capability to zoom and/or change viewpoints (e.g., pans, tilts, rotations, etc.). In the case that the user zooms into a region corresponding to a high definition bubble, the user will be presented with higher resolution image frames during the zoom. As such, there is no need to demarcate explicit regions of the FVV that contain high definition bubbles.
  • In other words, the user is presented with the entire scene and as they scroll through it, more data is available in areas (i.e., bubbles) where there is higher detail. For example, by zooming into a high definition bubble around a football, the user will see that there is more detail available to them, while if they zoom into the grass near the edge of a field where there is less action, the user will see less detail (assuming that there is no corresponding high definition bubble there). Therefore, by placing bubbles in areas where the user is expected to look for higher detail (such as a tight view in and around the ball when it is fumbled) detail available to the user is higher, while off to one side of the field distant from the play, it is unlikely the user will zoom into that area. Therefore, when the user does zoom into the area around the ball, it creates an illusion as if the user can zoom in anywhere.
  • In alternate embodiments of the Dynamic High Definition Bubble Framework, the FVV is presented with thumbnails or highlighting within or near the overall scene to alert the user as to locations, regions or bubbles (and optionally available viewpoints) of higher definition. For example, the Dynamic High Definition Bubble Framework can provide a FVV of a boxing match where the overall ring is in low definition, but the two fighters are within a high definition bubble. In this case, the FVV may include indications of either or both the existence of the high definition bubble around the fighters and various available viewpoints within that bubble such as a view of the opponent from either boxers perspective.
  • Advantageously, the Dynamic High Definition Bubble Framework allows different users to have completely different viewing experiences. For example, in the case of a football game, one user can be zoomed into a bubble around the ball, while another user is zoomed into a bubble around cheerleaders on the edge of the football field, while yet another user is zoomed out to see the overall action on the entire field. Further, the same user can watch the FVV multiple times using any of a number of available zooms into one or more high definition bubbles and from any of a number of available viewpoints relative to any of those high definition bubbles.
  • 1.1 System Overview:
  • As noted above, the “Dynamic High Definition Bubble Framework,” provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. The processes summarized above are illustrated by the general system diagram of FIG. 1. In particular, the system diagram of FIG. 1 illustrates the interrelationships between program modules for implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein. Furthermore, while the system diagram of FIG. 1 illustrates a high-level view of various embodiments of the Dynamic High Definition Bubble Framework, FIG. 1 is not intended to provide an exhaustive or complete illustration of every possible embodiment of the Dynamic High Definition Bubble Framework as described throughout this document.
  • In addition, it should be noted that any boxes and interconnections between boxes that may be represented by broken or dashed lines in FIG. 1 represent alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • In general, as illustrated by FIG. 1, the processes enabled by the Dynamic High Definition Bubble Framework begin operation by using a data capture module 100 that uses multiple cameras or arrays to capture and generate 3D scene data 120 (e.g., geometric proxies, 3D models, RGB or other color space data, textures, etc.) for an overall viewing area and one or more viewpoints for one or more high definition bubbles within the overall viewing area.
  • In various embodiments, a user input module 110 is used for various purposes, including, but not limited to, defining and configuring one or more cameras and/or camera arrays for capturing an overall viewing area and one or more high definition bubbles. The user input module 110 is also used in various embodiments to define or specify one or more high definition bubbles, one or more viewpoints or view frustums, resolution or level of detail for one or more of the bubbles and one or more of the viewpoints, etc.
  • Typically, local clients will render video frames of the FVV from 3D scene data 120. However, in various embodiments, a pre-rendering module 130 uses the 3D scene data 120 to pre-render one or more FVV's that are then provided to one or more clients for viewing and navigation. In either case, a data transmission module 140 transmits either the pre-rendered FVV or 3D scene data 120 to one or more clients. The Dynamic High Definition Bubble Framework conserves bandwidth when transmitting to the client by only sending sufficient 3D scene data 120 for the level of detail desired to render image frames corresponding to an initial virtual navigation viewpoint or viewing frustum or one selected by the client. Following receipt of the 3D scene data 120, local clients use a local rendering module 150 to render one or more FVV's 160 or image frames of the FVV.
  • Finally, a FVV playback module 170 provides user-navigable interactive playback of the FVV in response to user navigation and zoom commands. In general, the FVV playback module 170 allows the user to pan, zoom, or otherwise navigate through the FVV. Further, user pan, tilt, rotation and zoom information is provided back to the local rendering module 150 or to the data transmission module for use in retrieving the 3D scene data 120 needed to render subsequent image frames of the FVV corresponding to user interaction and navigation through the FVV.
  • 2.0 Operational Details:
  • The above-described program modules are employed for implementing various embodiments of the Dynamic High Definition Bubble Framework. As summarized above, the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display FVV of complex scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • The following sections provide a detailed discussion of the operation of various embodiments of the Dynamic High Definition Bubble Framework, and of exemplary methods for implementing the program modules described in Section 1 with respect to FIG. 1. In particular, the following sections provides examples and operational details of various embodiments of the Dynamic High Definition Bubble Framework, including: an operational overview of the Dynamic High Definition Bubble Framework; exemplary FVV scenarios enabled by the Dynamic High Definition Bubble Framework; and data capture scenarios and FVV generation.
  • 2.1 Operational Overview:
  • As noted above, the Dynamic High Definition Bubble Framework-based processes described herein provide various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • FIG. 2 illustrates various high definition bubbles within an overall viewing area 200, scene, or volumetric space. The Dynamic High Definition Bubble Framework generally uses various cameras or camera arrays to capture the overall viewing area 200 at some desired resolution level. One or more high definition bubbles within the overall viewing area 200 are then captured uses various cameras or camera arrays at higher resolution or fidelity levels. As illustrated by FIG. 2, these high definition bubbles (e.g., 210, 220, 230, 240, 250 and 260) can have arbitrary shapes, sizes and volumes. Further, high definition bubbles (e.g., 210, 220, 230) can be in fixed positions to capture particular regions of the overall scene that may be of interest (e.g., end zones in a football game). The high definition bubbles (e.g., 240, 250 and 260) may also represent dynamic regions that move to follow action along arbitrary paths (e.g., 240) or along fixed paths (e.g., 250 to 260). Note also that moving high definition bubbles may sometimes extend outside the overall viewing area 200 (e.g., 260), though this may result in FVV image frames in which only the content of that high definition bubble is visible. One or more high definition bubbles may also overlap (e.g., 230).
  • FIG. 3 illustrates the use of separate camera arrays to capture a high definition bubble 330 using a camera array (e.g., cameras 335, 340, 345 and 350) within an overall viewing area 300 that is in turn captured by a set of cameras (e.g., 305, 310, and 315) at a lower fidelity level than that of the high definition bubble.
  • Various embodiments of the Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene. One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies or 3D models while the remaining areas of the overall scene are transferred to the client using lower definition geometric proxies or 3D models. Advantageously, as noted above, the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., following a ball or a particular player in a soccer game). These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area or volume of the overall scene.
  • Consequently, when used to stream both 3D geometric and RGB data from the server to the client, the FVV processing techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint selected by the user for when viewing a FVV. This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client. In the server side example, using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • 2.2 Exemplary FVV Scenarios:
  • The Dynamic High Definition Bubble Framework enables a wide variety of viewing scenarios for clients or users. As noted above, since the user is provided with the opportunity to navigate and zoom the FVV during playback, the viewing experience can be substantially different for individual viewers of the same FVV.
  • For example, considering a football game in a typical stadium, the Dynamic High Definition Bubble Framework uses a number of cameras or camera arrays to capture sufficient views to create an overall 3D view of the stadium at low to medium definition or fidelity (i.e., any desired fidelity level). In addition, the Dynamic High Definition Bubble Framework will also capture one or more specific locations or “bubbles” at a higher definition or fidelity and with a plurality of available viewpoints. Note that these bubbles are captured using fixed or movable cameras or camera arrays. For example, again considering the football game, the Dynamic High Definition Bubble Framework may have fixed cameras or camera arrays around the end zone to capture high definition images in these regions at all times. Further, one or more sets of moving cameras or camera arrays can follow the ball or particular players around the field to capture images of the ball or players from multiple viewpoints.
  • Generally, in the case of a football field, it would be difficult to capture every part of the entire field and all of the action in high definition without using very large amounts of data. Consequently, the Dynamic High Definition Bubble Framework captures and provides an overall view of the field by using some number of cameras capturing the overall field. Then, the Dynamic High Definition Bubble Framework uses one or more sets of cameras that capture the regions around the ball, specific players, etc., so that the overall low definition general background of the football field can be augmented by user navigable high definition views of what is going on in 3D in the “bubbles.” In other words, in various embodiments, the Dynamic High Definition Bubble Framework generally presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then layers or combines navigable high definition bubbles with the overall volumetric space based on a determination of the proper geometric registration or alignment of those high definition bubbles within the overall volumetric space.
  • In the case of a movie or the like, the Dynamic High Definition Bubble Framework enables the creation of movies where the user is provided with the capability to move around within a particular scene (i.e., change viewpoints) and to view particular parts of the scene, which are within bubbles, in higher definition while the movie is playing.
  • 2.3 Exemplary Data Capture Scenarios and FVV Generation:
  • The following paragraphs describe various examples of scenarios involving the physical placement and geometric configuration of various cameras and camera arrays within a football stadium to capture multiple high definition bubbles and virtual viewpoints for navigation of FVV's of a football game with associated close-ups and zooms corresponding to the high definition bubbles and virtual viewpoints. It should be understood that the following examples are provided only for purposes of explanation and are not intended to limit the scope or use of the Dynamic High Definition Bubble Framework to the examples presented, to the particular camera array configurations or geometries discussed, or to the positioning or use of particular high definition bubbles or virtual viewpoints.
  • In general, understanding where cameras or camera arrays will be deployed and the geometry associated with those cameras determines how the resulting 3D scene data will be processed in an interactive Spatial Video (SV) and subsequently rendered to create the FVV for the user or client. In the case of a typical professional football game, it is assumed that all cameras and related technology for capturing images, following action scenes or the ball, cutting to particular locations or persons, etc., exists inside or above the stadium. In some cases, the cameras will record elements before the game. In other cases, the cameras will be used in the live broadcast of the game. In this example, there are several primary configurations, including, but not necessarily limited to the following:
      • Asset Arrays—Camera arrays referred to as “asset arrays” are used to capture 3D image data of players, cheerleaders, coaches, referees, and any other items or people which may appear on the field before the game. Following processing of the raw image data, the output of these asset arrays is both an image intensive photorealistic rendering and a high fidelity geometric proxy similar to a CGI asset for any imaged items or people. This information can then be used in subsequent rendering of the FVV.
      • Environment Model—Mobile SLR cameras, mobile video cameras, laser range scanners, etc., are used to build an image-based geometric proxy for the stadium environment before the game from 3D image data captured by one or more camera arrays. This 3D image data is then generally used to generate a geometric proxy or 3D model of the overall environment. Further, this geometric proxy or 3D model can be edited or modified to suit particular purposes (e.g., modified to allow dynamic placement of advertising messages along a stadium wall or other location during playback of the resulting FVV).
      • Fixed Arrays—Fixed camera arrays are used to capture 3D image data of various game elements or features for insertion into the FVV. These elements include, but are not limited to announcers, ‘talking heads’, player interviews, intra-game fixed physical locations around the field, etc.
      • Moving Arrays—Mobile camera arrays are used to capture 3D image data of intra-game action on the field. Note that these are the same types of mobile cameras that are currently used to record action in professional football games, though additional numbers of cameras may be used to capture 3D image data of the intra-game action. Note that image or video data captured by fans viewing the game from inside the stadium using cell phones or other cameras can also be used by the Dynamic High Definition Bubble Framework to record intra-game action on the field.
  • 2.3.1 Asset Arrays:
  • In general, “asset arrays” are dense, fixed camera arrays optimized for creating a static (or moving) geometric proxy of an asset. Assets include any object or person who will be on the field such as players, cheerleaders, referees, footballs, or other equipment. The camera geometry of the asset arrays is optimized for the creation of a high fidelity geometric proxies and that requires a ‘full 360’ arrangement of sensors so that all aspects of the asset can be recorded and modeled; additional sensors may be placed above or below the assets. Note that in some cases, ‘full 360’ coverage may not be possible (e.g., views partially obstructed along some range of viewing directions), and that in such cases, user selection of viewpoints in the resulting FVV will be limited to whatever viewpoints can be rendered from the captured data. In addition to RGB (or other color space) cameras in the asset array, other sensor combinations such as active IR based stereo (also used in Kinect® or time of flight type applications) can be used to assist in 3D reconstruction. Additional techniques such as the use of green screen backgrounds can further assist in segmentation of the assets for use in creating high fidelity geometric proxies of those assets.
  • Asset arrays are generally utilized prior to the game and focus on static representations of the assets. Once recorded, these assets can be used as SV content for creating FVV's in two different ways, depending on the degree of geometry employed in their representation using image-based rendering (IBR).
  • Firstly, a low-geometry IBR method, including, but not limited to, view interpolation can be used to place the asset (players or cheerleaders) online using technology including, but not limited to, browser-based 2D or 3D rendering engines. This also allows users to view single assets with a web browser or the like to navigate around a coordinate system that allows them to zoom in to the players (or other assets) from any angle, thus providing the user or viewer with high levels of photorealism with respect to those assets. Again, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without losing fidelity or resolution in the zoomed areas, or otherwise creating pixelization artifacts or other zoom-based viewing problems. In other implementations, video can be used to highlight different player/cheerleader promotional activities such a throw, catch, block, cheer, etc. Note that various examples of view interpolation and view morphing for such purposes are discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • Secondly, a high fidelity geometry proxy of the players (or other persons such as cheerleaders, referees, coaches, announcers, etc.) is created and combined with view dependent texture mapping (VDTM) for use in close up FVV scenarios. To use these geometric proxies in FVV, a kinematic model for a human is used as a baseline for possible motions and further articulated based on RGB data from live-action video camera arrays. Multi-angle video data is then used to realistically articulate the geometric proxies for all players or a subset of players on the field. Advantageously, 6 degrees of freedom (6-DOF) movement of the user's viewpoint during playback of FVV is possible due the explicit use of 3D geometry in representing the assets. Again, various techniques for rendering and viewing the 3D content of the FVV is discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • 2.3.2 Environment Model:
  • A model of the environment is useful to the FVV of the football game in a number of different ways, such as providing a calibration framework for live-action moving cameras, creating interstitials effects when transitioning between known real camera feeds, determining the accurate placement (i.e., registration or alignment) of various geometric proxies (generated from the high definition bubbles) for FVV, improving segmentation results with background data, accurately representing the background of the scene using image-based-rendering methods in different FVV use cases, etc.
  • As is well known to those skilled in the art, a number of conventional techniques exist for modeling the environment using RGB (or other color space) photos using a sparse geometric representations of the scene. For example, in the case of Photosynth®, sparse geometry means that only enough geometry is extracted to enable the alignment of multiple photographs into a cohesive montage. However, in any scenario, such as the football game scenario, the Dynamic High Definition Bubble Framework provides richer 3D rendering by using much more geometry. More specifically, geometric proxies corresponding to each high definition bubble are registered or aligned to the geometry of the environment model. Once properly positioned, the various geometric proxies are then used to render the frames of the FVV.
  • Traditional environment models are often created using a variety of sensors such as moving video cameras, fixed cameras for high resolution static images, and laser based range scanning devices. RGB data from video cameras and fixed camera data can be processed using conventional 3D reconstruction methods to identify features and their location; point clouds of the stadium can be created from these features. Additional geometry, also in the form of point clouds, can be extracted using range scanning devices for additional accuracy. Finally, the point cloud data can be merged together, meshed, and textured into a cohesive geometric model. This geometry can also be used as an infrastructure to organize RGB data for use in other IBR approaches for backgrounds useful for FVV functionality.
  • Similar to the use of asset arrays, an environment model is created and processed before being used in any live-action footage provided by the FVV. Various methods associated with FVV live action, as discussed below, are made possible by the creation of an environment model including interstitials, moving camera calibration, and geometry-articulation.
  • In the simplest use of background models, interstitial movements between real camera positions are enabled, allowing users to more clearly understand where various camera feeds are located. In any SV scenario involving FVV, real camera feeds will have the highest degree of photorealism and will be widely utilized. When a viewer elects to change real camera views—instead of immediately switching to the next video feed—a smooth and sweeping camera movement is optionally enabled by rendering a virtual transition from the viewpoint of one camera view to the other to provide additional spatial information about the location of the cameras relative to the scene.
  • Additional FVV scenarios make advantageous use of the environment model by using both fixed and moving camera arrays to enable FVV functionality. In the case of moving cameras, these are used to provide close-ups of action on the field (i.e., by registering or positioning geometric proxies generated from the high definition bubbles with the environment model). To use moving cameras for FVV, individual video frames are continuously calibrated based on their orientation and optical focus, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • In general, the Dynamic High Definition Bubble Framework uses structure from motion (SFM) based approaches, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference, to calibrate the moving cameras or cameras based on static high resolution static RGB images captured during the environment modeling stage. Finally, for close up FVV functionality the Dynamic High Definition Bubble Framework relies upon the aforementioned articulation of the high-fidelity geometric proxies for the assets (players) using data from both fixed and moving camera arrays. These proxies are then positioned (i.e., registered or aligned) in the correct location on the field by determining where these assets are located relative to the environment model, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • 2.3.3 Fixed Arrays:
  • Fixed camera arrays are used in various scenarios associated with the football game, including intra-game focused footage as well as collateral footage. The defining characteristic of the fixed arrays are that cameras do not move relative to the scene.
  • For example, consider the use of FVV functionality for non-game collateral footage—this could include interviews with players or announcers. Further, consider an announcers stage having a medium density array of fixed RGB video cameras arranged in a 180-degree camera geometry pointing towards the stage for capturing 3D scene data of persons and assets on the stage. In this case, the views being considered generally include close-up views of humans, focused on the face, with limited need for full 6-DOF spatial navigation. In this case, an IBR approach such as view interpolation, view morphing, or view warping would use a less explicit geometric proxy for the scene, which would therefore emphasize photorealism at the expense of viewpoint navigation.
  • One use of this FVV functionality is that viewers (or producers) can enable real-time smooth pans between the different announcers as they comment and react. Another application of these ideas is to change views between the announcers and a top down map of the play presented next to the announcers. Another example scenario includes zooming in on a specific cheerleader doing a cheer, assuming that the fixed array is positioned on the field in an appropriate location for such views. In these scenarios, FVV navigation would be primarily limited to synthetic viewpoints between real camera positions or the axis of the camera geometry. However, by using the available 3D scene data for rendering the image frames, the results would be almost indistinguishable from real camera viewpoints.
  • The intra-game functionality discussed below highlights various benefits and advantages to the user when using the FVV technology described herein. For example, consider two classes of fixed arrays, one sparse array positioned with whole or partial views of the field from high vantage points within the stadium and another where denser fixed camera are positioned around the actual field such as in the end zone to capture a high definition bubble of the end zone.
  • In the case of high vantage point sparse arrays, this video data can be used to enable both far and medium FVV viewpoint control both during the game and during playback. This is considered a sparse array because the relative volume of the stadium is rather large and the distance between sensors is high. In this case, image-based rendering methods such as billboards and articulated billboards may be used to provide two-dimensional representations of the players on the field. These billboards are created using segmentation approaches, which are enabled partially by the environment model. These billboards maintain the photorealistic look of the players, but because they do not include the explicit geometry of the players (such as when represented as high fidelity geometric proxies). However, it should be understood that in general, navigation in the FVV is independent of the representation used.
  • Next, denser fixed arrays on the field such as around the end zone for capturing high definition bubbles allow for highly photorealist viewpoints during both live action and replay. Similar to the announcer's stage discussed above, viewpoint navigation would be largely constrained by the camera axis using similar image-based-rendering methods described for the announcer's stage. For the most part, these types of viewpoints are specifically enabled when camera density is at an appropriate level and therefore are not generally enabled for all locations within the stadium. In other words, dense camera arrays are used for capturing sub-regions of the overall stadium as high definition bubbles for inclusion in the FVV. In general, these methods are unsuitable for medium and sparse configurations of sensors.
  • 2.3.4 Moving Arrays:
  • Typical intra-game football coverage comes from moving cameras for both live action coverage and for replays. The preceding discussion regarding camera arrays generally focused on creating high fidelity geometric proxies of players and assets, how an environment model of the stadium can be leveraged to enhance the FVV, and the use of intra-game fixed camera arrays in both sparse and dense configurations. The Dynamic High Definition Bubble Framework ties these elements together with sparse moving camera arrays to enable additional FVV functionality for medium shots using billboards and close-up shots that leverage full 6-DOF spatial navigation using high fidelity geometric proxies of players or other assets or persons using conventional game cameras and camera operators. In other words, moving camera arrays are used to capture high definition bubbles used in generating FVV's.
  • Moving cameras in the array are continuously calibrated using SFM approaches leveraging the environment model. The optical zoom functionality of these moving cameras is also used to capture image data within high definition bubbles using methods including using prior frames to help further refine or identify a zoomed in camera geometry. Once the individual frames of the moving cameras have been registered to the geometry of the environment model (i.e., correctly positioned within the stadium), additional image-based-rendering methods are enabled for different FVV based on the contributing camera geometries including RGB articulated geometric proxies with maximal spatial navigation and billboard methods which emphasize photorealism and less spatial navigation.
  • For example, to enable close up replays with full 6-DOF viewpoint control during playback, the Dynamic High Definition Bubble Framework uses image data from the asset arrays, fixed arrays, and moving arrays. First, the relative position of the players is tracked on the field using one or more fixed arrays. In this way, the approximate location of any player on the field is known. This allows the Dynamic High Definition Bubble Framework to determine which players are in a zoomed in moving camera field of view. Next, based on the identification of the players in the zoomed in fields of view, the Dynamic High Definition Bubble Framework selects the appropriate high-fidelity geometric proxies for each player that were created earlier using the asset arrays.
  • Finally, using a kinematic model for known human motion as well as conventional object recognition techniques applied to RGB video (from both fixed and moving cameras), the Dynamic High Definition Bubble Framework determines the spatial orientation of specific players on the field and articulates their geometric proxies as realistically as possible. Note that this also helps in filling in occluded areas (using various hole-filling techniques) when there were insufficient numbers or placements of cameras to capture a view. When the geometric proxies are mapped to their correct location on the field in both space and time, the Dynamic High Definition Bubble Framework then derives a full 6-DOF FVV replay experience for the user. In this way, users or clients can literally view a play from any potential position including close-up shots as well as intra-field camera positions. Advantageously, the net effect here is to enable interactive replays similar to what is possible with various Xbox® football games such as the “Madden NFL” series of electronic games by Electronic Arts Inc, although with real data.
  • Finally, multiple moving cameras focused on the same physical location of the field can also enable medium and close up views that use IBR methods with less explicit geometry such as billboard methodologies. These cameras can be combined with data from both the environment model as well as the fixed arrays to create additional FVV viewpoints within the stadium.
  • 3.0 Operational Summary:
  • The processes described above with respect to FIG. 1 through FIG. 3 and in further view of the detailed description provided above in Sections 1 and 2 are illustrated by the general operational flow diagram of FIG. 4. In particular, FIG. 4 provides an exemplary operational flow diagram that summarizes the operation of some of the various embodiments of the Dynamic High Definition Bubble Framework. Note that FIG. 4 is not intended to be an exhaustive representation of all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, and that the embodiments represented in FIG. 4 are provided only for purposes of explanation.
  • Further, it should be noted that any boxes and interconnections between boxes that are represented by broken or dashed lines in FIG. 4 represent optional or alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these optional or alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • In general, as illustrated by FIG. 4, the Dynamic High Definition Bubble Framework begins operation by capturing (410) 3D image data for an overall viewing area and one or more high definition bubbles within the overall viewing area. The Dynamic High Definition Bubble Framework then uses the captured data to generate (420) one or more 3D geometric proxies or models for use in generating a Free Viewpoint Video (FVV). For each FVV, a view frustum for an initial or user selected virtual navigation viewpoint is then selected (430). The Dynamic High Definition Bubble Framework then selects (440) an appropriate level of detail for regions in the view frustum based on distance from viewpoint. Further, as discussed herein, the Dynamic High Definition Bubble Framework uses higher fidelity geometric proxies for regions corresponding to high definition bubbles and lower fidelity geometric proxies for other regions of overall viewing area.
  • The Dynamic High Definition Bubble Framework then provides (450) one or more clients with 3D geometric proxies corresponding to the view frustum, with those geometric proxies having a level of detail sufficient to render the scene (or other objects or people within the current viewpoint) from a viewing frustum corresponding to a user selected virtual navigation viewpoint. Given this data, the FVV is rendered or generated and presented to the user for viewing, with the user then navigating (460) the FVV by selecting zoom levels and virtual navigation viewpoints (e.g., pans, tilts, rotations, etc.), which are in turn used to select the view frustum for generating subsequent frames of the FVV.
  • 4.0 Exemplary Operating Environments:
  • The Dynamic High Definition Bubble Framework described herein is operational within numerous types of general purpose or special purpose computing system environments or configurations. FIG. 5 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the Dynamic High Definition Bubble Framework, as described herein, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 5 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • For example, FIG. 5 shows a general system diagram showing a simplified computing device such as computer 500. Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • To allow a device to implement the Dynamic High Definition Bubble Framework, the device should have a sufficient computational capability and system memory to enable basic computational operations. In particular, as illustrated by FIG. 5, the computational capability is generally illustrated by one or more processing unit(s) 510, and may also include one or more GPUs 515, either or both in communication with system memory 520. Note that that the processing unit(s) 510 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • In addition, the simplified computing device of FIG. 5 may also include other components, such as, for example, a communications interface 530. The simplified computing device of FIG. 5 may also include one or more conventional computer input devices 540 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.). The simplified computing device of FIG. 5 may also include other optional components, such as, for example, one or more conventional computer output devices 550 (e.g., display device(s) 555, audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.). Note that typical communications interfaces 530, input devices 540, output devices 550, and storage devices 560 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • The simplified computing device of FIG. 5 may also include a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 500 via storage devices 560 and includes both volatile and nonvolatile media that is either removable 570 and/or non-removable 580, for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • Storage of information such as computer-readable or computer-executable instructions, data structures, program modules, etc., can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism. Note that the terms “modulated data signal” or “carrier wave” generally refer a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • Further, software, programs, and/or computer program products embodying the some or all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • Finally, the Dynamic High Definition Bubble Framework described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks. In a distributed computing environment, program modules may be located in both local and remote computer storage media including media storage devices. Still further, the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.
  • The foregoing description of the Dynamic High Definition Bubble Framework has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the claimed subject matter to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. Further, it should be noted that any or all of the aforementioned alternate embodiments may be used in any combination desired to form additional hybrid embodiments of the Dynamic High Definition Bubble Framework. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.

Claims (20)

What is claimed is:
1. A computer-implemented process for generating navigable free viewpoint video (FVV), comprising using a computer to perform process actions for:
generating a geometric proxy from 3D image data of an overall volumetric space;
generating one or more geometric proxies for each of one or more sub-regions of the overall volumetric space;
registering one or more of the geometric proxies of the sub-regions with the geometric proxy of the overall volumetric space; and
rendering a multi-resolution user-navigable FVV from the registered geometric proxies and the geometric proxy of the overall volumetric space, wherein portions of the FVV corresponding to the sub-regions are rendered with a higher resolution than other regions of the FVV.
2. The computer-implemented process of claim 1 wherein each sub-region is captured at a resolution greater than a resolution used to capture the overall volumetric space.
3. The computer-implemented process of claim 1 wherein one or more of the sub-regions are captured using one or more moving camera arrays.
4. The computer-implemented process of claim 1 wherein one or more of the sub-regions are captured using one or more fixed camera arrays.
5. The computer-implemented process of claim 1 wherein rendering the multi-resolution user-navigable FVV further comprises process actions for:
determining a current view frustum corresponding to a current client viewpoint for viewing the FVV; and
transmitting appropriate geometric proxies within the current view frustum to the client for local rendering of video frames of the FVV.
6. The computer-implemented process of claim 1 wherein one or more of the sub-regions move relative to the overall volumetric space during capture of the 3D image data for those sub-regions.
7. The computer-implemented process of claim 1 wherein one or more of the sub-regions overlap within the overall volumetric space.
8. A method for generating a navigable 3D representation of a volumetric space, comprising:
capturing 3D image data of an overall volumetric space and using this 3D image data to construct an environment model comprising a geometric proxy of the overall volumetric space;
capturing 3D image data for one or more sub-regions of the overall volumetric space and generating one or more geometric proxies of each sub-region;
registering one or more of the geometric proxies of each sub-region to the environment model;
determining a view frustum relative to the environment model; and
rendering frames of a multi-resolution user-navigable FVV from portions of the registered geometric proxies and environment model corresponding to the view frustum, wherein portions of the FVV corresponding to the sub-regions are rendered with a higher resolution than other regions of the FVV.
9. The method of claim 8 wherein the view frustum is determined from a current viewpoint of a client viewing the FVV, and wherein the rendering is performed by the client from portions of the registered geometric proxies and environment model corresponding to the view frustum transmitted to the client.
10. The method of claim 8 wherein zooming into portions of the FVV rendered with a higher resolution provides greater detail than when zooming into other regions of the FVV.
11. The method of claim 8 wherein each sub-region is captured at a resolution greater than a resolution used to capture the overall volumetric space.
12. The method of claim 8 wherein the sub-regions are captured using any combination of one or more moving camera arrays and one or more fixed camera arrays.
13. The method of claim 8 wherein one or more of the sub-regions move relative to the overall volumetric space during capture of the 3D image data for those sub-regions.
14. A computer-readable medium having computer executable instructions stored therein for generating a user navigable free viewpoint video (FVV), said instructions causing a computing device to execute a method comprising:
capturing 3D image data for an overall viewing area;
capturing 3D image data for one or more high definition bubbles within the overall viewing area;
generating a geometric proxy from the 3D image data of the overall viewing area;
generating one or more geometric proxies from the 3D image data of one or more of the high definition bubbles;
aligning one or more of the geometric proxies of the high definition bubbles with the geometric proxy of the overall viewing area; and
transmitting portions of any of the aligned geometric proxies corresponding to a current client viewpoint to a client for local client-based rendering of a multi-resolution user-navigable FVV, wherein portions of the FVV corresponding to the high definition bubbles are rendered with a higher resolution than other regions of the FVV.
15. The computer-readable medium of claim 14 wherein each high definition bubble is captured at a resolution greater than a resolution used to capture the overall viewing area.
16. The computer-readable medium of claim 14 wherein one or more of the high definition bubbles are captured using one or more moving camera arrays.
17. The computer-readable medium of claim 14 wherein one or more of high definition bubbles are captured using one or more fixed camera arrays.
18. The computer-readable medium of claim 14 wherein rendering the multi-resolution user-navigable FVV further comprises:
determining a current view frustum corresponding to a current client viewpoint for viewing the FVV; and
using portions of the aligned geometric proxies within the current view frustum for local rendering of video frames of the FVV.
19. The computer-readable medium of claim 14 wherein one or more of the high definition bubbles move relative to the overall viewing area during capture of the 3D image data for those high definition bubbles.
20. The computer-readable medium of claim 14 wherein one or more of the sub-regions overlap within the overall volumetric space.
US13/598,747 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video Abandoned US20130321575A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US201261653983P true 2012-05-31 2012-05-31
US13/598,747 US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/598,747 US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video

Publications (1)

Publication Number Publication Date
US20130321575A1 true US20130321575A1 (en) 2013-12-05

Family

ID=49669652

Family Applications (10)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)
US13/598,747 Abandoned US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds
US13/790,158 Abandoned US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)

Family Applications After (6)

Application Number Title Priority Date Filing Date
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds
US13/790,158 Abandoned US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Country Status (1)

Country Link
US (10) US9846960B2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9191643B2 (en) 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US20170013283A1 (en) * 2015-07-10 2017-01-12 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
EP3388119A3 (en) * 2017-04-14 2018-11-28 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation

Families Citing this family (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9892546B2 (en) * 2010-06-30 2018-02-13 Primal Space Systems, Inc. Pursuit path camera model method and system
US20150373153A1 (en) * 2010-06-30 2015-12-24 Primal Space Systems, Inc. System and method to reduce bandwidth requirement for visibility event packet streaming using a predicted maximal view frustum and predicted maximal viewpoint extent, each computed at runtime
US9001960B2 (en) * 2012-01-04 2015-04-07 General Electric Company Method and apparatus for reducing noise-related imaging artifacts
US10262462B2 (en) 2014-04-18 2019-04-16 Magic Leap, Inc. Systems and methods for augmented and virtual reality
US9300841B2 (en) * 2012-06-25 2016-03-29 Yoldas Askan Method of generating a smooth image from point cloud data
US10079968B2 (en) 2012-12-01 2018-09-18 Qualcomm Incorporated Camera having additional functionality based on connectivity with a host device
US9519968B2 (en) * 2012-12-13 2016-12-13 Hewlett-Packard Development Company, L.P. Calibrating visual sensors using homography operators
US9224227B2 (en) * 2012-12-21 2015-12-29 Nvidia Corporation Tile shader for screen space, a method of rendering and a graphics processing unit employing the tile shader
US9144905B1 (en) * 2013-03-13 2015-09-29 Hrl Laboratories, Llc Device and method to identify functional parts of tools for robotic manipulation
EP2983140A4 (en) * 2013-04-04 2016-11-09 Sony Corp Display control device, display control method and program
US9208609B2 (en) * 2013-07-01 2015-12-08 Mitsubishi Electric Research Laboratories, Inc. Method for fitting primitive shapes to 3D point clouds using distance fields
EP3022898A1 (en) * 2013-07-19 2016-05-25 Google Technology Holdings LLC Asymmetric sensor array for capturing images
CN104424655A (en) * 2013-09-10 2015-03-18 鸿富锦精密工业(深圳)有限公司 System and method for reconstructing point cloud curved surface
US9286718B2 (en) * 2013-09-27 2016-03-15 Ortery Technologies, Inc. Method using 3D geometry data for virtual reality image presentation and control in 3D space
US9888333B2 (en) * 2013-11-11 2018-02-06 Google Technology Holdings LLC Three-dimensional audio rendering techniques
EP2881918B1 (en) * 2013-12-06 2018-02-07 My Virtual Reality Software AS Method for visualizing three-dimensional data
US9233469B2 (en) * 2014-02-13 2016-01-12 GM Global Technology Operations LLC Robotic system with 3D box location functionality
US9530226B2 (en) * 2014-02-18 2016-12-27 Par Technology Corporation Systems and methods for optimizing N dimensional volume data for transmission
US10241616B2 (en) 2014-02-28 2019-03-26 Hewlett-Packard Development Company, L.P. Calibration of sensors and projector
US9396586B2 (en) * 2014-03-14 2016-07-19 Matterport, Inc. Processing and/or transmitting 3D data
CN104089628B (en) * 2014-06-30 2017-02-08 中国科学院光电研究院 Geometric adaptive light field camera calibration method
US10169909B2 (en) * 2014-08-07 2019-01-01 Pixar Generating a volumetric projection for an object
US20160088282A1 (en) 2014-09-22 2016-03-24 Samsung Electronics Company, Ltd. Transmission of three-dimensional video
US9600892B2 (en) * 2014-11-06 2017-03-21 Symbol Technologies, Llc Non-parametric method of and system for estimating dimensions of objects of arbitrary shape
EP3221851A1 (en) * 2014-11-20 2017-09-27 Cappasity Inc. Systems and methods for 3d capture of objects using multiple range cameras and multiple rgb cameras
US9396554B2 (en) 2014-12-05 2016-07-19 Symbol Technologies, Llc Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
DE102014118989A1 (en) * 2014-12-18 2016-06-23 Connaught Electronics Ltd. A method of calibrating a camera system, a camera system and motor vehicle
US20160212418A1 (en) * 2015-01-19 2016-07-21 Aquifi, Inc. Multiple camera system with auto recalibration
US9661312B2 (en) * 2015-01-22 2017-05-23 Microsoft Technology Licensing, Llc Synthesizing second eye viewport using interleaving
US9686520B2 (en) * 2015-01-22 2017-06-20 Microsoft Technology Licensing, Llc Reconstructing viewport upon user viewpoint misprediction
US10038889B2 (en) * 2015-03-01 2018-07-31 Nextvr Inc. Methods and apparatus for requesting, receiving and/or playing back content corresponding to an environment
EP3070942A1 (en) * 2015-03-17 2016-09-21 Thomson Licensing Method and apparatus for displaying light field video data
US9460513B1 (en) 2015-06-17 2016-10-04 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing a 3D scene as a 3D model using images acquired by 3D sensors and omnidirectional cameras
EP3335418A1 (en) 2015-08-14 2018-06-20 PCMS Holdings, Inc. System and method for augmented reality multi-view telepresence
GB2543776B (en) * 2015-10-27 2019-02-06 Imagination Tech Ltd Systems and methods for processing images of objects
CN108369639A (en) * 2015-12-11 2018-08-03 虞晶怡 Method and system for image-based image rendering using multi-camera and depth camera array
US10145955B2 (en) 2016-02-04 2018-12-04 Symbol Technologies, Llc Methods and systems for processing point-cloud data with a line scanner
CN107097698A (en) * 2016-02-22 2017-08-29 福特环球技术公司 Inflatable airbag system applied to vehicle seat, seat assembly and adjustment method for same
CA2961921A1 (en) 2016-03-29 2017-09-29 Institut National D'optique Camera calibration method using a calibration target
US9805240B1 (en) 2016-04-18 2017-10-31 Symbol Technologies, Llc Barcode scanning and dimensioning
US10192345B2 (en) * 2016-07-19 2019-01-29 Qualcomm Incorporated Systems and methods for improved surface normal estimation
US9980078B2 (en) 2016-10-14 2018-05-22 Nokia Technologies Oy Audio object modification in free-viewpoint rendering
US10229533B2 (en) * 2016-11-03 2019-03-12 Mitsubishi Electric Research Laboratories, Inc. Methods and systems for fast resampling method and apparatus for point cloud data
US10165386B2 (en) 2017-05-16 2018-12-25 Nokia Technologies Oy VR audio superzoom
US10154176B1 (en) * 2017-05-30 2018-12-11 Intel Corporation Calibrating depth cameras using natural objects with expected shapes
JP2019016161A (en) * 2017-07-06 2019-01-31 キヤノン株式会社 Image processing device and control method thereof
WO2019034807A1 (en) * 2017-08-15 2019-02-21 Nokia Technologies Oy Sequential encoding and decoding of volymetric video
WO2019034808A1 (en) * 2017-08-15 2019-02-21 Nokia Technologies Oy Encoding and decoding of volumetric video
US20190069000A1 (en) * 2017-08-30 2019-02-28 Samsung Electronics Co., Ltd. Method and apparatus of point-cloud streaming
JP6409107B1 (en) * 2017-09-06 2018-10-17 キヤノン株式会社 The information processing apparatus, information processing method, and program
JP6425780B1 (en) * 2017-09-22 2018-11-21 キヤノン株式会社 An image processing system, image processing apparatus, image processing method, and program
CN107610182B (en) * 2017-09-22 2018-09-11 哈尔滨工业大学 Seed light field camera calibration method microlens array center

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327381B1 (en) * 1994-12-29 2001-12-04 Worldscape, Llc Image transformation and synthesis methods
US20060267977A1 (en) * 2005-05-19 2006-11-30 Helmut Barfuss Method for expanding the display of a volume image of an object region
US20080095465A1 (en) * 2006-10-18 2008-04-24 General Electric Company Image registration system and method
US20090016641A1 (en) * 2007-06-19 2009-01-15 Gianluca Paladini Method and apparatus for efficient client-server visualization of multi-dimensional data
US20090128568A1 (en) * 2007-11-16 2009-05-21 Sportvision, Inc. Virtual viewpoint animation
US20110142321A1 (en) * 2008-08-29 2011-06-16 Koninklijke Philips Electronics N.V. Dynamic transfer of three-dimensional image data

Family Cites Families (103)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5602903A (en) 1994-09-28 1997-02-11 Us West Technologies, Inc. Positioning system and method
US5850352A (en) 1995-03-31 1998-12-15 The Regents Of The University Of California Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images
JP3461980B2 (en) 1995-08-25 2003-10-27 株式会社東芝 High-speed drawing method and apparatus
US6163337A (en) 1996-04-05 2000-12-19 Matsushita Electric Industrial Co., Ltd. Multi-view point image transmission method and multi-view point image display method
US5926400A (en) 1996-11-21 1999-07-20 Intel Corporation Apparatus and method for determining the intensity of a sound in a virtual world
US6064771A (en) 1997-06-23 2000-05-16 Real-Time Geometry Corp. System and method for asynchronous, adaptive moving picture compression, and decompression
US6072496A (en) 1998-06-08 2000-06-06 Microsoft Corporation Method and system for capturing and representing 3D geometry, color and shading of facial expressions and other animated objects
US6226003B1 (en) 1998-08-11 2001-05-01 Silicon Graphics, Inc. Method for rendering silhouette and true edges of 3-D line drawings with occlusion
US6556199B1 (en) 1999-08-11 2003-04-29 Advanced Research And Technology Institute Method and apparatus for fast voxelization of volumetric models
US6509902B1 (en) 2000-02-28 2003-01-21 Mitsubishi Electric Research Laboratories, Inc. Texture filtering for surface elements
US7522186B2 (en) 2000-03-07 2009-04-21 L-3 Communications Corporation Method and apparatus for providing immersive surveillance
US6968299B1 (en) 2000-04-14 2005-11-22 International Business Machines Corporation Method and apparatus for reconstructing a surface using a ball-pivoting algorithm
US6750873B1 (en) 2000-06-27 2004-06-15 International Business Machines Corporation High quality texture reconstruction from multiple scans
US7538764B2 (en) 2001-01-05 2009-05-26 Interuniversitair Micro-Elektronica Centrum (Imec) System and method to obtain surface structures of multi-dimensional objects, and to represent those surface structures for animation, transmission and display
US20040217956A1 (en) 2002-02-28 2004-11-04 Paul Besl Method and system for processing, compressing, streaming, and interactive rendering of 3D color image data
US6919906B2 (en) 2001-05-08 2005-07-19 Microsoft Corporation Discontinuity edge overdraw
GB2378337B (en) 2001-06-11 2005-04-13 Canon Kk 3D Computer modelling apparatus
US6990681B2 (en) 2001-08-09 2006-01-24 Sony Corporation Enhancing broadcast of an event with synthetic scene using a depth map
US7909696B2 (en) 2001-08-09 2011-03-22 Igt Game interaction in 3-D gaming environments
US6781591B2 (en) 2001-08-15 2004-08-24 Mitsubishi Electric Research Laboratories, Inc. Blending multiple images using local and global information
US7023432B2 (en) 2001-09-24 2006-04-04 Geomagic, Inc. Methods, apparatus and computer program products that reconstruct surfaces from data point sets
US7096428B2 (en) 2001-09-28 2006-08-22 Fuji Xerox Co., Ltd. Systems and methods for providing a spatially indexed panoramic video
KR100861161B1 (en) 2002-02-06 2008-09-30 디지털 프로세스 가부시끼가이샤 Computer-readable record medium for storing a three-dimensional displaying program, three-dimensional displaying device, and three-dimensional displaying method
US7515173B2 (en) 2002-05-23 2009-04-07 Microsoft Corporation Head pose tracking system
US7030875B2 (en) 2002-09-04 2006-04-18 Honda Motor Company Ltd. Environmental reasoning using geometric data structure
US7106358B2 (en) 2002-12-30 2006-09-12 Motorola, Inc. Method, system and apparatus for telepresence communications
US20050017969A1 (en) 2003-05-27 2005-01-27 Pradeep Sen Computer graphics rendering using boundary information
US7480401B2 (en) 2003-06-23 2009-01-20 Siemens Medical Solutions Usa, Inc. Method for local surface smoothing with application to chest wall nodule segmentation in lung CT data
US7321669B2 (en) * 2003-07-10 2008-01-22 Sarnoff Corporation Method and apparatus for refining target position and size estimates using image and depth data
GB2405775B (en) 2003-09-05 2008-04-02 Canon Europa Nv 3D computer surface model generation
US7184052B2 (en) 2004-06-18 2007-02-27 Microsoft Corporation Real-time texture rendering using generalized displacement maps
US7292257B2 (en) 2004-06-28 2007-11-06 Microsoft Corporation Interactive viewpoint video system and process
US20060023782A1 (en) 2004-07-27 2006-02-02 Microsoft Corporation System and method for off-line multi-view video compression
US7671893B2 (en) 2004-07-27 2010-03-02 Microsoft Corp. System and method for interactive multi-view video
US7561620B2 (en) 2004-08-03 2009-07-14 Microsoft Corporation System and process for compressing and decompressing multiple, layered, video streams employing spatial and temporal encoding
US7142209B2 (en) 2004-08-03 2006-11-28 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video that was generated using overlapping images of a scene captured from viewpoints forming a grid
US7221366B2 (en) 2004-08-03 2007-05-22 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video
US8477173B2 (en) 2004-10-15 2013-07-02 Lifesize Communications, Inc. High definition videoconferencing system
US20080088626A1 (en) 2004-12-10 2008-04-17 Kyoto University Three-Dimensional Image Data Compression System, Method, Program and Recording Medium
US7860301B2 (en) 2005-02-11 2010-12-28 Macdonald Dettwiler And Associates Inc. 3D imaging system
US8228994B2 (en) 2005-05-20 2012-07-24 Microsoft Corporation Multi-view video coding based on temporal and view decomposition
US20070070177A1 (en) 2005-07-01 2007-03-29 Christensen Dennis G Visual and aural perspective management for enhanced interactive video telepresence
JP4595733B2 (en) 2005-08-02 2010-12-08 カシオ計算機株式会社 Image processing apparatus
US7551232B2 (en) 2005-11-14 2009-06-23 Lsi Corporation Noise adaptive 3D composite noise reduction
US7623127B2 (en) 2005-11-29 2009-11-24 Siemens Medical Solutions Usa, Inc. Method and apparatus for discrete mesh filleting and rounding through ball pivoting
US7577491B2 (en) 2005-11-30 2009-08-18 General Electric Company System and method for extracting parameters of a cutting tool
KR100810268B1 (en) 2006-04-06 2008-03-06 삼성전자주식회사 Embodiment Method For Color-weakness in Mobile Display Apparatus
US7778491B2 (en) 2006-04-10 2010-08-17 Microsoft Corporation Oblique image stitching
US7679639B2 (en) 2006-04-20 2010-03-16 Cisco Technology, Inc. System and method for enhancing eye gaze in a telepresence system
EP1862969A1 (en) 2006-06-02 2007-12-05 Eidgenössische Technische Hochschule Zürich Method and system for generating a representation of a dynamically changing 3D scene
US20080043024A1 (en) 2006-06-26 2008-02-21 Siemens Corporate Research, Inc. Method for reconstructing an object subject to a cone beam using a graphic processor unit (gpu)
USD610105S1 (en) 2006-07-10 2010-02-16 Cisco Technology, Inc. Telepresence system
US8213711B2 (en) 2007-04-03 2012-07-03 Her Majesty The Queen In Right Of Canada As Represented By The Minister Of Industry, Through The Communications Research Centre Canada Method and graphical user interface for modifying depth maps
GB0708676D0 (en) 2007-05-04 2007-06-13 Imec Inter Uni Micro Electr A Method for real-time/on-line performing of multi view multimedia applications
US8253770B2 (en) 2007-05-31 2012-08-28 Eastman Kodak Company Residential video communication system
JP4947593B2 (en) 2007-07-31 2012-06-06 Kddi株式会社 Free viewpoint image generating apparatus and a program according to the local region division
US8223192B2 (en) 2007-10-31 2012-07-17 Technion Research And Development Foundation Ltd. Free viewpoint video
CN103258184B (en) * 2008-02-27 2017-04-12 索尼计算机娱乐美国有限责任公司 A method for capturing a scene depth data and the operation of the computer application
TWI357582B (en) 2008-04-18 2012-02-01 Univ Nat Taiwan Image tracking system and method thereof
US8160345B2 (en) 2008-04-30 2012-04-17 Otismed Corporation System and method for image segmentation in generating computer models of a joint to undergo arthroplasty
US8442355B2 (en) 2008-05-23 2013-05-14 Samsung Electronics Co., Ltd. System and method for generating a multi-dimensional image
US7840638B2 (en) 2008-06-27 2010-11-23 Microsoft Corporation Participant positioning in multimedia conferencing
US8106924B2 (en) 2008-07-31 2012-01-31 Stmicroelectronics S.R.L. Method and system for video rendering, computer program product therefor
US20110169824A1 (en) 2008-09-29 2011-07-14 Nobutoshi Fujinami 3d image processing device and method for reducing noise in 3d image processing device
CN102239506B (en) 2008-10-02 2014-07-09 弗兰霍菲尔运输应用研究公司 Intermediate view synthesis and multi-view data signal extraction
US8200041B2 (en) 2008-12-18 2012-06-12 Intel Corporation Hardware accelerated silhouette detection
US8436852B2 (en) 2009-02-09 2013-05-07 Microsoft Corporation Image editing consistent with scene geometry
US8477175B2 (en) 2009-03-09 2013-07-02 Cisco Technology, Inc. System and method for providing three dimensional imaging in a network environment
JP5222205B2 (en) 2009-04-03 2013-06-26 Kddi株式会社 Image processing apparatus, method and program
US20100259595A1 (en) 2009-04-10 2010-10-14 Nokia Corporation Methods and Apparatuses for Efficient Streaming of Free View Point Video
US8719309B2 (en) 2009-04-14 2014-05-06 Apple Inc. Method and apparatus for media data transmission
US8665259B2 (en) 2009-04-16 2014-03-04 Autodesk, Inc. Multiscale three-dimensional navigation
US8755569B2 (en) 2009-05-29 2014-06-17 University Of Central Florida Research Foundation, Inc. Methods for recognizing pose and action of articulated objects with collection of planes in motion
US8629866B2 (en) 2009-06-18 2014-01-14 International Business Machines Corporation Computer method and apparatus providing interactive control and remote identity through in-world proxy
US9648346B2 (en) 2009-06-25 2017-05-09 Microsoft Technology Licensing, Llc Multi-view video compression and streaming based on viewpoints of remote viewer
KR101070591B1 (en) * 2009-06-25 2011-10-06 (주)실리콘화일 distance measuring apparatus having dual stereo camera
US8194149B2 (en) 2009-06-30 2012-06-05 Cisco Technology, Inc. Infrared-aided depth estimation
US8633940B2 (en) 2009-08-04 2014-01-21 Broadcom Corporation Method and system for texture compression in a system having an AVC decoder and a 3D engine
US8908958B2 (en) 2009-09-03 2014-12-09 Ron Kimmel Devices and methods of generating three dimensional (3D) colored models
US8284237B2 (en) 2009-09-09 2012-10-09 Nokia Corporation Rendering multiview content in a 3D video system
US8441482B2 (en) 2009-09-21 2013-05-14 Caustic Graphics, Inc. Systems and methods for self-intersection avoidance in ray tracing
US20110084983A1 (en) 2009-09-29 2011-04-14 Wavelength & Resonance LLC Systems and Methods for Interaction With a Virtual Environment
US9154730B2 (en) 2009-10-16 2015-10-06 Hewlett-Packard Development Company, L.P. System and method for determining the active talkers in a video conference
US8537200B2 (en) 2009-10-23 2013-09-17 Qualcomm Incorporated Depth map generation techniques for conversion of 2D video data to 3D video data
US20110122225A1 (en) 2009-11-23 2011-05-26 General Instrument Corporation Depth Coding as an Additional Channel to Video Sequence
US8487977B2 (en) 2010-01-26 2013-07-16 Polycom, Inc. Method and apparatus to virtualize people with 3D effect into a remote room on a telepresence call for true in person experience
US20110211749A1 (en) 2010-02-28 2011-09-01 Kar Han Tan System And Method For Processing Video Using Depth Sensor Information
US8898567B2 (en) 2010-04-09 2014-11-25 Nokia Corporation Method and apparatus for generating a virtual interactive workspace
EP2383696A1 (en) 2010-04-30 2011-11-02 LiberoVision AG Method for estimating a pose of an articulated object model
US20110304619A1 (en) 2010-06-10 2011-12-15 Autodesk, Inc. Primitive quadric surface extraction from unorganized point cloud data
US8411126B2 (en) 2010-06-24 2013-04-02 Hewlett-Packard Development Company, L.P. Methods and systems for close proximity spatial audio rendering
KR20120011653A (en) * 2010-07-29 2012-02-08 삼성전자주식회사 Image processing apparatus and method
US8659597B2 (en) 2010-09-27 2014-02-25 Intel Corporation Multi-view ray tracing using edge detection and shader reuse
US8787459B2 (en) 2010-11-09 2014-07-22 Sony Computer Entertainment Inc. Video coding methods and apparatus
US9123115B2 (en) * 2010-11-23 2015-09-01 Qualcomm Incorporated Depth estimation based on global motion and optical flow
US8867823B2 (en) * 2010-12-03 2014-10-21 National University Corporation Nagoya University Virtual viewpoint image synthesizing method and virtual viewpoint image synthesizing system
US8693713B2 (en) 2010-12-17 2014-04-08 Microsoft Corporation Virtual audio environment for multidimensional conferencing
US8156239B1 (en) 2011-03-09 2012-04-10 Metropcs Wireless, Inc. Adaptive multimedia renderer
US9117113B2 (en) 2011-05-13 2015-08-25 Liberovision Ag Silhouette-based pose estimation
US8867886B2 (en) 2011-08-08 2014-10-21 Roy Feinson Surround video playback
CN103828359B (en) 2011-09-29 2016-06-22 杜比实验室特许公司 The method of the view of the scene is generated for the encoding system and the decoding system
US9830743B2 (en) 2012-04-03 2017-11-28 Autodesk, Inc. Volume-preserving smoothing brush
US9058706B2 (en) 2012-04-30 2015-06-16 Convoy Technologies Llc Motor vehicle camera and monitoring system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327381B1 (en) * 1994-12-29 2001-12-04 Worldscape, Llc Image transformation and synthesis methods
US20060267977A1 (en) * 2005-05-19 2006-11-30 Helmut Barfuss Method for expanding the display of a volume image of an object region
US20080095465A1 (en) * 2006-10-18 2008-04-24 General Electric Company Image registration system and method
US20090016641A1 (en) * 2007-06-19 2009-01-15 Gianluca Paladini Method and apparatus for efficient client-server visualization of multi-dimensional data
US20090128568A1 (en) * 2007-11-16 2009-05-21 Sportvision, Inc. Virtual viewpoint animation
US20110142321A1 (en) * 2008-08-29 2011-06-16 Koninklijke Philips Electronics N.V. Dynamic transfer of three-dimensional image data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9191643B2 (en) 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US20170013283A1 (en) * 2015-07-10 2017-01-12 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
US9848212B2 (en) * 2015-07-10 2017-12-19 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
EP3388119A3 (en) * 2017-04-14 2018-11-28 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation

Also Published As

Publication number Publication date
US20130321413A1 (en) 2013-12-05
US20130321590A1 (en) 2013-12-05
US20130321418A1 (en) 2013-12-05
US9256980B2 (en) 2016-02-09
US20130321589A1 (en) 2013-12-05
US20130321396A1 (en) 2013-12-05
US9846960B2 (en) 2017-12-19
US20130321586A1 (en) 2013-12-05
US8917270B2 (en) 2014-12-23
US20130321410A1 (en) 2013-12-05
US9251623B2 (en) 2016-02-02
US20130321566A1 (en) 2013-12-05
US20130321593A1 (en) 2013-12-05

Similar Documents

Publication Publication Date Title
Matsuyama et al. Real-time 3D shape reconstruction, dynamic 3D mesh deformation, and high fidelity visualization for 3D video
Zhang et al. A survey on image-based rendering—representation, sampling and compression
CN1717064B (en) Interactive viewpoint video system and process
US7823058B2 (en) Methods and apparatus for interactive point-of-view authoring of digital video content
Carranza et al. Free-viewpoint video of human actors
Chan et al. Image-based rendering and synthesis
Rander et al. Virtualized reality: Constructing time-varying virtual worlds from real world events
Smolic et al. Interactive 3-D video representation and coding technologies
CA2640834C (en) Method and system for producing a video synopsis
US8645832B2 (en) Methods and apparatus for interactive map-based analysis of digital video content
Uyttendaele et al. Image-based interactive exploration of real-world environments
Wagner et al. Real-time panoramic mapping and tracking on mobile phones
Shum et al. Survey of image-based representations and compression techniques
EP2481023B1 (en) 2d to 3d video conversion
US20130073981A1 (en) Methods and apparatus for interactive network sharing of digital video content
Kopf et al. Street slide: browsing street level imagery
US20100053164A1 (en) Spatially correlated rendering of three-dimensional content on display components having arbitrary positions
Kubota et al. Multiview imaging and 3DTV
US20050283730A1 (en) System and process for viewing and navigating through an interactive video tour
CA2587644C (en) Method for inter-scene transitions
US5850352A (en) Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images
Kanade et al. Virtualized reality: Concepts and early results
US8345961B2 (en) Image stitching method and apparatus
EP1854282B1 (en) Method and system for spatio-temporal video warping
Prince et al. 3d live: Real time captured content for mixed reality

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KIRK, ADAM;FISHMAN, NEIL;GILLET, DON;AND OTHERS;SIGNING DATES FROM 20120827 TO 20120829;REEL/FRAME:028880/0101

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0541

Effective date: 20141014