US20130321575A1 - High definition bubbles for rendering free viewpoint video - Google Patents

High definition bubbles for rendering free viewpoint video Download PDF

Info

Publication number
US20130321575A1
US20130321575A1 US13/598,747 US201213598747A US2013321575A1 US 20130321575 A1 US20130321575 A1 US 20130321575A1 US 201213598747 A US201213598747 A US 201213598747A US 2013321575 A1 US2013321575 A1 US 2013321575A1
Authority
US
United States
Prior art keywords
fvv
high definition
regions
sub
bubbles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/598,747
Inventor
Adam Kirk
Neil Fishman
Don Gillett
Patrick Sweeney
Kanchan Mitra
David Eraker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/598,747 priority Critical patent/US20130321575A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ERAKER, DAVID, FISHMAN, NEIL, GILLET, DON, KIRK, ADAM, SWEENEY, PATRICK, MITRA, KANCHAN
Publication of US20130321575A1 publication Critical patent/US20130321575A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/243Image signal generators using stereoscopic image cameras using three or more 2D image sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/246Calibration of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/56Particle system, point based geometry or rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/005Audio distribution systems for home, i.e. multi-room use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Definitions

  • FVV free-viewpoint video
  • multiple video streams are used to re-render a time-varying scene from arbitrary viewpoints.
  • the creation and playback of a FVV is typically accomplished using a substantial amount of data.
  • scenes are generally simultaneously recorded from many different perspectives using sensors such as RGB cameras.
  • This recorded data is then generally processed to extract 3D geometric information in the form of geometric proxies or models using various 3D reconstruction (3DR) algorithms.
  • 3DR 3D reconstruction
  • the original RGB data and geometric proxies are then recombined during rendering, using various image based rendering (IBR) algorithms, to generate multiple synthetic viewpoints.
  • IBR image based rendering
  • a “Dynamic High Definition Bubble Framework” as described herein provides various techniques that allow local clients to display free viewpoint video (FVV) of complex 3D scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. These techniques allow the client to perform spatial navigation through the FVV, while changing viewpoints and/or zooming into one or more higher definition regions or areas (specifically defined and referred to herein as “high definition bubbles”) within the overall area or scene of the FVV.
  • FVV free viewpoint video
  • the Dynamic High Definition Bubble Framework enables local rendering of FVV by providing a lower fidelity geometric proxy of an overall scene or viewing area in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints).
  • regions of interest e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints.
  • regions of interest e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints.
  • regions of interest e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints.
  • regions of interest e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints.
  • the high definition bubbles may have differing levels of resolution or fidelity levels as well as differing numbers of viewpoint
  • the Dynamic High Definition Bubble Framework enables these capabilities by providing multiple areas or sub-regions of higher definition video capture within the overall viewing area or scene.
  • One implementation of this concept is to use multiple cameras (e.g., a camera array or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired.
  • a set of cameras e.g., a camera array or the like
  • zoom in on particular regions of interest within the overall scene are used to create higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene.
  • various embodiments of the Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene.
  • One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies while the remaining areas of the overall scene are transferred to the client using lower resolution geometric proxies.
  • the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., camera arrays following a ball or a particular player in a soccer game).
  • These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area of the overall scene.
  • Captured image data is then used to generate geometric proxies or 3D models of the scene for local rendering of the FVV from any available viewpoint and at any desired resolution corresponding to the selected viewpoint.
  • the FVV can be pre-rendered and sent to the client as a viewable and navigable FVV.
  • the techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint and resolution selected by the user when viewing or navigating the FVV.
  • This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client.
  • using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • the Dynamic High Definition Bubble Framework creates a navigable FVV that presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then chooses an optimal dataset to use to render various portions of the FVV at the desired resolutions/fidelity.
  • This allows the Dynamic High Definition Bubble Framework to seamlessly support varying resolutions for different regions while optimally choosing the appropriate dataset to process for the desired output.
  • rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems. In other words, even though the user is zooming into particular areas or regions, the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • Dynamic High Definition Bubble Framework described herein provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • other advantages of the Dynamic High Definition Bubble Framework will become apparent from the detailed description that follows hereinafter when taken in conjunction with the accompanying drawing figures.
  • FIG. 1 provides an exemplary architectural flow diagram that illustrates program modules for using a “Dynamic High Definition Bubble Framework” for creating and navigating free viewpoint videos (FVV) of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV to clients, as described herein.
  • a “Dynamic High Definition Bubble Framework” for creating and navigating free viewpoint videos (FVV) of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV to clients, as described herein.
  • FIG. 2 provides an illustration of high definition bubbles within an overall viewing area or scene, as described herein
  • FIG. 3 provides illustration of the use of separate camera arrays to capture a high definition bubble and an overall viewing area, as described herein.
  • FIG. 4 illustrates a general system flow diagram that illustrates exemplary methods for implementing various embodiments of the Dynamic High Definition Bubble Framework for creating and navigating FVV's having high definition bubbles, as described herein.
  • FIG. 5 is a general system diagram depicting a simplified general-purpose computing device having simplified computing and I/O capabilities for use in implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein.
  • one or more overall capture areas typically surround the “action”, which is confined to one or more smaller volumetric areas or sub-regions within the overall capture area.
  • the action is generally centered on the ball and one or more players or athletes around the ball. While it is technically feasible to capture and render the entire capture volume at full fidelity, this would typically result in the generation of very large datasets to be sent from the server to the client for local rendering.
  • a “Dynamic High Definition Bubble Framework,” as described herein, provides various techniques that specifically address such concerns by providing the client with one or more lower fidelity geometric proxies of an overall viewing area or volumetric space.
  • the Dynamic High Definition Bubble Framework provides one or more sub-regions of the overall viewing area as higher fidelity representations. Local clients then use this information to view and navigate through the overall FVV while providing the user with the capability to zoom into areas of higher fidelity.
  • the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems.
  • the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • the Dynamic High Definition Bubble Framework enables local rendering of image frames of the FVV by providing a lower fidelity geometric proxy of an overall scene in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail). This allows the user to view the entire volume of the scene as FVV, with interesting features or regions of the scene being provided in higher detail in the event that the user zooms into such regions, while reducing the amount of data that is transmitted to the client for local rendering of the FVV.
  • regions of interest e.g., areas of action in the scene that the user may wish to view in expanded detail.
  • One implementation of this concept is to use multiple cameras (e.g., camera arrays or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired.
  • a set of cameras that zoom in on particular regions of interest within the overall scene (such as the “action” in a football game where a player is carrying the ball) are used to capture data for creating higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene.
  • These bubbles are specifically defined and referred to herein as “high definition bubbles.”Further, depending upon the available camera data, multiple viewpoints of potentially varying resolution or fidelity may be available within each bubble.
  • the Dynamic High Definition Bubble Framework typically presents a broad view of the overall viewing area or volumetric space from some distance away. Then, as the user zooms in or changes viewpoints, one or more areas of the overall scene or viewing area are provided in higher definition or fidelity. Therefore, rather than providing high definition everywhere (at high computational and bandwidth costs), the Dynamic High Definition Bubble Framework captures one or more bubbles in higher definition in locations or regions where it is believed that the user will be most interested. In other words, an author of the FVV will use the Dynamic High Definition Bubble Framework to capture bubbles in places where it is believed that user's may want more detail, or where the author want user's to be able to explore the FVV in greater detail.
  • Bubbles can be presented to the user in various ways.
  • the user in displaying the FVV to the user, the user is provided with the capability to zoom and/or change viewpoints (e.g., pans, tilts, rotations, etc.).
  • zoom and/or change viewpoints e.g., pans, tilts, rotations, etc.
  • the user will be presented with higher resolution image frames during the zoom. As such, there is no need to demarcate explicit regions of the FVV that contain high definition bubbles.
  • the user is presented with the entire scene and as they scroll through it, more data is available in areas (i.e., bubbles) where there is higher detail.
  • areas i.e., bubbles
  • the user will see that there is more detail available to them, while if they zoom into the grass near the edge of a field where there is less action, the user will see less detail (assuming that there is no corresponding high definition bubble there). Therefore, by placing bubbles in areas where the user is expected to look for higher detail (such as a tight view in and around the ball when it is fumbled) detail available to the user is higher, while off to one side of the field distant from the play, it is unlikely the user will zoom into that area. Therefore, when the user does zoom into the area around the ball, it creates an illusion as if the user can zoom in anywhere.
  • the FVV is presented with thumbnails or highlighting within or near the overall scene to alert the user as to locations, regions or bubbles (and optionally available viewpoints) of higher definition.
  • the Dynamic High Definition Bubble Framework can provide a FVV of a boxing match where the overall ring is in low definition, but the two fighters are within a high definition bubble.
  • the FVV may include indications of either or both the existence of the high definition bubble around the fighters and various available viewpoints within that bubble such as a view of the opponent from either boxers perspective.
  • the Dynamic High Definition Bubble Framework allows different users to have completely different viewing experiences. For example, in the case of a football game, one user can be zoomed into a bubble around the ball, while another user is zoomed into a bubble around cheerleaders on the edge of the football field, while yet another user is zoomed out to see the overall action on the entire field. Further, the same user can watch the FVV multiple times using any of a number of available zooms into one or more high definition bubbles and from any of a number of available viewpoints relative to any of those high definition bubbles.
  • the “Dynamic High Definition Bubble Framework,” provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • the processes summarized above are illustrated by the general system diagram of FIG. 1 .
  • the system diagram of FIG. 1 illustrates the interrelationships between program modules for implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein.
  • the system diagram of FIG. 1 illustrates a high-level view of various embodiments of the Dynamic High Definition Bubble Framework
  • FIG. 1 is not intended to provide an exhaustive or complete illustration of every possible embodiment of the Dynamic High Definition Bubble Framework as described throughout this document.
  • any boxes and interconnections between boxes that may be represented by broken or dashed lines in FIG. 1 represent alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • the processes enabled by the Dynamic High Definition Bubble Framework begin operation by using a data capture module 100 that uses multiple cameras or arrays to capture and generate 3D scene data 120 (e.g., geometric proxies, 3D models, RGB or other color space data, textures, etc.) for an overall viewing area and one or more viewpoints for one or more high definition bubbles within the overall viewing area.
  • 3D scene data 120 e.g., geometric proxies, 3D models, RGB or other color space data, textures, etc.
  • a user input module 110 is used for various purposes, including, but not limited to, defining and configuring one or more cameras and/or camera arrays for capturing an overall viewing area and one or more high definition bubbles.
  • the user input module 110 is also used in various embodiments to define or specify one or more high definition bubbles, one or more viewpoints or view frustums, resolution or level of detail for one or more of the bubbles and one or more of the viewpoints, etc.
  • a pre-rendering module 130 uses the 3D scene data 120 to pre-render one or more FVV's that are then provided to one or more clients for viewing and navigation.
  • a data transmission module 140 transmits either the pre-rendered FVV or 3D scene data 120 to one or more clients.
  • the Dynamic High Definition Bubble Framework conserves bandwidth when transmitting to the client by only sending sufficient 3D scene data 120 for the level of detail desired to render image frames corresponding to an initial virtual navigation viewpoint or viewing frustum or one selected by the client.
  • local clients use a local rendering module 150 to render one or more FVV's 160 or image frames of the FVV.
  • a FVV playback module 170 provides user-navigable interactive playback of the FVV in response to user navigation and zoom commands.
  • the FVV playback module 170 allows the user to pan, zoom, or otherwise navigate through the FVV. Further, user pan, tilt, rotation and zoom information is provided back to the local rendering module 150 or to the data transmission module for use in retrieving the 3D scene data 120 needed to render subsequent image frames of the FVV corresponding to user interaction and navigation through the FVV.
  • the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display FVV of complex scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • the following sections provide a detailed discussion of the operation of various embodiments of the Dynamic High Definition Bubble Framework, and of exemplary methods for implementing the program modules described in Section 1 with respect to FIG. 1 .
  • the following sections provides examples and operational details of various embodiments of the Dynamic High Definition Bubble Framework, including: an operational overview of the Dynamic High Definition Bubble Framework; exemplary FVV scenarios enabled by the Dynamic High Definition Bubble Framework; and data capture scenarios and FVV generation.
  • Dynamic High Definition Bubble Framework-based processes described herein provide various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • FIG. 2 illustrates various high definition bubbles within an overall viewing area 200 , scene, or volumetric space.
  • the Dynamic High Definition Bubble Framework generally uses various cameras or camera arrays to capture the overall viewing area 200 at some desired resolution level.
  • One or more high definition bubbles within the overall viewing area 200 are then captured uses various cameras or camera arrays at higher resolution or fidelity levels.
  • these high definition bubbles e.g., 210 , 220 , 230 , 240 , 250 and 260
  • high definition bubbles e.g., 210 , 220 , 230
  • can be in fixed positions to capture particular regions of the overall scene that may be of interest e.g., end zones in a football game).
  • the high definition bubbles may also represent dynamic regions that move to follow action along arbitrary paths (e.g., 240 ) or along fixed paths (e.g., 250 to 260 ). Note also that moving high definition bubbles may sometimes extend outside the overall viewing area 200 (e.g., 260 ), though this may result in FVV image frames in which only the content of that high definition bubble is visible.
  • One or more high definition bubbles may also overlap (e.g., 230 ).
  • FIG. 3 illustrates the use of separate camera arrays to capture a high definition bubble 330 using a camera array (e.g., cameras 335 , 340 , 345 and 350 ) within an overall viewing area 300 that is in turn captured by a set of cameras (e.g., 305 , 310 , and 315 ) at a lower fidelity level than that of the high definition bubble.
  • a camera array e.g., cameras 335 , 340 , 345 and 350
  • a set of cameras e.g., 305 , 310 , and 315
  • Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene.
  • One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies or 3D models while the remaining areas of the overall scene are transferred to the client using lower definition geometric proxies or 3D models.
  • the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., following a ball or a particular player in a soccer game).
  • These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area or volume of the overall scene.
  • the FVV processing techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint selected by the user for when viewing a FVV.
  • This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client.
  • using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • the Dynamic High Definition Bubble Framework enables a wide variety of viewing scenarios for clients or users. As noted above, since the user is provided with the opportunity to navigate and zoom the FVV during playback, the viewing experience can be substantially different for individual viewers of the same FVV.
  • the Dynamic High Definition Bubble Framework uses a number of cameras or camera arrays to capture sufficient views to create an overall 3D view of the stadium at low to medium definition or fidelity (i.e., any desired fidelity level).
  • the Dynamic High Definition Bubble Framework will also capture one or more specific locations or “bubbles” at a higher definition or fidelity and with a plurality of available viewpoints. Note that these bubbles are captured using fixed or movable cameras or camera arrays.
  • the Dynamic High Definition Bubble Framework may have fixed cameras or camera arrays around the end zone to capture high definition images in these regions at all times. Further, one or more sets of moving cameras or camera arrays can follow the ball or particular players around the field to capture images of the ball or players from multiple viewpoints.
  • the Dynamic High Definition Bubble Framework captures and provides an overall view of the field by using some number of cameras capturing the overall field.
  • the Dynamic High Definition Bubble Framework uses one or more sets of cameras that capture the regions around the ball, specific players, etc., so that the overall low definition general background of the football field can be augmented by user navigable high definition views of what is going on in 3D in the “bubbles.”
  • the Dynamic High Definition Bubble Framework generally presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then layers or combines navigable high definition bubbles with the overall volumetric space based on a determination of the proper geometric registration or alignment of those high definition bubbles within the overall volumetric space.
  • the Dynamic High Definition Bubble Framework enables the creation of movies where the user is provided with the capability to move around within a particular scene (i.e., change viewpoints) and to view particular parts of the scene, which are within bubbles, in higher definition while the movie is playing.
  • asset arrays are dense, fixed camera arrays optimized for creating a static (or moving) geometric proxy of an asset.
  • Assets include any object or person who will be on the field such as players, cheerleaders, referees, footballs, or other equipment.
  • the camera geometry of the asset arrays is optimized for the creation of a high fidelity geometric proxies and that requires a ‘full 360 ’ arrangement of sensors so that all aspects of the asset can be recorded and modeled; additional sensors may be placed above or below the assets.
  • ‘full 360’ coverage may not be possible (e.g., views partially obstructed along some range of viewing directions), and that in such cases, user selection of viewpoints in the resulting FVV will be limited to whatever viewpoints can be rendered from the captured data.
  • RGB or other color space
  • other sensor combinations such as active IR based stereo (also used in Kinect® or time of flight type applications) can be used to assist in 3D reconstruction. Additional techniques such as the use of green screen backgrounds can further assist in segmentation of the assets for use in creating high fidelity geometric proxies of those assets.
  • Asset arrays are generally utilized prior to the game and focus on static representations of the assets. Once recorded, these assets can be used as SV content for creating FVV's in two different ways, depending on the degree of geometry employed in their representation using image-based rendering (IBR).
  • IBR image-based rendering
  • a low-geometry IBR method including, but not limited to, view interpolation can be used to place the asset (players or cheerleaders) online using technology including, but not limited to, browser-based 2D or 3D rendering engines.
  • This also allows users to view single assets with a web browser or the like to navigate around a coordinate system that allows them to zoom in to the players (or other assets) from any angle, thus providing the user or viewer with high levels of photorealism with respect to those assets.
  • rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without losing fidelity or resolution in the zoomed areas, or otherwise creating pixelization artifacts or other zoom-based viewing problems.
  • video can be used to highlight different player/cheerleader promotional activities such a throw, catch, block, cheer, etc.
  • video can be used to highlight different player/cheerleader promotional activities such a throw, catch, block, cheer, etc.
  • view interpolation and view morphing for such purposes are discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • a high fidelity geometry proxy of the players (or other persons such as cheerleaders, referees, coaches, announcers, etc.) is created and combined with view dependent texture mapping (VDTM) for use in close up FVV scenarios.
  • VDTM view dependent texture mapping
  • a kinematic model for a human is used as a baseline for possible motions and further articulated based on RGB data from live-action video camera arrays.
  • Multi-angle video data is then used to realistically articulate the geometric proxies for all players or a subset of players on the field.
  • 6 degrees of freedom (6-DOF) movement of the user's viewpoint during playback of FVV is possible due the explicit use of 3D geometry in representing the assets.
  • 6 degrees of freedom (6-DOF) movement of the user's viewpoint during playback of FVV is possible due the explicit use of 3D geometry in representing the assets.
  • a model of the environment is useful to the FVV of the football game in a number of different ways, such as providing a calibration framework for live-action moving cameras, creating interstitials effects when transitioning between known real camera feeds, determining the accurate placement (i.e., registration or alignment) of various geometric proxies (generated from the high definition bubbles) for FVV, improving segmentation results with background data, accurately representing the background of the scene using image-based-rendering methods in different FVV use cases, etc.
  • the Dynamic High Definition Bubble Framework provides richer 3D rendering by using much more geometry. More specifically, geometric proxies corresponding to each high definition bubble are registered or aligned to the geometry of the environment model. Once properly positioned, the various geometric proxies are then used to render the frames of the FVV.
  • RGB data from video cameras and fixed camera data can be processed using conventional 3D reconstruction methods to identify features and their location; point clouds of the stadium can be created from these features. Additional geometry, also in the form of point clouds, can be extracted using range scanning devices for additional accuracy. Finally, the point cloud data can be merged together, meshed, and textured into a cohesive geometric model. This geometry can also be used as an infrastructure to organize RGB data for use in other IBR approaches for backgrounds useful for FVV functionality.
  • an environment model is created and processed before being used in any live-action footage provided by the FVV.
  • Various methods associated with FVV live action are made possible by the creation of an environment model including interstitials, moving camera calibration, and geometry-articulation.
  • Additional FVV scenarios make advantageous use of the environment model by using both fixed and moving camera arrays to enable FVV functionality.
  • moving cameras these are used to provide close-ups of action on the field (i.e., by registering or positioning geometric proxies generated from the high definition bubbles with the environment model).
  • individual video frames are continuously calibrated based on their orientation and optical focus, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • the Dynamic High Definition Bubble Framework uses structure from motion (SFM) based approaches, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference, to calibrate the moving cameras or cameras based on static high resolution static RGB images captured during the environment modeling stage.
  • SFM structure from motion
  • the Dynamic High Definition Bubble Framework relies upon the aforementioned articulation of the high-fidelity geometric proxies for the assets (players) using data from both fixed and moving camera arrays. These proxies are then positioned (i.e., registered or aligned) in the correct location on the field by determining where these assets are located relative to the environment model, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • Fixed camera arrays are used in various scenarios associated with the football game, including intra-game focused footage as well as collateral footage.
  • the defining characteristic of the fixed arrays are that cameras do not move relative to the scene.
  • FVV functionality is that viewers (or producers) can enable real-time smooth pans between the different announcers as they comment and react. Another application of these ideas is to change views between the announcers and a top down map of the play presented next to the announcers.
  • Another example scenario includes zooming in on a specific cheerleader doing a cheer, assuming that the fixed array is positioned on the field in an appropriate location for such views.
  • FVV navigation would be primarily limited to synthetic viewpoints between real camera positions or the axis of the camera geometry. However, by using the available 3D scene data for rendering the image frames, the results would be almost indistinguishable from real camera viewpoints.
  • this video data can be used to enable both far and medium FVV viewpoint control both during the game and during playback.
  • This is considered a sparse array because the relative volume of the stadium is rather large and the distance between sensors is high.
  • image-based rendering methods such as billboards and articulated billboards may be used to provide two-dimensional representations of the players on the field. These billboards are created using segmentation approaches, which are enabled partially by the environment model. These billboards maintain the photorealistic look of the players, but because they do not include the explicit geometry of the players (such as when represented as high fidelity geometric proxies). However, it should be understood that in general, navigation in the FVV is independent of the representation used.
  • viewpoint navigation would be largely constrained by the camera axis using similar image-based-rendering methods described for the announcer's stage.
  • these types of viewpoints are specifically enabled when camera density is at an appropriate level and therefore are not generally enabled for all locations within the stadium.
  • dense camera arrays are used for capturing sub-regions of the overall stadium as high definition bubbles for inclusion in the FVV. In general, these methods are unsuitable for medium and sparse configurations of sensors.
  • Typical intra-game football coverage comes from moving cameras for both live action coverage and for replays.
  • the preceding discussion regarding camera arrays generally focused on creating high fidelity geometric proxies of players and assets, how an environment model of the stadium can be leveraged to enhance the FVV, and the use of intra-game fixed camera arrays in both sparse and dense configurations.
  • the Dynamic High Definition Bubble Framework ties these elements together with sparse moving camera arrays to enable additional FVV functionality for medium shots using billboards and close-up shots that leverage full 6-DOF spatial navigation using high fidelity geometric proxies of players or other assets or persons using conventional game cameras and camera operators.
  • moving camera arrays are used to capture high definition bubbles used in generating FVV's.
  • Moving cameras in the array are continuously calibrated using SFM approaches leveraging the environment model.
  • the optical zoom functionality of these moving cameras is also used to capture image data within high definition bubbles using methods including using prior frames to help further refine or identify a zoomed in camera geometry.
  • additional image-based-rendering methods are enabled for different FVV based on the contributing camera geometries including RGB articulated geometric proxies with maximal spatial navigation and billboard methods which emphasize photorealism and less spatial navigation.
  • the Dynamic High Definition Bubble Framework uses image data from the asset arrays, fixed arrays, and moving arrays. First, the relative position of the players is tracked on the field using one or more fixed arrays. In this way, the approximate location of any player on the field is known. This allows the Dynamic High Definition Bubble Framework to determine which players are in a zoomed in moving camera field of view. Next, based on the identification of the players in the zoomed in fields of view, the Dynamic High Definition Bubble Framework selects the appropriate high-fidelity geometric proxies for each player that were created earlier using the asset arrays.
  • the Dynamic High Definition Bubble Framework determines the spatial orientation of specific players on the field and articulates their geometric proxies as realistically as possible. Note that this also helps in filling in occluded areas (using various hole-filling techniques) when there were insufficient numbers or placements of cameras to capture a view.
  • the Dynamic High Definition Bubble Framework then derives a full 6-DOF FVV replay experience for the user. In this way, users or clients can literally view a play from any potential position including close-up shots as well as intra-field camera positions.
  • the net effect here is to enable interactive replays similar to what is possible with various Xbox® football games such as the “Madden NFL” series of electronic games by Electronic Arts Inc, although with real data.
  • multiple moving cameras focused on the same physical location of the field can also enable medium and close up views that use IBR methods with less explicit geometry such as billboard methodologies.
  • These cameras can be combined with data from both the environment model as well as the fixed arrays to create additional FVV viewpoints within the stadium.
  • FIG. 4 provides an exemplary operational flow diagram that summarizes the operation of some of the various embodiments of the Dynamic High Definition Bubble Framework. Note that FIG. 4 is not intended to be an exhaustive representation of all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, and that the embodiments represented in FIG. 4 are provided only for purposes of explanation.
  • any boxes and interconnections between boxes that are represented by broken or dashed lines in FIG. 4 represent optional or alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these optional or alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • the Dynamic High Definition Bubble Framework begins operation by capturing ( 410 ) 3D image data for an overall viewing area and one or more high definition bubbles within the overall viewing area.
  • the Dynamic High Definition Bubble Framework uses the captured data to generate ( 420 ) one or more 3D geometric proxies or models for use in generating a Free Viewpoint Video (FVV).
  • FVV Free Viewpoint Video
  • a view frustum for an initial or user selected virtual navigation viewpoint is then selected ( 430 ).
  • the Dynamic High Definition Bubble Framework selects ( 440 ) an appropriate level of detail for regions in the view frustum based on distance from viewpoint.
  • the Dynamic High Definition Bubble Framework uses higher fidelity geometric proxies for regions corresponding to high definition bubbles and lower fidelity geometric proxies for other regions of overall viewing area.
  • the Dynamic High Definition Bubble Framework then provides ( 450 ) one or more clients with 3D geometric proxies corresponding to the view frustum, with those geometric proxies having a level of detail sufficient to render the scene (or other objects or people within the current viewpoint) from a viewing frustum corresponding to a user selected virtual navigation viewpoint.
  • the FVV is rendered or generated and presented to the user for viewing, with the user then navigating ( 460 ) the FVV by selecting zoom levels and virtual navigation viewpoints (e.g., pans, tilts, rotations, etc.), which are in turn used to select the view frustum for generating subsequent frames of the FVV.
  • zoom levels and virtual navigation viewpoints e.g., pans, tilts, rotations, etc.
  • FIG. 5 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the Dynamic High Definition Bubble Framework, as described herein, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 5 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • FIG. 5 shows a general system diagram showing a simplified computing device such as computer 500 .
  • Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • the device should have a sufficient computational capability and system memory to enable basic computational operations.
  • the computational capability is generally illustrated by one or more processing unit(s) 510 , and may also include one or more GPUs 515 , either or both in communication with system memory 520 .
  • the processing unit(s) 510 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • the simplified computing device of FIG. 5 may also include other components, such as, for example, a communications interface 530 .
  • the simplified computing device of FIG. 5 may also include one or more conventional computer input devices 540 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.).
  • the simplified computing device of FIG. 5 may also include other optional components, such as, for example, one or more conventional computer output devices 550 (e.g., display device(s) 555 , audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.).
  • typical communications interfaces 530 , input devices 540 , output devices 550 , and storage devices 560 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • the simplified computing device of FIG. 5 may also include a variety of computer readable media.
  • Computer readable media can be any available media that can be accessed by computer 500 via storage devices 560 and includes both volatile and nonvolatile media that is either removable 570 and/or non-removable 580 , for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data.
  • Computer readable media may comprise computer storage media and communication media.
  • Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • modulated data signal or “carrier wave” generally refer a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • Dynamic High Definition Bubble Framework software, programs, and/or computer program products embodying the some or all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • Dynamic High Definition Bubble Framework described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device.
  • program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types.
  • the embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks.
  • program modules may be located in both local and remote computer storage media including media storage devices.
  • the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Image Generation (AREA)
  • Image Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Studio Devices (AREA)

Abstract

A “Dynamic High Definition Bubble Framework” allows local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. Generally, the FVV is presented to the user as a broad area from some distance away. Then, as the user zooms in or changes viewpoints, one or more areas of the overall area are provided in higher definition or fidelity. Therefore, rather than capturing and providing high definition everywhere (at high computational and bandwidth costs), the Dynamic High Definition Bubble Framework captures one or more “bubbles” or volumetric regions in higher definition in locations where it is believed that the user will be most interested. This information is then provided to the client to allow individual clients to navigate and zoom different regions of the FVV during playback without losing fidelity or resolution in the zoomed areas.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit under Title 35, U.S. Code, Section 119(e), of a previously filed U.S. Provisional Patent Application, Ser. No. 61/653,983 filed on May 31, 2012, by Simonnet, et al., and entitled “INTERACTIVE SPATIAL VIDEO,” the subject matter of which is incorporated herein by reference.
  • BACKGROUND
  • In general, in free-viewpoint video (FVV), multiple video streams are used to re-render a time-varying scene from arbitrary viewpoints. The creation and playback of a FVV is typically accomplished using a substantial amount of data. In particular, in FVV, scenes are generally simultaneously recorded from many different perspectives using sensors such as RGB cameras. This recorded data is then generally processed to extract 3D geometric information in the form of geometric proxies or models using various 3D reconstruction (3DR) algorithms. The original RGB data and geometric proxies are then recombined during rendering, using various image based rendering (IBR) algorithms, to generate multiple synthetic viewpoints.
  • Unfortunately, when a complex FVV such as a football game is recorded or otherwise captured, rendering the entire volume of the overall capture area to generate the FVV generally uses a very large dataset and a correspondingly large computational overhead for rendering the various viewpoints of the FVV for viewing on local clients.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Further, while certain disadvantages of prior technologies may be noted or discussed herein, the claimed subject matter is not intended to be limited to implementations that may solve or address any or all of the disadvantages of those prior technologies.
  • In general, a “Dynamic High Definition Bubble Framework” as described herein provides various techniques that allow local clients to display free viewpoint video (FVV) of complex 3D scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. These techniques allow the client to perform spatial navigation through the FVV, while changing viewpoints and/or zooming into one or more higher definition regions or areas (specifically defined and referred to herein as “high definition bubbles”) within the overall area or scene of the FVV.
  • More specifically, the Dynamic High Definition Bubble Framework enables local rendering of FVV by providing a lower fidelity geometric proxy of an overall scene or viewing area in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail and from one or more different viewpoints). This allows the user to view the entire volume of the scene as FVV, with interesting features or regions of the scene being provided in higher detail and optionally from a plurality of user-selectable viewpoints, while reducing the amount of data that is transmitted to the client for local rendering of the FVV. Note that the high definition bubbles may have differing levels of resolution or fidelity levels as well as differing numbers of viewpoints. Further, some of these viewpoints may be available at different resolutions or fidelity levels even within the same high definition bubble.
  • The Dynamic High Definition Bubble Framework enables these capabilities by providing multiple areas or sub-regions of higher definition video capture within the overall viewing area or scene. One implementation of this concept is to use multiple cameras (e.g., a camera array or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired. Concurrently, a set of cameras (e.g., a camera array or the like) that zoom in on particular regions of interest within the overall scene are used to create higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene.
  • For example, various embodiments of the Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene. One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies while the remaining areas of the overall scene are transferred to the client using lower resolution geometric proxies. Advantageously, the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., camera arrays following a ball or a particular player in a soccer game). These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area of the overall scene.
  • Captured image data is then used to generate geometric proxies or 3D models of the scene for local rendering of the FVV from any available viewpoint and at any desired resolution corresponding to the selected viewpoint. Note also that the FVV can be pre-rendered and sent to the client as a viewable and navigable FVV.
  • In particular, when used to stream 3D geometric proxies or models and corresponding RGB data to the client for locally render the FVV, the techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint and resolution selected by the user when viewing or navigating the FVV. This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client. In the server side example, using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • In other words, in various embodiments, the Dynamic High Definition Bubble Framework creates a navigable FVV that presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then chooses an optimal dataset to use to render various portions of the FVV at the desired resolutions/fidelity. This allows the Dynamic High Definition Bubble Framework to seamlessly support varying resolutions for different regions while optimally choosing the appropriate dataset to process for the desired output. Advantageously, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems. In other words, even though the user is zooming into particular areas or regions, the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • In view of the above summary, it is clear that the Dynamic High Definition Bubble Framework described herein provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. In addition to the just described benefits, other advantages of the Dynamic High Definition Bubble Framework will become apparent from the detailed description that follows hereinafter when taken in conjunction with the accompanying drawing figures.
  • DESCRIPTION OF THE DRAWINGS
  • The specific features, aspects, and advantages of the claimed subject matter will become better understood with regard to the following description, appended claims, and accompanying drawings where:
  • FIG. 1 provides an exemplary architectural flow diagram that illustrates program modules for using a “Dynamic High Definition Bubble Framework” for creating and navigating free viewpoint videos (FVV) of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV to clients, as described herein.
  • FIG. 2 provides an illustration of high definition bubbles within an overall viewing area or scene, as described herein
  • FIG. 3 provides illustration of the use of separate camera arrays to capture a high definition bubble and an overall viewing area, as described herein.
  • FIG. 4 illustrates a general system flow diagram that illustrates exemplary methods for implementing various embodiments of the Dynamic High Definition Bubble Framework for creating and navigating FVV's having high definition bubbles, as described herein.
  • FIG. 5 is a general system diagram depicting a simplified general-purpose computing device having simplified computing and I/O capabilities for use in implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • In the following description of the embodiments of the claimed subject matter, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the claimed subject matter may be practiced. It should be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the presently claimed subject matter.
  • 1.0 Introduction:
  • Note that some or all of the concepts described herein are intended to be understood in view of the overall context of the discussion of “Interactive Spatial Video” provided in U.S. Provisional Patent Application, Ser. No. 61/653,983 filed on May 31, 2012, by Simonnet, et al., and entitled “INTERACTIVE SPATIAL VIDEO,” the subject matter of which is incorporated herein by reference.
  • Note that various examples discussed in the following paragraphs refer to football games and football stadiums for purposes of explanation. However, it should be understood that the techniques described herein are not limited to any particular location, any particular activities, any particular size of volumetric space, or any particular number of scenes or objects.
  • In general, when a complex free-viewpoint video (FVV) of 3D scenes is recorded, one or more overall capture areas typically surround the “action”, which is confined to one or more smaller volumetric areas or sub-regions within the overall capture area. For example, in a football game, the size of the field is relatively large, but at any given time, the interesting action is generally centered on the ball and one or more players or athletes around the ball. While it is technically feasible to capture and render the entire capture volume at full fidelity, this would typically result in the generation of very large datasets to be sent from the server to the client for local rendering.
  • Advantageously, a “Dynamic High Definition Bubble Framework,” as described herein, provides various techniques that specifically address such concerns by providing the client with one or more lower fidelity geometric proxies of an overall viewing area or volumetric space. Concurrently, the Dynamic High Definition Bubble Framework provides one or more sub-regions of the overall viewing area as higher fidelity representations. Local clients then use this information to view and navigate through the overall FVV while providing the user with the capability to zoom into areas of higher fidelity. In other words, the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. Advantageously, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without creating pixelization artifacts or other zoom-based viewing problems. In other words, even though the user is zooming into particular areas or regions, the FVV displayed to the user does not lose fidelity or resolution in those zoomed areas.
  • More specifically, the Dynamic High Definition Bubble Framework enables local rendering of image frames of the FVV by providing a lower fidelity geometric proxy of an overall scene in combination with one or more higher fidelity geometric proxies of the scene corresponding to regions of interest (e.g., areas of action in the scene that the user may wish to view in expanded detail). This allows the user to view the entire volume of the scene as FVV, with interesting features or regions of the scene being provided in higher detail in the event that the user zooms into such regions, while reducing the amount of data that is transmitted to the client for local rendering of the FVV.
  • One implementation of this concept is to use multiple cameras (e.g., camera arrays or the like) surrounding the scene to capture the scene or event holistically, in whatever resolution is desired. Concurrently, a set of cameras that zoom in on particular regions of interest within the overall scene (such as the “action” in a football game where a player is carrying the ball) are used to capture data for creating higher definition geometric proxies that enable a higher quality viewing experience of “bubbles” associated with the zoomed regions of the scene. These bubbles are specifically defined and referred to herein as “high definition bubbles.”Further, depending upon the available camera data, multiple viewpoints of potentially varying resolution or fidelity may be available within each bubble.
  • For any given scenario (e.g., sporting events, movie scenes, concerts, etc.), the Dynamic High Definition Bubble Framework typically presents a broad view of the overall viewing area or volumetric space from some distance away. Then, as the user zooms in or changes viewpoints, one or more areas of the overall scene or viewing area are provided in higher definition or fidelity. Therefore, rather than providing high definition everywhere (at high computational and bandwidth costs), the Dynamic High Definition Bubble Framework captures one or more bubbles in higher definition in locations or regions where it is believed that the user will be most interested. In other words, an author of the FVV will use the Dynamic High Definition Bubble Framework to capture bubbles in places where it is believed that user's may want more detail, or where the author want user's to be able to explore the FVV in greater detail.
  • Bubbles can be presented to the user in various ways. For example, in displaying the FVV to the user, the user is provided with the capability to zoom and/or change viewpoints (e.g., pans, tilts, rotations, etc.). In the case that the user zooms into a region corresponding to a high definition bubble, the user will be presented with higher resolution image frames during the zoom. As such, there is no need to demarcate explicit regions of the FVV that contain high definition bubbles.
  • In other words, the user is presented with the entire scene and as they scroll through it, more data is available in areas (i.e., bubbles) where there is higher detail. For example, by zooming into a high definition bubble around a football, the user will see that there is more detail available to them, while if they zoom into the grass near the edge of a field where there is less action, the user will see less detail (assuming that there is no corresponding high definition bubble there). Therefore, by placing bubbles in areas where the user is expected to look for higher detail (such as a tight view in and around the ball when it is fumbled) detail available to the user is higher, while off to one side of the field distant from the play, it is unlikely the user will zoom into that area. Therefore, when the user does zoom into the area around the ball, it creates an illusion as if the user can zoom in anywhere.
  • In alternate embodiments of the Dynamic High Definition Bubble Framework, the FVV is presented with thumbnails or highlighting within or near the overall scene to alert the user as to locations, regions or bubbles (and optionally available viewpoints) of higher definition. For example, the Dynamic High Definition Bubble Framework can provide a FVV of a boxing match where the overall ring is in low definition, but the two fighters are within a high definition bubble. In this case, the FVV may include indications of either or both the existence of the high definition bubble around the fighters and various available viewpoints within that bubble such as a view of the opponent from either boxers perspective.
  • Advantageously, the Dynamic High Definition Bubble Framework allows different users to have completely different viewing experiences. For example, in the case of a football game, one user can be zoomed into a bubble around the ball, while another user is zoomed into a bubble around cheerleaders on the edge of the football field, while yet another user is zoomed out to see the overall action on the entire field. Further, the same user can watch the FVV multiple times using any of a number of available zooms into one or more high definition bubbles and from any of a number of available viewpoints relative to any of those high definition bubbles.
  • 1.1 System Overview:
  • As noted above, the “Dynamic High Definition Bubble Framework,” provides various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV. The processes summarized above are illustrated by the general system diagram of FIG. 1. In particular, the system diagram of FIG. 1 illustrates the interrelationships between program modules for implementing various embodiments of the Dynamic High Definition Bubble Framework, as described herein. Furthermore, while the system diagram of FIG. 1 illustrates a high-level view of various embodiments of the Dynamic High Definition Bubble Framework, FIG. 1 is not intended to provide an exhaustive or complete illustration of every possible embodiment of the Dynamic High Definition Bubble Framework as described throughout this document.
  • In addition, it should be noted that any boxes and interconnections between boxes that may be represented by broken or dashed lines in FIG. 1 represent alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • In general, as illustrated by FIG. 1, the processes enabled by the Dynamic High Definition Bubble Framework begin operation by using a data capture module 100 that uses multiple cameras or arrays to capture and generate 3D scene data 120 (e.g., geometric proxies, 3D models, RGB or other color space data, textures, etc.) for an overall viewing area and one or more viewpoints for one or more high definition bubbles within the overall viewing area.
  • In various embodiments, a user input module 110 is used for various purposes, including, but not limited to, defining and configuring one or more cameras and/or camera arrays for capturing an overall viewing area and one or more high definition bubbles. The user input module 110 is also used in various embodiments to define or specify one or more high definition bubbles, one or more viewpoints or view frustums, resolution or level of detail for one or more of the bubbles and one or more of the viewpoints, etc.
  • Typically, local clients will render video frames of the FVV from 3D scene data 120. However, in various embodiments, a pre-rendering module 130 uses the 3D scene data 120 to pre-render one or more FVV's that are then provided to one or more clients for viewing and navigation. In either case, a data transmission module 140 transmits either the pre-rendered FVV or 3D scene data 120 to one or more clients. The Dynamic High Definition Bubble Framework conserves bandwidth when transmitting to the client by only sending sufficient 3D scene data 120 for the level of detail desired to render image frames corresponding to an initial virtual navigation viewpoint or viewing frustum or one selected by the client. Following receipt of the 3D scene data 120, local clients use a local rendering module 150 to render one or more FVV's 160 or image frames of the FVV.
  • Finally, a FVV playback module 170 provides user-navigable interactive playback of the FVV in response to user navigation and zoom commands. In general, the FVV playback module 170 allows the user to pan, zoom, or otherwise navigate through the FVV. Further, user pan, tilt, rotation and zoom information is provided back to the local rendering module 150 or to the data transmission module for use in retrieving the 3D scene data 120 needed to render subsequent image frames of the FVV corresponding to user interaction and navigation through the FVV.
  • 2.0 Operational Details:
  • The above-described program modules are employed for implementing various embodiments of the Dynamic High Definition Bubble Framework. As summarized above, the Dynamic High Definition Bubble Framework provides various techniques that allow local clients to display FVV of complex scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • The following sections provide a detailed discussion of the operation of various embodiments of the Dynamic High Definition Bubble Framework, and of exemplary methods for implementing the program modules described in Section 1 with respect to FIG. 1. In particular, the following sections provides examples and operational details of various embodiments of the Dynamic High Definition Bubble Framework, including: an operational overview of the Dynamic High Definition Bubble Framework; exemplary FVV scenarios enabled by the Dynamic High Definition Bubble Framework; and data capture scenarios and FVV generation.
  • 2.1 Operational Overview:
  • As noted above, the Dynamic High Definition Bubble Framework-based processes described herein provide various techniques that allow local clients to display and navigate FVV of complex multi-resolution and multi-viewpoint scenes while reducing computational overhead and bandwidth for rendering and/or transmitting the FVV.
  • FIG. 2 illustrates various high definition bubbles within an overall viewing area 200, scene, or volumetric space. The Dynamic High Definition Bubble Framework generally uses various cameras or camera arrays to capture the overall viewing area 200 at some desired resolution level. One or more high definition bubbles within the overall viewing area 200 are then captured uses various cameras or camera arrays at higher resolution or fidelity levels. As illustrated by FIG. 2, these high definition bubbles (e.g., 210, 220, 230, 240, 250 and 260) can have arbitrary shapes, sizes and volumes. Further, high definition bubbles (e.g., 210, 220, 230) can be in fixed positions to capture particular regions of the overall scene that may be of interest (e.g., end zones in a football game). The high definition bubbles (e.g., 240, 250 and 260) may also represent dynamic regions that move to follow action along arbitrary paths (e.g., 240) or along fixed paths (e.g., 250 to 260). Note also that moving high definition bubbles may sometimes extend outside the overall viewing area 200 (e.g., 260), though this may result in FVV image frames in which only the content of that high definition bubble is visible. One or more high definition bubbles may also overlap (e.g., 230).
  • FIG. 3 illustrates the use of separate camera arrays to capture a high definition bubble 330 using a camera array (e.g., cameras 335, 340, 345 and 350) within an overall viewing area 300 that is in turn captured by a set of cameras (e.g., 305, 310, and 315) at a lower fidelity level than that of the high definition bubble.
  • Various embodiments of the Dynamic High Definition Bubble Framework are enabled by using captured image or video data to create a 3D representation (or other visual representation of the “real” world) of the overall space of a scene. One or more sub-regions (i.e., high definition bubbles) of the larger space of the overall scene are then transferred to the client as high definition geometric proxies or 3D models while the remaining areas of the overall scene are transferred to the client using lower definition geometric proxies or 3D models. Advantageously, as noted above, the sub-regions represented by the high definition bubbles can be in fixed or predefined positions (e.g., the end zone of football field) or can move within the larger area of the overall scene (e.g., following a ball or a particular player in a soccer game). These high definition bubbles are enabled by using any desired combination of fixed and moving camera arrays to capture high-resolution image data within one or more regions of interest relative to the area or volume of the overall scene.
  • Consequently, when used to stream both 3D geometric and RGB data from the server to the client, the FVV processing techniques enabled by the Dynamic High Definition Bubble Framework serve to reduce the amount of data used to render a specific viewpoint selected by the user for when viewing a FVV. This approach is also applicable to server side rendering performance, when a video frame is generated on the server and transmitted to the client. In the server side example, using lower fidelity representations of areas that are far away from a region of interest (i.e., the desired viewpoint) in combination with using higher fidelity representations of the regions of interest reduces the time and computational overhead needed for generating video frames prior to transmission to the client.
  • 2.2 Exemplary FVV Scenarios:
  • The Dynamic High Definition Bubble Framework enables a wide variety of viewing scenarios for clients or users. As noted above, since the user is provided with the opportunity to navigate and zoom the FVV during playback, the viewing experience can be substantially different for individual viewers of the same FVV.
  • For example, considering a football game in a typical stadium, the Dynamic High Definition Bubble Framework uses a number of cameras or camera arrays to capture sufficient views to create an overall 3D view of the stadium at low to medium definition or fidelity (i.e., any desired fidelity level). In addition, the Dynamic High Definition Bubble Framework will also capture one or more specific locations or “bubbles” at a higher definition or fidelity and with a plurality of available viewpoints. Note that these bubbles are captured using fixed or movable cameras or camera arrays. For example, again considering the football game, the Dynamic High Definition Bubble Framework may have fixed cameras or camera arrays around the end zone to capture high definition images in these regions at all times. Further, one or more sets of moving cameras or camera arrays can follow the ball or particular players around the field to capture images of the ball or players from multiple viewpoints.
  • Generally, in the case of a football field, it would be difficult to capture every part of the entire field and all of the action in high definition without using very large amounts of data. Consequently, the Dynamic High Definition Bubble Framework captures and provides an overall view of the field by using some number of cameras capturing the overall field. Then, the Dynamic High Definition Bubble Framework uses one or more sets of cameras that capture the regions around the ball, specific players, etc., so that the overall low definition general background of the football field can be augmented by user navigable high definition views of what is going on in 3D in the “bubbles.” In other words, in various embodiments, the Dynamic High Definition Bubble Framework generally presents a general or remote view (e.g., relatively far back from the action) of an overall volumetric space and then layers or combines navigable high definition bubbles with the overall volumetric space based on a determination of the proper geometric registration or alignment of those high definition bubbles within the overall volumetric space.
  • In the case of a movie or the like, the Dynamic High Definition Bubble Framework enables the creation of movies where the user is provided with the capability to move around within a particular scene (i.e., change viewpoints) and to view particular parts of the scene, which are within bubbles, in higher definition while the movie is playing.
  • 2.3 Exemplary Data Capture Scenarios and FVV Generation:
  • The following paragraphs describe various examples of scenarios involving the physical placement and geometric configuration of various cameras and camera arrays within a football stadium to capture multiple high definition bubbles and virtual viewpoints for navigation of FVV's of a football game with associated close-ups and zooms corresponding to the high definition bubbles and virtual viewpoints. It should be understood that the following examples are provided only for purposes of explanation and are not intended to limit the scope or use of the Dynamic High Definition Bubble Framework to the examples presented, to the particular camera array configurations or geometries discussed, or to the positioning or use of particular high definition bubbles or virtual viewpoints.
  • In general, understanding where cameras or camera arrays will be deployed and the geometry associated with those cameras determines how the resulting 3D scene data will be processed in an interactive Spatial Video (SV) and subsequently rendered to create the FVV for the user or client. In the case of a typical professional football game, it is assumed that all cameras and related technology for capturing images, following action scenes or the ball, cutting to particular locations or persons, etc., exists inside or above the stadium. In some cases, the cameras will record elements before the game. In other cases, the cameras will be used in the live broadcast of the game. In this example, there are several primary configurations, including, but not necessarily limited to the following:
      • Asset Arrays—Camera arrays referred to as “asset arrays” are used to capture 3D image data of players, cheerleaders, coaches, referees, and any other items or people which may appear on the field before the game. Following processing of the raw image data, the output of these asset arrays is both an image intensive photorealistic rendering and a high fidelity geometric proxy similar to a CGI asset for any imaged items or people. This information can then be used in subsequent rendering of the FVV.
      • Environment Model—Mobile SLR cameras, mobile video cameras, laser range scanners, etc., are used to build an image-based geometric proxy for the stadium environment before the game from 3D image data captured by one or more camera arrays. This 3D image data is then generally used to generate a geometric proxy or 3D model of the overall environment. Further, this geometric proxy or 3D model can be edited or modified to suit particular purposes (e.g., modified to allow dynamic placement of advertising messages along a stadium wall or other location during playback of the resulting FVV).
      • Fixed Arrays—Fixed camera arrays are used to capture 3D image data of various game elements or features for insertion into the FVV. These elements include, but are not limited to announcers, ‘talking heads’, player interviews, intra-game fixed physical locations around the field, etc.
      • Moving Arrays—Mobile camera arrays are used to capture 3D image data of intra-game action on the field. Note that these are the same types of mobile cameras that are currently used to record action in professional football games, though additional numbers of cameras may be used to capture 3D image data of the intra-game action. Note that image or video data captured by fans viewing the game from inside the stadium using cell phones or other cameras can also be used by the Dynamic High Definition Bubble Framework to record intra-game action on the field.
  • 2.3.1 Asset Arrays:
  • In general, “asset arrays” are dense, fixed camera arrays optimized for creating a static (or moving) geometric proxy of an asset. Assets include any object or person who will be on the field such as players, cheerleaders, referees, footballs, or other equipment. The camera geometry of the asset arrays is optimized for the creation of a high fidelity geometric proxies and that requires a ‘full 360’ arrangement of sensors so that all aspects of the asset can be recorded and modeled; additional sensors may be placed above or below the assets. Note that in some cases, ‘full 360’ coverage may not be possible (e.g., views partially obstructed along some range of viewing directions), and that in such cases, user selection of viewpoints in the resulting FVV will be limited to whatever viewpoints can be rendered from the captured data. In addition to RGB (or other color space) cameras in the asset array, other sensor combinations such as active IR based stereo (also used in Kinect® or time of flight type applications) can be used to assist in 3D reconstruction. Additional techniques such as the use of green screen backgrounds can further assist in segmentation of the assets for use in creating high fidelity geometric proxies of those assets.
  • Asset arrays are generally utilized prior to the game and focus on static representations of the assets. Once recorded, these assets can be used as SV content for creating FVV's in two different ways, depending on the degree of geometry employed in their representation using image-based rendering (IBR).
  • Firstly, a low-geometry IBR method, including, but not limited to, view interpolation can be used to place the asset (players or cheerleaders) online using technology including, but not limited to, browser-based 2D or 3D rendering engines. This also allows users to view single assets with a web browser or the like to navigate around a coordinate system that allows them to zoom in to the players (or other assets) from any angle, thus providing the user or viewer with high levels of photorealism with respect to those assets. Again, rendering regions within the high definition bubbles using higher resolutions allows the user to zoom into those regions without losing fidelity or resolution in the zoomed areas, or otherwise creating pixelization artifacts or other zoom-based viewing problems. In other implementations, video can be used to highlight different player/cheerleader promotional activities such a throw, catch, block, cheer, etc. Note that various examples of view interpolation and view morphing for such purposes are discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • Secondly, a high fidelity geometry proxy of the players (or other persons such as cheerleaders, referees, coaches, announcers, etc.) is created and combined with view dependent texture mapping (VDTM) for use in close up FVV scenarios. To use these geometric proxies in FVV, a kinematic model for a human is used as a baseline for possible motions and further articulated based on RGB data from live-action video camera arrays. Multi-angle video data is then used to realistically articulate the geometric proxies for all players or a subset of players on the field. Advantageously, 6 degrees of freedom (6-DOF) movement of the user's viewpoint during playback of FVV is possible due the explicit use of 3D geometry in representing the assets. Again, various techniques for rendering and viewing the 3D content of the FVV is discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • 2.3.2 Environment Model:
  • A model of the environment is useful to the FVV of the football game in a number of different ways, such as providing a calibration framework for live-action moving cameras, creating interstitials effects when transitioning between known real camera feeds, determining the accurate placement (i.e., registration or alignment) of various geometric proxies (generated from the high definition bubbles) for FVV, improving segmentation results with background data, accurately representing the background of the scene using image-based-rendering methods in different FVV use cases, etc.
  • As is well known to those skilled in the art, a number of conventional techniques exist for modeling the environment using RGB (or other color space) photos using a sparse geometric representations of the scene. For example, in the case of Photosynth®, sparse geometry means that only enough geometry is extracted to enable the alignment of multiple photographs into a cohesive montage. However, in any scenario, such as the football game scenario, the Dynamic High Definition Bubble Framework provides richer 3D rendering by using much more geometry. More specifically, geometric proxies corresponding to each high definition bubble are registered or aligned to the geometry of the environment model. Once properly positioned, the various geometric proxies are then used to render the frames of the FVV.
  • Traditional environment models are often created using a variety of sensors such as moving video cameras, fixed cameras for high resolution static images, and laser based range scanning devices. RGB data from video cameras and fixed camera data can be processed using conventional 3D reconstruction methods to identify features and their location; point clouds of the stadium can be created from these features. Additional geometry, also in the form of point clouds, can be extracted using range scanning devices for additional accuracy. Finally, the point cloud data can be merged together, meshed, and textured into a cohesive geometric model. This geometry can also be used as an infrastructure to organize RGB data for use in other IBR approaches for backgrounds useful for FVV functionality.
  • Similar to the use of asset arrays, an environment model is created and processed before being used in any live-action footage provided by the FVV. Various methods associated with FVV live action, as discussed below, are made possible by the creation of an environment model including interstitials, moving camera calibration, and geometry-articulation.
  • In the simplest use of background models, interstitial movements between real camera positions are enabled, allowing users to more clearly understand where various camera feeds are located. In any SV scenario involving FVV, real camera feeds will have the highest degree of photorealism and will be widely utilized. When a viewer elects to change real camera views—instead of immediately switching to the next video feed—a smooth and sweeping camera movement is optionally enabled by rendering a virtual transition from the viewpoint of one camera view to the other to provide additional spatial information about the location of the cameras relative to the scene.
  • Additional FVV scenarios make advantageous use of the environment model by using both fixed and moving camera arrays to enable FVV functionality. In the case of moving cameras, these are used to provide close-ups of action on the field (i.e., by registering or positioning geometric proxies generated from the high definition bubbles with the environment model). To use moving cameras for FVV, individual video frames are continuously calibrated based on their orientation and optical focus, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • In general, the Dynamic High Definition Bubble Framework uses structure from motion (SFM) based approaches, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference, to calibrate the moving cameras or cameras based on static high resolution static RGB images captured during the environment modeling stage. Finally, for close up FVV functionality the Dynamic High Definition Bubble Framework relies upon the aforementioned articulation of the high-fidelity geometric proxies for the assets (players) using data from both fixed and moving camera arrays. These proxies are then positioned (i.e., registered or aligned) in the correct location on the field by determining where these assets are located relative to the environment model, as discussed in the aforementioned U.S. Provisional Patent Application, the subject matter of which is incorporated herein by reference.
  • 2.3.3 Fixed Arrays:
  • Fixed camera arrays are used in various scenarios associated with the football game, including intra-game focused footage as well as collateral footage. The defining characteristic of the fixed arrays are that cameras do not move relative to the scene.
  • For example, consider the use of FVV functionality for non-game collateral footage—this could include interviews with players or announcers. Further, consider an announcers stage having a medium density array of fixed RGB video cameras arranged in a 180-degree camera geometry pointing towards the stage for capturing 3D scene data of persons and assets on the stage. In this case, the views being considered generally include close-up views of humans, focused on the face, with limited need for full 6-DOF spatial navigation. In this case, an IBR approach such as view interpolation, view morphing, or view warping would use a less explicit geometric proxy for the scene, which would therefore emphasize photorealism at the expense of viewpoint navigation.
  • One use of this FVV functionality is that viewers (or producers) can enable real-time smooth pans between the different announcers as they comment and react. Another application of these ideas is to change views between the announcers and a top down map of the play presented next to the announcers. Another example scenario includes zooming in on a specific cheerleader doing a cheer, assuming that the fixed array is positioned on the field in an appropriate location for such views. In these scenarios, FVV navigation would be primarily limited to synthetic viewpoints between real camera positions or the axis of the camera geometry. However, by using the available 3D scene data for rendering the image frames, the results would be almost indistinguishable from real camera viewpoints.
  • The intra-game functionality discussed below highlights various benefits and advantages to the user when using the FVV technology described herein. For example, consider two classes of fixed arrays, one sparse array positioned with whole or partial views of the field from high vantage points within the stadium and another where denser fixed camera are positioned around the actual field such as in the end zone to capture a high definition bubble of the end zone.
  • In the case of high vantage point sparse arrays, this video data can be used to enable both far and medium FVV viewpoint control both during the game and during playback. This is considered a sparse array because the relative volume of the stadium is rather large and the distance between sensors is high. In this case, image-based rendering methods such as billboards and articulated billboards may be used to provide two-dimensional representations of the players on the field. These billboards are created using segmentation approaches, which are enabled partially by the environment model. These billboards maintain the photorealistic look of the players, but because they do not include the explicit geometry of the players (such as when represented as high fidelity geometric proxies). However, it should be understood that in general, navigation in the FVV is independent of the representation used.
  • Next, denser fixed arrays on the field such as around the end zone for capturing high definition bubbles allow for highly photorealist viewpoints during both live action and replay. Similar to the announcer's stage discussed above, viewpoint navigation would be largely constrained by the camera axis using similar image-based-rendering methods described for the announcer's stage. For the most part, these types of viewpoints are specifically enabled when camera density is at an appropriate level and therefore are not generally enabled for all locations within the stadium. In other words, dense camera arrays are used for capturing sub-regions of the overall stadium as high definition bubbles for inclusion in the FVV. In general, these methods are unsuitable for medium and sparse configurations of sensors.
  • 2.3.4 Moving Arrays:
  • Typical intra-game football coverage comes from moving cameras for both live action coverage and for replays. The preceding discussion regarding camera arrays generally focused on creating high fidelity geometric proxies of players and assets, how an environment model of the stadium can be leveraged to enhance the FVV, and the use of intra-game fixed camera arrays in both sparse and dense configurations. The Dynamic High Definition Bubble Framework ties these elements together with sparse moving camera arrays to enable additional FVV functionality for medium shots using billboards and close-up shots that leverage full 6-DOF spatial navigation using high fidelity geometric proxies of players or other assets or persons using conventional game cameras and camera operators. In other words, moving camera arrays are used to capture high definition bubbles used in generating FVV's.
  • Moving cameras in the array are continuously calibrated using SFM approaches leveraging the environment model. The optical zoom functionality of these moving cameras is also used to capture image data within high definition bubbles using methods including using prior frames to help further refine or identify a zoomed in camera geometry. Once the individual frames of the moving cameras have been registered to the geometry of the environment model (i.e., correctly positioned within the stadium), additional image-based-rendering methods are enabled for different FVV based on the contributing camera geometries including RGB articulated geometric proxies with maximal spatial navigation and billboard methods which emphasize photorealism and less spatial navigation.
  • For example, to enable close up replays with full 6-DOF viewpoint control during playback, the Dynamic High Definition Bubble Framework uses image data from the asset arrays, fixed arrays, and moving arrays. First, the relative position of the players is tracked on the field using one or more fixed arrays. In this way, the approximate location of any player on the field is known. This allows the Dynamic High Definition Bubble Framework to determine which players are in a zoomed in moving camera field of view. Next, based on the identification of the players in the zoomed in fields of view, the Dynamic High Definition Bubble Framework selects the appropriate high-fidelity geometric proxies for each player that were created earlier using the asset arrays.
  • Finally, using a kinematic model for known human motion as well as conventional object recognition techniques applied to RGB video (from both fixed and moving cameras), the Dynamic High Definition Bubble Framework determines the spatial orientation of specific players on the field and articulates their geometric proxies as realistically as possible. Note that this also helps in filling in occluded areas (using various hole-filling techniques) when there were insufficient numbers or placements of cameras to capture a view. When the geometric proxies are mapped to their correct location on the field in both space and time, the Dynamic High Definition Bubble Framework then derives a full 6-DOF FVV replay experience for the user. In this way, users or clients can literally view a play from any potential position including close-up shots as well as intra-field camera positions. Advantageously, the net effect here is to enable interactive replays similar to what is possible with various Xbox® football games such as the “Madden NFL” series of electronic games by Electronic Arts Inc, although with real data.
  • Finally, multiple moving cameras focused on the same physical location of the field can also enable medium and close up views that use IBR methods with less explicit geometry such as billboard methodologies. These cameras can be combined with data from both the environment model as well as the fixed arrays to create additional FVV viewpoints within the stadium.
  • 3.0 Operational Summary:
  • The processes described above with respect to FIG. 1 through FIG. 3 and in further view of the detailed description provided above in Sections 1 and 2 are illustrated by the general operational flow diagram of FIG. 4. In particular, FIG. 4 provides an exemplary operational flow diagram that summarizes the operation of some of the various embodiments of the Dynamic High Definition Bubble Framework. Note that FIG. 4 is not intended to be an exhaustive representation of all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, and that the embodiments represented in FIG. 4 are provided only for purposes of explanation.
  • Further, it should be noted that any boxes and interconnections between boxes that are represented by broken or dashed lines in FIG. 4 represent optional or alternate embodiments of the Dynamic High Definition Bubble Framework described herein, and that any or all of these optional or alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • In general, as illustrated by FIG. 4, the Dynamic High Definition Bubble Framework begins operation by capturing (410) 3D image data for an overall viewing area and one or more high definition bubbles within the overall viewing area. The Dynamic High Definition Bubble Framework then uses the captured data to generate (420) one or more 3D geometric proxies or models for use in generating a Free Viewpoint Video (FVV). For each FVV, a view frustum for an initial or user selected virtual navigation viewpoint is then selected (430). The Dynamic High Definition Bubble Framework then selects (440) an appropriate level of detail for regions in the view frustum based on distance from viewpoint. Further, as discussed herein, the Dynamic High Definition Bubble Framework uses higher fidelity geometric proxies for regions corresponding to high definition bubbles and lower fidelity geometric proxies for other regions of overall viewing area.
  • The Dynamic High Definition Bubble Framework then provides (450) one or more clients with 3D geometric proxies corresponding to the view frustum, with those geometric proxies having a level of detail sufficient to render the scene (or other objects or people within the current viewpoint) from a viewing frustum corresponding to a user selected virtual navigation viewpoint. Given this data, the FVV is rendered or generated and presented to the user for viewing, with the user then navigating (460) the FVV by selecting zoom levels and virtual navigation viewpoints (e.g., pans, tilts, rotations, etc.), which are in turn used to select the view frustum for generating subsequent frames of the FVV.
  • 4.0 Exemplary Operating Environments:
  • The Dynamic High Definition Bubble Framework described herein is operational within numerous types of general purpose or special purpose computing system environments or configurations. FIG. 5 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the Dynamic High Definition Bubble Framework, as described herein, may be implemented. It should be noted that any boxes that are represented by broken or dashed lines in FIG. 5 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • For example, FIG. 5 shows a general system diagram showing a simplified computing device such as computer 500. Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers, server computers, hand-held computing devices, laptop or mobile computers, communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, audio or video media players, etc.
  • To allow a device to implement the Dynamic High Definition Bubble Framework, the device should have a sufficient computational capability and system memory to enable basic computational operations. In particular, as illustrated by FIG. 5, the computational capability is generally illustrated by one or more processing unit(s) 510, and may also include one or more GPUs 515, either or both in communication with system memory 520. Note that that the processing unit(s) 510 of the general computing device of may be specialized microprocessors, such as a DSP, a VLIW, or other micro-controller, or can be conventional CPUs having one or more processing cores, including specialized GPU-based cores in a multi-core CPU.
  • In addition, the simplified computing device of FIG. 5 may also include other components, such as, for example, a communications interface 530. The simplified computing device of FIG. 5 may also include one or more conventional computer input devices 540 (e.g., pointing devices, keyboards, audio input devices, video input devices, haptic input devices, devices for receiving wired or wireless data transmissions, etc.). The simplified computing device of FIG. 5 may also include other optional components, such as, for example, one or more conventional computer output devices 550 (e.g., display device(s) 555, audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, etc.). Note that typical communications interfaces 530, input devices 540, output devices 550, and storage devices 560 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • The simplified computing device of FIG. 5 may also include a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 500 via storage devices 560 and includes both volatile and nonvolatile media that is either removable 570 and/or non-removable 580, for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as DVD's, CD's, floppy disks, tape drives, hard drives, optical drives, solid state memory devices, RAM, ROM, EEPROM, flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • Storage of information such as computer-readable or computer-executable instructions, data structures, program modules, etc., can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism. Note that the terms “modulated data signal” or “carrier wave” generally refer a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, RF, infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • Further, software, programs, and/or computer program products embodying the some or all of the various embodiments of the Dynamic High Definition Bubble Framework described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • Finally, the Dynamic High Definition Bubble Framework described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The embodiments described herein may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks. In a distributed computing environment, program modules may be located in both local and remote computer storage media including media storage devices. Still further, the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.
  • The foregoing description of the Dynamic High Definition Bubble Framework has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the claimed subject matter to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. Further, it should be noted that any or all of the aforementioned alternate embodiments may be used in any combination desired to form additional hybrid embodiments of the Dynamic High Definition Bubble Framework. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.

Claims (20)

What is claimed is:
1. A computer-implemented process for generating navigable free viewpoint video (FVV), comprising using a computer to perform process actions for:
generating a geometric proxy from 3D image data of an overall volumetric space;
generating one or more geometric proxies for each of one or more sub-regions of the overall volumetric space;
registering one or more of the geometric proxies of the sub-regions with the geometric proxy of the overall volumetric space; and
rendering a multi-resolution user-navigable FVV from the registered geometric proxies and the geometric proxy of the overall volumetric space, wherein portions of the FVV corresponding to the sub-regions are rendered with a higher resolution than other regions of the FVV.
2. The computer-implemented process of claim 1 wherein each sub-region is captured at a resolution greater than a resolution used to capture the overall volumetric space.
3. The computer-implemented process of claim 1 wherein one or more of the sub-regions are captured using one or more moving camera arrays.
4. The computer-implemented process of claim 1 wherein one or more of the sub-regions are captured using one or more fixed camera arrays.
5. The computer-implemented process of claim 1 wherein rendering the multi-resolution user-navigable FVV further comprises process actions for:
determining a current view frustum corresponding to a current client viewpoint for viewing the FVV; and
transmitting appropriate geometric proxies within the current view frustum to the client for local rendering of video frames of the FVV.
6. The computer-implemented process of claim 1 wherein one or more of the sub-regions move relative to the overall volumetric space during capture of the 3D image data for those sub-regions.
7. The computer-implemented process of claim 1 wherein one or more of the sub-regions overlap within the overall volumetric space.
8. A method for generating a navigable 3D representation of a volumetric space, comprising:
capturing 3D image data of an overall volumetric space and using this 3D image data to construct an environment model comprising a geometric proxy of the overall volumetric space;
capturing 3D image data for one or more sub-regions of the overall volumetric space and generating one or more geometric proxies of each sub-region;
registering one or more of the geometric proxies of each sub-region to the environment model;
determining a view frustum relative to the environment model; and
rendering frames of a multi-resolution user-navigable FVV from portions of the registered geometric proxies and environment model corresponding to the view frustum, wherein portions of the FVV corresponding to the sub-regions are rendered with a higher resolution than other regions of the FVV.
9. The method of claim 8 wherein the view frustum is determined from a current viewpoint of a client viewing the FVV, and wherein the rendering is performed by the client from portions of the registered geometric proxies and environment model corresponding to the view frustum transmitted to the client.
10. The method of claim 8 wherein zooming into portions of the FVV rendered with a higher resolution provides greater detail than when zooming into other regions of the FVV.
11. The method of claim 8 wherein each sub-region is captured at a resolution greater than a resolution used to capture the overall volumetric space.
12. The method of claim 8 wherein the sub-regions are captured using any combination of one or more moving camera arrays and one or more fixed camera arrays.
13. The method of claim 8 wherein one or more of the sub-regions move relative to the overall volumetric space during capture of the 3D image data for those sub-regions.
14. A computer-readable medium having computer executable instructions stored therein for generating a user navigable free viewpoint video (FVV), said instructions causing a computing device to execute a method comprising:
capturing 3D image data for an overall viewing area;
capturing 3D image data for one or more high definition bubbles within the overall viewing area;
generating a geometric proxy from the 3D image data of the overall viewing area;
generating one or more geometric proxies from the 3D image data of one or more of the high definition bubbles;
aligning one or more of the geometric proxies of the high definition bubbles with the geometric proxy of the overall viewing area; and
transmitting portions of any of the aligned geometric proxies corresponding to a current client viewpoint to a client for local client-based rendering of a multi-resolution user-navigable FVV, wherein portions of the FVV corresponding to the high definition bubbles are rendered with a higher resolution than other regions of the FVV.
15. The computer-readable medium of claim 14 wherein each high definition bubble is captured at a resolution greater than a resolution used to capture the overall viewing area.
16. The computer-readable medium of claim 14 wherein one or more of the high definition bubbles are captured using one or more moving camera arrays.
17. The computer-readable medium of claim 14 wherein one or more of high definition bubbles are captured using one or more fixed camera arrays.
18. The computer-readable medium of claim 14 wherein rendering the multi-resolution user-navigable FVV further comprises:
determining a current view frustum corresponding to a current client viewpoint for viewing the FVV; and
using portions of the aligned geometric proxies within the current view frustum for local rendering of video frames of the FVV.
19. The computer-readable medium of claim 14 wherein one or more of the high definition bubbles move relative to the overall viewing area during capture of the 3D image data for those high definition bubbles.
20. The computer-readable medium of claim 14 wherein one or more of the sub-regions overlap within the overall volumetric space.
US13/598,747 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video Abandoned US20130321575A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/598,747 US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261653983P 2012-05-31 2012-05-31
US13/598,747 US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video

Publications (1)

Publication Number Publication Date
US20130321575A1 true US20130321575A1 (en) 2013-12-05

Family

ID=49669652

Family Applications (10)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)
US13/598,747 Abandoned US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds
US13/790,158 Abandoned US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)

Family Applications After (6)

Application Number Title Priority Date Filing Date
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds
US13/790,158 Abandoned US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Country Status (1)

Country Link
US (10) US9846960B2 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9191643B2 (en) 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US20170013283A1 (en) * 2015-07-10 2017-01-12 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
US20180246631A1 (en) * 2017-02-28 2018-08-30 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
CN108605090A (en) * 2016-02-12 2018-09-28 三星电子株式会社 Method for supporting the VR contents in communication system to show
EP3388119A3 (en) * 2017-04-14 2018-11-28 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation
US20180356942A1 (en) * 2017-06-12 2018-12-13 Samsung Eletrônica da Amazônia Ltda. METHOD FOR DISPLAYING 360º MEDIA ON BUBBLES INTERFACE
US10510111B2 (en) 2013-10-25 2019-12-17 Appliance Computing III, Inc. Image-based rendering of real spaces
US20200005527A1 (en) * 2016-12-19 2020-01-02 Interdigital Ce Patent Holdings Method and apparatus for constructing lighting environment representations of 3d scenes
US10818077B2 (en) 2018-12-14 2020-10-27 Canon Kabushiki Kaisha Method, system and apparatus for controlling a virtual camera
US10951879B2 (en) 2017-12-04 2021-03-16 Canon Kabushiki Kaisha Method, system and apparatus for capture of image data for free viewpoint video
US11037365B2 (en) 2019-03-07 2021-06-15 Alibaba Group Holding Limited Method, apparatus, medium, terminal, and device for processing multi-angle free-perspective data
US11508125B1 (en) * 2014-05-28 2022-11-22 Lucasfilm Entertainment Company Ltd. Navigating a virtual environment of a media content item
US11776205B2 (en) * 2020-06-09 2023-10-03 Ptc Inc. Determination of interactions with predefined volumes of space based on automated analysis of volumetric video
WO2024006997A1 (en) * 2022-07-01 2024-01-04 Google Llc Three-dimensional video highlight from a camera source
EP3425592B1 (en) * 2017-07-06 2024-07-24 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and program, for generating a virtual viewpoint image

Families Citing this family (237)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1934945A4 (en) * 2005-10-11 2016-01-20 Apple Inc Method and system for object reconstruction
US11792538B2 (en) 2008-05-20 2023-10-17 Adeia Imaging Llc Capturing and processing of images including occlusions focused on an image sensor by a lens stack array
US8866920B2 (en) 2008-05-20 2014-10-21 Pelican Imaging Corporation Capturing and processing of images using monolithic camera array with heterogeneous imagers
US9892546B2 (en) * 2010-06-30 2018-02-13 Primal Space Systems, Inc. Pursuit path camera model method and system
US20150373153A1 (en) 2010-06-30 2015-12-24 Primal Space Systems, Inc. System and method to reduce bandwidth requirement for visibility event packet streaming using a predicted maximal view frustum and predicted maximal viewpoint extent, each computed at runtime
US8878950B2 (en) 2010-12-14 2014-11-04 Pelican Imaging Corporation Systems and methods for synthesizing high resolution images using super-resolution processes
US8542933B2 (en) 2011-09-28 2013-09-24 Pelican Imaging Corporation Systems and methods for decoding light field image files
US9001960B2 (en) * 2012-01-04 2015-04-07 General Electric Company Method and apparatus for reducing noise-related imaging artifacts
US9300841B2 (en) * 2012-06-25 2016-03-29 Yoldas Askan Method of generating a smooth image from point cloud data
US8619082B1 (en) 2012-08-21 2013-12-31 Pelican Imaging Corporation Systems and methods for parallax detection and correction in images captured using array cameras that contain occlusions using subsets of images to perform depth estimation
US10079968B2 (en) 2012-12-01 2018-09-18 Qualcomm Incorporated Camera having additional functionality based on connectivity with a host device
US9519968B2 (en) * 2012-12-13 2016-12-13 Hewlett-Packard Development Company, L.P. Calibrating visual sensors using homography operators
US9224227B2 (en) * 2012-12-21 2015-12-29 Nvidia Corporation Tile shader for screen space, a method of rendering and a graphics processing unit employing the tile shader
US8866912B2 (en) 2013-03-10 2014-10-21 Pelican Imaging Corporation System and methods for calibration of an array camera using a single captured image
US9144905B1 (en) * 2013-03-13 2015-09-29 Hrl Laboratories, Llc Device and method to identify functional parts of tools for robotic manipulation
US9578259B2 (en) 2013-03-14 2017-02-21 Fotonation Cayman Limited Systems and methods for reducing motion blur in images or video in ultra low light with array cameras
US9445003B1 (en) * 2013-03-15 2016-09-13 Pelican Imaging Corporation Systems and methods for synthesizing high resolution images using image deconvolution based on motion and depth information
CN105339865B (en) * 2013-04-04 2018-05-22 索尼公司 Display control unit, display control method and computer-readable medium
US10262462B2 (en) 2014-04-18 2019-04-16 Magic Leap, Inc. Systems and methods for augmented and virtual reality
US9208609B2 (en) * 2013-07-01 2015-12-08 Mitsubishi Electric Research Laboratories, Inc. Method for fitting primitive shapes to 3D point clouds using distance fields
CN105308953A (en) * 2013-07-19 2016-02-03 谷歌技术控股有限责任公司 Asymmetric sensor array for capturing images
US10140751B2 (en) * 2013-08-08 2018-11-27 Imagination Technologies Limited Normal offset smoothing
CN104424655A (en) * 2013-09-10 2015-03-18 鸿富锦精密工业(深圳)有限公司 System and method for reconstructing point cloud curved surface
JP6476658B2 (en) * 2013-09-11 2019-03-06 ソニー株式会社 Image processing apparatus and method
US9286718B2 (en) * 2013-09-27 2016-03-15 Ortery Technologies, Inc. Method using 3D geometry data for virtual reality image presentation and control in 3D space
US10591969B2 (en) 2013-10-25 2020-03-17 Google Technology Holdings LLC Sensor-based near-field communication authentication
US9888333B2 (en) * 2013-11-11 2018-02-06 Google Technology Holdings LLC Three-dimensional audio rendering techniques
WO2015074078A1 (en) 2013-11-18 2015-05-21 Pelican Imaging Corporation Estimating depth from projected texture using camera arrays
US9456134B2 (en) 2013-11-26 2016-09-27 Pelican Imaging Corporation Array camera configurations incorporating constituent array cameras and constituent cameras
EP2881918B1 (en) * 2013-12-06 2018-02-07 My Virtual Reality Software AS Method for visualizing three-dimensional data
US9233469B2 (en) * 2014-02-13 2016-01-12 GM Global Technology Operations LLC Robotic system with 3D box location functionality
US9530226B2 (en) * 2014-02-18 2016-12-27 Par Technology Corporation Systems and methods for optimizing N dimensional volume data for transmission
US10241616B2 (en) 2014-02-28 2019-03-26 Hewlett-Packard Development Company, L.P. Calibration of sensors and projector
US9396586B2 (en) 2014-03-14 2016-07-19 Matterport, Inc. Processing and/or transmitting 3D data
CN104089628B (en) * 2014-06-30 2017-02-08 中国科学院光电研究院 Self-adaption geometric calibration method of light field camera
US11051000B2 (en) 2014-07-14 2021-06-29 Mitsubishi Electric Research Laboratories, Inc. Method for calibrating cameras with non-overlapping views
US10169909B2 (en) * 2014-08-07 2019-01-01 Pixar Generating a volumetric projection for an object
US10257494B2 (en) * 2014-09-22 2019-04-09 Samsung Electronics Co., Ltd. Reconstruction of three-dimensional video
US11205305B2 (en) 2014-09-22 2021-12-21 Samsung Electronics Company, Ltd. Presentation of three-dimensional video
WO2016054089A1 (en) 2014-09-29 2016-04-07 Pelican Imaging Corporation Systems and methods for dynamic calibration of array cameras
US9600892B2 (en) * 2014-11-06 2017-03-21 Symbol Technologies, Llc Non-parametric method of and system for estimating dimensions of objects of arbitrary shape
EP3221851A1 (en) * 2014-11-20 2017-09-27 Cappasity Inc. Systems and methods for 3d capture of objects using multiple range cameras and multiple rgb cameras
US9396554B2 (en) 2014-12-05 2016-07-19 Symbol Technologies, Llc Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
DE102014118989A1 (en) * 2014-12-18 2016-06-23 Connaught Electronics Ltd. Method for calibrating a camera system, camera system and motor vehicle
US11019330B2 (en) * 2015-01-19 2021-05-25 Aquifi, Inc. Multiple camera system with auto recalibration
US9661312B2 (en) * 2015-01-22 2017-05-23 Microsoft Technology Licensing, Llc Synthesizing second eye viewport using interleaving
US9686520B2 (en) 2015-01-22 2017-06-20 Microsoft Technology Licensing, Llc Reconstructing viewport upon user viewpoint misprediction
WO2016126816A2 (en) * 2015-02-03 2016-08-11 Dolby Laboratories Licensing Corporation Post-conference playback system having higher perceived quality than originally heard in the conference
EP3266199B1 (en) 2015-03-01 2019-09-18 NEXTVR Inc. Methods and apparatus for supporting content generation, transmission and/or playback
EP3070942B1 (en) * 2015-03-17 2023-11-22 InterDigital CE Patent Holdings Method and apparatus for displaying light field video data
US10878278B1 (en) * 2015-05-16 2020-12-29 Sturfee, Inc. Geo-localization based on remotely sensed visual features
JP6975642B2 (en) * 2015-06-11 2021-12-01 コンティ テミック マイクロエレクトロニック ゲゼルシャフト ミット ベシュレンクテル ハフツングConti Temic microelectronic GmbH How to create a virtual image of the vehicle's perimeter
US9460513B1 (en) 2015-06-17 2016-10-04 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing a 3D scene as a 3D model using images acquired by 3D sensors and omnidirectional cameras
US10554713B2 (en) 2015-06-19 2020-02-04 Microsoft Technology Licensing, Llc Low latency application streaming using temporal frame transformation
KR101835434B1 (en) * 2015-07-08 2018-03-09 고려대학교 산학협력단 Method and Apparatus for generating a protection image, Method for mapping between image pixel and depth value
EP3335418A1 (en) 2015-08-14 2018-06-20 PCMS Holdings, Inc. System and method for augmented reality multi-view telepresence
GB2543776B (en) * 2015-10-27 2019-02-06 Imagination Tech Ltd Systems and methods for processing images of objects
US11562502B2 (en) * 2015-11-09 2023-01-24 Cognex Corporation System and method for calibrating a plurality of 3D sensors with respect to a motion conveyance
US10757394B1 (en) * 2015-11-09 2020-08-25 Cognex Corporation System and method for calibrating a plurality of 3D sensors with respect to a motion conveyance
US10812778B1 (en) 2015-11-09 2020-10-20 Cognex Corporation System and method for calibrating one or more 3D sensors mounted on a moving manipulator
US20180374239A1 (en) * 2015-11-09 2018-12-27 Cognex Corporation System and method for field calibration of a vision system imaging two opposite sides of a calibration object
WO2017100487A1 (en) * 2015-12-11 2017-06-15 Jingyi Yu Method and system for image-based image rendering using a multi-camera and depth camera array
US10352689B2 (en) 2016-01-28 2019-07-16 Symbol Technologies, Llc Methods and systems for high precision locationing with depth values
US10145955B2 (en) 2016-02-04 2018-12-04 Symbol Technologies, Llc Methods and systems for processing point-cloud data with a line scanner
CN107097698B (en) * 2016-02-22 2021-10-01 福特环球技术公司 Inflatable airbag system for a vehicle seat, seat assembly and method for adjusting the same
US11567201B2 (en) 2016-03-11 2023-01-31 Kaarta, Inc. Laser scanner with real-time, online ego-motion estimation
WO2017155970A1 (en) 2016-03-11 2017-09-14 Kaarta, Inc. Laser scanner with real-time, online ego-motion estimation
US11573325B2 (en) 2016-03-11 2023-02-07 Kaarta, Inc. Systems and methods for improvements in scanning and mapping
US10989542B2 (en) 2016-03-11 2021-04-27 Kaarta, Inc. Aligning measured signal data with slam localization data and uses thereof
US10721451B2 (en) 2016-03-23 2020-07-21 Symbol Technologies, Llc Arrangement for, and method of, loading freight into a shipping container
CA2961921C (en) 2016-03-29 2020-05-12 Institut National D'optique Camera calibration method using a calibration target
WO2017172528A1 (en) 2016-04-01 2017-10-05 Pcms Holdings, Inc. Apparatus and method for supporting interactive augmented reality functionalities
US9805240B1 (en) 2016-04-18 2017-10-31 Symbol Technologies, Llc Barcode scanning and dimensioning
CN107341768B (en) * 2016-04-29 2022-03-11 微软技术许可有限责任公司 Grid noise reduction
WO2017197114A1 (en) 2016-05-11 2017-11-16 Affera, Inc. Anatomical model generation
EP3455756A2 (en) 2016-05-12 2019-03-20 Affera, Inc. Anatomical model controlling
EP3264759A1 (en) 2016-06-30 2018-01-03 Thomson Licensing An apparatus and a method for generating data representative of a pixel beam
US10192345B2 (en) * 2016-07-19 2019-01-29 Qualcomm Incorporated Systems and methods for improved surface normal estimation
US11082471B2 (en) * 2016-07-27 2021-08-03 R-Stor Inc. Method and apparatus for bonding communication technologies
US10574909B2 (en) 2016-08-08 2020-02-25 Microsoft Technology Licensing, Llc Hybrid imaging sensor for structured light object capture
US10776661B2 (en) 2016-08-19 2020-09-15 Symbol Technologies, Llc Methods, systems and apparatus for segmenting and dimensioning objects
US9980078B2 (en) 2016-10-14 2018-05-22 Nokia Technologies Oy Audio object modification in free-viewpoint rendering
US10229533B2 (en) * 2016-11-03 2019-03-12 Mitsubishi Electric Research Laboratories, Inc. Methods and systems for fast resampling method and apparatus for point cloud data
US11042161B2 (en) 2016-11-16 2021-06-22 Symbol Technologies, Llc Navigation control method and apparatus in a mobile automation system
US10451405B2 (en) 2016-11-22 2019-10-22 Symbol Technologies, Llc Dimensioning system for, and method of, dimensioning freight in motion along an unconstrained path in a venue
WO2018100928A1 (en) 2016-11-30 2018-06-07 キヤノン株式会社 Image processing device and method
JP6948171B2 (en) * 2016-11-30 2021-10-13 キヤノン株式会社 Image processing equipment and image processing methods, programs
US10354411B2 (en) 2016-12-20 2019-07-16 Symbol Technologies, Llc Methods, systems and apparatus for segmenting objects
WO2018123801A1 (en) * 2016-12-28 2018-07-05 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Three-dimensional model distribution method, three-dimensional model receiving method, three-dimensional model distribution device, and three-dimensional model receiving device
US11096004B2 (en) * 2017-01-23 2021-08-17 Nokia Technologies Oy Spatial audio rendering point extension
US11665308B2 (en) 2017-01-31 2023-05-30 Tetavi, Ltd. System and method for rendering free viewpoint video for sport applications
WO2018147329A1 (en) * 2017-02-10 2018-08-16 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Free-viewpoint image generation method and free-viewpoint image generation system
US10531219B2 (en) 2017-03-20 2020-01-07 Nokia Technologies Oy Smooth rendering of overlapping audio-object interactions
WO2018172614A1 (en) 2017-03-22 2018-09-27 Nokia Technologies Oy A method and an apparatus and a computer program product for adaptive streaming
US10726574B2 (en) * 2017-04-11 2020-07-28 Dolby Laboratories Licensing Corporation Passive multi-wearable-devices tracking
US10939038B2 (en) * 2017-04-24 2021-03-02 Intel Corporation Object pre-encoding for 360-degree view for optimal quality and latency
US10726273B2 (en) 2017-05-01 2020-07-28 Symbol Technologies, Llc Method and apparatus for shelf feature and object placement detection from shelf images
US10663590B2 (en) 2017-05-01 2020-05-26 Symbol Technologies, Llc Device and method for merging lidar data
US11093896B2 (en) 2017-05-01 2021-08-17 Symbol Technologies, Llc Product status detection system
AU2018261257B2 (en) 2017-05-01 2020-10-08 Symbol Technologies, Llc Method and apparatus for object status detection
US11449059B2 (en) 2017-05-01 2022-09-20 Symbol Technologies, Llc Obstacle detection for a mobile automation apparatus
US11367092B2 (en) 2017-05-01 2022-06-21 Symbol Technologies, Llc Method and apparatus for extracting and processing price text from an image set
US10949798B2 (en) 2017-05-01 2021-03-16 Symbol Technologies, Llc Multimodal localization and mapping for a mobile automation apparatus
US10591918B2 (en) 2017-05-01 2020-03-17 Symbol Technologies, Llc Fixed segmented lattice planning for a mobile automation apparatus
WO2018201423A1 (en) 2017-05-05 2018-11-08 Symbol Technologies, Llc Method and apparatus for detecting and interpreting price label text
US11074036B2 (en) 2017-05-05 2021-07-27 Nokia Technologies Oy Metadata-free audio-object interactions
CN108881784B (en) * 2017-05-12 2020-07-03 腾讯科技(深圳)有限公司 Virtual scene implementation method and device, terminal and server
US10165386B2 (en) 2017-05-16 2018-12-25 Nokia Technologies Oy VR audio superzoom
US10154176B1 (en) * 2017-05-30 2018-12-11 Intel Corporation Calibrating depth cameras using natural objects with expected shapes
CN110476186B (en) * 2017-06-07 2020-12-29 谷歌有限责任公司 High speed high fidelity face tracking
US10841537B2 (en) 2017-06-09 2020-11-17 Pcms Holdings, Inc. Spatially faithful telepresence supporting varying geometries and moving users
JP7205471B2 (en) 2017-06-29 2023-01-17 ソニーグループ株式会社 Image processing device and image processing method
US11049218B2 (en) 2017-08-11 2021-06-29 Samsung Electronics Company, Ltd. Seamless image stitching
WO2019034808A1 (en) 2017-08-15 2019-02-21 Nokia Technologies Oy Encoding and decoding of volumetric video
US11405643B2 (en) 2017-08-15 2022-08-02 Nokia Technologies Oy Sequential encoding and decoding of volumetric video
US11290758B2 (en) * 2017-08-30 2022-03-29 Samsung Electronics Co., Ltd. Method and apparatus of point-cloud streaming
JP6409107B1 (en) * 2017-09-06 2018-10-17 キヤノン株式会社 Information processing apparatus, information processing method, and program
US10572763B2 (en) 2017-09-07 2020-02-25 Symbol Technologies, Llc Method and apparatus for support surface edge detection
US10521914B2 (en) 2017-09-07 2019-12-31 Symbol Technologies, Llc Multi-sensor object recognition system and method
US10861196B2 (en) * 2017-09-14 2020-12-08 Apple Inc. Point cloud compression
US11818401B2 (en) 2017-09-14 2023-11-14 Apple Inc. Point cloud geometry compression using octrees and binary arithmetic encoding with adaptive look-up tables
US10897269B2 (en) 2017-09-14 2021-01-19 Apple Inc. Hierarchical point cloud compression
US11113845B2 (en) 2017-09-18 2021-09-07 Apple Inc. Point cloud compression using non-cubic projections and masks
US10909725B2 (en) 2017-09-18 2021-02-02 Apple Inc. Point cloud compression
JP6433559B1 (en) * 2017-09-19 2018-12-05 キヤノン株式会社 Providing device, providing method, and program
CN107610182B (en) * 2017-09-22 2018-09-11 哈尔滨工业大学 A kind of scaling method at light-field camera microlens array center
JP6425780B1 (en) 2017-09-22 2018-11-21 キヤノン株式会社 Image processing system, image processing apparatus, image processing method and program
US11395087B2 (en) 2017-09-29 2022-07-19 Nokia Technologies Oy Level-based audio-object interactions
EP3467777A1 (en) * 2017-10-06 2019-04-10 Thomson Licensing A method and apparatus for encoding/decoding the colors of a point cloud representing a 3d object
WO2019099605A1 (en) 2017-11-17 2019-05-23 Kaarta, Inc. Methods and systems for geo-referencing mapping systems
US10607373B2 (en) 2017-11-22 2020-03-31 Apple Inc. Point cloud compression with closed-loop color conversion
JP6934957B2 (en) * 2017-12-19 2021-09-15 株式会社ソニー・インタラクティブエンタテインメント Image generator, reference image data generator, image generation method, and reference image data generation method
KR102334070B1 (en) 2018-01-18 2021-12-03 삼성전자주식회사 Electric apparatus and method for control thereof
US11158124B2 (en) 2018-01-30 2021-10-26 Gaia3D, Inc. Method of providing 3D GIS web service
US10417806B2 (en) * 2018-02-15 2019-09-17 JJK Holdings, LLC Dynamic local temporal-consistent textured mesh compression
JP2019144958A (en) * 2018-02-22 2019-08-29 キヤノン株式会社 Image processing device, image processing method, and program
WO2019165194A1 (en) * 2018-02-23 2019-08-29 Kaarta, Inc. Methods and systems for processing and colorizing point clouds and meshes
US10542368B2 (en) 2018-03-27 2020-01-21 Nokia Technologies Oy Audio content modification for playback audio
WO2019195270A1 (en) 2018-04-03 2019-10-10 Kaarta, Inc. Methods and systems for real or near real-time point cloud map data confidence evaluation
WO2019193696A1 (en) * 2018-04-04 2019-10-10 株式会社ソニー・インタラクティブエンタテインメント Reference image generation device, display image generation device, reference image generation method, and display image generation method
US10832436B2 (en) 2018-04-05 2020-11-10 Symbol Technologies, Llc Method, system and apparatus for recovering label positions
US10740911B2 (en) 2018-04-05 2020-08-11 Symbol Technologies, Llc Method, system and apparatus for correcting translucency artifacts in data representing a support structure
US10809078B2 (en) 2018-04-05 2020-10-20 Symbol Technologies, Llc Method, system and apparatus for dynamic path generation
US11327504B2 (en) 2018-04-05 2022-05-10 Symbol Technologies, Llc Method, system and apparatus for mobile automation apparatus localization
US10823572B2 (en) 2018-04-05 2020-11-03 Symbol Technologies, Llc Method, system and apparatus for generating navigational data
US10939129B2 (en) 2018-04-10 2021-03-02 Apple Inc. Point cloud compression
US11010928B2 (en) 2018-04-10 2021-05-18 Apple Inc. Adaptive distance based point cloud compression
US10909727B2 (en) 2018-04-10 2021-02-02 Apple Inc. Hierarchical point cloud compression with smoothing
US10909726B2 (en) 2018-04-10 2021-02-02 Apple Inc. Point cloud compression
US11017566B1 (en) 2018-07-02 2021-05-25 Apple Inc. Point cloud compression with adaptive filtering
WO2020009826A1 (en) 2018-07-05 2020-01-09 Kaarta, Inc. Methods and systems for auto-leveling of point clouds and 3d models
US11202098B2 (en) 2018-07-05 2021-12-14 Apple Inc. Point cloud compression with multi-resolution video encoding
US11012713B2 (en) 2018-07-12 2021-05-18 Apple Inc. Bit stream structure for compressed point cloud data
US11367224B2 (en) 2018-10-02 2022-06-21 Apple Inc. Occupancy map block-to-patch information compression
US11506483B2 (en) 2018-10-05 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for support structure depth determination
US11010920B2 (en) 2018-10-05 2021-05-18 Zebra Technologies Corporation Method, system and apparatus for object detection in point clouds
US11430155B2 (en) 2018-10-05 2022-08-30 Apple Inc. Quantized depths for projection point cloud compression
US10972835B2 (en) * 2018-11-01 2021-04-06 Sennheiser Electronic Gmbh & Co. Kg Conference system with a microphone array system and a method of speech acquisition in a conference system
US11090811B2 (en) 2018-11-13 2021-08-17 Zebra Technologies Corporation Method and apparatus for labeling of support structures
US11003188B2 (en) 2018-11-13 2021-05-11 Zebra Technologies Corporation Method, system and apparatus for obstacle handling in navigational path generation
CN109661816A (en) * 2018-11-21 2019-04-19 京东方科技集团股份有限公司 The method and display device of panoramic picture are generated and shown based on rendering engine
US11079240B2 (en) 2018-12-07 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for adaptive particle filter localization
US11416000B2 (en) 2018-12-07 2022-08-16 Zebra Technologies Corporation Method and apparatus for navigational ray tracing
CN109618122A (en) * 2018-12-07 2019-04-12 合肥万户网络技术有限公司 A kind of virtual office conference system
US11100303B2 (en) 2018-12-10 2021-08-24 Zebra Technologies Corporation Method, system and apparatus for auxiliary label detection and association
US11015938B2 (en) 2018-12-12 2021-05-25 Zebra Technologies Corporation Method, system and apparatus for navigational assistance
US11423572B2 (en) 2018-12-12 2022-08-23 Analog Devices, Inc. Built-in calibration of time-of-flight depth imaging systems
WO2020122675A1 (en) * 2018-12-13 2020-06-18 삼성전자주식회사 Method, device, and computer-readable recording medium for compressing 3d mesh content
US10731970B2 (en) 2018-12-13 2020-08-04 Zebra Technologies Corporation Method, system and apparatus for support structure detection
CA3028708A1 (en) 2018-12-28 2020-06-28 Zih Corp. Method, system and apparatus for dynamic loop closure in mapping trajectories
JP7211835B2 (en) * 2019-02-04 2023-01-24 i-PRO株式会社 IMAGING SYSTEM AND SYNCHRONIZATION CONTROL METHOD
WO2020164044A1 (en) * 2019-02-14 2020-08-20 北京大学深圳研究生院 Free-viewpoint image synthesis method, device, and apparatus
JP6647433B1 (en) * 2019-02-19 2020-02-14 株式会社メディア工房 Point cloud data communication system, point cloud data transmission device, and point cloud data transmission method
US10797090B2 (en) 2019-02-27 2020-10-06 Semiconductor Components Industries, Llc Image sensor with near-infrared and visible light phase detection pixels
US11057564B2 (en) 2019-03-28 2021-07-06 Apple Inc. Multiple layer flexure for supporting a moving image sensor
JP7479793B2 (en) * 2019-04-11 2024-05-09 キヤノン株式会社 Image processing device, system for generating virtual viewpoint video, and method and program for controlling the image processing device
US11402846B2 (en) 2019-06-03 2022-08-02 Zebra Technologies Corporation Method, system and apparatus for mitigating data capture light leakage
US11200677B2 (en) 2019-06-03 2021-12-14 Zebra Technologies Corporation Method, system and apparatus for shelf edge detection
US11341663B2 (en) 2019-06-03 2022-05-24 Zebra Technologies Corporation Method, system and apparatus for detecting support structure obstructions
US11080566B2 (en) 2019-06-03 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for gap detection in support structures with peg regions
US11960286B2 (en) 2019-06-03 2024-04-16 Zebra Technologies Corporation Method, system and apparatus for dynamic task sequencing
US11151743B2 (en) 2019-06-03 2021-10-19 Zebra Technologies Corporation Method, system and apparatus for end of aisle detection
US11662739B2 (en) 2019-06-03 2023-05-30 Zebra Technologies Corporation Method, system and apparatus for adaptive ceiling-based localization
US11711544B2 (en) 2019-07-02 2023-07-25 Apple Inc. Point cloud compression with supplemental information messages
CN110624220B (en) * 2019-09-04 2021-05-04 福建师范大学 Method for obtaining optimal standing long jump technical template
MX2022003020A (en) 2019-09-17 2022-06-14 Boston Polarimetrics Inc Systems and methods for surface modeling using polarization cues.
US11562507B2 (en) 2019-09-27 2023-01-24 Apple Inc. Point cloud compression using video encoding with time consistent patches
US11627314B2 (en) 2019-09-27 2023-04-11 Apple Inc. Video-based point cloud compression with non-normative smoothing
EP4036863A4 (en) 2019-09-30 2023-02-08 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Human body model reconstruction method and reconstruction system, and storage medium
US11538196B2 (en) 2019-10-02 2022-12-27 Apple Inc. Predictive coding for point cloud compression
US11895307B2 (en) 2019-10-04 2024-02-06 Apple Inc. Block-based predictive coding for point cloud compression
JP7330376B2 (en) 2019-10-07 2023-08-21 ボストン ポーラリメトリックス,インコーポレイティド Method for augmenting sensor and imaging systems with polarized light
US11315326B2 (en) * 2019-10-15 2022-04-26 At&T Intellectual Property I, L.P. Extended reality anchor caching based on viewport prediction
US12058510B2 (en) * 2019-10-18 2024-08-06 Sphere Entertainment Group, Llc Mapping audio to visual images on a display device having a curved screen
US11202162B2 (en) 2019-10-18 2021-12-14 Msg Entertainment Group, Llc Synthesizing audio of a venue
CN110769241B (en) * 2019-11-05 2022-02-01 广州虎牙科技有限公司 Video frame processing method and device, user side and storage medium
WO2021108002A1 (en) 2019-11-30 2021-06-03 Boston Polarimetrics, Inc. Systems and methods for transparent object segmentation using polarization cues
US11507103B2 (en) 2019-12-04 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for localization-based historical obstacle handling
US11734873B2 (en) 2019-12-13 2023-08-22 Sony Group Corporation Real-time volumetric visualization of 2-D images
US11107238B2 (en) 2019-12-13 2021-08-31 Zebra Technologies Corporation Method, system and apparatus for detecting item facings
US11798196B2 (en) 2020-01-08 2023-10-24 Apple Inc. Video-based point cloud compression with predicted patches
US11625866B2 (en) 2020-01-09 2023-04-11 Apple Inc. Geometry encoding using octrees and predictive trees
KR20220132620A (en) 2020-01-29 2022-09-30 인트린식 이노베이션 엘엘씨 Systems and methods for characterizing object pose detection and measurement systems
CN115428028A (en) 2020-01-30 2022-12-02 因思创新有限责任公司 System and method for synthesizing data for training statistical models in different imaging modalities including polarized images
US11240465B2 (en) 2020-02-21 2022-02-01 Alibaba Group Holding Limited System and method to use decoder information in video super resolution
US11430179B2 (en) * 2020-02-24 2022-08-30 Microsoft Technology Licensing, Llc Depth buffer dilation for remote rendering
US11822333B2 (en) 2020-03-30 2023-11-21 Zebra Technologies Corporation Method, system and apparatus for data capture illumination control
US11700353B2 (en) * 2020-04-06 2023-07-11 Eingot Llc Integration of remote audio into a performance venue
US11953700B2 (en) 2020-05-27 2024-04-09 Intrinsic Innovation Llc Multi-aperture polarization optical systems using beam splitters
US11615557B2 (en) 2020-06-24 2023-03-28 Apple Inc. Point cloud compression using octrees with slicing
US11620768B2 (en) 2020-06-24 2023-04-04 Apple Inc. Point cloud geometry compression using octrees with multiple scan orders
US11450024B2 (en) 2020-07-17 2022-09-20 Zebra Technologies Corporation Mixed depth object detection
US11875452B2 (en) * 2020-08-18 2024-01-16 Qualcomm Incorporated Billboard layers in object-space rendering
US11748918B1 (en) * 2020-09-25 2023-09-05 Apple Inc. Synthesized camera arrays for rendering novel viewpoints
EP4007992A1 (en) * 2020-10-08 2022-06-08 Google LLC Few-shot synthesis of talking heads
US11593915B2 (en) 2020-10-21 2023-02-28 Zebra Technologies Corporation Parallax-tolerant panoramic image generation
US11392891B2 (en) 2020-11-03 2022-07-19 Zebra Technologies Corporation Item placement detection and optimization in material handling systems
US11847832B2 (en) 2020-11-11 2023-12-19 Zebra Technologies Corporation Object classification for autonomous navigation systems
US11527014B2 (en) * 2020-11-24 2022-12-13 Verizon Patent And Licensing Inc. Methods and systems for calibrating surface data capture devices
US11874415B2 (en) * 2020-12-22 2024-01-16 International Business Machines Corporation Earthquake detection and response via distributed visual input
US11703457B2 (en) * 2020-12-29 2023-07-18 Industrial Technology Research Institute Structure diagnosis system and structure diagnosis method
US12020455B2 (en) 2021-03-10 2024-06-25 Intrinsic Innovation Llc Systems and methods for high dynamic range image reconstruction
US12069227B2 (en) 2021-03-10 2024-08-20 Intrinsic Innovation Llc Multi-modal and multi-spectral stereo camera arrays
US11651538B2 (en) * 2021-03-17 2023-05-16 International Business Machines Corporation Generating 3D videos from 2D models
US11948338B1 (en) 2021-03-29 2024-04-02 Apple Inc. 3D volumetric content encoding using 2D videos and simplified 3D meshes
US11290658B1 (en) 2021-04-15 2022-03-29 Boston Polarimetrics, Inc. Systems and methods for camera exposure control
US11954886B2 (en) 2021-04-15 2024-04-09 Intrinsic Innovation Llc Systems and methods for six-degree of freedom pose estimation of deformable objects
US12067746B2 (en) 2021-05-07 2024-08-20 Intrinsic Innovation Llc Systems and methods for using computer vision to pick up small objects
US11954882B2 (en) 2021-06-17 2024-04-09 Zebra Technologies Corporation Feature-based georegistration for mobile computing devices
US11689813B2 (en) 2021-07-01 2023-06-27 Intrinsic Innovation Llc Systems and methods for high dynamic range imaging using crossed polarizers
CN113761238B (en) * 2021-08-27 2022-08-23 广州文远知行科技有限公司 Point cloud storage method, device, equipment and storage medium
US11823319B2 (en) 2021-09-02 2023-11-21 Nvidia Corporation Techniques for rendering signed distance functions
CN113905221B (en) * 2021-09-30 2024-01-16 福州大学 Stereoscopic panoramic video asymmetric transport stream self-adaption method and system
CN114355287B (en) * 2022-01-04 2023-08-15 湖南大学 Ultra-short baseline underwater sound distance measurement method and system
WO2023159180A1 (en) * 2022-02-17 2023-08-24 Nutech Ventures Single-pass 3d reconstruction of internal surface of pipelines using depth camera array
CN116800947A (en) * 2022-03-16 2023-09-22 安霸国际有限合伙企业 Rapid RGB-IR calibration verification for mass production process
WO2024144805A1 (en) * 2022-12-29 2024-07-04 Innopeak Technology, Inc. Methods and systems for image processing with eye gaze redirection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327381B1 (en) * 1994-12-29 2001-12-04 Worldscape, Llc Image transformation and synthesis methods
US20060267977A1 (en) * 2005-05-19 2006-11-30 Helmut Barfuss Method for expanding the display of a volume image of an object region
US20080095465A1 (en) * 2006-10-18 2008-04-24 General Electric Company Image registration system and method
US20090016641A1 (en) * 2007-06-19 2009-01-15 Gianluca Paladini Method and apparatus for efficient client-server visualization of multi-dimensional data
US20090128568A1 (en) * 2007-11-16 2009-05-21 Sportvision, Inc. Virtual viewpoint animation
US20110142321A1 (en) * 2008-08-29 2011-06-16 Koninklijke Philips Electronics N.V. Dynamic transfer of three-dimensional image data

Family Cites Families (103)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5602903A (en) 1994-09-28 1997-02-11 Us West Technologies, Inc. Positioning system and method
US5850352A (en) 1995-03-31 1998-12-15 The Regents Of The University Of California Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images
JP3461980B2 (en) 1995-08-25 2003-10-27 株式会社東芝 High-speed drawing method and apparatus
US6163337A (en) 1996-04-05 2000-12-19 Matsushita Electric Industrial Co., Ltd. Multi-view point image transmission method and multi-view point image display method
US5926400A (en) 1996-11-21 1999-07-20 Intel Corporation Apparatus and method for determining the intensity of a sound in a virtual world
US6064771A (en) 1997-06-23 2000-05-16 Real-Time Geometry Corp. System and method for asynchronous, adaptive moving picture compression, and decompression
US6072496A (en) 1998-06-08 2000-06-06 Microsoft Corporation Method and system for capturing and representing 3D geometry, color and shading of facial expressions and other animated objects
US6226003B1 (en) 1998-08-11 2001-05-01 Silicon Graphics, Inc. Method for rendering silhouette and true edges of 3-D line drawings with occlusion
US6556199B1 (en) 1999-08-11 2003-04-29 Advanced Research And Technology Institute Method and apparatus for fast voxelization of volumetric models
US6509902B1 (en) 2000-02-28 2003-01-21 Mitsubishi Electric Research Laboratories, Inc. Texture filtering for surface elements
US7522186B2 (en) 2000-03-07 2009-04-21 L-3 Communications Corporation Method and apparatus for providing immersive surveillance
US6968299B1 (en) 2000-04-14 2005-11-22 International Business Machines Corporation Method and apparatus for reconstructing a surface using a ball-pivoting algorithm
US6750873B1 (en) 2000-06-27 2004-06-15 International Business Machines Corporation High quality texture reconstruction from multiple scans
US7538764B2 (en) 2001-01-05 2009-05-26 Interuniversitair Micro-Elektronica Centrum (Imec) System and method to obtain surface structures of multi-dimensional objects, and to represent those surface structures for animation, transmission and display
US6919906B2 (en) 2001-05-08 2005-07-19 Microsoft Corporation Discontinuity edge overdraw
GB2378337B (en) 2001-06-11 2005-04-13 Canon Kk 3D Computer modelling apparatus
US7909696B2 (en) 2001-08-09 2011-03-22 Igt Game interaction in 3-D gaming environments
US6990681B2 (en) 2001-08-09 2006-01-24 Sony Corporation Enhancing broadcast of an event with synthetic scene using a depth map
US6781591B2 (en) 2001-08-15 2004-08-24 Mitsubishi Electric Research Laboratories, Inc. Blending multiple images using local and global information
US7023432B2 (en) 2001-09-24 2006-04-04 Geomagic, Inc. Methods, apparatus and computer program products that reconstruct surfaces from data point sets
US7096428B2 (en) 2001-09-28 2006-08-22 Fuji Xerox Co., Ltd. Systems and methods for providing a spatially indexed panoramic video
EP1473678A4 (en) 2002-02-06 2008-02-13 Digital Process Ltd Three-dimensional shape displaying program, three-dimensional shape displaying method, and three-dimensional shape displaying device
US20040217956A1 (en) 2002-02-28 2004-11-04 Paul Besl Method and system for processing, compressing, streaming, and interactive rendering of 3D color image data
US7515173B2 (en) 2002-05-23 2009-04-07 Microsoft Corporation Head pose tracking system
US7030875B2 (en) 2002-09-04 2006-04-18 Honda Motor Company Ltd. Environmental reasoning using geometric data structure
US7106358B2 (en) 2002-12-30 2006-09-12 Motorola, Inc. Method, system and apparatus for telepresence communications
US20050017969A1 (en) 2003-05-27 2005-01-27 Pradeep Sen Computer graphics rendering using boundary information
US7480401B2 (en) 2003-06-23 2009-01-20 Siemens Medical Solutions Usa, Inc. Method for local surface smoothing with application to chest wall nodule segmentation in lung CT data
US7321669B2 (en) * 2003-07-10 2008-01-22 Sarnoff Corporation Method and apparatus for refining target position and size estimates using image and depth data
GB2405776B (en) 2003-09-05 2008-04-02 Canon Europa Nv 3d computer surface model generation
US7184052B2 (en) 2004-06-18 2007-02-27 Microsoft Corporation Real-time texture rendering using generalized displacement maps
US7292257B2 (en) 2004-06-28 2007-11-06 Microsoft Corporation Interactive viewpoint video system and process
US7671893B2 (en) 2004-07-27 2010-03-02 Microsoft Corp. System and method for interactive multi-view video
US20060023782A1 (en) 2004-07-27 2006-02-02 Microsoft Corporation System and method for off-line multi-view video compression
US7561620B2 (en) 2004-08-03 2009-07-14 Microsoft Corporation System and process for compressing and decompressing multiple, layered, video streams employing spatial and temporal encoding
US7142209B2 (en) 2004-08-03 2006-11-28 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video that was generated using overlapping images of a scene captured from viewpoints forming a grid
US7221366B2 (en) 2004-08-03 2007-05-22 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video
US8477173B2 (en) 2004-10-15 2013-07-02 Lifesize Communications, Inc. High definition videoconferencing system
WO2006062199A1 (en) 2004-12-10 2006-06-15 Kyoto University 3-dimensional image data compression device, method, program, and recording medium
WO2006084385A1 (en) 2005-02-11 2006-08-17 Macdonald Dettwiler & Associates Inc. 3d imaging system
US8228994B2 (en) 2005-05-20 2012-07-24 Microsoft Corporation Multi-view video coding based on temporal and view decomposition
US20070070177A1 (en) 2005-07-01 2007-03-29 Christensen Dennis G Visual and aural perspective management for enhanced interactive video telepresence
JP4595733B2 (en) 2005-08-02 2010-12-08 カシオ計算機株式会社 Image processing device
US7551232B2 (en) 2005-11-14 2009-06-23 Lsi Corporation Noise adaptive 3D composite noise reduction
US7623127B2 (en) 2005-11-29 2009-11-24 Siemens Medical Solutions Usa, Inc. Method and apparatus for discrete mesh filleting and rounding through ball pivoting
US7577491B2 (en) 2005-11-30 2009-08-18 General Electric Company System and method for extracting parameters of a cutting tool
KR100810268B1 (en) 2006-04-06 2008-03-06 삼성전자주식회사 Embodiment Method For Color-weakness in Mobile Display Apparatus
US7778491B2 (en) 2006-04-10 2010-08-17 Microsoft Corporation Oblique image stitching
US7679639B2 (en) 2006-04-20 2010-03-16 Cisco Technology, Inc. System and method for enhancing eye gaze in a telepresence system
EP1862969A1 (en) 2006-06-02 2007-12-05 Eidgenössische Technische Hochschule Zürich Method and system for generating a representation of a dynamically changing 3D scene
US20080043024A1 (en) 2006-06-26 2008-02-21 Siemens Corporate Research, Inc. Method for reconstructing an object subject to a cone beam using a graphic processor unit (gpu)
USD610105S1 (en) 2006-07-10 2010-02-16 Cisco Technology, Inc. Telepresence system
US8213711B2 (en) 2007-04-03 2012-07-03 Her Majesty The Queen In Right Of Canada As Represented By The Minister Of Industry, Through The Communications Research Centre Canada Method and graphical user interface for modifying depth maps
GB0708676D0 (en) 2007-05-04 2007-06-13 Imec Inter Uni Micro Electr A Method for real-time/on-line performing of multi view multimedia applications
US8253770B2 (en) 2007-05-31 2012-08-28 Eastman Kodak Company Residential video communication system
JP4947593B2 (en) 2007-07-31 2012-06-06 Kddi株式会社 Apparatus and program for generating free viewpoint image by local region segmentation
US8223192B2 (en) 2007-10-31 2012-07-17 Technion Research And Development Foundation Ltd. Free viewpoint video
US8160345B2 (en) 2008-04-30 2012-04-17 Otismed Corporation System and method for image segmentation in generating computer models of a joint to undergo arthroplasty
JP5684577B2 (en) * 2008-02-27 2015-03-11 ソニー コンピュータ エンタテインメント アメリカ リミテッド ライアビリテイ カンパニー How to capture scene depth data and apply computer actions
TWI357582B (en) 2008-04-18 2012-02-01 Univ Nat Taiwan Image tracking system and method thereof
US8442355B2 (en) 2008-05-23 2013-05-14 Samsung Electronics Co., Ltd. System and method for generating a multi-dimensional image
US7840638B2 (en) 2008-06-27 2010-11-23 Microsoft Corporation Participant positioning in multimedia conferencing
US8106924B2 (en) 2008-07-31 2012-01-31 Stmicroelectronics S.R.L. Method and system for video rendering, computer program product therefor
WO2010035492A1 (en) 2008-09-29 2010-04-01 パナソニック株式会社 3d image processing device and method for reducing noise in 3d image processing device
CN102239506B (en) 2008-10-02 2014-07-09 弗兰霍菲尔运输应用研究公司 Intermediate view synthesis and multi-view data signal extraction
US8200041B2 (en) 2008-12-18 2012-06-12 Intel Corporation Hardware accelerated silhouette detection
US8436852B2 (en) 2009-02-09 2013-05-07 Microsoft Corporation Image editing consistent with scene geometry
US8477175B2 (en) 2009-03-09 2013-07-02 Cisco Technology, Inc. System and method for providing three dimensional imaging in a network environment
JP5222205B2 (en) 2009-04-03 2013-06-26 Kddi株式会社 Image processing apparatus, method, and program
US20100259595A1 (en) 2009-04-10 2010-10-14 Nokia Corporation Methods and Apparatuses for Efficient Streaming of Free View Point Video
US8719309B2 (en) 2009-04-14 2014-05-06 Apple Inc. Method and apparatus for media data transmission
US8665259B2 (en) 2009-04-16 2014-03-04 Autodesk, Inc. Multiscale three-dimensional navigation
US8755569B2 (en) 2009-05-29 2014-06-17 University Of Central Florida Research Foundation, Inc. Methods for recognizing pose and action of articulated objects with collection of planes in motion
US8629866B2 (en) 2009-06-18 2014-01-14 International Business Machines Corporation Computer method and apparatus providing interactive control and remote identity through in-world proxy
KR101070591B1 (en) * 2009-06-25 2011-10-06 (주)실리콘화일 distance measuring apparatus having dual stereo camera
US9648346B2 (en) 2009-06-25 2017-05-09 Microsoft Technology Licensing, Llc Multi-view video compression and streaming based on viewpoints of remote viewer
US8194149B2 (en) 2009-06-30 2012-06-05 Cisco Technology, Inc. Infrared-aided depth estimation
US8633940B2 (en) 2009-08-04 2014-01-21 Broadcom Corporation Method and system for texture compression in a system having an AVC decoder and a 3D engine
US8908958B2 (en) 2009-09-03 2014-12-09 Ron Kimmel Devices and methods of generating three dimensional (3D) colored models
US8284237B2 (en) 2009-09-09 2012-10-09 Nokia Corporation Rendering multiview content in a 3D video system
US8441482B2 (en) 2009-09-21 2013-05-14 Caustic Graphics, Inc. Systems and methods for self-intersection avoidance in ray tracing
US20110084983A1 (en) 2009-09-29 2011-04-14 Wavelength & Resonance LLC Systems and Methods for Interaction With a Virtual Environment
US9154730B2 (en) 2009-10-16 2015-10-06 Hewlett-Packard Development Company, L.P. System and method for determining the active talkers in a video conference
US8537200B2 (en) 2009-10-23 2013-09-17 Qualcomm Incorporated Depth map generation techniques for conversion of 2D video data to 3D video data
US20110122225A1 (en) 2009-11-23 2011-05-26 General Instrument Corporation Depth Coding as an Additional Channel to Video Sequence
US8487977B2 (en) 2010-01-26 2013-07-16 Polycom, Inc. Method and apparatus to virtualize people with 3D effect into a remote room on a telepresence call for true in person experience
US20110211749A1 (en) 2010-02-28 2011-09-01 Kar Han Tan System And Method For Processing Video Using Depth Sensor Information
US8898567B2 (en) 2010-04-09 2014-11-25 Nokia Corporation Method and apparatus for generating a virtual interactive workspace
EP2383696A1 (en) 2010-04-30 2011-11-02 LiberoVision AG Method for estimating a pose of an articulated object model
US20110304619A1 (en) 2010-06-10 2011-12-15 Autodesk, Inc. Primitive quadric surface extraction from unorganized point cloud data
US8411126B2 (en) 2010-06-24 2013-04-02 Hewlett-Packard Development Company, L.P. Methods and systems for close proximity spatial audio rendering
KR20120011653A (en) * 2010-07-29 2012-02-08 삼성전자주식회사 Image processing apparatus and method
US8659597B2 (en) 2010-09-27 2014-02-25 Intel Corporation Multi-view ray tracing using edge detection and shader reuse
US8787459B2 (en) 2010-11-09 2014-07-22 Sony Computer Entertainment Inc. Video coding methods and apparatus
US9123115B2 (en) * 2010-11-23 2015-09-01 Qualcomm Incorporated Depth estimation based on global motion and optical flow
JP5858380B2 (en) * 2010-12-03 2016-02-10 国立大学法人名古屋大学 Virtual viewpoint image composition method and virtual viewpoint image composition system
US8693713B2 (en) 2010-12-17 2014-04-08 Microsoft Corporation Virtual audio environment for multidimensional conferencing
US8156239B1 (en) 2011-03-09 2012-04-10 Metropcs Wireless, Inc. Adaptive multimedia renderer
EP2707834B1 (en) 2011-05-13 2020-06-24 Vizrt Ag Silhouette-based pose estimation
US8867886B2 (en) 2011-08-08 2014-10-21 Roy Feinson Surround video playback
WO2013049388A1 (en) 2011-09-29 2013-04-04 Dolby Laboratories Licensing Corporation Representation and coding of multi-view images using tapestry encoding
US9830743B2 (en) 2012-04-03 2017-11-28 Autodesk, Inc. Volume-preserving smoothing brush
US9058706B2 (en) 2012-04-30 2015-06-16 Convoy Technologies Llc Motor vehicle camera and monitoring system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327381B1 (en) * 1994-12-29 2001-12-04 Worldscape, Llc Image transformation and synthesis methods
US20060267977A1 (en) * 2005-05-19 2006-11-30 Helmut Barfuss Method for expanding the display of a volume image of an object region
US20080095465A1 (en) * 2006-10-18 2008-04-24 General Electric Company Image registration system and method
US20090016641A1 (en) * 2007-06-19 2009-01-15 Gianluca Paladini Method and apparatus for efficient client-server visualization of multi-dimensional data
US20090128568A1 (en) * 2007-11-16 2009-05-21 Sportvision, Inc. Virtual viewpoint animation
US20110142321A1 (en) * 2008-08-29 2011-06-16 Koninklijke Philips Electronics N.V. Dynamic transfer of three-dimensional image data

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9191643B2 (en) 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US10592973B1 (en) 2013-10-25 2020-03-17 Appliance Computing III, Inc. Image-based rendering of real spaces
US11783409B1 (en) 2013-10-25 2023-10-10 Appliance Computing III, Inc. Image-based rendering of real spaces
US11449926B1 (en) 2013-10-25 2022-09-20 Appliance Computing III, Inc. Image-based rendering of real spaces
US10510111B2 (en) 2013-10-25 2019-12-17 Appliance Computing III, Inc. Image-based rendering of real spaces
US11062384B1 (en) 2013-10-25 2021-07-13 Appliance Computing III, Inc. Image-based rendering of real spaces
US11610256B1 (en) 2013-10-25 2023-03-21 Appliance Computing III, Inc. User interface for image-based rendering of virtual tours
US11948186B1 (en) 2013-10-25 2024-04-02 Appliance Computing III, Inc. User interface for image-based rendering of virtual tours
US11508125B1 (en) * 2014-05-28 2022-11-22 Lucasfilm Entertainment Company Ltd. Navigating a virtual environment of a media content item
US20170013283A1 (en) * 2015-07-10 2017-01-12 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
US9848212B2 (en) * 2015-07-10 2017-12-19 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
CN108605090A (en) * 2016-02-12 2018-09-28 三星电子株式会社 Method for supporting the VR contents in communication system to show
US20200005527A1 (en) * 2016-12-19 2020-01-02 Interdigital Ce Patent Holdings Method and apparatus for constructing lighting environment representations of 3d scenes
US10705678B2 (en) * 2017-02-28 2020-07-07 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium for generating a virtual viewpoint image
US20180246631A1 (en) * 2017-02-28 2018-08-30 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
US10681337B2 (en) * 2017-04-14 2020-06-09 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation
EP3388119A3 (en) * 2017-04-14 2018-11-28 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation
US20180356942A1 (en) * 2017-06-12 2018-12-13 Samsung Eletrônica da Amazônia Ltda. METHOD FOR DISPLAYING 360º MEDIA ON BUBBLES INTERFACE
EP3425592B1 (en) * 2017-07-06 2024-07-24 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and program, for generating a virtual viewpoint image
US10951879B2 (en) 2017-12-04 2021-03-16 Canon Kabushiki Kaisha Method, system and apparatus for capture of image data for free viewpoint video
US10818077B2 (en) 2018-12-14 2020-10-27 Canon Kabushiki Kaisha Method, system and apparatus for controlling a virtual camera
US11037365B2 (en) 2019-03-07 2021-06-15 Alibaba Group Holding Limited Method, apparatus, medium, terminal, and device for processing multi-angle free-perspective data
US11521347B2 (en) 2019-03-07 2022-12-06 Alibaba Group Holding Limited Method, apparatus, medium, and device for generating multi-angle free-respective image data
US11341715B2 (en) 2019-03-07 2022-05-24 Alibaba Group Holding Limited Video reconstruction method, system, device, and computer readable storage medium
US11257283B2 (en) 2019-03-07 2022-02-22 Alibaba Group Holding Limited Image reconstruction method, system, device and computer-readable storage medium
US11055901B2 (en) 2019-03-07 2021-07-06 Alibaba Group Holding Limited Method, apparatus, medium, and server for generating multi-angle free-perspective video data
US11776205B2 (en) * 2020-06-09 2023-10-03 Ptc Inc. Determination of interactions with predefined volumes of space based on automated analysis of volumetric video
WO2024006997A1 (en) * 2022-07-01 2024-01-04 Google Llc Three-dimensional video highlight from a camera source

Also Published As

Publication number Publication date
US9846960B2 (en) 2017-12-19
US9251623B2 (en) 2016-02-02
US20130321593A1 (en) 2013-12-05
US20130321418A1 (en) 2013-12-05
US20130321566A1 (en) 2013-12-05
US20130321586A1 (en) 2013-12-05
US8917270B2 (en) 2014-12-23
US20130321410A1 (en) 2013-12-05
US20130321413A1 (en) 2013-12-05
US20130321590A1 (en) 2013-12-05
US20130321589A1 (en) 2013-12-05
US20130321396A1 (en) 2013-12-05
US9256980B2 (en) 2016-02-09

Similar Documents

Publication Publication Date Title
US20130321575A1 (en) High definition bubbles for rendering free viewpoint video
US10582191B1 (en) Dynamic angle viewing system
US12079942B2 (en) Augmented and virtual reality
US11217006B2 (en) Methods and systems for performing 3D simulation based on a 2D video image
US10880522B2 (en) Hybrid media viewing application including a region of interest within a wide field of view
AU2021203688B2 (en) Volumetric depth video recording and playback
US11748870B2 (en) Video quality measurement for virtual cameras in volumetric immersive media
US9237330B2 (en) Forming a stereoscopic video
JP4783588B2 (en) Interactive viewpoint video system and process
US20080246759A1 (en) Automatic Scene Modeling for the 3D Camera and 3D Video
Wang et al. Distanciar: Authoring site-specific augmented reality experiences for remote environments
US20200388068A1 (en) System and apparatus for user controlled virtual camera for volumetric video
US20140181630A1 (en) Method and apparatus for adding annotations to an image
KR20070086037A (en) Method for inter-scene transitions
JP6845490B2 (en) Texture rendering based on multi-layer UV maps for free-moving FVV applications
KR102499904B1 (en) Methods and systems for creating a virtualized projection of a customized view of a real world scene for inclusion within virtual reality media content
CN112740261A (en) Panoramic light field capture, processing and display
WO2014094874A1 (en) Method and apparatus for adding annotations to a plenoptic light field
CN113393566A (en) Depth-based 3D reconstruction using a priori depth scenes
Langlotz et al. AR record&replay: situated compositing of video content in mobile augmented reality
Tompkin et al. Video collections in panoramic contexts
Inamoto et al. Free viewpoint video synthesis and presentation of sporting events for mixed reality entertainment
Jiang View transformation and novel view synthesis based on deep learning
Huang et al. Animated panorama from a panning video sequence
Lipski Virtual video camera: a system for free viewpoint video of arbitrary dynamic scenes

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KIRK, ADAM;FISHMAN, NEIL;GILLET, DON;AND OTHERS;SIGNING DATES FROM 20120827 TO 20120829;REEL/FRAME:028880/0101

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0541

Effective date: 20141014

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION