US9330500B2 - Inserting objects into content - Google Patents
Inserting objects into content Download PDFInfo
- Publication number
- US9330500B2 US9330500B2 US13/314,723 US201113314723A US9330500B2 US 9330500 B2 US9330500 B2 US 9330500B2 US 201113314723 A US201113314723 A US 201113314723A US 9330500 B2 US9330500 B2 US 9330500B2
- Authority
- US
- United States
- Prior art keywords
- scene
- light
- image
- single image
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims description 82
- 239000000463 material Substances 0.000 claims description 16
- 238000009877 rendering Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 10
- 238000000354 decomposition reaction Methods 0.000 claims description 9
- 239000002131 composite material Substances 0.000 claims description 5
- 238000007670 refining Methods 0.000 claims 12
- 230000008569 process Effects 0.000 description 20
- 238000003780 insertion Methods 0.000 description 14
- 230000037431 insertion Effects 0.000 description 14
- 239000011159 matrix material Substances 0.000 description 11
- 238000003860 storage Methods 0.000 description 11
- 230000009466 transformation Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 230000001413 cellular effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 238000004091 panning Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000010422 painting Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000001444 catalytic combustion detection Methods 0.000 description 1
- QBWCMBCROVPCKQ-UHFFFAOYSA-N chlorous acid Chemical compound OCl=O QBWCMBCROVPCKQ-UHFFFAOYSA-N 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
Definitions
- an image of a scene is obtained.
- a 3-dimensional (3D) representation of the scene and a light model for the scene are generated based on the image.
- One or more objects are inserted into the 3D representation of the scene, and the 3D representation of the scene is rendered, based on the light model, to generate a modified image of the scene including the one or more objects.
- an image of a scene is obtained. Locations of one or more interior lighting sources in the image and locations of one or more shafts of light in the image are identified based on the image. An amount of shadowing of each of multiple pixels in the one or more shafts of light is identified, and a direction of each of the one or more shafts of light is determined. The locations of one or more interior lighting sources in the image, the locations of one or more shafts of light in the image, the amount of shadowing of each of multiple pixels in the one or more shafts of light, and the direction of each of the one or more shafts of light are maintained as a light model for the scene.
- FIG. 1 illustrates an example system implementing the inserting objects into content in accordance with one or more embodiments.
- FIG. 2 is a flowchart illustrating an example process for inserting objects into content in accordance with one or more embodiments.
- FIG. 3 illustrates an example image of a scene in accordance with one or more embodiments.
- FIG. 4 illustrates an example image of a scene with scene boundaries identified in accordance with one or more embodiments.
- FIG. 5 illustrates an example of user input identifying extruding geometry in accordance with one or more embodiments.
- FIG. 6 illustrates an example of user input identifying an occluding surface in accordance with one or more embodiments.
- FIG. 7 illustrates an example of user input identifying interior lighting in accordance with one or more embodiments.
- FIG. 8 illustrates an example of refined light source locations in accordance with one or more embodiments.
- FIG. 9 illustrates an example image including exterior lighting in accordance with one or more embodiments.
- FIG. 10 illustrates an example of user input identifying shafts of light and sources of the shafts of light in accordance with one or more embodiments.
- FIG. 11 is a flowchart illustrating an example process for generating a light model in accordance with one or more embodiments.
- FIG. 12 is a block diagram illustrating an example computing device in which the inserting objects into content can be implemented in accordance with one or more embodiments.
- Inserting objects into content is discussed herein.
- An image such as a photograph, into which one or more objects are to be inserted is obtained.
- a 3-dimensional (3D) representation of the scene in the image is generated and a light model of the scene in the image is generated.
- the one or more objects are added to the 3D representation of the scene.
- the 3D representation of the scene is then rendered, based on the light model, to generate a modified image (which is the obtained image modified to include the one or more objects).
- FIG. 1 illustrates an example system 100 implementing the inserting objects into content in accordance with one or more embodiments.
- System 100 includes a content management module 102 , a user input module 104 , a display module 106 , and an insertion system 108 .
- Insertion system 108 includes a 3-dimensional (3D) representation generation module 112 , a light source identification module 114 , an object insertion module 116 , and a rendering module 118 .
- 3D 3-dimensional
- system 100 is implemented by a single device. Any of a variety of different types of devices can be used to implement system 100 , such as desktop or laptop computer, a server computer, a cellular or other wireless phone, a digital camera, and so forth. Alternatively, system 100 can be implemented by multiple devices, with different devices including different modules. For example, one or more modules of system 100 can be implemented at least in part by one device (e.g., a desktop computer), while one or more other modules of system 100 are implemented at least in part by another device (e.g., a server computer accessed over a communication network).
- a desktop computer e.g., a desktop computer
- another device e.g., a server computer accessed over a communication network
- the multiple devices can communicate with one another over various wired and/or wireless communication networks (e.g., the Internet, a local area network (LAN), a cellular or other wireless phone network, etc.) or other communication media (e.g., a universal serial bus (USB) connection, a wireless USB connection, and so forth).
- wireless communication networks e.g., the Internet, a local area network (LAN), a cellular or other wireless phone network, etc.
- other communication media e.g., a universal serial bus (USB) connection, a wireless USB connection, and so forth.
- Content management module 102 manages content, including obtaining content and/or providing content to other devices or systems.
- the content can be in various forms, such as a single image, a set of multiple images (e.g., video), and so forth. An image is oftentimes a photograph, although images can take other forms such as drawings, paintings, and so forth.
- Content management module 102 can obtain content in various manners, such as from an image capture device of system 100 , from another system or device, from a storage device (e.g., magnetic disk, optical disc, Flash memory, etc.) of system 100 , and so forth.
- Content management module 102 can also provide content to other devices or systems in various manners, such as emailing content, saving content in a particular location of a storage device or to a particular service, and so forth.
- User input module 104 receives inputs from a user of system 100 , and provides an indication of those user inputs to various modules of system 100 .
- User inputs can be provided by the user in various manners, such as by touching portions of a touchscreen or touchpad with a finger or stylus, manipulating a mouse or other cursor control device, providing audible inputs that are received by a microphone of system 100 , moving hands or other body parts that are detected by an image capture device of system 100 , and so forth.
- Display module 106 displays a user interface (UI) for system 100 , including displaying images or other content.
- Display module 106 can display the UI on a screen of system 100 , or alternatively provide signals causing the UI to be displayed on a screen of another system or device.
- UI user interface
- Insertion system 108 facilitates inserting objects into content.
- content management module 102 obtains content and makes the content available to insertion system 108 .
- 3D representation generation module 112 generates, based on an image of the content, a 3D representation of the scene in the image. If the content includes multiple images, then the 3D representation is a 3D representation of the scene in one of the multiple images. This 3D representation of the scene includes an estimation of materials included in the image.
- Light source identification module 114 estimates, based on the image, the location of one or more light sources in the 3D representation of the scene generated by module 112 and generates a light model for the scene.
- Object insertion module 116 inserts objects into the 3D representation of the scene.
- Rendering module 118 renders, based on the estimated materials included in the image and the light model for the scene, the 3D representation of the scene to generate a modified 2-dimensional (2D) image.
- the modified image is the image that was obtained and modified by insertion of the one or more objects.
- the inserted objects look like they belong in the image, appearing as if the objects were actually part of the scene depicted in the image.
- Insertion system 108 allows objects to be inserted into an image based on a single image. Insertion system 108 need not have multiple images of the same scene in order to allow objects to be inserted into the image. It should also be noted that insertion system 108 allows objects to be inserted into an image based on the image and without additional data or information (e.g., data regarding lighting) being collected from the physical scene depicted in the image. Some user inputs identifying characteristics of the scene may be received as discussed in more detail below, but no additional information need be collected from the scene itself (from the physical scene depicted in the image).
- FIG. 2 is a flowchart illustrating an example process 200 for inserting objects into content in accordance with one or more embodiments.
- Process 200 can be implemented in software, firmware, hardware, or combinations thereof.
- Process 200 is carried out by, for example, an insertion system 108 of FIG. 1 .
- Process 200 is shown as a set of acts and is not limited to the order shown for performing the operations of the various acts.
- Process 200 is an example process for inserting objects into content; additional discussions of inserting objects into content are included herein with reference to different figures.
- content depicting a scene is obtained (act 202 ).
- the content can be a single image, or multiple images (e.g., video), and the content can be obtained in various manners as discussed above.
- a 3D representation of the scene is generated based on an image of the content (act 204 ). If the content is a single image then the 3D representation of the scene is generated based on that single image, and if the content is multiple images then the 3D representation of the scene is generated based on one or more of the multiple images.
- the 3D representation of the scene in the image can be generated in a variety of different manners as discussed below.
- a light model for the scene is generated based on the image (act 206 ).
- This light model can include interior lighting and/or exterior lighting as discussed in more detail below.
- One or more objects are inserted into the 3D representation of the scene (act 208 ). These one or more objects can take various forms, such as synthetic objects, portions of other pictures, and so forth as discussed in more detail below.
- the 3D representation of the scene is rendered to generate a modified image that includes the inserted object (act 210 ).
- the modified image is rendered based on the 3D representation of the scene and the identified light sources, as discussed in more detail below.
- 3D representation generation module 112 generates, based on an image, a 3D representation of the scene in the image.
- This image (e.g., obtained by content management module 102 as discussed above) is also referred to as the original image.
- This 3D representation is a model or representation of the scene that is depicted in the image. It should be noted that the 3D representation can be generated by module 112 based on the image, and optionally user inputs, and need not be based on any additional information regarding the physical scene (e.g., actual measurements of the physical scene).
- 3D representation generation module 112 can generate the 3D representation of the scene in the image in a variety of different manners.
- the 3D representation of the scene in the image is generated by automatically identifying scene boundaries within the scene.
- the scene boundaries can be automatically identified in different manners, such as using the techniques discussed in Hedau, V., Hoiem, D., and Forsyth, D., “Recovering the Spatial Layout of Cluttered Rooms”, International Conference on Computer Vision (2009).
- These scene boundaries refer to boundaries present in the physical scene depicted in the image, such as floors, walls, ceilings, and so forth.
- the scene boundaries can be a coarse geometric representation of the scene, with scene boundaries being approximately identified—exact scene boundaries need not be identified and all scene boundaries need not be identified. Rather, the techniques discussed herein generate sufficient geometry to model lighting effects, and need not fully describe all aspects of the scene.
- the geometry of a scene refers to parts (e.g., walls, floors, ceilings, buildings, furniture, etc.) in the scene.
- the geometry of the 3D representation of a scene refers to the 3D representation of those objects.
- FIG. 3 illustrates an example image 300 of a scene (e.g., of a kitchen).
- the scene boundaries in the scene of image 300 are boundaries between walls, floor, and ceiling.
- FIG. 4 illustrates an example image 400 of the same scene as image 300 , but with scene boundaries identified using dashed lines.
- 3D representation generation module 112 also automatically generates an estimate of parameters of a camera or other imaging device that captured or would have captured an image (e.g., taken a picture).
- This estimate can be an estimate of camera parameters for an actual camera, such as for a camera that actually took (or could have taken) a picture of a scene.
- This estimate can also be an estimate of a virtual or assumed camera for images that are not photographs. For example, if an image is a drawing or painting, then the estimate can be an estimate of camera parameters for a camera that would have captured the image if the image were a photograph.
- the estimate of the camera parameters includes, for example, an estimate of camera intrinsics or internal camera parameters such as the focal length and optical center of the camera.
- the camera parameters are thus also referred to as the camera perspective.
- the camera parameters can be automatically identified in different manners, such as using the techniques discussed in Hedau, V., Hoiem, D., and Forsyth, D., “Recovering the Spatial Layout of Cluttered Rooms”, International Conference on Computer Vision (2009).
- vanishing points in the image can be estimated. Vanishing points refer to the intersection of 3D parallel lines in 2D. Given multiple vanishing points, the camera parameters can be readily identified.
- 3D representation generation module 112 allows a user of system 100 to modify scene boundaries.
- the user may believe that the automatically identified scene boundaries do not accurately identify the actual scene boundaries.
- the user can provide various inputs to correct these inaccuracies, such as by moving vertices, moving lines, and so forth.
- a user can change the scene boundaries identified by the dashed lines by moving (e.g., dragging and dropping, and resizing as appropriate) one or more vertices of one or more of the dashed lines, by moving one or more of the dashed lines, and so forth.
- 3D representation generation module 112 allows a user of system 100 to modify vanishing points.
- the user may believe that the estimated vanishing points do not accurately identify the actual vanishing points.
- the user can provide various inputs to correct these inaccuracies, such as by moving vertices, moving lines, and so forth.
- lines used to estimate a vanishing point can be displayed, and a user can change the estimated vanishing point by moving (e.g., dragging and dropping) one or more vertices of a line used to estimate a vanishing point, by moving one or more of the lines used to estimate a vanishing point, and so forth.
- 3D representation generation module 112 allows a user to identify additional geometry in the scene that may be relevant to inserting objects.
- This additional geometry typically includes extruded geometry and occluding surfaces.
- Extruding geometry refers to geometry defined by a closed 2D curve that is extruded along some 3D vector.
- extruding geometry can include tables, chairs, desks, countertops, and so forth.
- the extruding geometry identified by a user is typically geometry on which a user desires to have an inserted object placed. For example, if the user desires to have an object inserted on top of a table in the scene, then the user identifies the table as extruding geometry in the scene.
- the extruding geometry can be converted to a 3D model and added to the representation of the 3D scene in a variety of conventional manners based on the bounding geometry and vanishing points identified as discussed above.
- FIG. 5 illustrates an example of user input identifying extruding geometry.
- an image 500 has drawn on a tabletop an outline 502 of the tabletop, which is illustrated with cross-hatching.
- the user can simply draw a line around the surface of the tabletop to identify the tabletop extruding geometry.
- occluding surfaces refer to surfaces that will occlude an inserted object if the object is inserted behind the occluding surface. Occluding surfaces can be included as part of any of a variety of different geometries, including furniture, books, boxes, buildings, and so forth. Various techniques can be used to create occlusion boundaries for objects. In one or more embodiments, occlusion boundaries for objects are creating using an interactive spectral matting segmentation approach as discussed in Levin, A., Rav-Acha, A., and Lischinski, D., “Spectral Matting”, IEEE Pattern Analysis and Machine Intelligence (October, 2008).
- User inputs can identify occluding surfaces in a variety of different manners, such as by a user providing inputs to scribble on or color in the interior and/or exterior of an object including an occluding surface.
- the depth of the object can be determined by assuming that the lowermost point on the boundary of the object is the contact point of the object with the floor. However, the depth of the object can alternatively be determined in different manners (e.g., based on whether the object is on the floor, ceiling, wall, etc.).
- a segmentation matte for the object is determined, which operates as a cardboard cutout in the scene—if an inserted object intersects the segmentation matte in the image space and is also further from the camera than the segmentation matte, then the object is occluded by the cutout.
- FIG. 6 illustrates an example of user input identifying an occluding surface.
- a line 602 has been drawn around the exterior of the occluding surface (an ottoman) and a line 604 drawn around the interior of the occluding surface. Given these two lines 602 , 604 , the occluding surface can be readily determined.
- 3D representation generation module 112 is discussed as using both automatic and manual (based on user input) techniques for generating the 3D representation of the scene in the image. It should be noted, however, that 3D representation generation module 112 can alternatively use automatic techniques and not use manual techniques in generating the 3D representation of the scene in the image. Similarly, 3D representation generation module 112 can alternatively use manual techniques and not use automatic techniques in generating the 3D representation of the scene in the image.
- 3D representation generation module 112 generates a 3D representation of the scene in the image, as well as an estimation of the camera perspective for the scene in the image.
- Light source identification module 114 estimates, based on the image, the location of one or more light sources in the 3D representation of the scene generated by module 112 . Based on these estimations, light source identification module 114 generates a lighting representation or light model of the scene in the image. Light source identification module 114 identifies both interior lighting and exterior lighting. Interior lighting refers to light from sources present in the scene in the image (e.g., lamps, light fixtures, etc.). Exterior lighting refers to light from sources that are external to (not include in) the scene in the image (e.g., sunlight shining through windows).
- user input is received indicating light sources in the scene in the image.
- This user input can take various forms, such as drawing a polygon around (outlining) the light source, scribbling over the light source, dragging and dropping (and resizing as appropriate) a polygon around the light source, and so forth.
- a polygon is projected onto the 3D representation of the scene generated by module 112 to define an area light source. If the user input is other than a polygon (e.g., scribbling over a light source) then a polygon is generated based on the user input (e.g., a rectangle including the scribbling is generated).
- shapes other than polygons can be projected onto the 3D representation of the scene, such as circles.
- Light source identification module 114 then automatically refines the locations of the light sources projected onto the 3D representation of the scene. Due to this refinement, the user input identifying a light source need not be exact. Rather, the user input can approximately identify the light source, and rely on light source identification module 114 to refine and correct the location. Light source identification module 114 can use various different objective or optimization functions, or other techniques, to identify a location of a light source given user input that approximately identifies a location of a light source.
- light source identification module 114 refines the locations of one or more light sources projected onto the 3D representation of the scene by choosing light parameters to minimize the squared pixel-wise differences between a rendered image (an image rendered using the current lighting parameter vector and 3D representation of the scene) and a target image (the original image). For example, light source identification module 114 seeks to minimize the objective:
- R(L) refers to the rendered image parameterized by the current lighting parameter vector L
- R* refers to the target image
- L 0 refers to the initial lighting parameters (as identified from the user input indicating a light source)
- w refers to a weight vector that constrains lighting parameters near their initial values
- ⁇ refers to a per-pixel weighting that places less emphasis on pixels near the ground
- the “pixels” refer to pixels in the rendered and target images
- the “params” refer to parameters of the lighting parameter vectors L.
- ⁇ is set to 1 for pixels above
- Each lighting parameter vector L includes six scalars for each light source. These six scalars include a 3D position (e.g., a scalar for position along an x axis, a scalar for position along a y axis, and a scalar for position along a z axis) and pixel intensity (e.g., using an RGB color model, a scalar for intensity of the color red, a scalar for the intensity of the color green, and a scalar for the intensity of the color blue).
- Each light parameter is normalized to the range [0,1], and the weight vector w is set to 10 for spatial (3D position) parameters and is set to 1 for intensity (pixel intensity) values.
- materials for the geometry are estimated.
- the materials are estimated using an image decomposition algorithm to estimate surface reflectance (albedo), and the albedo is then projected onto the scene geometry as a diffuse texture map, as discussed in more detail below.
- lighting parameters could identify directional properties of the light source, distribution of light directions (e.g., the angle of a spotlight cone), and so forth.
- Light source identification module 114 uses an intrinsic decomposition technique to estimate the albedo and direct light from the original image (the image into which objects are to be inserted).
- the intrinsic decomposition technique used by the light source identification module 114 is as follows. First, module 114 determines indirect irradiance by gathering radiance values at each 3D patch of geometry in the 3D representation onto which a pixel in the initial image projects. The gathered radiance values are obtained by sampling observed pixel values from the original image, which are projected onto geometry along the camera perspective. This indirect irradiance image is referred to as ⁇ , and is equivalent to the integral in the radiosity equation.
- Light source identification module 114 decomposes an image B into albedo ⁇ and direct light D by solving the objective function:
- ⁇ ⁇ B D + ⁇ ⁇ ⁇ ⁇ , 0 ⁇ ⁇ ⁇ 1 , 0 ⁇ D ( 3 )
- ⁇ 1 , ⁇ 2 , and ⁇ 3 are weights
- m is a scalar mask taking large values where B has small gradients (and otherwise taking small values)
- D 0 is an initial direct lighting estimate.
- the scalar mask m is defined as a sigmoid applied to the gradient magnitude of B as follows:
- the first two terms coerce p to be piecewise constant.
- the first term enforces an L1 sparsity penalty on edges in ⁇
- the second term smoothes albedo where B's gradients are small.
- the last two terms smooth D while keeping D near the initial estimate D 0 .
- the value ⁇ is initialized using the color variant of Retinex, for example as discussed in Grosse, R., Johnson, M. K., Adelson, E. H., and Freeman, W.
- the target image is set as the estimate of the direct term D and the 3D representation is rendered using just the direct lighting (as estimated by D).
- FIG. 7 illustrates an example of user input identifying interior lighting.
- boxes 702 and 704 have been drawn by a user to indicate the light sources.
- FIG. 8 illustrates an example of the refined light source locations.
- boxes 802 and 804 identify the interior light sources, having been refined from boxes 702 and 704 drawn by the user.
- light source identification module 114 also identifies exterior lighting. Exterior lighting or light shafts refer to light from sources that are not included in the scene, such as sunlight shining through windows or other openings, other light sources not included in the image, and so forth. Generally, exterior lighting is identified by identifying a 2D polygonal projection of a shaft of light and a direction of the shaft of light, as discussed in more detail below.
- a user input identifying the shafts of light visible in the scene in the image is received.
- This user input can identify the shafts of light visible in the scene in various manners, such as by the user drawing a bounding box or other polygon encompassing the shafts of light, by the user dragging and dropping (and resizing as appropriate) a polygon onto the shafts of light, by the user scribbling over the shafts of light, and so forth.
- a user input identifying sources of the shafts of light if visible in the scene in the image, is also received. This user input can identify the sources of the shafts of light in various manners, analogous to identifying the shafts of light visible in the scene.
- FIG. 9 illustrates an example image 900 including exterior lighting. Shafts of light are visible on the floor in the scene of image 900 , with the source of the shafts being openings in the ceiling in the scene of image 900 .
- FIG. 10 illustrates an example of user input identifying shafts of light and sources of the shafts of light.
- image 1000 a box 1002 has been drawn encompassing the shafts of light on the floor.
- a box 1004 has been drawn encompassing the sources of the shafts of light in the ceiling.
- a shadow detection algorithm is used to determine a scalar mask that estimates the confidence that a pixel is not illuminated by a shaft.
- the confidence that a pixel is not illuminated by a shaft is estimated for each pixel in the box encompassing the shafts of light (and optionally additional pixels in the image).
- Various shadow detection algorithms can be used, such as the shadow detection algorithm discussed in Guo, R., Dai, Q., and Hoiem, D., “Single-image Shadow Detection and Removal Using Paired Regions”, IEEE Computer Vision and Pattern Recognition (2011), which models region based appearance features along with pairwise relations between regions that have similar surface material and illumination.
- a graph cut inference is then performed to identify the regions that have the same material and different illumination conditions, resulting in the confidence mask.
- the detected shadow mask is then used to recover a soft shadow matte.
- the geometry in the 3D representation of the image is then used to recover the shaft direction (the direction of the shafts of light).
- This shaft direction can be, for example, the direction defined by locations (e.g., midpoints) of the bounding boxes (the boxes encompassing the shafts of light and the sources of the shafts of light).
- the shafts of light or the sources of the shafts of light may not be visible in the image.
- a user input of an estimate of the shaft direction is received. For example, the user can draw an arrow on the image of the shaft direction, drag and drop (and change direction of as appropriate) an arrow on the image in the shaft direction, and so forth.
- shafts of light are represented as masked spotlights (e.g., from an infinitely far spotlight) in the shaft direction.
- the detected shadow mask is projected on the floor (or other surface the shaft of light illuminates) along the shaft direction to obtain the mapping on the wall (or ceiling, floor, etc.), and a modified shadow matte result generated by averaging a shadow matte recovered for the wall and a shadow matte recovered for the floor.
- the detected shadow mask is projected on the wall (or other surface that is the source of the shaft of light) opposite the shaft direction to obtain the mapping on the floor (or other surface the shaft of light illuminates), and a modified shadow matte result generated by averaging a shadow matte recovered for the wall and a shadow matte recovered for the floor.
- Light source identification module 114 also estimates the materials for the geometry in the 3D representation of the scene.
- This geometry includes the automatically identified geometry as well as user-specified geometry (e.g., extruded geometry and occluding surfaces as discussed above).
- Module 114 assigns a material to the geometry in the 3D representation based on the albedo estimated during the image decomposition discussed above.
- the estimated albedo is projected along the camera's view vector (the perspective of the camera) onto the estimated geometry, and the objects are rendered with a diffuse texture corresponding to the projected albedo. This projection can also apply to out-of-view geometry, such as a wall behind the camera or other hidden geometry.
- Light source identification module 114 is discussed as using both automatic and manual (based on user input) techniques for estimating light sources in the 3D representation of the scene. It should be noted, however, that light source identification module 114 can alternatively use automatic techniques and not use manual techniques in estimating light sources in the 3D representation of the scene. For example, rather than receiving user inputs identifying light sources, light fixtures can be automatically identified based on the brightness of pixels in the image (e.g., the brightness of pixels in the ceiling to identify ceiling light fixtures). Similarly, light source identification module 114 can alternatively use manual techniques and not use automatic techniques in generating the 3D representation of the scene in the image (e.g., not automatically refine the locations of the light sources projected onto the 3D representation of the scene).
- FIG. 11 is a flowchart illustrating an example process 1100 for generating a light model in accordance with one or more embodiments.
- Process 1100 can be implemented in software, firmware, hardware, or combinations thereof.
- Process 1100 is carried out by, for example, light source identification module 114 of FIG. 1 .
- Process 1100 is shown as a set of acts and is not limited to the order shown for performing the operations of the various acts.
- Process 1100 is an example process for generating a light model; additional discussions of generating a light model are included herein with reference to different figures.
- process 1100 user input identifying one or more interior lighting sources in an image is received (act 1102 ).
- This user input can take various forms as discussed above.
- Locations of the one or more interior lighting sources are automatically refined (act 1104 ). These locations can be refined in different matters as discussed above.
- a soft shadow matte for the image is generated (act 1108 ).
- the soft shadow matte indicates an amount of shadowing of each pixel in the image as discussed above, and various matting methods can be used to generate the soft shadow matte as discussed above.
- a direction of the shaft of light is also determined (act 1110 ). This direction can be determined in different manners, as discussed above.
- a light model identifying the interior lighting and/or exterior lighting of the scene is maintained (act 1112 ).
- This light model can include indications of light sources (whether interior lighting sources or sources of shafts of light), directions of shafts of lights, and the soft shadow matte.
- FIG. 11 discusses acts performed for both interior lighting and exterior lighting. It should be noted that if an image includes no interior lighting, then acts 1102 and 1104 need not be performed. Similarly, if an image includes no exterior lighting, then acts 1106 , 1108 , and 1110 need not be performed.
- object insertion module 116 inserts one or more objects into the 3D representation of the scene.
- These one or more objects can be identified by a user of system 100 in various manners, such as dragging and dropping objects from an object collection, selecting objects from a menu, selecting objects (e.g., cutouts) from another image, and so forth.
- These one or more objects can alternatively be identified in other manners (e.g., by another module of system 100 , by another device or system, etc.).
- the location of each object within the scene is identified by the user (e.g., the location where the object is dropped in the scene) or by the other module, device, system, etc. that identifies the object.
- each object is a synthetic object, which refers to a 3D textured mesh.
- Such synthetic objects can be any type of material.
- objects can be other forms rather than a 3D textured mesh, such as a portion of an image (e.g., an object cutout or copied from another image).
- Rendering module 118 renders the 3D representation of the scene to generate a modified 2D image, which is the original image that was obtained and modified by insertion of the one or more objects.
- the rendering is based on various inputs including the 3D representation of the scene, as well as the light model indicating the interior and/or exterior lighting sources for the image, the estimated materials included in the image, and the soft shadow matte for the image.
- the 3D representation of the scene can be rendered, based on these inputs, in a variety of different conventional manners.
- the 3D representation of the scene is rendered using the LuxRender renderer (available at the web site “luxrender.net”).
- the rendered image is then composited back into the original image.
- the rendered image can be composited back into the original image in various manners, such as using the additive differential rendering method discussed in Debevec, P., “Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography”, Proceedings of the 25th annual conference on Computer graphics and interactive techniques, SIGGRAPH (1998).
- This additive differential rendering method generates a final composite image as follows: I final M ⁇ I obj +(1 ⁇ M) ⁇ ( I obj ⁇ noobj ) (5) where I final refers to the final composite image, I obj refers to a rendered image including inserted objects, I noobj refers to a rendered image without inserted objects, I b refers to the original image, M refers to an object mask (a scalar image that is 0 everywhere where no object is present, and (0, 1] otherwise), and ⁇ is the Hadamard product.
- an object can be animated.
- An animated object refers to an object that changes or moves over time. For example, an object may bounce or float around within a scene, an object may melt over time within a scene, and so forth.
- Such animated objects can be readily inserted into images using the techniques discussed above in various manners. For example, when the animated object changes (in appearance, location, etc.), the previous version of the object that was inserted into the 3D representation of the scene is removed from the 3D representation of the scene (e.g., by object insertion module 116 ), the new version of the object is inserted into the 3D representation of the scene, and a new modified image is generated by rendering the 3D representation of the scene with the inserted new version of the object.
- the previously generated 3D representation of the scene and light model can be re-used, and need not be regenerated each time the animated object changes.
- the content into which an object can be inserted can be an image, or alternatively a video in which a camera is panning around a scene or is otherwise moving.
- An object can be inserted into a video by selecting one frame of the video that will include the inserted object.
- a 3D representation of the scene, the light model indicating the interior and/or exterior lighting sources for the image, the estimated materials included in the image, and the soft shadow matte for the video are generated treating the selected frame as an image as discussed above.
- one or more additional frames of the video can also be selected, and a 3D representation of the scene, the light model indicating the interior and/or exterior lighting sources for the image, the estimated materials included in the image, and the soft shadow matte for the video are generated treating the selected frame as an image as discussed above.
- These one or more additional frames depict at least part of a scene not depicted in the other selected frames. Which frames are selected can be identified in different manners, although the frames are selected so as to be able to generate a 3D representation of the entire (or at least a threshold amount of) the scene.
- content can be a video generated by panning a camera across a room.
- the portion of the room depicted in one frame can be (and oftentimes is due to the panning) different from the portion of the room depicted in other frames.
- a 3D representation of the room can be generated (as well as the light model indicating the interior and/or exterior lighting sources for the selected frames, the estimated materials included in the selected frames, and the soft shadow matte for the selected frames are generated treating the selected frames) rather than a 3D representation of only the portion of the room depicted in a selected image.
- Camera matching or “match moving” techniques are used to determine the internal camera parameters (e.g., focal length and optical center) as well as camera extrinsics or external camera parameters such as the relative motion of the camera throughout the video (e.g., the camera position, rotation, and so forth for each frame in the video sequence).
- Various different conventional camera matching techniques or systems can be used to determine the internal and/or external camera parameters, such as the Voodoo Camera Tracker (available from digilab at the web site “digilab.uni-hannover.de”) or the boujou match moving software (available from Vicon of Oxford, UK).
- the 3D representation of the scene, the light model indicating the interior and/or exterior lighting sources for each selected frame, the estimated materials included in each selected frame, and the soft shadow matte generated based on each selected frame can be used to render the scene from each synthetic camera viewpoint as determined by the camera matching technique or system.
- the camera parameters are estimated based on vanishing points.
- the camera matching technique or system estimates a new set of camera parameters that may be different from the estimates based on the vanishing points.
- the 3D representation and light model are warped using a 4 ⁇ 4 projective (linear) transformation (a 3D homography). This homography can be obtained in various manners, such as by minimizing the squared distance of the re-projection of the vertices of the 3D representation onto the image plane under the determined homography.
- the homography is obtained as follows.
- a 3 ⁇ 4 projection matrix is generated that encodes various camera parameters.
- the encoded parameters can be, for example, camera position, rotation, focal length, and optical center, although other camera parameters can alternatively be used.
- This projection matrix is a transformation that maps 3D homogeneous coordinates in object space to 2D homogeneous coordinates in image space. In other words, the projection matrix can be used to determine where 3D geometry in the scene will project onto the 2D image.
- a projection matrix referred to as P is estimated by the camera matching technique or system discussed above, and a projection matrix P 0 is estimated based on the image (e.g., based on vanishing points) as discussed above.
- Various different objective or optimization functions, or other techniques, can be used to generate the 3D projective transformation H.
- an optimization procedure is used to find the 3D projective transformation H that reduces (e.g., minimizes) the difference in image-space reprojection error of the 3D representation of the scene using the new projection matrix P by applying the projective transformation H.
- An example of such an optimization procedure is:
- the geometric estimates e.g., bounding geometry, light locations in the light model, etc.
- the projection of these geometric estimates using projection matrix P will match (e.g., be the same as or approximately the same as) the projection of these geometric estimates using projection matrix P 0 .
- each 3D vertex value is replaced by the product of H and the original vertex value ( ⁇ H ⁇ ).
- each frame of the video can be rendered as an image as discussed above, although the camera parameters included in projection matrix P are used rather than the camera parameters included in projection matrix P 0 estimated based on the image (e.g., based on vanishing points).
- FIG. 12 is a block diagram illustrating an example computing device 1200 in which the inserting objects into content can be implemented in accordance with one or more embodiments.
- Computing device 1200 can be used to implement the various techniques and processes discussed herein.
- Computing device 1200 can be any of a wide variety of computing devices, such as a desktop computer, a server computer, a handheld computer, a laptop or netbook computer, a tablet or notepad computer, a personal digital assistant (PDA), an internet appliance, a game console, a set-top box, a cellular or other wireless phone, a digital camera, audio and/or video players, audio and/or video recorders, and so forth.
- PDA personal digital assistant
- Computing device 1200 includes one or more processor(s) 1202 , computer readable media such as system memory 1204 and mass storage device(s) 1206 , input/output (I/O) device(s) 1208 , and bus 1210 .
- processors 1202 at least part of system memory 1204 , one or more mass storage devices 1206 , one or more of devices 1208 , and/or bus 1210 can optionally be implemented as a single component or chip (e.g., a system on a chip).
- Processor(s) 1202 include one or more processors or controllers that execute instructions stored on computer readable media.
- the computer readable media can be, for example, system memory 1204 , mass storage device(s) 1206 , and/or other storage devices.
- Processor(s) 1202 may also include computer readable media, such as cache memory.
- the computer readable media refers to media for storage of information in contrast to mere signal transmission, carrier waves, or signals per se. However, it should be noted that instructions can also be communicated via various signal bearing media rather than computer readable media.
- System memory 1204 includes various computer readable media, including volatile memory (such as random access memory (RAM)) and/or nonvolatile memory (such as read only memory (ROM)).
- System memory 1204 may include rewritable ROM, such as Flash memory.
- Mass storage device(s) 1206 include various computer readable media, such as magnetic disks, optical discs, solid state memory (e.g., Flash memory), and so forth. Various drives may also be included in mass storage device(s) 1206 to enable reading from and/or writing to the various computer readable media. Mass storage device(s) 1206 include removable media and/or nonremovable media.
- I/O device(s) 1208 include various devices that allow data and/or other information to be input to and/or output from computing device 1200 .
- Examples of I/O device(s) 1208 include cursor control devices, keypads, microphones, monitors or other displays, speakers, printers, network interface cards, modems, lenses, CCDs or other image capture devices, and so forth.
- Bus 1210 allows processor(s) 1202 , system 1204 , mass storage device(s) 1206 , and I/O device(s) 1208 to communicate with one another.
- Bus 1210 can be one or more of multiple types of buses, such as a system bus, PCI bus, IEEE 1394 bus, USB bus, and so forth.
- any of the functions or techniques described herein can be implemented using software, firmware, hardware (e.g., fixed logic circuitry), manual processing, or a combination of these implementations.
- the terms “module” and “component” as used herein generally represent software, firmware, hardware, or combinations thereof.
- the module or component represents program code that performs specified tasks when executed on a processor (e.g., CPU or CPUs).
- the program code can be stored in one or more computer readable media, further description of which may be found with reference to FIG. 12 .
- the module or component represents a functional block or other hardware that performs specified tasks.
- the module or component can be an application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), complex programmable logic device (CPLD), and so forth.
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- CPLD complex programmable logic device
Landscapes
- Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
where R(L) refers to the rendered image parameterized by the current lighting parameter vector L, R* refers to the target image, L0 refers to the initial lighting parameters (as identified from the user input indicating a light source), w refers to a weight vector that constrains lighting parameters near their initial values, α refers to a per-pixel weighting that places less emphasis on pixels near the ground, the “pixels” refer to pixels in the rendered and target images, and the “params” refer to parameters of the lighting parameter vectors L. In one or more embodiments, α is set to 1 for pixels above the spatial midpoint of the scene (height-wise), and decreases quadratically from 1 to 0 (being set to 0 for pixels at the floor of the scene).
B=ρS,B=D+I,I=ρΓ,B=D+ρΓ (2)
where γ1, γ2, and γ3 are weights, m is a scalar mask taking large values where B has small gradients (and otherwise taking small values), and D0 is an initial direct lighting estimate. The scalar mask m is defined as a sigmoid applied to the gradient magnitude of B as follows:
where, for example, s=10.0 and c=0.15 (although other values for s and c can alternatively be used).
I final M⊙I obj+(1−M)⊙(I obj−noobj) (5)
where Ifinal refers to the final composite image, Iobj refers to a rendered image including inserted objects, Inoobj refers to a rendered image without inserted objects, Ib refers to the original image, M refers to an object mask (a scalar image that is 0 everywhere where no object is present, and (0, 1] otherwise), and ⊙ is the Hadamard product.
where ν=(x, y, z, 1)T is the 3D homogenous coordinate of a given vertex of geometry in the 3D representation of the scene, and the “Vertices” refer to vertices in the geometry in the 3D representation of the scene.
Claims (30)
I final =M⊙I obj+(1−M)⊙(I b +I obj −I noobj).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/314,723 US9330500B2 (en) | 2011-12-08 | 2011-12-08 | Inserting objects into content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/314,723 US9330500B2 (en) | 2011-12-08 | 2011-12-08 | Inserting objects into content |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130147798A1 US20130147798A1 (en) | 2013-06-13 |
US9330500B2 true US9330500B2 (en) | 2016-05-03 |
Family
ID=48571551
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/314,723 Active US9330500B2 (en) | 2011-12-08 | 2011-12-08 | Inserting objects into content |
Country Status (1)
Country | Link |
---|---|
US (1) | US9330500B2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160247536A1 (en) * | 2013-02-20 | 2016-08-25 | Intel Corporation | Techniques for adding interactive features to videos |
US10089759B2 (en) * | 2016-02-01 | 2018-10-02 | Adobe Systems Incorporated | Homography-assisted perspective drawing |
US20180374271A1 (en) * | 2015-12-21 | 2018-12-27 | Thomson Licensing | Ket lights direction detection |
US10614583B2 (en) | 2016-10-31 | 2020-04-07 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
US10685498B2 (en) * | 2014-05-13 | 2020-06-16 | Nant Holdings Ip, Llc | Augmented reality content rendering via albedo models, systems and methods |
US20210342057A1 (en) * | 2015-06-01 | 2021-11-04 | Lg Electronics Inc. | Mobile terminal |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
HU0900478D0 (en) * | 2009-07-31 | 2009-09-28 | Holografika Hologrameloeallito | Method and apparatus for displaying 3d images |
SG11201400429RA (en) * | 2011-09-08 | 2014-04-28 | Paofit Holdings Pte Ltd | System and method for visualizing synthetic objects withinreal-world video clip |
WO2014094880A1 (en) | 2012-12-21 | 2014-06-26 | Metaio Gmbh | Method for representing virtual information in a real environment |
US20140225922A1 (en) * | 2013-02-11 | 2014-08-14 | Rocco A. Sbardella | System and method for an augmented reality software application |
US9299188B2 (en) * | 2013-08-08 | 2016-03-29 | Adobe Systems Incorporated | Automatic geometry and lighting inference for realistic image editing |
US9996974B2 (en) * | 2013-08-30 | 2018-06-12 | Qualcomm Incorporated | Method and apparatus for representing a physical scene |
US9852238B2 (en) | 2014-04-24 | 2017-12-26 | The Board Of Trustees Of The University Of Illinois | 4D vizualization of building design and construction modeling with photographs |
US9704298B2 (en) | 2015-06-23 | 2017-07-11 | Paofit Holdings Pte Ltd. | Systems and methods for generating 360 degree mixed reality environments |
CN113516752A (en) * | 2015-09-30 | 2021-10-19 | 苹果公司 | Method, apparatus and program for displaying 3D representation of object based on orientation information of display |
US10706615B2 (en) * | 2015-12-08 | 2020-07-07 | Matterport, Inc. | Determining and/or generating data for an architectural opening area associated with a captured three-dimensional model |
US10134198B2 (en) | 2016-04-19 | 2018-11-20 | Adobe Systems Incorporated | Image compensation for an occluding direct-view augmented reality system |
JP7200678B2 (en) * | 2017-02-20 | 2023-01-10 | ソニーグループ株式会社 | Image processing device and image processing method |
US10403045B2 (en) * | 2017-08-11 | 2019-09-03 | Adobe Inc. | Photorealistic augmented reality system |
US11288412B2 (en) | 2018-04-18 | 2022-03-29 | The Board Of Trustees Of The University Of Illinois | Computation of point clouds and joint display of point clouds and building information models with project schedules for monitoring construction progress, productivity, and risk for delays |
WO2021133375A1 (en) * | 2019-12-23 | 2021-07-01 | Google Llc | Systems and methods for manipulation of shadows on portrait image frames |
US11682180B1 (en) * | 2021-12-09 | 2023-06-20 | Qualcomm Incorporated | Anchoring virtual content to physical surfaces |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7193633B1 (en) * | 2000-04-27 | 2007-03-20 | Adobe Systems Incorporated | Method and apparatus for image assisted modeling of three-dimensional scenes |
US20070098290A1 (en) * | 2005-10-28 | 2007-05-03 | Aepx Animation, Inc. | Automatic compositing of 3D objects in a still frame or series of frames |
US20090027391A1 (en) * | 2007-07-23 | 2009-01-29 | Disney Enterprises, Inc. | Directable lighting method and apparatus |
US20120183204A1 (en) * | 2011-01-18 | 2012-07-19 | NedSense Loft B.V. | 3d modeling and rendering from 2d images |
-
2011
- 2011-12-08 US US13/314,723 patent/US9330500B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7193633B1 (en) * | 2000-04-27 | 2007-03-20 | Adobe Systems Incorporated | Method and apparatus for image assisted modeling of three-dimensional scenes |
US20070098290A1 (en) * | 2005-10-28 | 2007-05-03 | Aepx Animation, Inc. | Automatic compositing of 3D objects in a still frame or series of frames |
US20090027391A1 (en) * | 2007-07-23 | 2009-01-29 | Disney Enterprises, Inc. | Directable lighting method and apparatus |
US20120183204A1 (en) * | 2011-01-18 | 2012-07-19 | NedSense Loft B.V. | 3d modeling and rendering from 2d images |
Non-Patent Citations (49)
Title |
---|
Alnasser, Mais et al., "Image-Based Rendering of Synthetic Diffuse Objects in Natural Scenes", The 18th International Conference on Pattern Recognition, (2006), 4 pages. |
Barrow, Harry G., et al., "Recovering Intrinsic Scene Characteristics from Images", In Comp. Vision Sys, 1978, (Apr. 1978), 23 pages. |
Blake, Andrew "Boundary Conditions for Lightness Computation in Mondrian World", Computer Vision, Graphics and Image Processing, (May 22, 1985), pp. 314-327. |
Boivin, Samuel et al., "Image-Based Rendering of Diffuse, Specular and Glossy Surfaces from a Single Image", In Proc. ACM SIGGRAPH, (2001), pp. 107-110. |
Carroll, Robert et al., "Illumination Decomposition for Material Recoloring with Consistent Interreflections", ACM Trans. Graph. 30, (Aug. 2011), 9 pages. |
Cossairt, Oliver et al., "Light Field Transfer: Global Illumination Between Real and Synthetic Objects", ACM Trans. Graph. 27, (Aug. 2008), 2 pages. |
Criminisi, A. "Single View Metrology", Int. J. Comput. Vision 40, (Nov. 2000), 8 pages. |
Debevec, Paul "Rendering Synthetic Objects into Real Scenes: Bridging Traditional and Image-Based Graphics with Global Illumination and High Dynamic Range Photography", University of California at Berkeley; In SIGGRAPH 98, (Jul. 1998), 10 pages. |
Farenzena, M. et al., "Recovering Intrinsic Images Using an Illumination Invariant Image", In ICIP, (2007), 4 pages. |
Fournier, Alain et al., "Common Illumination between Real and Computer Generated Scenes", Technical Report; Department of Computer Science, University of British Columbia, Canada, (1992), 8 pages. |
Funt, Brian V., et al., "Recovering Shading from Color Images", ECCV '92, Second European Conference on Computer Vision, (May 1992), 14 pages. |
Furukawa, Yasutaka et al., "Accurate, Dense, and Robust Multi-View Steropsis", IEEE PAMI 32, (Aug. 2010), 8 pages. |
Gibson, Simon, and Alan Murta. Interactive rendering with real-world illumination. Springer Vienna, 2000. * |
Greger, Gene et al., "The Irradiance Volume", IEEE Computer Graphics and Applications 18, (1998), 4 pages. |
Grosse, Roger et al., "Ground truth dataset and baseline evaluations for intrinsic image algorithms", In ICCV 2009, (2009), 8 pages. |
Guo, Ruiqi et al., "Single-Image Shadow Detection and Removal using Paired Regions", In CVPR, (2011), 8 pages. |
Hedau, Varsha et al., "Recovering the Spatial Layout of Cluttered Rooms", In ICCV 2009, (2009), 8 pages. |
Hoiem, et al., "Automatic Photo Pop-up", Proceedings of ACM SIGGRAPH, vol. 24 Issue 3, Jul. 2005, pp. 577-584. |
Horn, Berthold et al., "Determining Lightness from and Image", Computer Vision, Graphics and Image Processing 3, (1974), pp. 277-299. |
Horry, Youichi et al., "Tour into the Picture: Using a Spidery Mesh Interface to Make Animation from a Single Image", Proceedings of the 24th annual conference on Computer graphics and interactive techniques. ACM Press/Addision-Wesley Publish Co., (1997), 9 pages. |
Kang, Hyung Woo, et al. "Tour into the picture using a vanishing line and its extension to panoramic images." Computer Graphics Forum. vol. 20. No. 3. Blackwell Publishers Ltd, 2001. * |
Kee, Eric et al., "Exposing Digital Forgeries from 3-D Lighting Environments", In WIFS, (Dec. 2010), 2 pages. |
Khan, Erum A., et al., "Image-Based Material Editing", ACM Trans. Graph. 25, (Jul. 2006), 10 pages. |
Lalonde, Jean-Francois et al., "Using Color Compatibility for Assessing Image Realism", In ICCV, (2007), 9 pages. |
Lalonde, Jean-Francois et al., "Webcam Clip Art: Appearance and Illuminant Transfer from Time-lapse Sequences", ACM Trans. Graph. 28, (2009), 10 pages. |
Lalonde, Jean-François, et al. "Photo clip art." ACM Transactions on Graphics (TOG). vol. 26. No. 3. ACM, 2007. * |
Land, Edwin H., et al., "Lightness and Retinex Theory", Journal of the Optical Society of America, vol. 61, No. 1, (Jan. 1971), 11 pages. |
Lee, David C., et al., "Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces", Advances in Neural Information Processing Systems (NIPS), (Dec. 1, 2010), 10 pages. |
Lee, David C., et al., "Geometric Reasoning for Single Image Structure Recovery", In CVPR, (2009), 8 pages. |
Levin, et al., "Spectral Matting", IEEE Conference on Computer Vision and Pattern Recognition, (Oct. 2008), 14 pages. |
Liebowitz, David et al., "Creating Architectural Models from Images", In Eurographics, vol. 18, (1999), 13 pages. |
Lopez-Moreno, Jorge et al., "Compositing Images Through Light Source Detection", Computers & Graphics 24, (2010), 10 pages. |
Loscos, Céline, George Drettakis, and Luc Robert. "Interactive virtual relighting of real scenes." Visualization and Computer Graphics, IEEE Transactions on 6.4 (2000): 289-305. * |
Merrell, Paul et al., "Interactive Furniture Layout Using Interior Design Guidelines", ACM Trans. Graph. 30, (Aug. 2011), 9 pages. |
Mury, Alexander A., et al., "Representing the light field in finite three-dimensional spaces from sparse discrete samples", Applied Optics 48, (Jan. 20, 2009), pp. 450-547. |
Oh, Byong M., "Image-Based Modeling and Photo Editing", 28th annual conference on Computer graphics and interactive techniques, SIGGRAPH Proceedings, (2001), 10 pages. |
Rother, Carsten "A new Approach to Vanishing Point Detection in Architectural Environments", IVC 20, 9-10, (Aug. 2002), 10 pages. |
Sato, Imari et al., "Illumination from Shadows", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. Y, No. Y (2003), 11 pages. |
Saxena, Ashutosh et al., "Make3D: Depth Perception from a Single Still Image", Proceedings of the 23rd National Conference on Artificial Intelligence-vol. 3, AAAI Press, (2008), 6 pages. |
Sinha, et al., "Interactive 3D Architectural Modeling from Unordered Photo Collections", ACM Transactions on Graphics, vol. 27, No. 5, Article 159, (Dec. 2008), pp. 159:1-159:10. |
Tappen, Marshall F., et al., "Estimating Intrinsic Component Images using Non-Linear Regression", In CVPR, vol. 2, (2006), 8 pages. |
Tappen, Marshall F., et al., "Recovering Intrinsic Images from a Single Image", IEEE Trans. PAMI 27, 9, (Sep. 2005), 8 pages. |
Wang, Yang et al., "Estimation of multiple directional light sources for synthesis of augmented reality images", Graphical Models 65, (Feb. 25, 2003), pp. 186-205. |
Weiss, Yair "Deriving Intrinsic Images from Image Sequences", In ICCV, II, (2001), 8 pages. |
Wu, Tai-Pang, et al. "Natural shadow matting." ACM Transactions on Graphics (TOG) 26.2 (2007): 8. * |
Yeung, Sai-Kit et al., "Matting and Compositing of Transparent and Refractive Objects", ACM Trans. Graph. 30, (2011), 14 pages. |
Yu, Lap-Fai et al., "Make it Home: Automatic Optimization of Furniture Arrangement", ACM Trans. Graph. 30, (Aug. 2011), 11 pages. |
Yu, Yizhou, et al. "Inverse global illumination: Recovering reflectance models of real scenes from photographs." Proceedings of the 26th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 1999. * |
Zhang, Li et al., "Single View Modeling of Free-Form Scenes", Journal of Visualization and Computer Animation, vol. 13, No. 4, (2002), 9 pages. |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160247536A1 (en) * | 2013-02-20 | 2016-08-25 | Intel Corporation | Techniques for adding interactive features to videos |
US9922681B2 (en) * | 2013-02-20 | 2018-03-20 | Intel Corporation | Techniques for adding interactive features to videos |
US10685498B2 (en) * | 2014-05-13 | 2020-06-16 | Nant Holdings Ip, Llc | Augmented reality content rendering via albedo models, systems and methods |
US20210342057A1 (en) * | 2015-06-01 | 2021-11-04 | Lg Electronics Inc. | Mobile terminal |
US11934625B2 (en) * | 2015-06-01 | 2024-03-19 | Lg Electronics Inc. | Mobile terminal |
US20180374271A1 (en) * | 2015-12-21 | 2018-12-27 | Thomson Licensing | Ket lights direction detection |
US10657724B2 (en) * | 2015-12-21 | 2020-05-19 | Thomson Licensing | Key lights direction detection |
US10089759B2 (en) * | 2016-02-01 | 2018-10-02 | Adobe Systems Incorporated | Homography-assisted perspective drawing |
US10614583B2 (en) | 2016-10-31 | 2020-04-07 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
US11386566B2 (en) | 2016-10-31 | 2022-07-12 | Samsung Electronics Co., Ltd. | Image processing apparatus and method |
Also Published As
Publication number | Publication date |
---|---|
US20130147798A1 (en) | 2013-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9330500B2 (en) | Inserting objects into content | |
US10325399B2 (en) | Optimal texture memory allocation | |
Du et al. | DepthLab: Real-time 3D interaction with depth maps for mobile augmented reality | |
US9613454B2 (en) | Automatic geometry and lighting inference for realistic image editing | |
Siltanen | Diminished reality for augmented reality interior design | |
US11210838B2 (en) | Fusing, texturing, and rendering views of dynamic three-dimensional models | |
Karsch et al. | Rendering synthetic objects into legacy photographs | |
CN115699114B (en) | Method and apparatus for image augmentation for analysis | |
Zhou et al. | Plane-based content preserving warps for video stabilization | |
Zollmann et al. | Image-based ghostings for single layer occlusions in augmented reality | |
JP2019525515A (en) | Multiview scene segmentation and propagation | |
US20240062345A1 (en) | Method, apparatus, and computer-readable medium for foreground object deletion and inpainting | |
CN110866966A (en) | Rendering virtual objects with realistic surface properties matching the environment | |
WO2023066121A1 (en) | Rendering of three-dimensional model | |
US9471967B2 (en) | Relighting fragments for insertion into content | |
Du et al. | Video fields: fusing multiple surveillance videos into a dynamic virtual environment | |
Liu et al. | Static scene illumination estimation from videos with applications | |
Unger et al. | Spatially varying image based lighting using HDR-video | |
Liao et al. | Illumination animating and editing in a single picture using scene structure estimation | |
WO2023024395A1 (en) | Method and apparatus for model optimization, electronic device, storage medium, computer program, and computer program product | |
Li et al. | Guided selfies using models of portrait aesthetics | |
Liu et al. | Fog effect for photography using stereo vision | |
US11893207B2 (en) | Generating a semantic construction of a physical setting | |
Kim et al. | Planar Abstraction and Inverse Rendering of 3D Indoor Environments | |
Karsch et al. | Rendering synthetic objects into legacy photographs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NATIONAL SCIENCE FOUNDATION, VIRGINIA Free format text: CONFIRMATORY LICENSE;ASSIGNOR:UNIVERSITY OF ILLINOIS URBANA-CHAMPAIGN;REEL/FRAME:027439/0410 Effective date: 20111209 |
|
AS | Assignment |
Owner name: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOI Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KARSCH, KEVIN;HEDAU, VARSHA CHANDRASHEKHAR;FORSYTH, DAVID A;AND OTHERS;SIGNING DATES FROM 20120109 TO 20120124;REEL/FRAME:027650/0899 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: NAVY, SECRETARY OF THE UNITED STATES OF AMERICA, V Free format text: CONFIRMATORY LICENSE;ASSIGNOR:ILLINOIS, UNIVERSITY OF;REEL/FRAME:044311/0618 Effective date: 20111209 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |