EP3072289A1 - A light field processing method - Google Patents

A light field processing method

Info

Publication number
EP3072289A1
EP3072289A1 EP13798624.6A EP13798624A EP3072289A1 EP 3072289 A1 EP3072289 A1 EP 3072289A1 EP 13798624 A EP13798624 A EP 13798624A EP 3072289 A1 EP3072289 A1 EP 3072289A1
Authority
EP
European Patent Office
Prior art keywords
light field
camera
plane
ray
representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP13798624.6A
Other languages
German (de)
English (en)
French (fr)
Inventor
Laurent RIME
Bernard Maxvell ARULRAJ
Keishi NISHIDA
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vidinoti SA
Original Assignee
Vidinoti SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vidinoti SA filed Critical Vidinoti SA
Publication of EP3072289A1 publication Critical patent/EP3072289A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/2224Studio circuitry; Studio devices; Studio equipment related to virtual studio applications
    • H04N5/2226Determination of depth image, e.g. for foreground/background separation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2628Alteration of picture size, shape, position or orientation, e.g. zooming, rotation, rolling, perspective, translation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay

Definitions

  • the present invention concerns a light field processing method. Description of related art
  • plenoptic cameras Light fields are often captured with plenoptic cameras.
  • plenoptic camera include the Lytro camera for example.
  • Each plenoptic camera generates data representing the captured light field in a camera dependent format.
  • the Lytro camera represents the light field by a series of matrix; each matrix includes a plurality of cells indicating the intensity of light reaching the micro lenses from various directions. The number of cells corresponds to the number of micro-lenses
  • a light field processing method for processing data corresponding to a light field comprising:
  • processing said converted data so as to generate processed data representing a different light field.
  • Fig. 1 A to 1 E schematically represent different parametrization methods for light fields.
  • Fig. 2 illustrates two ray values coming from the same physical point (on the illustration on the left side) and recomputed using the two-planes representation (on the illustration on the right side).
  • U-V plane is representing device main lens plane.
  • Rx-Ry plane is representing observed real world.
  • Fig. 3 illustrates two ray values coming from two different physical points B, C lying respectively before and after the focal plane 1 1 (on the illustration on the left side) and recomputed using the two-planes representation (on the illustration on the right side).
  • Fig. 4 illustrates a first example of plenoptic camera 1 design.
  • Fig. 5 and 6 illustrate a second example of plenoptic camera 1 design.
  • Fig. 7 illustrates a third example of plenoptic camera 1 design.
  • Fig. 8 illustrates a process for determining the parameters of an unknown plenoptic camera device from the plenoptic representation of a known reference image, here a checkerboard.
  • Fig. 9 illustrates a two-plane representation of a light field from a scene with objects at different distances.
  • Fig. 10 illustrates a first method for determining the depth of each point of a scene, using triangulation between a plurality of rays from the same point.
  • Fig. 1 1 illustrates light rays emitted by a single physical point A intersecting the two planes Rx-Ry and U-V
  • Fig. 12 illustrates an epipolar line appearing in U-Rx plot
  • Fig. 13 shows an example of Gaussian filter for the U-V plane, which diffuses light rays passing through a single point (Rx,Ry) and hitting to the U-V plane.
  • Fig. 14 illustrates a light ray blurring by a Gaussian filter for the U-V plane .
  • Fig. 1 5 illustrates a process of object resizing .
  • Fig. 16 briefly shows the schematic for a perpendicular plane translation. Detailed Description of possible embodiments of the Invention Definitions
  • Object Focal Plane plane in a scene, which is parallel to a camera main lens and on which the plenoptic camera is focused.
  • Image Focal Plane plane within a camera, which is parallel to the camera main lens and where physical points lying on the Object Focal Plane are projected in focus on the Image Focal Plane.
  • Focal Plane when no mention of " Object” or " Image ", it means either one of the Object Focal Plane or Image Focal Plane.
  • a plenoptic function is a function, which describes a light field with multiple parameters as its arguments.
  • a typical plenoptic function represents the radiance of light emitted from a given position (x,y,z) in 3D space, and observed at a given position (u,v) on a 2D plane, at a given time and wavelength.
  • a plenoptic function P which represents the intensity of the light ray, takes the following form:
  • P P(x, y, z, u, v, t, ⁇ ) where t and ⁇ are the observation time and the wavelength respectively.
  • P P(x, y, z, ⁇ , ⁇ , t, ⁇ ).
  • not all 7 parameters are mandatory. For example, if all the light rays in a scene are stationary (i.e. t is constant, such as in a still plenoptic picture) and with a single wavelength ⁇ , the 7D plenoptic function mentioned above could be reduced to a 5D function. Moreover, assuming that the rays travel through transparent air without being blocked by any object, radiance of a light ray remains constant along its linear path. As a consequence, a light ray can be fully parameterized by four parameters. For instance, a light ray can be represented with the positions of two intersecting points on two pre-defined surfaces.
  • the parameterisation method for characterizing each light ray of a light field with four parameters preferably takes the plenoptic camera design into account in order to represent the captured light field meaningfully and be processed easily.
  • representing light field with parallel planes might be straightforward for common plenoptic camera comprising a main lens, a micro-lens array and a sensor plane arranged parallel to each other.
  • a parametrization method independent of a particular camera design is selected for representing each light rays of a light field. This way, a common parametrization method can be used for representing a light field captured with different types or designs of cameras.
  • FIG. 1 A illustrates a parametrisation method of a light field with two planes.
  • a ray n is characterized by the positions where it crosses two planes U-V, Rx-Ry which are parallel to each other.
  • the position on a plane is based on the Cartesian coordinate system for example, or on a polar coordinate system.
  • ( ⁇ ,, ⁇ ,) is the position where Ray n crosses the first plane U-V and (Rx if Ry,) is the position where this ray r, crosses the second plane Rx, Ry.
  • the radiance P is determi ned uniquely from the four parameters Ui,Vi,RXi, Ry,. Taking into account the z axis, the corresponding ray x,y,z is obtained as where k is a parameter that can take any real positive value.
  • This method is well-suited for plenoptic cameras having an array of micro-lenses and a sensor plane parallel to each other.
  • One drawback of this representation is that it can't represent light rays which travel parallel to the planes U-V, Rx-Ry.
  • Fig. 1 B illustrates a parametrisation method of a light field with two spheres s1 , s2 which circumscribe each other.
  • the two spheres s1 , s2 are tangent with each other.
  • a ray n is parameterized by the outgoing intersecting point ( ⁇ 1 , ⁇ 1 ) with a first sphere s1 and the outgoing intersecting point ( ⁇ 2, ⁇ 2) with the second sphere s2 circumscribed with the first sphere at the fi rst intersecting point ( ⁇ 1 , ⁇ 1 ).
  • ( ⁇ 1 , ⁇ 1 ) is the spherical coordinate with respect to the first sphere and ( ⁇ 2, ⁇ 2) is the spherical coordinate with respect to the second sphere.
  • the ray r is obtained as the line passing through the two points:
  • This representation is useful in the case of a plenoptic image captured by an array of cameras arranged on a sphere. This type of camera is typically used for capturing street views. Another advantage of this representation is that all the light rays which intersect the spheres can be described with this representation. However, rays which do not intersect this sphere can't be represented.
  • Fig. 1 C illustrates a parametrisation method of a light field with one single sphere s. It uses two intersecting points ( ⁇ 1 , ⁇ 1 ), ( ⁇ 2, ⁇ 2) of each ray with the sphere s. Assuming that the radius of the sphere s is large enough for the light field, all the rays can be characterized by four angular parameters ( ⁇ 1 , ⁇ 1 ), ( ⁇ 2, ⁇ 2). A ray is obtained as
  • Fig. 1 D illustrates a parametrisation method of a light field with one sphere s and one plane P.
  • a ray n is represented with the intersecting point (x, y) with the plane P and the angle ( ⁇ , ⁇ ) of the ray with respect to the sphere coordinate.
  • the plane P is chosen perpendicular to the ray ⁇ and passes through the center of the sphere such that its normal can be represented by a position on a directional sphere.
  • This sphere-plane representation can represent light rays from any position towards any direction, whether or not it crosses the sphere, in contrast to the representations mentioned above. However, the conversion from the sphere- plane representation to the Cartesian coordinate is more complex than the previous representations.
  • a ray r is represented with the following four parameters: r, ⁇ , ⁇ , ⁇ .
  • r is the distance between the origin of the coordinate and the closest point A on the ray.
  • ( ⁇ , ⁇ ) is the coordinate of the closest point A in the spherical coordinate
  • is the angle of the ray within the plane p in which the ray lies, where the plane is perpendicular to the vector from the origin to the closest point A.
  • Data conversion from one representation to another one can be used to facilitate data processing. For instance, it might be hard to apply a depth
  • processing a lightfield might comprise a step of converting a lightfield representation from a first camera independent
  • plenoptic data is a set of light rays represented as lines
  • the conversion from one representation to another one is equivalent to the conversion of the parameters of lines in a coordinate system to the corresponding parameters in another coordinate system.
  • ⁇ . ⁇ 1> ⁇ 2 ⁇ ⁇ 2 ) P(R xl Ry, U, V)
  • the U-V plane could correspond to the plenoptic camera device 1 main lens 10 plane (i.e. the micro-cameras main lens plane in the case of a Pelican Imaging camera).
  • the Rx-Ry plane is parallel to the U-V plane; it is a normalized version of the object focal plane(s) 14 of the plenoptic cameras at the instant of the capture.
  • Fig. 2 illustrates two rays ⁇ , coming from the same physical point A (on the illustration on the left side) and recomputed using the two-planes
  • the U-V plane represents the normalized camera device main lens 10 plane.
  • the Rx-Ry plane represents the normalized scene (real world) 14.
  • Fig. 2 illustrates two rays r-self coming from a physical point A.
  • the registered light field data contain the intensity and the direction of all light rays. That stored light field data has anyway the inconvenient to be device dependent.
  • the physical point A on the focal plane is seen by the plenoptic camera device 1 by the two rays n, which intensities might be different in the case where that physical point reflects different rays depending on the angle of view (principle of a non-lambertian surface).
  • Both rays r bombard are coming from the focal plane 14 and each of them has a specific intensity and direction. The fact that they are coming from the same physical point A is not known anymore. Some algorithms will be described later to match rays with physical points, and hence to derive depth information.
  • Rx (resp. Ry) is the coordinate on the Rx-Ry Plane in the x (resp. y) direction where one ray intersects the plane.
  • U and V corresponds to the intersection of one ray with the U-V Plane.
  • Fig. 3 illustrates two ray values coming from two different physical points B, C lying respectively before and after the focal plane 14 (on the illustration on the left side) and recomputed using the two-planes representation (on the illustration on the right side).
  • the capture device 1 does not know where the physical point lies.
  • a point might for example be before, on, or after the focal plane, and still generates the same ray light on the camera.
  • This plenoptic camera device 1 comprises a main lens 10 which focuses light rays n, rj on an array of micro-lenses 12 right in front of the camera sensor plane 13.
  • Reference 14 is the object focal plane and the main lens plane is designated with U-V.
  • the Rx-Ry plane represents the scene at a distance 1 from the camera main lens plane U-V. Since the main lens focuses 10 on the micro-lens array 12, rays ⁇ , intersecting on the micro-lens array 12 also intersect on the focal plane 14 of the camera.
  • Each micro-lens forms on the sensor 13 a micro-image that does not overlap with the neighbouring micro-image.
  • the focal length of all micro- lenses are the same.
  • the micro-lenses 12 are significantly small as compared to the main lens (for example about 300 times smaller) and placed at a distance such that the main lens 10 is at the optical infinity of the micro-lenses. This design gives the interesting property that directions of light rays reaching a same micro-lens correspond to different view angles of a physical point belonging to a focused object in the scene.
  • each physical point of a focused object sees all its light rays captured by a single micro-lens and therefore stored on the sensor 13 in a single micro-image, each pixel of the micro-image corresponding to a different ray direction of that physical point.
  • Each micro-image on the sensor 13 plane corresponds to one micro-lens and has coordinates X and Y. Each pixel within a micro-image as coordinates P and Q. Each micro-image is indexed relatively to the optical axis. Pixels in a given micro-image are indexed relatively to the micro-lens optical axis.
  • N x (resp. N y ) corresponds to the number of micro-images in the x (resp. y) directions
  • N p (resp. N y ) correspond to the number of pixels within a micro-image in the x (resp. y) directions.
  • the ray r hits a micro-lens 120 identified with its (X;Y) coordinates.
  • the selected pixel 130 within a micro-image where the ray n hits is described using its (Pi,Qi) coordinates.
  • the area within the main lens 10 where the ray pass through is identified with its (U;V) coordinates.
  • the intersection of Rx-Ry plane with the ray r x hitting the main lens 10 of the device at (U;V) with a specific direction is described using (Rx;Ry) coordinates.
  • coordinates on the Rx-Ry plane (Rx;Ry) and coordinates on the main lens (U;V) have to be determined using known device parameters which are the micro-lens coordinates (X;Y) where the ray is
  • a transformation for transforming the captured ray expressed using device dependent parameters to a device independent plane-plane representation can be formalized as follows: m inlens size x
  • offset x -sign(X) x - x (1 - mod N x , 2)) mainlens size.
  • V (-Q + offset q ) x
  • the plenoptic capture device 1 of Fig. 5 comprises an array of micro- cameras 16 whose lenses are aligned on a same plane U-V and preferably equidistant to each other.
  • These micro-cameras 16 are thin and therefore could be integrated within mobile devices such as portable computers, palmtops, smartphones or similar devices.
  • mobile devices such as portable computers, palmtops, smartphones or similar devices.
  • different camera types with different focal lengths f 1f could be used, such that this plenoptic camera captures more angular information.
  • Each micro- camera captures a subview of a scene from a slightly different position and focal length. The light field is therefore created by combining the images of the different micro-cameras.
  • the reference 19 designates a synthetic optical axis from where all positions are computed in the formulas.
  • Each micro-camera 16 captures a subview of the scene. By aligning the micro-camera plane 160 with the U-V plane, each micro-camera captures the rays hitting a specific U-V coordinate. This corresponds to considering only the rays hitting a specific U-V coordinate but coming from all possible x-Ry coordinates, i.e., looking at the scene from a specific position on the U-V plane. [0053] Since every micro-camera 16 has a different focal length fi, f2, .., the focal planes 14 need to be normalized individually in order to form the Rx-Ry plane.
  • Each micro-camera 16 can be defined by its coordinates X and Y, each pixel within a micro-camera is described using P and Q. Furthermore,
  • N x (resp. N y ) corresponds to the number of micro-cameras in the x (resp. y) directions and N p (resp. N y ) correspond to the number of pixels within a micro- camera in the x (resp. y) directions.
  • Each micro-camera 16 is indexed relatively to the synthetic optical axis 19. Pixels position for each micro-camera is also converted relative to that synthetic optical axis. Computed Rx Ry positions on Rx-Ry plane and U V positions on U-V plane are also relative to that axis.
  • each captured ray ⁇ can be represented on both planes, in the U-V plane with a pair of coordinates (U;V), and in the Rx-Ry plane using (Rx;Ry) coordinates.
  • the ray first hits the micro-camera U-V plane, described using (U;V) coordinates. Then, that ray hits the sensor 13 at a specific coordinates (P;Q) describing the position of the ray within the selected micro-image.
  • Fig. 7 illustrates an example of plenoptic camera 1 design that could correspond the plenoptic camera proposed by aytrix (registered trademark).
  • This camera 1 comprises a main lens 10 focusing the light rays n, r ⁇ , on the image focal plane 15 within the camera.
  • An array of micro-lenses 12 is focused on the image focal plane 15 and located behind it.
  • the micro-lenses 12 then converge the rays on the camera sensor 13.
  • Each micro-lens looks at the scene of the image focal plane 15 with a different view angle.
  • a point A in focus on the object image plane 14 is therefore imaged on the image focal plane 15, which is observed from different view positions by the micro-lenses 12.
  • Several, for example three different types of focal length are used for the micro-lenses. Therefore they focus on three different image focal plane 15 which results in an increased captured angular information.
  • Each micro-image on the sensor plane 13 might be identified by its coordinates X and Y, each pixel within a micro-image as P and Q. Furthermore,
  • N x (resp. N y ) corresponds to the number of micro-images in the x (resp. y) directions and N p (resp. N y ) correspond to the number of pixels within a microimage in the x (resp. y) directions.
  • Each micro-image is indexed relatively to a main lens optical axis and pixels in a given micro-lens are indexed relatively to the micro-lens optical axis.
  • That ray is first captured using device parameters.
  • the ray first hits the main lens plane 10 considered as U-V plane, described using (U;V) coordinates.
  • That ray hits then a specific micro-lens 12 described using (X;Y). Then, it hits the sensor 13 at a specific coordinates (P;Q) describing the position of the ray within the selected micro-image.
  • o bject coord - image focal distance * dist ° b Ject focal plane ( cOOr d selected microlens C ⁇ ° ⁇ " d selected pixel on sensor) , . lm y coordinate ⁇ ⁇ ⁇ nalniens-smsar
  • C00rd yselected pix el on sensor C00rd y selected microlens + j3 ⁇ 4T X (X + ffset y )
  • offset q -sign ⁇ Q) x - x (1 - mod(N q , 2))
  • offset y -sign(Y) x - x (1 - mod(N y , 2))
  • a conversion function can still be acquired by measuring the characteristics of camera system. For example, one can measure how a scene is captured and stored in a plenoptic camera by using a reference scene whose parameters are perfectly known.
  • a plenoptic image 21 of a checkerboard 20 captured with the unknown plenoptic camera 1 could be used to determine the parameters of the camera 1. For instance, if one knows that the design of the camera 1 model is identical to the design of a known camera but only its focal distance is unknown, we can infer the focal distance by moving the reference image along the optical axis and find the position where all the rays from the same physical point composes one single micro-image.
  • a 2D image with a specific width (W) and height (H) is composed of W*H number of pixels.
  • a 1 D image can also be converted to a light field.
  • the process is the same as above except from the fact that we only consider one dimension (e.g. Ix) instead of two (e.g. Ix-ly).
  • each frame can also be converted into a light field following the exact same principle as described above.
  • a 3D model can also be converted into a light field.
  • One first approach is to integrate the 3D model as 3D points in the independent representation.
  • the ray intensities in the different directions is defined by the 3D model parameters. In a simple case, all the ray intensities would be the same in every direction (lambertian model), but it might not always be the case.
  • one converts a 3D model into a camera independent representation only when willing to make use of it.
  • the required memory space is less but some latency might be introduced due to the on-demand conversion of the 3D model for some post-processing.
  • This camera-centered representation can be processed to be centered on any object of the scene. Indeed, light rays comes from scene objects such as light sources our other objects reflecting direct or indirect light sources. In order to process the captured and transformed light fields, it is useful to have such a scene- centered representation in order to be able to modify the light rays when an object is added, removed or modified.
  • augmented reality For example, in an augmented reality processing, one often needs to augment the captured light field with some virtual/artificial visual information. In the present description, augmented reality also comprises situation where one actually remove an object from a scene, which is sometimes called diminished reality. [0079] Throughout the next section we use the example of a representation of the light field with the above described two-plane parametrization but the same method can be equally applied to other representations such as for instance sphere-plane.
  • Fig. 9 illustrates a two-plane representation of a light field from a scene 20.
  • Four rays n to r 4 are illustrated.
  • the first two n, ⁇ 2 correspond to an in-focus point A in the focal plane Rx, Ry of the camera 1, where the last two r ⁇ , r 4 correspond to a point B not in focus at a difference distance.
  • this 3D point does not exist in the scene-centric representation, create a new point in the scene centric representation and attach it a new ray emitted from this point and having the same direction as the current ray in the camera independent representation.
  • the other properties of the ray e.g. intensity
  • the output of this transform is a point cloud having rays attached to each point.
  • Each ray has a specif ic color intensity describing the color of the physical object, lit with the current light, seen from the viewpoint centered on the ray. This fully describes the scene geometry as well as its visual appearance.
  • the main principle behind the first method illustrated on Fig. 10 is to identify which rays ri, ⁇ 2, .. n come from the same physical point A. As soon as two rays (here r ⁇ and r 2 ) have been identified as corresponding to the same physical point A, the depth can be inferred using triangulation. This gives a depth estimate relative to the representation parameters. If absolute depth is needed, one need to have also the parameter linking the light field representation to the physical world.
  • FIG. 10 one physical point A emits two rays, r-[ and r 2 .
  • a third ray r 3 not emitted by this point is also illustrated.
  • a method for assessing whether two rays corresponds (i.e. are emitted by) to the same physical object or point could be based on the assumption that the physical object surfaces are totally lambertian, i.e. the light reflected by the object is the same in all directions, for a given point on the object.
  • a similarity measure which defines how well two rays are representing the same object. This is actually asserting the visual appearance, or more precisely the intensity, of each ray.
  • One possible measure is the absolute difference defined as:
  • the second method is based on epipolar image representation of the light field and more precisely epipolar lines. We present below the method
  • a light field might be represented in Rx-Ry-U-V representation.
  • a physical point A is placed at distance d from the Rx-Ry plane with the offset h x in the direction parallel to the Rx axis.
  • the distance between the Rx plane and the U plane is ⁇ , which equals 1 in our two plane representation
  • the object A emits rays r,.
  • a ray intersects to both the Rx-Ry and the U-V plane but the positions of intersection are slightly different depending on angle of the ray.
  • Radon transform R of a function f(a,b) is defined as: f ⁇ a, b)a x cos ⁇ + y sin ⁇ — p)dxdy
  • Algorithm 1 The Hough (Radon) Transform Algorithm
  • This scene-centric representation allows to perform modifications and/or improvements to the captured light field in a way that is compliant with the physics of the scene.
  • the first is about improving the image quality of a specific scene element. It is often wanted to select an element of an image, such as an object or a well delimited part of the scene, and to have a specific part of an image brighter, darker, with more contrast or with different exposition settings. In order to apply the wanted picture correction in a
  • the second example is about doing augmented reality in the light field space.
  • Augmented reality is about changing the scene with additional/other information. Therefore there is a direct "physical " link between the change in the scene and the content to be added.
  • the example use case given here is to replace an object, such as an existing building, with another object, such as a newer building to be built up.
  • the newer object takes the form of a computer generated virtual 3D model with textures and surface information such as reflectiveness and so on. The goal is to have this 3D model perfectly put in the light field capture so that it replaces the current object.
  • the user select the main frontage of an object, such as a building, in the captured light field scene. This creates an anchor point in the point cloud representation of the scene
  • the system places the virtual 3D model in the scene so that the main
  • the system can infer which rays are representing the object in the light field captures, and therefore which rays have to be replaced by the one representing the virtual 3D model.
  • the system merges the virtual 3D model rays with the one of the scene to create a near-real representation of the scene with the new object artificially put into it.
  • the above procedure can be used to add / change objects in a light field.
  • an augmented reality scenario there are usually two steps involved: 1) based on a single light field capture, a user A modify the scene by either removing elements or by adding new elements, directly linked with the physics of the scene, and 2) a user B take another light field capture (or a video of it, or a continuous real-time capture) so that the scene is automatically changed (or annotated per se) based on the previous user input.
  • the first step can be done using the above 4-steps procedure.
  • the second one involves being able to register the light field captured by user A with the one captured by user B. After the registration we exactly know the mapping between scene A rays and scene B rays.
  • the step number 4 of the above procedure can therefore be automatically applied to create an augmented version of user B's scene based on user's A modifications of the scene. This is called light field based augmented reality.
  • Removing an object from a scene requires a certain knowledge of the scene. For example, removing a cube from scene, requires the method to know how the background looks behind the cube (with respect to the camera plane position).
  • each light ray emitting point as 6D vector, 3 dimensions
  • the algorithm could also quantize them to have one intensity per light ray emitting direction, out of a set of predefined directions. Setting the number of directions to N would create a N+3 dimensional vector representing the emitting point.
  • the result of the clustering is made of clusters, each one representing a
  • the last step of the algorithm is to assign, to each ray of the scene, an object identifier corresponding to the cluster to which the rays belong to.
  • the object to be removed has to be chosen. This can be done for example by presenting to the user an image of the scene and allowing him to click a region. By back projecting the click on the scene we can know to which object the click has been applied to and therefore identify the rays belonging to that object.
  • a light field scene captured by a plenoptic camera has a specific angular and spatial resolution. Both resolutions are mainly due to the camera intrinsic parameters.
  • a similar scene taken with two different plenoptic camera parameters might have different angular and spatial resolutions. Assuming they have the same spatial resolution, the perspective from a scene captured with different camera parameters such as a short focal lens and the average focal lens might be different.
  • the photograph wants the foreground having a similar perspective as in the first image (hence the physical capture position has to be adjusted) but the background with another perspective. For that, he needs to physically move around to be able to capture such a visual effect, which is of importance in photography.
  • the last step of the diminished reality processing is to transform the rays identified in the previous step to make them appear as if they were emitted by the background of the object to remove.
  • the main idea of in painting a light field is to recover the missing information of the light field.
  • the missing information are the area represented by the rays of the selected object. Those rays can be removed and the light field reconstructed by assuming that those object rays are missing. This problem is stated more formally as follow, assuming that the mussing region D corresponds to the rays of the object which are beforehand removed from the captured light field named F.
  • Cropping of a plenoptic image corresponds to selecting a range of four parameters Rx,Ry,U,V.
  • plenoptic cropping allows to crop images on the U-V plane as well as the Rx-Ry plane. Setting a range for each parameter Rx, Ry, U, V, one can select a subset of the light rays from the entire set of light rays.
  • a possible implementation of cropping is the angle-based cropping, which allows to restrict the viewing angle of an object.
  • the angle based cropping takes input of the 3D position (x.y.z) of the attached object and two angles ( ⁇ , ⁇ ) to restrict the viewing area, and outputs the corresponding range of Rx-Ry and U- V.
  • the range of the Rx-Ry plane is determined as:
  • Ray intensity of a plenoptic image can be modified globally and locally.
  • Global ray intensity modification allows user to adjust brightness, color balance, contrast, saturation, etc of a plenoptic image and the modification is applied on all the rays uniformly. More advanced processing, such as automatic image enhancement by analyzing and optimizing color histograms, can also be performed on the plenoptic image.
  • Local ray intensity modification allows user to select an interesting region of a plenoptic image in terms of both scene (i.e. Rx-Ry plane) and viewing point (i.e. U-V plane) then apply a modification listed above within the selected region.
  • a low-pass filter such as Gaussian blurring filter is interpreted as diffusion of light rays in the light filed.
  • Filtering 2D images is represented as convolution of an image and a 2D filter element, likewise filtering plenoptic data is represented as convolution of a plenoptic image and a 4D filter element.
  • H the filter element
  • Fig. 13 shows an example of Gaussian filter for the U-V plane, which diffuses light rays passing through a single point (Rx,Ry) and hitting to the U-V plane.
  • the object A filtered with filter F looks blurred as depicted by A' in Fig. 14.
  • objects near the Rx-Ry plane becomes less blurred and those far from the plane becomes more blurred.
  • the resizing process R for resizing object A illustrated on Fig. 15 transforms a value on an axis to the product of the value and a scaling factor
  • Fig. 17 shows the schematic of resizing the Rx-Ry and U-V planes to a half size.
  • Light rays of a captured scene are parametrized by Rx,Ry,U,V in the two- plane representation, and the U-V plane represents the viewing position.
  • Rx,Ry,U,V represents the viewing position.
  • the captured light field might be taken with a specific object focal plane. Since we capture rays coming from different directions from the same physical points, we can rearrange the rays so as to recreate refocusing.
  • a light field is composed of a finite number of parameters.
  • a ray is described by 4 parameters for the intersections with the U-V and Rx-Ry planes and its ray intensity.
  • the coordinates value of the 4 intersection parameters can correspond to different representations, as for instance the two plane
  • Two plenoptic data with different representations can be merged or fused by converting the second plenoptic data in the second representation to the equivalent data in the first representation and fusing the two data in the first representation.
  • the sampling of the converted data might not be the same as the one of the second data. In this case, quantization may need to be applied on the plenoptic data to cope with different samplings.
  • each parameter is assigned to a small bin whose size corresponds to the distance of two sampling points in a coordinate. For instance, if the number of samples on the Rx axis is 640, then the area to be merged on the Rx axis is divided into 640 bins and the Rx value of each ray which hits the area is quantized into one of the bins. It might happen that two different light rays are quantized to the same bin, which means all the quantized
  • the intensity value of a light ray in the two-plane representation is parameterized using 4 parameters (e.g. Rx, Ry, U, V), we can store the entire captured light field by storing all the intensity values and their corresponding parameters.
  • the 4 parameters can take any real values.
  • An intensity value can be defined for each color red, green and blue or any other value in other
  • a light filed can be stored in a matrix-like format, where its row corresponds to each light ray and its column corresponds to each parameter or intensity value respectively. Assuming that a light ray has one intensity value, the size of matrix equals to 5 (i.e. 4 parameters + 1 intensity value) times the number of light rays to be stored.
  • a traditional image format can be used.
  • the image is composed of pixels, each pixel representing a ray hitting the U-V plane.
  • the 2D cartesian coordinate system of the image is directly mapped to the U-V plane coordinate system, making the relation between the U-V plane and this image storage completely straightforward.
  • the number of pixels of the image corresponds directly to the sampling rate of the U-V plane, which is equal to the number of rays hitting this latter plane.
  • a format to efficiently store a light filed can be constructed for another type of representation, such as the spherical representation by exploiting the characteristics of the representation.
  • Visualization of stored light rays is a necessary step to enable human to perceive the scene represented by the rays.
  • visualization can be performed in various ways, for instance holographic visualization, we consider in this section, without loss of generality, ordinary 2D visualization (i.e. rendering) which visualize a scene as single/multiple 2D image(s).
  • the stored light field in our example of the two-plane representation can be visualized by projecting the 4D light rays which hit a certain viewing position onto a 2D surface.
  • a 2D image is generated by searching for the intersecting point of each ray with the surface and assigning its intensity value into the corresponding pixel of the 2D image.
  • the simplest example is the rendering of light field stored in the Rx-Ry, U-V representation.
  • the Rx-Ry plane corresponds to the surface where light rays are projected (i.e. image plane), and a point on the UV plane corresponds to the viewing position.
  • the viewing position can be placed at an arbitrary position.
  • the change of the viewing position is called perspective shift.
  • perspective shift is conducted by changing the viewing point (U,V) to another viewing point (U',V) on the U-V plane. Rendering a light field with perspective shift induces of a visual effect that the camera position translates to a new position.
  • light field rendering can be used for more advanced use cases.
  • a 3D view of a scene is simulated by generating two images from two viewing positions with interpupillary distance apart and displaying one for the right eye and the other for left eye respectively.
  • stereoscopic images such as shutter 3D system and autostereoscopy.
  • a plenoptic viewer can be used to present data to the user as
  • the change of focal point can be done by directly clicking on a point in the presented 2D image to adjust the focus on that point.
  • the user could use a scroll wheel to change the focal point in the scene. This is visually similar to a scanning of the image, where the points in focus are sharped and the rest are blurry. Let us not that this ability to refocus the image at a focal distance has the property to see behind objects which are hidden by occlusions. From a user perspective, this is similar to scanning a scene along the different focal planes regardless of whether occlusions are present or not. This is a powerful property where using such a Plenoptic Viewer one can see through bushes, or through a dense volume of particles.
  • a plenoptic viewer can also render different views by changing the view point in the 3D space.
  • the view point change can be for instance triggered by a click and drag action of the user mouse on the scene. This way, the plenoptic viewer changes the view position according to the new position of the mouse until the mouse button is released.
  • keyboard keys could be used as triggers to change the view position depending on the pressed keys. A simple example would be to use the keyboard arrows for this action.
  • the user may drag and drop an annotation on the 2D viewer.
  • the annotation can be taken from a list of possible annotations, or uploaded to the viewer or created on-the-fly.
  • the selected annotation appears integrated in the plenoptic viewer.
  • the system has merged the rays properly with the scene rays.
  • the user can then apply some transforms to it directly in the plenoptic scene environment.
  • the transforms can be 3D translation, rotation or scaling. Those transforms are for instance triggered by buttons or keyboard keys.
  • the merging of the rays between the annotation and the scene is done directly at the annotation level as the viewer directly works with the rays information.
  • the recorded plenoptic data can also be visualised in a 3D viewer.
  • the recorded scene can be shifted and manipulate on three dimensions. It permits to the user an intuitive navigation into the scene space. Since only a part of the real- world has been captured, the reconstructed 3D scene might be crackled since some data of the scene is missing.
  • All captured rays may be used to compute that 3D map.
  • Each generated coloured pixel will be positioned into that 3D space.
  • Plenoptic image has the key feature to be focusable afterwards. In other terms, the user can select which areas he wants on focus. So that stored data can be also seen as a stack of images with different sharpest areas. Using each image focal distances, their relative positions can be known. All images from that stack are used to compute the 3D map. For each image, only pixels from sharpest areas are considered and since the selected image focal distance is known, these pixels could be repositioned in to a 3D map. That computed map will be composed of pixels from multiple images, positioned on several planes, giving a impression of depth. Actually, more advanced
  • Microscopy is probably the field where using plenoptic technology is currently the most appropriate.
  • Standard optical systems fail in presenting efficient solutions due to optical limitation (reduced depth of field, too long light exposure for cells or neurons). For instance, as the analysis of cells is a fastidious process, being able to annotate, for instance by labelling cells, shows a strong interest.
  • plenoptic increases depth of field (by a factor 6). • In case of occlusions, plenoptic can resolve the information at different layers, where other depth devices cannot.
  • plenoptic can resolve the information at different layers for a better 3D trajectory analysis.
  • plenoptic can resolve the information at different layers for a better 3D analysis.
  • the various means, logical blocks, and modules may include various hardware and/or software component(s) and/or module(s), including, but not limited to a circuit, an application specific integrated circuit (ASIC), or a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array signal (FPGA) or other programmable logic device (PLD), discrete gate or transistor logic, discrete hardware components or any combination thereof designed to perform the functions described herein.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array signal
  • PLD programmable logic device
  • determining encompasses a wide variety of actions. For example, “determining” may include calculating, computing,
  • determining may include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory) and the like. Also, “determining” may include resolving, selecting, choosing, establishing and the like.
  • a software module may reside in any form of storage medium that is known in the art. Some examples of storage media that may be Used include random access memory (RAM), read only memory (ROM), flash memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM and so forth.
  • RAM random access memory
  • ROM read only memory
  • flash memory EPROM memory
  • EEPROM memory EEPROM memory
  • registers a hard disk, a removable disk, a CD-ROM and so forth.
  • a software module may comprise a single instruction, or many instructions, and may be distributed over several different code segments, among different programs, and across multiple storage media.
  • a software module may consist of an executable program, a portion or routine or library used in a complete program, a plurality of interconnected programs, an "apps" executed by many smartphones, tablets or computers, a widget, a Flash application, a portion of HTML code, etc.
  • a storage medium may be coupled to a processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor.
  • a database may be implemented as any structured collection of data, including a SQL database, a set of XML documents, a semantical database, or set of information available over an IP network, or any other suitable structure.
  • certain aspects may comprise a computer program product for performing the operations presented herein.
  • a computer program product may comprise a computer readable medium having instructions stored (and/or encoded) thereon, the instructions being executable by one or more processors to perform the operations described herein.
  • the computer program product may include packaging material.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Studio Devices (AREA)
  • Length Measuring Devices By Optical Means (AREA)
EP13798624.6A 2013-11-22 2013-11-22 A light field processing method Withdrawn EP3072289A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2013/074520 WO2015074718A1 (en) 2013-11-22 2013-11-22 A light field processing method

Publications (1)

Publication Number Publication Date
EP3072289A1 true EP3072289A1 (en) 2016-09-28

Family

ID=49681009

Family Applications (1)

Application Number Title Priority Date Filing Date
EP13798624.6A Withdrawn EP3072289A1 (en) 2013-11-22 2013-11-22 A light field processing method

Country Status (5)

Country Link
EP (1) EP3072289A1 (ja)
JP (1) JP2016537901A (ja)
KR (1) KR20160106045A (ja)
CN (1) CN106165387A (ja)
WO (1) WO2015074718A1 (ja)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108353188B (zh) * 2015-09-17 2022-09-06 交互数字Vc控股公司 用于编码光场内容的方法
JP6878415B2 (ja) * 2015-09-17 2021-05-26 インターデジタル ヴイシー ホールディングス, インコーポレイテッド ライトフィールド・データ表現
EP3144885A1 (en) 2015-09-17 2017-03-22 Thomson Licensing Light field data representation
EP3144888A1 (en) * 2015-09-17 2017-03-22 Thomson Licensing An apparatus and a method for generating data representing a pixel beam
EP3188123A1 (en) * 2015-12-30 2017-07-05 Thomson Licensing A method and an apparatus for generating data representative of a pixel beam
WO2017135896A1 (en) * 2016-02-02 2017-08-10 Agency For Science, Technology And Research An imaging system and method for estimating three-dimensional shape and/ or behaviour of a subject
EP3220351A1 (en) * 2016-03-14 2017-09-20 Thomson Licensing Method and device for processing lightfield data
KR20230166155A (ko) 2016-07-15 2023-12-06 라이트 필드 랩 인코포레이티드 라이트 필드 및 홀로그램 도파관 어레이에서의 에너지의 선택적 전파
EP3273686A1 (en) 2016-07-21 2018-01-24 Thomson Licensing A method for generating layered depth data of a scene
CN107645643A (zh) * 2017-10-10 2018-01-30 成都学知乐科技有限公司 适用于多种教学环境的图像录制系统
US11650354B2 (en) 2018-01-14 2023-05-16 Light Field Lab, Inc. Systems and methods for rendering data from a 3D environment
CN111292245A (zh) * 2018-12-07 2020-06-16 北京字节跳动网络技术有限公司 图像处理方法和装置
CN111382753B (zh) * 2018-12-27 2023-05-12 曜科智能科技(上海)有限公司 光场语义分割方法、系统、电子终端及存储介质
CN111612806B (zh) * 2020-01-10 2023-07-28 江西理工大学 一种建筑物立面窗户提取方法及装置
CN117015966A (zh) * 2021-03-15 2023-11-07 华为技术有限公司 一种光场预测模型的生成方法及相关装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100309226A1 (en) * 2007-05-08 2010-12-09 Eidgenossische Technische Hochschule Zurich Method and system for image-based information retrieval

Also Published As

Publication number Publication date
KR20160106045A (ko) 2016-09-09
CN106165387A (zh) 2016-11-23
JP2016537901A (ja) 2016-12-01
WO2015074718A1 (en) 2015-05-28

Similar Documents

Publication Publication Date Title
US20150146032A1 (en) Light field processing method
WO2015074718A1 (en) A light field processing method
Liu et al. 3D imaging, analysis and applications
Schops et al. A multi-view stereo benchmark with high-resolution images and multi-camera videos
Sabater et al. Dataset and pipeline for multi-view light-field video
CN106228507B (zh) 一种基于光场的深度图像处理方法
Fuhrmann et al. Mve-a multi-view reconstruction environment.
US20130335535A1 (en) Digital 3d camera using periodic illumination
WO2015180659A1 (zh) 图像处理方法和图像处理装置
WO2013139814A2 (fr) Modèle et procédé de production de modèle 3d photo-réalistes
Santos et al. 3D plant modeling: localization, mapping and segmentation for plant phenotyping using a single hand-held camera
Ley et al. Syb3r: A realistic synthetic benchmark for 3d reconstruction from images
US9325886B2 (en) Specular and diffuse image generator using polarized light field camera and control method thereof
CN110276831B (zh) 三维模型的建构方法和装置、设备、计算机可读存储介质
Ziegler et al. Acquisition system for dense lightfield of large scenes
Wei et al. Object-based illumination estimation with rendering-aware neural networks
Griffiths et al. OutCast: Outdoor Single‐image Relighting with Cast Shadows
Kang et al. Facial depth and normal estimation using single dual-pixel camera
JP2022518402A (ja) 三次元再構成の方法及び装置
Ferreira et al. Fast and accurate micro lenses depth maps for multi-focus light field cameras
CN113361360B (zh) 基于深度学习的多人跟踪方法及系统
Goldlücke et al. Plenoptic Cameras.
Drofova et al. Use of scanning devices for object 3D reconstruction by photogrammetry and visualization in virtual reality
Miles et al. Computational photography
Tao Unified Multi-Cue Depth Estimation from Light-Field Images: Correspondence, Defocus, Shading, and Specularity

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20160517

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20161214