WO2021252892A1 - Systems and methods for producing a light field from a depth map - Google Patents
Systems and methods for producing a light field from a depth map Download PDFInfo
- Publication number
- WO2021252892A1 WO2021252892A1 PCT/US2021/037005 US2021037005W WO2021252892A1 WO 2021252892 A1 WO2021252892 A1 WO 2021252892A1 US 2021037005 W US2021037005 W US 2021037005W WO 2021252892 A1 WO2021252892 A1 WO 2021252892A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixels
- source image
- depth data
- pixel
- light field
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/557—Depth or shape recovery from multiple images from light fields, e.g. from plenoptic cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/207—Image signal generators using stereoscopic image cameras using a single 2D image sensor
- H04N13/232—Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/271—Image signal generators wherein the generated image signals comprise depth maps or disparity maps
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10052—Images from lightfield camera
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/111—Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- the present disclosure relates generally to imaging systems, and more specifically to systems and methods for producing a light field from a depth map.
- Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
- 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
- Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
- 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
- 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non-real-time fashion), or considerable GPU horsepower (e.g. arrays of high- end graphics cards) must be committed to calculate the four-dimensional data.
- 4D light field generation is so computationally expensive that it generally cannot be done in real- time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
- a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units.
- the module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image.
- the module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field.
- the module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
- the disclosed embodiments provide several practical applications and technical advantages, which include at least: 1) circumventing the need for multiple render passes to produce a 4D light field by instead programmatically computing the necessary data for the 4D light field from a single two-dimensional (2D) or two-and-a-half- dimensional (2.5D) source image; and 2) the real-time generation of a 4D light field from a 2D or 2.5D source, even on low-end computing hardware (e.g., a smartphone CPU / GPU).
- Certain embodiments may include none, some, or all of the above technical advantages and practical applications.
- One or more other technical advantages and practical applications may be readily apparent to one skilled in the art from the figures, descriptions, and claims included herein.
- FIGURE 1 is a schematic diagram of an example system for producing a light field from a depth map, according to certain embodiments.
- FIGURE 2 is a diagrammatic view of a two-dimensional image to a four dimensional light field mapping, according to certain embodiments.
- FIGURE 3 is a diagrammatic view of a two and a half-dimensional image to a four-dimensional light field mapping, according to certain embodiments.
- FIGURE 4 is a flowchart of a method for producing a light field from a depth map, according to certain embodiments.
- FIGURE 5 is a flowchart of a method for producing a light field from a depth map using reverse-lookup, according to certain embodiments.
- FIGURE 6 is a diagrammatic view of a reverse lookup mapping for a hardware implementation, according to certain embodiments.
- FIGURE 7 is a flowchart of a method for producing a light field from a depth map using forward-lookup, according to certain embodiments.
- FIGURE 8 is a flowchart of a sub-method of FIGURE 7, according to certain embodiments.
- FIGURE 9 is a diagrammatic view of a forward lookup mapping for a software implementation, according to certain embodiments.
- FIGURES 1 through 9 of the drawings like numerals being used for like and corresponding parts of the various drawings.
- Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
- 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
- 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non-real-time fashion), or considerable GPU horsepower (e.g. , arrays of high- end graphics cards) must be utilized to calculate the 4D data.
- 4D light field generation is so computationally expensive that it generally cannot be done in real-time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
- a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units.
- the module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image.
- the module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field.
- the module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
- embodiments disclosed herein provide systems and methods which render a 4D light field using 2D or 2.5D source image data.
- computing the 4D light field generally requires many render passes (i.e., one render pass for each plenoptic cell). This means, for example, a relatively low-spatial- resolution light field of ten plenoptic cells takes lOx more graphics processing power than the amount of processing power to render the same content on a traditional 2D screen.
- Most typical light field generation systems are predicated on brute-force computing the 4D plenoptic function (RGB for each x, y, theta, and phi). This is equivalent to rendering a scene from many different camera positions for each frame.
- Embodiments of the disclosure circumvent the need for multiple render passes and instead programmatically compute the necessary data for a 4D light field from a single 2D or 2.5D source image.
- embodiments of the disclosure may provide a lOx to lOOx decrease in the amount of time to produce a 4D light field over the approach of traditional systems.
- each plenoptic cell's location represents a spatial position and each pixel location within that cell represents a ray direction.
- embodiments of the disclosure transform data from a 2D or 2.5D source image into a 4D light field via one of two approaches.
- a first approach some embodiments perform simple replication of source imagery in an identical manner across each plenoptic cell.
- 2D data is projected into infinity with no additional depth data.
- some embodiments utilize the depth buffer (intrinsically present for computer-generated 3D imagery) to programmatically compute ray direction (theta, phi) for each pixel in the 2D matrix.
- the second approach is identical to the first approach except each pixel within a plenoptic cell is shifted in x/y space as a function of its associated depth value.
- Embodiments described herein allow for the generation of a 4D light field from a 2D or 2.5D source image in real time, even on low-end computing hardware (e.g., smartphone CPU / GPU).
- the embodiments described herein circumvent the need for multiple render passes, instead programmatically computing the necessary data from a single 2D or 2.5D source image.
- the disclosed embodiments are lOx to lOOx faster than the traditional approach (depending on the spatial resolution of the target light field display).
- the disclosed embodiments may be a key enabler for extended reality (XR) visors, XR wall portals, XR construction helmets, XR pilot helmets, XR far eye displays, and the like.
- XR includes Virtual Reality (VR), Augmented Reality (AR), Mixed Reality (MR), and any combination thereof.
- FIGURE 1 illustrates an example system 100 for producing a light field from a depth map.
- system 100 includes a processor 110, memory 120, and an electronic display 130.
- One or more source images 150 and a depth map to light field module 140 may be stored in memory 120.
- Electronic display 130 includes multiple plenoptic cells 132 (e.g., 132A, 132B, etc.), and each plenoptic cell 132 includes multiple display pixels 134 (e.g., 134A-134P).
- electronic display 130 of FIGURE 1 includes nine plenoptic cells 132, and each plenoptic cell 132 includes sixteen display pixels 134.
- electronic display 130 may have any number of plenoptic cells 132 in any physical arrangement, and each plenoptic cell 132 may have any number of display pixels 134.
- Processor 110 is any electronic circuitry, including, but not limited to microprocessors, application specific integrated circuits (ASIC), application specific instruction set processor (ASIP), and/or state machines, that communicatively couples to memory 120 and controls the operation of automatic alerting communications system 110.
- Processor 110 may be 8-bit, 16-bit, 32-bit, 64-bit or of any other suitable architecture.
- Processor 110 may include an arithmetic logic unit (ALU) for performing arithmetic and logic operations, processor registers that supply operands to the ALU and store the results of ALU operations, and a control unit that fetches instructions from memory and executes them by directing the coordinated operations of the ALU, registers and other components.
- ALU arithmetic logic unit
- Processor 110 may include other hardware that operates software to control and process information.
- Processor 110 executes software stored on memory to perform any of the functions described herein.
- Processor 110 controls the operation and administration of depth map to light field module 140.
- Processor 110 may be a programmable logic device, a microcontroller, a microprocessor, any suitable processing device, or any suitable combination of the preceding.
- Processor 110 is not limited to a single processing device and may encompass multiple processing devices.
- Memory 120 may store, either permanently or temporarily, source images 150, operational software such as depth map to light field module 140, or other information for processor 110.
- Memory 120 may include any one or a combination of volatile or non-volatile local or remote devices suitable for storing information.
- memory 120 may include random access memory (RAM), read only memory (ROM), magnetic storage devices, optical storage devices, or any other suitable information storage device or a combination of these devices.
- Depth map to light field module 140 represents any suitable set of instructions, logic, or code embodied in a computer- readable storage medium.
- depth map to light field module 140 may be embodied in memory 120, a disk, a CD, or a flash drive.
- depth map to light field module 140 may include an application executable by processor 110 to perform one or more of the functions described herein.
- Source image 150 is any image or electronic data file associated with an image. In some embodiments, source image 150 is captured by a camera. In some embodiments, source image 150 is a 2D image that contains color data but does not contain depth data. In some embodiments, source image 150 is a 2.5D image that contains color data plus depth data (e.g., RGB-D). In some embodiments, source image 150 is a stereographic image from a stereo pair. Source image 150 may contain any appropriate pixel data (e.g., RGB-D, RGBA, BGR, HSV, and the like).
- system 100 provides (via, e.g., depth map to light field module 140) a 4D light field for display on electronic display 130 from a 2D or 2.5D source image 150.
- depth map to light field module 140 accesses a source image 150 stored in memory 120 and determines depth data for each pixel 152 of a plurality of pixels 152 of the source image 150. If the source image 150 is a 2D image that does not contain depth data, depth map to light field module 140 may assume a constant depth for the image (e.g., infinity) and then map each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130.
- a constant depth for the image e.g., infinity
- FIGURE 2 is a diagrammatic view of a 2D source image 150 to a 4D light field mapping.
- pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132)
- pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth.
- the pixel data (i.e., color data plus constant depth) is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field.
- FIGURE 3 is a diagrammatic view of a 2.5D source image 150 to a 4D light field mapping.
- pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth.
- the pixel data e.g., RGB-D
- the pixel data is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field.
- the pixel data e.g., RGB+XYD
- the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer, but is shifted as a function of the depth data. This is illustrated in FIGURE 3 by the shifting of the 4 inner pixels of source image 150 (labeled “A” and shaded grey) in certain plenoptic cells 132.
- FIGURE 4 illustrates a method 400 for producing a light field from a depth map.
- method 400 may be utilized by depth map to light field module 140 to generate light field pixel data 142 from source image 150 and send light field pixel data 142 to electronic display 130 in order to display a 4D light field corresponding to the source image 150.
- method 400 may accesses a source image (e.g., source image 150) stored in one or more memory units (e.g., memory 120).
- a source image e.g., source image 150
- memory units e.g., memory 120
- Method 400 may then determine depth data for each pixel of a plurality of pixels of the source image (e.g., steps 410-450) and then map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field (e.g., steps 460-480). Each of these steps is described in more detail below.
- method 400 determines whether the source image contains depth data (e.g. from a render buffer). In some embodiments, the source image is source image 150. If the source image includes depth data, method 400 proceeds to step 420. If the source image does not include depth data, method 400 proceeds to step 440.
- depth data e.g. from a render buffer
- method 400 determines if the depth data of the source image is RGB-D data. If the depth data of the source image is RGB-D data, method 400 proceeds to step 460. If the depth data of the source image is not RGB-D data, method 400 proceeds to step 430. At step 430, method 400 converts the depth data to RGB-D and then proceeds to step 460.
- method 400 determines if the source image is stereoscopic. If method 400 determines that the source image is stereoscopic, method 400 proceeds to step 445. If method 400 determines that the source image is not stereoscopic, method 400 proceeds to step 450.
- method 400 computes the depth data from the stereoscopic source image.
- the depth data is computed via parallax differences.
- method 400 proceeds to step 460.
- method 400 assigns a constant depth and a fixed location to the source image that is devoid of depth data. For example, method 400 may assign a constant depth of infinity. After step 450, method 400 proceeds to step 460.
- method 400 determines if the runtime environment supports reverse-lookup (e.g., a hardware implementation for providing a light field). If method 400 determines that the runtime environment supports reverse-lookup in step 460, method 400 proceeds to step 470. If method 400 determines that the runtime environment does not support reverse-lookup in step 460, method 400 proceeds to step 480.
- reverse-lookup e.g., a hardware implementation for providing a light field.
- step 470 method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using reverse lookup (e.g., hardware implementation).
- step 470 may include copying RGB+D to corresponding X & Y for each plenoptic cell within the target buffer as pixel data (RGB + XYD) arrives at the parser, shifting as a function of depth (D).
- pixel data RGB + XYD
- Ax * Ay total reads and Ax * Ay * Sx * Sy total writes may be performed. More details about certain embodiments of step 470 are described in more detail below with respect to FIGURE 5.
- method 400 may end.
- step 480 maps and sends light field pixel data (e.g., light field pixel data 142) using forward-lookup (e.g., software such as a graphics pipeline shader).
- step 480 may include performing a ray-march for each pixel in output buffer along its corresponding theta / phi direction to determine target pixel data from the reference RGB-D map. In some embodiments, this may require Ax * Ay * Sx * Sy (parallelizable) ray-marches (where Ax & Ay are angular resolution width & height (e.g. pixels per cell) and Sx & Sy are spatial resolution width & height (e.g. cells per display). More details about certain embodiments of step 480 are described in more detail below with respect to FIGURE 7. After step 480, method 400 may end.
- FIGURE 5 is a flowchart of a method 500 for producing a light field from a depth map using reverse-lookup (e.g., hardware implementation), according to certain embodiments.
- FIGURE 6 is a diagrammatic view of a reverse lookup-mapping for a hardware implementation such as method 500.
- method 500 uses the color and depth (e.g., RGB-D) of the pixel in question to determine in which location in the output image to write that data.
- step 510 determines if the incoming pixel 152 of source image 150 has been written to all plenoptic cells 132. If, so, method 500 ends. Otherwise, method 500 proceeds to step 520.
- step 520 method 500 uses the pixel's location in the display matrix (its cell) to determine the real world offset from the center of the light field display to its encompassing plenoptic cell’s comers.
- step 520 includes computing the pixel offset (dX/dY) as a ratio-of-slopes-function of the plenoptic cell position (cX/cY) and pixel depth (D).
- step 520 proceeds to step 530.
- step 530 method 500 determines if the pixel has already been written at the location (X + dX, Y + dY) of this cell for this frame. If so, method 500 proceeds to step 540. If not, method 500 proceeds to step 550.
- method 500 determines if the depth of the already-written pixel is smaller (i.e., closer to the camera) than the depth of the not-yet-written pixel. If the depth of the already-written pixel is smaller than the depth of the not-yet-written pixel, method 500 does not write any pixel data for the incoming pixel and proceeds back to step 510. If the depth of the already- written pixel is not smaller than the depth of the not-yet-written pixel, method 500 proceeds to step 550.
- step 550 method 500 writes the pixel data (RGB+D) to location (X+dX, Y+dY) of the plenoptic cell. Because each pixel in a plenoptic cell represents a different ray direction (the angles theta and phi), method 500 may use the difference between the pixel’s location and the plenoptic cells center, multiplied by the pitch of a cell, to calculate one side of a right triangle in step 550. Method 500 may use the given depth to calculate the other side of that same triangle. The ratio of those two sides generates an angle related to the pixel offset that should be applied when writing the output buffer data for the display.
- This relationship may be defined by taking that angle, converting it to degrees, multiplying by the pixels-per-degree of the display and the sign of the difference between the pixel’s location and its cell center. This offset is then added to the original pixel x/y and its cell’s center to create a new world position. The pixel’s color value and depth are written to this position in the output buffer if it is closer than what was there before to create a part of the light field.
- Method 500 may be implemented in low level code, firmware, or transistor- logic and run on hardware close to the light field display 130.
- identical synthetic content me be fed to each one and it may use its known position to calculate its portion of the light field.
- FIGURE 7 is a flowchart of a method 700 for producing a light field from a depth map using forward-lookup (e.g., software such as a graphics pipeline shader), according to certain embodiments.
- FIGURE 9 is a diagrammatic view of a forward- lookup mapping for a software implementation such as method 700.
- Method 700 uses the pixel in question’s buffer location and the depth (if available) and cell location to determine which color should be placed in that same spot in the output buffer.
- method 700 represents the location and direction in space of that cell within that particular pixel within the plenoptic cell into the virtual camera space so that the appropriate color to put in that cell can be looked up.
- the operation of method 700 is opposite from that of method 500 (i.e., method 700 reads through the display pixels 134 of plenoptic cells 132 while method 500 reads through the source pixel data 152 of source image 150 as illustrated in FIGURE 6).
- the first step of method 700 is to calculate the camera ray for the given pixel location. This is the ray direction that light is traveling along as it would intersect this pixel’s cell location. This ray uses the world position of the pixel’s cell as its origin and the angles theta and phi as its direction.
- the ray’s origin is converted from pixel space to world space using known properties of the collection system such as near plane and field of view.
- method 700 converts this ray into one in the depth (backspace). This transformation converts the 3D world space vector into a 2D depth space vector and uses inverse depth for comparisons.
- the derivatives for both the camera ray and the depth vector are calculated with respect to the spatial dimensions (x,y). These derivatives are then used in a loop to traverse through depth space to determine the closest object in the rendered buffer that the ray would hit.
- the depth space vector In the case of having a separate depth buffer, the depth space vector would also be traversed through its buffer. At the end, the closest object between the depth space and color space would determine the actual depth of the pixel’s ray. In these steps it may be preferable to use the tangent of the inverse depth during the recursion for stability reasons. With the depth of the object that was hit, the origin of the ray and its direction, trigonometric relations (using known properties of the collector such as field of view) can be used to convert from world back to screen space to determine which source color image pixel’s color data should be used for the output image at this pixel location. The specific steps of some embodiments of method 700 are described in more detail below.
- Method 700 may begin in step 710 where method 700 determines if every pixel in the render target has been checked. If so, method 700 may end. Otherwise, method 700 proceeds to step 715.
- method 700 retrieves the next pixel X,Y in the render target. After step 715, method 700 proceeds to step 720 where method 700 determines the ray direction of pixel X,Y in the scene space (di). After step 715, method 700 proceeds to step 725 where method 700 determines the ray direction of pixel X,Y in the background space (d2). After step 725, method 700 proceeds to step 730 where method 700 calculates the derivative of di and d2 with respect to X/Y. After step 730, method 700 proceeds to step 735 where method 700 calls the function of FIGURE 8 using di and its derivatives to approximate the ray intersection point ii in the scene space.
- step 740 method 700 calls the function of FIGURE 8 using d2 and its derivatives to approximate the ray intersection point h in the background space.
- step 745 method 700 converts hand into more accurate inverse depths idi and id2.
- step 750 method 700 calculates an ending inverse depth (id e ) from id2.
- step 755. method 700 determines if idi is less than id2. If idi is less than id2, method 700 proceeds to step 760 where method 700 writes 0 (blocked) and then proceeds back to step 710.
- step 765 method 700 writes the color sample form the scene or multisample (e.g., derivative) at ii. After step 765, method 700 proceeds back to step 710.
- method 700 may be implemented as a post process shader in a game engine or render engine. This shader would take as input data from cameras placed in the synthetic scene and optionally depth buffers from those virtual cameras. Method 700 would then output a single image that contained the cells that make up the light field in an array of images.
- FIGURE 8 is a flowchart of a sub-method 800 of FIGURE 7, according to certain embodiments.
- Method 800 is a parallax occlusion ray to inverse depth method that functions by raymarching with uniform steps in the pixel space.
- method 800 first sets up a conversion between 3D space and the 2D space of the input render input. This allows method 800 to its iteration in a pixel-perfect manner while still doing the computation in 3D space. Not only does this provide better quality results, but it also provides drastic performance gains when the light field display is small with respect to the scene it is displaying.
- Method 800 may begin in step 810 where method 800 calculates the starting tangent for the inverse depth for the direction passed in from method 700 (di or d2). After step 810, method 800 proceeds to step 820 where method 800 calculates the ending tangent for the inverse depth for the direction passed in from method 700 (di or d2). In general, in steps 810 and 820, method 800 obtains the tangents (original render space) corresponding to the start and end inverse depths along the input ray. Method 800 may in addition compute the constant step (corresponding to half of the diagonal length of one pixel) to be used in the raymarching.
- method 800 begins a loop that takes a parameter 't' from 0 to 1.
- Method 800 uses this fact to compute the depth of the input ray at each step in the raymarch (step 830 where method 800 calculates the parametric inverse depth id t using ‘t’), and method 800 uses the linear relation between t and the input pixel space to sample the input depth map, which is called the scene depth (step 835).
- step 840 On the first iteration in which the ray depth exceeds the scene depth (step 840), method 800 concludes that the ray has 'hit' something and takes an early exit to return an appropriately interpolated value (step 850). Otherwise, in the case that this never occurs before the end of the loop, method 800 returns an indication that we hit nothing (step 855).
- Method 800 thus is a parallax occlusion algorithm that results in an inverse depth, allowing it to return an intersection point (as an inverse depth along the ray) as well as returning a 'nothing hit' signal (as the input ending inverse depth).
- tangent space As opposed to either pixel space or angle space, and mostly use inverse depth as opposed to depth. Both of these choices in space have both conceptual and computational advantages.
- tangent space may be the best choice because it is the most consistent, computationally cheapest option and is readily compatible with the other spaces (pixel space is tangent space with a scalar multiplier).
- inverse depth space offers many advantages. As noted above, in a standard camera projection, pixel position is inversely related to scene depth (coordinate parallel to the view direction, not distance to the camera) and therefore linearly related to inverse depth. This allows the disclosed embodiments to execute large sections of computation (such as pixel-perfect raymarching) without ever directly computing depth and without ever using a division. Another advantage of using inverse depth is a clean representation of infinity. An inverse depth of 0 may be used to refer to the 'back' of the scene as it is effectively a depth of infinity. This proves to be much more useful than the counterpart depth of zero that the inverse depth space renders unwieldy.
- each refers to each member of a set or each member of a subset of a set.
- or is not necessarily exclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.”
- and is not necessarily inclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” All references to "a/an/the element, apparatus, component, means, step, etc.” are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.
- references to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Generation (AREA)
Abstract
A system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
Description
SYSTEMS AND METHODS FOR PRODUCING A LIGHT FIELD FROM A
DEPTH MAP
PRIORITY
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application No. 63/038,639, filed 12 June 2020, which is incorporated herein by reference in its entirety. TECHNICAL FIELD
The present disclosure relates generally to imaging systems, and more specifically to systems and methods for producing a light field from a depth map.
BACKGROUND
Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
SUMMARY
Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
Traditional media content (e.g. images and video) typically does not contain the necessary angular information (theta, phi) for 4D light field displays. Computing the missing angular information is complex and computationally intensive. In practice, this means that 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non- real-time fashion), or considerable GPU horsepower (e.g. arrays of high- end graphics cards) must be committed to calculate the four-dimensional data. 4D light field generation is so computationally expensive that it generally cannot be done in real- time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
To address these and other problems with providing 4D light fields, embodiment of the disclosure provide novel systems and methods for producing a light field from a depth map. In some embodiments, a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
The disclosed embodiments provide several practical applications and technical advantages, which include at least: 1) circumventing the need for multiple render passes to produce a 4D light field by instead programmatically computing the necessary data for the 4D light field from a single two-dimensional (2D) or two-and-a-half- dimensional (2.5D) source image; and 2) the real-time generation of a 4D light field from a 2D or 2.5D source, even on low-end computing hardware (e.g., a smartphone CPU / GPU).
Certain embodiments may include none, some, or all of the above technical advantages and practical applications. One or more other technical advantages and practical applications may be readily apparent to one skilled in the art from the figures, descriptions, and claims included herein.
BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts. FIGURE 1 is a schematic diagram of an example system for producing a light field from a depth map, according to certain embodiments.
FIGURE 2 is a diagrammatic view of a two-dimensional image to a four dimensional light field mapping, according to certain embodiments.
FIGURE 3 is a diagrammatic view of a two and a half-dimensional image to a four-dimensional light field mapping, according to certain embodiments.
FIGURE 4 is a flowchart of a method for producing a light field from a depth map, according to certain embodiments.
FIGURE 5 is a flowchart of a method for producing a light field from a depth map using reverse-lookup, according to certain embodiments. FIGURE 6 is a diagrammatic view of a reverse lookup mapping for a hardware implementation, according to certain embodiments.
FIGURE 7 is a flowchart of a method for producing a light field from a depth map using forward-lookup, according to certain embodiments.
FIGURE 8 is a flowchart of a sub-method of FIGURE 7, according to certain embodiments.
FIGURE 9 is a diagrammatic view of a forward lookup mapping for a software implementation, according to certain embodiments.
DETAILED DESCRIPTION
Embodiments of the present disclosure and its advantages are best understood by referring to FIGURES 1 through 9 of the drawings, like numerals being used for like and corresponding parts of the various drawings. Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
Traditional media content (e.g. images and video) typically does not contain the necessary angular information (theta, phi) for 4D light field displays. Computing the missing angular information is complex and computationally intensive. In practice, this means that 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non- real-time fashion), or considerable GPU horsepower (e.g. , arrays of high- end graphics cards) must be utilized to calculate the 4D data. 4D light field generation is so computationally expensive that it generally cannot be done in real-time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
To address these and other difficulties and problems with providing 4D light fields, embodiment of the disclosure provide novel systems and methods for producing a light field from a depth map. In some embodiments, a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field. In general, embodiments disclosed herein provide systems and methods which render a 4D light field using 2D or 2.5D source image data. In typical systems, computing the 4D light field generally requires many render passes (i.e., one render
pass for each plenoptic cell). This means, for example, a relatively low-spatial- resolution light field of ten plenoptic cells takes lOx more graphics processing power than the amount of processing power to render the same content on a traditional 2D screen. Most typical light field generation systems are predicated on brute-force computing the 4D plenoptic function (RGB for each x, y, theta, and phi). This is equivalent to rendering a scene from many different camera positions for each frame. This is computationally expensive and generally impractical (or impossible) for real time content. Embodiments of the disclosure, however, circumvent the need for multiple render passes and instead programmatically compute the necessary data for a 4D light field from a single 2D or 2.5D source image. As will be appreciated, embodiments of the disclosure may provide a lOx to lOOx decrease in the amount of time to produce a 4D light field over the approach of traditional systems.
For a near-eye light field system composed of plenoptic cells, each plenoptic cell's location represents a spatial position and each pixel location within that cell represents a ray direction. As such, embodiments of the disclosure transform data from a 2D or 2.5D source image into a 4D light field via one of two approaches. In a first approach, some embodiments perform simple replication of source imagery in an identical manner across each plenoptic cell. In these embodiments, 2D data is projected into infinity with no additional depth data. In a second approach, some embodiments utilize the depth buffer (intrinsically present for computer-generated 3D imagery) to programmatically compute ray direction (theta, phi) for each pixel in the 2D matrix. In essence, the second approach is identical to the first approach except each pixel within a plenoptic cell is shifted in x/y space as a function of its associated depth value.
The two approaches described above allow for rapid computation of the 4D light field data (x, y, theta, phi) from a 2D or 2.5D source image. Embodiments described herein allow for the generation of a 4D light field from a 2D or 2.5D source image in real time, even on low-end computing hardware (e.g., smartphone CPU / GPU). The embodiments described herein circumvent the need for multiple render passes, instead programmatically computing the necessary data from a single 2D or 2.5D source image. Thus, the disclosed embodiments are lOx to lOOx faster than the traditional approach (depending on the spatial resolution of the target light field display). The disclosed embodiments may be a key enabler for extended reality (XR) visors, XR wall portals,
XR construction helmets, XR pilot helmets, XR far eye displays, and the like. As used herein, XR includes Virtual Reality (VR), Augmented Reality (AR), Mixed Reality (MR), and any combination thereof.
FIGURE 1 illustrates an example system 100 for producing a light field from a depth map. As seen in FIGURE 1, system 100 includes a processor 110, memory 120, and an electronic display 130. One or more source images 150 and a depth map to light field module 140 may be stored in memory 120. Electronic display 130 includes multiple plenoptic cells 132 (e.g., 132A, 132B, etc.), and each plenoptic cell 132 includes multiple display pixels 134 (e.g., 134A-134P). For illustrative purposes only, electronic display 130 of FIGURE 1 includes nine plenoptic cells 132, and each plenoptic cell 132 includes sixteen display pixels 134. This provides a 4D resultant light field that has a 4x4 angular resolution and a 3x3 spatial resolution. However, electronic display 130 may have any number of plenoptic cells 132 in any physical arrangement, and each plenoptic cell 132 may have any number of display pixels 134.
Processor 110 is any electronic circuitry, including, but not limited to microprocessors, application specific integrated circuits (ASIC), application specific instruction set processor (ASIP), and/or state machines, that communicatively couples to memory 120 and controls the operation of automatic alerting communications system 110. Processor 110 may be 8-bit, 16-bit, 32-bit, 64-bit or of any other suitable architecture. Processor 110 may include an arithmetic logic unit (ALU) for performing arithmetic and logic operations, processor registers that supply operands to the ALU and store the results of ALU operations, and a control unit that fetches instructions from memory and executes them by directing the coordinated operations of the ALU, registers and other components. Processor 110 may include other hardware that operates software to control and process information. Processor 110 executes software stored on memory to perform any of the functions described herein. Processor 110 controls the operation and administration of depth map to light field module 140. Processor 110 may be a programmable logic device, a microcontroller, a microprocessor, any suitable processing device, or any suitable combination of the preceding. Processor 110 is not limited to a single processing device and may encompass multiple processing devices.
Memory 120 may store, either permanently or temporarily, source images 150, operational software such as depth map to light field module 140, or other information for processor 110. Memory 120 may include any one or a combination of volatile or non-volatile local or remote devices suitable for storing information. For example, memory 120 may include random access memory (RAM), read only memory (ROM), magnetic storage devices, optical storage devices, or any other suitable information storage device or a combination of these devices. Depth map to light field module 140 represents any suitable set of instructions, logic, or code embodied in a computer- readable storage medium. For example, depth map to light field module 140 may be embodied in memory 120, a disk, a CD, or a flash drive. In particular embodiments, depth map to light field module 140 may include an application executable by processor 110 to perform one or more of the functions described herein.
Source image 150 is any image or electronic data file associated with an image. In some embodiments, source image 150 is captured by a camera. In some embodiments, source image 150 is a 2D image that contains color data but does not contain depth data. In some embodiments, source image 150 is a 2.5D image that contains color data plus depth data (e.g., RGB-D). In some embodiments, source image 150 is a stereographic image from a stereo pair. Source image 150 may contain any appropriate pixel data (e.g., RGB-D, RGBA, BGR, HSV, and the like).
In operation, system 100 provides (via, e.g., depth map to light field module 140) a 4D light field for display on electronic display 130 from a 2D or 2.5D source image 150. To do so, depth map to light field module 140 accesses a source image 150 stored in memory 120 and determines depth data for each pixel 152 of a plurality of pixels 152 of the source image 150. If the source image 150 is a 2D image that does not contain depth data, depth map to light field module 140 may assume a constant depth for the image (e.g., infinity) and then map each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130. For example, FIGURE 2 is a diagrammatic view of a 2D source image 150 to a 4D light field mapping. As illustrated in FIGURE 2, pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth. The pixel data (i.e., color data plus constant depth) is sent from depth map to light field module 140 to electronic display 130 as
light field pixel data 142 in order to produce the corresponding 4D light field. For example, as the pixel data (e.g., RGB+XYD) arrives at the parser, the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer. On the other hand, if the source image 150 is a 2.5D image that does contain depth data, depth map to light field module 140 uses the depth data when mapping each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130. For example, FIGURE 3 is a diagrammatic view of a 2.5D source image 150 to a 4D light field mapping. As illustrated in FIGURE 3, pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth. The pixel data (e.g., RGB-D) is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field. For example, as the pixel data (e.g., RGB+XYD) arrives at the parser, the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer, but is shifted as a function of the depth data. This is illustrated in FIGURE 3 by the shifting of the 4 inner pixels of source image 150 (labeled “A” and shaded grey) in certain plenoptic cells 132.
FIGURE 4 illustrates a method 400 for producing a light field from a depth map. In general, method 400 may be utilized by depth map to light field module 140 to generate light field pixel data 142 from source image 150 and send light field pixel data 142 to electronic display 130 in order to display a 4D light field corresponding to the source image 150. To do so, method 400 may accesses a source image (e.g., source image 150) stored in one or more memory units (e.g., memory 120). Method 400 may then determine depth data for each pixel of a plurality of pixels of the source image (e.g., steps 410-450) and then map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field (e.g., steps 460-480). Each of these steps is described in more detail below.
At step 410, method 400 determines whether the source image contains depth data (e.g. from a render buffer). In some embodiments, the source image is source image 150. If the source image includes depth data, method 400 proceeds to step 420. If the source image does not include depth data, method 400 proceeds to step 440.
At step 420, method 400 determines if the depth data of the source image is RGB-D data. If the depth data of the source image is RGB-D data, method 400
proceeds to step 460. If the depth data of the source image is not RGB-D data, method 400 proceeds to step 430. At step 430, method 400 converts the depth data to RGB-D and then proceeds to step 460.
At step 440, method 400 determines if the source image is stereoscopic. If method 400 determines that the source image is stereoscopic, method 400 proceeds to step 445. If method 400 determines that the source image is not stereoscopic, method 400 proceeds to step 450.
At step 445, method 400 computes the depth data from the stereoscopic source image. In some embodiments, the depth data is computed via parallax differences. After step 445, method 400 proceeds to step 460.
At step 450, method 400 assigns a constant depth and a fixed location to the source image that is devoid of depth data. For example, method 400 may assign a constant depth of infinity. After step 450, method 400 proceeds to step 460.
At step 460, method 400 determines if the runtime environment supports reverse-lookup (e.g., a hardware implementation for providing a light field). If method 400 determines that the runtime environment supports reverse-lookup in step 460, method 400 proceeds to step 470. If method 400 determines that the runtime environment does not support reverse-lookup in step 460, method 400 proceeds to step 480.
At step 470, method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using reverse lookup (e.g., hardware implementation). In general, step 470 may include copying RGB+D to corresponding X & Y for each plenoptic cell within the target buffer as pixel data (RGB + XYD) arrives at the parser, shifting as a function of depth (D). In some embodiments, Ax * Ay total reads and Ax * Ay * Sx * Sy total writes may be performed. More details about certain embodiments of step 470 are described in more detail below with respect to FIGURE 5. After step 470, method 400 may end.
At step 480, method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using forward-lookup (e.g., software such as a graphics pipeline shader). In general, step 480 may include performing a ray-march for each pixel in output buffer along its corresponding theta / phi direction to determine target pixel data from the reference RGB-D map. In some embodiments, this may require Ax * Ay * Sx * Sy
(parallelizable) ray-marches (where Ax & Ay are angular resolution width & height (e.g. pixels per cell) and Sx & Sy are spatial resolution width & height (e.g. cells per display). More details about certain embodiments of step 480 are described in more detail below with respect to FIGURE 7. After step 480, method 400 may end.
FIGURE 5 is a flowchart of a method 500 for producing a light field from a depth map using reverse-lookup (e.g., hardware implementation), according to certain embodiments. FIGURE 6 is a diagrammatic view of a reverse lookup-mapping for a hardware implementation such as method 500. In general, if parallelizable hardware is available, method 500 uses the color and depth (e.g., RGB-D) of the pixel in question to determine in which location in the output image to write that data. In step 510, method 500 determines if the incoming pixel 152 of source image 150 has been written to all plenoptic cells 132. If, so, method 500 ends. Otherwise, method 500 proceeds to step 520. In step 520, method 500 uses the pixel's location in the display matrix (its cell) to determine the real world offset from the center of the light field display to its encompassing plenoptic cell’s comers. In some embodiments, step 520 includes computing the pixel offset (dX/dY) as a ratio-of-slopes-function of the plenoptic cell position (cX/cY) and pixel depth (D). After step 520, method 500 proceeds to step 530.
In step 530, method 500 determines if the pixel has already been written at the location (X + dX, Y + dY) of this cell for this frame. If so, method 500 proceeds to step 540. If not, method 500 proceeds to step 550.
At step 540, method 500 determines if the depth of the already-written pixel is smaller (i.e., closer to the camera) than the depth of the not-yet-written pixel. If the depth of the already-written pixel is smaller than the depth of the not-yet-written pixel, method 500 does not write any pixel data for the incoming pixel and proceeds back to step 510. If the depth of the already- written pixel is not smaller than the depth of the not-yet-written pixel, method 500 proceeds to step 550.
In step 550, method 500 writes the pixel data (RGB+D) to location (X+dX, Y+dY) of the plenoptic cell. Because each pixel in a plenoptic cell represents a different ray direction (the angles theta and phi), method 500 may use the difference between the pixel’s location and the plenoptic cells center, multiplied by the pitch of a cell, to calculate one side of a right triangle in step 550. Method 500 may use the given depth to calculate the other side of that same triangle. The ratio of those two sides generates
an angle related to the pixel offset that should be applied when writing the output buffer data for the display. This relationship may be defined by taking that angle, converting it to degrees, multiplying by the pixels-per-degree of the display and the sign of the difference between the pixel’s location and its cell center. This offset is then added to the original pixel x/y and its cell’s center to create a new world position. The pixel’s color value and depth are written to this position in the output buffer if it is closer than what was there before to create a part of the light field.
Method 500 may be implemented in low level code, firmware, or transistor- logic and run on hardware close to the light field display 130. In this application, identical synthetic content me be fed to each one and it may use its known position to calculate its portion of the light field.
FIGURE 7 is a flowchart of a method 700 for producing a light field from a depth map using forward-lookup (e.g., software such as a graphics pipeline shader), according to certain embodiments. FIGURE 9 is a diagrammatic view of a forward- lookup mapping for a software implementation such as method 700. In general, in the case where there is not a hardware implementation, it is possible to map the 4D light field from source image 150 using a loop. Method 700 uses the pixel in question’s buffer location and the depth (if available) and cell location to determine which color should be placed in that same spot in the output buffer. At a high level, method 700 represents the location and direction in space of that cell within that particular pixel within the plenoptic cell into the virtual camera space so that the appropriate color to put in that cell can be looked up. As illustrated in FIGURE 9, the operation of method 700 is opposite from that of method 500 (i.e., method 700 reads through the display pixels 134 of plenoptic cells 132 while method 500 reads through the source pixel data 152 of source image 150 as illustrated in FIGURE 6). The first step of method 700 is to calculate the camera ray for the given pixel location. This is the ray direction that light is traveling along as it would intersect this pixel’s cell location. This ray uses the world position of the pixel’s cell as its origin and the angles theta and phi as its direction. These angles can be calculated from the pixel’s offset from the center of its cell. The details of this calculation depend on if foveation or other mapping needs to take place and should not be assumed to be linear. The ray’s origin is converted from pixel space to world space using known properties of the collection system such as near plane and
field of view. Next, method 700 converts this ray into one in the depth (backspace). This transformation converts the 3D world space vector into a 2D depth space vector and uses inverse depth for comparisons. Next the derivatives for both the camera ray and the depth vector are calculated with respect to the spatial dimensions (x,y). These derivatives are then used in a loop to traverse through depth space to determine the closest object in the rendered buffer that the ray would hit. In the case of having a separate depth buffer, the depth space vector would also be traversed through its buffer. At the end, the closest object between the depth space and color space would determine the actual depth of the pixel’s ray. In these steps it may be preferable to use the tangent of the inverse depth during the recursion for stability reasons. With the depth of the object that was hit, the origin of the ray and its direction, trigonometric relations (using known properties of the collector such as field of view) can be used to convert from world back to screen space to determine which source color image pixel’s color data should be used for the output image at this pixel location. The specific steps of some embodiments of method 700 are described in more detail below.
Method 700 may begin in step 710 where method 700 determines if every pixel in the render target has been checked. If so, method 700 may end. Otherwise, method 700 proceeds to step 715.
At step 715, method 700 retrieves the next pixel X,Y in the render target. After step 715, method 700 proceeds to step 720 where method 700 determines the ray direction of pixel X,Y in the scene space (di). After step 715, method 700 proceeds to step 725 where method 700 determines the ray direction of pixel X,Y in the background space (d2). After step 725, method 700 proceeds to step 730 where method 700 calculates the derivative of di and d2 with respect to X/Y. After step 730, method 700 proceeds to step 735 where method 700 calls the function of FIGURE 8 using di and its derivatives to approximate the ray intersection point ii in the scene space. After step 735, method 700 proceeds to step 740 where method 700 calls the function of FIGURE 8 using d2 and its derivatives to approximate the ray intersection point h in the background space. After step 740, method 700 proceeds to step 745 where method 700 converts hand into more accurate inverse depths idi and id2. After step 745, method 700 proceeds to step 750 where method 700 calculates an ending inverse depth (ide) from id2. After step 750, method 700 proceeds to step 755.
At step 755, method 700 determines if idi is less than id2. If idi is less than id2, method 700 proceeds to step 760 where method 700 writes 0 (blocked) and then proceeds back to step 710. Otherwise, if idi is not less than id2, method 700 proceeds to step 765 where method 700 writes the color sample form the scene or multisample (e.g., derivative) at ii. After step 765, method 700 proceeds back to step 710.
In the forward-lookup application (i.e., FIGURE 7), method 700 may be implemented as a post process shader in a game engine or render engine. This shader would take as input data from cameras placed in the synthetic scene and optionally depth buffers from those virtual cameras. Method 700 would then output a single image that contained the cells that make up the light field in an array of images.
FIGURE 8 is a flowchart of a sub-method 800 of FIGURE 7, according to certain embodiments. Method 800 is a parallax occlusion ray to inverse depth method that functions by raymarching with uniform steps in the pixel space. In general, method 800 first sets up a conversion between 3D space and the 2D space of the input render input. This allows method 800 to its iteration in a pixel-perfect manner while still doing the computation in 3D space. Not only does this provide better quality results, but it also provides drastic performance gains when the light field display is small with respect to the scene it is displaying.
Method 800 may begin in step 810 where method 800 calculates the starting tangent for the inverse depth for the direction passed in from method 700 (di or d2). After step 810, method 800 proceeds to step 820 where method 800 calculates the ending tangent for the inverse depth for the direction passed in from method 700 (di or d2). In general, in steps 810 and 820, method 800 obtains the tangents (original render space) corresponding to the start and end inverse depths along the input ray. Method 800 may in addition compute the constant step (corresponding to half of the diagonal length of one pixel) to be used in the raymarching.
In steps 820 and 825, method 800 begins a loop that takes a parameter 't' from 0 to 1. Method 800 considers t=0 to be the start of the input ray and t=l to be the endpoint, and method 800 increments 't' by the previously described precomputed constant at each iteration of the loop (step 845). Because 't' is linearly related to the original render's pixel space, it has an inverse relation to depth of the input ray (i.e., t*depth is constant). Method 800 uses this fact to compute the depth of the input ray at
each step in the raymarch (step 830 where method 800 calculates the parametric inverse depth idt using ‘t’), and method 800 uses the linear relation between t and the input pixel space to sample the input depth map, which is called the scene depth (step 835). On the first iteration in which the ray depth exceeds the scene depth (step 840), method 800 concludes that the ray has 'hit' something and takes an early exit to return an appropriately interpolated value (step 850). Otherwise, in the case that this never occurs before the end of the loop, method 800 returns an indication that we hit nothing (step 855). Method 800 thus is a parallax occlusion algorithm that results in an inverse depth, allowing it to return an intersection point (as an inverse depth along the ray) as well as returning a 'nothing hit' signal (as the input ending inverse depth).
The methods described herein mostly use tangent space as opposed to either pixel space or angle space, and mostly use inverse depth as opposed to depth. Both of these choices in space have both conceptual and computational advantages. For directions in 3D space and for locations on a 2D pinhole-style image, the space of choice is tangent space. This is defined with respect to the plane perpendicular to the view direction of the camera/light field so that for tangent coordinate {w_x, w_y}, the direction it points corresponds to the vector <x = w_x, y = w_y, z = 1>. In plenoptic cellular light field contexts, tangent space may be the best choice because it is the most consistent, computationally cheapest option and is readily compatible with the other spaces (pixel space is tangent space with a scalar multiplier).
The use of inverse depth space offers many advantages. As noted above, in a standard camera projection, pixel position is inversely related to scene depth (coordinate parallel to the view direction, not distance to the camera) and therefore linearly related to inverse depth. This allows the disclosed embodiments to execute large sections of computation (such as pixel-perfect raymarching) without ever directly computing depth and without ever using a division. Another advantage of using inverse depth is a clean representation of infinity. An inverse depth of 0 may be used to refer to the 'back' of the scene as it is effectively a depth of infinity. This proves to be much more useful than the counterpart depth of zero that the inverse depth space renders unwieldy.
The scope of this disclosure is not limited to the example embodiments described or illustrated herein. The scope of this disclosure encompasses all changes,
substitutions, variations, alterations, and modifications to the example embodiments described or illustrated herein that a person having ordinary skill in the art would comprehend.
Modifications, additions, or omissions may be made to the systems and apparatuses described herein without departing from the scope of the disclosure. The components of the systems and apparatuses may be integrated or separated. Moreover, the operations of the systems and apparatuses may be performed by more, fewer, or other components. Additionally, operations of the systems and apparatuses may be performed using any suitable logic comprising software, hardware, and/or other logic.
Modifications, additions, or omissions may be made to the methods described herein without departing from the scope of the disclosure. The methods may include more, fewer, or other steps. Additionally, steps may be performed in any suitable order. That is, the steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
As used in this document, “each” refers to each member of a set or each member of a subset of a set. Furthermore, as used in the document “or” is not necessarily exclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” Similarly, as used in this document “and” is not necessarily inclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.
Furthermore, reference to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from
the scope of this disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.
To aid the Patent Office, and any readers of any patent issued on this application in interpreting the claims appended hereto, applicants note that they do not intend any of the appended claims to invoke 35 U.S.C. § 112(f) as it exists on the date of filing hereof unless the words “means for” or “step for” are explicitly used in the particular claim.
Claims
1. A system comprising: an electronic display; a computer processor; one or more memory units; and a module stored in the one or more memory units, the module configured, when executed by the processor, to: access a source image stored in the one or more memory units; determine depth data for each pixel of a plurality of pixels of the source image; map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and send instructions to the electronic display to display the mapped four- dimensional light field.
2. The system of Claim 1 , wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
3. The system of Claim 1, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
4. The system of Claim 1, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
5. The system of Claim 1, wherein: the module is further configured, when executed by the processor, to determine whether a runtime environment for the system supports reverse-lookup; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
6. The system of Claim 1, wherein: the electronic display comprises a plurality of plenoptic cells, each plenoptic cell comprising a plurality of display pixels; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises copying pixel data for each of the plurality of pixels of the source image to a corresponding display pixel of each of the plenoptic cells.
7. The system of Claim 6, wherein the corresponding display pixel is determined as a function of the depth data of the corresponding pixel of the source image.
8. The system of Claim 1, wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup.
9. The system of Claim 1 , wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup.
10. A method by a computing device, the method comprising: accessing a source image stored in one or more memory units; determining depth data for each pixel of a plurality of pixels of the source image; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and sending instructions to an electronic display to display the mapped fourdimensional light field.
11. The method of Claim 10, wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
12. The method of Claim 10, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
13. The method of Claim 10, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
14. The method of Claim 10, further comprising determining whether a runtime environment for the computing device supports reverse-lookup, wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
15. The method of Claim 10, wherein: the electronic display comprises a plurality of plenoptic cells, each plenoptic cell comprising a plurality of display pixels; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises copying pixel data for each of the plurality of pixels of the source image to a corresponding display pixel of each of the plenoptic cells.
16. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access a source image stored in the one or more computer-readable non- transitory storage media; determine depth data for each pixel of a plurality of pixels of the source image; map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and send instructions to an electronic display to display the mapped fourdimensional light field.
17. The media of Claim 16, wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
18. The media of Claim 16, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
19. The media of Claim 16, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
20. The media of Claim 16, wherein: the software is further operable when executed to determine whether a runtime environment supports reverse-lookup; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21737891.8A EP4165586A1 (en) | 2020-06-12 | 2021-06-11 | Systems and methods for producing a light field from a depth map |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063038639P | 2020-06-12 | 2020-06-12 | |
US63/038,639 | 2020-06-12 | ||
US17/345,436 | 2021-06-11 | ||
US17/345,436 US20210390722A1 (en) | 2020-06-12 | 2021-06-11 | Systems and Methods for Producing a Light Field from a Depth Map |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021252892A1 true WO2021252892A1 (en) | 2021-12-16 |
Family
ID=78825780
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2021/037005 WO2021252892A1 (en) | 2020-06-12 | 2021-06-11 | Systems and methods for producing a light field from a depth map |
Country Status (3)
Country | Link |
---|---|
US (1) | US20210390722A1 (en) |
EP (1) | EP4165586A1 (en) |
WO (1) | WO2021252892A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180342075A1 (en) * | 2017-05-25 | 2018-11-29 | Lytro, Inc. | Multi-view back-projection to a light-field |
US20200137376A1 (en) * | 2017-10-31 | 2020-04-30 | Wuhan China Star Optoelectronics Technology Co., Ltd | Method for generating a light-field 3d display unit image and a generating device |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100505334B1 (en) * | 2003-03-28 | 2005-08-04 | (주)플렛디스 | Real-time stereoscopic image conversion apparatus using motion parallaxr |
US8559705B2 (en) * | 2006-12-01 | 2013-10-15 | Lytro, Inc. | Interactive refocusing of electronic images |
JP5242667B2 (en) * | 2010-12-22 | 2013-07-24 | 株式会社東芝 | Map conversion method, map conversion apparatus, and map conversion program |
US9390505B2 (en) * | 2013-12-12 | 2016-07-12 | Qualcomm Incorporated | Method and apparatus for generating plenoptic depth maps |
US9305375B2 (en) * | 2014-03-25 | 2016-04-05 | Lytro, Inc. | High-quality post-rendering depth blur |
JP6620394B2 (en) * | 2014-06-17 | 2019-12-18 | ソニー株式会社 | Control device, control method and program |
EP3228977A4 (en) * | 2014-12-01 | 2018-07-04 | Sony Corporation | Image-processing device and image-processing method |
US10735711B2 (en) * | 2017-05-05 | 2020-08-04 | Motorola Mobility Llc | Creating a three-dimensional image via a wide-angle camera sensor |
-
2021
- 2021-06-11 EP EP21737891.8A patent/EP4165586A1/en active Pending
- 2021-06-11 WO PCT/US2021/037005 patent/WO2021252892A1/en unknown
- 2021-06-11 US US17/345,436 patent/US20210390722A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180342075A1 (en) * | 2017-05-25 | 2018-11-29 | Lytro, Inc. | Multi-view back-projection to a light-field |
US20200137376A1 (en) * | 2017-10-31 | 2020-04-30 | Wuhan China Star Optoelectronics Technology Co., Ltd | Method for generating a light-field 3d display unit image and a generating device |
Non-Patent Citations (1)
Title |
---|
WANG QIAOSONG ET AL: "Stereo vision-based depth of field rendering on a mobile de", JOURNAL OF ELECTRONIC IMAGING, S P I E - INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING, US, vol. 23, no. 2, 1 March 2014 (2014-03-01), pages 23009, XP060047684, ISSN: 1017-9909, [retrieved on 20140319], DOI: 10.1117/1.JEI.23.2.023009 * |
Also Published As
Publication number | Publication date |
---|---|
US20210390722A1 (en) | 2021-12-16 |
EP4165586A1 (en) | 2023-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210174471A1 (en) | Image Stitching Method, Electronic Apparatus, and Storage Medium | |
US11363249B2 (en) | Layered scene decomposition CODEC with transparency | |
US6680735B1 (en) | Method for correcting gradients of irregular spaced graphic data | |
US7973791B2 (en) | Apparatus and method for generating CG image for 3-D display | |
Heidrich et al. | View-independent environment maps | |
RU2754721C2 (en) | Device and method for generating an image of the intensity of light radiation | |
CN111698463A (en) | View synthesis using neural networks | |
JP2000348202A (en) | Shifting warp renderling method for volume data set having voxel and rendering method for volume data set | |
Xiong et al. | Registration, calibration and blending in creating high quality panoramas | |
TW202036481A (en) | Distance field color palette | |
GB2475944A (en) | Correction of estimated axes of elliptical filter region | |
Bonatto et al. | Real-time depth video-based rendering for 6-DoF HMD navigation and light field displays | |
JP2006244426A (en) | Texture processing device, picture drawing processing device, and texture processing method | |
CN114782607A (en) | Graphics texture mapping | |
CN113643414A (en) | Three-dimensional image generation method and device, electronic equipment and storage medium | |
US20210390722A1 (en) | Systems and Methods for Producing a Light Field from a Depth Map | |
JP3629243B2 (en) | Image processing apparatus and method for rendering shading process using distance component in modeling | |
EP4064193A1 (en) | Real-time omnidirectional stereo matching using multi-view fisheye lenses | |
KR20220133766A (en) | Real-time omnidirectional stereo matching method using multi-view fisheye lenses and system therefore | |
KR20210117988A (en) | Methods and apparatus for decoupled shading texture rendering | |
Kolhatkar et al. | Real-time virtual viewpoint generation on the GPU for scene navigation | |
CN114503150A (en) | Method and apparatus for multi-lens distortion correction | |
JP2002092597A (en) | Method and device for processing image | |
Walia et al. | A computationally efficient framework for 3D warping technique | |
EP1209620A2 (en) | Method for correcting gradients of irregularly spaced graphic data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21737891 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021737891 Country of ref document: EP Effective date: 20230112 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |