WO2021252892A1 - Systems and methods for producing a light field from a depth map - Google Patents

Systems and methods for producing a light field from a depth map Download PDF

Info

Publication number
WO2021252892A1
WO2021252892A1 PCT/US2021/037005 US2021037005W WO2021252892A1 WO 2021252892 A1 WO2021252892 A1 WO 2021252892A1 US 2021037005 W US2021037005 W US 2021037005W WO 2021252892 A1 WO2021252892 A1 WO 2021252892A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
source image
depth data
pixel
light field
Prior art date
Application number
PCT/US2021/037005
Other languages
French (fr)
Inventor
Kyle Martin RINGGENBERG
Mark Andrew LAMKIN
Jordan David LAMKIN
Bryan Eugene WALTER
Jared Scott KNUTZON
Original Assignee
Fyr, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fyr, Inc. filed Critical Fyr, Inc.
Priority to EP21737891.8A priority Critical patent/EP4165586A1/en
Publication of WO2021252892A1 publication Critical patent/WO2021252892A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/557Depth or shape recovery from multiple images from light fields, e.g. from plenoptic cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/207Image signal generators using stereoscopic image cameras using a single 2D image sensor
    • H04N13/232Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10052Images from lightfield camera
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • the present disclosure relates generally to imaging systems, and more specifically to systems and methods for producing a light field from a depth map.
  • Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
  • 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
  • Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
  • 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
  • 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non-real-time fashion), or considerable GPU horsepower (e.g. arrays of high- end graphics cards) must be committed to calculate the four-dimensional data.
  • 4D light field generation is so computationally expensive that it generally cannot be done in real- time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
  • a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units.
  • the module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image.
  • the module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field.
  • the module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
  • the disclosed embodiments provide several practical applications and technical advantages, which include at least: 1) circumventing the need for multiple render passes to produce a 4D light field by instead programmatically computing the necessary data for the 4D light field from a single two-dimensional (2D) or two-and-a-half- dimensional (2.5D) source image; and 2) the real-time generation of a 4D light field from a 2D or 2.5D source, even on low-end computing hardware (e.g., a smartphone CPU / GPU).
  • Certain embodiments may include none, some, or all of the above technical advantages and practical applications.
  • One or more other technical advantages and practical applications may be readily apparent to one skilled in the art from the figures, descriptions, and claims included herein.
  • FIGURE 1 is a schematic diagram of an example system for producing a light field from a depth map, according to certain embodiments.
  • FIGURE 2 is a diagrammatic view of a two-dimensional image to a four dimensional light field mapping, according to certain embodiments.
  • FIGURE 3 is a diagrammatic view of a two and a half-dimensional image to a four-dimensional light field mapping, according to certain embodiments.
  • FIGURE 4 is a flowchart of a method for producing a light field from a depth map, according to certain embodiments.
  • FIGURE 5 is a flowchart of a method for producing a light field from a depth map using reverse-lookup, according to certain embodiments.
  • FIGURE 6 is a diagrammatic view of a reverse lookup mapping for a hardware implementation, according to certain embodiments.
  • FIGURE 7 is a flowchart of a method for producing a light field from a depth map using forward-lookup, according to certain embodiments.
  • FIGURE 8 is a flowchart of a sub-method of FIGURE 7, according to certain embodiments.
  • FIGURE 9 is a diagrammatic view of a forward lookup mapping for a software implementation, according to certain embodiments.
  • FIGURES 1 through 9 of the drawings like numerals being used for like and corresponding parts of the various drawings.
  • Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV.
  • 2D data representation of traditional electronic displays light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
  • 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non-real-time fashion), or considerable GPU horsepower (e.g. , arrays of high- end graphics cards) must be utilized to calculate the 4D data.
  • 4D light field generation is so computationally expensive that it generally cannot be done in real-time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
  • a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units.
  • the module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image.
  • the module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field.
  • the module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
  • embodiments disclosed herein provide systems and methods which render a 4D light field using 2D or 2.5D source image data.
  • computing the 4D light field generally requires many render passes (i.e., one render pass for each plenoptic cell). This means, for example, a relatively low-spatial- resolution light field of ten plenoptic cells takes lOx more graphics processing power than the amount of processing power to render the same content on a traditional 2D screen.
  • Most typical light field generation systems are predicated on brute-force computing the 4D plenoptic function (RGB for each x, y, theta, and phi). This is equivalent to rendering a scene from many different camera positions for each frame.
  • Embodiments of the disclosure circumvent the need for multiple render passes and instead programmatically compute the necessary data for a 4D light field from a single 2D or 2.5D source image.
  • embodiments of the disclosure may provide a lOx to lOOx decrease in the amount of time to produce a 4D light field over the approach of traditional systems.
  • each plenoptic cell's location represents a spatial position and each pixel location within that cell represents a ray direction.
  • embodiments of the disclosure transform data from a 2D or 2.5D source image into a 4D light field via one of two approaches.
  • a first approach some embodiments perform simple replication of source imagery in an identical manner across each plenoptic cell.
  • 2D data is projected into infinity with no additional depth data.
  • some embodiments utilize the depth buffer (intrinsically present for computer-generated 3D imagery) to programmatically compute ray direction (theta, phi) for each pixel in the 2D matrix.
  • the second approach is identical to the first approach except each pixel within a plenoptic cell is shifted in x/y space as a function of its associated depth value.
  • Embodiments described herein allow for the generation of a 4D light field from a 2D or 2.5D source image in real time, even on low-end computing hardware (e.g., smartphone CPU / GPU).
  • the embodiments described herein circumvent the need for multiple render passes, instead programmatically computing the necessary data from a single 2D or 2.5D source image.
  • the disclosed embodiments are lOx to lOOx faster than the traditional approach (depending on the spatial resolution of the target light field display).
  • the disclosed embodiments may be a key enabler for extended reality (XR) visors, XR wall portals, XR construction helmets, XR pilot helmets, XR far eye displays, and the like.
  • XR includes Virtual Reality (VR), Augmented Reality (AR), Mixed Reality (MR), and any combination thereof.
  • FIGURE 1 illustrates an example system 100 for producing a light field from a depth map.
  • system 100 includes a processor 110, memory 120, and an electronic display 130.
  • One or more source images 150 and a depth map to light field module 140 may be stored in memory 120.
  • Electronic display 130 includes multiple plenoptic cells 132 (e.g., 132A, 132B, etc.), and each plenoptic cell 132 includes multiple display pixels 134 (e.g., 134A-134P).
  • electronic display 130 of FIGURE 1 includes nine plenoptic cells 132, and each plenoptic cell 132 includes sixteen display pixels 134.
  • electronic display 130 may have any number of plenoptic cells 132 in any physical arrangement, and each plenoptic cell 132 may have any number of display pixels 134.
  • Processor 110 is any electronic circuitry, including, but not limited to microprocessors, application specific integrated circuits (ASIC), application specific instruction set processor (ASIP), and/or state machines, that communicatively couples to memory 120 and controls the operation of automatic alerting communications system 110.
  • Processor 110 may be 8-bit, 16-bit, 32-bit, 64-bit or of any other suitable architecture.
  • Processor 110 may include an arithmetic logic unit (ALU) for performing arithmetic and logic operations, processor registers that supply operands to the ALU and store the results of ALU operations, and a control unit that fetches instructions from memory and executes them by directing the coordinated operations of the ALU, registers and other components.
  • ALU arithmetic logic unit
  • Processor 110 may include other hardware that operates software to control and process information.
  • Processor 110 executes software stored on memory to perform any of the functions described herein.
  • Processor 110 controls the operation and administration of depth map to light field module 140.
  • Processor 110 may be a programmable logic device, a microcontroller, a microprocessor, any suitable processing device, or any suitable combination of the preceding.
  • Processor 110 is not limited to a single processing device and may encompass multiple processing devices.
  • Memory 120 may store, either permanently or temporarily, source images 150, operational software such as depth map to light field module 140, or other information for processor 110.
  • Memory 120 may include any one or a combination of volatile or non-volatile local or remote devices suitable for storing information.
  • memory 120 may include random access memory (RAM), read only memory (ROM), magnetic storage devices, optical storage devices, or any other suitable information storage device or a combination of these devices.
  • Depth map to light field module 140 represents any suitable set of instructions, logic, or code embodied in a computer- readable storage medium.
  • depth map to light field module 140 may be embodied in memory 120, a disk, a CD, or a flash drive.
  • depth map to light field module 140 may include an application executable by processor 110 to perform one or more of the functions described herein.
  • Source image 150 is any image or electronic data file associated with an image. In some embodiments, source image 150 is captured by a camera. In some embodiments, source image 150 is a 2D image that contains color data but does not contain depth data. In some embodiments, source image 150 is a 2.5D image that contains color data plus depth data (e.g., RGB-D). In some embodiments, source image 150 is a stereographic image from a stereo pair. Source image 150 may contain any appropriate pixel data (e.g., RGB-D, RGBA, BGR, HSV, and the like).
  • system 100 provides (via, e.g., depth map to light field module 140) a 4D light field for display on electronic display 130 from a 2D or 2.5D source image 150.
  • depth map to light field module 140 accesses a source image 150 stored in memory 120 and determines depth data for each pixel 152 of a plurality of pixels 152 of the source image 150. If the source image 150 is a 2D image that does not contain depth data, depth map to light field module 140 may assume a constant depth for the image (e.g., infinity) and then map each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130.
  • a constant depth for the image e.g., infinity
  • FIGURE 2 is a diagrammatic view of a 2D source image 150 to a 4D light field mapping.
  • pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132)
  • pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth.
  • the pixel data (i.e., color data plus constant depth) is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field.
  • FIGURE 3 is a diagrammatic view of a 2.5D source image 150 to a 4D light field mapping.
  • pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth.
  • the pixel data e.g., RGB-D
  • the pixel data is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field.
  • the pixel data e.g., RGB+XYD
  • the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer, but is shifted as a function of the depth data. This is illustrated in FIGURE 3 by the shifting of the 4 inner pixels of source image 150 (labeled “A” and shaded grey) in certain plenoptic cells 132.
  • FIGURE 4 illustrates a method 400 for producing a light field from a depth map.
  • method 400 may be utilized by depth map to light field module 140 to generate light field pixel data 142 from source image 150 and send light field pixel data 142 to electronic display 130 in order to display a 4D light field corresponding to the source image 150.
  • method 400 may accesses a source image (e.g., source image 150) stored in one or more memory units (e.g., memory 120).
  • a source image e.g., source image 150
  • memory units e.g., memory 120
  • Method 400 may then determine depth data for each pixel of a plurality of pixels of the source image (e.g., steps 410-450) and then map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field (e.g., steps 460-480). Each of these steps is described in more detail below.
  • method 400 determines whether the source image contains depth data (e.g. from a render buffer). In some embodiments, the source image is source image 150. If the source image includes depth data, method 400 proceeds to step 420. If the source image does not include depth data, method 400 proceeds to step 440.
  • depth data e.g. from a render buffer
  • method 400 determines if the depth data of the source image is RGB-D data. If the depth data of the source image is RGB-D data, method 400 proceeds to step 460. If the depth data of the source image is not RGB-D data, method 400 proceeds to step 430. At step 430, method 400 converts the depth data to RGB-D and then proceeds to step 460.
  • method 400 determines if the source image is stereoscopic. If method 400 determines that the source image is stereoscopic, method 400 proceeds to step 445. If method 400 determines that the source image is not stereoscopic, method 400 proceeds to step 450.
  • method 400 computes the depth data from the stereoscopic source image.
  • the depth data is computed via parallax differences.
  • method 400 proceeds to step 460.
  • method 400 assigns a constant depth and a fixed location to the source image that is devoid of depth data. For example, method 400 may assign a constant depth of infinity. After step 450, method 400 proceeds to step 460.
  • method 400 determines if the runtime environment supports reverse-lookup (e.g., a hardware implementation for providing a light field). If method 400 determines that the runtime environment supports reverse-lookup in step 460, method 400 proceeds to step 470. If method 400 determines that the runtime environment does not support reverse-lookup in step 460, method 400 proceeds to step 480.
  • reverse-lookup e.g., a hardware implementation for providing a light field.
  • step 470 method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using reverse lookup (e.g., hardware implementation).
  • step 470 may include copying RGB+D to corresponding X & Y for each plenoptic cell within the target buffer as pixel data (RGB + XYD) arrives at the parser, shifting as a function of depth (D).
  • pixel data RGB + XYD
  • Ax * Ay total reads and Ax * Ay * Sx * Sy total writes may be performed. More details about certain embodiments of step 470 are described in more detail below with respect to FIGURE 5.
  • method 400 may end.
  • step 480 maps and sends light field pixel data (e.g., light field pixel data 142) using forward-lookup (e.g., software such as a graphics pipeline shader).
  • step 480 may include performing a ray-march for each pixel in output buffer along its corresponding theta / phi direction to determine target pixel data from the reference RGB-D map. In some embodiments, this may require Ax * Ay * Sx * Sy (parallelizable) ray-marches (where Ax & Ay are angular resolution width & height (e.g. pixels per cell) and Sx & Sy are spatial resolution width & height (e.g. cells per display). More details about certain embodiments of step 480 are described in more detail below with respect to FIGURE 7. After step 480, method 400 may end.
  • FIGURE 5 is a flowchart of a method 500 for producing a light field from a depth map using reverse-lookup (e.g., hardware implementation), according to certain embodiments.
  • FIGURE 6 is a diagrammatic view of a reverse lookup-mapping for a hardware implementation such as method 500.
  • method 500 uses the color and depth (e.g., RGB-D) of the pixel in question to determine in which location in the output image to write that data.
  • step 510 determines if the incoming pixel 152 of source image 150 has been written to all plenoptic cells 132. If, so, method 500 ends. Otherwise, method 500 proceeds to step 520.
  • step 520 method 500 uses the pixel's location in the display matrix (its cell) to determine the real world offset from the center of the light field display to its encompassing plenoptic cell’s comers.
  • step 520 includes computing the pixel offset (dX/dY) as a ratio-of-slopes-function of the plenoptic cell position (cX/cY) and pixel depth (D).
  • step 520 proceeds to step 530.
  • step 530 method 500 determines if the pixel has already been written at the location (X + dX, Y + dY) of this cell for this frame. If so, method 500 proceeds to step 540. If not, method 500 proceeds to step 550.
  • method 500 determines if the depth of the already-written pixel is smaller (i.e., closer to the camera) than the depth of the not-yet-written pixel. If the depth of the already-written pixel is smaller than the depth of the not-yet-written pixel, method 500 does not write any pixel data for the incoming pixel and proceeds back to step 510. If the depth of the already- written pixel is not smaller than the depth of the not-yet-written pixel, method 500 proceeds to step 550.
  • step 550 method 500 writes the pixel data (RGB+D) to location (X+dX, Y+dY) of the plenoptic cell. Because each pixel in a plenoptic cell represents a different ray direction (the angles theta and phi), method 500 may use the difference between the pixel’s location and the plenoptic cells center, multiplied by the pitch of a cell, to calculate one side of a right triangle in step 550. Method 500 may use the given depth to calculate the other side of that same triangle. The ratio of those two sides generates an angle related to the pixel offset that should be applied when writing the output buffer data for the display.
  • This relationship may be defined by taking that angle, converting it to degrees, multiplying by the pixels-per-degree of the display and the sign of the difference between the pixel’s location and its cell center. This offset is then added to the original pixel x/y and its cell’s center to create a new world position. The pixel’s color value and depth are written to this position in the output buffer if it is closer than what was there before to create a part of the light field.
  • Method 500 may be implemented in low level code, firmware, or transistor- logic and run on hardware close to the light field display 130.
  • identical synthetic content me be fed to each one and it may use its known position to calculate its portion of the light field.
  • FIGURE 7 is a flowchart of a method 700 for producing a light field from a depth map using forward-lookup (e.g., software such as a graphics pipeline shader), according to certain embodiments.
  • FIGURE 9 is a diagrammatic view of a forward- lookup mapping for a software implementation such as method 700.
  • Method 700 uses the pixel in question’s buffer location and the depth (if available) and cell location to determine which color should be placed in that same spot in the output buffer.
  • method 700 represents the location and direction in space of that cell within that particular pixel within the plenoptic cell into the virtual camera space so that the appropriate color to put in that cell can be looked up.
  • the operation of method 700 is opposite from that of method 500 (i.e., method 700 reads through the display pixels 134 of plenoptic cells 132 while method 500 reads through the source pixel data 152 of source image 150 as illustrated in FIGURE 6).
  • the first step of method 700 is to calculate the camera ray for the given pixel location. This is the ray direction that light is traveling along as it would intersect this pixel’s cell location. This ray uses the world position of the pixel’s cell as its origin and the angles theta and phi as its direction.
  • the ray’s origin is converted from pixel space to world space using known properties of the collection system such as near plane and field of view.
  • method 700 converts this ray into one in the depth (backspace). This transformation converts the 3D world space vector into a 2D depth space vector and uses inverse depth for comparisons.
  • the derivatives for both the camera ray and the depth vector are calculated with respect to the spatial dimensions (x,y). These derivatives are then used in a loop to traverse through depth space to determine the closest object in the rendered buffer that the ray would hit.
  • the depth space vector In the case of having a separate depth buffer, the depth space vector would also be traversed through its buffer. At the end, the closest object between the depth space and color space would determine the actual depth of the pixel’s ray. In these steps it may be preferable to use the tangent of the inverse depth during the recursion for stability reasons. With the depth of the object that was hit, the origin of the ray and its direction, trigonometric relations (using known properties of the collector such as field of view) can be used to convert from world back to screen space to determine which source color image pixel’s color data should be used for the output image at this pixel location. The specific steps of some embodiments of method 700 are described in more detail below.
  • Method 700 may begin in step 710 where method 700 determines if every pixel in the render target has been checked. If so, method 700 may end. Otherwise, method 700 proceeds to step 715.
  • method 700 retrieves the next pixel X,Y in the render target. After step 715, method 700 proceeds to step 720 where method 700 determines the ray direction of pixel X,Y in the scene space (di). After step 715, method 700 proceeds to step 725 where method 700 determines the ray direction of pixel X,Y in the background space (d2). After step 725, method 700 proceeds to step 730 where method 700 calculates the derivative of di and d2 with respect to X/Y. After step 730, method 700 proceeds to step 735 where method 700 calls the function of FIGURE 8 using di and its derivatives to approximate the ray intersection point ii in the scene space.
  • step 740 method 700 calls the function of FIGURE 8 using d2 and its derivatives to approximate the ray intersection point h in the background space.
  • step 745 method 700 converts hand into more accurate inverse depths idi and id2.
  • step 750 method 700 calculates an ending inverse depth (id e ) from id2.
  • step 755. method 700 determines if idi is less than id2. If idi is less than id2, method 700 proceeds to step 760 where method 700 writes 0 (blocked) and then proceeds back to step 710.
  • step 765 method 700 writes the color sample form the scene or multisample (e.g., derivative) at ii. After step 765, method 700 proceeds back to step 710.
  • method 700 may be implemented as a post process shader in a game engine or render engine. This shader would take as input data from cameras placed in the synthetic scene and optionally depth buffers from those virtual cameras. Method 700 would then output a single image that contained the cells that make up the light field in an array of images.
  • FIGURE 8 is a flowchart of a sub-method 800 of FIGURE 7, according to certain embodiments.
  • Method 800 is a parallax occlusion ray to inverse depth method that functions by raymarching with uniform steps in the pixel space.
  • method 800 first sets up a conversion between 3D space and the 2D space of the input render input. This allows method 800 to its iteration in a pixel-perfect manner while still doing the computation in 3D space. Not only does this provide better quality results, but it also provides drastic performance gains when the light field display is small with respect to the scene it is displaying.
  • Method 800 may begin in step 810 where method 800 calculates the starting tangent for the inverse depth for the direction passed in from method 700 (di or d2). After step 810, method 800 proceeds to step 820 where method 800 calculates the ending tangent for the inverse depth for the direction passed in from method 700 (di or d2). In general, in steps 810 and 820, method 800 obtains the tangents (original render space) corresponding to the start and end inverse depths along the input ray. Method 800 may in addition compute the constant step (corresponding to half of the diagonal length of one pixel) to be used in the raymarching.
  • method 800 begins a loop that takes a parameter 't' from 0 to 1.
  • Method 800 uses this fact to compute the depth of the input ray at each step in the raymarch (step 830 where method 800 calculates the parametric inverse depth id t using ‘t’), and method 800 uses the linear relation between t and the input pixel space to sample the input depth map, which is called the scene depth (step 835).
  • step 840 On the first iteration in which the ray depth exceeds the scene depth (step 840), method 800 concludes that the ray has 'hit' something and takes an early exit to return an appropriately interpolated value (step 850). Otherwise, in the case that this never occurs before the end of the loop, method 800 returns an indication that we hit nothing (step 855).
  • Method 800 thus is a parallax occlusion algorithm that results in an inverse depth, allowing it to return an intersection point (as an inverse depth along the ray) as well as returning a 'nothing hit' signal (as the input ending inverse depth).
  • tangent space As opposed to either pixel space or angle space, and mostly use inverse depth as opposed to depth. Both of these choices in space have both conceptual and computational advantages.
  • tangent space may be the best choice because it is the most consistent, computationally cheapest option and is readily compatible with the other spaces (pixel space is tangent space with a scalar multiplier).
  • inverse depth space offers many advantages. As noted above, in a standard camera projection, pixel position is inversely related to scene depth (coordinate parallel to the view direction, not distance to the camera) and therefore linearly related to inverse depth. This allows the disclosed embodiments to execute large sections of computation (such as pixel-perfect raymarching) without ever directly computing depth and without ever using a division. Another advantage of using inverse depth is a clean representation of infinity. An inverse depth of 0 may be used to refer to the 'back' of the scene as it is effectively a depth of infinity. This proves to be much more useful than the counterpart depth of zero that the inverse depth space renders unwieldy.
  • each refers to each member of a set or each member of a subset of a set.
  • or is not necessarily exclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.”
  • and is not necessarily inclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” All references to "a/an/the element, apparatus, component, means, step, etc.” are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.
  • references to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)

Abstract

A system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.

Description

SYSTEMS AND METHODS FOR PRODUCING A LIGHT FIELD FROM A
DEPTH MAP
PRIORITY
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application No. 63/038,639, filed 12 June 2020, which is incorporated herein by reference in its entirety. TECHNICAL FIELD
The present disclosure relates generally to imaging systems, and more specifically to systems and methods for producing a light field from a depth map.
BACKGROUND
Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
SUMMARY
Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
Traditional media content (e.g. images and video) typically does not contain the necessary angular information (theta, phi) for 4D light field displays. Computing the missing angular information is complex and computationally intensive. In practice, this means that 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non- real-time fashion), or considerable GPU horsepower (e.g. arrays of high- end graphics cards) must be committed to calculate the four-dimensional data. 4D light field generation is so computationally expensive that it generally cannot be done in real- time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
To address these and other problems with providing 4D light fields, embodiment of the disclosure provide novel systems and methods for producing a light field from a depth map. In some embodiments, a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field.
The disclosed embodiments provide several practical applications and technical advantages, which include at least: 1) circumventing the need for multiple render passes to produce a 4D light field by instead programmatically computing the necessary data for the 4D light field from a single two-dimensional (2D) or two-and-a-half- dimensional (2.5D) source image; and 2) the real-time generation of a 4D light field from a 2D or 2.5D source, even on low-end computing hardware (e.g., a smartphone CPU / GPU). Certain embodiments may include none, some, or all of the above technical advantages and practical applications. One or more other technical advantages and practical applications may be readily apparent to one skilled in the art from the figures, descriptions, and claims included herein.
BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts. FIGURE 1 is a schematic diagram of an example system for producing a light field from a depth map, according to certain embodiments.
FIGURE 2 is a diagrammatic view of a two-dimensional image to a four dimensional light field mapping, according to certain embodiments.
FIGURE 3 is a diagrammatic view of a two and a half-dimensional image to a four-dimensional light field mapping, according to certain embodiments.
FIGURE 4 is a flowchart of a method for producing a light field from a depth map, according to certain embodiments.
FIGURE 5 is a flowchart of a method for producing a light field from a depth map using reverse-lookup, according to certain embodiments. FIGURE 6 is a diagrammatic view of a reverse lookup mapping for a hardware implementation, according to certain embodiments.
FIGURE 7 is a flowchart of a method for producing a light field from a depth map using forward-lookup, according to certain embodiments.
FIGURE 8 is a flowchart of a sub-method of FIGURE 7, according to certain embodiments.
FIGURE 9 is a diagrammatic view of a forward lookup mapping for a software implementation, according to certain embodiments.
DETAILED DESCRIPTION
Embodiments of the present disclosure and its advantages are best understood by referring to FIGURES 1 through 9 of the drawings, like numerals being used for like and corresponding parts of the various drawings. Traditional electronic displays encode data as a two-dimensional (2D) matrix of pixels (x, y). This pixel data can be stored in various formats such as RGB, RGBA, BGR, and HSV. In contrast to the flat, 2D data representation of traditional electronic displays, light field displays provide a four-dimensional (4D) display of data. 4D light fields are generated using 4D plenoptic vectors that are composed of position (x, y) and angle (theta, phi), with a color value (e.g., RGB) for each permutation.
Traditional media content (e.g. images and video) typically does not contain the necessary angular information (theta, phi) for 4D light field displays. Computing the missing angular information is complex and computationally intensive. In practice, this means that 4D light field data must either be pre-rendered explicitly as 4D light field data (in a non- real-time fashion), or considerable GPU horsepower (e.g. , arrays of high- end graphics cards) must be utilized to calculate the 4D data. 4D light field generation is so computationally expensive that it generally cannot be done in real-time without extremely powerful hardware (e.g., numerous GPUs running in parallel).
To address these and other difficulties and problems with providing 4D light fields, embodiment of the disclosure provide novel systems and methods for producing a light field from a depth map. In some embodiments, a system includes an electronic display, a computer processor, one or more memory units, and a module stored in the one or more memory units. The module is configured to access a source image stored in the one or more memory units and determine depth data for each pixel of a plurality of pixels of the source image. The module is further configured to map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field. The module is further configured to send instructions to the electronic display to display the mapped four-dimensional light field. In general, embodiments disclosed herein provide systems and methods which render a 4D light field using 2D or 2.5D source image data. In typical systems, computing the 4D light field generally requires many render passes (i.e., one render pass for each plenoptic cell). This means, for example, a relatively low-spatial- resolution light field of ten plenoptic cells takes lOx more graphics processing power than the amount of processing power to render the same content on a traditional 2D screen. Most typical light field generation systems are predicated on brute-force computing the 4D plenoptic function (RGB for each x, y, theta, and phi). This is equivalent to rendering a scene from many different camera positions for each frame. This is computationally expensive and generally impractical (or impossible) for real time content. Embodiments of the disclosure, however, circumvent the need for multiple render passes and instead programmatically compute the necessary data for a 4D light field from a single 2D or 2.5D source image. As will be appreciated, embodiments of the disclosure may provide a lOx to lOOx decrease in the amount of time to produce a 4D light field over the approach of traditional systems.
For a near-eye light field system composed of plenoptic cells, each plenoptic cell's location represents a spatial position and each pixel location within that cell represents a ray direction. As such, embodiments of the disclosure transform data from a 2D or 2.5D source image into a 4D light field via one of two approaches. In a first approach, some embodiments perform simple replication of source imagery in an identical manner across each plenoptic cell. In these embodiments, 2D data is projected into infinity with no additional depth data. In a second approach, some embodiments utilize the depth buffer (intrinsically present for computer-generated 3D imagery) to programmatically compute ray direction (theta, phi) for each pixel in the 2D matrix. In essence, the second approach is identical to the first approach except each pixel within a plenoptic cell is shifted in x/y space as a function of its associated depth value.
The two approaches described above allow for rapid computation of the 4D light field data (x, y, theta, phi) from a 2D or 2.5D source image. Embodiments described herein allow for the generation of a 4D light field from a 2D or 2.5D source image in real time, even on low-end computing hardware (e.g., smartphone CPU / GPU). The embodiments described herein circumvent the need for multiple render passes, instead programmatically computing the necessary data from a single 2D or 2.5D source image. Thus, the disclosed embodiments are lOx to lOOx faster than the traditional approach (depending on the spatial resolution of the target light field display). The disclosed embodiments may be a key enabler for extended reality (XR) visors, XR wall portals, XR construction helmets, XR pilot helmets, XR far eye displays, and the like. As used herein, XR includes Virtual Reality (VR), Augmented Reality (AR), Mixed Reality (MR), and any combination thereof.
FIGURE 1 illustrates an example system 100 for producing a light field from a depth map. As seen in FIGURE 1, system 100 includes a processor 110, memory 120, and an electronic display 130. One or more source images 150 and a depth map to light field module 140 may be stored in memory 120. Electronic display 130 includes multiple plenoptic cells 132 (e.g., 132A, 132B, etc.), and each plenoptic cell 132 includes multiple display pixels 134 (e.g., 134A-134P). For illustrative purposes only, electronic display 130 of FIGURE 1 includes nine plenoptic cells 132, and each plenoptic cell 132 includes sixteen display pixels 134. This provides a 4D resultant light field that has a 4x4 angular resolution and a 3x3 spatial resolution. However, electronic display 130 may have any number of plenoptic cells 132 in any physical arrangement, and each plenoptic cell 132 may have any number of display pixels 134.
Processor 110 is any electronic circuitry, including, but not limited to microprocessors, application specific integrated circuits (ASIC), application specific instruction set processor (ASIP), and/or state machines, that communicatively couples to memory 120 and controls the operation of automatic alerting communications system 110. Processor 110 may be 8-bit, 16-bit, 32-bit, 64-bit or of any other suitable architecture. Processor 110 may include an arithmetic logic unit (ALU) for performing arithmetic and logic operations, processor registers that supply operands to the ALU and store the results of ALU operations, and a control unit that fetches instructions from memory and executes them by directing the coordinated operations of the ALU, registers and other components. Processor 110 may include other hardware that operates software to control and process information. Processor 110 executes software stored on memory to perform any of the functions described herein. Processor 110 controls the operation and administration of depth map to light field module 140. Processor 110 may be a programmable logic device, a microcontroller, a microprocessor, any suitable processing device, or any suitable combination of the preceding. Processor 110 is not limited to a single processing device and may encompass multiple processing devices. Memory 120 may store, either permanently or temporarily, source images 150, operational software such as depth map to light field module 140, or other information for processor 110. Memory 120 may include any one or a combination of volatile or non-volatile local or remote devices suitable for storing information. For example, memory 120 may include random access memory (RAM), read only memory (ROM), magnetic storage devices, optical storage devices, or any other suitable information storage device or a combination of these devices. Depth map to light field module 140 represents any suitable set of instructions, logic, or code embodied in a computer- readable storage medium. For example, depth map to light field module 140 may be embodied in memory 120, a disk, a CD, or a flash drive. In particular embodiments, depth map to light field module 140 may include an application executable by processor 110 to perform one or more of the functions described herein.
Source image 150 is any image or electronic data file associated with an image. In some embodiments, source image 150 is captured by a camera. In some embodiments, source image 150 is a 2D image that contains color data but does not contain depth data. In some embodiments, source image 150 is a 2.5D image that contains color data plus depth data (e.g., RGB-D). In some embodiments, source image 150 is a stereographic image from a stereo pair. Source image 150 may contain any appropriate pixel data (e.g., RGB-D, RGBA, BGR, HSV, and the like).
In operation, system 100 provides (via, e.g., depth map to light field module 140) a 4D light field for display on electronic display 130 from a 2D or 2.5D source image 150. To do so, depth map to light field module 140 accesses a source image 150 stored in memory 120 and determines depth data for each pixel 152 of a plurality of pixels 152 of the source image 150. If the source image 150 is a 2D image that does not contain depth data, depth map to light field module 140 may assume a constant depth for the image (e.g., infinity) and then map each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130. For example, FIGURE 2 is a diagrammatic view of a 2D source image 150 to a 4D light field mapping. As illustrated in FIGURE 2, pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth. The pixel data (i.e., color data plus constant depth) is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field. For example, as the pixel data (e.g., RGB+XYD) arrives at the parser, the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer. On the other hand, if the source image 150 is a 2.5D image that does contain depth data, depth map to light field module 140 uses the depth data when mapping each pixel 152 of the source image 150 to a corresponding pixel 134 of each electronic display 130. For example, FIGURE 3 is a diagrammatic view of a 2.5D source image 150 to a 4D light field mapping. As illustrated in FIGURE 3, pixel 152A is mapped to display pixel 134A of each plenoptic cell 132 (i.e., all nine plenoptic cells 132), pixel 152B is mapped to display pixel 134B of each plenoptic cell 132, and so forth. The pixel data (e.g., RGB-D) is sent from depth map to light field module 140 to electronic display 130 as light field pixel data 142 in order to produce the corresponding 4D light field. For example, as the pixel data (e.g., RGB+XYD) arrives at the parser, the RGB-D is copied to the corresponding (x,y) for each cell within the target buffer, but is shifted as a function of the depth data. This is illustrated in FIGURE 3 by the shifting of the 4 inner pixels of source image 150 (labeled “A” and shaded grey) in certain plenoptic cells 132.
FIGURE 4 illustrates a method 400 for producing a light field from a depth map. In general, method 400 may be utilized by depth map to light field module 140 to generate light field pixel data 142 from source image 150 and send light field pixel data 142 to electronic display 130 in order to display a 4D light field corresponding to the source image 150. To do so, method 400 may accesses a source image (e.g., source image 150) stored in one or more memory units (e.g., memory 120). Method 400 may then determine depth data for each pixel of a plurality of pixels of the source image (e.g., steps 410-450) and then map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field (e.g., steps 460-480). Each of these steps is described in more detail below.
At step 410, method 400 determines whether the source image contains depth data (e.g. from a render buffer). In some embodiments, the source image is source image 150. If the source image includes depth data, method 400 proceeds to step 420. If the source image does not include depth data, method 400 proceeds to step 440.
At step 420, method 400 determines if the depth data of the source image is RGB-D data. If the depth data of the source image is RGB-D data, method 400 proceeds to step 460. If the depth data of the source image is not RGB-D data, method 400 proceeds to step 430. At step 430, method 400 converts the depth data to RGB-D and then proceeds to step 460.
At step 440, method 400 determines if the source image is stereoscopic. If method 400 determines that the source image is stereoscopic, method 400 proceeds to step 445. If method 400 determines that the source image is not stereoscopic, method 400 proceeds to step 450.
At step 445, method 400 computes the depth data from the stereoscopic source image. In some embodiments, the depth data is computed via parallax differences. After step 445, method 400 proceeds to step 460.
At step 450, method 400 assigns a constant depth and a fixed location to the source image that is devoid of depth data. For example, method 400 may assign a constant depth of infinity. After step 450, method 400 proceeds to step 460.
At step 460, method 400 determines if the runtime environment supports reverse-lookup (e.g., a hardware implementation for providing a light field). If method 400 determines that the runtime environment supports reverse-lookup in step 460, method 400 proceeds to step 470. If method 400 determines that the runtime environment does not support reverse-lookup in step 460, method 400 proceeds to step 480.
At step 470, method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using reverse lookup (e.g., hardware implementation). In general, step 470 may include copying RGB+D to corresponding X & Y for each plenoptic cell within the target buffer as pixel data (RGB + XYD) arrives at the parser, shifting as a function of depth (D). In some embodiments, Ax * Ay total reads and Ax * Ay * Sx * Sy total writes may be performed. More details about certain embodiments of step 470 are described in more detail below with respect to FIGURE 5. After step 470, method 400 may end.
At step 480, method 400 maps and sends light field pixel data (e.g., light field pixel data 142) using forward-lookup (e.g., software such as a graphics pipeline shader). In general, step 480 may include performing a ray-march for each pixel in output buffer along its corresponding theta / phi direction to determine target pixel data from the reference RGB-D map. In some embodiments, this may require Ax * Ay * Sx * Sy (parallelizable) ray-marches (where Ax & Ay are angular resolution width & height (e.g. pixels per cell) and Sx & Sy are spatial resolution width & height (e.g. cells per display). More details about certain embodiments of step 480 are described in more detail below with respect to FIGURE 7. After step 480, method 400 may end.
FIGURE 5 is a flowchart of a method 500 for producing a light field from a depth map using reverse-lookup (e.g., hardware implementation), according to certain embodiments. FIGURE 6 is a diagrammatic view of a reverse lookup-mapping for a hardware implementation such as method 500. In general, if parallelizable hardware is available, method 500 uses the color and depth (e.g., RGB-D) of the pixel in question to determine in which location in the output image to write that data. In step 510, method 500 determines if the incoming pixel 152 of source image 150 has been written to all plenoptic cells 132. If, so, method 500 ends. Otherwise, method 500 proceeds to step 520. In step 520, method 500 uses the pixel's location in the display matrix (its cell) to determine the real world offset from the center of the light field display to its encompassing plenoptic cell’s comers. In some embodiments, step 520 includes computing the pixel offset (dX/dY) as a ratio-of-slopes-function of the plenoptic cell position (cX/cY) and pixel depth (D). After step 520, method 500 proceeds to step 530.
In step 530, method 500 determines if the pixel has already been written at the location (X + dX, Y + dY) of this cell for this frame. If so, method 500 proceeds to step 540. If not, method 500 proceeds to step 550.
At step 540, method 500 determines if the depth of the already-written pixel is smaller (i.e., closer to the camera) than the depth of the not-yet-written pixel. If the depth of the already-written pixel is smaller than the depth of the not-yet-written pixel, method 500 does not write any pixel data for the incoming pixel and proceeds back to step 510. If the depth of the already- written pixel is not smaller than the depth of the not-yet-written pixel, method 500 proceeds to step 550.
In step 550, method 500 writes the pixel data (RGB+D) to location (X+dX, Y+dY) of the plenoptic cell. Because each pixel in a plenoptic cell represents a different ray direction (the angles theta and phi), method 500 may use the difference between the pixel’s location and the plenoptic cells center, multiplied by the pitch of a cell, to calculate one side of a right triangle in step 550. Method 500 may use the given depth to calculate the other side of that same triangle. The ratio of those two sides generates an angle related to the pixel offset that should be applied when writing the output buffer data for the display. This relationship may be defined by taking that angle, converting it to degrees, multiplying by the pixels-per-degree of the display and the sign of the difference between the pixel’s location and its cell center. This offset is then added to the original pixel x/y and its cell’s center to create a new world position. The pixel’s color value and depth are written to this position in the output buffer if it is closer than what was there before to create a part of the light field.
Method 500 may be implemented in low level code, firmware, or transistor- logic and run on hardware close to the light field display 130. In this application, identical synthetic content me be fed to each one and it may use its known position to calculate its portion of the light field.
FIGURE 7 is a flowchart of a method 700 for producing a light field from a depth map using forward-lookup (e.g., software such as a graphics pipeline shader), according to certain embodiments. FIGURE 9 is a diagrammatic view of a forward- lookup mapping for a software implementation such as method 700. In general, in the case where there is not a hardware implementation, it is possible to map the 4D light field from source image 150 using a loop. Method 700 uses the pixel in question’s buffer location and the depth (if available) and cell location to determine which color should be placed in that same spot in the output buffer. At a high level, method 700 represents the location and direction in space of that cell within that particular pixel within the plenoptic cell into the virtual camera space so that the appropriate color to put in that cell can be looked up. As illustrated in FIGURE 9, the operation of method 700 is opposite from that of method 500 (i.e., method 700 reads through the display pixels 134 of plenoptic cells 132 while method 500 reads through the source pixel data 152 of source image 150 as illustrated in FIGURE 6). The first step of method 700 is to calculate the camera ray for the given pixel location. This is the ray direction that light is traveling along as it would intersect this pixel’s cell location. This ray uses the world position of the pixel’s cell as its origin and the angles theta and phi as its direction. These angles can be calculated from the pixel’s offset from the center of its cell. The details of this calculation depend on if foveation or other mapping needs to take place and should not be assumed to be linear. The ray’s origin is converted from pixel space to world space using known properties of the collection system such as near plane and field of view. Next, method 700 converts this ray into one in the depth (backspace). This transformation converts the 3D world space vector into a 2D depth space vector and uses inverse depth for comparisons. Next the derivatives for both the camera ray and the depth vector are calculated with respect to the spatial dimensions (x,y). These derivatives are then used in a loop to traverse through depth space to determine the closest object in the rendered buffer that the ray would hit. In the case of having a separate depth buffer, the depth space vector would also be traversed through its buffer. At the end, the closest object between the depth space and color space would determine the actual depth of the pixel’s ray. In these steps it may be preferable to use the tangent of the inverse depth during the recursion for stability reasons. With the depth of the object that was hit, the origin of the ray and its direction, trigonometric relations (using known properties of the collector such as field of view) can be used to convert from world back to screen space to determine which source color image pixel’s color data should be used for the output image at this pixel location. The specific steps of some embodiments of method 700 are described in more detail below.
Method 700 may begin in step 710 where method 700 determines if every pixel in the render target has been checked. If so, method 700 may end. Otherwise, method 700 proceeds to step 715.
At step 715, method 700 retrieves the next pixel X,Y in the render target. After step 715, method 700 proceeds to step 720 where method 700 determines the ray direction of pixel X,Y in the scene space (di). After step 715, method 700 proceeds to step 725 where method 700 determines the ray direction of pixel X,Y in the background space (d2). After step 725, method 700 proceeds to step 730 where method 700 calculates the derivative of di and d2 with respect to X/Y. After step 730, method 700 proceeds to step 735 where method 700 calls the function of FIGURE 8 using di and its derivatives to approximate the ray intersection point ii in the scene space. After step 735, method 700 proceeds to step 740 where method 700 calls the function of FIGURE 8 using d2 and its derivatives to approximate the ray intersection point h in the background space. After step 740, method 700 proceeds to step 745 where method 700 converts hand into more accurate inverse depths idi and id2. After step 745, method 700 proceeds to step 750 where method 700 calculates an ending inverse depth (ide) from id2. After step 750, method 700 proceeds to step 755. At step 755, method 700 determines if idi is less than id2. If idi is less than id2, method 700 proceeds to step 760 where method 700 writes 0 (blocked) and then proceeds back to step 710. Otherwise, if idi is not less than id2, method 700 proceeds to step 765 where method 700 writes the color sample form the scene or multisample (e.g., derivative) at ii. After step 765, method 700 proceeds back to step 710.
In the forward-lookup application (i.e., FIGURE 7), method 700 may be implemented as a post process shader in a game engine or render engine. This shader would take as input data from cameras placed in the synthetic scene and optionally depth buffers from those virtual cameras. Method 700 would then output a single image that contained the cells that make up the light field in an array of images.
FIGURE 8 is a flowchart of a sub-method 800 of FIGURE 7, according to certain embodiments. Method 800 is a parallax occlusion ray to inverse depth method that functions by raymarching with uniform steps in the pixel space. In general, method 800 first sets up a conversion between 3D space and the 2D space of the input render input. This allows method 800 to its iteration in a pixel-perfect manner while still doing the computation in 3D space. Not only does this provide better quality results, but it also provides drastic performance gains when the light field display is small with respect to the scene it is displaying.
Method 800 may begin in step 810 where method 800 calculates the starting tangent for the inverse depth for the direction passed in from method 700 (di or d2). After step 810, method 800 proceeds to step 820 where method 800 calculates the ending tangent for the inverse depth for the direction passed in from method 700 (di or d2). In general, in steps 810 and 820, method 800 obtains the tangents (original render space) corresponding to the start and end inverse depths along the input ray. Method 800 may in addition compute the constant step (corresponding to half of the diagonal length of one pixel) to be used in the raymarching.
In steps 820 and 825, method 800 begins a loop that takes a parameter 't' from 0 to 1. Method 800 considers t=0 to be the start of the input ray and t=l to be the endpoint, and method 800 increments 't' by the previously described precomputed constant at each iteration of the loop (step 845). Because 't' is linearly related to the original render's pixel space, it has an inverse relation to depth of the input ray (i.e., t*depth is constant). Method 800 uses this fact to compute the depth of the input ray at each step in the raymarch (step 830 where method 800 calculates the parametric inverse depth idt using ‘t’), and method 800 uses the linear relation between t and the input pixel space to sample the input depth map, which is called the scene depth (step 835). On the first iteration in which the ray depth exceeds the scene depth (step 840), method 800 concludes that the ray has 'hit' something and takes an early exit to return an appropriately interpolated value (step 850). Otherwise, in the case that this never occurs before the end of the loop, method 800 returns an indication that we hit nothing (step 855). Method 800 thus is a parallax occlusion algorithm that results in an inverse depth, allowing it to return an intersection point (as an inverse depth along the ray) as well as returning a 'nothing hit' signal (as the input ending inverse depth).
The methods described herein mostly use tangent space as opposed to either pixel space or angle space, and mostly use inverse depth as opposed to depth. Both of these choices in space have both conceptual and computational advantages. For directions in 3D space and for locations on a 2D pinhole-style image, the space of choice is tangent space. This is defined with respect to the plane perpendicular to the view direction of the camera/light field so that for tangent coordinate {w_x, w_y}, the direction it points corresponds to the vector <x = w_x, y = w_y, z = 1>. In plenoptic cellular light field contexts, tangent space may be the best choice because it is the most consistent, computationally cheapest option and is readily compatible with the other spaces (pixel space is tangent space with a scalar multiplier).
The use of inverse depth space offers many advantages. As noted above, in a standard camera projection, pixel position is inversely related to scene depth (coordinate parallel to the view direction, not distance to the camera) and therefore linearly related to inverse depth. This allows the disclosed embodiments to execute large sections of computation (such as pixel-perfect raymarching) without ever directly computing depth and without ever using a division. Another advantage of using inverse depth is a clean representation of infinity. An inverse depth of 0 may be used to refer to the 'back' of the scene as it is effectively a depth of infinity. This proves to be much more useful than the counterpart depth of zero that the inverse depth space renders unwieldy.
The scope of this disclosure is not limited to the example embodiments described or illustrated herein. The scope of this disclosure encompasses all changes, substitutions, variations, alterations, and modifications to the example embodiments described or illustrated herein that a person having ordinary skill in the art would comprehend.
Modifications, additions, or omissions may be made to the systems and apparatuses described herein without departing from the scope of the disclosure. The components of the systems and apparatuses may be integrated or separated. Moreover, the operations of the systems and apparatuses may be performed by more, fewer, or other components. Additionally, operations of the systems and apparatuses may be performed using any suitable logic comprising software, hardware, and/or other logic.
Modifications, additions, or omissions may be made to the methods described herein without departing from the scope of the disclosure. The methods may include more, fewer, or other steps. Additionally, steps may be performed in any suitable order. That is, the steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
As used in this document, “each” refers to each member of a set or each member of a subset of a set. Furthermore, as used in the document “or” is not necessarily exclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” Similarly, as used in this document “and” is not necessarily inclusive and, unless expressly indicated otherwise, can be inclusive in certain embodiments and can be understood to mean “and/or.” All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.
Furthermore, reference to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of this disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.
To aid the Patent Office, and any readers of any patent issued on this application in interpreting the claims appended hereto, applicants note that they do not intend any of the appended claims to invoke 35 U.S.C. § 112(f) as it exists on the date of filing hereof unless the words “means for” or “step for” are explicitly used in the particular claim.

Claims

WHAT IS CLAIMED IS:
1. A system comprising: an electronic display; a computer processor; one or more memory units; and a module stored in the one or more memory units, the module configured, when executed by the processor, to: access a source image stored in the one or more memory units; determine depth data for each pixel of a plurality of pixels of the source image; map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and send instructions to the electronic display to display the mapped four- dimensional light field.
2. The system of Claim 1 , wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
3. The system of Claim 1, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
4. The system of Claim 1, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
5. The system of Claim 1, wherein: the module is further configured, when executed by the processor, to determine whether a runtime environment for the system supports reverse-lookup; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
6. The system of Claim 1, wherein: the electronic display comprises a plurality of plenoptic cells, each plenoptic cell comprising a plurality of display pixels; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises copying pixel data for each of the plurality of pixels of the source image to a corresponding display pixel of each of the plenoptic cells.
7. The system of Claim 6, wherein the corresponding display pixel is determined as a function of the depth data of the corresponding pixel of the source image.
8. The system of Claim 1, wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup.
9. The system of Claim 1 , wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup.
10. A method by a computing device, the method comprising: accessing a source image stored in one or more memory units; determining depth data for each pixel of a plurality of pixels of the source image; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and sending instructions to an electronic display to display the mapped fourdimensional light field.
11. The method of Claim 10, wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
12. The method of Claim 10, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
13. The method of Claim 10, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
14. The method of Claim 10, further comprising determining whether a runtime environment for the computing device supports reverse-lookup, wherein: mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
15. The method of Claim 10, wherein: the electronic display comprises a plurality of plenoptic cells, each plenoptic cell comprising a plurality of display pixels; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises copying pixel data for each of the plurality of pixels of the source image to a corresponding display pixel of each of the plenoptic cells.
16. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access a source image stored in the one or more computer-readable non- transitory storage media; determine depth data for each pixel of a plurality of pixels of the source image; map, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to a four-dimensional light field; and send instructions to an electronic display to display the mapped fourdimensional light field.
17. The media of Claim 16, wherein: the source image comprises the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises accessing the depth data of the source image from a render buffer.
18. The media of Claim 16, wherein: the source image is a stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises computing the depth data via parallax differences in the stereoscopic image.
19. The media of Claim 16, wherein: the source image is a non-stereoscopic image that is devoid of the depth data; and determining the depth data for each pixel of the plurality of pixels of the source image comprises assigning a constant depth value and a fixed location to the depth data.
20. The media of Claim 16, wherein: the software is further operable when executed to determine whether a runtime environment supports reverse-lookup; mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using reverse-lookup when it is determined that the runtime environment for the system supports reverse-lookup; and mapping, using the plurality of pixels and the determined depth data for each of the plurality of pixels, the source image to the four-dimensional light field comprises using forward-lookup when it is determined that the runtime environment for the system does not support reverse-lookup.
PCT/US2021/037005 2020-06-12 2021-06-11 Systems and methods for producing a light field from a depth map WO2021252892A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21737891.8A EP4165586A1 (en) 2020-06-12 2021-06-11 Systems and methods for producing a light field from a depth map

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202063038639P 2020-06-12 2020-06-12
US63/038,639 2020-06-12
US17/345,436 2021-06-11
US17/345,436 US20210390722A1 (en) 2020-06-12 2021-06-11 Systems and Methods for Producing a Light Field from a Depth Map

Publications (1)

Publication Number Publication Date
WO2021252892A1 true WO2021252892A1 (en) 2021-12-16

Family

ID=78825780

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/037005 WO2021252892A1 (en) 2020-06-12 2021-06-11 Systems and methods for producing a light field from a depth map

Country Status (3)

Country Link
US (1) US20210390722A1 (en)
EP (1) EP4165586A1 (en)
WO (1) WO2021252892A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180342075A1 (en) * 2017-05-25 2018-11-29 Lytro, Inc. Multi-view back-projection to a light-field
US20200137376A1 (en) * 2017-10-31 2020-04-30 Wuhan China Star Optoelectronics Technology Co., Ltd Method for generating a light-field 3d display unit image and a generating device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100505334B1 (en) * 2003-03-28 2005-08-04 (주)플렛디스 Real-time stereoscopic image conversion apparatus using motion parallaxr
US8559705B2 (en) * 2006-12-01 2013-10-15 Lytro, Inc. Interactive refocusing of electronic images
JP5242667B2 (en) * 2010-12-22 2013-07-24 株式会社東芝 Map conversion method, map conversion apparatus, and map conversion program
US9390505B2 (en) * 2013-12-12 2016-07-12 Qualcomm Incorporated Method and apparatus for generating plenoptic depth maps
US9305375B2 (en) * 2014-03-25 2016-04-05 Lytro, Inc. High-quality post-rendering depth blur
JP6620394B2 (en) * 2014-06-17 2019-12-18 ソニー株式会社 Control device, control method and program
EP3228977A4 (en) * 2014-12-01 2018-07-04 Sony Corporation Image-processing device and image-processing method
US10735711B2 (en) * 2017-05-05 2020-08-04 Motorola Mobility Llc Creating a three-dimensional image via a wide-angle camera sensor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180342075A1 (en) * 2017-05-25 2018-11-29 Lytro, Inc. Multi-view back-projection to a light-field
US20200137376A1 (en) * 2017-10-31 2020-04-30 Wuhan China Star Optoelectronics Technology Co., Ltd Method for generating a light-field 3d display unit image and a generating device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG QIAOSONG ET AL: "Stereo vision-based depth of field rendering on a mobile de", JOURNAL OF ELECTRONIC IMAGING, S P I E - INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING, US, vol. 23, no. 2, 1 March 2014 (2014-03-01), pages 23009, XP060047684, ISSN: 1017-9909, [retrieved on 20140319], DOI: 10.1117/1.JEI.23.2.023009 *

Also Published As

Publication number Publication date
US20210390722A1 (en) 2021-12-16
EP4165586A1 (en) 2023-04-19

Similar Documents

Publication Publication Date Title
US20210174471A1 (en) Image Stitching Method, Electronic Apparatus, and Storage Medium
US11363249B2 (en) Layered scene decomposition CODEC with transparency
US6680735B1 (en) Method for correcting gradients of irregular spaced graphic data
US7973791B2 (en) Apparatus and method for generating CG image for 3-D display
Heidrich et al. View-independent environment maps
RU2754721C2 (en) Device and method for generating an image of the intensity of light radiation
CN111698463A (en) View synthesis using neural networks
JP2000348202A (en) Shifting warp renderling method for volume data set having voxel and rendering method for volume data set
Xiong et al. Registration, calibration and blending in creating high quality panoramas
TW202036481A (en) Distance field color palette
GB2475944A (en) Correction of estimated axes of elliptical filter region
Bonatto et al. Real-time depth video-based rendering for 6-DoF HMD navigation and light field displays
JP2006244426A (en) Texture processing device, picture drawing processing device, and texture processing method
CN114782607A (en) Graphics texture mapping
CN113643414A (en) Three-dimensional image generation method and device, electronic equipment and storage medium
US20210390722A1 (en) Systems and Methods for Producing a Light Field from a Depth Map
JP3629243B2 (en) Image processing apparatus and method for rendering shading process using distance component in modeling
EP4064193A1 (en) Real-time omnidirectional stereo matching using multi-view fisheye lenses
KR20220133766A (en) Real-time omnidirectional stereo matching method using multi-view fisheye lenses and system therefore
KR20210117988A (en) Methods and apparatus for decoupled shading texture rendering
Kolhatkar et al. Real-time virtual viewpoint generation on the GPU for scene navigation
CN114503150A (en) Method and apparatus for multi-lens distortion correction
JP2002092597A (en) Method and device for processing image
Walia et al. A computationally efficient framework for 3D warping technique
EP1209620A2 (en) Method for correcting gradients of irregularly spaced graphic data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21737891

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021737891

Country of ref document: EP

Effective date: 20230112

NENP Non-entry into the national phase

Ref country code: DE