WO2011094164A1 - Image enhancement system using area information - Google Patents

Image enhancement system using area information Download PDF

Info

Publication number
WO2011094164A1
WO2011094164A1 PCT/US2011/022281 US2011022281W WO2011094164A1 WO 2011094164 A1 WO2011094164 A1 WO 2011094164A1 US 2011022281 W US2011022281 W US 2011022281W WO 2011094164 A1 WO2011094164 A1 WO 2011094164A1
Authority
WO
WIPO (PCT)
Prior art keywords
frames
input
frame
area
area map
Prior art date
Application number
PCT/US2011/022281
Other languages
French (fr)
Inventor
Gianluca Filippini
Reinhard Steffens
James F. Dougherty
Thomas J. Zato
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Publication of WO2011094164A1 publication Critical patent/WO2011094164A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness

Definitions

  • the present invention relates generally to imaging systems, and in particular, to imaging systems that process 2-dimensional (2D), 3-dimensional (3D), and multi-view images.
  • these different-eye images may still convey a flawed view of a 3D reality to a user, as the geometry of the camera system may not match that of an image- viewing environment. Due to the difference in geometries, an object as perceived in the viewing environment may be too large or too small relative to that in the view of reality. For example, one spatial dimension of a solid object may be scaled quite differently from a different spatial dimension of the same object. A solid ball in reality may possibly be perceived as a much-elongated ellipse in the viewing environment. Further, for images that portray fast actions and movements, human brains are typically ineffective in compensating or
  • a high-quality geometrically correct camera system may be used to produce high quality 2D, 3D, or multi-view original images.
  • these original images cannot be used for public distribution as is, and are typically required to be edited, down-sampled and compressed to produce a much more compact version of the images.
  • a release version stored on a Blu-ray disc may use only 8-bit color values to approximate the original 12-bit color values in the original images.
  • human eyes may be quite sensitive in certain regions of the color space, and can tell different colors even if those colors have very similar color values. As a result, colors such as dark blue in certain parts of an original image may be shifted to human-detectable different colors, such as blue with a purple hue, in the release version.
  • FIG. 1A, FIG. IB, FIG. 1C, and FIG. ID illustrate example systems and components that support image processing, according to possible embodiments of the present invention
  • FIG. 2A and FIG. 2B illustrate example geometry of an example camera system, according to possible embodiments of the present invention
  • FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, and FIG. 3E illustrate example operations and representations relating to area maps, according to possible embodiments of the present invention
  • FIG. 4A and FIG. 4B illustrate an example process flow, according to a possible embodiment of the present invention
  • FIG. 5 illustrates an example hardware platform on which a computer or a computing device as described herein may be implemented, according a possible embodiment of the present invention.
  • a computational measure may be determined for one or more input frames.
  • the one or more input frames may be 2D, 3D, or multi-view image frames comprising one or more views of a reality as perceived by one or more camera elements in a camera system.
  • geometry information for the one or more camera elements in the camera system may also be determined.
  • a value of the computational measure may be calculated for each picture unit in an input frame in the one or more input frames.
  • the input frame for example, may be a frame in a pair of stereoscopic frames that comprise a left-eye frame and a right-eye frame.
  • each picture unit may be a pixel.
  • each picture unit may be a block of pixels.
  • a picture unit may comprise any number of pixels and any arbitrary shape.
  • the computational measure may relate to disparity between two input frames.
  • the computational measure may relate to noise levels to be injected into the input frame to produce the encoded version of the input frame.
  • the computational measure may relate to color information in the input frame.
  • An area map comprising one or more areas may be generated. Each of the one or more areas comprising picture units with a uniform value for the measure.
  • the area map and an encoded version of the input frame may be outputted to a recipient device.
  • the version of the encoded version of the input frame provided to the recipient device may be of a sampling rate lower than that of the input frame.
  • the area map may be represented based on a plurality of vertices and a plurality of links between the vertices.
  • the area node map may comprise at least an area in which corresponding picture units in the two input frames are of zero disparity.
  • additional computational measures may be used to generate additional area maps.
  • an encoded version of an input frame and an area map relating to one or more input frames including the input image may be decoded.
  • the area map may comprise one or more areas each having picture units that are of a uniform value for a computational measure.
  • the area map may be applied to the input frame to generate a new version of the input frame.
  • the new version of the input frame may be sent to a rendering device.
  • geometry information for one or more camera elements that originate one or more input frames may be determined.
  • the geometry information with the one or more input frames in a data stream may be encoded.
  • the data stream may be outputted to a recipient device.
  • one or more input frames and geometry information of one or more camera elements that originate the one or more input frames may be received.
  • the one or more input frames may be modified to generate a new version of the one or more input frames based on the geometry information of the one or more camera elements.
  • the new version of the one or more input frames may be outputted to a rendering device.
  • mechanisms as described herein form a part of a image processing system, including but not limited to a server, studio system, art director system, image editor, animation system, movie studio system, broadcast system, media recording device, media playing device, television, laptop computer, netbook computer, cellular radiotelephone, electronic book reader, point of sale terminal, desktop computer, computer workstation, computer kiosk, and various other kinds of terminals and display units.
  • Images may be described herein with reference to one or more example media, including still images, video frames, slide shows, etc.
  • the selection of an example medium in this description may be made for simplicity and concise unity and, unless expressly stated to the contrary, should not be construed as limiting an embodiment to a particular medium as embodiments of the present invention are well suited to function with any media content.
  • FIG. 1A shows an example of an image processing system (100) in accordance with one or more possible embodiments.
  • the image processing system (100) generally represents a single device or multiple devices that are configured to encoding images and metadata into a coded bitstream.
  • the image processing system (100) may comprise a camera system unit (102), a video preprocessing unit (104), a sampling format unit (106), an area mapping unit (158) comprising a disparity computation unit (108) and an area node mapping unit (110), a video encode unit (112), and a muxer (114).
  • the camera system unit (102) corresponds to any device configured to acquire images in terms of field raw source frames.
  • the camera system unit (102) may include a single view camera, a stereo camera pair, or a multi-view camera, or any other suitable systems or subsystems of image acquisition devices.
  • field raw source frames may refer to a version of images captured from a reality in image planes (or film planes) of image acquisition devices present in the reality; the field raw source frames (or simply source frames) may be, but not limited to, a high-quality version of original images that portray the reality.
  • source frames may generally refer to a version of initial frames that are to be edited, down-sampled, and/or compressed, along with possible metadata, into a coded bitstream that may be distributed to image receiving systems; thus, the field raw source frames may include artificially created or synthesized image frames.
  • the source frames may be captured from a camera system with a high sampling rate that is typically used by a professional, an art studio, a broadcast company, a high-end media production entity, etc.
  • the camera system unit (102) may comprise one, two, or more image acquisition devices each of which may be a single camera element configured to capture a specific view of a reality.
  • Example of image acquisition devices may include, but are not limited to, a left camera element and a right camera element as illustrated, where the camera system unit (102) is a stereo camera system.
  • Examples of source frames may include, but not limited to, left source frames (116) and right source frames (118).
  • source frames may be computer generated.
  • Source frames may also be obtained from existing image sources such as old movies and documentaries.
  • disparity and/or depth information and/or parallactic information may be computer-generated to convert 2D images to 3D source frames or to multi-view source frames.
  • the camera system unit (102) may be configured to acquire and communicate geometry information related to optical configurations of the image acquisition devices to other parts of the image processing system (100).
  • the camera system unit (102) is configured to provide geometry information (120) as described herein as an input to the video preprocessing unit (104).
  • geometry information (120) may include, but are not limited to, information related to positions and/or offsets of principal points and parallaxes in optical configurations of the image acquisition devices as functions of time.
  • preprocessing may be optional. Some or all of the functions performed by the video preprocessing unit (104) as described herein may be bypassed.
  • the video preprocessing unit (104) generally represents any hardware and/or software configured to produce (1) pixel processed frames, and/or (2) statistical analysis data, related to the contents of the source frames.
  • the "pixel processed frames” refer to a version of images that are generated based on source frames (e.g., 116 and 118) and geometry information (e.g., 120) received as an input; typically, the pixel processed frames represent a high-quality version of images that portray the reality, enhanced by the geometry information and the statistical analysis data.
  • the "statistical analysis data” refers to data generated based on statistical analysis on the contents of the source frames (e.g., 116 and 118).
  • the statistical analysis data may be provided to, and used by, an image receiving system to make better predictions or faster decoding relating to image data encoded in a coded bitstream than otherwise.
  • the video preprocessing unit (104) may receive the source frames (116 and 118) and geometry information (120) as inputs from the camera system unit (102).
  • the video preprocessing unit (104) may only receive source frames (116 and 118) from the camera system unit (102).
  • the geometry information (120) may bypass the video preprocessing unit (104) and directly provide to one or more other parts or units, for example, the disparity computation unit (108), in the image processing system (100).
  • the video preprocessing unit (104) may comprise video filters that are configured to perform sharpening, de-noising, motion correlation and/or
  • the video preprocessing unit (104) may be configured to perform frame rate conversion (FRC) and/or de-interlacing.
  • FRC frame rate conversion
  • the video preprocessing unit (104) may be configured to perform motion analysis, segmentation, object scan, and pixel activities on source frames, to semantically describe the content of source frames using one or more non-proprietary or proprietary measures, and to improve the final quality of the coded bitstream. Some or all of the information relating to these measures may be provided to as input to the video encode unit (112) to drive specific parts of the video encode unit (112), such as a quantization block, a mode decision block, and a rate control module.
  • the video preprocessing unit (104) may be configured to provide any, some, or all of, the pixel processed frames, the geometry information, and the statistical analysis data to other parts or units in the image processing system (100).
  • the pixel processed frames additionally carry the geometry information and the statistical analysis data, and the video preprocessing unit (104) may be configured to provide the pixel processed frames to other parts or units in the image processing system (100).
  • the video preprocessing unit (104) receives the source frames as stereo left source frames and stereo right source frames, the video preprocessing unit (104) provides left pixel processed frames (122) and right pixel processed frames (124) as inputs to the sampling format unit (106).
  • these left pixel processed frames (122) and right pixel processed frames (124) may, but are not required to, include the geometry information (120) and the statistical analysis data.
  • the sampling format unit (106) generally represents any hardware and/or software configured to receive pixel processed frames (e.g., 122 and 124) and generate a pre-encoding version of frames (128) by sub-sampling the pixel processed frames (e.g., 122 and 124) using one or more sub-sampling methods. These sub-sampling methods may or may not be specific to processing 2D, 3D, or multi-view image frames, and may, but are not required to, relate to side-by-side, top/bottom, quincunx, etc. As compared with the source frames, the pre-encoding version of frames (128) may comprise frames as a relatively low quality version of images.
  • a sampling rate for the source frames may be 4:4:2 (e.g., 10 bits for each color value), while a sampling rate for the pre-encoding version of images may be 4:2:2 (e.g., 8 bits for each color value).
  • the pre-encoding version of frames (128) may or may not be further compressed.
  • either a lossless compression method or a lossy compression method may be used.
  • Each frame in the pre-encoding version of frames (128) may comprise image data derived from one, two, or more source frames.
  • a side-by-side frame in the pre-encoding version of frames (128) may comprise image data derived from both a left source frame and a right source frame.
  • the pre-encoding version of frames (128) produced by the sampling format unit (106) may be encoded and stored in one or more much smaller file or containers and transmitted at a much lower transmission rate than that required by the source images.
  • the sampling format unit (106) may provide the pre-encoding version of frames (128) as an input to the video encode unit (112).
  • the video preprocessing unit (104) receives stereo left source frames and stereo right source frames
  • the video preprocessing unit (104) provides left pixel processed frames (122), right pixel processed frames (124), and geometry information (126) related to optical configuration of the camera system (102) as inputs to one or more area mapping units in the image processing system (100).
  • area mapping unit may be, but not limited to, a disparity computation unit (e.g., 108).
  • Other examples of area mapping units may include a noise level computation unit, a color gradient computation unit, a motion analysis unit, etc.
  • an example implementation of area mapping unit (158) may comprise a disparity computation unit (e.g., 108) and an area node mapping unit (e.g., 110).
  • the area mapping unit (158), or the disparity computation unit (108) therein may receive input frames, which may be the pixel processed frames previously discussed, from the video preprocessing unit (104). Additionally and/or optionally, the area mapping unit, or the disparity computation unit (108) therein, may receive the geometry information relating to the optical configuration of the camera systems when original source frames corresponding to the input frames were taken.
  • the area mapping unit (158) produces an area map (136) that may be provided as an input to other parts of the image processing system (100) such as the muxer (114). Additionally and/or optionally, the area mapping unit (158) may produce additional information other than the area map (136) and provide the additional information to other parts of the image processing system (100). For example, the area mapping unit (158) may produce pixel displacement information (134), and provide that information as an input to the muxer (114) along with the area map (136).
  • any, some, or all, of the area map (136) and/or the pixel displacement information (134) may be provided by the area mapping unit (158) to other units in the image processing system (100) such as the video encode unit (112).
  • the video encode unit (112) generally represents any hardware and/or software configured to receive image data and other data related to the image data and to encode the received data into a file or a data stream.
  • the file or data stream may comprise a container structure with multiple channels.
  • One or more image data channels in the container structure may carry a compressed and/or quantized version of the image data.
  • the image data may be an encoded version of a pre-encoded version of images (e.g., 128).
  • One or more metadata channels may carry description and statistical information relating to the image data.
  • the video encode unit (112) may be configured to receive the pre-encoded version of frames (128) from the sampling format unit (106), and to encode the pre-encoded version of frames (128) in a data stream (140) as an input to the muxer (114). Additionally and/or optionally, the video encode unit (112) may be configured to receive pixel displacement information (134) from the area mapping unit (158), and to encode the pixel displacement information (134) in or more metadata channels in the data stream (140) as an input to the muxer (114).
  • the muxer (114) generally represents any hardware and/or software configured to receive data streams of (1) audio data (e.g., 138) from one or more audio processing units (not shown) in the image processing system, (2) image data as encoded by a video encode unit, and (3) content-based information as generated by one or more area mapping units or as provided by a video encode unit, to multiplex the received data streams in a coded bitstream, and to store the coded bit stream in a tangible storage medium or to transmit the coded bitstream to an image receiving system.
  • audio data e.g., 138
  • image data as encoded by a video encode unit
  • content-based information as generated by one or more area mapping units or as provided by a video encode unit
  • the coded bitstream may be logically structured as a container with one or more image data channels for storing the audio data, one or more image data channels for storing the image data and one or more metadata channels for storing metadata including description, statistical information and area maps as described herein.
  • the metadata may include, but are not limited to, the pixel displacement information (134) and the area map (136), which may be provided by an area mapping unit (e.g., 158).
  • an area mapping unit (e.g., 158) generally represents any hardware and/or software configured to receive one, two, or more input frames and/or geometry information of the camera elements that captured the original frames corresponding to the input frames, and to produce an area map representing regions of uniform values for a computational measure.
  • an "area map” refers to a map specifying
  • an area map as described herein is a content-based area map, and is not formed by a content-neutral dissection of the view plane with regular or irregular shapes such as squares, circles, triangles, etc.
  • edges that form boundaries for the arbitrarily shaped regions/areas of a content-based area map are pixel- accurate; that is, the edges may be specified at a resolution of single pixels.
  • edges that form boundaries for the arbitrarily shaped regions/areas of a content-based area map are block- accurate; that is, the edges may be specified at a resolution of single blocks or sub-blocks, wherein a block or sub-block may comprise two or more pixels.
  • computational measure refers to a measure of interest that is capable of presenting a region- or area-based behavior.
  • values for a computational measure may be computed algebraically based on pixel values in one, two, or more input frames.
  • values for a computational measure may be determined logically based on pixel values in one, two, or more input frames.
  • a function which may or may not involve a range -based mapping and which may or may not be analytical, may be used to compute/determine values for the computational measure.
  • a computational measure may be related to one or more attributes or properties of 2D images, 3D images, or multi-view images. Examples of computational measures may include, but are not limited to, disparity, noise level, color gradient, displacements, motion vectors, spatial errors, etc.
  • a "region/area" in an area map refers to a group of contiguous pixels whose values computed for a specific measure are uniform.
  • a group of contiguous pixels in one, two, or more frames may represent a dark blue object and may have color values within a narrow range that represent dark blue.
  • color values of the contiguous pixels may be mapped to a same (e.g., uniform) value, e.g., dark blue.
  • this group of contiguous pixels (e.g., the dark blue object) may form a region/area in an area map generated by an area mapping unit comprising a color gradient computation unit.
  • a group of contiguous pixels in one, two, or more frames may represent a moving train and may have motion vectors that are of a same vector value as generated by simple panning.
  • values of motion vectors for the contiguous pixels may all be mapped to a same (e.g., uniform) value, for example, corresponding to the motion vector generated from the simple panning.
  • this group of contiguous pixels e.g., the train
  • this group of contiguous pixels may form a region/area in an area map generated by an area mapping unit comprising a motion analysis unit.
  • one, two, or more frames in a movie may contextually represent a scene happening in the distant past and may need to assume a classic look like that of an old movie, which may be different from other scenes in the movie.
  • Pixel values of the one or more frames may be analyzed to determine a set of discrete regions/areas.
  • one region/area of the frames may be determined to have one uniform value of the specific measure and may be injected with one noise level, while another region/area of the frames may be a different uniform value of the specific measure and may be injected with a different noise level.
  • An area mapping unit that comprises a noise level computation unit may generate an area map that identifies regions/areas to be injected with various noise levels (e.g., an image receiving system may be configured to map different values of the specific measure to different values of ⁇ n for noise levels).
  • FIG. IB illustrates an example area mapping unit (158) that comprises a disparity computation unit (108) and an area node mapping unit (110).
  • the disparity computation unit (108) generally represents any hardware and/or software configured to receive two or more input frames representing different perspectives/views of a reality and geometry information of the camera system (e.g., 102) that takes original sources frames which give rise to the two or more input frames, to produce per-pixel displacements between any pair of two different input frames, and to produce a disparity area map representing the per-pixel displacements between the pair of two input frames.
  • disparity information produced by a disparity computation unit (108) as described herein may include both per-pixel displacements and the disparity area map constructed based on the per-pixel displacements.
  • the input frames are, or are derived from, source frames that are taken at the same time by a camera system (e.g., 102).
  • An input frame may carry a frame number indicating when the input frame was taken relative to other input frames.
  • two or more input frames may carry a same frame number indicating that corresponding source frames are created from the reality at substantially the same time.
  • two or more input frames may carry timestamps that root in a same clock or in two or more clocks that are synchronized, or in two or more clocks that timing offsets between the clocks are known; and based on the timestamps, corresponding source frames for these input frames may be determined as taken at the same time.
  • Other ways of identifying two or more frames that are taken by the camera system (102) at a substantially same time may also be used.
  • the one or more input frames received by the disparity computation unit (108) are stereo frames.
  • the disparity computation unit (108) may receive the left pixel processed frames (122) and right pixel processed frames (124) from the video preprocessing unit (104).
  • "disparity" as described herein may refer to per-pixel displacements (which may be zero or non-zero values) between a left pixel processed frame (or simply "left frame”) and a right pixel processed frame (or simply "right frame”).
  • the per-pixel displacements may be caused by differences in geometrical positions of a left camera element and a right camera element that took the original source frames, which give rise to the left and right frames.
  • the disparity computation unit (108) comprises an edge detection unit (150) and a geometry compensated correlation unit (152).
  • the edge detection unit (150) generally represents any hardware and/or software configured to process pixel values from a frame and to determine edges related to positions of objects in the frame.
  • the edge detection unit (150) receives the left frame (122), determines edges related to positions of objects in the left frame (122), and describes the edges in a left edge map.
  • the edge detection unit (150) receives the right frame (122), determines edges related to positions of objects in the right frame (122), and describes the edges in a right edge map.
  • pixel values (e.g., luma/chroma values) in the left and right frames may be used together to determine the edges.
  • pixel values in the left and right frames, as well as input frames that are of various perspectives, and/or input frames that precede or follow the input frame may be used to determine the edges.
  • the edge detection unit (150) may be configured to provide edge maps (154) such as the left and right edge maps as an input to the geometry compensated correlation unit (152).
  • the geometry compensated correlation unit (152) generally represents any hardware and/or software configured to receive input frames, geometry information, and edge maps, and to process the received data into disparity information that may comprise per-pixel displacements and a disparity area map.
  • the geometry compensated correlation unit (152) receives a left frame (e.g., one of the left pixel processed frames (122)), a right frame (e.g., one of the right pixel processed frames (124)), and geometry information (126) from the video preprocessing unit (104) and one or more edge maps (154) for the left and right frames from the edge detection unit (150).
  • the geometry compensated correlation unit (152) may be configured to determine and/or select a view plane and to use the geometry information to project both left and right frames, as well as the edge maps (154), onto the same view plane.
  • the view plane as described herein may or may not correspond to a view plane in which a left frame and a right frame are shown to a viewer in an image viewing environment.
  • the view plane as described herein may be physical and may correspond to a view plane on which one of the left and right frames may be viewed.
  • An imaginary line segment from the center of the view plane and the camera element of one of the left and right camera elements may be perpendicular to the view plane.
  • this view plane may be virtual, and may correspond to a point of view differing from either of the camera elements that took corresponding source frames.
  • An imaginary line segment from the center of the view plane and the camera element of any of the left and right camera elements may be tilted (e.g., not perpendicular) to the view plane.
  • the geometry compensated correlation unit (152) may be configured to use geometry information related to a viewing environment in setting up the view plane and in projecting the left and right frames to the view plane.
  • the geometry information related to the viewing environment may be configured to store locally in, or provided to, the image processing system (100) or to be accessible from a user or another device.
  • pixels in one of the left frame and the right frame may be chosen as reference pixels for computing displacements.
  • a pixel may refer to a display unit in a display panel and may correspond to a position in an image frame.
  • a pair of corresponding pixels in two frames may refer to a pair of pixels, respectively in the two frames, that describe or represent a same physical point in a reality.
  • a pixel value may refer to a unit data value that is to be loaded in the pixel of the display unit when the image frame is rendered.
  • three-dimensional (e.g., x-y-z in a Cartesian coordinate system) positions represented by pixel positions of either the left frame or the right frame projected at the view plane may be determined based on the geometric information of the camera elements and of the view plane and may be chosen as initial reference positions in computing displacements of the corresponding pixels in the other frame.
  • Pixel values of the left and right frames, the edge maps after projection to the view plane, the geometry information of the camera elements, and the geometry information of the view plane may be further correlated by the geometry compensated correlation unit (152) to obtain a three-dimensional (e.g., x-y-z) displacement per pixel between pixels in the projected left frame and corresponding pixels in the projected right frame. All per-pixel displacements for a pair of a projected left frame and a corresponding projected right frame may be collectively denoted as ⁇ d>.
  • a pixel in one frame and a corresponding pixel in the other frame are determined as coplanar if the positions of the two corresponding pixels are the same.
  • Coplanar pixels may be marked as "zero disparity" in ⁇ d> and on a disparity area map generated by the disparity computation unit (108).
  • the disparity computation unit (108) may be configured to determine a computational measure related to displacement calculation and to compute values for the measure based on the displacements ⁇ d>.
  • the measure may be a per-pixel measure and a value of the measure may be separately computed for each pair of corresponding pixels in the two frames.
  • the measure may be a group-of-pixels measure and a value of the measure may be collectively computed for a group of pairs of corresponding pixels in the two frames.
  • the measure may be represented by a function that maps a vector value such as a displacement to a normalized value such as a number.
  • a group of contiguous pixels in one frame and their corresponding pixels in the other frame may have displacements with similar directions and similar magnitudes such that normalized values computed for these pixels in the measure may be the same value (e.g., uniform value for the measure).
  • the values determined for the chosen measure related to displacements may be used by the disparity computation unit (108) to segment a portion of the view plane as represented by the left and right frames into the previously mentioned disparity area map.
  • the disparity area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to displacements (as discussed above) form one or more non- overlapped areas.
  • coplanar pixels with zero displacements may be additionally and/or optionally marked in the disparity area map.
  • a computational measure related to displacements may be coarse-grained and displacements with values of less than two unit displacements apart may be mapped into a same area of contiguous pixels on the disparity area map, among pixels included in the area, coplanar pixels may be additionally and/or optionally marked in the disparity area map.
  • the zero-disparity information may be used by an image processing system (e.g., 100) to mark certain portion(s) of an image display as 2D (e.g., pixels with zero disparity).
  • the zero disparity information may then be used by an image receiving system to correctly process 2D information (e.g., textual statistics for a baseball player) in a 3D image (which, for example, portray the baseball's action in 3D).
  • 2D information e.g., textual statistics for a baseball player
  • 3D image which, for example, portray the baseball's action in 3D.
  • a block of pixels as encoded in the coded bitstream may comprise a diagonal of pixels from a left frame and an opposite diagonal of pixels from a right frame.
  • the disparity computation unit (108) may be configured to provide the displacements ⁇ d> and the disparity area map to other units or parts in the image processing system (100).
  • the disparity computation unit (108) may be configured to provide the disparity information as described herein (e.g., 130) as an input to the area node mapping unit (110).
  • an area node mapping unit (110) as described herein generally represents any hardware and/or software configured to implement one or more techniques of formatting an area map in a data representation that is appropriate for coding into a data stream that can be readily decoded by an image receiving system.
  • the data representation adopted by the area node mapping unit (110) may be an area node map that employs vertices (a.k.a. nodes) and links that interconnect the nodes for representing segmented areas on the area map.
  • the area node mapping unit (110) may use nodes and links to create a disparity area node map for the disparity area map received from the disparity computation unit (108) and provide the disparity area node map (136) as an input to the muxer (114).
  • the area node mapping unit (110) may be configured to provide the displacements ⁇ d> to the video encode unit (112) and/or the muxer (114).
  • FIG. 1C illustrates an example area mapping unit (168) for color gradient information.
  • the color gradient area mapping unit (168) may comprise an edge detection unit (which may, but is not limited to, be the same edge detection unit (150) as illustrated in FIG. IB), a color gradient computation unit (162), and an area node mapping unit (which may, but is not limited to, be a separate area node mapping unit (110) in FIG. 1A and FIG. IB).
  • the color gradient area mapping unit (168) generally represents any hardware and/or software configured to receive one, two, or more input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a color gradient area map representing color gradient information in one or more of the input frames.
  • color gradient information produced by a color gradient area mapping unit (168) as described herein may include a color gradient area map.
  • the edge detection unit (150) generally represents any hardware and/or software configured to process pixels from a frame and to determine edges related to positions of objects in the frame.
  • the edge detection unit (150) receives an input frame which may be any of the left pixel processed frames (122) or the right pixel processed frames (124), determines edges related to positions of objects in the input frame, and describes the edges in an edge map.
  • pixel values e.g., hue/luma/chroma values
  • pixel values in the input frame may be used to determine the edges.
  • the edge detection unit (150) may be configured to provide an edge map (164) (which may or may not be the same as one of the left and right edge maps (154) in FIG. IB) as an input to the color gradient computation unit (162).
  • the color gradient computation unit (162) generally represents any hardware and/or software configured to receive input frames and edge maps, and to process the received data into color gradient information that may comprise a color gradient area map.
  • the color gradient computation unit (162) receives one, two, or more input frames from the video preprocessing unit (104) and edge maps (154) from the edge detection unit (150).
  • the color gradient computation unit (162) may be configured to determine a computational measure related to color values and to compute values for the computational measure based on the pixel values in the input frame.
  • the measure may be a per-pixel measure and a value of the measure may be separately computed for each pixel in the input frame.
  • the measure may be a group-of-pixels measure and a value of the measure may be collectively computed for a group of pixels in the input frame.
  • the measure may be represented by a function that maps a pixel value to a normalized value such as a number.
  • a group of contiguous pixels in the input frame may have similar values in any, some, or all of hue, chroma, and luma properties such that a normalized value for these pixels as determined by the measure may be the same value (e.g., uniform value for the measure).
  • an area mapping unit as described herein may produce one, two, or more area maps for one or more same input frames.
  • the color gradient area mapper unit (168) may produce one or more area maps, one for hue values, another for chroma values, and yet another for luma values.
  • an area map as produced by the color gradient area mapper unit (168) may be based on a combination of any, some, or all of hue, chroma, and luma values.
  • the values determined for the chosen measure related to color values may be used by the color gradient computation unit (162) to segment a picture as represented by the input frame into a color gradient area map as previously mentioned.
  • the color gradient area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to colors (as discussed above) form one or more non-overlapped areas.
  • the edges as determined by the edge detection unit (150) may be additionally and/or optionally marked in the color gradient area map.
  • a computational measure related to colors may or may not distinguish two objects with like color values.
  • the edge information as produced by the edge detection unit (150) may be incorporated into the color gradient area map to distinguish these objects.
  • the color gradient computation unit (162) may be configured to provide the color gradient area map to other units or parts in the image processing system (100).
  • the color gradient computation unit (162) may be configured to provide the color gradient area map as described herein (e.g., 166) in an input to the area node mapping unit (110).
  • the area node mapping unit (110) may use nodes and links as described herein to create a color gradient area node map for the color gradient area map received from the color gradient computation unit (162) and provide the color gradient area node map (160) as an input to the muxer (114).
  • one or more area mapping units in the image processing system (100) may comprise area mapping units for other computational measures such as those relating to noise level computation, motion analysis computation, spatial error computation, etc.
  • a noise level area mapping unit may be configured to receive one, two, or more input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a noise level area map representing noise level information in one or more of the input frames.
  • noise level information produced by a noise level area mapping unit as described herein may include a noise level area map.
  • the noise level computation unit may be configured to determine a computational measure related to noise level calculation and to compute values for the measure based on the pixel values in the input frames.
  • the measure may be represented by a function that maps a pixel value to a normalized value such as a number.
  • the values determined for the chosen measure related to noise levels may be used to segment one or more pictures as represented by the input frames into one or more noise level area maps.
  • a noise level area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to noise levels (as discussed above) form one or more non-overlapped areas.
  • edges as determined by an edge detection unit (e.g., 150) may be additionally and/or optionally marked in the noise level area map.
  • noise level area maps as described herein may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110).
  • the area node mapping unit (110) may use nodes and links as described herein to create a noise level area node map for the noise level area map and provide the noise level area node map (160) as an input to the muxer (114).
  • a motion analysis area mapping unit may be configured to receive a group of input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a motion analysis area map representing motion analysis information in the group of the input frames.
  • motion analysis information produced by a motion analysis area mapping unit as described herein may include a motion analysis area map.
  • the motion analysis computation unit may be configured to determine a computational measure related to motion analysis calculation and to compute values for the measure based on the pixel values in the group of input frames.
  • the measure may be represented by a function that maps a pixel value to a normalized value such as a number.
  • the valued determined for the chosen measure related to motion analysis may be used to segment a group of pictures as represented by the group of input frames into one or more motion analysis area maps.
  • a motion analysis area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to motion analysis (as discussed above) form one or more non-overlapped areas.
  • edges as determined by an edge detection unit (e.g., 150) may be additionally and/or optionally marked in the motion analysis area maps.
  • motion analysis area maps as described herein may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110).
  • the area node mapping unit (110) may use nodes and links as described herein to create a motion analysis area node map for a motion analysis area map described herein and provide the motion analysis area node map (160) as an input to the muxer (114).
  • these and other area maps may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110).
  • an area node mapping unit e.g., 110
  • a frame that is encoded into the image data channels may comprise an approximate shape of an object, while a spatial error area map may be created to indicate how the approximate shape of the object conveyed in the image data may be improved if an image receiving system so chose to.
  • the area node mapping unit (110) may use nodes and links as described herein to create a spatial error area node map for a spatial error area map described herein and provide the spatial error area node map (160) as an input to the muxer (114).
  • FIG. 2A illustrates an example horizontal view of example geometric configuration related to an example camera system (e.g., 102).
  • geometric configuration of the camera system (102) may be determined and/or specified based on a reference platform.
  • the reference platform may be physical, such as a platform on which camera elements in the camera system (102) may be mounted.
  • the reference platform may be virtual, such as an imaginary platform formed by initial calibrated positions of camera elements in the camera system (102).
  • the example camera system (102) comprises two camera elements (206-1 and 206-2), one for the left-eye view and the other for the right-eye view. Positions of the camera elements, when the camera elements take a pair of left and right source frames at substantially the same time, may be determined by the camera system (120) as part of the previously mentioned geometry information.
  • the set of positions of the camera elements for a plurality of pairs of left and right frames where each pair is captured at a substantially same time may be represented as one or more position functions of time.
  • source frames are taken at substantially the same time provided that the time difference for capturing the source frames is within a small tolerance value.
  • this small tolerance value may be set differently, and may be less that one twentieth of a second, one sixtieth of a second, or one hundred twentieth of a second, or a value more or less than the foregoing example values.
  • the small tolerance value may be set as less than the smallest time interval for which two consecutive pairs of source frames may be captured.
  • the camera system (102) may comprise four reference lines in the horizontal view. These four reference lines may comprise a pair of two parallel lines along two directions.
  • first two parallel lines (202-1 and 202-2) in the pair of parallel lines may generally point to a frontal direction (e.g., towards the left direction of FIG. 2A) of the camera system (102) that may be determined in the reference platform.
  • the distance between the first two parallel lines (202-1 and 202-2) may be measured by the camera system (102) as a distance Dl, which may be interpreted as a stereobase of the left-eye camera element and the right-eye camera element and may be analogous to an inter-pupil distance of left and right eyes of a viewer.
  • the first parallel lines (202-1 and 202-2) may intersect two principal points of a same type in the two camera elements, respectively.
  • the first parallel lines (202-1 and 202-2) may intersect entrance pupils in the two camera elements, respectively.
  • the camera elements in the camera system (102) may perform movements that may or may not be correlated.
  • Each of the camera elements may change its position in the 3D space as well as in the horizontal view of FIG. 2A in a translational motion, a rotational move, or a combination of the two.
  • second two parallel lines (204-1 and 204-2) in the pair of parallel lines may generally point to a direction perpendicular to the frontal direction of the camera system (102) that may be determined in the reference platform.
  • the distance between the second two parallel lines (204-1 and 204-2) may be measured as a distance D2.
  • the second parallel lines (204-1 and 204-2) may pass the two principal points of the same type in the two camera elements, respectively.
  • the second parallel lines (204-1 and 204-2) may pass the entrance pupils in the two camera elements, respectively.
  • view angles (ccl and cc2) of the camera elements may also be determined as part of the geometry information.
  • a view angle may indicate how wide an angle may be captured by a camera element and may be related to the optical configuration such as aperture, shutter, focal lengths, an image plane size, etc., of the camera element.
  • the view angle of a source frame may be further reduced in an editing process, which, for example, may be implemented in, or supported by, the video preprocessing unit (104).
  • horizontal parallactic angles ( ⁇ 1 and ⁇ 2) of the camera elements at the time of capturing a pair of source frames may also be determined as part of the geometry information for the pair of source frames.
  • each horizontal parallactic angle may be formed between a line associated with a center view of a camera element and each of the first parallel lines (204-1 and 204-2) in the horizontal view.
  • FIG. 2B illustrates an example vertical view of an example geometric configuration related to an example camera system (e.g., 102).
  • the camera system (102) may comprise two parallel horizontal planes (as represented by two lines 208-1 and 208-2 in the vertical view), in which the first two parallel lines (202-1 and 202-2) may respectively lie.
  • the distance between these two parallel horizontal planes (208-1 and 208-2) may be measured as a distance D3.
  • the two parallel horizontal planes (208-1 and 208-2) may pass the two principal points of the same type in the two camera elements as previously mentioned, respectively.
  • the two parallel horizontal lines (208-1 and 208-2) may pass the entrance pupils in the two camera elements, respectively.
  • each of the camera elements may change its position in the vertical view of FIG. 2B in a translational motion, a rotational move, or a combination of the two.
  • each vertical parallactic angle may be formed between a line associated with a center view of a camera element and each of the two parallel horizontal planes (208-1 and 208-2).
  • FIG. 3A illustrates example operations for creating an area node map, according to embodiments of the present invention.
  • input frames 302 are processed by area segmentation (304), which, for example, may be an operation, implemented by the video preprocessing unit (104) or by an area mapping unit in the image processing system (100).
  • An area map as described herein (306) may be produced following the area segmentation (304).
  • This area map (306) may be created for one, two, or more input frames individually or as a group.
  • a color gradient area map may possibly be created for a single input frame
  • a motion analysis area map may be created on a reference frame for a group of two or more frames with one frame in the group of the frames selected as the reference frame.
  • the area map (306) may comprise six areas Al through A6.
  • the area map (306) may be further processed into an area node map (314) by example operations, for example, node enumeration (308), node syntax verification (310), node specification (312), etc. Any, some, or all, of these operations may be implemented by an area node mapping unit (110 of FIG. 1 A, FIG. IB, and FIG. 1C) or, alternatively, by one or more other units in the image processing system (100 of FIG. 1A).
  • example operations for example, node enumeration (308), node syntax verification (310), node specification (312), etc. Any, some, or all, of these operations may be implemented by an area node mapping unit (110 of FIG. 1 A, FIG. IB, and FIG. 1C) or, alternatively, by one or more other units in the image processing system (100 of FIG. 1A).
  • the node enumeration (308) may implement an enumeration algorithm that assigns a different unique number to each and every one of the areas in the area map (306).
  • the enumeration of the areas in the area map (306) may start from a specific point or a specific area, which specific area may or may not be required to include the specific point.
  • enumeration of the areas may follow a "raster scan order" from a specific point or a specific area.
  • enumeration of the areas may follow a first direction first and a second direction second. For example, the enumeration may follow the horizontal direction first and the vertical direction second.
  • the enumeration may also perform along a radial direction starting from a specific point or a specific area of the area map (306), either clockwise or counterclockwise.
  • the specific point to start may be chosen as the upper-left corner, the upper-right corner of the area map (306), etc.
  • the node enumeration (308) may implement an algorithm to represent an area in the area map (306) as a polygon in which the vertices are nodes and the line segments of the circumference of the polygon are links.
  • this algorithm assigns consecutive numbers to all the nodes starting from a default starting point that is invariant in all area maps and in frames the area maps represent. For example, for frames of different sizes or shapes such as those supported in H.264/SVC encoding, the default starting point may be set as the upper-left corner.
  • the node syntax verification (310) may implement a path-walking algorithm to traverse the links and nodes along a particular path.
  • the links and nodes may be considered as forming a directed graph.
  • the particular path may be any path for the directed graph based on the graph theory. In an embodiment, this particular path may be an optimal path that may be represented by a minimum amount of data.
  • virtual nodes may be inserted. In an embodiment, only a minimum number of such virtual nodes may be inserted to create the area node map (314).
  • the node specification (312) may produce a list of ordered nodes along with properties of the ordered nodes.
  • this list may comprise a list of ordered nodes represented as combinations of node numbers, positions (x, y) of the nodes, and link numbers for the links that interconnect the nodes in the numeric order of the node numbers.
  • the list may be used to describe the net that forms the area node map (314).
  • FIG. 3B illustrates an example node link table (316) comprising a list of ordered nodes representing an example area node map (e.g., 314), according to embodiments of the present invention.
  • the list comprises rows along an ordered walking path of the nodes.
  • the ordered walking path follows a numeric (e.g., ascending) order of node numbers assigned to the nodes.
  • each row in the node link table (316) may represent a node and comprise a node number (Node #), an X coordinate (X), and a Y coordinate (Y) for the node, and a list of linked nodes from the node in the "Link #s" column of the node link table (316).
  • a link between two nodes may be represented by a single directed edge.
  • the list of linked nodes for a node in a row of the node link table (316) may comprise zero or more originating nodes for which the node is the end point, which originating nodes may be denoted, for example, within parentheses.
  • the list of linked nodes for the node may also comprise zero or more terminating nodes for which the node specified in the "Node#" column is the start point (e.g., the originating node).
  • the terminating nodes for the node specified in the "Node#" column may be denoted, for example, without parentheses.
  • the node link table (316) may be encoded in a data stream to other units or parts in the image processing system (100) or an image data storage device or an image receiving system. In another embodiment, the node link table (316) may be transformed to an alternative representation that may be more efficiently stored or transmitted than the node link table (316).
  • FIG. 3C illustrates an example alternative representation, as mentioned above, of an area node map (e.g., 314) according to embodiments of the present invention.
  • a row in a Alink table (318) represents a link between an originating node specified in the "Node #" column and a terminating node specified in the "Alink #".
  • the "Alink #" column for a row comprises one and only one terminating node. For example, since node "1" is an originating node that links to node "2", the first row in the Alink table (318) stores information between node "1" and node "2", as illustrated.
  • a row of the Alink table (318) store values in two "AX” and "AY" columns that are the differences in x-y coordinates of the originating node in the (present) row and x-y coordinates of an originating node in a preceding row if the two originating nodes in the present row and the preceding row are the same. Otherwise, if the originating node in the present row is different from that in the preceding row, or if the present row is the first row, then the two "AX” and "AY” columns for the present row stores x-y coordinates of the originating node in the present row.
  • a same originating node may be repeated one time or multiple times if the originating node is the originating node for one link or multiple links, respectively.
  • the Alink table (318) comprises three rows with node "2" as the originating nodes. Values in the Alink# column for these rows in the Alink table (318) are 2, 1, and 8, which are differences between the number for the originating node (e.g., 2) and the numbers for the terminating nodes (e.g., 4, 3, and 10), respectively.
  • each node needs to have at least one row in the Alink table (318). Additionally and/or optionally, virtual nodes may be used to complement those nodes that are not originating nodes in the chosen ordered walking path, and thus might not otherwise appear in the Alink table (318) without the virtual nodes being inserted in the Alink table (318). In an embodiment, each virtual node may correspond to only one row in the Alink table (318).
  • FIG. 3D An example of a different ordered walking path is illustrated in FIG. 3D.
  • Such a different ordered walking path is expected to produce different metadata compression from that for the ordered walking path illustrated in FIG. 3A, FIG. 3B, and FIG. 3C.
  • three columns of information represented by "AX", ⁇ ", and “Alink #” may produce sparse vectors with many zero values in “AX” and “AY", and small amplitudes in "Alink #".
  • These sparse vectors may be efficiently encoded with entropy coding or any other encoding techniques that use run-lengths and binarization.
  • entropy coders (320) such as CAVLC or CABAC that may already be implemented in a video coding unit (e.g., 112) in the image processing system (100) may be used to encode these sparse vectors as part of messages of a metadata transport type.
  • the messages may be transported in one or more channels different from media data channels that transport image data for video frames.
  • standard-based messages e.g., SEI for H.264
  • extra channels for MP2TS e.g., MP2TS, side-band frequency for radio broadcasting, etc.
  • side-band frequency for radio broadcasting e.g., side-band frequency for radio broadcasting, etc.
  • FIG. ID illustrates an example image receiving system (180) according to embodiments of the present invention.
  • a coded bitstream (142) generated by an example image processing system e.g., 100
  • the intermediate system (172) may be a networked system that operatively connects with both the image processing system (100) and the image receiving system (180).
  • the intermediate system (172) may comprise a tangible storage that may be used to first store the coded bitstream (142) from the image processing system (100) and then transmit the coded bitstream to the image receiving system (180).
  • the coded bitstream (142) may be stored as a file on a Blu-ray disc or another suitable medium in the intermediate system (172) or in the image processing system (100).
  • the coded bitstream (142) may be recovered from a tangible storage medium and provided to the image receiving system (180) as requested, as scheduled, or in any suitable manner.
  • the coded bitstream (142) has a container structure that comprises image data channels, metadata channel, audio channels, etc.
  • the image data channels may comprise image data having two video layers: a full-resolution residual layer and a prediction layer.
  • the full-resolution residual layer may store reference frames while the prediction layer may be a low-resolution layer that stores prediction information.
  • the metadata in one or more separate metadata channels in the coded bitstream may comprise data specifying one or more area maps as described herein.
  • the coded bitstream 142 may be de-multiplexed by a demuxer (182) in the image receiving system (180) into an image data stream (190), a metadata stream (192), and an audio data stream (194) that may have been carried by different logical channels as described above, respectively, in the coded bitstream (142).
  • the image data stream (190) may be decoded by a video decode unit (184) to recover video layers (174).
  • the video layers (174) may be a down-sampled version of pixel processed frames and may represent a version of frames at a lower sampling rate than that of the pixel processed frames.
  • the video layers (174) may be provided to a video post processing unit (188) in the image receiving system (180).
  • the image receiving system (180), or a unit therein may apply one or more image enhancing techniques to improve the quality of the final image output (e.g., 198).
  • the video post processing unit (188) may implement one or more image enhancing algorithms for sharpening and for deinterlacing.
  • the image enhancing algorithms may only use information that is already stored in the video layers (174). In this embodiment, no extra information, other than the video layers (174), needs to be provided for these image enhancing algorithms to work; and hence under this approach the image enhancing algorithms do not require an additional bit rate to carry, per unit time, extra information outside the video layers (174).
  • any, some, or all, of the image enhancing algorithms may alternatively and/or optionally use both the video layers (174) and the metadata contained in the metadata stream (192) to improve the quality of the final image output (198).
  • Such an image enhancing algorithm that uses both input frames and metadata may be a deblocking filter used in H.264 video coding standard.
  • Such an image enhancing algorithm may also use both the video layers (174) and area maps that were constructed by an image processing system (e.g., 100).
  • the metadata stream (192) may embed data specifying one or more area maps that were constructed by the image processing system (100). Both the embedded data specifying the one or more area maps and the video layers (174) may be provided as a combined input (196) to an area map reconstruction unit (186) in the image receiving system (180).
  • the data embedded in and extracted from the metadata stream may comprise one or more Alink tables (e.g., 318 of FIG. 3C) for the one or more area maps and measured information related to one or more computational measures.
  • the area map reconstruction unit (186) in the image receiving system (18) may perform a number of reconstruction and decoding operations, which may be symmetrical or resemble an inverse procedure to the area map construction and encoding operations implemented in the image processing system (100). From these reconstruction and decoding operations, node link tables (e.g., 316 of FIG. 3B) may be reconstructed from the one or more Alink tables (318) encoded in the metadata. Based on the node link tables (316), one or more area node maps may be reconstructed. One or more area maps may be reconstructed based on the one or more area node maps decoded from the metadata (192).
  • node link tables e.g., 316 of FIG. 3B
  • Alink tables 318
  • one or more area node maps may be reconstructed.
  • One or more area maps may be reconstructed based on the one or more area node maps decoded from the metadata (192).
  • the measured information related to the one or more computational measures may be contained in and extracted from the coded bitstream and thereafter incorporated into the one or more area maps.
  • information in the video layers (174) may be additionally and/or optionally extracted and used to reconstruct the one or more area maps.
  • the area map reconstruction unit (186) provides the one or more reconstructed area maps (176) with the measured information as an input to the video post processing unit (188).
  • the video post processing unit (188) may implement one or more algorithms that use both the video layers (174) and the one or more area maps (176) to generate the final image output (198) for rendering.
  • the final image output (198) comprises frames of an enhanced version relative to the frames that can be recovered solely from the video layers (174) with or without image enhancing techniques. This enhanced version of frames in the final image output (198) may provide high-quality image perception.
  • a video encode unit e.g., 112 of FIG. 1A
  • pixel processed frames e.g., 128 of FIG. 1A
  • a data stream e.g., 140 of FIG. 1A
  • some information in the pixel processed frames may be lost.
  • a sampling rate associated with the pixel processed frames or source frames may be, but is not limited to, 4:4:2 using a 10-bit color value representation
  • a sampling rate associated with the encoded image data such as (video layers) in the data stream may be, but is not limited to, 4:2:2 using a 8-bit color value representation.
  • the lower sampling rate may be dictated by a mass media player.
  • a Blu-ray disc or a similar video delivery mechanism may not support the higher sampling rate of the source frames, but may support the lower sampling rate of the encoded image data implemented by an image processing system (100).
  • an image receiving system e.g., a high-end 3D display system that supports HD, 2K, 4K, or even 8K encoded pictures
  • an area map may be received and used by the image receiving system (180) to predict information lost in the down-sampling operation of the encoding process.
  • the area map may carry measured information (a value of
  • the area map may use much smaller data volume to store and/or to transmit than a video layer comprising all per-pixel values of the lost information for supporting a restoration of a down-sampled version to the original pixel processed frames or source frames.
  • the video layer comprising per-pixel values may contain highly redundant information, resulting in very high data volume that is difficult to store and to transmit.
  • the area map even though transmitted in a significantly small data volume, may be effectively used by a video post processing unit (188) to a high quality version of frames in the final image output (198).
  • This high quality version of frames may not be the same as the source frames or the pixel processed frames in the image processing system (100), but may support excellent perception relative to a viewer.
  • area-based samples (or data points) of measured information may be added on top of the area map by the area map reconstruction unit in FIG. ID.
  • the information on this area map (176) may be interpolated using an interpolator of a same type used in encoding by the image processing system (100) to obtain an up-sampled prediction layer.
  • the up-sampled prediction layer may be added to a residual layer from the coded bitstream (142) and (post) processed by the video post processing unit (188) based on the same or different image enhancing algorithms that have already been implemented therein to generate output frames.
  • the final image output (198) may be frames of an up-sampled version that are generated based on the video layers in the received coded stream (142) and the measured information in the area map.
  • a video encode unit e.g., 112 of FIG. 1A
  • frames e.g., 128 of FIG. 1A
  • a data stream e.g., 140 of FIG. 1A
  • some information in the frames may be lost due to data compression. For example, blockiness artifacts may be seen if frames are made of highly compressed image data.
  • an area map may be used to outline and predict accurately the edges of objects or areas in a picture as represented by frames.
  • the edges in the area may be a smooth faithful representation of an outline of a uniform looking portion of the sky.
  • Samples of measured information in the area map may be interpolated with some, or all video layers or channels to create frames of an up-sampled version with accurate surface structures of objects in the frames.
  • These surface structures may be accentuated by pixel- accurate contours captured in the area map, forming an improved surface of luma or chroma that does not have blockiness artifacts as created by a typical block-based compression and/or representation of images.
  • Other image enhancing algorithms may be used to further improve the image quality of areas around the pixel-accurate contours.
  • an image receiving system configured for a higher sampling rate (e.g., 4:4:2) instead of the low sampling rate (e.g., 4:2:2) of received image data may up-sample the image data incorrectly and thus waste the capability of supporting the higher sampling rate by the image receiving system (e.g., 180) or by an accompanying image rendering system.
  • the image data may comprise compressed color values (e.g., 8-bit values) in certain regions of color space.
  • an area map may carry color shift information in a small data volume.
  • color values in certain areas (in the color space) that are susceptible to incorrect color shifting may be correctly shifted by prediction based on measured information in the color shift area map that was produced in the video encoding process in the image processing system (100).
  • source frames may be encoded with a wide input color gamut, while the image data may be encoded with a smaller color gamut by a video encode unit (e.g., 112 of FIG. 1).
  • an area map may carry color space conversion information such that when frames are up-sampled by an image receiving system (e.g., 180), color values in certain areas that were mapped to the smaller color gamut in video encoding may be converted back to a wider color gamut supported by the image receiving system (180) using measured information on the area map that recorded the original color space conversion.
  • image enhancement techniques as described herein may be used to improve image perceptions in other aspects.
  • area maps that comprise disparity information may be used to generate output frames (e.g., frames with stereo encoding in the ASBS mode) that provide an accurate representation in a viewing environment, even if the viewing environment has a different geometry than a camera system originating the video data.
  • area maps that comprise noise level information may be used to inject noise levels, or correct spatial errors of the frames in a residual layer, etc.
  • FIG. 4A illustrates an example process flow according to a possible embodiment of the present invention.
  • one or more computing devices or components in an image processing system (100) may perform this process flow.
  • the image processing system (100) determines a computational measure for one or more input frames.
  • the image processing system (100) calculates a value of the computational measure for each picture unit in an input frame in the one or more input frames.
  • the image processing system (100) generates an area map comprising one or more areas.
  • each of the one or more area comprises picture units with a uniform value for the measure.
  • the image processing system (100) outputs the area map and an encoded version of the input frame to a recipient device.
  • FIG. 4B illustrates another example process flow according to a possible embodiment of the present invention.
  • one or more computing devices or components in an image receiving system (180) may perform this process flow.
  • the image receiving system (180) decodes an encoded version of an input frame and an area map relating to one or more input frames including the input image.
  • the area map comprising one or more areas each having picture units that are of a uniform value for a computational measure.
  • the image receiving system (180) applies the area map to the input frame to generate a new version of the input frame.
  • the image receiving system (180) outputs the new version of the input frame to a rendering device.
  • the special-purpose computing devices may be hard- wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
  • the special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard- wired and/or program logic to implement the techniques.
  • FIG. 5 is a block diagram that illustrates a computer system 500 upon which an embodiment of the invention may be implemented.
  • Computer system 500 includes a bus 502 or other communication mechanism for communicating information, and a hardware processor 504 coupled with bus 502 for processing information.
  • Hardware processor 504 may be, for example, a general purpose microprocessor.
  • Computer system 500 also includes a main memory 506, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 502 for storing information and instructions to be executed by processor 504.
  • Main memory 506 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 504.
  • Such instructions when stored in storage media accessible to processor 504, render computer system 500 into a special-purpose machine that is customized to perform the operations specified in the instructions.
  • Computer system 500 further includes a read only memory (ROM) 508 or other static storage device coupled to bus 502 for storing static information and instructions for processor 504.
  • ROM read only memory
  • a storage device 510 such as a magnetic disk or optical disk, is provided and coupled to bus 502 for storing information and instructions.
  • Computer system 500 may be coupled via bus 502 to a display 512, such as a cathode ray tube (CRT), for displaying information to a computer user.
  • a display 512 such as a cathode ray tube (CRT)
  • An input device 514 is coupled to bus 502 for communicating information and command selections to processor 504.
  • cursor control 516 is Another type of user input device, such as a mouse, a trackball, or cursor direction keys for
  • This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • Computer system 500 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 500 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 500 in response to processor 504 executing one or more sequences of one or more instructions contained in main memory 506. Such instructions may be read into main memory 506 from another storage medium, such as storage device 510. Execution of the sequences of instructions contained in main memory 506 causes processor 504 to perform the process steps described herein. In alternative embodiments, hard- wired circuitry may be used in place of or in combination with software instructions.
  • Non-volatile media includes, for example, optical or magnetic disks, such as storage device 510.
  • Volatile media includes dynamic memory, such as main memory 506.
  • Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
  • Storage media is distinct from but may be used in conjunction with transmission media.
  • Transmission media participates in transferring information between storage media.
  • transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 502.
  • transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
  • Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 504 for execution.
  • the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer.
  • the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
  • a modem local to computer system 500 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
  • An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 502.
  • Bus 502 carries the data to main memory 506, from which processor 504 retrieves and executes the instructions.
  • the instructions received by main memory 506 may optionally be stored on storage device 510 either before or after execution by processor 504.
  • Computer system 500 also includes a communication interface 518 coupled to bus 502.
  • Communication interface 518 provides a two-way data communication coupling to a network link 520 that is connected to a local network 522.
  • network link 520 that is connected to a local network 522.
  • communication interface 518 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
  • ISDN integrated services digital network
  • communication interface 518 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
  • LAN local area network
  • Wireless links may also be implemented.
  • communication interface 518 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link 520 typically provides data communication through one or more networks to other data devices.
  • network link 520 may provide a connection through local network 522 to a host computer 524 or to data equipment operated by an Internet Service Provider (ISP) 526.
  • ISP 526 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 528.
  • Internet 528 uses electrical, electromagnetic or optical signals that carry digital data streams.
  • the signals through the various networks and the signals on network link 520 and through communication interface 518, which carry the digital data to and from computer system 500, are example forms of transmission media.
  • Computer system 500 can send messages and receive data, including program code, through the network(s), network link 520 and communication interface 518.
  • a server 530 might transmit a requested code for an application program through Internet 528, ISP 526, local network 522 and communication interface 518.
  • the received code may be executed by processor 504 as it is received, and/or stored in storage device 510, or other non- volatile storage for later execution.
  • source frames taken from a reality are used to illustrate some aspects of the present invention. It should be noted that other types of source frames may also be used in embodiments of the present invention. For example, source frames may be composite frames from two or more different image sources.
  • a part, or a whole, of a source frame may be sourced from a 2D image, while another part on the same source frame may be sourced from a 3D or multi-view image.
  • Techniques as described herein may be provided for these other types of source frames in embodiments of the present invention.
  • the invention may suitably comprise, consist of, or consist essentially of, any of element (the various parts or features of the invention and their equivalents as described herein, currently existing, and/or as subsequently developed. Further, the present invention illustratively disclosed herein may be practiced in the absence of any element, whether or not specifically disclosed herein. Obviously, numerous modifications and variations of the present inventio9n are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.
  • the invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which described structure, features, and functionality of some portions of the present invention.
  • EEEs Enumerated Example Embodiments
  • EEE3 The method of Claim EEEl, wherein the one or more input frames comprise one or more views of a scene as captured by one or more camera elements in a camera system.
  • EEE4 The method of Claim EEE3, further comprising determining geometry information for the one or more camera elements in the camera system.
  • EEE5. The method of Claim 1, wherein the input frame is a frame in a pair of stereoscopic frames that comprise a left-eye frame and a right-eye frame.
  • EEE6 The method of Claim 1, wherein the computational measure relates to disparity between two input frames.
  • EEE7 The method of EEE 1, wherein the computational measure relates to noise levels to be injected into the input frame to produce the encoded version of the input frame.
  • EEE8 The method of EEE 1, wherein the computational measure relates to color information in the input frame.
  • EEE9 The method of EEE 1, wherein each picture unit is a pixel.
  • EEE10 The method of EEE 1, wherein each picture unit is a block of pixels.
  • EEE11 The method of EEE 1, further comprising representing the area map with a representation based on a plurality of vertices and a plurality of links between the vertices.
  • EEE12 The method of EEE 1 , wherein the computational measure is related to disparity between two input frames and wherein the area map comprises at least an area in which corresponding picture units in the two input frames is of zero disparity.
  • EEE13 The method of EEE 1, further comprising:
  • EEE17 An apparatus comprising a processor and configured to perform any one of the methods recited in EEEs 1-16.
  • EEE18 A computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in EEEs 1-16.
  • a method comprising:
  • EEE20 The method of EEE 19, further comprising retrieving the encoded version of the input frame from a coded bitstream.
  • EEE21 The method of EEE 19, further comprising retrieving the encoded version of the input frame from a physical storage medium.
  • EEE22 The method of EEE 19, further comprising determining geometry information for one or more camera elements in a camera system that captures a source frame from which the input frame was derived.
  • EEE23 The method of EEE 19, wherein the input frame is a frame in a pair of
  • stereoscopic frames that comprise a left-eye frame and a right-eye frame.
  • EEE24 The method of EEE 19, wherein the computational measure relates to disparity between the input frame and one or more other input frames.
  • EEE25 The method of EEE 19, wherein the computational measure relates to noise levels to be injected into an output frame for rendering.
  • EEE26 The method of EEE 19, wherein the computational measure relates to color information in a source frame from which the input frame was derived.
  • EEE27 The method of EEE 19, wherein each picture unit is a pixel.
  • EEE28 The method of EEE 19, wherein each picture unit is a block of pixels.
  • EEE29 The method of EEE 19, wherein the area map is represented based on a plurality of vertices and a plurality of links between the vertices.
  • EEE30 The method of EEE 19, wherein the input frame is one of two stereoscopic input frames, and wherein the area map comprises at least an area in which corresponding picture units in the two stereoscopic input frames is of zero disparity.
  • EEE31 The method of EEE 19, further comprising:
  • EEE32 An apparatus comprising a processor and configured to perform any one of the methods recited in EEEs 19-29.
  • EEE33 A computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in EEEs 19-29.

Abstract

Techniques for image processing systems are provided. A computational measure may be determined for one or more input frames. A value of the computational measure may be calculated for each picture unit in an input frame in the one or more input frames. An area map comprising one or more areas may be generated. Each of the one or more areas may comprise picture units with a uniform value for the measure. The area map and an encoded version of the input frame may be outputted to a recipient device.

Description

IMAGE ENHANCEMENT SYSTEM USING AREA INFORMATION
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to United States Provisional Patent Application No. 61/299,793 filed 29 January 2010, which is hereby incorporated by reference in its entirety. TECHNOLOGY
[0002] The present invention relates generally to imaging systems, and in particular, to imaging systems that process 2-dimensional (2D), 3-dimensional (3D), and multi-view images. BACKGROUND
[0003] In general, human eyes perceive 3D images based on the slight difference of the right eye view and the left eye view. The illusion of depth can be created by providing an image as taken by a left camera in a stereo camera system to the left eye and a slightly different image as taken by a right camera in the stereo camera system to the right eye.
[0004] However, these different-eye images may still convey a flawed view of a 3D reality to a user, as the geometry of the camera system may not match that of an image- viewing environment. Due to the difference in geometries, an object as perceived in the viewing environment may be too large or too small relative to that in the view of reality. For example, one spatial dimension of a solid object may be scaled quite differently from a different spatial dimension of the same object. A solid ball in reality may possibly be perceived as a much-elongated ellipse in the viewing environment. Further, for images that portray fast actions and movements, human brains are typically ineffective in compensating or
autocorrecting distortions in perception.
[0005] Problems related to image perception are not limited to stereo and/or multi-view images. A high-quality geometrically correct camera system may be used to produce high quality 2D, 3D, or multi-view original images. However, these original images cannot be used for public distribution as is, and are typically required to be edited, down-sampled and compressed to produce a much more compact version of the images. For example, a release version stored on a Blu-ray disc may use only 8-bit color values to approximate the original 12-bit color values in the original images. On the other hand, human eyes may be quite sensitive in certain regions of the color space, and can tell different colors even if those colors have very similar color values. As a result, colors such as dark blue in certain parts of an original image may be shifted to human-detectable different colors, such as blue with a purple hue, in the release version.
[0006] The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.
BRIEF DESCRIPTION OF DRAWINGS
[0007] The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
[0008] FIG. 1A, FIG. IB, FIG. 1C, and FIG. ID illustrate example systems and components that support image processing, according to possible embodiments of the present invention;
[0009] FIG. 2A and FIG. 2B illustrate example geometry of an example camera system, according to possible embodiments of the present invention;
[00010] FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, and FIG. 3E illustrate example operations and representations relating to area maps, according to possible embodiments of the present invention;
[0010] FIG. 4A and FIG. 4B illustrate an example process flow, according to a possible embodiment of the present invention; and [0011] FIG. 5 illustrates an example hardware platform on which a computer or a computing device as described herein may be implemented, according a possible embodiment of the present invention.
DESCRIPTION OF EXAMPLE POSSIBLE EMBODIMENTS
[0012] Example possible embodiments, which relate to image processing systems, are described herein. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily including, obscuring, or obfuscating the present invention.
[0013] Example embodiments are described herein according to the following outline (outline section headings are for reference purposes only and shall not in any way control the scope of the present invention):
1. GENERAL OVERVIEW
2. IMAGE PROCESSING SYSTEM
3. AREA MAPPING UNITS
4. DISPARITY COMPUTATION
5. COLOR GRADIENT COMPUTATION
6. OTHER MEASURE-BASED COMPUTATIONS
7. GEOMETRIC INFORMATION OF CAMERA SYSTEM
8. OPERATIONS FOR CREATING AN AREA NODE MAP
9. POST PROCESSING OPERATIONS
10. UP-SAMPLING
11. DEBLOCKING
12. IMAGE ENHANCEMENTS
13. PROCESS FLOW
14. IMPLEMENTATION MECHANISMS - HARDWARE
OVERVIEW
15. EQUIVALENTS, EXTENSIONS, ALTERNATIVES AND MISCELLANEOUS
1. GENERAL OVERVIEW
[0014] This overview presents a basic description of some aspects of a possible embodiment of the present invention. It should be noted that this overview is not an extensive or exhaustive summary of aspects of the possible embodiment. Moreover, it should be noted that this overview is not intended to be understood as identifying any particularly significant aspects or elements of the possible embodiment, nor as delineating any scope of the possible embodiment in particular, nor the invention in general. This overview merely presents some concepts that relate to the example possible embodiment in a condensed and simplified format, and should be understood as merely a conceptual prelude to a more detailed description of example possible embodiments that follows below.
[0015] In an embodiment, a computational measure may be determined for one or more input frames. The one or more input frames may be 2D, 3D, or multi-view image frames comprising one or more views of a reality as perceived by one or more camera elements in a camera system.
[0016] In an embodiment, geometry information for the one or more camera elements in the camera system may also be determined.
[0017] A value of the computational measure may be calculated for each picture unit in an input frame in the one or more input frames. The input frame, for example, may be a frame in a pair of stereoscopic frames that comprise a left-eye frame and a right-eye frame. In an example, each picture unit may be a pixel. In another example, each picture unit may be a block of pixels. In an embodiment, a picture unit may comprise any number of pixels and any arbitrary shape.
[0018] In an embodiment, the computational measure may relate to disparity between two input frames. In another embodiment, the computational measure may relate to noise levels to be injected into the input frame to produce the encoded version of the input frame. In a further embodiment, the computational measure may relate to color information in the input frame.
[0019] An area map comprising one or more areas may be generated. Each of the one or more areas comprising picture units with a uniform value for the measure. The area map and an encoded version of the input frame may be outputted to a recipient device.
[0020] In an embodiment, the version of the encoded version of the input frame provided to the recipient device may be of a sampling rate lower than that of the input frame.
[0021] In an embodiment, the area map may be represented based on a plurality of vertices and a plurality of links between the vertices.
[0022] In embodiments in which the computational measure is related to disparity between two input frames, the area node map may comprise at least an area in which corresponding picture units in the two input frames are of zero disparity.
[0023] In an embodiment, additional computational measures may be used to generate additional area maps.
[0024] In an embodiment, an encoded version of an input frame and an area map relating to one or more input frames including the input image may be decoded. Here, the area map may comprise one or more areas each having picture units that are of a uniform value for a computational measure. The area map may be applied to the input frame to generate a new version of the input frame. The new version of the input frame may be sent to a rendering device.
[0025] In an embodiment, geometry information for one or more camera elements that originate one or more input frames may be determined. The geometry information with the one or more input frames in a data stream may be encoded. The data stream may be outputted to a recipient device.
[0026] In an embodiment, one or more input frames and geometry information of one or more camera elements that originate the one or more input frames may be received. The one or more input frames may be modified to generate a new version of the one or more input frames based on the geometry information of the one or more camera elements. The new version of the one or more input frames may be outputted to a rendering device.
[0027] In some embodiments, mechanisms as described herein form a part of a image processing system, including but not limited to a server, studio system, art director system, image editor, animation system, movie studio system, broadcast system, media recording device, media playing device, television, laptop computer, netbook computer, cellular radiotelephone, electronic book reader, point of sale terminal, desktop computer, computer workstation, computer kiosk, and various other kinds of terminals and display units.
[0028] Various modifications to the preferred embodiments and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the disclosure is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features described herein.
2. IMAGE PROCESSING SYSTEM
[0029] Images may be described herein with reference to one or more example media, including still images, video frames, slide shows, etc. The selection of an example medium in this description may be made for simplicity and concise unity and, unless expressly stated to the contrary, should not be construed as limiting an embodiment to a particular medium as embodiments of the present invention are well suited to function with any media content.
[0030] FIG. 1A shows an example of an image processing system (100) in accordance with one or more possible embodiments. In an embodiment, the image processing system (100) generally represents a single device or multiple devices that are configured to encoding images and metadata into a coded bitstream. In an embodiment, as illustrated in FIG. 1A, the image processing system (100) may comprise a camera system unit (102), a video preprocessing unit (104), a sampling format unit (106), an area mapping unit (158) comprising a disparity computation unit (108) and an area node mapping unit (110), a video encode unit (112), and a muxer (114).
[0031] In an embodiment, the camera system unit (102) corresponds to any device configured to acquire images in terms of field raw source frames. The camera system unit (102) may include a single view camera, a stereo camera pair, or a multi-view camera, or any other suitable systems or subsystems of image acquisition devices. As used herein, "field raw source frames" may refer to a version of images captured from a reality in image planes (or film planes) of image acquisition devices present in the reality; the field raw source frames (or simply source frames) may be, but not limited to, a high-quality version of original images that portray the reality. However, in various embodiments, source frames may generally refer to a version of initial frames that are to be edited, down-sampled, and/or compressed, along with possible metadata, into a coded bitstream that may be distributed to image receiving systems; thus, the field raw source frames may include artificially created or synthesized image frames. In the present example, the source frames may be captured from a camera system with a high sampling rate that is typically used by a professional, an art studio, a broadcast company, a high-end media production entity, etc. In various embodiments, the camera system unit (102) may comprise one, two, or more image acquisition devices each of which may be a single camera element configured to capture a specific view of a reality. Example of image acquisition devices may include, but are not limited to, a left camera element and a right camera element as illustrated, where the camera system unit (102) is a stereo camera system. Examples of source frames may include, but not limited to, left source frames (116) and right source frames (118). In some embodiments, instead of, or in addition to, image acquisition devices, other types of devices may be used to obtain source frames. For example, source frames may be computer generated. Source frames may also be obtained from existing image sources such as old movies and documentaries. In some embodiments, disparity and/or depth information and/or parallactic information may be computer-generated to convert 2D images to 3D source frames or to multi-view source frames.
[0032] In an embodiment, the camera system unit (102) may be configured to acquire and communicate geometry information related to optical configurations of the image acquisition devices to other parts of the image processing system (100). In an embodiment, the camera system unit (102) is configured to provide geometry information (120) as described herein as an input to the video preprocessing unit (104). Examples of geometry information (120) may include, but are not limited to, information related to positions and/or offsets of principal points and parallaxes in optical configurations of the image acquisition devices as functions of time.
[0033] In some embodiments, preprocessing may be optional. Some or all of the functions performed by the video preprocessing unit (104) as described herein may be bypassed. In an embodiment, the video preprocessing unit (104) generally represents any hardware and/or software configured to produce (1) pixel processed frames, and/or (2) statistical analysis data, related to the contents of the source frames. As used herein, the "pixel processed frames" refer to a version of images that are generated based on source frames (e.g., 116 and 118) and geometry information (e.g., 120) received as an input; typically, the pixel processed frames represent a high-quality version of images that portray the reality, enhanced by the geometry information and the statistical analysis data. Here, the "statistical analysis data" refers to data generated based on statistical analysis on the contents of the source frames (e.g., 116 and 118). The statistical analysis data may be provided to, and used by, an image receiving system to make better predictions or faster decoding relating to image data encoded in a coded bitstream than otherwise.
[0034] In an example, the video preprocessing unit (104) may receive the source frames (116 and 118) and geometry information (120) as inputs from the camera system unit (102). In another example, the video preprocessing unit (104) may only receive source frames (116 and 118) from the camera system unit (102). For example, the geometry information (120) may bypass the video preprocessing unit (104) and directly provide to one or more other parts or units, for example, the disparity computation unit (108), in the image processing system (100).
[0035] In an embodiment, the video preprocessing unit (104) may comprise video filters that are configured to perform sharpening, de-noising, motion correlation and/or
compensation, other entropy reducing operations on the source frames (116 and 118), and/or one or more other operations that facilitate video encoding in the image processing system (100). Additionally and/or optionally, the video preprocessing unit (104) may be configured to perform frame rate conversion (FRC) and/or de-interlacing.
[0036] In an embodiment, the video preprocessing unit (104) may be configured to perform motion analysis, segmentation, object scan, and pixel activities on source frames, to semantically describe the content of source frames using one or more non-proprietary or proprietary measures, and to improve the final quality of the coded bitstream. Some or all of the information relating to these measures may be provided to as input to the video encode unit (112) to drive specific parts of the video encode unit (112), such as a quantization block, a mode decision block, and a rate control module.
[0037] In an embodiment, the video preprocessing unit (104) may be configured to provide any, some, or all of, the pixel processed frames, the geometry information, and the statistical analysis data to other parts or units in the image processing system (100). In an embodiment, the pixel processed frames additionally carry the geometry information and the statistical analysis data, and the video preprocessing unit (104) may be configured to provide the pixel processed frames to other parts or units in the image processing system (100).
[0038] In an embodiment where the video preprocessing unit (104) receives the source frames as stereo left source frames and stereo right source frames, the video preprocessing unit (104) provides left pixel processed frames (122) and right pixel processed frames (124) as inputs to the sampling format unit (106). In an embodiment, these left pixel processed frames (122) and right pixel processed frames (124) may, but are not required to, include the geometry information (120) and the statistical analysis data.
[0039] In an embodiment, the sampling format unit (106) generally represents any hardware and/or software configured to receive pixel processed frames (e.g., 122 and 124) and generate a pre-encoding version of frames (128) by sub-sampling the pixel processed frames (e.g., 122 and 124) using one or more sub-sampling methods. These sub-sampling methods may or may not be specific to processing 2D, 3D, or multi-view image frames, and may, but are not required to, relate to side-by-side, top/bottom, quincunx, etc. As compared with the source frames, the pre-encoding version of frames (128) may comprise frames as a relatively low quality version of images. For example, a sampling rate for the source frames may be 4:4:2 (e.g., 10 bits for each color value), while a sampling rate for the pre-encoding version of images may be 4:2:2 (e.g., 8 bits for each color value). In some embodiments, the pre-encoding version of frames (128) may or may not be further compressed. In some embodiments in which the pre-encoding version of images are compressed video frames, either a lossless compression method or a lossy compression method may be used. Each frame in the pre-encoding version of frames (128) may comprise image data derived from one, two, or more source frames. For example, a side-by-side frame in the pre-encoding version of frames (128) may comprise image data derived from both a left source frame and a right source frame.
[0040] The pre-encoding version of frames (128) produced by the sampling format unit (106) may be encoded and stored in one or more much smaller file or containers and transmitted at a much lower transmission rate than that required by the source images. In an embodiment, the sampling format unit (106) may provide the pre-encoding version of frames (128) as an input to the video encode unit (112).
[0041] In an embodiment where the video preprocessing unit (104) receives stereo left source frames and stereo right source frames, the video preprocessing unit (104) provides left pixel processed frames (122), right pixel processed frames (124), and geometry information (126) related to optical configuration of the camera system (102) as inputs to one or more area mapping units in the image processing system (100). An example of area mapping unit may be, but not limited to, a disparity computation unit (e.g., 108). Other examples of area mapping units may include a noise level computation unit, a color gradient computation unit, a motion analysis unit, etc.
[0042] As illustrated in FIG. 1A, an example implementation of area mapping unit (158) may comprise a disparity computation unit (e.g., 108) and an area node mapping unit (e.g., 110). In an embodiment, the area mapping unit (158), or the disparity computation unit (108) therein, may receive input frames, which may be the pixel processed frames previously discussed, from the video preprocessing unit (104). Additionally and/or optionally, the area mapping unit, or the disparity computation unit (108) therein, may receive the geometry information relating to the optical configuration of the camera systems when original source frames corresponding to the input frames were taken. In an embodiment, the area mapping unit (158) produces an area map (136) that may be provided as an input to other parts of the image processing system (100) such as the muxer (114). Additionally and/or optionally, the area mapping unit (158) may produce additional information other than the area map (136) and provide the additional information to other parts of the image processing system (100). For example, the area mapping unit (158) may produce pixel displacement information (134), and provide that information as an input to the muxer (114) along with the area map (136).
Additionally and/or optionally, any, some, or all, of the area map (136) and/or the pixel displacement information (134) may be provided by the area mapping unit (158) to other units in the image processing system (100) such as the video encode unit (112).
[0043] In an embodiment, the video encode unit (112) generally represents any hardware and/or software configured to receive image data and other data related to the image data and to encode the received data into a file or a data stream. The file or data stream may comprise a container structure with multiple channels. One or more image data channels in the container structure may carry a compressed and/or quantized version of the image data. The image data may be an encoded version of a pre-encoded version of images (e.g., 128). One or more metadata channels may carry description and statistical information relating to the image data. For example, the video encode unit (112) may be configured to receive the pre-encoded version of frames (128) from the sampling format unit (106), and to encode the pre-encoded version of frames (128) in a data stream (140) as an input to the muxer (114). Additionally and/or optionally, the video encode unit (112) may be configured to receive pixel displacement information (134) from the area mapping unit (158), and to encode the pixel displacement information (134) in or more metadata channels in the data stream (140) as an input to the muxer (114).
[0044] In an embodiment, the muxer (114) generally represents any hardware and/or software configured to receive data streams of (1) audio data (e.g., 138) from one or more audio processing units (not shown) in the image processing system, (2) image data as encoded by a video encode unit, and (3) content-based information as generated by one or more area mapping units or as provided by a video encode unit, to multiplex the received data streams in a coded bitstream, and to store the coded bit stream in a tangible storage medium or to transmit the coded bitstream to an image receiving system. The coded bitstream may be logically structured as a container with one or more image data channels for storing the audio data, one or more image data channels for storing the image data and one or more metadata channels for storing metadata including description, statistical information and area maps as described herein. In an example, the metadata may include, but are not limited to, the pixel displacement information (134) and the area map (136), which may be provided by an area mapping unit (e.g., 158).
3. AREA MAPPING UNITS
[0045] In an embodiment, an area mapping unit (e.g., 158) generally represents any hardware and/or software configured to receive one, two, or more input frames and/or geometry information of the camera elements that captured the original frames corresponding to the input frames, and to produce an area map representing regions of uniform values for a computational measure. As used herein, an "area map" refers to a map specifying
non-overlapped arbitrarily shaped regions/areas in a view plane relating to the input frames; the arbitrarily shaped regions/areas refer to regions/areas whose shapes in the view plane that are not preset but rather determined by values computed for the computational measure based on the specific content in the input frames. As such, an area map as described herein is a content-based area map, and is not formed by a content-neutral dissection of the view plane with regular or irregular shapes such as squares, circles, triangles, etc. In an example, edges that form boundaries for the arbitrarily shaped regions/areas of a content-based area map are pixel- accurate; that is, the edges may be specified at a resolution of single pixels. In another example, edges that form boundaries for the arbitrarily shaped regions/areas of a content-based area map are block- accurate; that is, the edges may be specified at a resolution of single blocks or sub-blocks, wherein a block or sub-block may comprise two or more pixels.
[0046] As used herein, the term "computational measure" refers to a measure of interest that is capable of presenting a region- or area-based behavior. In an example, values for a computational measure may be computed algebraically based on pixel values in one, two, or more input frames. Alternatively and/or optionally, values for a computational measure may be determined logically based on pixel values in one, two, or more input frames. A function, which may or may not involve a range -based mapping and which may or may not be analytical, may be used to compute/determine values for the computational measure. A computational measure may be related to one or more attributes or properties of 2D images, 3D images, or multi-view images. Examples of computational measures may include, but are not limited to, disparity, noise level, color gradient, displacements, motion vectors, spatial errors, etc.
[0047] As used herein, a "region/area" in an area map refers to a group of contiguous pixels whose values computed for a specific measure are uniform. In an example, a group of contiguous pixels in one, two, or more frames may represent a dark blue object and may have color values within a narrow range that represent dark blue. Under a specific measure related to colors, color values of the contiguous pixels may be mapped to a same (e.g., uniform) value, e.g., dark blue. Accordingly, this group of contiguous pixels (e.g., the dark blue object) may form a region/area in an area map generated by an area mapping unit comprising a color gradient computation unit.
[0048] In another example, a group of contiguous pixels in one, two, or more frames may represent a moving train and may have motion vectors that are of a same vector value as generated by simple panning. Under a specific measure related to motion analysis, values of motion vectors for the contiguous pixels may all be mapped to a same (e.g., uniform) value, for example, corresponding to the motion vector generated from the simple panning. Accordingly, this group of contiguous pixels (e.g., the train) may form a region/area in an area map generated by an area mapping unit comprising a motion analysis unit.
[0049] In another example, one, two, or more frames in a movie may contextually represent a scene happening in the distant past and may need to assume a classic look like that of an old movie, which may be different from other scenes in the movie. Pixel values of the one or more frames may be analyzed to determine a set of discrete regions/areas. Under a specific measure related to how to inject classic-look noise levels, one region/area of the frames may be determined to have one uniform value of the specific measure and may be injected with one noise level, while another region/area of the frames may be a different uniform value of the specific measure and may be injected with a different noise level. An area mapping unit that comprises a noise level computation unit may generate an area map that identifies regions/areas to be injected with various noise levels (e.g., an image receiving system may be configured to map different values of the specific measure to different values of∑n for noise levels). 4. DISPARITY COMPUTATION
[0050] FIG. IB illustrates an example area mapping unit (158) that comprises a disparity computation unit (108) and an area node mapping unit (110). In an embodiment, the disparity computation unit (108) generally represents any hardware and/or software configured to receive two or more input frames representing different perspectives/views of a reality and geometry information of the camera system (e.g., 102) that takes original sources frames which give rise to the two or more input frames, to produce per-pixel displacements between any pair of two different input frames, and to produce a disparity area map representing the per-pixel displacements between the pair of two input frames. In an embodiment, disparity information produced by a disparity computation unit (108) as described herein may include both per-pixel displacements and the disparity area map constructed based on the per-pixel displacements.
[0051] In an embodiment, the input frames are, or are derived from, source frames that are taken at the same time by a camera system (e.g., 102). An input frame may carry a frame number indicating when the input frame was taken relative to other input frames. In some embodiments, two or more input frames may carry a same frame number indicating that corresponding source frames are created from the reality at substantially the same time.
Alternatively and/or optionally, two or more input frames may carry timestamps that root in a same clock or in two or more clocks that are synchronized, or in two or more clocks that timing offsets between the clocks are known; and based on the timestamps, corresponding source frames for these input frames may be determined as taken at the same time. Other ways of identifying two or more frames that are taken by the camera system (102) at a substantially same time may also be used.
[0052] In an embodiment, the one or more input frames received by the disparity computation unit (108) are stereo frames. For example, the disparity computation unit (108) may receive the left pixel processed frames (122) and right pixel processed frames (124) from the video preprocessing unit (104). Thus, "disparity" as described herein may refer to per-pixel displacements (which may be zero or non-zero values) between a left pixel processed frame (or simply "left frame") and a right pixel processed frame (or simply "right frame"). The per-pixel displacements may be caused by differences in geometrical positions of a left camera element and a right camera element that took the original source frames, which give rise to the left and right frames.
[0053] In an embodiment, the disparity computation unit (108) comprises an edge detection unit (150) and a geometry compensated correlation unit (152). In an embodiment, the edge detection unit (150) generally represents any hardware and/or software configured to process pixel values from a frame and to determine edges related to positions of objects in the frame. In an example, the edge detection unit (150) receives the left frame (122), determines edges related to positions of objects in the left frame (122), and describes the edges in a left edge map. In another example, the edge detection unit (150) receives the right frame (122), determines edges related to positions of objects in the right frame (122), and describes the edges in a right edge map. In an embodiment, pixel values (e.g., luma/chroma values) in the left and right frames may be used together to determine the edges. In an embodiment, pixel values in the left and right frames, as well as input frames that are of various perspectives, and/or input frames that precede or follow the input frame, may be used to determine the edges. In an embodiment, the edge detection unit (150) may be configured to provide edge maps (154) such as the left and right edge maps as an input to the geometry compensated correlation unit (152).
[0054] In an embodiment, the geometry compensated correlation unit (152) generally represents any hardware and/or software configured to receive input frames, geometry information, and edge maps, and to process the received data into disparity information that may comprise per-pixel displacements and a disparity area map. In an example, the geometry compensated correlation unit (152) receives a left frame (e.g., one of the left pixel processed frames (122)), a right frame (e.g., one of the right pixel processed frames (124)), and geometry information (126) from the video preprocessing unit (104) and one or more edge maps (154) for the left and right frames from the edge detection unit (150).
[0055] In an embodiment, the geometry compensated correlation unit (152) may be configured to determine and/or select a view plane and to use the geometry information to project both left and right frames, as well as the edge maps (154), onto the same view plane. The view plane as described herein may or may not correspond to a view plane in which a left frame and a right frame are shown to a viewer in an image viewing environment. In an embodiment, the view plane as described herein may be physical and may correspond to a view plane on which one of the left and right frames may be viewed. An imaginary line segment from the center of the view plane and the camera element of one of the left and right camera elements may be perpendicular to the view plane. In another embodiment, this view plane may be virtual, and may correspond to a point of view differing from either of the camera elements that took corresponding source frames. An imaginary line segment from the center of the view plane and the camera element of any of the left and right camera elements may be tilted (e.g., not perpendicular) to the view plane. Additionally and/or optionally, the geometry compensated correlation unit (152) may be configured to use geometry information related to a viewing environment in setting up the view plane and in projecting the left and right frames to the view plane. The geometry information related to the viewing environment may be configured to store locally in, or provided to, the image processing system (100) or to be accessible from a user or another device.
[0056] Without loss of generality, pixels in one of the left frame and the right frame may be chosen as reference pixels for computing displacements. As used herein, a pixel may refer to a display unit in a display panel and may correspond to a position in an image frame. A pair of corresponding pixels in two frames may refer to a pair of pixels, respectively in the two frames, that describe or represent a same physical point in a reality. A pixel value may refer to a unit data value that is to be loaded in the pixel of the display unit when the image frame is rendered. [0057] For example, three-dimensional (e.g., x-y-z in a Cartesian coordinate system) positions represented by pixel positions of either the left frame or the right frame projected at the view plane may be determined based on the geometric information of the camera elements and of the view plane and may be chosen as initial reference positions in computing displacements of the corresponding pixels in the other frame. Pixel values of the left and right frames, the edge maps after projection to the view plane, the geometry information of the camera elements, and the geometry information of the view plane may be further correlated by the geometry compensated correlation unit (152) to obtain a three-dimensional (e.g., x-y-z) displacement per pixel between pixels in the projected left frame and corresponding pixels in the projected right frame. All per-pixel displacements for a pair of a projected left frame and a corresponding projected right frame may be collectively denoted as <d>.
[0058] In an embodiment, a pixel in one frame and a corresponding pixel in the other frame are determined as coplanar if the positions of the two corresponding pixels are the same. Coplanar pixels may be marked as "zero disparity" in <d> and on a disparity area map generated by the disparity computation unit (108).
[0059] In an embodiment, the disparity computation unit (108) may be configured to determine a computational measure related to displacement calculation and to compute values for the measure based on the displacements <d>. In an example, the measure may be a per-pixel measure and a value of the measure may be separately computed for each pair of corresponding pixels in the two frames. In another example, the measure may be a group-of-pixels measure and a value of the measure may be collectively computed for a group of pairs of corresponding pixels in the two frames. In an embodiment, the measure may be represented by a function that maps a vector value such as a displacement to a normalized value such as a number. For example, a group of contiguous pixels in one frame and their corresponding pixels in the other frame may have displacements with similar directions and similar magnitudes such that normalized values computed for these pixels in the measure may be the same value (e.g., uniform value for the measure).
[0060] In an embodiment, the values determined for the chosen measure related to displacements may be used by the disparity computation unit (108) to segment a portion of the view plane as represented by the left and right frames into the previously mentioned disparity area map. Here, the disparity area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to displacements (as discussed above) form one or more non- overlapped areas. In an embodiment, coplanar pixels with zero displacements may be additionally and/or optionally marked in the disparity area map. For example, even if a computational measure related to displacements may be coarse-grained and displacements with values of less than two unit displacements apart may be mapped into a same area of contiguous pixels on the disparity area map, among pixels included in the area, coplanar pixels may be additionally and/or optionally marked in the disparity area map.
[0061] In some embodiments, the zero-disparity information may be used by an image processing system (e.g., 100) to mark certain portion(s) of an image display as 2D (e.g., pixels with zero disparity). The zero disparity information may then be used by an image receiving system to correctly process 2D information (e.g., textual statistics for a baseball player) in a 3D image (which, for example, portray the baseball's action in 3D). In an embodiment in which quincunx codecs are used, a block of pixels as encoded in the coded bitstream may comprise a diagonal of pixels from a left frame and an opposite diagonal of pixels from a right frame. If this block of mixed pixels falls in an area marked as zero disparity, the block of mixed pixels may correctly represent pixels in both the left and right frames. Thus, additional processing may be avoided by an image processing system that receives the zero disparity information, as the image processing system does not need to redundantly generate two separate blocks for the left-eye view and for the right-eye view, which would be the same. Thus, the image receiving system may process the image data more efficiently than otherwise. [0062] In an embodiment, the disparity computation unit (108) may be configured to provide the displacements <d> and the disparity area map to other units or parts in the image processing system (100). For example, the disparity computation unit (108) may be configured to provide the disparity information as described herein (e.g., 130) as an input to the area node mapping unit (110).
[0063] In an embodiment, an area node mapping unit (110) as described herein generally represents any hardware and/or software configured to implement one or more techniques of formatting an area map in a data representation that is appropriate for coding into a data stream that can be readily decoded by an image receiving system. In an embodiment, the data representation adopted by the area node mapping unit (110) may be an area node map that employs vertices (a.k.a. nodes) and links that interconnect the nodes for representing segmented areas on the area map. For example, the area node mapping unit (110) may use nodes and links to create a disparity area node map for the disparity area map received from the disparity computation unit (108) and provide the disparity area node map (136) as an input to the muxer (114).
[0064] In embodiments illustrated in FIG. IB, the area node mapping unit (110) may be configured to provide the displacements <d> to the video encode unit (112) and/or the muxer (114).
5. COLOR GRADIENT COMPUTATION
[0065] FIG. 1C illustrates an example area mapping unit (168) for color gradient information. The color gradient area mapping unit (168) may comprise an edge detection unit (which may, but is not limited to, be the same edge detection unit (150) as illustrated in FIG. IB), a color gradient computation unit (162), and an area node mapping unit (which may, but is not limited to, be a separate area node mapping unit (110) in FIG. 1A and FIG. IB). In an embodiment, the color gradient area mapping unit (168) generally represents any hardware and/or software configured to receive one, two, or more input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a color gradient area map representing color gradient information in one or more of the input frames. In an embodiment, color gradient information produced by a color gradient area mapping unit (168) as described herein may include a color gradient area map.
[0066] In an embodiment, the edge detection unit (150) generally represents any hardware and/or software configured to process pixels from a frame and to determine edges related to positions of objects in the frame. In an example, the edge detection unit (150) receives an input frame which may be any of the left pixel processed frames (122) or the right pixel processed frames (124), determines edges related to positions of objects in the input frame, and describes the edges in an edge map. In an embodiment, pixel values (e.g., hue/luma/chroma values) in the input frame may be used by edge detection algorithms to determine the edges. In an embodiment, pixel values in the input frame, as well as other pixel values in other correlated input frames that are of various perspectives, and/or in other input frames that precede or follow the input frame, may be used to determine the edges. In an embodiment, the edge detection unit (150) may be configured to provide an edge map (164) (which may or may not be the same as one of the left and right edge maps (154) in FIG. IB) as an input to the color gradient computation unit (162).
[0067] In an embodiment, the color gradient computation unit (162) generally represents any hardware and/or software configured to receive input frames and edge maps, and to process the received data into color gradient information that may comprise a color gradient area map. In an example, the color gradient computation unit (162) receives one, two, or more input frames from the video preprocessing unit (104) and edge maps (154) from the edge detection unit (150).
[0068] In an embodiment, the color gradient computation unit (162) may be configured to determine a computational measure related to color values and to compute values for the computational measure based on the pixel values in the input frame. In an example, the measure may be a per-pixel measure and a value of the measure may be separately computed for each pixel in the input frame. In another example, the measure may be a group-of-pixels measure and a value of the measure may be collectively computed for a group of pixels in the input frame. In an embodiment, the measure may be represented by a function that maps a pixel value to a normalized value such as a number. For example, a group of contiguous pixels in the input frame may have similar values in any, some, or all of hue, chroma, and luma properties such that a normalized value for these pixels as determined by the measure may be the same value (e.g., uniform value for the measure).
[0069] It should be noted that an area mapping unit as described herein may produce one, two, or more area maps for one or more same input frames. For example, the color gradient area mapper unit (168) may produce one or more area maps, one for hue values, another for chroma values, and yet another for luma values. Similarly, an area map as produced by the color gradient area mapper unit (168) may be based on a combination of any, some, or all of hue, chroma, and luma values.
[0070] In an embodiment, the values determined for the chosen measure related to color values may be used by the color gradient computation unit (162) to segment a picture as represented by the input frame into a color gradient area map as previously mentioned. Here, the color gradient area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to colors (as discussed above) form one or more non-overlapped areas. In an embodiment, the edges as determined by the edge detection unit (150) may be additionally and/or optionally marked in the color gradient area map. For example, a computational measure related to colors may or may not distinguish two objects with like color values. The edge information as produced by the edge detection unit (150) may be incorporated into the color gradient area map to distinguish these objects. In some embodiments where frames are stereo images or multi-view images, zero-disparity pixels may or may not be marked on a color gradient area map. [0071] In an embodiment, the color gradient computation unit (162) may be configured to provide the color gradient area map to other units or parts in the image processing system (100). For example, the color gradient computation unit (162) may be configured to provide the color gradient area map as described herein (e.g., 166) in an input to the area node mapping unit (110).
[0072] In an embodiment, the area node mapping unit (110) may use nodes and links as described herein to create a color gradient area node map for the color gradient area map received from the color gradient computation unit (162) and provide the color gradient area node map (160) as an input to the muxer (114).
6. OTHER MEASURE-BASED COMPUTATIONS
[0073] Embodiments of the present invention are not limited to any specific computational measure. In some embodiments, one or more area mapping units in the image processing system (100) may comprise area mapping units for other computational measures such as those relating to noise level computation, motion analysis computation, spatial error computation, etc.
[0074] In an embodiment, a noise level area mapping unit may be configured to receive one, two, or more input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a noise level area map representing noise level information in one or more of the input frames. In an embodiment, noise level information produced by a noise level area mapping unit as described herein may include a noise level area map. For example, the noise level computation unit may be configured to determine a computational measure related to noise level calculation and to compute values for the measure based on the pixel values in the input frames. In an embodiment, the measure may be represented by a function that maps a pixel value to a normalized value such as a number. In an embodiment, the values determined for the chosen measure related to noise levels may be used to segment one or more pictures as represented by the input frames into one or more noise level area maps. Here, a noise level area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to noise levels (as discussed above) form one or more non-overlapped areas. In an embodiment, edges as determined by an edge detection unit (e.g., 150) may be additionally and/or optionally marked in the noise level area map. In an embodiment, noise level area maps as described herein may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110). In an embodiment, the area node mapping unit (110) may use nodes and links as described herein to create a noise level area node map for the noise level area map and provide the noise level area node map (160) as an input to the muxer (114).
[0075] Similarly, in an embodiment, a motion analysis area mapping unit may be configured to receive a group of input frames, for example, from a video preprocessing unit (e.g., 104), and to produce a motion analysis area map representing motion analysis information in the group of the input frames. In an embodiment, motion analysis information produced by a motion analysis area mapping unit as described herein may include a motion analysis area map. For example, the motion analysis computation unit may be configured to determine a computational measure related to motion analysis calculation and to compute values for the measure based on the pixel values in the group of input frames. In an embodiment, the measure may be represented by a function that maps a pixel value to a normalized value such as a number. In an embodiment, the valued determined for the chosen measure related to motion analysis may be used to segment a group of pictures as represented by the group of input frames into one or more motion analysis area maps. Here, a motion analysis area map refers to an area map in which contiguous pixels with uniform values of a computational measure related to motion analysis (as discussed above) form one or more non-overlapped areas. In an embodiment, edges as determined by an edge detection unit (e.g., 150) may be additionally and/or optionally marked in the motion analysis area maps. In an embodiment, motion analysis area maps as described herein may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110). In an embodiment, the area node mapping unit (110) may use nodes and links as described herein to create a motion analysis area node map for a motion analysis area map described herein and provide the motion analysis area node map (160) as an input to the muxer (114).
[0076] In embodiments of the present invention, these and other area maps may be provided to other units or parts in the image processing system (100), such as an input to an area node mapping unit (e.g., 110). For example, a frame that is encoded into the image data channels may comprise an approximate shape of an object, while a spatial error area map may be created to indicate how the approximate shape of the object conveyed in the image data may be improved if an image receiving system so chose to. In an embodiment, the area node mapping unit (110) may use nodes and links as described herein to create a spatial error area node map for a spatial error area map described herein and provide the spatial error area node map (160) as an input to the muxer (114).
7. GEOMETRIC INFORMATION OF CAMERA SYSTEM
[0077] FIG. 2A illustrates an example horizontal view of example geometric configuration related to an example camera system (e.g., 102). In an embodiment, geometric configuration of the camera system (102) may be determined and/or specified based on a reference platform. In an embodiment, the reference platform may be physical, such as a platform on which camera elements in the camera system (102) may be mounted. In another embodiment, the reference platform may be virtual, such as an imaginary platform formed by initial calibrated positions of camera elements in the camera system (102).
[0078] In an embodiment, the example camera system (102) comprises two camera elements (206-1 and 206-2), one for the left-eye view and the other for the right-eye view. Positions of the camera elements, when the camera elements take a pair of left and right source frames at substantially the same time, may be determined by the camera system (120) as part of the previously mentioned geometry information. The set of positions of the camera elements for a plurality of pairs of left and right frames where each pair is captured at a substantially same time may be represented as one or more position functions of time. As used herein, source frames are taken at substantially the same time provided that the time difference for capturing the source frames is within a small tolerance value. As used herein, in different embodiments, this small tolerance value may be set differently, and may be less that one twentieth of a second, one sixtieth of a second, or one hundred twentieth of a second, or a value more or less than the foregoing example values. For example, for a camera system that is capable of capturing a picture at a certain rate, the small tolerance value may be set as less than the smallest time interval for which two consecutive pairs of source frames may be captured.
[0079] In an embodiment, the camera system (102) may comprise four reference lines in the horizontal view. These four reference lines may comprise a pair of two parallel lines along two directions. In an embodiment, first two parallel lines (202-1 and 202-2) in the pair of parallel lines may generally point to a frontal direction (e.g., towards the left direction of FIG. 2A) of the camera system (102) that may be determined in the reference platform. The distance between the first two parallel lines (202-1 and 202-2) may be measured by the camera system (102) as a distance Dl, which may be interpreted as a stereobase of the left-eye camera element and the right-eye camera element and may be analogous to an inter-pupil distance of left and right eyes of a viewer. In an embodiment, the first parallel lines (202-1 and 202-2) may intersect two principal points of a same type in the two camera elements, respectively. For example, the first parallel lines (202-1 and 202-2) may intersect entrance pupils in the two camera elements, respectively.
[0080] In an embodiment, while taking different pairs of source frames, the camera elements in the camera system (102) may perform movements that may or may not be correlated. Each of the camera elements may change its position in the 3D space as well as in the horizontal view of FIG. 2A in a translational motion, a rotational move, or a combination of the two.
[0081] In an embodiment, second two parallel lines (204-1 and 204-2) in the pair of parallel lines may generally point to a direction perpendicular to the frontal direction of the camera system (102) that may be determined in the reference platform. The distance between the second two parallel lines (204-1 and 204-2) may be measured as a distance D2. In an embodiment, the second parallel lines (204-1 and 204-2) may pass the two principal points of the same type in the two camera elements, respectively. For example, the second parallel lines (204-1 and 204-2) may pass the entrance pupils in the two camera elements, respectively.
[0082] In an embodiment, view angles (ccl and cc2) of the camera elements may also be determined as part of the geometry information. In an embodiment, a view angle may indicate how wide an angle may be captured by a camera element and may be related to the optical configuration such as aperture, shutter, focal lengths, an image plane size, etc., of the camera element. The view angle of a source frame may be further reduced in an editing process, which, for example, may be implemented in, or supported by, the video preprocessing unit (104).
[0083] In an embodiment, horizontal parallactic angles (Θ1 and Θ2) of the camera elements at the time of capturing a pair of source frames may also be determined as part of the geometry information for the pair of source frames. In an embodiment, as illustrated in FIG. 2A, each horizontal parallactic angle may be formed between a line associated with a center view of a camera element and each of the first parallel lines (204-1 and 204-2) in the horizontal view.
[0084] FIG. 2B illustrates an example vertical view of an example geometric configuration related to an example camera system (e.g., 102). In an embodiment, the camera system (102) may comprise two parallel horizontal planes (as represented by two lines 208-1 and 208-2 in the vertical view), in which the first two parallel lines (202-1 and 202-2) may respectively lie. The distance between these two parallel horizontal planes (208-1 and 208-2) may be measured as a distance D3. In an embodiment, the two parallel horizontal planes (208-1 and 208-2) may pass the two principal points of the same type in the two camera elements as previously mentioned, respectively. For example, the two parallel horizontal lines (208-1 and 208-2) may pass the entrance pupils in the two camera elements, respectively.
[0085] In an embodiment, as the camera elements may perform movements that may or may not be correlated, each of the camera elements may change its position in the vertical view of FIG. 2B in a translational motion, a rotational move, or a combination of the two.
[0086] In an embodiment, vertical parallactic angles (γΐ and γ2) of the camera elements may be determined as part of the geometry information. In an embodiment, each vertical parallactic angle may be formed between a line associated with a center view of a camera element and each of the two parallel horizontal planes (208-1 and 208-2).
8. OPERATIONS FOR CREATING AN AREA NODE MAP
[0087] FIG. 3A illustrates example operations for creating an area node map, according to embodiments of the present invention. In an embodiment, input frames 302 are processed by area segmentation (304), which, for example, may be an operation, implemented by the video preprocessing unit (104) or by an area mapping unit in the image processing system (100). An area map as described herein (306) may be produced following the area segmentation (304). This area map (306) may be created for one, two, or more input frames individually or as a group. For example, a color gradient area map may possibly be created for a single input frame, while a motion analysis area map may be created on a reference frame for a group of two or more frames with one frame in the group of the frames selected as the reference frame. In the illustrated example of FIG. 3 A, the area map (306) may comprise six areas Al through A6.
[0088] The area map (306) may be further processed into an area node map (314) by example operations, for example, node enumeration (308), node syntax verification (310), node specification (312), etc. Any, some, or all, of these operations may be implemented by an area node mapping unit (110 of FIG. 1 A, FIG. IB, and FIG. 1C) or, alternatively, by one or more other units in the image processing system (100 of FIG. 1A).
[0089] In an embodiment, the node enumeration (308) may implement an enumeration algorithm that assigns a different unique number to each and every one of the areas in the area map (306). In an embodiment, the enumeration of the areas in the area map (306) may start from a specific point or a specific area, which specific area may or may not be required to include the specific point. In an example, enumeration of the areas may follow a "raster scan order" from a specific point or a specific area. In an example, enumeration of the areas may follow a first direction first and a second direction second. For example, the enumeration may follow the horizontal direction first and the vertical direction second. The enumeration may also perform along a radial direction starting from a specific point or a specific area of the area map (306), either clockwise or counterclockwise. The specific point to start may be chosen as the upper-left corner, the upper-right corner of the area map (306), etc.
[0090] In an embodiment, the node enumeration (308) may implement an algorithm to represent an area in the area map (306) as a polygon in which the vertices are nodes and the line segments of the circumference of the polygon are links. In a possible embodiment, this algorithm assigns consecutive numbers to all the nodes starting from a default starting point that is invariant in all area maps and in frames the area maps represent. For example, for frames of different sizes or shapes such as those supported in H.264/SVC encoding, the default starting point may be set as the upper-left corner.
[0091] In an embodiment, the node syntax verification (310) may implement a path-walking algorithm to traverse the links and nodes along a particular path. In an embodiment, the links and nodes may be considered as forming a directed graph. The particular path may be any path for the directed graph based on the graph theory. In an embodiment, this particular path may be an optimal path that may be represented by a minimum amount of data. To simplify data transmission for the area node map (314) and to ensure the correctness in node representation syntaxes, virtual nodes may be inserted. In an embodiment, only a minimum number of such virtual nodes may be inserted to create the area node map (314).
[0092] In an embodiment, the node specification (312) may produce a list of ordered nodes along with properties of the ordered nodes. For example, this list may comprise a list of ordered nodes represented as combinations of node numbers, positions (x, y) of the nodes, and link numbers for the links that interconnect the nodes in the numeric order of the node numbers. The list may be used to describe the net that forms the area node map (314).
[0093] FIG. 3B illustrates an example node link table (316) comprising a list of ordered nodes representing an example area node map (e.g., 314), according to embodiments of the present invention. The list comprises rows along an ordered walking path of the nodes. In an embodiment, the ordered walking path follows a numeric (e.g., ascending) order of node numbers assigned to the nodes. In an embodiment, each row in the node link table (316) may represent a node and comprise a node number (Node #), an X coordinate (X), and a Y coordinate (Y) for the node, and a list of linked nodes from the node in the "Link #s" column of the node link table (316). In an example representation, a link between two nodes may be represented by a single directed edge. The list of linked nodes for a node in a row of the node link table (316) may comprise zero or more originating nodes for which the node is the end point, which originating nodes may be denoted, for example, within parentheses. The list of linked nodes for the node may also comprise zero or more terminating nodes for which the node specified in the "Node#" column is the start point (e.g., the originating node). The terminating nodes for the node specified in the "Node#" column may be denoted, for example, without parentheses. In an embodiment, the node link table (316) may be encoded in a data stream to other units or parts in the image processing system (100) or an image data storage device or an image receiving system. In another embodiment, the node link table (316) may be transformed to an alternative representation that may be more efficiently stored or transmitted than the node link table (316).
[0094] FIG. 3C illustrates an example alternative representation, as mentioned above, of an area node map (e.g., 314) according to embodiments of the present invention. In an embodiment, a row in a Alink table (318) represents a link between an originating node specified in the "Node #" column and a terminating node specified in the "Alink #". In an embodiment, the "Alink #" column for a row comprises one and only one terminating node. For example, since node "1" is an originating node that links to node "2", the first row in the Alink table (318) stores information between node "1" and node "2", as illustrated.
[0095] A row of the Alink table (318) store values in two "AX" and "AY" columns that are the differences in x-y coordinates of the originating node in the (present) row and x-y coordinates of an originating node in a preceding row if the two originating nodes in the present row and the preceding row are the same. Otherwise, if the originating node in the present row is different from that in the preceding row, or if the present row is the first row, then the two "AX" and "AY" columns for the present row stores x-y coordinates of the originating node in the present row. In the Alink table (318), a same originating node may be repeated one time or multiple times if the originating node is the originating node for one link or multiple links, respectively.
[0096] Thus, since node "2" is the originating node for links to nodes "4", "3", and "10", the Alink table (318) comprises three rows with node "2" as the originating nodes. Values in the Alink# column for these rows in the Alink table (318) are 2, 1, and 8, which are differences between the number for the originating node (e.g., 2) and the numbers for the terminating nodes (e.g., 4, 3, and 10), respectively.
[0097] In an embodiment, each node needs to have at least one row in the Alink table (318). Additionally and/or optionally, virtual nodes may be used to complement those nodes that are not originating nodes in the chosen ordered walking path, and thus might not otherwise appear in the Alink table (318) without the virtual nodes being inserted in the Alink table (318). In an embodiment, each virtual node may correspond to only one row in the Alink table (318).
[0098] It should be noted that in other embodiments of the present invention, different ordered walking paths other than that illustrated in FIG. 3A, FIG. 3B, and FIG. 3C may be used. An example of a different ordered walking path is illustrated in FIG. 3D. Such a different ordered walking path is expected to produce different metadata compression from that for the ordered walking path illustrated in FIG. 3A, FIG. 3B, and FIG. 3C.
[0099] In an embodiment, three columns of information represented by "AX", ΆΥ", and "Alink #" may produce sparse vectors with many zero values in "AX" and "AY", and small amplitudes in "Alink #". These sparse vectors may be efficiently encoded with entropy coding or any other encoding techniques that use run-lengths and binarization. In an embodiment, as illustrated in FIG. 3E, entropy coders (320) such as CAVLC or CABAC that may already be implemented in a video coding unit (e.g., 112) in the image processing system (100) may be used to encode these sparse vectors as part of messages of a metadata transport type. In some embodiments, the messages may be transported in one or more channels different from media data channels that transport image data for video frames. In a particular method for off-channel signaling, standard-based messages, proprietary messages (e.g., SEI for H.264), extra channels for MP2TS, side-band frequency for radio broadcasting, etc., may be used to transport the sparse vectors representing an area map.
9. POST PROCESSING OPERATIONS
[0100] FIG. ID illustrates an example image receiving system (180) according to embodiments of the present invention. In an embodiment, a coded bitstream (142) generated by an example image processing system (e.g., 100) is received through an intermediate media transmission or media storage system (172). In an embodiment, the intermediate system (172) may be a networked system that operatively connects with both the image processing system (100) and the image receiving system (180). Alternatively and/or optionally, the intermediate system (172) may comprise a tangible storage that may be used to first store the coded bitstream (142) from the image processing system (100) and then transmit the coded bitstream to the image receiving system (180). For example, the coded bitstream (142) may be stored as a file on a Blu-ray disc or another suitable medium in the intermediate system (172) or in the image processing system (100). The coded bitstream (142) may be recovered from a tangible storage medium and provided to the image receiving system (180) as requested, as scheduled, or in any suitable manner.
[0101] In an embodiment, the coded bitstream (142) has a container structure that comprises image data channels, metadata channel, audio channels, etc. The image data channels may comprise image data having two video layers: a full-resolution residual layer and a prediction layer. For example, the full-resolution residual layer may store reference frames while the prediction layer may be a low-resolution layer that stores prediction information. The metadata in one or more separate metadata channels in the coded bitstream may comprise data specifying one or more area maps as described herein.
[0102] The coded bitstream 142 may be de-multiplexed by a demuxer (182) in the image receiving system (180) into an image data stream (190), a metadata stream (192), and an audio data stream (194) that may have been carried by different logical channels as described above, respectively, in the coded bitstream (142). The image data stream (190) may be decoded by a video decode unit (184) to recover video layers (174). The video layers (174) may be a down-sampled version of pixel processed frames and may represent a version of frames at a lower sampling rate than that of the pixel processed frames. The video layers (174) may be provided to a video post processing unit (188) in the image receiving system (180).
[0103] In an embodiment, the image receiving system (180), or a unit therein (e.g., the video post processing unit (188) or the video decode unit (184)), may apply one or more image enhancing techniques to improve the quality of the final image output (e.g., 198). For example, the video post processing unit (188) may implement one or more image enhancing algorithms for sharpening and for deinterlacing.
[0104] In an embodiment, the image enhancing algorithms may only use information that is already stored in the video layers (174). In this embodiment, no extra information, other than the video layers (174), needs to be provided for these image enhancing algorithms to work; and hence under this approach the image enhancing algorithms do not require an additional bit rate to carry, per unit time, extra information outside the video layers (174).
[0105] In another embodiment, any, some, or all, of the image enhancing algorithms may alternatively and/or optionally use both the video layers (174) and the metadata contained in the metadata stream (192) to improve the quality of the final image output (198). Such an image enhancing algorithm that uses both input frames and metadata may be a deblocking filter used in H.264 video coding standard. Such an image enhancing algorithm may also use both the video layers (174) and area maps that were constructed by an image processing system (e.g., 100).
[0106] In an embodiment, as illustrated in FIG. ID, the metadata stream (192) may embed data specifying one or more area maps that were constructed by the image processing system (100). Both the embedded data specifying the one or more area maps and the video layers (174) may be provided as a combined input (196) to an area map reconstruction unit (186) in the image receiving system (180). In an embodiment, the data embedded in and extracted from the metadata stream may comprise one or more Alink tables (e.g., 318 of FIG. 3C) for the one or more area maps and measured information related to one or more computational measures.
[0107] In an embodiment, the area map reconstruction unit (186) in the image receiving system (18) may perform a number of reconstruction and decoding operations, which may be symmetrical or resemble an inverse procedure to the area map construction and encoding operations implemented in the image processing system (100). From these reconstruction and decoding operations, node link tables (e.g., 316 of FIG. 3B) may be reconstructed from the one or more Alink tables (318) encoded in the metadata. Based on the node link tables (316), one or more area node maps may be reconstructed. One or more area maps may be reconstructed based on the one or more area node maps decoded from the metadata (192). The measured information related to the one or more computational measures may be contained in and extracted from the coded bitstream and thereafter incorporated into the one or more area maps. In an embodiment, information in the video layers (174) may be additionally and/or optionally extracted and used to reconstruct the one or more area maps. In an embodiment, the area map reconstruction unit (186) provides the one or more reconstructed area maps (176) with the measured information as an input to the video post processing unit (188).
[0108] In an embodiment, the video post processing unit (188) may implement one or more algorithms that use both the video layers (174) and the one or more area maps (176) to generate the final image output (198) for rendering. In an embodiment, the final image output (198) comprises frames of an enhanced version relative to the frames that can be recovered solely from the video layers (174) with or without image enhancing techniques. This enhanced version of frames in the final image output (198) may provide high-quality image perception.
UP-SAMPLING
[0109] In some embodiments, when a video encode unit (e.g., 112 of FIG. 1A) encodes pixel processed frames (e.g., 128 of FIG. 1A) into a data stream (e.g., 140 of FIG. 1A), some information in the pixel processed frames may be lost. For example, a sampling rate associated with the pixel processed frames or source frames, may be, but is not limited to, 4:4:2 using a 10-bit color value representation, while a sampling rate associated with the encoded image data such as (video layers) in the data stream may be, but is not limited to, 4:2:2 using a 8-bit color value representation. For example, the lower sampling rate may be dictated by a mass media player. A Blu-ray disc or a similar video delivery mechanism may not support the higher sampling rate of the source frames, but may support the lower sampling rate of the encoded image data implemented by an image processing system (100). In some embodiments, an image receiving system (e.g., a high-end 3D display system that supports HD, 2K, 4K, or even 8K encoded pictures) may support a higher sampling rate than the sampling rate associated with the encoded image data from the image processing system (100).
[0110] In some embodiments, an area map may be received and used by the image receiving system (180) to predict information lost in the down-sampling operation of the encoding process. The area map may carry measured information (a value of
computational measure) per area formed by a group of contiguous pixels, rather than actual original value per pixel. Thus, the area map may use much smaller data volume to store and/or to transmit than a video layer comprising all per-pixel values of the lost information for supporting a restoration of a down-sampled version to the original pixel processed frames or source frames.
[0111] As image information such as color values of objects in the input frames is typically highly correlated, the video layer comprising per-pixel values may contain highly redundant information, resulting in very high data volume that is difficult to store and to transmit. On the other hand, due to the same highly correlated nature of image information of objects in the input frames, the area map, even though transmitted in a significantly small data volume, may be effectively used by a video post processing unit (188) to a high quality version of frames in the final image output (198). This high quality version of frames may not be the same as the source frames or the pixel processed frames in the image processing system (100), but may support excellent perception relative to a viewer. [0112] In an embodiment, area-based samples (or data points) of measured information (e.g., luma/chroma) may be added on top of the area map by the area map reconstruction unit in FIG. ID. The information on this area map (176) may be interpolated using an interpolator of a same type used in encoding by the image processing system (100) to obtain an up-sampled prediction layer. The up-sampled prediction layer may be added to a residual layer from the coded bitstream (142) and (post) processed by the video post processing unit (188) based on the same or different image enhancing algorithms that have already been implemented therein to generate output frames. In an embodiment, the final image output (198) may be frames of an up-sampled version that are generated based on the video layers in the received coded stream (142) and the measured information in the area map.
DEBLOCKING
[0113] In some embodiments, when a video encode unit (e.g., 112 of FIG. 1A) encodes frames (e.g., 128 of FIG. 1A) into a data stream (e.g., 140 of FIG. 1A), some information in the frames may be lost due to data compression. For example, blockiness artifacts may be seen if frames are made of highly compressed image data.
[0114] In some embodiments, even for large areas with little amplitude gradients in luma and chroma in a picture, visible artifacts such as banding and contouring may nevertheless be seen with the frames even if the frames have been delivered with a relatively high bit rate.
[0115] In some embodiments, "breaking the block based net" techniques as described herein may be used to reduce the above-mentioned visual artifacts. In an embodiment, an area map may be used to outline and predict accurately the edges of objects or areas in a picture as represented by frames. For example, the edges in the area may be a smooth faithful representation of an outline of a uniform looking portion of the sky. Samples of measured information in the area map may be interpolated with some, or all video layers or channels to create frames of an up-sampled version with accurate surface structures of objects in the frames. These surface structures may be accentuated by pixel- accurate contours captured in the area map, forming an improved surface of luma or chroma that does not have blockiness artifacts as created by a typical block-based compression and/or representation of images. Other image enhancing algorithms may be used to further improve the image quality of areas around the pixel-accurate contours.
IMAGE ENHANCEMENTS
[0116] In some embodiments, an image receiving system (e.g., 180) configured for a higher sampling rate (e.g., 4:4:2) instead of the low sampling rate (e.g., 4:2:2) of received image data may up-sample the image data incorrectly and thus waste the capability of supporting the higher sampling rate by the image receiving system (e.g., 180) or by an accompanying image rendering system. For example, the image data may comprise compressed color values (e.g., 8-bit values) in certain regions of color space. When the color values in the image data are up-sampled to a higher sampling rate that the image receiving system (180) supports, these up-sampled values may be different from those in pixel processed frames or source frames, and hence may produce artifacts such as color banding. In an embodiment, an area map may carry color shift information in a small data volume. When frames are up-sampled by the image receiving system (180), color values in certain areas (in the color space) that are susceptible to incorrect color shifting may be correctly shifted by prediction based on measured information in the color shift area map that was produced in the video encoding process in the image processing system (100).
[0117] In some embodiments, source frames may be encoded with a wide input color gamut, while the image data may be encoded with a smaller color gamut by a video encode unit (e.g., 112 of FIG. 1). In an embodiment, an area map may carry color space conversion information such that when frames are up-sampled by an image receiving system (e.g., 180), color values in certain areas that were mapped to the smaller color gamut in video encoding may be converted back to a wider color gamut supported by the image receiving system (180) using measured information on the area map that recorded the original color space conversion.
[0118] In some embodiments, image enhancement techniques as described herein may be used to improve image perceptions in other aspects. For example, area maps that comprise disparity information may be used to generate output frames (e.g., frames with stereo encoding in the ASBS mode) that provide an accurate representation in a viewing environment, even if the viewing environment has a different geometry than a camera system originating the video data. Similarly, area maps that comprise noise level information may be used to inject noise levels, or correct spatial errors of the frames in a residual layer, etc.
PROCESS FLOW
[0119] FIG. 4A illustrates an example process flow according to a possible embodiment of the present invention. In some possible embodiments, one or more computing devices or components in an image processing system (100) may perform this process flow. In block 410, the image processing system (100) determines a computational measure for one or more input frames. In block 420, the image processing system (100) calculates a value of the computational measure for each picture unit in an input frame in the one or more input frames. In block 430, the image processing system (100) generates an area map comprising one or more areas. Here, each of the one or more area comprises picture units with a uniform value for the measure. In block 440, the image processing system (100) outputs the area map and an encoded version of the input frame to a recipient device.
[0120] FIG. 4B illustrates another example process flow according to a possible embodiment of the present invention. In some possible embodiments, one or more computing devices or components in an image receiving system (180) may perform this process flow. In block 460, the image receiving system (180) decodes an encoded version of an input frame and an area map relating to one or more input frames including the input image. Here, the area map comprising one or more areas each having picture units that are of a uniform value for a computational measure. In block 470, the image receiving system (180) applies the area map to the input frame to generate a new version of the input frame. In block 480, the image receiving system (180) outputs the new version of the input frame to a rendering device.
IMPLEMENTATION MECHANISMS - HARDWARE OVERVIEW
[0121] According to one embodiment, the techniques described herein are
implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard- wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard- wired and/or program logic to implement the techniques.
[0122] For example, FIG. 5 is a block diagram that illustrates a computer system 500 upon which an embodiment of the invention may be implemented. Computer system 500 includes a bus 502 or other communication mechanism for communicating information, and a hardware processor 504 coupled with bus 502 for processing information. Hardware processor 504 may be, for example, a general purpose microprocessor. [0123] Computer system 500 also includes a main memory 506, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 502 for storing information and instructions to be executed by processor 504. Main memory 506 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 504. Such instructions, when stored in storage media accessible to processor 504, render computer system 500 into a special-purpose machine that is customized to perform the operations specified in the instructions.
[0124] Computer system 500 further includes a read only memory (ROM) 508 or other static storage device coupled to bus 502 for storing static information and instructions for processor 504. A storage device 510, such as a magnetic disk or optical disk, is provided and coupled to bus 502 for storing information and instructions.
[0125] Computer system 500 may be coupled via bus 502 to a display 512, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 514, including alphanumeric and other keys, is coupled to bus 502 for communicating information and command selections to processor 504. Another type of user input device is cursor control 516, such as a mouse, a trackball, or cursor direction keys for
communicating direction information and command selections to processor 504 and for controlling cursor movement on display 512. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
[0126] Computer system 500 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 500 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 500 in response to processor 504 executing one or more sequences of one or more instructions contained in main memory 506. Such instructions may be read into main memory 506 from another storage medium, such as storage device 510. Execution of the sequences of instructions contained in main memory 506 causes processor 504 to perform the process steps described herein. In alternative embodiments, hard- wired circuitry may be used in place of or in combination with software instructions.
[0127] The term "storage media" as used herein refers to any media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 510. Volatile media includes dynamic memory, such as main memory 506. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
[0128] Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 502. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
[0129] Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 504 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 500 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 502. Bus 502 carries the data to main memory 506, from which processor 504 retrieves and executes the instructions. The instructions received by main memory 506 may optionally be stored on storage device 510 either before or after execution by processor 504.
[0130] Computer system 500 also includes a communication interface 518 coupled to bus 502. Communication interface 518 provides a two-way data communication coupling to a network link 520 that is connected to a local network 522. For example,
communication interface 518 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 518 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 518 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
[0131] Network link 520 typically provides data communication through one or more networks to other data devices. For example, network link 520 may provide a connection through local network 522 to a host computer 524 or to data equipment operated by an Internet Service Provider (ISP) 526. ISP 526 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 528. Local network 522 and Internet 528 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 520 and through communication interface 518, which carry the digital data to and from computer system 500, are example forms of transmission media.
[0132] Computer system 500 can send messages and receive data, including program code, through the network(s), network link 520 and communication interface 518. In the Internet example, a server 530 might transmit a requested code for an application program through Internet 528, ISP 526, local network 522 and communication interface 518. The received code may be executed by processor 504 as it is received, and/or stored in storage device 510, or other non- volatile storage for later execution.
EQUIVALENTS, EXTENSIONS, ALTERNATIVES AND MISCELLANEOUS
[0133] To illustrate a clear example, source frames taken from a reality are used to illustrate some aspects of the present invention. It should be noted that other types of source frames may also be used in embodiments of the present invention. For example, source frames may be composite frames from two or more different image sources.
Furthermore, a part, or a whole, of a source frame may be sourced from a 2D image, while another part on the same source frame may be sourced from a 3D or multi-view image. Techniques as described herein may be provided for these other types of source frames in embodiments of the present invention.
[0134] In the foregoing specification, possible embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
[0135] Accordingly the invention may suitably comprise, consist of, or consist essentially of, any of element (the various parts or features of the invention and their equivalents as described herein, currently existing, and/or as subsequently developed. Further, the present invention illustratively disclosed herein may be practiced in the absence of any element, whether or not specifically disclosed herein. Obviously, numerous modifications and variations of the present inventio9n are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.
[0136] Accordingly, the invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which described structure, features, and functionality of some portions of the present invention.
EEEl. A method comprising:
determining a computational measure for one or more input frames;
calculating a value of the computational measure for a picture unit in an input frame in the one or more input frames;
generating an area map comprising one or more areas, each of the one or more areas comprising picture units with a uniform value for the measure; and outputting the area map and an encoded version of the input frame to a recipient device; wherein the method is performed by one or more devices comprising a processor. EEE2. The method of Claim EEEl, wherein the version of the encoded version of the input frame is of a sampling rate lower than that of the input frame.
EEE3. The method of Claim EEEl, wherein the one or more input frames comprise one or more views of a scene as captured by one or more camera elements in a camera system. EEE4. The method of Claim EEE3, further comprising determining geometry information for the one or more camera elements in the camera system.
EEE5. The method of Claim 1, wherein the input frame is a frame in a pair of stereoscopic frames that comprise a left-eye frame and a right-eye frame.
EEE6. The method of Claim 1, wherein the computational measure relates to disparity between two input frames.
EEE7. The method of EEE 1, wherein the computational measure relates to noise levels to be injected into the input frame to produce the encoded version of the input frame.
EEE8. The method of EEE 1, wherein the computational measure relates to color information in the input frame.
EEE9. The method of EEE 1, wherein each picture unit is a pixel.
EEE10. The method of EEE 1, wherein each picture unit is a block of pixels.
EEE11. The method of EEE 1, further comprising representing the area map with a representation based on a plurality of vertices and a plurality of links between the vertices.
EEE12. The method of EEE 1 , wherein the computational measure is related to disparity between two input frames and wherein the area map comprises at least an area in which corresponding picture units in the two input frames is of zero disparity.
EEE13. The method of EEE 1, further comprising:
determining a second computational measure for the one or more input frames;
calculating a second value of the computational measure for each picture unit in the input frame in the one or more input frames;
generating a second area map comprising one or more second areas, each of the one or more second areas comprising second picture units with a second uniform value for the second measure; and
outputting the second area map to the recipient device.
EEE14. A method comprising:
decoding an encoded version of an input frame and an area map relating to one or more input frames including the input image, the area map comprising one or more areas each having picture units that are of a uniform value for a computational measure; applying the area map to the input frame to generate a new version of the input frame; and
outputting the new version of the input frame to a rendering device;
wherein the method is performed by one or more devices comprising a processor. EEE15. A method comprising:
determining geometry information for one or more camera elements that originate one or more input frames;
encoding the geometry information with the one or more input frames in a data stream; and
outputting the data stream to a recipient device;
wherein the method is performed by one or more devices comprising a processor. EEE16. A method comprising:
receiving one or more input frames and geometry information of one or more camera elements that originate the one or more input frames;
modifying the one or more input frames to generate a new version of the one or more input frames based on the geometry information of the one or more camera elements; and
outputting the new version of the one or more input frames to a rendering device; wherein the method is performed by one or more devices comprising a processor. EEE17. An apparatus comprising a processor and configured to perform any one of the methods recited in EEEs 1-16.
EEE18. A computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in EEEs 1-16.
EEE19. A method comprising:
receiving an area map comprising one or more areas, each of the one or more areas comprising one or more picture units, in an input frame, with a uniform value for a computational measure;
receiving an encoded version of the input frame; and
generating an output frame by adjusting a picture unit of the input frame based upon the uniform value of a corresponding picture unit of the area map, wherein the method is performed by one or more devices comprising a processor. EEE20. The method of EEE 19, further comprising retrieving the encoded version of the input frame from a coded bitstream.
EEE21. The method of EEE 19, further comprising retrieving the encoded version of the input frame from a physical storage medium.
EEE22. The method of EEE 19, further comprising determining geometry information for one or more camera elements in a camera system that captures a source frame from which the input frame was derived.
EEE23. The method of EEE 19, wherein the input frame is a frame in a pair of
stereoscopic frames that comprise a left-eye frame and a right-eye frame.
EEE24. The method of EEE 19, wherein the computational measure relates to disparity between the input frame and one or more other input frames.
EEE25. The method of EEE 19, wherein the computational measure relates to noise levels to be injected into an output frame for rendering.
EEE26. The method of EEE 19, wherein the computational measure relates to color information in a source frame from which the input frame was derived.
EEE27. The method of EEE 19, wherein each picture unit is a pixel.
EEE28. The method of EEE 19, wherein each picture unit is a block of pixels.
EEE29. The method of EEE 19, wherein the area map is represented based on a plurality of vertices and a plurality of links between the vertices.
EEE30. The method of EEE 19, wherein the input frame is one of two stereoscopic input frames, and wherein the area map comprises at least an area in which corresponding picture units in the two stereoscopic input frames is of zero disparity.
EEE31. The method of EEE 19, further comprising:
receiving a second area map comprising one or more second areas, each of the one or more second areas comprising one or more second picture units, in the input frame, with a second uniform value for a second computational measure; and adjusting, in the output frame, a picture unit of the input frame based upon the second uniform value of a corresponding second picture unit of the second area map. EEE32. An apparatus comprising a processor and configured to perform any one of the methods recited in EEEs 19-29.
EEE33. A computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in EEEs 19-29.

Claims

1. A method comprising:
determining a computational measure for one or more input frames;
calculating a value of the computational measure for a picture unit in an input frame in the one or more input frames;
generating an area map comprising one or more areas, each of the one or more areas comprising picture units with a uniform value for the measure; and outputting the area map and an encoded version of the input frame to a recipient device; wherein the method is performed by one or more devices comprising a processor.
2. The method of Claim 1 , wherein the version of the encoded version of the input frame is of a sampling rate lower than that of the input frame.
3. The method of Claim 1, wherein the one or more input frames comprise one or more views of a scene as captured by one or more camera elements in a camera system.
4. The method of Claim 3, further comprising determining geometry information for the one or more camera elements in the camera system.
5. The method of Claim 1, wherein the input frame is a frame in a pair of stereoscopic frames that comprise a left-eye frame and a right-eye frame.
6. The method of Claim 1, wherein the computational measure relates to disparity
between two input frames.
7. The method of Claim 1, wherein the computational measure relates to noise levels to be injected into the input frame to produce the encoded version of the input frame.
8. The method of Claim 1, wherein the computational measure relates to color
information in the input frame.
9. The method of Claim 1, wherein each picture unit is a pixel.
10. The method of Claim 1, wherein each picture unit is a block of pixels.
11. The method of Claim 1, further comprising representing the area map with a
representation based on a plurality of vertices and a plurality of links between the vertices.
12. The method of Claim 1, wherein the computational measure is related to disparity between two input frames and wherein the area map comprises at least an area in which corresponding picture units in the two input frames is of zero disparity.
13. The method of Claim 1, further comprising:
determining a second computational measure for the one or more input frames;
calculating a second value of the computational measure for each picture unit in the input frame in the one or more input frames;
generating a second area map comprising one or more second areas, each of the one or more second areas comprising second picture units with a second uniform value for the second measure; and
outputting the second area map to the recipient device.
14. A method comprising:
decoding an encoded version of an input frame and an area map relating to one or more input frames including the input image, the area map comprising one or more areas each having picture units that are of a uniform value for a computational measure;
applying the area map to the input frame to generate a new version of the input frame; and
outputting the new version of the input frame to a rendering device;
wherein the method is performed by one or more devices comprising a processor.
15. A method comprising:
determining geometry information for one or more camera elements that originate one or more input frames;
encoding the geometry information with the one or more input frames in a data stream; and
outputting the data stream to a recipient device;
wherein the method is performed by one or more devices comprising a processor.
16. A method comprising:
receiving one or more input frames and geometry information of one or more camera elements that originate the one or more input frames;
modifying the one or more input frames to generate a new version of the one or more input frames based on the geometry information of the one or more camera elements; and
outputting the new version of the one or more input frames to a rendering device; wherein the method is performed by one or more devices comprising a processor.
17. An apparatus comprising a processor and configured to perform any one of the methods recited in Claims 1-16.
18. A computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in Claims 1-16.
19. A method comprising:
receiving an area map comprising one or more areas, each of the one or more areas comprising one or more picture units, in an input frame, with a uniform value for a computational measure;
receiving an encoded version of the input frame; and
generating an output frame by adjusting a picture unit of the input frame based upon the uniform value of a corresponding picture unit of the area map, wherein the method is performed by one or more devices comprising a processor.
20. The method of Claim 19, further comprising retrieving the encoded version of the input frame from a coded bitstream.
PCT/US2011/022281 2010-01-29 2011-01-24 Image enhancement system using area information WO2011094164A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US29979310P 2010-01-29 2010-01-29
US61/299,793 2010-01-29

Publications (1)

Publication Number Publication Date
WO2011094164A1 true WO2011094164A1 (en) 2011-08-04

Family

ID=43660528

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/022281 WO2011094164A1 (en) 2010-01-29 2011-01-24 Image enhancement system using area information

Country Status (1)

Country Link
WO (1) WO2011094164A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10354394B2 (en) 2016-09-16 2019-07-16 Dolby Laboratories Licensing Corporation Dynamic adjustment of frame rate conversion settings
US10977809B2 (en) 2017-12-11 2021-04-13 Dolby Laboratories Licensing Corporation Detecting motion dragging artifacts for dynamic adjustment of frame rate conversion settings
WO2021177758A1 (en) * 2020-03-04 2021-09-10 Samsung Electronics Co., Ltd. Methods and systems for denoising media using contextual information of the media

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007037726A1 (en) * 2005-09-28 2007-04-05 Telefonaktiebolaget Lm Ericsson (Publ) Media content management
EP1965389A2 (en) * 2007-02-28 2008-09-03 Kabushiki Kaisha Toshiba Information encoding method, information playback method, and information storage medium using two versions of film grain reproduction information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007037726A1 (en) * 2005-09-28 2007-04-05 Telefonaktiebolaget Lm Ericsson (Publ) Media content management
EP1965389A2 (en) * 2007-02-28 2008-09-03 Kabushiki Kaisha Toshiba Information encoding method, information playback method, and information storage medium using two versions of film grain reproduction information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EKMEKCIOGLU E ET AL: "Bit-Rate Adaptive Downsampling for the Coding of Multi-View Video with Depth Information", 3DTV CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO, 2008, IEEE, PISCATAWAY, NJ, USA, 28 May 2008 (2008-05-28), pages 137 - 140, XP031275230, ISBN: 978-1-4244-1760-5 *
FAN H ET AL: "Disparity map coding based on adaptive triangular surface modelling", SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 14, no. 1-2, 6 November 1998 (1998-11-06), pages 119 - 130, XP027357223, ISSN: 0923-5965, [retrieved on 19981106] *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10354394B2 (en) 2016-09-16 2019-07-16 Dolby Laboratories Licensing Corporation Dynamic adjustment of frame rate conversion settings
US10977809B2 (en) 2017-12-11 2021-04-13 Dolby Laboratories Licensing Corporation Detecting motion dragging artifacts for dynamic adjustment of frame rate conversion settings
WO2021177758A1 (en) * 2020-03-04 2021-09-10 Samsung Electronics Co., Ltd. Methods and systems for denoising media using contextual information of the media

Similar Documents

Publication Publication Date Title
EP3751857A1 (en) A method, an apparatus and a computer program product for volumetric video encoding and decoding
US11405643B2 (en) Sequential encoding and decoding of volumetric video
US11202086B2 (en) Apparatus, a method and a computer program for volumetric video
US20200302571A1 (en) An Apparatus, a Method and a Computer Program for Volumetric Video
US8447096B2 (en) Method and device for processing a depth-map
Müller et al. 3-D video representation using depth maps
ES2676055T5 (en) Effective image receptor for multiple views
WO2019135024A1 (en) An apparatus, a method and a computer program for volumetric video
US20100309287A1 (en) 3D Data Representation, Conveyance, and Use
US20110304618A1 (en) Calculating disparity for three-dimensional images
Müller et al. 3D video formats and coding methods
US20150304640A1 (en) Managing 3D Edge Effects On Autostereoscopic Displays
WO2014013405A1 (en) Metadata for depth filtering
JP7344988B2 (en) Methods, apparatus, and computer program products for volumetric video encoding and decoding
US20150237323A1 (en) 3d video representation using information embedding
WO2019115867A1 (en) An apparatus, a method and a computer program for volumetric video
WO2011094164A1 (en) Image enhancement system using area information
Takyar et al. Extended layered depth image representation in multiview navigation
WO2019234290A1 (en) An apparatus, a method and a computer program for volumetric video
KR102658474B1 (en) Method and apparatus for encoding/decoding image for virtual view synthesis
WO2013039333A1 (en) 3d video encoding/decoding method and apparatus therefor
Takyar et al. Multiview navigation based on extended layered depth image representation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11703527

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11703527

Country of ref document: EP

Kind code of ref document: A1