US20120050483A1 - Method and system for utilizing an image sensor pipeline (isp) for 3d imaging processing utilizing z-depth information - Google Patents
Method and system for utilizing an image sensor pipeline (isp) for 3d imaging processing utilizing z-depth information Download PDFInfo
- Publication number
- US20120050483A1 US20120050483A1 US13/174,364 US201113174364A US2012050483A1 US 20120050483 A1 US20120050483 A1 US 20120050483A1 US 201113174364 A US201113174364 A US 201113174364A US 2012050483 A1 US2012050483 A1 US 2012050483A1
- Authority
- US
- United States
- Prior art keywords
- captured
- image
- processing
- video
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/25—Image signal generators using stereoscopic image cameras using two or more image sensors with different characteristics other than in their location or field of view, e.g. having different resolutions or colour pickup characteristics; using image signals from one sensor to control the characteristics of another sensor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/122—Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/261—Image signal generators with monoscopic-to-stereoscopic image conversion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/271—Image signal generators wherein the generated image signals comprise depth maps or disparity maps
Definitions
- Certain embodiments of the invention relate to video processing. More specifically, certain embodiments of the invention relate to a method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information.
- ISP image sensor pipeline
- 3D video supports and demand for video systems that support three-dimensional (3D) video has increased rapidly in recent years. Both literally and physically, 3D video provides a whole new way to watch video, in home and in theaters. However, 3D video systems are still in their infancy in many ways and there is much room for improvement in terms of both cost and performance.
- a system and/or method is provided for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
- ISP image sensor pipeline
- FIG. 1 is a diagram that illustrates an exemplary monoscopic, or single-view, camera embodying aspects of the present invention, compared with a conventional stereoscopic camera.
- FIG. 2A is a diagram illustrating an exemplary monoscopic camera, which may be utilized in accordance with an embodiment of the invention.
- FIG. 2B is a block diagram illustrating an exemplary image sensor pipeline (ISP), which may be utilized in accordance with an embodiment of the invention.
- ISP image sensor pipeline
- FIG. 3 is a diagram that illustrates exemplary processing of depth information and 2D image information to generate a 3D image, which may be utilized in accordance with an embodiment of the invention.
- FIG. 4A is a diagram that illustrates exemplary detection and/or tracking of objects via a monoscopic camera based on Z-depth information, which may be utilized in accordance with an embodiment of the invention.
- FIG. 4B is a diagram that illustrates exemplary selective processing of objects via a monoscopic camera subsequent to detection and/or tracking based on Z-depth information, which may be utilized in accordance with an embodiment of the invention.
- FIG. 5 is a flow chart that illustrates exemplary steps for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, in accordance with an embodiment of the invention.
- ISP image sensor pipeline
- a monoscopic video camera may be utilized to detect and/or track objects at varying depths, and may adaptively process video information associated with each of these objects, based on determined corresponding depths for these objects.
- the monoscopic video camera may capture, via at least one image sensor, two-dimensional video, and may capture, via at least one depth sensor, corresponding depth information for the captured two-dimensional video.
- the monoscopic video camera may then detect and/or track objects in the captured two-dimensional video, based on the captured corresponding depth information for example.
- processing of image related information corresponding to the objects may be configured based on the detecting and/or tracking of the objects.
- video processing in the monoscopic video camera may be configured to provide adaptive and/or dynamic setting and/or modification of video information, such as color and/or brightness based on determined types of objects and/or based on determination of relative depth of each of the objects with respect to the monoscopic video camera.
- Detection of objections may comprise determining type and/or characteristics of each of the objects.
- identification of the type and/or characteristics of the objects may be performed based on one or more object recognition algorithms programmed into the monoscopic video camera.
- Configuration of processing of object image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects.
- the monoscopic video camera may also be operable to perform scene detection based on the two-dimensional video and/or the corresponding depth information, and object detection and/or tracking may be performed and/or adjusted based on scene detection.
- the scene detection may comprise determining various characteristics associated with scenes in images captured by the monoscopic video camera. Exemplary scene characteristics may comprise type of setting in the scenes, such as rural vs. urban; type of objects present and/or anticipated in the scene, such as trees and/or buildings; and/or chronological information relating to the scene, such as season and/or time of day.
- the monoscopic video camera may be operable to synchronize the captured corresponding depth information to the captured two-dimensional video, to enable generating 3D perception for at least some images captured via the monoscopic video camera. Accordingly, the monoscopic video camera may compose three-dimensional video from captured two-dimensional video based on corresponding captured depth information. The monoscopic video camera may then render the composed three-dimensional video, autonomously—using integrated display in the monoscopic video camera, or via another display device, to which the 3D video may be communicated directly from the monoscopic video camera or indirectly via intermediate storage devices.
- FIG. 1 is a diagram that compares a monoscopic camera embodying aspects of the present invention with a conventional stereoscopic camera. Referring to FIG. 1 , there is shown a stereoscopic camera 100 and a monoscopic camera 102 .
- the stereoscopic camera 100 may comprise suitable logic, circuitry, interfaces, and/or code that may enable capturing and/or generating stereoscopic video and/or images.
- the stereoscopic camera 100 may comprise two lenses 101 a and 101 b.
- Each of the lenses 101 a and 101 b may capture images from a different viewpoint and images captured via the two lenses 101 a and 101 b may be combined to generate a 3D image.
- electromagnetic (EM) waves in the visible spectrum may be focused on a first one or more image sensors by the lens 101 a (and associated optics) and EM waves in the visible spectrum may be focused on a second one or more image sensors by the lens (and associated optics) 101 b.
- EM electromagnetic
- the monoscopic camera 102 may comprise suitable logic, circuitry, interfaces, and/or code that may enable capturing and/or generating monoscopic video and/or images.
- the monoscopic camera 102 may capture images via a single viewpoint, corresponding to the lens 101 c for example.
- EM waves in the visible spectrum may be focused on one or more image sensors by the lens 101 c.
- the image sensor(s) may capture brightness and/or color information.
- the captured brightness and/or color information may be represented in any suitable color space such as YCrCb color space or RGB color space.
- the monoscopic camera 102 may be operable to generate 3D video and/or images based on captured 2D video and/or images based on, for example, depth information.
- the monoscopic camera 102 may also capture depth information via the lens 101 c (and associated optics).
- the monoscopic camera 102 may comprise an infrared emitter, an infrared sensor, and associated circuitry operable to determine the distance to objects based on reflected infrared waves. Additional details of the monoscopic camera 102 are described below.
- the monoscopic camera may comprise a processor 124 , a memory 126 , and a sensory subsystem 128 .
- the processor 124 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to manage operation of various components of the camera and perform various computing and processing tasks.
- a single processor 124 is utilized only for illustration but the invention is not so limited.
- various portions of the camera 102 depicted in FIG. 2A below may correspond to the processor 124 depicted in FIG. 1 .
- the memory 106 may comprise, for example, DRAM, SRAM, flash memory, a hard drive or other magnetic storage, or any other suitable memory devices.
- the sensory subsystem 128 may comprise a plurality of sensors which may be operable to capture and/or generate video information corresponding to images and/or video streams generated via the monoscopic camera 102 .
- the sensory subsystem 128 may also comprise suitable logic, circuitry, interfaces, and/or code that may be operable to manage and/or control of the various sensors in the sensory subsystem 128 , and/or to handling at the least some of the processing of information generated and/or captured thereby.
- the sensory subsystem 128 may enable generating 2D video and corresponding depth and/or color information.
- the sensory subsystem 128 may comprise, for example, one or more image sensors, one or more depth sensors, and one or more sensors.
- exemplary sensors that may be integrated into the sensory subsystem 128 are described in more detail below with respect to FIG. 2A .
- FIG. 2A is a diagram illustrating an exemplary monoscopic camera, in accordance with an embodiment of the invention.
- the monoscopic camera 102 which may comprise a memory 202 , a processor 204 , a digital signal processor (DSP) 206 , an error protection module 208 , a video encoder/decoder 210 , an audio encoder/decoder 212 , a speaker 214 , a microphone 216 , an optics module 218 , an emitter 220 , an input/output (I/O) module 228 , a digital display 230 , controls 232 , and optical viewfinder 234 .
- DSP digital signal processor
- the camera 102 may also comprise a plurality of sensors which may be operable to capture and/or generate video information corresponding to images and/or video streams.
- the camera 102 may comprise, for example, one or more image sensors 222 , one or more color sensor 224 , and one or more depth sensor 226 .
- the camera 102 may also comprise the lens 101 c , which may be operable to collect and sufficiently focus electromagnetic waves in the visible and infrared spectra to enable capturing images and/or video.
- the memory 202 may comprise suitable logic, circuitry, interfaces, and/or code that may enable temporary and/or permanent storage of data, and/or retrieval or fetching thereof.
- the memory 202 may comprise, for example, DRAM, SRAM, flash memory, a hard drive or other magnetic storage, or any other suitable memory devices.
- SRAM may be utilized to store data utilized and/or generated by the processor 204 and a hard-drive and/or flash memory may be utilized to store recorded image data and depth data.
- the processor 204 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to coordinate operation of the various components of the camera 102 .
- the processor 204 may, for example, run an operating system of the camera 102 and control communication of information and signals between components of the camera 102 .
- the processor 204 may execute instructions stored in the memory 202 .
- the DSP 206 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to perform complex processing of captured image data, captured depth data, and captured audio data.
- the DSP 206 may be operable to, for example, compress and/or decompress the data, encode and/or decode the data, and/or filter the data to remove noise and/or otherwise improve perceived audio and/or video quality for a listener and/or viewer.
- the error protection module 208 may comprise suitable logic, circuitry, interfaces and/or code that may be operable to perform error protection functions for the monoscopic camera 102 .
- the error protection module 208 may provide error protection to encoded 2D video images and corresponding depth information, and/or encoded audio data for transmission to a video rendering device that may be communicatively coupled to the monoscopic camera 102 .
- the video encoder/decoder 210 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to process captured color, brightness, and/or depth data to make the data suitable for conveyance to, for example, the display 230 and/or to one or more external devices via the I/O block 228 .
- the video encoder/decoder 210 may convert between, for example, raw RGB or YCrCb pixel values and an MPEG encoding.
- the video encoder/decoder 210 may be implemented in the DSP 206 .
- the audio encoder/decoder 212 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to process captured color, brightness, and/or depth data to make the data suitable for conveyance to, for example, the speaker 214 and/or to one or more external devices via the I/O block 228 .
- the video encoder/decoder 210 may convert between, for example, raw pulse-code-modulated audio and an MP3 or AAC encoding.
- the audio encoder/decoder 212 may be implemented in the DSP 206 .
- the speaker 214 may comprise suitable logic, circuitry, interfaces, and/or code operable to convert electrical signals into acoustic waves.
- the microphone 216 may be operable to amplify, equalize, and/or otherwise generate audio signals based on audio information generated in the camera 102 .
- the directionality of the speaker 214 may be controlled electronically and/or mechanically.
- the microphone 216 may comprise a transducer and associated logic, circuitry, interfaces, and/or code operable to convert acoustic waves into electrical signals.
- the microphone 216 may be operable to amplify, equalize, and/or otherwise process captured audio signals.
- the directionality of the microphone 216 may be controlled electronically and/or mechanically.
- the optics module 218 may comprise various optical devices for conditioning and directing EM waves received via the lens 101 c.
- the optics module 218 may direct EM waves in the visible spectrum to the image sensor 222 and direct EM waves in the infrared spectrum to the depth sensor 226 .
- the optics module 218 may comprise, for example, one or more lenses, prisms, color filters, and/or mirrors.
- Each image sensor 222 may each comprise suitable logic, circuitry, interfaces, and/or code that may be operable to convert optical signals to electrical signals.
- Each image sensor 222 may comprise, for example, a charge coupled device (CCD) images sensor or a complimentary metal oxide semiconductor (CMOS) image sensor.
- CMOS complimentary metal oxide semiconductor
- Each image sensor 222 may capture 2D brightness and/or color information.
- Each color sensor 224 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to detect color generation and/or generate color related information based thereon in images captures via the camera 102 .
- Each depth sensor 226 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to detect EM waves in the infrared spectrum and determine distance to objects based on reflected infrared waves. In an embodiment of the invention, distance may be determined based on time-of-flight of infrared waves transmitted by the emitter 220 and reflected back to the color sensor 224 . In an embodiment of the invention, depth may be determined based on distortion of a captured grid.
- the input/output module 228 may comprise suitable logic, circuitry, interfaces, and/or code that may enable the camera 102 to interface with other devices in accordance with one or more standards such as USB, PCI-X, IEEE 1394, HDMI, DisplayPort, and/or analog audio and/or analog video standards.
- the I/O module 228 may be operable to send and receive signals from the controls 232 , output video to the display 230 , output audio to the speaker 214 , handle audio input from the microphone 216 , read from and write to cassettes, flash cards, hard disk drives, solid state drives, or other external memory attached to the camera 102 , and/or output audio and/or video via one or more ports such as a IEEE 1394 or USB port.
- the digital display 230 may comprise suitable logic, circuitry, interfaces, and/or code that may enable displaying video and/or images, captured, generated, and/or processed via the monoscopic camera 102 .
- the digital display 230 may comprise, for example, an LCD, LED, OLED, or other digital display technology on which images recorded via the camera 102 may be displayed.
- the digital display 230 may be operable to display 3D images.
- the controls 232 may comprise suitable logic, circuitry, interfaces, and/or code. The controls 232 may enable a user to interact with the camera 102 . For example, controls for controlling recording and playback.
- the controls 232 may enable a user to select whether the camera 102 records and/or outputs video in 2D or 3D modes.
- the optical viewfinder 234 may enable a user to see what the lens 101 c “sees,” that is, what is “in frame.”
- the camera 102 may comprise an image sensor pipeline (ISP) 250 .
- the ISP 250 may be implemented as a dedicated component, and/or as part of another component of the camera 102 , such as the processor 202 for example.
- the ISP 250 may comprise suitable circuitry, logic and/or code that may be operable to process imaging (or video) data, which may be received from one or more imaging related sensors, such as image sensors 222 , sensor 224 , and/or depth sensor 226 .
- the ISP 250 may perform and/or support various video processing operations and/or techniques comprising, for example, filtering, demosaic, lens shading correction, defective pixel correction, white balance, image compensation, Bayer interpolation, color transformation, and/or post filtering.
- the ISP 250 may provide accelerated processing of imaging data.
- the accelerated processing may be achieved by use of pipelined based architecture, with the ISP 250 comprising programmable pipeline structure for example.
- the ISP 250 may comprise, for example, multiple sensor processing stages, implemented in hardware, software, firmware, and/or any combination thereof. Exemplary processing stages may comprise demosaicing, geometric distortion correction, color conversion, denoising, and/or sharpening, for example.
- processing of image data may be performed on variable sized tiles, reducing the memory requirements of the ISP 250 processes.
- the camera 102 may be utilized to generate 3D video and/or images based on captured 2D video data and corresponding depth information.
- the depth sensor(s) 226 may capture depth information and the image sensor(s) 222 may capture 2D image information.
- the image sensor(s) 222 may capture only brightness information for rendering black and white 3D video.
- the depth information may, for example, be stored and/or communicated as metadata and/or an additional layer of information associated with 2D image information.
- a data structure in which the 2D image information is stored may comprise one or more fields and/or indications that indicate depth data associated with the stored 2D image information is available for rendering a 3D image.
- packets in which the 2D image information is communicated may comprise one or more fields and/or indications that indicate depth data associated with the communicated 2D image information is available for rendering a 3D image.
- the camera 101 may read the 2D image information out of memory, and process it to generate a 2D video stream to the display and/or the I/O block.
- For outputting 3D video may: (1) read the 2D image information from memory; (2) determine, based on an indication stored in memory with the 2D image information, that associated depth information is available; (3) read the depth information from memory; and (4) process the 2D image information and depth information to generate a 3D video stream.
- Processing of the 2D image information and depth information may comprise synchronizing the depth information to the 2D image information. Processing of the 2D image information and depth information may comprise scaling and/or interpolating either or both of the 2D image information and the associated depth information.
- the resolution of the depth sensor 226 may be less than the resolution of the image sensor 222 . Accordingly, the camera 102 may be operable to interpolate between pixels of depth information to generate depth information for each pixel, or group of pixels, of 2D image information.
- the frame rate of the depth sensor 226 may be less than the frame rate of the image sensor 222 . Accordingly, the camera 102 may be operable to interpolate between frames of depth information to generate a frame of depth information for each frame of 2D image information.
- the monoscopic camera 102 may be operable to detect and/or track objects in 2D video captured via the image sensor(s) 222 , based on corresponding depth information captured via the depth sensor(s) 226 for example, and/or may adaptively process video information associated with detected and/or tracked objects, to enhance corresponding video images.
- the monoscopic camera 102 may be operable to utilize, for example, one or more recognition algorithms that may enable determining presence of certain objects in scenes captured via the image sensor 222 for example.
- the object recognition algorithms may detect objects based on, for example, determination of type of object, and/or preconfigured characteristics associated therewith.
- the object recognition algorithms utilized in the monoscopic camera 102 may detect such objects as persons, or parts thereof such as face or hands for example. Associated characteristics may comprise, for example, size of the object or specific parts thereof, such as face of person, and/or color related information, such as permissible color hues and/or shades.
- the monoscopic camera 102 may continue to track the object in successive image frames.
- the object tracking may be configured and/or controlled using depth information. This may enable tracking an object as it moves, for example, closer to or further away from the monoscopic camera 102 .
- the monoscopic camera 102 may be operable to adjust characteristics associated with detected objects, such as size and/or relative position to other objects in the scenes, as the tracked objects move within the captured scene.
- Adaptive processing of video information associated with detected and/or tracked objects may comprise configuring and/or modifying control parameters and/or criteria pertinent to video processing operations, and/or setting and/or adjusting video information (e.g. color, brightness, and/or shade) associated with the detected objects.
- Configuration of processing of object image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects.
- video information associated with certain objects may be generated and/or modified, based on preconfigured criteria for example. For example, in instances where an object is identified as a face, color related video information may be adjusted to ensure based on predetermined criteria associated with acceptable human color hues for example.
- depth information may be also be utilized in conjunction with object detection and/or tracking during processing of object image related information.
- the relative size of an object identified as a face may be adjusted based on depth information, to ensure that the size of the face may be appropriate, for example within acceptable range and/or quality, based on identified associated relative depth of the face in relation to the camera 102 .
- FIG. 2B is a block diagram illustrating an exemplary image sensor pipeline (ISP), which may be utilized in accordance with an embodiment of the invention. Referring to FIG. 2B , there is shown the image sensor pipeline (ISP) 250 of FIG. 2A .
- ISP image sensor pipeline
- the ISP 250 may comprise suitable circuitry, logic and/or code that may be operable to perform various functions associated with processing of imaging data, which may be received from one or more imaging related sensors, in accelerated manner, by use of pipelined based architecture for example.
- the ISP 250 may be utilized to enable, for example, pipelined color processing of captured images.
- the ISP 250 may be configured as programmable pipeline structure, comprising a plurality of functions 250 A - 250 N , each of which associated with handling and/or performing particular image processing function.
- the ISP 250 may enable accelerated image processing by splitting the processing of data associated with each particular image into stages, to enable concurrently handling multiple images with each of the plurality of functions 250 A - 250 N being utilized to, for example, perform the corresponding processing function on different images.
- ISP 250 may enable handling multiple images since processing of each image may be at different stage at any given point. This may enable implementing various aspects of the invention by adjusting different stages of pipelined functions, without affecting the overall processing duration since some of the operations may be done while other stages are being performed.
- Data may be moved from any point of the ISP 250 and processed in software and the resulting software processed data may be put into any desired point of the ISP 250 for processing in hardware.
- Exemplary processing functions handled and/or implemented by the ISP 250 may comprise, for example, auto-focus function 250 A , flash-metering function 250 B , auto-white-balance (AWB) function 250 C , image segmentation function 250 D , and/or image scaling function 250 N .
- auto-focus function 250 A flash-metering function 250 B
- ABB auto-white-balance
- the auto-focus function 250 A may comprise performing focusing operations automatically.
- focusing may comprise selecting one or more portions of an image to be focal points during images processing, in which light from these portions and/or objects there in, are optimally captured and/or corresponding image information are consequently very accurate.
- Auto-focus operations may comprise use and/or control of image sensors to enable selecting focus points, and/or to determine correct focusing associated therewith.
- auto-focus may be active, in which distance to the focus points (or objects) may be determined, and subsequently correct focusing may be effectuated, by controlling and/or adjusting available image sensors, using such techniques as light metering for example.
- Auto-focus may also be passive, in which focus point selection and/or corresponding focusing adjustment may be performed base don passive analysis of the image, and/or associate image information after the image is captured.
- the flash-metering function 250 B may comprise controlling flashing operations, such as of the camera 102 , based on image sensory information.
- flash-metering may comprise determining and/or measuring levels of light or brightness in a scene with which an image is associated, and selecting and/or controlling based thereon the amount of light emitted by a flash component coupled to and/or integrated into the camera.
- the light measuring may be performed using one or more sensors, and/or via the camera's lenses.
- the AWB function 250 C may comprise performing white balance operations automatically.
- white balancing may comprise adjusting color intensities of portions of an image associated with the white color, to ensure that these portions may be render correctly—i.e. with more natural feel, based on identification of the objects associated with these white areas and/or settings in which the image was captured.
- the white color may typically be function of equal, or near equal mixing of the three primary colors (red, green, and blue). Accordingly, during color balancing, contribution or parameters associated with each of these three colors may be adjusted to adjust the whiteness of the white color region.
- white balancing may comprise adjusting image portions associated with snow such that the white color associated with the snow may be rendered with a degree of blueness.
- the image segmentation function 250 D may comprise partitioning an image, whose information is processed, into multiple segments, each of comprising a plurality of contagious and/or non-contagious pixels, based on presence of one or more common characteristics among pixels in each segment.
- the common characteristics may be determined based on predetermined ranges associated with one or more video information, such as intensity and/or color.
- Image segmentation may be utilized to enable simplifying and/or changing processing image data, by configuring analysis and/or processing of image data in accordance with the common characteristics associated with each segment.
- Image segmentation may also be utilized to enable and/or enhance locating objects and boundaries, such as lines or curves, in images.
- the image scaling function 250 N may comprise resizing images, and/or portions thereof, to increase or decrease the image (or portion) size.
- image scaling may comprise and/or pertain to zooming operations, in which a portion of an image may be adjusted to fit larger or smaller portion of screen.
- Image scaling may affect various characteristics of the mages, such as smoothness and/or sharpness. In this regard, increasing the size of image may reduce the smoothness and/or sharpness in the image, while decreasing the size of image may enhance its smoothness and/or sharpness.
- Image scaling may comprise subsampling or upsampling an image, or portion thereof based on whether the image (or portion) is being scaled up or down.
- the depth information generated and/or captured via the depth sensor 226 may be utilized to enhance and/or improve image processing performed in the camera 102 .
- the depth information may be utilized to generate and/or adjust control information utilized in controlling and/or managing operations of the ISP 250 .
- the control information may be utilized to adjust and/or control various stages and/or functions of the ISP 250 , such as the auto-focus function 250 A , the flash-metering function 250 B , the AWB function 250 C , the image segmentation function 250 D , and/or the image scaling function 250 N .
- the auto-focus function 250 A may be adjusted, based on depth information, to enable selecting focus points, and/or configuring focusing operations relating thereto, adaptively at different depths relative to the camera 102 .
- algorithms utilized during AWB operations may be adjusted to enable applying white balancing adaptively at different depths relative to the camera 102 .
- algorithms and/or parameters utilized scaling operations may be adjusted to enable performing scaling operations adaptively at different depths relative to the camera 102 .
- Operations of the ISP 250 may also be controlled and/or adjusted based on detection and/or tracking of objects.
- the functions 250 A - 260 N of the ISP 250 may be modified and/or configured to support and/or enable performing various operations associated with detecting and/or tracking objects.
- the auto-focus function 250 A may be adjusted to focus on particular types of objects and/or to do so at particular depths, such as by incorporating depth information related parameters into this function.
- the flash-metering function 250 B may also be adjusted such as flashing operations may be tailored to enhance lighting of detected and/or tracked objects.
- the image segmentation function 250 D may be adjusted to enhance locating objects and shapes associated therewith, such as lines or curves with particular characteristics, during object detection and/or tracking operations.
- At least some of the functions 250 A - 260 N of the ISP 250 may also be modified and/or configured to enhance and/or adjust various image processing operations associated with detected and/or tracked objects.
- the AWB function 250 C may be adjusted to perform white balancing adaptively on regions associated with detected and/or tracked objects.
- FIG. 3 illustrates processing of depth information and 2D image information to generate a 3D image, which may be utilized in accordance with an embodiment of the invention.
- a frame 330 of depth information which may be captured by the depth sensor(s) 226
- a frame 334 of 2D image information captured by the image sensors 222 .
- the depth information 330 and the 2D image information 334 may be processed to generate a frame 336 associated with a corresponding 3D image.
- plane 332 indicated by a dashed line, which is merely for illustration purposes to indicate depth on the two dimensional drawing sheets.
- the line weight is utilized to indicate depth—heavier lines being closer to the viewer.
- the object 338 is farthest from the camera 102
- the object 342 is closest to the camera 102
- the object 104 is at an intermediate distance.
- depth information such as in frame 330 for example, may be utilized to provide depth related adjustments to corresponding 2D video information, such as in frame 334 for example.
- the depth information may be mapped to a grayscale, or pseudo-grayscale, image for display to the viewer. Such mapping may be performed by the DSP 206 for example.
- the image associated with the frame 334 may be a conventional 2D image.
- a viewer of the frame 234 for example on the display 120 or on a dedicated display device connected to the camera 102 via the I/O module 228 , may perceive the same distance to each of the objects 338 , 340 , and 342 . That is, each of the objects 338 , 340 , and 342 each appear to reside on the plane 332 .
- 2D video information and corresponding depth information may be processed to enable generating 3D images, associated with frame 336 for example, which may provide depth perception when viewed.
- the depth information of frame 330 may be utilized to adaptively adjust video information with each of objects in the 2D images of frames 334 to create perception of different depths for objects contained therein.
- the viewer of frames 336 on the display 120 or on a dedicated display device connected to the camera 102 via the I/O module 228 for example, may perceive the object 338 being furthest from the viewer the object 342 being closest to the viewer, and the object 340 being at an intermediate distance.
- the object 338 may appear to be behind the reference plane
- the object 340 may appear to be on the reference plane
- the object 342 may appear to be in front of the reference plane.
- depth information may be utilized to enable detecting and/or tracking objects in captured 2D images and/or video, and/or may be utilized to enable adaptively processing and/or generating video data associated with detected and/or tracked objects.
- object detection and/or tracking may be performed at different depths, corresponding to the depth of each of the objects 338 , 340 , and 342 , to enable detecting presenting of each of objects at the appropriate depth, and to continue tracking these objects thereafter.
- the object detection may comprise use of recognition algorithms, which may enable determining presence of certain objects.
- object recognition algorithms may detect objects based on, for example, determination of type of object, by classifying the object into one of plurality of preconfigured categories for example, and/or based on characteristics associated therewith.
- object recognition algorithms may enable detecting objects such as persons, or parts thereof such as face or hands for example.
- FIG. 4A is a diagram that illustrates exemplary detection and/or tracking of objects via a monoscopic camera based on Z-depth information, which may be utilized in accordance with an embodiment of the invention.
- the monoscopic camera 102 and a physical object 402 , whose image may be captured via the monoscopic camera 102 .
- the object 402 may comprise, for example, a person and/or a particular part of a person such as face or hands for example.
- the monoscopic camera 102 may be utilized to capture 2D video and to generate corresponding depth information, substantially as described with regard to FIGS. 2 and 3 , for example.
- the monoscopic camera 102 may be operable to detect and/or track objects in captured video.
- the monoscopic camera 102 may be operable to detect presence of video information corresponding to the object 402 in captured 2D video.
- depth information generated via the monoscopic camera 102 may enable determining a plurality of depth planes, for example, corresponding to different distances from the monoscopic camera 102 .
- the monoscopic camera 102 may be operable to adaptively perform object detection and/or tracking, and/or subsequent processing of video information associated with detected and/or tracked objects.
- the monoscopic camera 102 may incorporate various object recognition algorithms to enable detecting certain objects.
- the object recognition may be performed based on determination of type, category, and/or characteristics of objects.
- the monoscopic camera 102 may be operable to detect and/or track objects, such as object 402 , based on capturing of 2D video and corresponding depth information.
- the monoscopic camera 102 may then adaptively process video information associated with detected and/or tracked object 402 , to enhance corresponding video images by ensuring that perception of object 402 in images captured by the camera 102 may be acceptable and/or normal to viewers.
- object detection may be performed based on and/or utilizing one or more recognition algorithms, which may enable determining presence of certain objects in scenes captured via the camera 102 .
- the object recognition algorithms may detect objects based on, for example, determination of type of object, and/or preconfigured or predetermined characteristics associated therewith.
- the object recognition algorithms may enable determining whether the object 402 may comprise a person, or a part thereof such as a face.
- object tracking may be configured and/or controlled using depth information. This may enable tracking an object as it moves, for example, closer to or further away from the monoscopic camera 102 .
- the camera 102 may be operable to track movement of the object 402 , in subsequent images, as it moves in the scene captured by the camera 102 .
- the object movement may comprise moving to various depths relative to the camera 102 , such as from depth D 1 , depth D 2 , then to depth D 3 . Accordingly, as the object 402 moves to the different depth, processing of video information associated with the object 402 may be continually modified and/or configured to ensure that the video information may remain within acceptable ranges based on identified type and/or characteristics associated with the object 402 .
- FIG. 4B is a diagram that illustrates exemplary selective processing of objects via a monoscopic camera subsequent to detection and/or tracking based on Z-depth information, which may be utilized in accordance with an embodiment of the invention. Referring to FIG. 4B , there is shown a plurality of successive frames 420 a - 420 c , which may be captured and/or generated via the monoscopic camera 102 .
- Each of the plurality of frames 420 a - 420 c may comprise video information associated with various elements present in a scene at which the monoscopic camera may be directed while generating and/or capturing the 2D video.
- a particular object 422 which may comprise a person for example, is shown in the plurality of frames 420 a - 420 c.
- the frames 420 a - 420 c may comprise image information associated with object 422 , which may correspond to a person for example.
- the image information associated with object 422 may be adjusted based on depth information captured in association with, and corresponding to captured 2D video.
- processing of the video information associated with the object 422 may be performed adaptively. For example, processing of the video information associated with object 422 may be configured to ensure that size of the object 422 may be adjusted, while remaining within acceptable range based on determined type and depth of the object 422 , as shown by changes in the size of the object 422 in frames 420 a - 420 c.
- object detection and/or processing of video information associated with objects may be adaptively configured based on scene detection.
- scene detection may comprise determining various characteristics associated with scenes in images captured by the monoscopic video camera.
- Exemplary scene characteristics may comprise type of setting in the scenes, such as rural vs. urban; type of objects present and/or anticipated in the scene, such as trees and/or buildings; and/or chronological information relating to the scene, such as season and/or time of day.
- the characteristics of the scene may be utilized to control and/or adjust object detection.
- object recognition may be configured to enable detecting shape of person in shaded areas, such as in shadow of trees for example.
- processing of video information associated with objects may be configured based on region surrounding the object.
- shade and/or color related information associated with object 422 may be set and/or modified based on determination of the type of the settings around the object 422 .
- Video information associated with object 422 may be adjusted based on surrounding regions 424 a - 424 c , in frames, respectively.
- FIG. 5 is a flow chart that illustrates exemplary steps for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, in accordance with an embodiment of the invention.
- ISP image sensor pipeline
- FIG. 5 there is shown a flow chart 500 comprising a plurality of exemplary steps that may be performed to enable performing utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information during video processing.
- ISP image sensor pipeline
- 2D video data may be captured via image sensor(s) in a monoscopic camera.
- depth information may be captured via depth sensor(s) in the monoscopic camera.
- the captured depth information may correspond to the captured 2D video.
- one or more objects may be detected and/or tracked in scene(s) in the captured video, based on captured depth information for example.
- at least some of the object detection and/or operations may be performed based on various stages and/or functions in the ISP 250 .
- at least some stages and/or functions of the ISP 250 may be modified and/or adjusted to support object detection and/or tracking operations.
- step 508 selective and/or adaptive processing of video information associated with detected and/or tracked objects may be performed based on depth information and/or local portion(s) of captured video data surrounding objects.
- at least some of the adaptive and/or selective video processing associated with detected and/or tracked objects may be performed by modifying and/or adjusting settings of various stages and/or functions in the ISP 250 .
- Various embodiments of the invention may comprise a method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information.
- the monoscopic video camera 102 may be utilized to detect and/or track, based on sensory information captured via one or more image sensors 222 , color sensors 224 , and/or depth sensors 226 , objects at varying depths, and may adaptively process video information associated with each of these objects, based on determined corresponding depths for these objects.
- the monoscopic video camera 102 may capture, via image sensors 222 and/or color sensors 224 , two-dimensional video, and may capture, via depth sensor 226 , corresponding depth information for the captured two-dimensional video.
- the monoscopic video camera may then detect and/or track objects in the captured two-dimensional video, based on the captured corresponding depth information for example. Furthermore, processing of image related information corresponding to the objects may be configured based on the detecting and/or tracking of the objects.
- object detection/tracking based configuration of image processing may comprise adjusting and/or controlling one or more functions 250 A - 250 N in the ISP 250 .
- the image processing of the monoscopic video camera 102 may be configured to provide adaptive and/or dynamic setting and/or modification of video information, such as color and/or brightness based on determined types of objects and/or based on determination of relative depth of each of the objects with respect to the monoscopic video camera 102 .
- Detection of objections may comprise determining type and/or characteristics of each of the objects.
- identification of the type and/or characteristics of the objects may be performed based on one or more object recognition algorithms programmed into the monoscopic video camera 102 .
- Configuring image processing of objects image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects.
- the monoscopic video camera 102 may be operable to synchronize the captured corresponding depth information to the captured two-dimensional video, to enable generating 3D perception for at least some images captured via the monoscopic video camera 102 .
- inventions may provide a non-transitory computer readable medium and/or storage medium, and/or a non-transitory machine readable medium and/or storage medium, having stored thereon, a machine code and/or a computer program having at least one code section executable by a machine and/or a computer, thereby causing the machine and/or computer to perform the steps as described herein for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information.
- ISP image sensor pipeline
- the present invention may be realized in hardware, software, or a combination of hardware and software.
- the present invention may be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited.
- a typical combination of hardware and software may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
- the present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods.
- Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
Abstract
Description
- This patent application makes reference to, claims priority to and claims benefit from U.S. Provisional Application Ser. No. 61/377,867, which was filed on Aug. 27, 2010.
- The above stated application is hereby incorporated herein by reference in its entirety.
- This application also makes reference to:
- U.S. application Ser. No. (Attorney Docket Number 23457U502) filed on even date herewith;
- U.S. application Ser. No. 13/077,900 (Attorney Docket Number 23461 U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,912 (Attorney Docket Number 23462U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,922 (Attorney Docket Number 23463U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,886 (Attorney Docket Number 23464U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,926 (Attorney Docket Number 23465U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,893 (Attorney Docket Number 23464U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,923 (Attorney Docket Number 23467U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. ______ (Attorney Docket Number 23469U502) filed on even date herewith;
- U.S. Provisional Application Ser. No. 61/439,201 (Attorney Docket Number 23470U502) filed on Feb. 3, 2011;
- U.S. application Ser. No. ______ (Attorney Docket Number 23470U503) filed on even date herewith;
- U.S. Provisional Application Ser. No. 61/439,209 (Attorney Docket Number 23471 U502) filed on Feb. 3, 2011;
- U.S. application Ser. No. ______ (Attorney Docket Number 23471US03) filed on even date herewith;
- U.S. application Ser. No. 13/077,868 (Attorney Docket Number 23472U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,880 (Attorney Docket Number 23473U503) filed on Mar. 31, 2011;
- U.S. application Ser. No. 13/077,899 (Attorney Docket Number 23473U503) filed on Mar. 31, 2011; and
- U.S. application Ser. No. 13/077,930 (Attorney Docket Number 23475U503) filed on Mar. 31, 2011.
- Each of the above stated applications is hereby incorporated herein by reference in its entirety.
- [Not Applicable].
- [Not Applicable].
- Certain embodiments of the invention relate to video processing. More specifically, certain embodiments of the invention relate to a method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information.
- Support and demand for video systems that support three-dimensional (3D) video has increased rapidly in recent years. Both literally and physically, 3D video provides a whole new way to watch video, in home and in theaters. However, 3D video systems are still in their infancy in many ways and there is much room for improvement in terms of both cost and performance.
- Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.
- A system and/or method is provided for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
- These and other advantages, aspects and novel features of the present invention, as well as details of an illustrated embodiment thereof, will be more fully understood from the following description and drawings.
-
FIG. 1 is a diagram that illustrates an exemplary monoscopic, or single-view, camera embodying aspects of the present invention, compared with a conventional stereoscopic camera. -
FIG. 2A is a diagram illustrating an exemplary monoscopic camera, which may be utilized in accordance with an embodiment of the invention. -
FIG. 2B is a block diagram illustrating an exemplary image sensor pipeline (ISP), which may be utilized in accordance with an embodiment of the invention. -
FIG. 3 is a diagram that illustrates exemplary processing of depth information and 2D image information to generate a 3D image, which may be utilized in accordance with an embodiment of the invention. -
FIG. 4A is a diagram that illustrates exemplary detection and/or tracking of objects via a monoscopic camera based on Z-depth information, which may be utilized in accordance with an embodiment of the invention. -
FIG. 4B is a diagram that illustrates exemplary selective processing of objects via a monoscopic camera subsequent to detection and/or tracking based on Z-depth information, which may be utilized in accordance with an embodiment of the invention. -
FIG. 5 is a flow chart that illustrates exemplary steps for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, in accordance with an embodiment of the invention. - Certain embodiments of the invention may be found in a method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information. In various embodiments of the invention, a monoscopic video camera may be utilized to detect and/or track objects at varying depths, and may adaptively process video information associated with each of these objects, based on determined corresponding depths for these objects. In this regard, the monoscopic video camera may capture, via at least one image sensor, two-dimensional video, and may capture, via at least one depth sensor, corresponding depth information for the captured two-dimensional video. The monoscopic video camera may then detect and/or track objects in the captured two-dimensional video, based on the captured corresponding depth information for example. Furthermore, processing of image related information corresponding to the objects may be configured based on the detecting and/or tracking of the objects. In this regard, video processing in the monoscopic video camera may be configured to provide adaptive and/or dynamic setting and/or modification of video information, such as color and/or brightness based on determined types of objects and/or based on determination of relative depth of each of the objects with respect to the monoscopic video camera. Detection of objections may comprise determining type and/or characteristics of each of the objects. In this regard, identification of the type and/or characteristics of the objects may be performed based on one or more object recognition algorithms programmed into the monoscopic video camera.
- Configuration of processing of object image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects. The monoscopic video camera may also be operable to perform scene detection based on the two-dimensional video and/or the corresponding depth information, and object detection and/or tracking may be performed and/or adjusted based on scene detection. In this regard, the scene detection may comprise determining various characteristics associated with scenes in images captured by the monoscopic video camera. Exemplary scene characteristics may comprise type of setting in the scenes, such as rural vs. urban; type of objects present and/or anticipated in the scene, such as trees and/or buildings; and/or chronological information relating to the scene, such as season and/or time of day. The monoscopic video camera may be operable to synchronize the captured corresponding depth information to the captured two-dimensional video, to enable generating 3D perception for at least some images captured via the monoscopic video camera. Accordingly, the monoscopic video camera may compose three-dimensional video from captured two-dimensional video based on corresponding captured depth information. The monoscopic video camera may then render the composed three-dimensional video, autonomously—using integrated display in the monoscopic video camera, or via another display device, to which the 3D video may be communicated directly from the monoscopic video camera or indirectly via intermediate storage devices.
-
FIG. 1 is a diagram that compares a monoscopic camera embodying aspects of the present invention with a conventional stereoscopic camera. Referring toFIG. 1 , there is shown astereoscopic camera 100 and amonoscopic camera 102. - The
stereoscopic camera 100 may comprise suitable logic, circuitry, interfaces, and/or code that may enable capturing and/or generating stereoscopic video and/or images. In this regard, thestereoscopic camera 100 may comprise twolenses lenses lenses lens 101 a (and associated optics) and EM waves in the visible spectrum may be focused on a second one or more image sensors by the lens (and associated optics) 101 b. - The
monoscopic camera 102 may comprise suitable logic, circuitry, interfaces, and/or code that may enable capturing and/or generating monoscopic video and/or images. In this regard, themonoscopic camera 102 may capture images via a single viewpoint, corresponding to thelens 101 c for example. EM waves in the visible spectrum may be focused on one or more image sensors by thelens 101 c. The image sensor(s) may capture brightness and/or color information. The captured brightness and/or color information may be represented in any suitable color space such as YCrCb color space or RGB color space. In an exemplary aspect of the invention, themonoscopic camera 102 may be operable to generate 3D video and/or images based on captured 2D video and/or images based on, for example, depth information. In this regard, themonoscopic camera 102 may also capture depth information via thelens 101 c (and associated optics). For example, themonoscopic camera 102 may comprise an infrared emitter, an infrared sensor, and associated circuitry operable to determine the distance to objects based on reflected infrared waves. Additional details of themonoscopic camera 102 are described below. - The monoscopic camera may comprise a
processor 124, amemory 126, and asensory subsystem 128. Theprocessor 124 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to manage operation of various components of the camera and perform various computing and processing tasks. Asingle processor 124 is utilized only for illustration but the invention is not so limited. In an exemplary embodiment of the invention, various portions of thecamera 102 depicted inFIG. 2A below may correspond to theprocessor 124 depicted inFIG. 1 . The memory 106 may comprise, for example, DRAM, SRAM, flash memory, a hard drive or other magnetic storage, or any other suitable memory devices. - The
sensory subsystem 128 may comprise a plurality of sensors which may be operable to capture and/or generate video information corresponding to images and/or video streams generated via themonoscopic camera 102. Thesensory subsystem 128 may also comprise suitable logic, circuitry, interfaces, and/or code that may be operable to manage and/or control of the various sensors in thesensory subsystem 128, and/or to handling at the least some of the processing of information generated and/or captured thereby. In this regard, thesensory subsystem 128 may enable generating 2D video and corresponding depth and/or color information. Thesensory subsystem 128 may comprise, for example, one or more image sensors, one or more depth sensors, and one or more sensors. In this regard, exemplary sensors that may be integrated into thesensory subsystem 128 are described in more detail below with respect toFIG. 2A . -
FIG. 2A is a diagram illustrating an exemplary monoscopic camera, in accordance with an embodiment of the invention. Referring toFIG. 2A , there is shown themonoscopic camera 102, which may comprise amemory 202, aprocessor 204, a digital signal processor (DSP) 206, anerror protection module 208, a video encoder/decoder 210, an audio encoder/decoder 212, aspeaker 214, amicrophone 216, anoptics module 218, anemitter 220, an input/output (I/O)module 228, adigital display 230, controls 232, andoptical viewfinder 234. Thecamera 102 may also comprise a plurality of sensors which may be operable to capture and/or generate video information corresponding to images and/or video streams. Thecamera 102 may comprise, for example, one ormore image sensors 222, one ormore color sensor 224, and one ormore depth sensor 226. Thecamera 102 may also comprise thelens 101 c, which may be operable to collect and sufficiently focus electromagnetic waves in the visible and infrared spectra to enable capturing images and/or video. - The
memory 202 may comprise suitable logic, circuitry, interfaces, and/or code that may enable temporary and/or permanent storage of data, and/or retrieval or fetching thereof. Thememory 202 may comprise, for example, DRAM, SRAM, flash memory, a hard drive or other magnetic storage, or any other suitable memory devices. For example, SRAM may be utilized to store data utilized and/or generated by theprocessor 204 and a hard-drive and/or flash memory may be utilized to store recorded image data and depth data. Theprocessor 204 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to coordinate operation of the various components of thecamera 102. Theprocessor 204 may, for example, run an operating system of thecamera 102 and control communication of information and signals between components of thecamera 102. Theprocessor 204 may execute instructions stored in thememory 202. TheDSP 206 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to perform complex processing of captured image data, captured depth data, and captured audio data. TheDSP 206 may be operable to, for example, compress and/or decompress the data, encode and/or decode the data, and/or filter the data to remove noise and/or otherwise improve perceived audio and/or video quality for a listener and/or viewer. - The
error protection module 208 may comprise suitable logic, circuitry, interfaces and/or code that may be operable to perform error protection functions for themonoscopic camera 102. For example, theerror protection module 208 may provide error protection to encoded 2D video images and corresponding depth information, and/or encoded audio data for transmission to a video rendering device that may be communicatively coupled to themonoscopic camera 102. - The video encoder/
decoder 210 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to process captured color, brightness, and/or depth data to make the data suitable for conveyance to, for example, thedisplay 230 and/or to one or more external devices via the I/O block 228. For example, the video encoder/decoder 210 may convert between, for example, raw RGB or YCrCb pixel values and an MPEG encoding. Although depicted as a separate block, the video encoder/decoder 210 may be implemented in theDSP 206. - The audio encoder/
decoder 212 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to process captured color, brightness, and/or depth data to make the data suitable for conveyance to, for example, thespeaker 214 and/or to one or more external devices via the I/O block 228. For example, the video encoder/decoder 210 may convert between, for example, raw pulse-code-modulated audio and an MP3 or AAC encoding. Although depicted as a separate block, the audio encoder/decoder 212 may be implemented in theDSP 206. - The
speaker 214 may comprise suitable logic, circuitry, interfaces, and/or code operable to convert electrical signals into acoustic waves. Themicrophone 216 may be operable to amplify, equalize, and/or otherwise generate audio signals based on audio information generated in thecamera 102. The directionality of thespeaker 214 may be controlled electronically and/or mechanically. - The
microphone 216 may comprise a transducer and associated logic, circuitry, interfaces, and/or code operable to convert acoustic waves into electrical signals. Themicrophone 216 may be operable to amplify, equalize, and/or otherwise process captured audio signals. The directionality of themicrophone 216 may be controlled electronically and/or mechanically. - The
optics module 218 may comprise various optical devices for conditioning and directing EM waves received via thelens 101 c. Theoptics module 218 may direct EM waves in the visible spectrum to theimage sensor 222 and direct EM waves in the infrared spectrum to thedepth sensor 226. Theoptics module 218 may comprise, for example, one or more lenses, prisms, color filters, and/or mirrors. - Each
image sensor 222 may each comprise suitable logic, circuitry, interfaces, and/or code that may be operable to convert optical signals to electrical signals. Eachimage sensor 222 may comprise, for example, a charge coupled device (CCD) images sensor or a complimentary metal oxide semiconductor (CMOS) image sensor. Eachimage sensor 222 may capture 2D brightness and/or color information. - Each
color sensor 224 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to detect color generation and/or generate color related information based thereon in images captures via thecamera 102. - Each
depth sensor 226 may comprise suitable logic, circuitry, interfaces, and/or code that may be operable to detect EM waves in the infrared spectrum and determine distance to objects based on reflected infrared waves. In an embodiment of the invention, distance may be determined based on time-of-flight of infrared waves transmitted by theemitter 220 and reflected back to thecolor sensor 224. In an embodiment of the invention, depth may be determined based on distortion of a captured grid. - The input/
output module 228 may comprise suitable logic, circuitry, interfaces, and/or code that may enable thecamera 102 to interface with other devices in accordance with one or more standards such as USB, PCI-X, IEEE 1394, HDMI, DisplayPort, and/or analog audio and/or analog video standards. For example, the I/O module 228 may be operable to send and receive signals from thecontrols 232, output video to thedisplay 230, output audio to thespeaker 214, handle audio input from themicrophone 216, read from and write to cassettes, flash cards, hard disk drives, solid state drives, or other external memory attached to thecamera 102, and/or output audio and/or video via one or more ports such as a IEEE 1394 or USB port. - The
digital display 230 may comprise suitable logic, circuitry, interfaces, and/or code that may enable displaying video and/or images, captured, generated, and/or processed via themonoscopic camera 102. In this regard, thedigital display 230 may comprise, for example, an LCD, LED, OLED, or other digital display technology on which images recorded via thecamera 102 may be displayed. In an embodiment of the invention, thedigital display 230 may be operable to display 3D images. Thecontrols 232 may comprise suitable logic, circuitry, interfaces, and/or code. Thecontrols 232 may enable a user to interact with thecamera 102. For example, controls for controlling recording and playback. In an embodiment of the invention, thecontrols 232 may enable a user to select whether thecamera 102 records and/or outputs video in 2D or 3D modes. Theoptical viewfinder 234 may enable a user to see what thelens 101 c “sees,” that is, what is “in frame.” - In an exemplary aspect of the invention, the
camera 102 may comprise an image sensor pipeline (ISP) 250. In this regard, theISP 250 may be implemented as a dedicated component, and/or as part of another component of thecamera 102, such as theprocessor 202 for example. TheISP 250 may comprise suitable circuitry, logic and/or code that may be operable to process imaging (or video) data, which may be received from one or more imaging related sensors, such asimage sensors 222,sensor 224, and/ordepth sensor 226. In this regard, theISP 250 may perform and/or support various video processing operations and/or techniques comprising, for example, filtering, demosaic, lens shading correction, defective pixel correction, white balance, image compensation, Bayer interpolation, color transformation, and/or post filtering. TheISP 250 may provide accelerated processing of imaging data. In this regard, the accelerated processing may be achieved by use of pipelined based architecture, with theISP 250 comprising programmable pipeline structure for example. TheISP 250 may comprise, for example, multiple sensor processing stages, implemented in hardware, software, firmware, and/or any combination thereof. Exemplary processing stages may comprise demosaicing, geometric distortion correction, color conversion, denoising, and/or sharpening, for example. Furthermore, processing of image data may be performed on variable sized tiles, reducing the memory requirements of theISP 250 processes. - In operation, the
camera 102 may be utilized to generate 3D video and/or images based on captured 2D video data and corresponding depth information. For example, the depth sensor(s) 226 may capture depth information and the image sensor(s) 222 may capture 2D image information. Similarly, for a lower-end application of thecamera 102, such as a security camera, the image sensor(s) 222 may capture only brightness information for rendering black and white 3D video. The depth information may, for example, be stored and/or communicated as metadata and/or an additional layer of information associated with 2D image information. In this regard, a data structure in which the 2D image information is stored may comprise one or more fields and/or indications that indicate depth data associated with the stored 2D image information is available for rendering a 3D image. Similarly, packets in which the 2D image information is communicated may comprise one or more fields and/or indications that indicate depth data associated with the communicated 2D image information is available for rendering a 3D image. Thus, for outputting 2D video, the camera 101 may read the 2D image information out of memory, and process it to generate a 2D video stream to the display and/or the I/O block. For outputting 3D video, may: (1) read the 2D image information from memory; (2) determine, based on an indication stored in memory with the 2D image information, that associated depth information is available; (3) read the depth information from memory; and (4) process the 2D image information and depth information to generate a 3D video stream. - Processing of the 2D image information and depth information may comprise synchronizing the depth information to the 2D image information. Processing of the 2D image information and depth information may comprise scaling and/or interpolating either or both of the 2D image information and the associated depth information. For example, the resolution of the
depth sensor 226 may be less than the resolution of theimage sensor 222. Accordingly, thecamera 102 may be operable to interpolate between pixels of depth information to generate depth information for each pixel, or group of pixels, of 2D image information. Similarly, the frame rate of thedepth sensor 226 may be less than the frame rate of theimage sensor 222. Accordingly, thecamera 102 may be operable to interpolate between frames of depth information to generate a frame of depth information for each frame of 2D image information. - In various embodiments on the invention, the
monoscopic camera 102 may be operable to detect and/or track objects in 2D video captured via the image sensor(s) 222, based on corresponding depth information captured via the depth sensor(s) 226 for example, and/or may adaptively process video information associated with detected and/or tracked objects, to enhance corresponding video images. In this regard, themonoscopic camera 102 may be operable to utilize, for example, one or more recognition algorithms that may enable determining presence of certain objects in scenes captured via theimage sensor 222 for example. The object recognition algorithms may detect objects based on, for example, determination of type of object, and/or preconfigured characteristics associated therewith. For example, the object recognition algorithms utilized in themonoscopic camera 102 may detect such objects as persons, or parts thereof such as face or hands for example. Associated characteristics may comprise, for example, size of the object or specific parts thereof, such as face of person, and/or color related information, such as permissible color hues and/or shades. Once an object is detected, themonoscopic camera 102 may continue to track the object in successive image frames. The object tracking may be configured and/or controlled using depth information. This may enable tracking an object as it moves, for example, closer to or further away from themonoscopic camera 102. Accordingly, themonoscopic camera 102 may be operable to adjust characteristics associated with detected objects, such as size and/or relative position to other objects in the scenes, as the tracked objects move within the captured scene. - Adaptive processing of video information associated with detected and/or tracked objects may comprise configuring and/or modifying control parameters and/or criteria pertinent to video processing operations, and/or setting and/or adjusting video information (e.g. color, brightness, and/or shade) associated with the detected objects. Configuration of processing of object image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects. In this regard, video information associated with certain objects may be generated and/or modified, based on preconfigured criteria for example. For example, in instances where an object is identified as a face, color related video information may be adjusted to ensure based on predetermined criteria associated with acceptable human color hues for example. Furthermore, depth information may be also be utilized in conjunction with object detection and/or tracking during processing of object image related information. For example, the relative size of an object identified as a face may be adjusted based on depth information, to ensure that the size of the face may be appropriate, for example within acceptable range and/or quality, based on identified associated relative depth of the face in relation to the
camera 102. -
FIG. 2B is a block diagram illustrating an exemplary image sensor pipeline (ISP), which may be utilized in accordance with an embodiment of the invention. Referring toFIG. 2B , there is shown the image sensor pipeline (ISP) 250 ofFIG. 2A . - The
ISP 250 may comprise suitable circuitry, logic and/or code that may be operable to perform various functions associated with processing of imaging data, which may be received from one or more imaging related sensors, in accelerated manner, by use of pipelined based architecture for example. TheISP 250 may be utilized to enable, for example, pipelined color processing of captured images. In this regard, theISP 250 may be configured as programmable pipeline structure, comprising a plurality of functions 250 A-250 N, each of which associated with handling and/or performing particular image processing function. Accordingly, theISP 250 may enable accelerated image processing by splitting the processing of data associated with each particular image into stages, to enable concurrently handling multiple images with each of the plurality of functions 250 A-250 N being utilized to, for example, perform the corresponding processing function on different images. In other words,ISP 250 may enable handling multiple images since processing of each image may be at different stage at any given point. This may enable implementing various aspects of the invention by adjusting different stages of pipelined functions, without affecting the overall processing duration since some of the operations may be done while other stages are being performed. Data may be moved from any point of theISP 250 and processed in software and the resulting software processed data may be put into any desired point of theISP 250 for processing in hardware. - Exemplary processing functions handled and/or implemented by the
ISP 250 may comprise, for example, auto-focus function 250 A, flash-metering function 250 B, auto-white-balance (AWB)function 250 C,image segmentation function 250 D, and/orimage scaling function 250 N. - The auto-
focus function 250 A may comprise performing focusing operations automatically. In this regard, focusing may comprise selecting one or more portions of an image to be focal points during images processing, in which light from these portions and/or objects there in, are optimally captured and/or corresponding image information are consequently very accurate. Auto-focus operations may comprise use and/or control of image sensors to enable selecting focus points, and/or to determine correct focusing associated therewith. In this regard, auto-focus may be active, in which distance to the focus points (or objects) may be determined, and subsequently correct focusing may be effectuated, by controlling and/or adjusting available image sensors, using such techniques as light metering for example. Auto-focus may also be passive, in which focus point selection and/or corresponding focusing adjustment may be performed base don passive analysis of the image, and/or associate image information after the image is captured. - The flash-
metering function 250 B may comprise controlling flashing operations, such as of thecamera 102, based on image sensory information. In this regard, flash-metering may comprise determining and/or measuring levels of light or brightness in a scene with which an image is associated, and selecting and/or controlling based thereon the amount of light emitted by a flash component coupled to and/or integrated into the camera. The light measuring may be performed using one or more sensors, and/or via the camera's lenses. - The
AWB function 250 C may comprise performing white balance operations automatically. In this regard, white balancing may comprise adjusting color intensities of portions of an image associated with the white color, to ensure that these portions may be render correctly—i.e. with more natural feel, based on identification of the objects associated with these white areas and/or settings in which the image was captured. The white color may typically be function of equal, or near equal mixing of the three primary colors (red, green, and blue). Accordingly, during color balancing, contribution or parameters associated with each of these three colors may be adjusted to adjust the whiteness of the white color region. For example, white balancing may comprise adjusting image portions associated with snow such that the white color associated with the snow may be rendered with a degree of blueness. - The
image segmentation function 250 D may comprise partitioning an image, whose information is processed, into multiple segments, each of comprising a plurality of contagious and/or non-contagious pixels, based on presence of one or more common characteristics among pixels in each segment. The common characteristics may be determined based on predetermined ranges associated with one or more video information, such as intensity and/or color. Image segmentation may be utilized to enable simplifying and/or changing processing image data, by configuring analysis and/or processing of image data in accordance with the common characteristics associated with each segment. Image segmentation may also be utilized to enable and/or enhance locating objects and boundaries, such as lines or curves, in images. - The
image scaling function 250 N may comprise resizing images, and/or portions thereof, to increase or decrease the image (or portion) size. In this regard, image scaling may comprise and/or pertain to zooming operations, in which a portion of an image may be adjusted to fit larger or smaller portion of screen. Image scaling may affect various characteristics of the mages, such as smoothness and/or sharpness. In this regard, increasing the size of image may reduce the smoothness and/or sharpness in the image, while decreasing the size of image may enhance its smoothness and/or sharpness. Image scaling may comprise subsampling or upsampling an image, or portion thereof based on whether the image (or portion) is being scaled up or down. - In various embodiments of the invention, the depth information generated and/or captured via the
depth sensor 226 may be utilized to enhance and/or improve image processing performed in thecamera 102. In this regard, the depth information may be utilized to generate and/or adjust control information utilized in controlling and/or managing operations of theISP 250. For example, the control information may be utilized to adjust and/or control various stages and/or functions of theISP 250, such as the auto-focus function 250 A, the flash-metering function 250 B, theAWB function 250 C, theimage segmentation function 250 D, and/or theimage scaling function 250 N. For example, the auto-focus function 250 A may be adjusted, based on depth information, to enable selecting focus points, and/or configuring focusing operations relating thereto, adaptively at different depths relative to thecamera 102. Also, algorithms utilized during AWB operations may be adjusted to enable applying white balancing adaptively at different depths relative to thecamera 102. Similarly, algorithms and/or parameters utilized scaling operations may be adjusted to enable performing scaling operations adaptively at different depths relative to thecamera 102. - Operations of the
ISP 250 may also be controlled and/or adjusted based on detection and/or tracking of objects. In this regard, at least some of the functions 250 A-260 N of theISP 250 may be modified and/or configured to support and/or enable performing various operations associated with detecting and/or tracking objects. For example, the auto-focus function 250 A may be adjusted to focus on particular types of objects and/or to do so at particular depths, such as by incorporating depth information related parameters into this function. The flash-metering function 250 B may also be adjusted such as flashing operations may be tailored to enhance lighting of detected and/or tracked objects. Similarly, theimage segmentation function 250 D may be adjusted to enhance locating objects and shapes associated therewith, such as lines or curves with particular characteristics, during object detection and/or tracking operations. At least some of the functions 250 A-260 N of theISP 250 may also be modified and/or configured to enhance and/or adjust various image processing operations associated with detected and/or tracked objects. For example, theAWB function 250 C may be adjusted to perform white balancing adaptively on regions associated with detected and/or tracked objects. -
FIG. 3 illustrates processing of depth information and 2D image information to generate a 3D image, which may be utilized in accordance with an embodiment of the invention. Referring toFIG. 3 , there is shown aframe 330 of depth information, which may be captured by the depth sensor(s) 226, and aframe 334 of 2D image information, captured by theimage sensors 222. In this regard, thedepth information 330 and the2D image information 334 may be processed to generate aframe 336 associated with a corresponding 3D image. Also shown inFIG. 3 isplane 332, indicated by a dashed line, which is merely for illustration purposes to indicate depth on the two dimensional drawing sheets. - In the
frame 330, the line weight is utilized to indicate depth—heavier lines being closer to the viewer. Thus, the object 338 is farthest from thecamera 102, theobject 342 is closest to thecamera 102 and the object 104 is at an intermediate distance. - In operation, depth information, such as in
frame 330 for example, may be utilized to provide depth related adjustments to corresponding 2D video information, such as inframe 334 for example. For example, the depth information may be mapped to a grayscale, or pseudo-grayscale, image for display to the viewer. Such mapping may be performed by theDSP 206 for example. The image associated with theframe 334 may be a conventional 2D image. A viewer of theframe 234, for example on the display 120 or on a dedicated display device connected to thecamera 102 via the I/O module 228, may perceive the same distance to each of theobjects objects plane 332. - Accordingly, 2D video information and corresponding depth information may be processed to enable generating 3D images, associated with
frame 336 for example, which may provide depth perception when viewed. For example, the depth information offrame 330 may be utilized to adaptively adjust video information with each of objects in the 2D images offrames 334 to create perception of different depths for objects contained therein. In this regard, the viewer offrames 336, on the display 120 or on a dedicated display device connected to thecamera 102 via the I/O module 228 for example, may perceive the object 338 being furthest from the viewer theobject 342 being closest to the viewer, and theobject 340 being at an intermediate distance. In this regard, the object 338 may appear to be behind the reference plane, theobject 340 may appear to be on the reference plane, and theobject 342 may appear to be in front of the reference plane. - In various embodiments on the invention, depth information may be utilized to enable detecting and/or tracking objects in captured 2D images and/or video, and/or may be utilized to enable adaptively processing and/or generating video data associated with detected and/or tracked objects. In this regard, object detection and/or tracking may be performed at different depths, corresponding to the depth of each of the
objects -
FIG. 4A is a diagram that illustrates exemplary detection and/or tracking of objects via a monoscopic camera based on Z-depth information, which may be utilized in accordance with an embodiment of the invention. Referring toFIG. 4A , there is shown themonoscopic camera 102 and aphysical object 402, whose image may be captured via themonoscopic camera 102. Theobject 402 may comprise, for example, a person and/or a particular part of a person such as face or hands for example. - In operation, the
monoscopic camera 102 may be utilized to capture 2D video and to generate corresponding depth information, substantially as described with regard toFIGS. 2 and 3 , for example. In this regard, themonoscopic camera 102 may be operable to detect and/or track objects in captured video. For example, themonoscopic camera 102 may be operable to detect presence of video information corresponding to theobject 402 in captured 2D video. Furthermore, depth information generated via themonoscopic camera 102 may enable determining a plurality of depth planes, for example, corresponding to different distances from themonoscopic camera 102. Accordingly, themonoscopic camera 102 may be operable to adaptively perform object detection and/or tracking, and/or subsequent processing of video information associated with detected and/or tracked objects. For example, themonoscopic camera 102 may incorporate various object recognition algorithms to enable detecting certain objects. In this regard, the object recognition may be performed based on determination of type, category, and/or characteristics of objects. - In various embodiments on the invention, the
monoscopic camera 102 may be operable to detect and/or track objects, such asobject 402, based on capturing of 2D video and corresponding depth information. Themonoscopic camera 102 may then adaptively process video information associated with detected and/or trackedobject 402, to enhance corresponding video images by ensuring that perception ofobject 402 in images captured by thecamera 102 may be acceptable and/or normal to viewers. In this regard, object detection may be performed based on and/or utilizing one or more recognition algorithms, which may enable determining presence of certain objects in scenes captured via thecamera 102. The object recognition algorithms may detect objects based on, for example, determination of type of object, and/or preconfigured or predetermined characteristics associated therewith. For example, the object recognition algorithms may enable determining whether theobject 402 may comprise a person, or a part thereof such as a face. Furthermore, object tracking may be configured and/or controlled using depth information. This may enable tracking an object as it moves, for example, closer to or further away from themonoscopic camera 102. For example, after initially detecting theobject 402, thecamera 102 may be operable to track movement of theobject 402, in subsequent images, as it moves in the scene captured by thecamera 102. The object movement may comprise moving to various depths relative to thecamera 102, such as from depth D1, depth D2, then to depth D3. Accordingly, as theobject 402 moves to the different depth, processing of video information associated with theobject 402 may be continually modified and/or configured to ensure that the video information may remain within acceptable ranges based on identified type and/or characteristics associated with theobject 402. -
FIG. 4B is a diagram that illustrates exemplary selective processing of objects via a monoscopic camera subsequent to detection and/or tracking based on Z-depth information, which may be utilized in accordance with an embodiment of the invention. Referring toFIG. 4B , there is shown a plurality of successive frames 420 a-420 c, which may be captured and/or generated via themonoscopic camera 102. - Each of the plurality of frames 420 a-420 c may comprise video information associated with various elements present in a scene at which the monoscopic camera may be directed while generating and/or capturing the 2D video. In this regard, a
particular object 422, which may comprise a person for example, is shown in the plurality of frames 420 a-420 c. - In this regard, the frames 420 a-420 c may comprise image information associated with
object 422, which may correspond to a person for example. The image information associated withobject 422 may be adjusted based on depth information captured in association with, and corresponding to captured 2D video. In this regard, based on determination of the type of the object, and its relative depth, processing of the video information associated with theobject 422 may be performed adaptively. For example, processing of the video information associated withobject 422 may be configured to ensure that size of theobject 422 may be adjusted, while remaining within acceptable range based on determined type and depth of theobject 422, as shown by changes in the size of theobject 422 in frames 420 a-420 c. - Furthermore, object detection and/or processing of video information associated with objects may be adaptively configured based on scene detection. In this regard, scene detection may comprise determining various characteristics associated with scenes in images captured by the monoscopic video camera. Exemplary scene characteristics may comprise type of setting in the scenes, such as rural vs. urban; type of objects present and/or anticipated in the scene, such as trees and/or buildings; and/or chronological information relating to the scene, such as season and/or time of day. For example, the characteristics of the scene may be utilized to control and/or adjust object detection. In this regard, object recognition may be configured to enable detecting shape of person in shaded areas, such as in shadow of trees for example. Furthermore, processing of video information associated with objects may be configured based on region surrounding the object. For example, shade and/or color related information associated with
object 422 may be set and/or modified based on determination of the type of the settings around theobject 422. Video information associated withobject 422 may be adjusted based on surrounding regions 424 a-424 c, in frames, respectively. -
FIG. 5 is a flow chart that illustrates exemplary steps for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information, in accordance with an embodiment of the invention. Referring toFIG. 5 , there is shown aflow chart 500 comprising a plurality of exemplary steps that may be performed to enable performing utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information during video processing. - In
step step 504, depth information may be captured via depth sensor(s) in the monoscopic camera. In this regard, the captured depth information may correspond to the captured 2D video. Instep 506, one or more objects may be detected and/or tracked in scene(s) in the captured video, based on captured depth information for example. In this regard, at least some of the object detection and/or operations may be performed based on various stages and/or functions in theISP 250. Furthermore, at least some stages and/or functions of theISP 250 may be modified and/or adjusted to support object detection and/or tracking operations. Instep 508, selective and/or adaptive processing of video information associated with detected and/or tracked objects may be performed based on depth information and/or local portion(s) of captured video data surrounding objects. In this regard, at least some of the adaptive and/or selective video processing associated with detected and/or tracked objects may be performed by modifying and/or adjusting settings of various stages and/or functions in theISP 250. - Various embodiments of the invention may comprise a method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information. The
monoscopic video camera 102 may be utilized to detect and/or track, based on sensory information captured via one ormore image sensors 222,color sensors 224, and/ordepth sensors 226, objects at varying depths, and may adaptively process video information associated with each of these objects, based on determined corresponding depths for these objects. In this regard, themonoscopic video camera 102 may capture, viaimage sensors 222 and/orcolor sensors 224, two-dimensional video, and may capture, viadepth sensor 226, corresponding depth information for the captured two-dimensional video. The monoscopic video camera may then detect and/or track objects in the captured two-dimensional video, based on the captured corresponding depth information for example. Furthermore, processing of image related information corresponding to the objects may be configured based on the detecting and/or tracking of the objects. In this regard, object detection/tracking based configuration of image processing may comprise adjusting and/or controlling one or more functions 250 A-250 N in theISP 250. The image processing of themonoscopic video camera 102 may be configured to provide adaptive and/or dynamic setting and/or modification of video information, such as color and/or brightness based on determined types of objects and/or based on determination of relative depth of each of the objects with respect to themonoscopic video camera 102. Detection of objections may comprise determining type and/or characteristics of each of the objects. In this regard, identification of the type and/or characteristics of the objects may be performed based on one or more object recognition algorithms programmed into themonoscopic video camera 102. Configuring image processing of objects image related information may be performed based on preset criteria and/or parameters associated with identified types and/or characteristics of the objects. Themonoscopic video camera 102 may be operable to synchronize the captured corresponding depth information to the captured two-dimensional video, to enable generating 3D perception for at least some images captured via themonoscopic video camera 102. - Other embodiments of the invention may provide a non-transitory computer readable medium and/or storage medium, and/or a non-transitory machine readable medium and/or storage medium, having stored thereon, a machine code and/or a computer program having at least one code section executable by a machine and/or a computer, thereby causing the machine and/or computer to perform the steps as described herein for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information.
- Accordingly, the present invention may be realized in hardware, software, or a combination of hardware and software. The present invention may be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
- The present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
- While the present invention has been described with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present invention without departing from its scope. Therefore, it is intended that the present invention not be limited to the particular embodiment disclosed, but that the present invention will include all embodiments falling within the scope of the appended claims.
Claims (20)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/174,364 US20120050483A1 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing an image sensor pipeline (isp) for 3d imaging processing utilizing z-depth information |
US13/174,261 US9013552B2 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing image sensor pipeline (ISP) for scaling 3D images based on Z-depth information |
EP12004741.0A EP2541945A3 (en) | 2011-06-30 | 2012-06-25 | Method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US37786710P | 2010-08-27 | 2010-08-27 | |
US13/174,364 US20120050483A1 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing an image sensor pipeline (isp) for 3d imaging processing utilizing z-depth information |
Publications (1)
Publication Number | Publication Date |
---|---|
US20120050483A1 true US20120050483A1 (en) | 2012-03-01 |
Family
ID=45696699
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/174,430 Active 2032-12-18 US9100640B2 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing image sensor pipeline (ISP) for enhancing color of the 3D image utilizing z-depth information |
US13/174,364 Abandoned US20120050483A1 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing an image sensor pipeline (isp) for 3d imaging processing utilizing z-depth information |
US13/174,261 Active 2032-09-06 US9013552B2 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing image sensor pipeline (ISP) for scaling 3D images based on Z-depth information |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/174,430 Active 2032-12-18 US9100640B2 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing image sensor pipeline (ISP) for enhancing color of the 3D image utilizing z-depth information |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/174,261 Active 2032-09-06 US9013552B2 (en) | 2010-08-27 | 2011-06-30 | Method and system for utilizing image sensor pipeline (ISP) for scaling 3D images based on Z-depth information |
Country Status (1)
Country | Link |
---|---|
US (3) | US9100640B2 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130187908A1 (en) * | 2012-01-20 | 2013-07-25 | Realtek Semiconductor Corp. | Image processing device and method thereof |
WO2014022490A1 (en) * | 2012-07-31 | 2014-02-06 | Intel Corporation | Context-driven adjustment of camera parameters |
US20150092029A1 (en) * | 2013-10-02 | 2015-04-02 | National Cheng Kung University | Method, device and system for packing color frame and original depth frame |
US20150319417A1 (en) * | 2014-05-02 | 2015-11-05 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for taking a photograph in electronic apparatus |
US9330470B2 (en) | 2010-06-16 | 2016-05-03 | Intel Corporation | Method and system for modeling subjects from a depth map |
WO2016077568A1 (en) * | 2014-11-13 | 2016-05-19 | Intel Corporation | 3d enhanced image correction |
US9477303B2 (en) | 2012-04-09 | 2016-10-25 | Intel Corporation | System and method for combining three-dimensional tracking with a three-dimensional display for a user interface |
US9910498B2 (en) | 2011-06-23 | 2018-03-06 | Intel Corporation | System and method for close-range movement tracking |
US20190251403A1 (en) * | 2018-02-09 | 2019-08-15 | Stmicroelectronics (Research & Development) Limited | Apparatus, method and computer program for performing object recognition |
US11048333B2 (en) | 2011-06-23 | 2021-06-29 | Intel Corporation | System and method for close-range movement tracking |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101289269B1 (en) * | 2010-03-23 | 2013-07-24 | 한국전자통신연구원 | An apparatus and method for displaying image data in image system |
US20110279651A1 (en) * | 2010-05-17 | 2011-11-17 | Texas Instruments Incorporated | Method and Apparatus for Auto-Convergence Based on Auto-Focus Point for Stereoscopic Frame |
KR20120088467A (en) * | 2011-01-31 | 2012-08-08 | 삼성전자주식회사 | Method and apparatus for displaying partial 3d image in 2d image disaply area |
US8780161B2 (en) * | 2011-03-01 | 2014-07-15 | Hewlett-Packard Development Company, L.P. | System and method for modifying images |
KR20130127867A (en) * | 2012-05-15 | 2013-11-25 | 삼성전자주식회사 | Stereo vision apparatus and control method thereof |
KR20140010823A (en) * | 2012-07-17 | 2014-01-27 | 삼성전자주식회사 | Image data scaling method and image display apparatus |
US20140267616A1 (en) * | 2013-03-15 | 2014-09-18 | Scott A. Krig | Variable resolution depth representation |
US10497140B2 (en) * | 2013-08-15 | 2019-12-03 | Intel Corporation | Hybrid depth sensing pipeline |
US9292906B1 (en) * | 2013-09-06 | 2016-03-22 | Google Inc. | Two-dimensional image processing based on third dimension data |
KR101957243B1 (en) * | 2014-07-09 | 2019-03-12 | 삼성전자주식회사 | Multi view image display apparatus and multi view image display method thereof |
JP6443677B2 (en) * | 2015-03-12 | 2018-12-26 | 日本精機株式会社 | Head mounted display device |
CN106469551A (en) * | 2015-08-19 | 2017-03-01 | 中兴通讯股份有限公司 | A kind of pipeline noise reduction system and method |
US10672180B2 (en) * | 2016-05-02 | 2020-06-02 | Samsung Electronics Co., Ltd. | Method, apparatus, and recording medium for processing image |
US20180149826A1 (en) * | 2016-11-28 | 2018-05-31 | Microsoft Technology Licensing, Llc | Temperature-adjusted focus for cameras |
US10554881B2 (en) * | 2016-12-06 | 2020-02-04 | Microsoft Technology Licensing, Llc | Passive and active stereo vision 3D sensors with variable focal length lenses |
US10277889B2 (en) | 2016-12-27 | 2019-04-30 | Qualcomm Incorporated | Method and system for depth estimation based upon object magnification |
KR102320198B1 (en) | 2017-04-05 | 2021-11-02 | 삼성전자주식회사 | Method and apparatus for refining depth image |
WO2019031259A1 (en) * | 2017-08-08 | 2019-02-14 | ソニー株式会社 | Image processing device and method |
KR102417701B1 (en) * | 2017-08-28 | 2022-07-07 | 삼성전자주식회사 | Method and apparatus for displaying in electronic device |
US10511888B2 (en) | 2017-09-19 | 2019-12-17 | Sony Corporation | Calibration system for audience response capture and analysis of media content |
WO2019183733A1 (en) * | 2018-03-29 | 2019-10-03 | Matr Performance Inc. | Method and system for motion capture to enhance performance in an activity |
US11849223B2 (en) | 2018-12-21 | 2023-12-19 | Chronoptics Limited | Time of flight camera data processing system |
JP7292979B2 (en) * | 2019-05-31 | 2023-06-19 | 株式会社東芝 | Image processing device and image processing method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030206653A1 (en) * | 1995-07-28 | 2003-11-06 | Tatsushi Katayama | Image sensing and image processing apparatuses |
US20060029270A1 (en) * | 2004-08-03 | 2006-02-09 | Sony Corporation | System and method for efficiently performing a depth map recovery procedure |
US20070188623A1 (en) * | 2003-09-11 | 2007-08-16 | Haruo Yamashita | Visual processing device, visual processing method, visual processing program, intergrated circuit, display device, image-capturing device, and portable information terminal |
US20080031327A1 (en) * | 2006-08-01 | 2008-02-07 | Haohong Wang | Real-time capturing and generating stereo images and videos with a monoscopic low power mobile device |
US20090060273A1 (en) * | 2007-08-03 | 2009-03-05 | Harman Becker Automotive Systems Gmbh | System for evaluating an image |
US20100118122A1 (en) * | 2008-11-07 | 2010-05-13 | Honeywell International Inc. | Method and apparatus for combining range information with an optical image |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6414709B1 (en) * | 1994-11-03 | 2002-07-02 | Synthonics Incorporated | Methods and apparatus for zooming during capture and reproduction of 3-dimensional images |
US6445833B1 (en) * | 1996-07-18 | 2002-09-03 | Sanyo Electric Co., Ltd | Device and method for converting two-dimensional video into three-dimensional video |
ATE483327T1 (en) * | 2001-08-15 | 2010-10-15 | Koninkl Philips Electronics Nv | 3D VIDEO CONFERENCE SYSTEM |
WO2007145654A1 (en) * | 2005-10-28 | 2007-12-21 | Aepx Animation, Inc. | Automatic compositing of 3d objects in a still frame or series of frames and detection and manipulation of shadows in an image or series of images |
US7477777B2 (en) * | 2005-10-28 | 2009-01-13 | Aepx Animation, Inc. | Automatic compositing of 3D objects in a still frame or series of frames |
US8009903B2 (en) * | 2006-06-29 | 2011-08-30 | Panasonic Corporation | Image processor, image processing method, storage medium, and integrated circuit that can adjust a degree of depth feeling of a displayed high-quality image |
US8587638B2 (en) * | 2006-09-25 | 2013-11-19 | Nokia Corporation | Supporting a 3D presentation |
WO2008063167A1 (en) * | 2006-11-21 | 2008-05-29 | Thomson Licensing | Methods and systems for color correction of 3d images |
US8330801B2 (en) * | 2006-12-22 | 2012-12-11 | Qualcomm Incorporated | Complexity-adaptive 2D-to-3D video sequence conversion |
US9058668B2 (en) * | 2007-05-24 | 2015-06-16 | Broadcom Corporation | Method and system for inserting software processing in a hardware image sensor pipeline |
US7884823B2 (en) * | 2007-06-12 | 2011-02-08 | Microsoft Corporation | Three dimensional rendering of display information using viewer eye coordinates |
KR101420684B1 (en) * | 2008-02-13 | 2014-07-21 | 삼성전자주식회사 | Apparatus and method for matching color image and depth image |
US8059911B2 (en) * | 2008-09-30 | 2011-11-15 | Himax Technologies Limited | Depth-based image enhancement |
KR101502362B1 (en) * | 2008-10-10 | 2015-03-13 | 삼성전자주식회사 | Apparatus and Method for Image Processing |
CN101789235B (en) * | 2009-01-22 | 2011-12-28 | 华为终端有限公司 | Method and device for processing image |
JP4903240B2 (en) * | 2009-03-31 | 2012-03-28 | シャープ株式会社 | Video processing apparatus, video processing method, and computer program |
US8947422B2 (en) * | 2009-09-30 | 2015-02-03 | Disney Enterprises, Inc. | Gradient modeling toolkit for sculpting stereoscopic depth models for converting 2-D images into stereoscopic 3-D images |
US8537200B2 (en) * | 2009-10-23 | 2013-09-17 | Qualcomm Incorporated | Depth map generation techniques for conversion of 2D video data to 3D video data |
US9042636B2 (en) * | 2009-12-31 | 2015-05-26 | Disney Enterprises, Inc. | Apparatus and method for indicating depth of one or more pixels of a stereoscopic 3-D image comprised from a plurality of 2-D layers |
US8823776B2 (en) * | 2010-05-20 | 2014-09-02 | Cisco Technology, Inc. | Implementing selective image enhancement |
-
2011
- 2011-06-30 US US13/174,430 patent/US9100640B2/en active Active
- 2011-06-30 US US13/174,364 patent/US20120050483A1/en not_active Abandoned
- 2011-06-30 US US13/174,261 patent/US9013552B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030206653A1 (en) * | 1995-07-28 | 2003-11-06 | Tatsushi Katayama | Image sensing and image processing apparatuses |
US20070188623A1 (en) * | 2003-09-11 | 2007-08-16 | Haruo Yamashita | Visual processing device, visual processing method, visual processing program, intergrated circuit, display device, image-capturing device, and portable information terminal |
US20060029270A1 (en) * | 2004-08-03 | 2006-02-09 | Sony Corporation | System and method for efficiently performing a depth map recovery procedure |
US20080031327A1 (en) * | 2006-08-01 | 2008-02-07 | Haohong Wang | Real-time capturing and generating stereo images and videos with a monoscopic low power mobile device |
US20090060273A1 (en) * | 2007-08-03 | 2009-03-05 | Harman Becker Automotive Systems Gmbh | System for evaluating an image |
US20100118122A1 (en) * | 2008-11-07 | 2010-05-13 | Honeywell International Inc. | Method and apparatus for combining range information with an optical image |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9330470B2 (en) | 2010-06-16 | 2016-05-03 | Intel Corporation | Method and system for modeling subjects from a depth map |
US11048333B2 (en) | 2011-06-23 | 2021-06-29 | Intel Corporation | System and method for close-range movement tracking |
US9910498B2 (en) | 2011-06-23 | 2018-03-06 | Intel Corporation | System and method for close-range movement tracking |
US20130187908A1 (en) * | 2012-01-20 | 2013-07-25 | Realtek Semiconductor Corp. | Image processing device and method thereof |
US9477303B2 (en) | 2012-04-09 | 2016-10-25 | Intel Corporation | System and method for combining three-dimensional tracking with a three-dimensional display for a user interface |
KR101643496B1 (en) * | 2012-07-31 | 2016-07-27 | 인텔 코포레이션 | Context-driven adjustment of camera parameters |
KR20150027137A (en) * | 2012-07-31 | 2015-03-11 | 인텔 코오퍼레이션 | Context-driven adjustment of camera parameters |
WO2014022490A1 (en) * | 2012-07-31 | 2014-02-06 | Intel Corporation | Context-driven adjustment of camera parameters |
US20150092029A1 (en) * | 2013-10-02 | 2015-04-02 | National Cheng Kung University | Method, device and system for packing color frame and original depth frame |
US9832446B2 (en) * | 2013-10-02 | 2017-11-28 | National Cheng Kung University | Method, device and system for packing color frame and original depth frame |
US20150319417A1 (en) * | 2014-05-02 | 2015-11-05 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for taking a photograph in electronic apparatus |
US10070041B2 (en) * | 2014-05-02 | 2018-09-04 | Samsung Electronics Co., Ltd. | Electronic apparatus and method for taking a photograph in electronic apparatus |
WO2016077568A1 (en) * | 2014-11-13 | 2016-05-19 | Intel Corporation | 3d enhanced image correction |
CN107077607A (en) * | 2014-11-13 | 2017-08-18 | 英特尔公司 | The enhanced image rectifications of 3D |
EP3219101A4 (en) * | 2014-11-13 | 2018-06-20 | Intel Corporation | 3d enhanced image correction |
US10764563B2 (en) * | 2014-11-13 | 2020-09-01 | Intel Corporation | 3D enhanced image correction |
US20160142702A1 (en) * | 2014-11-13 | 2016-05-19 | Intel Corporation | 3d enhanced image correction |
US20190251403A1 (en) * | 2018-02-09 | 2019-08-15 | Stmicroelectronics (Research & Development) Limited | Apparatus, method and computer program for performing object recognition |
US10922590B2 (en) * | 2018-02-09 | 2021-02-16 | Stmicroelectronics (Research & Development) Limited | Apparatus, method and computer program for performing object recognition |
Also Published As
Publication number | Publication date |
---|---|
US9100640B2 (en) | 2015-08-04 |
US9013552B2 (en) | 2015-04-21 |
US20120050482A1 (en) | 2012-03-01 |
US20120050484A1 (en) | 2012-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9100640B2 (en) | Method and system for utilizing image sensor pipeline (ISP) for enhancing color of the 3D image utilizing z-depth information | |
US10897609B2 (en) | Systems and methods for multiscopic noise reduction and high-dynamic range | |
US9204128B2 (en) | Stereoscopic shooting device | |
US20120050478A1 (en) | Method and System for Utilizing Multiple 3D Source Views for Generating 3D Image | |
US10708486B2 (en) | Generation of a depth-artificial image by determining an interpolated supplementary depth through interpolation based on the original depths and a detected edge | |
US20120050480A1 (en) | Method and system for generating three-dimensional video utilizing a monoscopic camera | |
US9961272B2 (en) | Image capturing apparatus and method of controlling the same | |
US8810565B2 (en) | Method and system for utilizing depth information as an enhancement layer | |
US20140002612A1 (en) | Stereoscopic shooting device | |
JP2010088105A (en) | Imaging apparatus and method, and program | |
US9774841B2 (en) | Stereoscopic image capture device and control method of the same | |
US8878910B2 (en) | Stereoscopic image partial area enlargement and compound-eye imaging apparatus and recording medium | |
WO2019105151A1 (en) | Method and device for image white balance, storage medium and electronic equipment | |
US20120050491A1 (en) | Method and system for adjusting audio based on captured depth information | |
US20120050477A1 (en) | Method and System for Utilizing Depth Information for Providing Security Monitoring | |
US20110052070A1 (en) | Image processing apparatus, image processing method and computer readable-medium | |
EP2485494A1 (en) | Method and system for utilizing depth information as an enhancement layer | |
JP5733588B2 (en) | Image processing apparatus and method, and program | |
EP2541945A2 (en) | Method and system for utilizing an image sensor pipeline (ISP) for 3D imaging processing utilizing Z-depth information | |
EP2485493A2 (en) | Method and system for error protection of 3D video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BOROSS, CHRIS;SESHADRI, NAMBIRAJAN;KARAOGUZ, JEYHAN;AND OTHERS;SIGNING DATES FROM 20110624 TO 20110720;REEL/FRAME:026666/0552 |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 |
|
AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041712/0001 Effective date: 20170119 |