WO2011158225A1 - System and method for enhancing images - Google Patents

System and method for enhancing images Download PDF

Info

Publication number
WO2011158225A1
WO2011158225A1 PCT/IL2011/000290 IL2011000290W WO2011158225A1 WO 2011158225 A1 WO2011158225 A1 WO 2011158225A1 IL 2011000290 W IL2011000290 W IL 2011000290W WO 2011158225 A1 WO2011158225 A1 WO 2011158225A1
Authority
WO
WIPO (PCT)
Prior art keywords
low resolution
image
frames
frame
resolution
Prior art date
Application number
PCT/IL2011/000290
Other languages
French (fr)
Inventor
Yossi Deutsch
Original Assignee
Mirtemis Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mirtemis Ltd. filed Critical Mirtemis Ltd.
Publication of WO2011158225A1 publication Critical patent/WO2011158225A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present invention relates to a system and method for enhancing still and video images.
  • the quality of video and still images is often determined via their pixel count which is mostly a function of the camera sensor (typically a CCD or CMOS sensor chip) used for capturing the image(s).
  • the camera sensor converts captured light into discrete signals which ultimately form pixels on the resulting displayed image.
  • the total number of pixels in the image determines its "pixel count”.
  • digital cameras in use today utilize CMOS sensors capable of capturing 5-14 megapixels (MP) of visual data.
  • Super-resolution is a term for a mathematical approach for increasing the resolution of an image beyond that of the original pixel count.
  • Super-resolution approaches typically utilize information from several images to create one upsized image by extracting details from one or more frames to reconstruct other frames.
  • Super- resolution is different from image upsizing approaches which synthesize artificial details in order to increase the pixel count of an image.
  • Implementation of the method and system of the present invention involves performing or completing selected tasks or steps manually, automatically, or a combination thereof.
  • several selected steps could be implemented by hardware or by software on any operating system of any firmware or a combination thereof.
  • selected steps of the invention could be implemented as a chip or a circuit.
  • selected steps of the invention could be implemented as a plurality of software instructions being executed by a computer using any suitable operating system.
  • selected steps of the method and system of the invention could be described as being performed by a data processor, such as a computing platform for executing a plurality of instructions.
  • FIG. 1 is a flow chart illustrating the basic steps of the present method.
  • FIG. 2 is an image captured with a 640X480 CMOS camera.
  • FIG. 3 illustrates a 8X digital zoom of the image of Figure 2 generated using bilinear interpolation.
  • FIG. 4 illustrates a 6X digital zoom of the image of Figure 2 generated using the present approach.
  • the present invention is of a system and method which can be to enhance images. Specifically, the present invention can be used to increase the pixel count of images and thus it can be used to enhance low resolution images.
  • Images captured by digital imaging sensors are definition limited by the resolution of the image sensor.
  • Super resolution methods attempt to overcome the diffraction limits of the imaging sensor by reconstructing a high resolution image from a set of under-sampled, low resolution images. This enables an imaging system to output images with a resolution that is higher than the imaging sensor limits.
  • Prior art super resolution approaches typically utilize an image registration step in which the geometric transforms between the captured frames are analyzed and a shift/rotation/wrap is calculated with sub pixel accuracy followed by a super-resolution (SR) stage in which the registered low resolution images are fused into a single, high resolution image.
  • SR super-resolution
  • Most multi-frame super-resolution methods require the algorithm to keep a history of captured frames and therefore memory requirement can become too demanding, especially for low power devices such as mobile phones and digital still cameras.
  • NECTM implemented a single frame super resolution approach (see, for example, published US Patent application 20090274385 and 20080267525) which overcomes the high memory requirements of prior art approaches.
  • the present inventors While reducing the present invention to practice, the present inventors have devised a super-resolution approach which can be used to enhance low resolution images while traversing the limitations of prior art approaches. As is further described herein, the present approach can be used to rapidly enhance low resolution still and video images, thus enabling real time super-resolved zoom in digital cameras and cell phones and high definition streaming of low definition video.
  • the term "enhancing" when used with respect to a low resolution image refers to increasing the pixel count of the image by a factor of at least 1.5, preferably 4X, 6X, 8X or more (e.g. 10X) or improving image quality via reduction of noise or anti-aliasing.
  • the phrase low resolution image refers to an image characterized by low pixel count (typically standard definition or less), and/or to an image which can be enhanced by noise reduction or anti-aliasing. Enhancement can also refer to increasing the pixel count or quality of a high definition image, for example, by converting a high resolution image into a super high resolution image.
  • the present approach can enhance any image captured by a digital imaging sensor at any resolution.
  • the method of the present invention is effected by first obtaining a plurality of low resolution frames corresponding to the low resolution image.
  • the low resolution (LR) frames can be obtained as a time sequence (microseconds to seconds) of frames, i.e. by sequentially capturing a plurality (e.g. 5, 10, 15 or more) frames of the same image (scene) manually or preferably automatically.
  • each frame can represent the image from the same angle and cover the same X-Y coordinates, a slight shift (e.g. of at least 0.1% of the pixel physical size) in angle and/or X-Y coordinates resulting from, for example, intentional or accidental camera movement is preferred since it substantially increases the amount of information covered by the sequence of frames.
  • Each captured frame of the LR frame sequence is registered to a reference frame (e.g. previous frame) and the displacement information is stored.
  • a single frame (preferably first or last in the sequence) is upscaled to create a first high resolution estimation by bicubic, bilinear or lanczos interpolation or any other known upscaling method. This step is performed once the frame is captured, thus, in instances where the first frame of the LR sequence is used, this frame is processed immediately following capture.
  • the upscaled frame is then utilized to generate a plurality of simulated low resolution frames by sub-sampling according to the SR factor (HR_size/LR_size) and shifting according to the displacements information retrieved from the LR sequence.
  • a sequence of simulated low resolution frames is generated it is compared to the original LR frames (captured by the device) by means of LI norm distance function and the difference between the captured and simulated frames is mapped to a high resolution grid to generate an enhanced version of the low resolution image.
  • the enhanced version of the low resolution image is then interpolated, using, for example, a Gaussian filter to obtain the final enhanced image.
  • Comparison between the simulated LR frames and the original LR frames (captured by the device) is preferably effected by subtracting each pixel value of the simulated LR frames from its respective original LR frame of the captured sequence.
  • the result of the previous stage is an "error" image (showing the difference between the LR and simulated LR) and it is added to the estimated HR frame (described above).
  • the enhanced image is obtained it is preferably used to generate simulated LR frames, which are mapped to a new grid which is then compared to the grid generated from the original (captured) LR frames (first grid above). These steps are repeated multiple times in order to arrive at a final enhanced image (e.g. a high definition image in the case of an SD starting image).
  • the present approach differs from prior art super resolution approaches. Whereas prior art approaches maintain (store) all the images of a captured sequence and compare captured and simulated low-resolution images, the present approach maps low resolution images into a high resolution grid, maps simulated low resolution images to another grid and identifies the differences between these grids. This traverses the need to maintain (store) captured LR images in memory and thus reduces the memory requirement.
  • the present approach reduces the memory requirements regardless of the number of LR frames used in the captured sequence. Such reduction in memory requirements is possible because the present approach does not need to store captured frames in memory, needing only 2 X HR_width X HR_ Height memory space instead of (HR_width X HRJheight ) + sequence length X LR_width X LR_height stored by prior art approaches.
  • Figure 1 is a flow chart illustrating the basic steps of the present methodology. A more detailed description of the present methodology is provided in Examples 1 and 2 of the Examples section which follows.
  • the present methodology can be used in enhancing digital zooming, of, for example, a digital camera or a digital microscope; in providing high definition stills; or for enhancing images by anti-aliasing or reduction of noise. It can be used to enhance any image of any type and content including real and animated images (e.g. CGI or 3D game images).
  • Example 2 of the Examples section which follows exemplifies use of the present method in enhancing digital zooming of a standard definition still image.
  • the present methodology can be implemented using a software application executed by processing unit of a device such as a personal computer, a server, a work station and the like, a handheld device such as a mobile phone or a PDA, a tablet, a digital stills or video camera and the like or via a dedicated unit (chip) configured for such purposes and utilized by any of the aforementioned devices.
  • a device such as a personal computer, a server, a work station and the like, a handheld device such as a mobile phone or a PDA, a tablet, a digital stills or video camera and the like or via a dedicated unit (chip) configured for such purposes and utilized by any of the aforementioned devices.
  • the system includes a processing unit capable of executing the methodology described herein.
  • exemplary systems include desktop and mobile computers, digital cameras, gaming devices (e.g. Sony PSPTM), mobile phones, TV Displays, and the like.
  • the processing unit can be a dedicated processing unit, in which case , an ASIC or an FPGA is used to implement the method or it can be any processing unit capable of running dedicated software designed for executing the above described methodology.
  • the system is designed to process and enhance images captured by a camera of the system or for enhancing images (still and video) stored on or streamed to the system.
  • each low resolution (LR) image 11, 12,..., In is treated as a down-sampled, degraded version of a real-life, actual image with a size Wlr x Hlr, where Wlr ⁇ Wsr and Hlr ⁇ Hsr.
  • D represents the down sampling matrix
  • B is the blur matrix
  • W is the geometric wrapping matrix
  • the operator A represents all image degradation factors such as sub-sampling, blurring, wrapping, etc and e represents the Gaussian noise.
  • A is the linear operator
  • 1 is the set of low resolution images
  • Xk is the kth approximation of the final super resolution image
  • step 1 go to step 1 using the set of diff images (instead of the original LR set)
  • each input LR frame can be a color or a black and white frame.
  • each plane is processed separately (optionally in parallel):
  • each LR frame is up-scaled, translated and accumulated into a sumHRframes variable - this leads to loss of the actual set of low resolution images
  • N frames of simulated LR frames are then generated by down-sampling shifting and then up sampling according to the image registration stage;
  • the difference image is then interpolated by using a Gaussian filter (or any other suitable filter).
  • the size of the sequence of the LR frames needed to create the high resolution image was determined prior to execution.
  • the size of the high resolution frame was estimated (at first iteration it can be created using standard bi-cubic scaling or any other known single frame scaling method) and N frames of simulated low resolution frames were generated by down- sampling, geometric translation (e.g. shifting using -l*Di(x,y) for the i'th frame), adding noise and blurring (this step can also utilize any other anticipated degradation processes such as artifacts caused by imperfection in the optical parts (e.g. lens, filters, etc).
  • the set of low resolution frames was then subtracted from the original low resolution frames.
  • Each resulting subtracted frame represents the difference between an actual and a simulated low resolution frame. This step enabled prediction of an optimum solution.
  • the high resolution grid formed a sparse matrix with accumulated information of all the mapped difference frames.
  • the high resolution grid was then interpolated using a Gaussian filter (with large support and appropriate sigma).
  • the interpolated high resolution grid was then added to a final super resolution image and the resulting image was then used as input for the next iteration of the above described process starting with re-estimation of the high resolution image.
  • steps (ii)-(iv) were altered as follows.
  • the simulated low resolution frames were mapped into an accumulated high resolution grid via upscaling and translating.
  • the current estimation of the high resolution frame (at first iteration it can be created using standard bi-cubic scaling or any other known single frame scaling method) was then identified and N frames of simulated low resolution frames were generated via down-sampling, geometric translation (e.g. shifting it using -l*Di(x,y) for the i'th frame), adding noise and blurring.
  • Each simulated low resolution image was then mapped to an accumulated high resolution simulation grid.
  • the simulation grid was then subtracted from the high resolution grid and he result was interpolated using a Gaussian filter (with higher support arid higher sigma in order to accommodate for doing it in high resolution space).
  • the interpolated image was then added to the final high resolution image which in turn was used as an input for the next iteration.
  • Figure 3 illustrates a 6X digital zoom of the image of Figure 2 generated using bicubic interpolation.
  • Figure 4 illustrates a 6X digital zoom of the image of Figure 2 generated using the present approach.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

A method of enhancing a low resolution image is provided. The method is effected by obtaining a plurality of low resolution frames corresponding to the low resolution image, upscaling a frame of the plurality of low resolution frames, generating a plurality of simulated low resolution frames, identifying differences between the plurality of low resolution frames and the plurality of simulated low resolution frames and mapping the differences to a high resolution grid and generating an enhanced version of the low resolution image.

Description

SYSTEM AND METHOD FOR ENHANCING IMAGES
FIELD AND BACKGROUND OF THE INVENTION
The present invention relates to a system and method for enhancing still and video images.
The quality of video and still images is often determined via their pixel count which is mostly a function of the camera sensor (typically a CCD or CMOS sensor chip) used for capturing the image(s). The camera sensor converts captured light into discrete signals which ultimately form pixels on the resulting displayed image. The total number of pixels in the image determines its "pixel count". Consumes: digital cameras in use today utilize CMOS sensors capable of capturing 5-14 megapixels (MP) of visual data.
Super-resolution is a term for a mathematical approach for increasing the resolution of an image beyond that of the original pixel count. Super-resolution approaches typically utilize information from several images to create one upsized image by extracting details from one or more frames to reconstruct other frames. Super- resolution is different from image upsizing approaches which synthesize artificial details in order to increase the pixel count of an image.
Several super-resolution approaches are known in the art, see for example, Altunbasak et al. 2002 - Super-resolution still and video reconstruction from mpeg- coded video. IEEE Trans Circuits Sys. Video Technol. 12:217-226; Elad and Feuer. 1999 - Super-resolution reconstruction of image sequences; Farsiuet al. 2003 - Robust shift and add approach to super-resolution. SPIE Conf on Appl. Digital Signal and Image Process, pp. 121-130; Hardie et al. 1997 - Joint map registration and high- resolution image estimation using a sequence of under sampled images. IEEE Trans Image Process 6:1621-1633.
Most of these approaches utilize frame-to-frame movement of the capturing device and the objects within the captured scene to "fill-in" missing details and generate a supersized image or video sequence.
Although presently available super-resolution approaches are capable of enhancing still and video images, they are processor and memory intensive and thus cannot be effectively used for real-time video super-resolution. In addition, such approaches are typically insensitive to environmentally-produced artifacts, often magnifying such artifacts in resulting super-resolution images produced thereby.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Implementation of the method and system of the present invention involves performing or completing selected tasks or steps manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and equipment of preferred embodiments of the method and system of the present invention, several selected steps could be implemented by hardware or by software on any operating system of any firmware or a combination thereof. For example, as hardware, selected steps of the invention could be implemented as a chip or a circuit. As software, selected steps of the invention could be implemented as a plurality of software instructions being executed by a computer using any suitable operating system. In any case, selected steps of the method and system of the invention could be described as being performed by a data processor, such as a computing platform for executing a plurality of instructions.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is herein described, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.
In the drawings:
FIG. 1 is a flow chart illustrating the basic steps of the present method.
FIG. 2 is an image captured with a 640X480 CMOS camera.
FIG. 3 illustrates a 8X digital zoom of the image of Figure 2 generated using bilinear interpolation.
FIG. 4 illustrates a 6X digital zoom of the image of Figure 2 generated using the present approach.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention is of a system and method which can be to enhance images. Specifically, the present invention can be used to increase the pixel count of images and thus it can be used to enhance low resolution images.
The principles and operation of the present invention may be better understood with reference to the drawings and accompanying descriptions.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not limited in its application to the details set forth in the following description or exemplified by the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein is for the purpose of description and should not be regarded as limiting.
Images captured by digital imaging sensors are definition limited by the resolution of the image sensor. Super resolution methods attempt to overcome the diffraction limits of the imaging sensor by reconstructing a high resolution image from a set of under-sampled, low resolution images. This enables an imaging system to output images with a resolution that is higher than the imaging sensor limits.
Prior art super resolution approaches typically utilize an image registration step in which the geometric transforms between the captured frames are analyzed and a shift/rotation/wrap is calculated with sub pixel accuracy followed by a super-resolution (SR) stage in which the registered low resolution images are fused into a single, high resolution image. Most multi-frame super-resolution methods require the algorithm to keep a history of captured frames and therefore memory requirement can become too demanding, especially for low power devices such as mobile phones and digital still cameras.
To overcome these limitations, several companies, such as NEC™ implemented a single frame super resolution approach (see, for example, published US Patent application 20090274385 and 20080267525) which overcomes the high memory requirements of prior art approaches.
Although the NEC approach substantially reduces memory requirements it still suffers from several limitations including not actually breaking the diffraction limits of the imaging sensor but, instead, trying to achieve a subjectively 'nicer' looking image.
While reducing the present invention to practice, the present inventors have devised a super-resolution approach which can be used to enhance low resolution images while traversing the limitations of prior art approaches. As is further described herein, the present approach can be used to rapidly enhance low resolution still and video images, thus enabling real time super-resolved zoom in digital cameras and cell phones and high definition streaming of low definition video.
Thus, according to one aspect of the present invention, there is provided a method of enhancing a low resolution image. As used herein, the term "enhancing" when used with respect to a low resolution image refers to increasing the pixel count of the image by a factor of at least 1.5, preferably 4X, 6X, 8X or more (e.g. 10X) or improving image quality via reduction of noise or anti-aliasing. As used herein, the phrase low resolution image refers to an image characterized by low pixel count (typically standard definition or less), and/or to an image which can be enhanced by noise reduction or anti-aliasing. Enhancement can also refer to increasing the pixel count or quality of a high definition image, for example, by converting a high resolution image into a super high resolution image. Thus, the present approach can enhance any image captured by a digital imaging sensor at any resolution.
The method of the present invention is effected by first obtaining a plurality of low resolution frames corresponding to the low resolution image. The low resolution (LR) frames can be obtained as a time sequence (microseconds to seconds) of frames, i.e. by sequentially capturing a plurality (e.g. 5, 10, 15 or more) frames of the same image (scene) manually or preferably automatically. Although each frame can represent the image from the same angle and cover the same X-Y coordinates, a slight shift (e.g. of at least 0.1% of the pixel physical size) in angle and/or X-Y coordinates resulting from, for example, intentional or accidental camera movement is preferred since it substantially increases the amount of information covered by the sequence of frames.
Each captured frame of the LR frame sequence is registered to a reference frame (e.g. previous frame) and the displacement information is stored.
Once the sequence of LR frames is captured, a single frame (preferably first or last in the sequence) is upscaled to create a first high resolution estimation by bicubic, bilinear or lanczos interpolation or any other known upscaling method. This step is performed once the frame is captured, thus, in instances where the first frame of the LR sequence is used, this frame is processed immediately following capture.
The upscaled frame is then utilized to generate a plurality of simulated low resolution frames by sub-sampling according to the SR factor (HR_size/LR_size) and shifting according to the displacements information retrieved from the LR sequence.
Once a sequence of simulated low resolution frames is generated it is compared to the original LR frames (captured by the device) by means of LI norm distance function and the difference between the captured and simulated frames is mapped to a high resolution grid to generate an enhanced version of the low resolution image. The enhanced version of the low resolution image is then interpolated, using, for example, a Gaussian filter to obtain the final enhanced image.
Comparison between the simulated LR frames and the original LR frames (captured by the device) is preferably effected by subtracting each pixel value of the simulated LR frames from its respective original LR frame of the captured sequence.
The result of the previous stage is an "error" image (showing the difference between the LR and simulated LR) and it is added to the estimated HR frame (described above).
Once the enhanced image is obtained it is preferably used to generate simulated LR frames, which are mapped to a new grid which is then compared to the grid generated from the original (captured) LR frames (first grid above). These steps are repeated multiple times in order to arrive at a final enhanced image (e.g. a high definition image in the case of an SD starting image). The present approach differs from prior art super resolution approaches. Whereas prior art approaches maintain (store) all the images of a captured sequence and compare captured and simulated low-resolution images, the present approach maps low resolution images into a high resolution grid, maps simulated low resolution images to another grid and identifies the differences between these grids. This traverses the need to maintain (store) captured LR images in memory and thus reduces the memory requirement.
It will be appreciated that the present approach reduces the memory requirements regardless of the number of LR frames used in the captured sequence. Such reduction in memory requirements is possible because the present approach does not need to store captured frames in memory, needing only 2 X HR_width X HR_ Height memory space instead of (HR_width X HRJheight ) + sequence length X LR_width X LR_height stored by prior art approaches.
Thus, in the case of 30 captured LR frames of 640X480 pixels each, 4X scaling requires:
mem_size = (30 X 640 X 480) + 4 X (640 X 480) = 10444800 pixels stored by prior art approaches, while with the present approach, it only requires:
mem_size = 2 X 4 X(640 X 480) = 2457600 pixels stored, i.e. less than a quarter of the memory usage.
Figure 1 is a flow chart illustrating the basic steps of the present methodology. A more detailed description of the present methodology is provided in Examples 1 and 2 of the Examples section which follows.
The present methodology can be used in enhancing digital zooming, of, for example, a digital camera or a digital microscope; in providing high definition stills; or for enhancing images by anti-aliasing or reduction of noise. It can be used to enhance any image of any type and content including real and animated images (e.g. CGI or 3D game images).
Example 2 of the Examples section which follows exemplifies use of the present method in enhancing digital zooming of a standard definition still image.
The present methodology can be implemented using a software application executed by processing unit of a device such as a personal computer, a server, a work station and the like, a handheld device such as a mobile phone or a PDA, a tablet, a digital stills or video camera and the like or via a dedicated unit (chip) configured for such purposes and utilized by any of the aforementioned devices.
Thus, according to another aspect of the present invention there is provided a system for enhancing a low resolution image.
The system includes a processing unit capable of executing the methodology described herein. Exemplary systems include desktop and mobile computers, digital cameras, gaming devices (e.g. Sony PSP™), mobile phones, TV Displays, and the like. As is mentioned hereinabove, the processing unit can be a dedicated processing unit, in which case , an ASIC or an FPGA is used to implement the method or it can be any processing unit capable of running dedicated software designed for executing the above described methodology.
In any case, the system is designed to process and enhance images captured by a camera of the system or for enhancing images (still and video) stored on or streamed to the system.
As used herein the term "about" refers to ± 10 %.
Additional objects, advantages, and novel features of the present invention will become apparent to one ordinarily skilled in the art upon examination of the following examples, which are not intended to be limiting. Additionally, each of the various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below finds experimental support in the following examples. EXAMPLES
Reference is now made to the following examples, which together with the above descriptions, illustrate the invention in a non limiting fashion.
EXAMPLE 1
Super-Resolution algorithm
The following describes an algorithm for multi-frame super resolution suitable for use in low memory digital still applications. Prior art super-resolution approaches assume that N low resolution images represent different snapshots of the same scene. The real image that needs to be estimated is a single high-resolution image X of size Wsr x Hsr. Thus, each low resolution (LR) image 11, 12,..., In is treated as a down-sampled, degraded version of a real-life, actual image with a size Wlr x Hlr, where Wlr < Wsr and Hlr < Hsr.
Thus, prior art approaches create a super-resolution image by solving the following equation:
1112 i In = Dl · Bl · Wl Dn · Bn · WnX + el ! en = Al ! An X+ el ! en
where D represents the down sampling matrix, B is the blur matrix and W is the geometric wrapping matrix.
This can be written in simple form as: l=AX+e
where Ak= Dk · Bk · Wk for k=l,2,..,N
The operator A represents all image degradation factors such as sub-sampling, blurring, wrapping, etc and e represents the Gaussian noise.
Solving the above for X is an inverse problem. Solving this problem can be done with numerical methods in an iterative approach.
By assuming that R is the inverse of matrix A one can use:
Xk+1= Xk + Rl-A- Xk
where A is the linear operator, 1 is the set of low resolution images and Xk is the kth approximation of the final super resolution image.
Because 1 is a set of low resolution images, prior art approaches require that all images be kept in memory in order to enable iterative processing of these images.
In order to apply this prior art approach to video processing, the following general algorithm is utilized:
(i) create an empty, temporary, high resolution image;
(ii) translate each pixel in every LR image into the HR image (translation and up-scale - R);
(iii) create a set of simulated LR images by down-sampling and shifting an approximation to final HR (operator A on X);
(iv) Compare each simulated LR with the original LR and create a diff image
(1-AX); (v) interpolate the result using a Gaussian filter (any other filter is possible) with sigma equals half the size of the resolution factor and the kernel size is big enough to follow the following guideline: min(kernel_values)/max(kernel_values) <=0.005
(vi) add the diff image to a finalHR image (Xk+1 = Xk + ...);
(vii) go to step 1 using the set of diff images (instead of the original LR set)
The problem with such implementation is that it requires the entire set of LR images for processing and thus requires large data storage capabilities.
In order to traverse the memory problem inherent to prior art approaches, the present inventors have devised a novel approach which forgoes the use of the entire LR image set and applies the following calculation:
Figure imgf000010_0001
where R is the approximation of the inverse of A in the accumulated high resolution space.
The following steps are utilized in order to implement the present approach, each input LR frame can be a color or a black and white frame. In cases where the input LR frame is in color, each plane is processed separately (optionally in parallel):
(i) each LR frame is up-scaled, translated and accumulated into a sumHRframes variable - this leads to loss of the actual set of low resolution images;
(ii) an estimated version of finalHR is then created (by simply resizing the last frame in traditional methods such bicubic interpolation);
(iii) N frames of simulated LR frames are then generated by down-sampling shifting and then up sampling according to the image registration stage;
(iv) all simulated images are then accumulated into a sum_of_simLR;
(v) the difference between the sumHRframes and sum_of_simLR is then identified;
(vi) the difference image is then interpolated by using a Gaussian filter (or any other suitable filter); and
(vii) the result is added to a final HR image. These steps are then repeated for each LR frame. Since comparisons between the simulated and actual LR are performed in the High resolution space, one must accommodate for that by increasing the size and sigma of the Gaussian filter to cover more area of the sparser matrix.
EXAMPLE 2
Super-resolution of a still image
The present approach was utilized to enhance an image captured with a 640X480 CMOS camera (Figure 2).
The size of the sequence of the LR frames needed to create the high resolution image was determined prior to execution.
(i) Each LR frame was registered to the previous captured LR frame and displacement parameters between the two frames were generated. If the images are translated, the result is Di(x) and Di(y) - representing the displacement on the horizontal and vertical planes, respectively, of the i'th frame. A set of displacements, D, was then generated for all input frames.
(ii) The size of the high resolution frame was estimated (at first iteration it can be created using standard bi-cubic scaling or any other known single frame scaling method) and N frames of simulated low resolution frames were generated by down- sampling, geometric translation (e.g. shifting using -l*Di(x,y) for the i'th frame), adding noise and blurring (this step can also utilize any other anticipated degradation processes such as artifacts caused by imperfection in the optical parts (e.g. lens, filters, etc).
The set of low resolution frames was then subtracted from the original low resolution frames. Each resulting subtracted frame represents the difference between an actual and a simulated low resolution frame. This step enabled prediction of an optimum solution.
(iii) Each difference frame was then mapped to a high resolution grid using upscaling and translation using Di(x,y) for the i'th frame.
(iv) The high resolution grid formed a sparse matrix with accumulated information of all the mapped difference frames. The high resolution grid was then interpolated using a Gaussian filter (with large support and appropriate sigma). The interpolated high resolution grid was then added to a final super resolution image and the resulting image was then used as input for the next iteration of the above described process starting with re-estimation of the high resolution image.
(v) Following completion of the iterative process, the final image was sharpened by interpolating it with a Gaussian filter (with parameters similar to those used in iv) and multiplying it by a sharpening factor. This resulted in a blurred image that was subtracted from the final image: Fsr = Fsr - a(H*Fsr) where * represents convolution and H represents a Gaussian kernel and Fsr is the final super resolved image.
In order to further reduce memory requirements, steps (ii)-(iv) were altered as follows. The simulated low resolution frames were mapped into an accumulated high resolution grid via upscaling and translating. The current estimation of the high resolution frame (at first iteration it can be created using standard bi-cubic scaling or any other known single frame scaling method) was then identified and N frames of simulated low resolution frames were generated via down-sampling, geometric translation (e.g. shifting it using -l*Di(x,y) for the i'th frame), adding noise and blurring. Each simulated low resolution image was then mapped to an accumulated high resolution simulation grid.
The simulation grid was then subtracted from the high resolution grid and he result was interpolated using a Gaussian filter (with higher support arid higher sigma in order to accommodate for doing it in high resolution space). The interpolated image was then added to the final high resolution image which in turn was used as an input for the next iteration.
The above described alteration to the present process reduced memory use to twice the size of the final, super resolved image regardless the size of the input sequence (same memory requirements if N=15 frames or N=100 frames).
Figure 3 illustrates a 6X digital zoom of the image of Figure 2 generated using bicubic interpolation. Figure 4 illustrates a 6X digital zoom of the image of Figure 2 generated using the present approach.
These results clearly show that the present approach results in enhanced resolution/definition, a reduction in moire pattern and other aliasing effect and an improved signal to noise ratio (SNR). It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination.
Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims. All publications, patents and patent applications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention.

Claims

WHAT IS CLAIMED IS:
1. A method of enhancing a low resolution image comprising:
(a) obtaining a plurality of low resolution frames corresponding to the low resolution image;
(b) upscaling a frame of said plurality of low resolution frames to generate an upscaled frame;
(c) using said upscaled frame to generate a plurality of simulated low resolution frames;
(d) identifying differences between said plurality of low resolution frames and said plurality of simulated low resolution frames; and
(e) mapping said differences to a high resolution grid and generating an enhanced version of said low resolution image.
2. The method of claim 1, wherein (d) is effected by:
(i) mapping said plurality of low resolution frames to a first grid;
(ii) mapping said plurality of simulated low resolution frames to a second grid; and
(iii) identifying said differences between said first grid and said second grid.
3. The method of claim 1, further comprising interpolating said differences prior to said generating said enhanced version of said low resolution image.
4. The method of claim 3, further comprising using said enhanced version of said low resolution image to repeat (c)-(e).
5. The method of claim 3, wherein said interpolating is effected using a Gaussian filter.
6. The method of claim 1, wherein said plurality of low resolution frames constitute a time sequence of said low resolution frames.
7. The method of claim 5, wherein said frame of said plurality of low resolution frames is a first or last frame of said time sequence.
8. A system for enhancing a low resolution image comprising a computing unit executing the method of claim 1.
9. The system of claim 8, wherein the system is a desktop or laptop computer, a tablet, a handheld device, a digital camera or a dedicated display.
PCT/IL2011/000290 2010-06-17 2011-04-05 System and method for enhancing images WO2011158225A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35557510P 2010-06-17 2010-06-17
US61/355,575 2010-06-17

Publications (1)

Publication Number Publication Date
WO2011158225A1 true WO2011158225A1 (en) 2011-12-22

Family

ID=45347708

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2011/000290 WO2011158225A1 (en) 2010-06-17 2011-04-05 System and method for enhancing images

Country Status (1)

Country Link
WO (1) WO2011158225A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2996338A3 (en) * 2013-01-30 2016-07-06 Intel Corporation Content adaptive super resolution prediction generation for next generation video coding
US9894372B2 (en) 2012-11-13 2018-02-13 Intel Corporation Content adaptive super resolution prediction generation for next generation video coding
CN108764028A (en) * 2018-04-13 2018-11-06 北京航天自动控制研究所 A kind of method of filtering mode processing frame difference method On-Screen Identification label

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6546157B1 (en) * 1998-10-29 2003-04-08 Mitsubishi Denki Kabushiki Kaisha Number-of-pixels conversion apparatus and display apparatus using the same
US20080085059A1 (en) * 2006-10-05 2008-04-10 Po-Wei Chao Image processing method and device for performing mosquito noise reduction
US7379626B2 (en) * 2004-08-20 2008-05-27 Silicon Optix Inc. Edge adaptive image expansion and enhancement system and method
US20090087120A1 (en) * 2007-09-28 2009-04-02 Ati Technologies Ulc Apparatus and method for generating a detail-enhanced upscaled image
US20100020225A1 (en) * 2006-09-21 2010-01-28 Takafumi Hosoi Image processing apparatus, image processing method, and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6546157B1 (en) * 1998-10-29 2003-04-08 Mitsubishi Denki Kabushiki Kaisha Number-of-pixels conversion apparatus and display apparatus using the same
US7379626B2 (en) * 2004-08-20 2008-05-27 Silicon Optix Inc. Edge adaptive image expansion and enhancement system and method
US20100020225A1 (en) * 2006-09-21 2010-01-28 Takafumi Hosoi Image processing apparatus, image processing method, and program
US20080085059A1 (en) * 2006-10-05 2008-04-10 Po-Wei Chao Image processing method and device for performing mosquito noise reduction
US20090087120A1 (en) * 2007-09-28 2009-04-02 Ati Technologies Ulc Apparatus and method for generating a detail-enhanced upscaled image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9894372B2 (en) 2012-11-13 2018-02-13 Intel Corporation Content adaptive super resolution prediction generation for next generation video coding
EP2996338A3 (en) * 2013-01-30 2016-07-06 Intel Corporation Content adaptive super resolution prediction generation for next generation video coding
CN108764028A (en) * 2018-04-13 2018-11-06 北京航天自动控制研究所 A kind of method of filtering mode processing frame difference method On-Screen Identification label
CN108764028B (en) * 2018-04-13 2020-07-14 北京航天自动控制研究所 Method for processing screen identification label by frame difference method in filtering mode

Similar Documents

Publication Publication Date Title
Sun et al. Learned image downscaling for upscaling using content adaptive resampler
CN110136066B (en) Video-oriented super-resolution method, device, equipment and storage medium
US10853916B2 (en) Convolution deconvolution neural network method and system
US9485432B1 (en) Methods, systems and apparatuses for dual-camera based zooming
JP5450600B2 (en) Improving image resolution
CN110163237B (en) Model training and image processing method, device, medium and electronic equipment
Rajan et al. Simultaneous estimation of super-resolved scene and depth map from low resolution defocused observations
US8233745B2 (en) Image processing apparatus and image processing method
Agrawal et al. Resolving objects at higher resolution from a single motion-blurred image
CA2700344C (en) System and method for scaling images
Joshi et al. Super-resolution imaging: use of zoom as a cue
JP2013518336A (en) Method and system for generating an output image with increased pixel resolution from an input image
Chatterjee et al. Application of Papoulis–Gerchberg method in image super-resolution and inpainting
Yoshikawa et al. Super resolution image reconstruction using total variation regularization and learning-based method
Thapa et al. A performance comparison among different super-resolution techniques
WO2011158225A1 (en) System and method for enhancing images
WO2014114529A1 (en) Method and apparatus for performing single-image super-resolution
Purkait et al. Morphologic gain-controlled regularization for edge-preserving super-resolution image reconstruction
Zhang et al. Video superresolution reconstruction using iterative back projection with critical-point filters based image matching
Wu et al. Lapepi-net: A Laplacian pyramid EPI structure for learning-based dense light field reconstruction
Choi et al. High dynamic range image reconstruction with spatial resolution enhancement
Yau et al. Robust deep learning-based multi-image super-resolution using inpainting
Callicó et al. Low-cost super-resolution algorithms implementation over a HW/SW video compression platform
Choudhury et al. Image super resolution using fractional discrete wavelet transformand fractional fast Fourier transform
Lukes et al. Objective image quality assessment of multiframe super-resolution methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11795291

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11795291

Country of ref document: EP

Kind code of ref document: A1