US20160284066A1 - Method to improve video quality under low light conditions - Google Patents

Method to improve video quality under low light conditions Download PDF

Info

Publication number
US20160284066A1
US20160284066A1 US14/669,433 US201514669433A US2016284066A1 US 20160284066 A1 US20160284066 A1 US 20160284066A1 US 201514669433 A US201514669433 A US 201514669433A US 2016284066 A1 US2016284066 A1 US 2016284066A1
Authority
US
United States
Prior art keywords
frame
current frame
motion estimation
enhanced
filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/669,433
Other versions
US9466094B1 (en
Inventor
Xiaogang Dong
Jiro Takatori
Tak Shing Wong
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Priority to US14/669,433 priority Critical patent/US9466094B1/en
Assigned to SONY CORPORATION reassignment SONY CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WONG, TAK SHING, DONG, XIAOGANG, Takatori, Jiro
Publication of US20160284066A1 publication Critical patent/US20160284066A1/en
Application granted granted Critical
Publication of US9466094B1 publication Critical patent/US9466094B1/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • G06T5/002
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/503Blending, e.g. for anti-aliasing
    • G06T3/0093
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • H04N23/81Camera processing pipelines; Components thereof for suppressing or minimising disturbance in the image signal generation
    • H04N5/235
    • H04N9/045
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/12Indexing scheme for image data processing or generation, in general involving antialiasing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • G06T2207/20012Locally adaptive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20028Bilateral filtering

Definitions

  • the present invention relates to video processing. More specifically, the present invention relates to improving video quality under low light conditions.
  • Video recording is a standard feature for cameras, smart phones, tablets and many other devices.
  • Compact cameras and mobile devices such as phones and tablets are usually equipped with smaller size image sensors and less than ideal optics. Improving video quality is especially important for these devices.
  • High-end cameras and camcorders are generally equipped with larger size image sensors and better optics. Captured videos using these devices have decent quality under normal lighting conditions. However, videos recorded under low light conditions still demand significant improvement even for high-end cameras and camcorders.
  • many recording devices have increased their resolutions in recent years (e.g., from SD to HD, from HD to 4K, and maybe 4K to 8K in future). Increased video resolution lowers the signal-to-noise ratio at every pixel location on the image sensor. Improving video quality becomes even more challenging with increased video resolution.
  • a method to improve video quality by suppressing noise and artifacts in difference frames of a video is described herein.
  • a method programmed in a non-transitory memory of a device comprises acquiring video content which includes a plurality of frames, including storing the video content in the non-transitory memory, performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame.
  • the method further comprises capturing the video content with an image sensor.
  • the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
  • Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
  • Filtering includes, but is not limited to, average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering.
  • Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • a system programmed in a non-transitory memory of a device comprises an image sensor configured for acquiring video content which includes a plurality of frames, hardware components configured for: performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame and a display device configured for displaying an enhanced video including the enhanced frame.
  • the system further comprises an image processor for processing the video content.
  • the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
  • Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
  • Filtering includes, but is not limited to average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering.
  • Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • a camera apparatus comprises an image sensor configured for acquiring video content which includes a plurality of frames, a non-transitory memory for storing an application, the application for: performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame and a processing component coupled to the memory, the processing component configured for processing the application.
  • the camera apparatus further comprises an image processor for processing the video content.
  • the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
  • Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
  • Filtering includes, but is not limited to, average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering.
  • Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • FIG. 1 illustrates examples of difference frames according to some embodiments.
  • FIG. 2 illustrates a flowchart of a method of improving video quality according to some embodiments.
  • FIG. 3 illustrates a block diagram of an exemplary computing device configured to implement the method of improving video quality according to some embodiments.
  • a video includes a sequence of frames or 2D images in temporal order.
  • Each frame or 2D image usually consists of thousands or millions of pixels.
  • the difference frame of two neighbor frames is obtained by subtracting the pixel values of the previous frame from the corresponding pixel values of the current frame.
  • the correspondence of pixel locations between two frames may consider the relative movements of objects.
  • the difference frame is generally sparse, e.g., image values are either zeros or close to zeros at most of pixel locations.
  • FIG. 1 illustrates examples of difference frames according to some embodiments.
  • the difference frame looks mid-gray or closed to mid-gray at most of pixel locations.
  • mid-gray color represents the pixel value 0 in the difference frame.
  • the difference frame looks brighter at some pixel locations if the values at these pixel locations in the noisefree frame 2 are larger than the corresponding values in the noisefree frame 1
  • the difference frames look darker at some other pixel locations when the values at these pixel locations in the noisefree frame 2 are smaller than the corresponding values of the noisefree frame 1 .
  • Relative motions of the objects between the two frames are not considered in this example.
  • the difference frame may be seen as mid-gray almost everywhere when perfect motion compensation is applied between the two frames.
  • a denoised frame 1 When a denoised frame 1 is subtracted from a noisy frame 2 , the result is a noisy difference frame. Averages of pixel values are 0 or close to 0 in many areas in the difference frame. These areas are shown in mid-gray or close to mid-gray. However, values at individual pixel locations may deviate from 0 due to the presence of noise. If the noise in the noisy difference frame is suppressed and made as close as possible to the noisefree difference frame, a decently denoised frame 2 is able to be obtained.
  • FIG. 2 illustrates a flowchart of a method of improving video quality according to some embodiments.
  • a video is acquired or received.
  • a user takes a video using a digital camera device and the video includes many frames.
  • motion estimation is implemented using a current frame (e.g., noisy frame) and a previous (or preceding) enhanced frame (e.g., denoised frame) to generate a Motion Estimation (ME) aligned frame.
  • ME Motion Estimation
  • a goal of the ME aligned frame is maximizing sparsity of difference frames (or canceling meaningful signal as much as possible, so only noise remains in difference frames).
  • the ME aligned frame is subtracted from the current frame to generate a difference frame.
  • the difference frame is enhanced.
  • the difference frame is enhanced by any implementation of image enhancement algorithms Suitable image enhancement algorithms may include steps such as noise reduction and/or artifact removal.
  • the ME aligned frame and the enhanced difference frame are added together to generate a first enhanced current frame. Such enhancement is based on difference frame enhancement.
  • ME error detection is implemented using the current frame and the ME aligned frame as input.
  • ME error detection attempts to detect any errors in the estimations of object movements. If there are any errors, different image contents may present at the corresponding pixel locations of the current frame and the ME aligned frame. As a result, some motion artifacts may be observed in the output of the step 206 , e.g., the first enhanced current frame. Therefore an alternative enhancement method is needed for those areas affected by ME errors.
  • One possible candidate is the direct single frame enhancement.
  • a single frame enhance method is implemented using the current frame as input.
  • the single frame enhance method includes any implementation of image enhancement algorithms including steps such as noise reduction and/or artifact removal.
  • the single frame enhance method outputs a second enhanced current frame.
  • the first enhanced current frame and the second enhanced frame are blended to generate a final enhanced frame.
  • the final enhanced frame is used to generate a video with better video quality.
  • Motion estimation is generated using the current frame and the previous enhanced frame.
  • Various motion estimations are able to be used depending on the desired system complexity.
  • “Null” motion assumes there are no motions at all, which is simplest.
  • Global motion assumes that there is only camera movement.
  • Local motion assumes both camera and object movements.
  • Motion estimation is generally not error-free.
  • the “NULL” motion is assumed, then there are motion estimation errors if there are any camera or object movements.
  • the global motion is assumed, then there are motion estimation errors if the global motion is not accurate enough or there are any object movements. Even when the local motion is applied, there usually exists some inaccurate estimation of local motions. Many different kinds of artifacts may happen when motion estimation errors occur. It is important to have an “ME Error Detection” block to detect motion estimation error as well as some areas more suspect to artifacts.
  • Sample technologies include: simple average filter, bilateral filter, “wavelet transform on incomplete image data and its applications in image processing”, as described in U.S. Pat. No. 8,731,281, issued on May 20, 2014, and “an improved method to measure local image similarity and its application in image processing,” as described in U.S. patent application Ser. No. 12/931,962, filed on Feb. 15, 2011, which is incorporated by reference in its entirety for all purposes.
  • a blending step of the first enhanced current frame (based on enhancing difference frame) and the second enhanced current frame (based on enhancing single frame) is applied to deal with ME errors.
  • the confidence level is able to be determined in any manner. For example, the difference between the current frame and the ME aligned frame is calculated, and if the difference is above a first threshold, then the confidence that there are errors is 100%, but if the difference is below the first threshold but above a second threshold, then the confidence of errors is 90%, and so on, until if the difference is below a final (e.g., lowest) threshold, then the confidence of errors is 0% (e.g., 100% confidence of no errors).
  • a final e.g., lowest
  • the confidence of errors is 0% (e.g., 100% confidence of no errors).
  • a table is able to be used to provide the confidence of errors or no errors corresponding with the difference amount.
  • a continuous function mapping from the difference between the current frame and the ME aligned frame to the confidence level may also be defined.
  • FIG. 3 illustrates a block diagram of an exemplary computing device configured to implement the improved video quality method according to some embodiments.
  • the computing device 300 is able to be used to acquire, store, compute, process, communicate and/or display information such as images and videos.
  • a hardware structure suitable for implementing the computing device 300 includes a network interface 302 , a memory 304 , a processor 306 , I/O device(s) 308 , a bus 310 and a storage device 312 .
  • the choice of processor is not critical as long as a suitable processor with sufficient speed is chosen.
  • the memory 304 is able to be any conventional computer memory known in the art.
  • the storage device 312 is able to include a hard drive, CDROM, CDRW, DVD, DVDRW, Blu-ray disc/drive, flash memory card or any other storage device.
  • the computing device 300 is able to include one or more network interfaces 302 .
  • An example of a network interface includes a network adapter connected to an Ethernet or other type of wired or wireless network interface adapter.
  • the I/O device(s) 308 are able to include one or more of the following: keyboard, mouse, monitor, screen, printer, modem, touchscreen, button interface and other devices.
  • Improved video quality method application(s) 330 used to implement the improved video quality method are likely to be stored in the storage device 312 and memory 304 and processed as applications are typically processed. More or fewer components shown in FIG.
  • the improved video quality method is able to be included in the computing device 300 .
  • specific hardware 320 is included for the improved video quality method.
  • the computing device 300 in FIG. 3 includes applications 330 and hardware 320 for the improved video quality method, the improved video quality method is able to be implemented on a computing device in hardware, firmware, software or any combination thereof.
  • the improved video quality method applications 330 are programmed in a memory and executed using a processor.
  • the improved video quality method hardware 320 is programmed hardware logic including gates specifically designed to implement the improved video quality method.
  • the improved video quality method application(s) 330 include several applications and/or modules.
  • modules include one or more sub-modules as well. In some embodiments, fewer or additional modules are able to be included.
  • suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, an augmented reality device, a digital camera, a digital camcorder, a camera phone, a smart phone, a tablet computer, a mobile device, a video player, a video disc writer/player (e.g., DVD writer/player, blu-ray disc writer/player), a television, a home entertainment system, a wearable computing device (e.g., smart watch) or any other suitable computing device.
  • a personal computer e.g., a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, an augmented reality device, a digital camera, a digital camcorder, a camera phone, a smart phone, a tablet
  • a device such as a digital camera/camcorder is used to acquire image/video content.
  • the improved video quality method is automatically used when acquiring and/or encoding the content.
  • the improved video quality method is able to be implemented with user assistance or automatically without user involvement.
  • the improved video quality method provides better quality content, particularly in low light situations.
  • a method programmed in a non-transitory memory of a device comprising:

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Image Analysis (AREA)

Abstract

A method to improve video quality by suppressing noise and artifacts in difference frames of a video is described herein.

Description

    FIELD OF THE INVENTION
  • The present invention relates to video processing. More specifically, the present invention relates to improving video quality under low light conditions.
  • BACKGROUND OF THE INVENTION
  • Video recording is a standard feature for cameras, smart phones, tablets and many other devices. Compact cameras and mobile devices such as phones and tablets are usually equipped with smaller size image sensors and less than ideal optics. Improving video quality is especially important for these devices. High-end cameras and camcorders are generally equipped with larger size image sensors and better optics. Captured videos using these devices have decent quality under normal lighting conditions. However, videos recorded under low light conditions still demand significant improvement even for high-end cameras and camcorders. In addition, many recording devices have increased their resolutions in recent years (e.g., from SD to HD, from HD to 4K, and maybe 4K to 8K in future). Increased video resolution lowers the signal-to-noise ratio at every pixel location on the image sensor. Improving video quality becomes even more challenging with increased video resolution.
  • SUMMARY OF THE INVENTION
  • A method to improve video quality by suppressing noise and artifacts in difference frames of a video is described herein.
  • In one aspect, a method programmed in a non-transitory memory of a device comprises acquiring video content which includes a plurality of frames, including storing the video content in the non-transitory memory, performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame. The method further comprises capturing the video content with an image sensor. The motion estimation includes null motion estimation, global motion estimation, or local motion estimation. Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame. Filtering includes, but is not limited to, average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering. Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • In another aspect, a system programmed in a non-transitory memory of a device comprises an image sensor configured for acquiring video content which includes a plurality of frames, hardware components configured for: performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame and a display device configured for displaying an enhanced video including the enhanced frame. The system further comprises an image processor for processing the video content. The motion estimation includes null motion estimation, global motion estimation, or local motion estimation. Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame. Filtering includes, but is not limited to average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering. Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • In another aspect, a camera apparatus comprises an image sensor configured for acquiring video content which includes a plurality of frames, a non-transitory memory for storing an application, the application for: performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame, subtracting the motion estimated aligned frame from the current frame to generate a difference frame, enhancing the difference frame, adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame, enhancing the current frame directly to generate a second enhanced current frame, performing motion estimation error detection using the current frame and the motion estimated aligned frame and blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame and a processing component coupled to the memory, the processing component configured for processing the application. The camera apparatus further comprises an image processor for processing the video content. The motion estimation includes null motion estimation, global motion estimation, or local motion estimation. Enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame. Filtering includes, but is not limited to, average filtering, bilateral filtering, or transformation domain filtering such as wavelet filtering. Blending utilizes a blending coefficient. The blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates examples of difference frames according to some embodiments.
  • FIG. 2 illustrates a flowchart of a method of improving video quality according to some embodiments.
  • FIG. 3 illustrates a block diagram of an exemplary computing device configured to implement the method of improving video quality according to some embodiments.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
  • A video includes a sequence of frames or 2D images in temporal order. Each frame or 2D image usually consists of thousands or millions of pixels. There are one or more values at each pixel location. For example, there is one value per pixel in grayscale images, and there are three values per pixel in color images. The difference frame of two neighbor frames is obtained by subtracting the pixel values of the previous frame from the corresponding pixel values of the current frame. The correspondence of pixel locations between two frames may consider the relative movements of objects. The difference frame is generally sparse, e.g., image values are either zeros or close to zeros at most of pixel locations. A method to improve video quality by suppressing noise and artifacts in the difference frames is described herein.
  • FIG. 1 illustrates examples of difference frames according to some embodiments. When a noisefree frame 1 is subtracted from a noisefree frame 2, the result is a noisefree difference frame. The difference frame looks mid-gray or closed to mid-gray at most of pixel locations. Here mid-gray color represents the pixel value 0 in the difference frame. The difference frame looks brighter at some pixel locations if the values at these pixel locations in the noisefree frame 2 are larger than the corresponding values in the noisefree frame 1 Similarly, the difference frames look darker at some other pixel locations when the values at these pixel locations in the noisefree frame 2 are smaller than the corresponding values of the noisefree frame 1. Relative motions of the objects between the two frames are not considered in this example. The difference frame may be seen as mid-gray almost everywhere when perfect motion compensation is applied between the two frames.
  • When a denoised frame 1 is subtracted from a noisy frame 2, the result is a noisy difference frame. Averages of pixel values are 0 or close to 0 in many areas in the difference frame. These areas are shown in mid-gray or close to mid-gray. However, values at individual pixel locations may deviate from 0 due to the presence of noise. If the noise in the noisy difference frame is suppressed and made as close as possible to the noisefree difference frame, a decently denoised frame 2 is able to be obtained.
  • FIG. 2 illustrates a flowchart of a method of improving video quality according to some embodiments. In some embodiments, a video is acquired or received. For example, a user takes a video using a digital camera device and the video includes many frames. In the step 200, motion estimation is implemented using a current frame (e.g., noisy frame) and a previous (or preceding) enhanced frame (e.g., denoised frame) to generate a Motion Estimation (ME) aligned frame. A goal of the ME aligned frame is maximizing sparsity of difference frames (or canceling meaningful signal as much as possible, so only noise remains in difference frames). In the step 202, the ME aligned frame is subtracted from the current frame to generate a difference frame. In the step 204, the difference frame is enhanced. The difference frame is enhanced by any implementation of image enhancement algorithms Suitable image enhancement algorithms may include steps such as noise reduction and/or artifact removal. In the step 206, the ME aligned frame and the enhanced difference frame are added together to generate a first enhanced current frame. Such enhancement is based on difference frame enhancement.
  • In the step 210, ME error detection is implemented using the current frame and the ME aligned frame as input. ME error detection attempts to detect any errors in the estimations of object movements. If there are any errors, different image contents may present at the corresponding pixel locations of the current frame and the ME aligned frame. As a result, some motion artifacts may be observed in the output of the step 206, e.g., the first enhanced current frame. Therefore an alternative enhancement method is needed for those areas affected by ME errors. One possible candidate is the direct single frame enhancement.
  • In the step 208, a single frame enhance method is implemented using the current frame as input. The single frame enhance method includes any implementation of image enhancement algorithms including steps such as noise reduction and/or artifact removal. The single frame enhance method outputs a second enhanced current frame. In the step 212, based on the ME error detection, the first enhanced current frame and the second enhanced frame are blended to generate a final enhanced frame. The final enhanced frame is used to generate a video with better video quality.
  • Motion estimation is generated using the current frame and the previous enhanced frame. Various motion estimations are able to be used depending on the desired system complexity. “Null” motion assumes there are no motions at all, which is simplest. Global motion assumes that there is only camera movement. Local motion assumes both camera and object movements. Motion estimation is generally not error-free. In one example, if the “NULL” motion is assumed, then there are motion estimation errors if there are any camera or object movements. In another example, if the global motion is assumed, then there are motion estimation errors if the global motion is not accurate enough or there are any object movements. Even when the local motion is applied, there usually exists some inaccurate estimation of local motions. Many different kinds of artifacts may happen when motion estimation errors occur. It is important to have an “ME Error Detection” block to detect motion estimation error as well as some areas more suspect to artifacts.
  • Various image enhancement technologies are able to be applied to both “single frame enhance” and “difference frame enhance.” Sample technologies include: simple average filter, bilateral filter, “wavelet transform on incomplete image data and its applications in image processing”, as described in U.S. Pat. No. 8,731,281, issued on May 20, 2014, and “an improved method to measure local image similarity and its application in image processing,” as described in U.S. patent application Ser. No. 12/931,962, filed on Feb. 15, 2011, which is incorporated by reference in its entirety for all purposes.
  • A blending step of the first enhanced current frame (based on enhancing difference frame) and the second enhanced current frame (based on enhancing single frame) is applied to deal with ME errors. The result is: final enhanced frame=(1−α) second enhanced current frame+α first enhanced current frame. Thus, if α=0, then the final enhanced current frame is 100% of the second enhanced current frame and 0% of the first enhanced current frame. If α=0.9, then the final enhanced frame is 10% of the second enhanced current frame blended with 90% of the first enhanced current frame. Blending is able to be on a pixel basis, block level or any other implementation.
  • The blending coefficient a depends on the confidence level of ME errors detection. When confidence about occurrences of ME errors is 100%, α=0 is used. When confidence that there are no ME errors is approximately 100% (e.g., greater than 90%), a close to 1 (e.g., 0.9) is used. Any a between 0 and 1 is able to be used. For example, if the confidence of ME errors is 50%, then a is able to be 0.5.
  • The confidence level is able to be determined in any manner. For example, the difference between the current frame and the ME aligned frame is calculated, and if the difference is above a first threshold, then the confidence that there are errors is 100%, but if the difference is below the first threshold but above a second threshold, then the confidence of errors is 90%, and so on, until if the difference is below a final (e.g., lowest) threshold, then the confidence of errors is 0% (e.g., 100% confidence of no errors). There are able to be any number of thresholds. A table is able to be used to provide the confidence of errors or no errors corresponding with the difference amount. Alternatively, a continuous function mapping from the difference between the current frame and the ME aligned frame to the confidence level may also be defined.
  • FIG. 3 illustrates a block diagram of an exemplary computing device configured to implement the improved video quality method according to some embodiments. The computing device 300 is able to be used to acquire, store, compute, process, communicate and/or display information such as images and videos. In general, a hardware structure suitable for implementing the computing device 300 includes a network interface 302, a memory 304, a processor 306, I/O device(s) 308, a bus 310 and a storage device 312. The choice of processor is not critical as long as a suitable processor with sufficient speed is chosen. The memory 304 is able to be any conventional computer memory known in the art. The storage device 312 is able to include a hard drive, CDROM, CDRW, DVD, DVDRW, Blu-ray disc/drive, flash memory card or any other storage device. The computing device 300 is able to include one or more network interfaces 302. An example of a network interface includes a network adapter connected to an Ethernet or other type of wired or wireless network interface adapter. The I/O device(s) 308 are able to include one or more of the following: keyboard, mouse, monitor, screen, printer, modem, touchscreen, button interface and other devices. Improved video quality method application(s) 330 used to implement the improved video quality method are likely to be stored in the storage device 312 and memory 304 and processed as applications are typically processed. More or fewer components shown in FIG. 3 are able to be included in the computing device 300. In some embodiments, specific hardware 320 is included for the improved video quality method. Although the computing device 300 in FIG. 3 includes applications 330 and hardware 320 for the improved video quality method, the improved video quality method is able to be implemented on a computing device in hardware, firmware, software or any combination thereof. For example, in some embodiments, the improved video quality method applications 330 are programmed in a memory and executed using a processor. In another example, in some embodiments, the improved video quality method hardware 320 is programmed hardware logic including gates specifically designed to implement the improved video quality method.
  • In some embodiments, the improved video quality method application(s) 330 include several applications and/or modules. In some embodiments, modules include one or more sub-modules as well. In some embodiments, fewer or additional modules are able to be included.
  • Examples of suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, an augmented reality device, a digital camera, a digital camcorder, a camera phone, a smart phone, a tablet computer, a mobile device, a video player, a video disc writer/player (e.g., DVD writer/player, blu-ray disc writer/player), a television, a home entertainment system, a wearable computing device (e.g., smart watch) or any other suitable computing device.
  • To utilize the improved video quality method described herein, a device such as a digital camera/camcorder is used to acquire image/video content. The improved video quality method is automatically used when acquiring and/or encoding the content. The improved video quality method is able to be implemented with user assistance or automatically without user involvement.
  • In operation, the improved video quality method provides better quality content, particularly in low light situations.
  • Some Embodiments of a Method to Improve Video Quality Under Low Light Conditions
  • 1. A method programmed in a non-transitory memory of a device comprising:
      • a. acquiring video content which includes a plurality of frames, including storing the video content in the non-transitory memory;
      • b. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
      • c. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
      • d. enhancing the difference frame;
      • e. adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame;
      • f. enhancing the current frame directly to generate a second enhanced current frame;
      • g. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
      • h. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame.
      • 2. The method of clause 1 further comprising capturing the video content with an image sensor.
      • 3. The method of clause 1 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
      • 4. The method of clause 1 wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
      • 5. The method of clause 4 wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
      • 6. The method of clause 1 wherein blending utilizes a blending coefficient.
      • 7. The method of clause 7 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
      • 8. A system programmed in a non-transitory memory of a device comprising:
        • a. an image sensor configured for acquiring video content which includes a plurality of frames;
        • b. hardware components configured for:
          • i. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
          • ii. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
          • iii. enhancing the difference frame;
          • iv. adding the enhanced difference frame and the motion estimated aligned frame to generate a combined enhanced difference frame;
          • v. enhancing the current frame directly to generate a second enhanced current frame;
          • vi. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
          • vii. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame; and
        • c. a display device configured for displaying an enhanced video including the enhanced frame.
      • 9. The system of clause 8 further comprising an image processor for processing the video content.
      • 10. The system of clause 8 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
      • 11. The system of clause 8 wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
      • 12. The system of clause 11 wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
      • 13. The system of clause 8 wherein blending utilizes a blending coefficient.
      • 14. The system of clause 13 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
      • 15. A camera apparatus comprising:
        • a. an image sensor configured for acquiring video content which includes a plurality of frames;
        • b. a non-transitory memory for storing an application, the application for:
          • i. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
          • ii. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
          • iii. enhancing the difference frame;
          • iv. adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame;
          • v. enhancing the current frame directly to generate a second enhanced current frame;
          • vi. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
          • vii. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame; and
        • c. a processing component coupled to the memory, the processing component configured for processing the application.
      • 16. The camera apparatus of clause 15 further comprising an image processor for processing the video content.
      • 17. The camera apparatus of clause 15 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
      • 18. The camera apparatus of clause 15 wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
      • 19. The camera apparatus of clause 18 wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
      • 20. The camera apparatus of clause 15 wherein blending utilizes a blending coefficient.
      • 21. The camera apparatus of clause 20 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
  • The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.

Claims (21)

1. A method programmed in a non-transitory memory of a device comprising:
a. acquiring video content which includes a plurality of frames, including storing the video content in the non-transitory memory;
b. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
c. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
d. enhancing the difference frame;
e. adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame;
f. enhancing the current frame directly to generate a second enhanced current frame;
g. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
h. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame, wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame.
2. The method of claim 1 further comprising capturing the video content with an image sensor.
3. The method of claim 1 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
4. (canceled)
5. The method of claim 1 wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
6. The method of claim 1 wherein blending utilizes a blending coefficient.
7. The method of claim 6 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
8. A system programmed in a non-transitory memory of a device comprising:
a. an image sensor configured for acquiring video content which includes a plurality of frames;
b. an image processor configured for:
i. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
ii. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
iii. enhancing the difference frame;
iv. adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame;
v. enhancing the current frame directly to generate a second enhanced current frame;
vi. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
vii. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame, wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame; and
c. a display device configured for displaying an enhanced video including the enhanced frame.
9. The system of claim 8 wherein the image processor is for processing the video content.
10. The system of claim 8 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
11. (canceled)
12. The system of claim wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
13. The system of claim 8 wherein blending utilizes a blending coefficient.
14. The system of claim 13 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
15. A camera apparatus comprising:
a. an image sensor configured for acquiring video content which includes a plurality of frames;
b. a non-transitory memory for storing an application, the application for:
i. performing motion estimation on a current frame and a previous frame to generate a motion estimated aligned frame;
ii. subtracting the motion estimated aligned frame from the current frame to generate a difference frame;
iii. enhancing the difference frame;
iv. adding the enhanced difference frame and the motion estimated aligned frame to generate a first enhanced current frame;
v. enhancing the current frame directly to generate a second enhanced current frame;
vi. performing motion estimation error detection using the current frame and the motion estimated aligned frame; and
vii. blending the first enhanced current frame with the second enhanced current frame based on the motion estimation error detection to generate an enhanced frame, wherein enhancing the current frame includes spatial filtering or transformation domain filtering the current frame, and enhancing the difference frame includes spatial filtering or transformation domain filtering the difference frame; and
c. a processor coupled to the memory, the processor configured for processing the application.
16. The camera apparatus of claim 15 wherein the processor is for processing the video content.
17. The camera apparatus of claim 15 wherein the motion estimation includes null motion estimation, global motion estimation, or local motion estimation.
18. (canceled)
19. The camera apparatus of claim 15 wherein filtering includes average filtering, bilateral filtering, or transformation domain filtering including wavelet filtering.
20. The camera apparatus of claim 15 wherein blending utilizes a blending coefficient.
21. The camera apparatus of claim 20 wherein the blending coefficient depends on a confidence level of motion estimation errors, wherein if the confidence level is approximately 100%, then the blending coefficient is 0, and wherein if the confidence level that there are no motion estimation errors is approximately 100%, then the blending coefficient is close to 1.
US14/669,433 2015-03-26 2015-03-26 Method to improve video quality under low light conditions Active 2035-04-17 US9466094B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/669,433 US9466094B1 (en) 2015-03-26 2015-03-26 Method to improve video quality under low light conditions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US14/669,433 US9466094B1 (en) 2015-03-26 2015-03-26 Method to improve video quality under low light conditions

Publications (2)

Publication Number Publication Date
US20160284066A1 true US20160284066A1 (en) 2016-09-29
US9466094B1 US9466094B1 (en) 2016-10-11

Family

ID=56975711

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/669,433 Active 2035-04-17 US9466094B1 (en) 2015-03-26 2015-03-26 Method to improve video quality under low light conditions

Country Status (1)

Country Link
US (1) US9466094B1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3352133A1 (en) * 2017-01-20 2018-07-25 Sony Corporation An efficient patch-based method for video denoising
US11138437B2 (en) * 2018-11-16 2021-10-05 Samsung Electronics Co., Ltd. Image processing apparatus and method thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366317B1 (en) 1998-03-27 2002-04-02 Intel Corporation Motion estimation using intrapixel logic
US8477848B1 (en) 2008-04-22 2013-07-02 Marvell International Ltd. Picture rate conversion system architecture
JP5576812B2 (en) * 2011-02-16 2014-08-20 オリンパス株式会社 Image processing apparatus, image processing method, image processing program, and imaging apparatus
US9326008B2 (en) 2012-04-10 2016-04-26 Google Inc. Noise reduction for image sequences

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3352133A1 (en) * 2017-01-20 2018-07-25 Sony Corporation An efficient patch-based method for video denoising
US20180211365A1 (en) * 2017-01-20 2018-07-26 Sony Corporation Efficient path-based method for video denoising
CN108337402A (en) * 2017-01-20 2018-07-27 索尼公司 Effective block-based method for video denoising
KR20180086127A (en) * 2017-01-20 2018-07-30 소니 주식회사 An efficient patch-based method for video denoising
US10140689B2 (en) * 2017-01-20 2018-11-27 Sony Corporation Efficient path-based method for video denoising
KR102007601B1 (en) * 2017-01-20 2019-10-23 소니 주식회사 An efficient patch-based method for video denoising
US11138437B2 (en) * 2018-11-16 2021-10-05 Samsung Electronics Co., Ltd. Image processing apparatus and method thereof

Also Published As

Publication number Publication date
US9466094B1 (en) 2016-10-11

Similar Documents

Publication Publication Date Title
US9202263B2 (en) System and method for spatio video image enhancement
US9888186B2 (en) Image acquisition method and image acquisition apparatus
US20130071045A1 (en) Image transmitting apparatus, image receiving apparatus, image transmitting and receiving system, recording medium recording image transmitting program, and recording medium recording image receiving program
EP2164040B1 (en) System and method for high quality image and video upscaling
US8508606B2 (en) System and method for deblurring motion blurred images
KR20190004270A (en) Performing intensity equalization for mono and color images
KR102127306B1 (en) Movement detecting apparatus and movement detecting method
KR102445762B1 (en) Method and device for processing images
Gryaditskaya et al. Motion aware exposure bracketing for HDR video
US20190020814A1 (en) Imaging device and imaging method using compressed sensing
US9008421B2 (en) Image processing apparatus for performing color interpolation upon captured images and related method thereof
US9466094B1 (en) Method to improve video quality under low light conditions
US20110242423A1 (en) Method and Apparatus for Motion Detection
US8379146B2 (en) Deinterlacing method and apparatus for digital motion picture
JP6738053B2 (en) Image processing apparatus for reducing staircase artifacts from image signals
US10140689B2 (en) Efficient path-based method for video denoising
US9508020B2 (en) Image processing system with artifact suppression mechanism and method of operation thereof
US8373752B2 (en) Detection apparatus, detection method and computer readable medium thereof for detecting an object in real-time
US9275295B1 (en) Noise estimation based on a characteristic of digital video

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DONG, XIAOGANG;TAKATORI, JIRO;WONG, TAK SHING;SIGNING DATES FROM 20150423 TO 20150425;REEL/FRAME:035591/0332

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8